[syslog-ng] incorrect firewall logs with syslogng-2.0.0 [security error?]

Balazs Scheidler bazsi at balabit.hu
Fri Dec 1 12:06:43 CET 2006


On Thu, 2006-11-30 at 19:23 +0100, Jacek Kalinski wrote:
> Dnia 2006-11-30 10:53, Użytkownik Balazs Scheidler napisał:
> >> After upgradeing from syslogng-1.6.11 to 2.0.0 I've got a strange
> >> messages in /var/log/messages file:
> >>
> >> Nov 29 08:14:05 denise WINDOW=16985 RES=0x00 ACK URGP=0
> >> Nov 29 11:10:19 denise PT=34536 WINDOW=16985 RES=0x00 ACK URGP=0
> >> Nov 29 16:42:16 denise W=0 RES=0x00 RST URGP=0
> >> Nov 29 21:37:15 denise 116 ID=34901 DF PROTO=TCP SPT=3584 DPT=80
> >> WINDOW=65535 RES=0x00 ACK FIN URGP=0
> >>     
> > It is not necessarily syslog-ng that is at fault here, if the kernel
> > ring buffer is overflown (because of higher traffic for example), the
> > kernel might give partial lines while reading /proc/kmsg.
> >
> > You can increase the kernel ring buffer size by increasing the config
> > option CONFIG_LOG_BUF_SHIFT
> >   
> 
> I don't think so.
> Previously syslog-ng 1.6.11 was installed for 3 months (exactly since 1
> August). Kernel also wasn't changed at least within last month or two.
> There wasn't before this kind of problems (even in a lot of higher
> network traffic). And I'm sure that there was no "truncated" entries,
> because every strange line in system logs is reported to me immediately.
> 
> After upgrade (27th November) truncated lines are reported in logs every
> 3-10 hours. So it is very often.
> Everything I can do in this situation is to downgrade to syslog-ng 1.6
> (yes I have previous binary in backup) and check if this errors will be
> repeated.
> 
> PS: I'm not sure it is a syslog-ng error, but circumstances are very,
> very strange and connected with upgrade.

Syslog-ng is software, and as such it can certainly have bugs of its
own. I just said that symptoms are very similar to kernel ring buffer
overflows. This can happen if kernel log messages are generated faster
than syslog-ng is able to process them

The possible reasons for syslog-ng not being able to process messages:

1) the rate of kernel messages got higher
2) syslog-ng got slower in your specific scenario
3) there's some kind of external cause that makes syslog-ng to block
(DNS for instance), which increases latency

A possible reason for #3 might be that 2.0.0 lacked DNS cache, which I
have added to my TLA tree.

What could help to track down the issue is to have syslog-ng strace-ed
with timestamps (strace -ttT -s 4096 -p <pid>), wait for a message to be
garbled and analyze the results.

I'd look at:
- identify the garbled message in the strace output (it should appear in
a read() call)
- check the file descriptor of the /proc/kmsg file
- identify the read() call that is before the previously identified read
call and refers to the same file descriptor
- check the time stamps, how much time is spent between the two read()
calls, and what happened between the two
- clearly syslog-ng is working on something instead of reading the
kernel message source.

If you already know where time is being spent (possible causes: DNS
lookups, time_sleep, etc.), you can either work in decreasing these
latencies, or if nothing jumps out:

1) then either syslog-ng 2.0.0 got slower in your specific setup
(although I had reports that syslog-ng 2.0.0 is significantly faster
than 1.6.x) To improve this situation I'd need a profile generated by
oprofile

2) increase the kernel ring buffer size


-- 
Bazsi



More information about the syslog-ng mailing list