[syslog-ng]lost logs when using syslog-ng on Linux due to /dev/log reopen

Balazs Scheidler bazsi@balabit.hu
Sun, 2 Jun 2002 12:18:49 +0200


On Sat, Jun 01, 2002 at 10:42:40PM +0200, Peter J. Holzer wrote:
> On 2002-06-01 22:40:19 +0400, Borsenkow Andrej wrote:
> > ? ???, 01.06.2002, ? 00:14, Peter J. Holzer ???????:
> > > On 2002-05-28 12:50:04 +0400, Borsenkow Andrej wrote:
> > > > I am using syslog-ng on Mandrake (8.2 and post-8.2), glibc-2.2.5.
> > > > 
> > > > Unfortunately, it tends to lose logs. As far as I can tell it is related to
> > > > the fact that syslog-ng recreates /dev/log on HUP. It means that every
> > > > program that has opened syslog connection won't be able to write to /dev/log
> > > > anymore.
> > > 
> > > Not quite. I had the same problem with syslog-ng 1.4.something (I think
> > > I recompiled a Mandrake source rpm on Redhat, but I'm not 100% sure) and
> > > 1.5.13. When I investigated the problem I found two bugs: One in
> > > syslog-ng and one in glibc 2.1:
> > > 
> > > The bug in syslog-ng was related to recreating /dev/log on HUP. It was
> > > fixed in 1.5.15, I think (mentioned in the changelog), so I didn't
> > > bother to report it. 
> > > 
> > 
> > I am not as sure:
> > 
> > {pts/1}% rpm -q syslog-ng
> > syslog-ng-1.5.18-0.3mdk
> > {pts/1}% LC_TIME=C ll /dev/log
> > srw-rw-rw-    1 root     root            0 Jun  1 22:32 /dev/log=
> > {pts/1}% sudo kill -1 1301
> > {pts/1}% LC_TIME=C ll /dev/log
> > srw-rw-rw-    1 root     root            0 Jun  1 22:33 /dev/log=
> > 
> > actually it quite happily does recreate /dev/log every time.
> 
> Yes, that's ok. Existing connections shouldn't be affected by that. The
> problem was (IIRC) that existing connections were thrown away even
> though keep-alive(yes) was set. Maybe Bazsi can provide the details, I
> gave up trying to understand libol at 4:00 am and the next day I saw
> that 1.5.15 was out and it seemed to fix the problem. 

It was the keep-alive(yes) state when syslog-ng was losing messages even if
should not have.

/dev/log is recreated every time, but given it is a SOCK_STREAM socket, the
sender program can notice the broken pipe, and those will work.

If it is SOCK_DGRAM (to which several distributions switched to) the
recreation of /dev/log is bad.

> > > Apparently[1] the syslog library function works like this:
> [... why the syslog function in glibc reproducably loses messages ...]
> > > Doesn't matter much if you lose one message every 24 hours (or
> > > how often you switch logs), don't you think?
> > > 
> > 
> > You are not serious, are you? 
> 
> No I'm not. But the guy who wrote the syslog function was obviously
> thinking along those lines. Given that syslog has traditionally been
> implemented over an unreliable transport protocol (UDP) the assumption
> that nobody would rely on getting all messages probably wasn't that
> unplausible. 

I'm going to address this issue by adding several features to syslog-ng 2:

* flow controlled message paths
* disk buffering of messages when a destination is absent
* maybe a preloadable shared library, which changes the syslog ()
  implementation in libc

-- 
Bazsi
PGP info: KeyID 9AF8D0A9 Fingerprint CD27 CFB0 802C 0944 9CFD 804E C82C 8EB1