[syslog-ng] syslog-ng dropping log messages forwarded w/TCP

Balazs Scheidler bazsi@balabit.hu
Tue, 7 Mar 2000 21:11:17 +0100


> >Probably the speed of your network is less than required. How fast are
> >you
> >sending those messages? syslog-ng can do with quite high loads (2G logs
> >a
> >day), but it still cannot widen your network bandwidth. Note that log
> >connections in syslog-ng (e.g. a log statement) is not flowcontrolled.
> >This
> >means that messages are continously read even if they have not yet been
> >flushed to the destination. The reason behind this is to prevent
> >syslog-ng
> >to become the bottleneck. 
> >
> 
> I don't think we have been quite clear on the nature of the problem.  We
> are sending messages at a very high rate over a named pipe to the local
> syslog-ng, and these messages are being forwarded via tcp to a remote
> machine, and also are logged in a local file.  Our network is switched 100
> Mbps, and the machines at each end are 450 MHz and 600 MHz Intel Pentium
> III with 3Com 3c905B NICs.  It seems unlikely that our network is the
> problem.
> 
> The exact sequence of events is that our program is started which makes
> 100000 log entries in a tight loop.  After the program exits, the local
> log file has all 100000 entries, but the remote log file has only about
> 25000 entries.  No amount of waiting increases the number of entries in
> the remote log.  The entries in the remote log are not sequentially
> numbered.  For example, they skip from 95432 to 96109.  The intervening
> log entries are simply dropped.  This is why we believe the problem is in
> syslog-ng.
> 
> At any rate it is possible that James has found the problem.  Apparently
> the return value of write is not being corretly interpreted when zero
> bytes were written.  Perhaps James will have a patch soonish.

This can also be caused by the garbage collection phase of syslog-ng. Some
time is spent in the garbage collector and during this period /dev/log is
not read, if /dev/log is a unix-dgram device, some messages may be lost here.

Garbage collection thresholds can be set via the gc_idle_threshold and
gc_busy_threshold directives in the options statement, like this:

options { gc_busy_threshold(10000); gc_idle_threshold(30); };

The number you give here is when to start garbage collection (how many
objects were allocated). I think you should try increasing busy threshold to
prevent syslog-ng to go into gc while in bursts.

To see whether this is the real cause of the problem, run syslog-ng with
-dv, and check "Garbage collecting while {idle,busy}" messages.

-- 
Bazsi
PGP info: KeyID 9AF8D0A9 Fingerprint CD27 CFB0 802C 0944 9CFD 804E C82C 8EB1
     url: http://www.balabit.hu/pgpkey.txt