[syslog-ng] Program destination seems to block

Balazs Scheidler bazsi at balabit.hu
Wed Apr 14 22:37:35 CEST 2010


On Wed, 2010-04-14 at 13:18 -0700, Evan Rempel wrote:
> A little background.
> 
> There is a "server" syslog-ng process that accepts messages from the network and
> sends the messages to a variety of destinations. For this report, I am only interested
> in one destination that happens to be a pipe.
> 
> There is a "slave" syslog-ng process that reads from the pipe that the "server" writes to,
> and writes to a program destination.
> 
> The program reads the standard in, and does "something".
> All works well.
> 
> At some point our application (we know why and don't want to discuss it) application stops
> reading standard in for a while (1,000,000 lines over an hour). We expect that the memory
> footprint of syslog-ng "slave" to grow during this time but it does not. Instead, the
> memory footprint of the syslog-ng "server" grows. When our application starts reading
> its standard in again, the memory footprint of the syslog "slave" grows very quickly,
> and all messages reach the destination.
> 
> I think that the syslog-ng "slave" get blocked on the program destination in a way that
> prevents it from reading its source, resulting in the upstream syslog-ng "server" having
> to buffer all of the messages.
> 
> There is no flow control anywhere, and both syslog-ng instances have log_fifo_size(8000000)
> for all of the destinations.
> 
> Anyone have any suggestions?

Hmm, either the slave syslog-ng really blocks (but I don't know any
similar bugs right now), or flow control is enabled.

There was a bug that caused flow-control to be enabled, if any of the
flags was used. Do you have fallback enabled?

Can you post the exact versions of syslog-ng you are using?

Also, you could confirm if the slave instance really blocks, or it has
just stalled one of its sources. You could do that by attaching to the
slave process using strace for a little while.

1) first check what fd is being used between master/slave (lsof -p
<pid>)
2) then check via strace if that fd is being polled for POLLIN or not

If it is not polled, then flow-control is somehow enabled, if syslog-ng
is not polling but waiting somewhere, then it might be blocked as you
suggest.

Anyway, the list of open file descriptors and the strace dump could help
in tracking down both cases.


-- 
Bazsi



More information about the syslog-ng mailing list