[syslog-ng] Program destination seems to block

Evan Rempel erempel at uvic.ca
Wed Apr 14 23:04:58 CEST 2010


Balazs Scheidler wrote:
> On Wed, 2010-04-14 at 13:18 -0700, Evan Rempel wrote:
>> A little background.
>>
>> There is a "server" syslog-ng process that accepts messages from the network and
>> sends the messages to a variety of destinations. For this report, I am only interested
>> in one destination that happens to be a pipe.
>>
>> There is a "slave" syslog-ng process that reads from the pipe that the "server" writes to,
>> and writes to a program destination.
>>
>> The program reads the standard in, and does "something".
>> All works well.
>>
>> At some point our application (we know why and don't want to discuss it) application stops
>> reading standard in for a while (1,000,000 lines over an hour). We expect that the memory
>> footprint of syslog-ng "slave" to grow during this time but it does not. Instead, the
>> memory footprint of the syslog-ng "server" grows. When our application starts reading
>> its standard in again, the memory footprint of the syslog "slave" grows very quickly,
>> and all messages reach the destination.
>>
>> I think that the syslog-ng "slave" get blocked on the program destination in a way that
>> prevents it from reading its source, resulting in the upstream syslog-ng "server" having
>> to buffer all of the messages.
>>
>> There is no flow control anywhere, and both syslog-ng instances have log_fifo_size(8000000)
>> for all of the destinations.
>>
>> Anyone have any suggestions?
> 
> Hmm, either the slave syslog-ng really blocks (but I don't know any
> similar bugs right now), or flow control is enabled.
> 
> There was a bug that caused flow-control to be enabled, if any of the
> flags was used. Do you have fallback enabled?

No, the only flags in the "slave" is no-parse.


> Can you post the exact versions of syslog-ng you are using?

3.0.5 OSE


> Also, you could confirm if the slave instance really blocks, or it has
> just stalled one of its sources. You could do that by attaching to the
> slave process using strace for a little while.
> 
> 1) first check what fd is being used between master/slave (lsof -p
> <pid>)
> 2) then check via strace if that fd is being polled for POLLIN or not
> 
> If it is not polled, then flow-control is somehow enabled, if syslog-ng
> is not polling but waiting somewhere, then it might be blocked as you
> suggest.
> 
> Anyway, the list of open file descriptors and the strace dump could help
> in tracking down both cases.


Not to be lazy here, but I am not going to get to this for weeks, really busy
at my site right now.

Here is a program that can turn on/off the reading of standard in by using
kill -USR1 <PID>
kill -USR2 <PID>


---------------
#!/usr/bin/perl

my $read = 0;

$SIG{USR1} = sub { $read = 1 };
$SIG{USR2} = sub { $read = 0 };

while ( not eof(STDIN)) {
   if ( $read == 1 ) {
     $line = <STDIN>;
     print $line;
   } else {
     sleep 1;
   }
}
---------------

So hopefully you can reproduce this this really easy at your end.

If I don't hear back, in a few weeks I'll get to this.


-- 
Evan Rempel


More information about the syslog-ng mailing list