[syslog-ng] Program destination seems to block
Evan Rempel
erempel at uvic.ca
Wed Apr 14 23:04:58 CEST 2010
Balazs Scheidler wrote:
> On Wed, 2010-04-14 at 13:18 -0700, Evan Rempel wrote:
>> A little background.
>>
>> There is a "server" syslog-ng process that accepts messages from the network and
>> sends the messages to a variety of destinations. For this report, I am only interested
>> in one destination that happens to be a pipe.
>>
>> There is a "slave" syslog-ng process that reads from the pipe that the "server" writes to,
>> and writes to a program destination.
>>
>> The program reads the standard in, and does "something".
>> All works well.
>>
>> At some point our application (we know why and don't want to discuss it) application stops
>> reading standard in for a while (1,000,000 lines over an hour). We expect that the memory
>> footprint of syslog-ng "slave" to grow during this time but it does not. Instead, the
>> memory footprint of the syslog-ng "server" grows. When our application starts reading
>> its standard in again, the memory footprint of the syslog "slave" grows very quickly,
>> and all messages reach the destination.
>>
>> I think that the syslog-ng "slave" get blocked on the program destination in a way that
>> prevents it from reading its source, resulting in the upstream syslog-ng "server" having
>> to buffer all of the messages.
>>
>> There is no flow control anywhere, and both syslog-ng instances have log_fifo_size(8000000)
>> for all of the destinations.
>>
>> Anyone have any suggestions?
>
> Hmm, either the slave syslog-ng really blocks (but I don't know any
> similar bugs right now), or flow control is enabled.
>
> There was a bug that caused flow-control to be enabled, if any of the
> flags was used. Do you have fallback enabled?
No, the only flags in the "slave" is no-parse.
> Can you post the exact versions of syslog-ng you are using?
3.0.5 OSE
> Also, you could confirm if the slave instance really blocks, or it has
> just stalled one of its sources. You could do that by attaching to the
> slave process using strace for a little while.
>
> 1) first check what fd is being used between master/slave (lsof -p
> <pid>)
> 2) then check via strace if that fd is being polled for POLLIN or not
>
> If it is not polled, then flow-control is somehow enabled, if syslog-ng
> is not polling but waiting somewhere, then it might be blocked as you
> suggest.
>
> Anyway, the list of open file descriptors and the strace dump could help
> in tracking down both cases.
Not to be lazy here, but I am not going to get to this for weeks, really busy
at my site right now.
Here is a program that can turn on/off the reading of standard in by using
kill -USR1 <PID>
kill -USR2 <PID>
---------------
#!/usr/bin/perl
my $read = 0;
$SIG{USR1} = sub { $read = 1 };
$SIG{USR2} = sub { $read = 0 };
while ( not eof(STDIN)) {
if ( $read == 1 ) {
$line = <STDIN>;
print $line;
} else {
sleep 1;
}
}
---------------
So hopefully you can reproduce this this really easy at your end.
If I don't hear back, in a few weeks I'll get to this.
--
Evan Rempel
More information about the syslog-ng
mailing list