[syslog-ng] Deadlock with full disk buffer

Ronald Fenner rfenner at gamecircus.com
Mon Aug 3 22:19:01 UTC 2020


What string could I grep for when I enable tracing to find the logs for the kafka plugin as in trace mode there is a lot of messages with the messages that are being sent to it and it's a large volume of them.

Ronald Fenner
Network Architect
Game Circus LLC.

rfenner at gamecircus.com

> On Aug 2, 2020, at 1:38 PM, Ronald Fenner <rfenner at gamecircus.com> wrote:
> 
> Thanks I'll keep it in mind but in order to upgrade I have to rebuild our base image and then roll it up and redeploy our production servers.
> 
> After further investigation I don't think it's a syslog-ng issue apparently the traffic coming into our Kafka cluster dropped significantly and after cycling out all the old instance pushing messages to it, it hasn't increased it's possible AWS may be responsible.
> 
> 
> Ronald Fenner
> Network Architect
> Game Circus LLC.
> 
> rfenner at gamecircus.com <mailto:rfenner at gamecircus.com>
> 
>> On Aug 2, 2020, at 1:12 PM, Peter Czanik (pczanik) <Peter.Czanik at oneidentity.com <mailto:Peter.Czanik at oneidentity.com>> wrote:
>> 
>> Hi,
>> 
>> I don't use Kafka, but recall, that there were some Kafka and buffering related fixes in 3.28. You should give it a try and see if it fixes your problem.
>> 
>> Peter
>> 
>> Peter Czanik (CzP) <peter.czanik at oneidentity.com <mailto:peter.czanik at oneidentity.com>>
>> Balabit (a OneIdentity company) / syslog-ng upstream
>> https://syslog-ng.com/community/ <https://syslog-ng.com/community/>
>> https://twitter.com/PCzanik <https://twitter.com/PCzanik>
>> From: syslog-ng <syslog-ng-bounces at lists.balabit.hu <mailto:syslog-ng-bounces at lists.balabit.hu>> on behalf of Ronald Fenner <rfenner at gamecircus.com <mailto:rfenner at gamecircus.com>>
>> Sent: Sunday, August 2, 2020 18:56
>> To: Syslog-ng users' and developers' mailing list <syslog-ng at lists.balabit.hu <mailto:syslog-ng at lists.balabit.hu>>
>> Subject: [syslog-ng] Deadlock with full disk buffer
>>  
>> CAUTION: This email originated from outside of the organization. Do not follow guidance, click links, or open attachments unless you recognize the sender and know the content is safe.
>> 
>> I've run into a problem where we had enabled reliable disk buffering on our Kafka destination and it's disk buffer eventually filled the drive it's stored on party because the drive is sized to the buffer it's holding.
>> 
>> When restarting syslog-ng it complains with "Daemon exited due to a deadlock/signal/failure, restarting; exitcode='6'"
>> 
>> The version we are using is 3.23.1
>> 
>> I'm not sure why the disk buffer filled up the only thing I can think of is that syslog-ng was unable to send message to the Kafka destination, however our Kafka cluster is up and function fine and I can even consume messages from a topic that we are pushing messages to on one of the instances affected.
>> 
>> A question once a reliable message is sent is it removed from the buffer, i assume that's the case.
>> 
>> The other question is it possible the java Kafka plugin can't keep up with the amount of mesages we are sending though syslog-ng.
>> 
>> We see 12.5 million requests per day to the servers with the disk buffers. Each request generates at least 2 messages, one access log message, one application log so that's roughly 25 million messages a day.
>> 
>> I designed the disk buffers to hold roughly 2 days worth of data if for some reason the Kafka cluster went down which in this case doesn't seem to be the case since monitoring tools show it's online and none of our hourly consumers complained about being able to connect.
>> 
>> I'll have to build a recovery server for the buffers since they are housed on attached drives the drive are preserved when the instance is terminated this was done to make sure we didn't lose any messages not sent when the autoscaling scaled out a instance. I'm currently manually terminating the affected instances to keep those buffer and restore out logging.
>> 
>> Once I have it, if more info is needed I can see what happens with trying to start syslog-ng up when it's also not receiving a bunch of messages as well.
>> 
>> 
>> Ronald Fenner
>> Network Architect
>> Game Circus LLC.
>> 
>> rfenner at gamecircus.com <mailto:rfenner at gamecircus.com>
>> ______________________________________________________________________________
>> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng <https://lists.balabit.hu/mailman/listinfo/syslog-ng>
>> Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng <http://www.balabit.com/support/documentation/?product=syslog-ng>
>> FAQ: http://www.balabit.com/wiki/syslog-ng-faq <http://www.balabit.com/wiki/syslog-ng-faq>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.balabit.hu/pipermail/syslog-ng/attachments/20200803/87da9cfa/attachment.html>


More information about the syslog-ng mailing list