[syslog-ng] syslog-ng lock ups

Tue Mar 7 21:01:33 UTC 2017

This is probably a deadlock or a hung main loop. Neither of them is
considered normal behavior.

To troubleshoot the issue, we would probably need a core file, that can be
generated by first making sure that syslog-ng is able to write core files
(--enable-core and friends), and then wait for this to happen.

When it does, SIGABRT the process, which should write a core with the
correct thread information intact.

Then with all binaries with debug symbols,  and the core files we can
potentially troubleshoot the issue.

This may or may not be trivial, depending on what is in the core files. But
this is the only way to find the root cause.

Bazsi

On Mar 7, 2017 8:00 PM, "Evan Rempel" <erempel at uvic.ca> wrote:

> I have been having a problem with syslog-ng where is just stops processing
> all input. There are destinations that receive and count all of the
> messages that come out of syslog-ng and they stop getting any messages.
>
> This is occurring on two syslog servers that have similar configurations.
> One is a superset of the simpler one. On these hosts we run two syslog-ng
> instances. One for the regular OS log messages from /dev/lo, /proc/kmsg,
> localhost:1514 and the second one which only listens on the network port(s)
> which we call our syslog server. The syslog server has its internal log
> messages going to localhost:1514 so any internal events such as new
> connections etc should be logged to the "OS syslog" instance.
>
> We had this problem on 3.7.x and after upgrading to 3.9.1 we still
> encounter this problem.
>
> The OS is now the latest Redhat 6 (6.8), but the same problem occurred on
> any Redhat 6.x system.
>
> Our host monitoring shows that the syslog-ng process did NOT increase its
> memory footprint which indicates that it stopped reading its source.
>
> All of the hosts that send logs to syslog server showed increased message
> queueing which confirms that the source stopped being read.
>
> The  CPU consumption by the syslog-ng process is zero during this time
> period.
>
> Our network monitoring infrastructure opens a connection to the syslog
> server every 30 seconds. This gets logged via the localhost:1514
> connection. These accepted and closed log lines are missing during our
> problem window even though the monitoring tool WAS able to connect through
> the entire problem window.
>
> We have configured the syslog stats log line to be sent to an external
> program which invokes syslog-ng-ctl stats and it processes this data. This
> process did not get any such stats line, so the exact syslog-ng stats
> counters are unavailable during this problem window. The syslog-ng-ctl
> program was NOT hung on the socket.
>
> Attempting a graceful shutdown of the syslog-ng process either with
> syslog-ng-ctl or with a kill -TERM appears to have no affect. I assume this
> is because syslog-ng is unable to flush its buffers so it does not
> terminate.
>
> Killing syslog-ng and restarting it starts processing again correctly.
>
>
> Has anyone else had this symptom?
>
> Has anyone found a solution to this?
>
> I realize that this is all anecdotal but was hoping it would trigger
> someone's memory.
>
> Thanks in advance.
>
>
> Evan.
>
> ____________________________________________________________
> __________________
> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
> Documentation: http://www.balabit.com/support/documentation/?product=
> syslog-ng
> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.balabit.hu/pipermail/syslog-ng/attachments/20170307/6f8481e1/attachment.html>