[syslog-ng] syslog hangup

Balazs Scheidler bazsi at balabit.hu
Fri Jul 28 14:02:06 CEST 2006


On Fri, 2006-07-28 at 12:38 +0200, Vincent Régnard wrote:
> Hi,
> 
> Since at least a couple of month now, we've been experiencing some 
> strange troubles on some of our routers which we now associate to a 
> syslog-ng (or a syslog-ng internal facility) hangup. The version of 
> syslog ng we use is 1.6.8 on different linux kernel 2.4.31 and 2.6.8.
> 
> I read some posts on this list 
> (https://lists.balabit.hu/pipermail/syslog-ng/2006-May/008784.html) 
> establishing some similar problems.
> 
> The conditions in which the problem occur are not clear to us. There are 
>   obviously different sircumstances that lead syslog-ng to hangup.
> 
> We first noticed the problem on a server where some users where not able 
> to login, certainly because of the impossibility to write to the 
> /dev/log socket. Reloading syslog-ng when we had an active shell on the 
> server corrected the problem. For this server we "solved" the problem by 
> adding a cron to check the syslog activity and reload syslog if needed. 
> This is not a nice soltution, but it avoids hard reboot of the server.
> 
> More recently we realized that some routers where not logging some 
> (iptables firewall) events sent to syslog, when in the same time, log 
> from other daemons where treated correctly, again reloading syslog-ng 
> fixes the problem untill the problem randmly accurs again. I am 
> presently studdying the way these log messages are sent to syslog to 
> understand this trouble better.
> 
> We are tracking the causes of such an annoying behaviour without succes 
> untill now. First of all we would like to understand what is happening 
> in syslog-ng itselfs, at what level is this hangup ? kernel ? syslog ? 
> is it related to /dev/log socket ? Maybe some experts or syslog 
> devloppers can send us some hints ? Is it related to the kernel 
> environement ? /proc ? udev ? Or is it possible that another daemon is 
> responsible for this syslog hangup.
> 
> Apparently the problem is also present in newer releases in the 1.6.X 
> branch according to the posts on the list, I checked the branch 
> changelogs without seeing anything on that. Has some work been devoted 
> to fix this kind of trouble in more recent branches (1.9 and 2.0) ?
> 
> We are planning to develop a daemon to monitor syslog-ng and reload the 
> service in case of hangup. If some of you already performed some work in 
> that direction, we would be glad to share the effort or learn the best 
> and more efficient way proceed.

The only hang cause I know about is not really a syslog-ng issue, at
least not fixable in syslog-ng alone (although I've already tried to
work it around).

This problem is related to reading the /proc/kmsg special file, as if
multiple processes poll /proc/kmsg, one of them might block as the
kernel does not support non-blocking I/O on /proc/kmsg.

This is usually caused by:

1) klogd and syslog-ng running on the same host, syslog-ng
referencing /proc/kmsg

2) two syslog-ng instances running for some reason (started two times
because of lost pidfiles)

3) one syslog-ng having more than a single /proc/kmsg source


-- 
Bazsi



More information about the syslog-ng mailing list