[syslog-ng] syslog hangup
Balazs Scheidler
bazsi at balabit.hu
Fri Jul 28 14:02:06 CEST 2006
On Fri, 2006-07-28 at 12:38 +0200, Vincent Régnard wrote:
> Hi,
>
> Since at least a couple of month now, we've been experiencing some
> strange troubles on some of our routers which we now associate to a
> syslog-ng (or a syslog-ng internal facility) hangup. The version of
> syslog ng we use is 1.6.8 on different linux kernel 2.4.31 and 2.6.8.
>
> I read some posts on this list
> (https://lists.balabit.hu/pipermail/syslog-ng/2006-May/008784.html)
> establishing some similar problems.
>
> The conditions in which the problem occur are not clear to us. There are
> obviously different sircumstances that lead syslog-ng to hangup.
>
> We first noticed the problem on a server where some users where not able
> to login, certainly because of the impossibility to write to the
> /dev/log socket. Reloading syslog-ng when we had an active shell on the
> server corrected the problem. For this server we "solved" the problem by
> adding a cron to check the syslog activity and reload syslog if needed.
> This is not a nice soltution, but it avoids hard reboot of the server.
>
> More recently we realized that some routers where not logging some
> (iptables firewall) events sent to syslog, when in the same time, log
> from other daemons where treated correctly, again reloading syslog-ng
> fixes the problem untill the problem randmly accurs again. I am
> presently studdying the way these log messages are sent to syslog to
> understand this trouble better.
>
> We are tracking the causes of such an annoying behaviour without succes
> untill now. First of all we would like to understand what is happening
> in syslog-ng itselfs, at what level is this hangup ? kernel ? syslog ?
> is it related to /dev/log socket ? Maybe some experts or syslog
> devloppers can send us some hints ? Is it related to the kernel
> environement ? /proc ? udev ? Or is it possible that another daemon is
> responsible for this syslog hangup.
>
> Apparently the problem is also present in newer releases in the 1.6.X
> branch according to the posts on the list, I checked the branch
> changelogs without seeing anything on that. Has some work been devoted
> to fix this kind of trouble in more recent branches (1.9 and 2.0) ?
>
> We are planning to develop a daemon to monitor syslog-ng and reload the
> service in case of hangup. If some of you already performed some work in
> that direction, we would be glad to share the effort or learn the best
> and more efficient way proceed.
The only hang cause I know about is not really a syslog-ng issue, at
least not fixable in syslog-ng alone (although I've already tried to
work it around).
This problem is related to reading the /proc/kmsg special file, as if
multiple processes poll /proc/kmsg, one of them might block as the
kernel does not support non-blocking I/O on /proc/kmsg.
This is usually caused by:
1) klogd and syslog-ng running on the same host, syslog-ng
referencing /proc/kmsg
2) two syslog-ng instances running for some reason (started two times
because of lost pidfiles)
3) one syslog-ng having more than a single /proc/kmsg source
--
Bazsi
More information about the syslog-ng
mailing list