syslog-ng fails to drop file handles and reopen on HUP signal
Since 3.3.2 I've been seeing this weird behavior where syslog-ng fails to reload cleanly on a HUP. This has been particularly hard to troubleshoot since it appears as though when the reload fails it is causing both sshd and login to block; which I suspect is because /dev/log has disappeared. Luckily I do have nagios nrpe setup to allow me to remotely restart syslog-ng which alleviates the inability to login, but also makes it very hard to collect any troubleshooting information. Anyone else seeing any failures like this? I also sometimes see a a kernel error like one of the following on reload: Nov 17 10:30:02 app-bogus kernel: [ 3788.489852] syslog-ng[28520]: segfault at 7fd054022350 ip 00007fd054022350 sp 00007fffc93404e8 error 15 OR Nov 17 10:32:35 app-bogus kernel: [ 3941.605005] syslog-ng[29805] general protection ip:7fd05f0bcc4e sp:7fd05ac71240 error:0 in libsyslog-ng-3.3.2.so[7fd05f092000+7b000] I was really excited about the features this version of syslog-ng has brought with it, but I'm finding it WAY to unstable to use in production. Is there any information I could provide to help resolve these possibly related issues? -Dave
On Thu, 2011-11-17 at 10:36 -0800, Dave Rawks wrote:
Since 3.3.2 I've been seeing this weird behavior where syslog-ng fails to reload cleanly on a HUP. This has been particularly hard to troubleshoot since it appears as though when the reload fails it is causing both sshd and login to block; which I suspect is because /dev/log has disappeared. Luckily I do have nagios nrpe setup to allow me to remotely restart syslog-ng which alleviates the inability to login, but also makes it very hard to collect any troubleshooting information.
Anyone else seeing any failures like this?
I also sometimes see a a kernel error like one of the following on reload: Nov 17 10:30:02 app-bogus kernel: [ 3788.489852] syslog-ng[28520]: segfault at 7fd054022350 ip 00007fd054022350 sp 00007fffc93404e8 error 15
OR
Nov 17 10:32:35 app-bogus kernel: [ 3941.605005] syslog-ng[29805] general protection ip:7fd05f0bcc4e sp:7fd05ac71240 error:0 in libsyslog-ng-3.3.2.so[7fd05f092000+7b000]
I was really excited about the features this version of syslog-ng has brought with it, but I'm finding it WAY to unstable to use in production.
Is there any information I could provide to help resolve these possibly related issues?
Yup, Patrick has also found this issue, and I posted a fix in a separate thread. One of the last fixes going into 3.3.2 was flawed, and I didn't notice. I'm working on 3.3.3 with just that fix. In fact, it is being uploaded to our webserver (syncs to the public every 3 hours), grab it while it's hot. For the impatient, the patch is available at github. -- Bazsi
participants (2)
-
Balazs Scheidler
-
Dave Rawks