Interesting issue with syslog-ng 3.3

30 Aug 2011

      Hi!

While this mail might sound a bit vague, it will - if nothing else -
serve as a reminder for me to investigate the issue furhter.

On one of my servers (PowerPC, running Debian Squeeze), I have a
syslog-ng 3.3 running, a reasonably recent (2-3 day old) git
snapshot. It works quite well, except that I was able to trace back my
server's recent hangs to syslog-ng:

The server had a ~120 day uptime when I upgraded from 3.1 to 3.3, and
since that time, it had to be rebooted two times already, just in two
weeks time. Last time, I didn't have any open connections to it, so
couldn't investigate, but tonight, I had an ssh session open with a
screen session inside.

So I tried to look around: first, I wanted to check the logs, but knew I
wouldn't find anything, as it stopped sending the logs to my other
server about two hours before I noticed the problem. Even worse, when I
tried to sudo, that hung, indefinitely. Weird.

There was nothing in dmesg, and nothing interesting in the logs it did
send before becoming unresponsive. HTTP still worked too, as did a few
other services. I could do nearly anything as a user.

So I tried stracing crontab, and it hung when it tried to send logs to
/dev/log. Interesting! I tried logger, same happens.

I suspect that for one reason or the other, /dev/log got overwhelmed,
and even worse, syslog-ng ended up trying to log something aswell, which
made it hang too. And thus, the queue remained full, and everything that
tried to log, got stuck.

HTTP continued to work, since my httpd isn't using syslog for its
logs. I could poke around in my shell, since that wasn't logging,
either.

This never happened with 3.1, and the only thing I changed in the config
is the @version, pretty much. Thus, I suspect, there's some very nasty
bug in 3.3beta2 that I haven't found yet.

I'm leaving a root shell open this time, so that I can poke around
further next time (along with a syslog-ng compiled with debug symbols).

In the meantime, I thought I'll drop a note, hoping that perhaps Bazsi
or someone from the syslog-ng devel team would have an idea where to
look, and what to check next time this happens.

-- 
|8]

Gergely Nagy

Martin Holste

Gergely Nagy

Paul Krizak

Gergely Nagy

Gergely Nagy

Balazs Scheidler

tags

participants (5)