[syslog-ng] syslog-ng deadlock if /dev/console locks?

Sandor Geller Sandor.Geller at morganstanley.com
Wed Jan 26 17:03:37 CET 2011


Hello,

On Wed, Jan 26, 2011 at 4:12 PM, Paul Krizak <paul.krizak at amd.com> wrote:
> Hi, we're using syslog-ng 3.1.2 and have run into what appears to be a
> bug, but I'd like to get the community's opinion before we dig further
> into it.
>
> We have a bunch of HP servers with iLO2 and iLO3 devices, configured
> with their virtual serial ports on COM1 (ttyS0).  We subsequently have
> the OS (RHEL4, RHEL5) configured to use COM1 as its console (e.g.
> /dev/console).  This is a very standard configuration that allows us to
> get remote access to the machines without having to purchase the iLO
> Advanced KVM feature.  It also lets us use the Magic SysRq keys to probe
> dead systems and stuff, so in general it's not something we're keen to
> change.
>
> What we have found, however, is that there are some cases where the iLO
> will freeze and requires a reboot.  When the iLO reboots, however, the
> kernel's connection to /dev/console (through the virtual serial port)
> hangs and blocks.  Any traffic to /dev/console just sits in the kernel's
> buffer and is never delivered.  Once the buffer is full, the kernel
> simply blocks on any write to /dev/console.
>
> Now this is a Bad Thing in general, and we're working with HP to try and
> remedy this bug.  However, what concerns me is that syslog-ng, when
> faced with this behavior, also blocks, even for log messages not bound
> for /dev/console.

syslog-ng uses a single thread (with the exception of database
destinations) running the event loop so when a read() or a write()
blocks then it affects the whole log processing

> What we have observed is that a system with syslog-ng will keep
> delivering the occasional console message to /dev/console (ex. *.emerg
> messages) and meanwhile the file-based log paths keep working.  But once
> /dev/console blocks, the next time a console message is delivered, *all*
> of syslog-ng blocks waiting for that message to be delivered, and all of
> the file-based paths block as well.  The result is that pretty much
> everything on the system stops working.  For example, you can't log in,
> even as root, because the login process blocks on the syslog command
> that writes to /var/log/secure.  Anything that uses syslog suddenly blocks.
>
> Is this expected behavior?  I would think that syslog-ng would be able
> to continue accepting and delivering messages, even if one of the log
> paths is stalled on a blocked write.

syslog-ng uses non-blocking I/O for all sources / destinations but
despite of this the kernel could still block it therefore syslog-ng
protects reads/writes in logtransport.c with alarm() so it should
recover when timeout is set and a read/write blocked. For me it looks
like the timeout is not set in all cases, only file and program
sources initialise transport->timeout to 10 secs so I'd say this isn't
expected behaviour - it is a bug.

Regards,

Sandor


More information about the syslog-ng mailing list