<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#0050d0">
<font size="-1"><font face="Helvetica, Arial, sans-serif">Nope, just
'echo h', that was it.<br>
<br>
-Patrick<br>
</font></font><br>
Sent: Wed Jan 26 2011 11:16:31 GMT-0500 (Eastern Standard Time)<br>
From: Paul Krizak <a class="moz-txt-link-rfc2396E" href="mailto:paul.krizak@amd.com"><paul.krizak@amd.com></a><br>
To: Syslog-ng users' and developers' mailing list
<a class="moz-txt-link-rfc2396E" href="mailto:syslog-ng@lists.balabit.hu"><syslog-ng@lists.balabit.hu></a> "Patrick H."
<a class="moz-txt-link-rfc2396E" href="mailto:syslogng@feystorm.net"><syslogng@feystorm.net></a>, "Sowell, Brett"
<a class="moz-txt-link-rfc2396E" href="mailto:Brett.Sowell@amd.com"><Brett.Sowell@amd.com></a>, "Petrini, Bryce"
<a class="moz-txt-link-rfc2396E" href="mailto:Bryce.Petrini@amd.com"><Bryce.Petrini@amd.com></a>, "Hart, Corey" <a class="moz-txt-link-rfc2396E" href="mailto:Corey.Hart@amd.com"><Corey.Hart@amd.com></a><br>
Subject: Re: [syslog-ng] syslog-ng deadlock if /dev/console locks?
<blockquote cite="mid:4D4048DF.1090802@amd.com" type="cite">Fascinating.
So just triggering the kernel to print something to the console (h is
"help") caused /dev/console to properly realign and syslog-ng woke back
up? You didn't have to restart syslog-ng or reboot the box or
anything?
<br>
<br>
<br>
Paul Krizak 7171 Southwest Pkwy MS B200.3A
<br>
MTS Systems Engineer Austin, TX 78735
<br>
Advanced Micro Devices Desk: (512) 602-8775
<br>
Linux/Unix Systems Engineering Cell: (512) 791-0686
<br>
Global IT Infrastructure Fax: (512) 602-0468
<br>
<br>
On 01/26/11 10:11, Patrick H. wrote:
<br>
<blockquote type="cite">We ran into this issue when upgrading iLO on
all our boxes. When the iLO
<br>
was upgraded, /dev/console went completely unresponsive, and things
<br>
started to hang. The solution turned out to be 'echo h >
<br>
/proc/sysrq-trigger'. Apparently when the kernel went to write out to
<br>
the serial port, it ran into problems and would reinitialize it. After
<br>
that everything started working fine.
<br>
<br>
-Patrick
<br>
<br>
Sent: Wed Jan 26 2011 11:03:37 GMT-0500 (Eastern Standard Time)
<br>
From: Sandor Geller <a class="moz-txt-link-rfc2396E" href="mailto:Sandor.Geller@morganstanley.com"><Sandor.Geller@morganstanley.com></a>
<br>
To: Syslog-ng users' and developers' mailing list
<br>
<a class="moz-txt-link-rfc2396E" href="mailto:syslog-ng@lists.balabit.hu"><syslog-ng@lists.balabit.hu></a> "Sowell, Brett"
<a class="moz-txt-link-rfc2396E" href="mailto:Brett.Sowell@amd.com"><Brett.Sowell@amd.com></a>,
<br>
"Petrini, Bryce" <a class="moz-txt-link-rfc2396E" href="mailto:Bryce.Petrini@amd.com"><Bryce.Petrini@amd.com></a>, "Hart, Corey"
<a class="moz-txt-link-rfc2396E" href="mailto:Corey.Hart@amd.com"><Corey.Hart@amd.com></a>
<br>
Subject: Re: [syslog-ng] syslog-ng deadlock if /dev/console locks?
<br>
<blockquote type="cite">Hello,
<br>
<br>
On Wed, Jan 26, 2011 at 4:12 PM, Paul
Krizak<a class="moz-txt-link-rfc2396E" href="mailto:paul.krizak@amd.com"><paul.krizak@amd.com></a> wrote:
<br>
<br>
<blockquote type="cite">Hi, we're using syslog-ng 3.1.2 and have
run into what appears to be a
<br>
bug, but I'd like to get the community's opinion before we dig further
<br>
into it.
<br>
<br>
We have a bunch of HP servers with iLO2 and iLO3 devices, configured
<br>
with their virtual serial ports on COM1 (ttyS0). We subsequently have
<br>
the OS (RHEL4, RHEL5) configured to use COM1 as its console (e.g.
<br>
/dev/console). This is a very standard configuration that allows us to
<br>
get remote access to the machines without having to purchase the iLO
<br>
Advanced KVM feature. It also lets us use the Magic SysRq keys to
probe
<br>
dead systems and stuff, so in general it's not something we're keen to
<br>
change.
<br>
<br>
What we have found, however, is that there are some cases where the iLO
<br>
will freeze and requires a reboot. When the iLO reboots, however, the
<br>
kernel's connection to /dev/console (through the virtual serial port)
<br>
hangs and blocks. Any traffic to /dev/console just sits in the
kernel's
<br>
buffer and is never delivered. Once the buffer is full, the kernel
<br>
simply blocks on any write to /dev/console.
<br>
<br>
Now this is a Bad Thing in general, and we're working with HP to try
and
<br>
remedy this bug. However, what concerns me is that syslog-ng, when
<br>
faced with this behavior, also blocks, even for log messages not bound
<br>
for /dev/console.
<br>
<br>
</blockquote>
<br>
syslog-ng uses a single thread (with the exception of database
<br>
destinations) running the event loop so when a read() or a write()
<br>
blocks then it affects the whole log processing
<br>
<br>
<br>
<blockquote type="cite">What we have observed is that a system
with syslog-ng will keep
<br>
delivering the occasional console message to /dev/console (ex. *.emerg
<br>
messages) and meanwhile the file-based log paths keep working. But
once
<br>
/dev/console blocks, the next time a console message is delivered,
*all*
<br>
of syslog-ng blocks waiting for that message to be delivered, and all
of
<br>
the file-based paths block as well. The result is that pretty much
<br>
everything on the system stops working. For example, you can't log in,
<br>
even as root, because the login process blocks on the syslog command
<br>
that writes to /var/log/secure. Anything that uses syslog suddenly
blocks.
<br>
<br>
Is this expected behavior? I would think that syslog-ng would be able
<br>
to continue accepting and delivering messages, even if one of the log
<br>
paths is stalled on a blocked write.
<br>
<br>
</blockquote>
<br>
syslog-ng uses non-blocking I/O for all sources / destinations but
<br>
despite of this the kernel could still block it therefore syslog-ng
<br>
protects reads/writes in logtransport.c with alarm() so it should
<br>
recover when timeout is set and a read/write blocked. For me it looks
<br>
like the timeout is not set in all cases, only file and program
<br>
sources initialise transport->timeout to 10 secs so I'd say this
isn't
<br>
expected behaviour - it is a bug.
<br>
<br>
Regards,
<br>
<br>
Sandor
<br>
______________________________________________________________________________
<br>
Member info:<a class="moz-txt-link-freetext" href="https://lists.balabit.hu/mailman/listinfo/syslog-ng">https://lists.balabit.hu/mailman/listinfo/syslog-ng</a>
<br>
Documentation:<a class="moz-txt-link-freetext" href="http://www.balabit.com/support/documentation/?product=syslog-ng">http://www.balabit.com/support/documentation/?product=syslog-ng</a>
<br>
FAQ:<a class="moz-txt-link-freetext" href="http://www.campin.net/syslog-ng/faq.html">http://www.campin.net/syslog-ng/faq.html</a>
<br>
<br>
<br>
</blockquote>
</blockquote>
<br>
</blockquote>
</body>
</html>