[syslog-ng] syslog-ng deadlock if /dev/console locks?
Patrick H.
syslogng at feystorm.net
Wed Jan 26 17:17:19 CET 2011
Nope, just 'echo h', that was it.
-Patrick
Sent: Wed Jan 26 2011 11:16:31 GMT-0500 (Eastern Standard Time)
From: Paul Krizak <paul.krizak at amd.com>
To: Syslog-ng users' and developers' mailing list
<syslog-ng at lists.balabit.hu> "Patrick H." <syslogng at feystorm.net>,
"Sowell, Brett" <Brett.Sowell at amd.com>, "Petrini, Bryce"
<Bryce.Petrini at amd.com>, "Hart, Corey" <Corey.Hart at amd.com>
Subject: Re: [syslog-ng] syslog-ng deadlock if /dev/console locks?
> Fascinating. So just triggering the kernel to print something to the
> console (h is "help") caused /dev/console to properly realign and
> syslog-ng woke back up? You didn't have to restart syslog-ng or
> reboot the box or anything?
>
>
> Paul Krizak 7171 Southwest Pkwy MS B200.3A
> MTS Systems Engineer Austin, TX 78735
> Advanced Micro Devices Desk: (512) 602-8775
> Linux/Unix Systems Engineering Cell: (512) 791-0686
> Global IT Infrastructure Fax: (512) 602-0468
>
> On 01/26/11 10:11, Patrick H. wrote:
>> We ran into this issue when upgrading iLO on all our boxes. When the iLO
>> was upgraded, /dev/console went completely unresponsive, and things
>> started to hang. The solution turned out to be 'echo h >
>> /proc/sysrq-trigger'. Apparently when the kernel went to write out to
>> the serial port, it ran into problems and would reinitialize it. After
>> that everything started working fine.
>>
>> -Patrick
>>
>> Sent: Wed Jan 26 2011 11:03:37 GMT-0500 (Eastern Standard Time)
>> From: Sandor Geller <Sandor.Geller at morganstanley.com>
>> To: Syslog-ng users' and developers' mailing list
>> <syslog-ng at lists.balabit.hu> "Sowell, Brett" <Brett.Sowell at amd.com>,
>> "Petrini, Bryce" <Bryce.Petrini at amd.com>, "Hart, Corey"
>> <Corey.Hart at amd.com>
>> Subject: Re: [syslog-ng] syslog-ng deadlock if /dev/console locks?
>>> Hello,
>>>
>>> On Wed, Jan 26, 2011 at 4:12 PM, Paul Krizak<paul.krizak at amd.com>
>>> wrote:
>>>
>>>> Hi, we're using syslog-ng 3.1.2 and have run into what appears to be a
>>>> bug, but I'd like to get the community's opinion before we dig further
>>>> into it.
>>>>
>>>> We have a bunch of HP servers with iLO2 and iLO3 devices, configured
>>>> with their virtual serial ports on COM1 (ttyS0). We subsequently have
>>>> the OS (RHEL4, RHEL5) configured to use COM1 as its console (e.g.
>>>> /dev/console). This is a very standard configuration that allows
>>>> us to
>>>> get remote access to the machines without having to purchase the iLO
>>>> Advanced KVM feature. It also lets us use the Magic SysRq keys to
>>>> probe
>>>> dead systems and stuff, so in general it's not something we're keen to
>>>> change.
>>>>
>>>> What we have found, however, is that there are some cases where the
>>>> iLO
>>>> will freeze and requires a reboot. When the iLO reboots, however, the
>>>> kernel's connection to /dev/console (through the virtual serial port)
>>>> hangs and blocks. Any traffic to /dev/console just sits in the
>>>> kernel's
>>>> buffer and is never delivered. Once the buffer is full, the kernel
>>>> simply blocks on any write to /dev/console.
>>>>
>>>> Now this is a Bad Thing in general, and we're working with HP to
>>>> try and
>>>> remedy this bug. However, what concerns me is that syslog-ng, when
>>>> faced with this behavior, also blocks, even for log messages not bound
>>>> for /dev/console.
>>>>
>>>
>>> syslog-ng uses a single thread (with the exception of database
>>> destinations) running the event loop so when a read() or a write()
>>> blocks then it affects the whole log processing
>>>
>>>
>>>> What we have observed is that a system with syslog-ng will keep
>>>> delivering the occasional console message to /dev/console (ex. *.emerg
>>>> messages) and meanwhile the file-based log paths keep working. But
>>>> once
>>>> /dev/console blocks, the next time a console message is delivered,
>>>> *all*
>>>> of syslog-ng blocks waiting for that message to be delivered, and
>>>> all of
>>>> the file-based paths block as well. The result is that pretty much
>>>> everything on the system stops working. For example, you can't log
>>>> in,
>>>> even as root, because the login process blocks on the syslog command
>>>> that writes to /var/log/secure. Anything that uses syslog suddenly
>>>> blocks.
>>>>
>>>> Is this expected behavior? I would think that syslog-ng would be able
>>>> to continue accepting and delivering messages, even if one of the log
>>>> paths is stalled on a blocked write.
>>>>
>>>
>>> syslog-ng uses non-blocking I/O for all sources / destinations but
>>> despite of this the kernel could still block it therefore syslog-ng
>>> protects reads/writes in logtransport.c with alarm() so it should
>>> recover when timeout is set and a read/write blocked. For me it looks
>>> like the timeout is not set in all cases, only file and program
>>> sources initialise transport->timeout to 10 secs so I'd say this isn't
>>> expected behaviour - it is a bug.
>>>
>>> Regards,
>>>
>>> Sandor
>>> ______________________________________________________________________________
>>>
>>> Member info:https://lists.balabit.hu/mailman/listinfo/syslog-ng
>>> Documentation:http://www.balabit.com/support/documentation/?product=syslog-ng
>>>
>>> FAQ:http://www.campin.net/syslog-ng/faq.html
>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.balabit.hu/pipermail/syslog-ng/attachments/20110126/2c90d3c9/attachment-0001.htm
More information about the syslog-ng
mailing list