[syslog-ng] Lock up problems when using unix-dgram("/dev/log") type on

Balazs Scheidler bazsi at balabit.hu
Mon Dec 7 13:58:32 CET 2009


On Thu, 2009-12-03 at 20:05 -0800, Lance Laursen wrote:
> Hello,
> 
> 
> I'm having problems with machines eventually hanging on all processes
> that write to /dev/log when using unix-dgram("/dev/log") with
> Syslog-NG 3.0.4. The servers run fine for a while and hum along as
> expected. Unfortunately the success does not last, with various
> programs completely hanging after an undetermined time. Having an
> existing root shell as this happens allows me to kill syslog-ng,
> freeing up all locks.
> 
> 
> Repro'ing this is...well, annoying. I have 300+ servers running this
> build of syslog-ng fine, all using unix-stream(). The 4 servers that
> are locking up are the only ones I have running unix-dgram().
> Completely fresh ubuntu 8.04 installs with syslog-ng 3.0.4, identical
> to all other boxes aside from the one syslog-ng option. I've got
> strace output that is hanging after programs try to write to /dev/log
> as well.
> 
> 
> I'm currently doing a repro by running "while true ; do logger -p
> local0.info ...longest_message_possible... ; sleep 1s ; done" in
> non-exact science and have managed to pile things up after just over
> 120 messages, or two minutes. I can still hop around as root, but all
> programs that try to write to /dev/log pile up. The pile up seems to
> be log-size/throughput based, not time-based after some rudimentary
> tests - though it could be something random that is triggering it
> while my crappy tests are running. My next test plans to have small
> log messages in very rapid succession.
> 
> 
> I'm running:
> # uname -a
> Linux tny0032 2.6.24-24-generic #1 SMP Tue Jul 7 19:10:36 UTC 2009
> x86_64 GNU/Linux
> # cat /etc/debian_version
> lenny/sid
> (ubuntu 8.04)
> 
> 
> Here's my source definition:
> # all known message sources
> source s_all {
>         internal();
>         unix-dgram("/dev/log");
>         file("/proc/kmsg" program_override("kernel: "));
> };
> 
> 
> 
> 
> Here's some strace output that locks after trying to write
> to /dev/log:
> # strace su - lance
> ...
> ...
> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0
> socket(PF_FILE, SOCK_DGRAM, 0) = 3
> fcntl(3, F_SETFD, FD_CLOEXEC) = 0
> connect(3, {sa_family=AF_FILE, path="/dev/log"}, 110) = 0
> sendto(3, "<85>Dec 4 02:12:43 su[7086]: pa"..., 148, MSG_NOSIGNAL,
> NULL, 0
> 
> 
> # strace logger -p local0.info lalala
> produces the same lock-point as above.
> 
> 
> 
> 
> I thought dgram should be connectionless? I'm not sure how syslog-ng
> could be locking up resources. Has anyone seen this before? I will
> continue looking for a better repro case, if anyone has any ideas
> though shout.
> 
> 
> I am using unix-dgram solely because it does not break to a new log
> entry on NewLines. I was encountering a problem where, using
> unix-stream, lighttpd's multi-line log output was getting broken up
> into multiple syslog lines. This would have been fine, except when the
> new line is broken out and made into new log entries, the $hostname
> and $program fields get stripped out, leaving me with just $date $msg.
> This basically negated the ability to filter and relay logs
> effectively. I can elaborate further here if requested, but making
> unix-stream behave the same as unix-dgram with regards to multi-line
> log messages would solve all my problems.


This should probably be caused by the same problem I answered in the
"pipe causing lockup" thread yesterday.

This patch should fix it (you can find it in the git repo):

commit 495bdc3690fe1c01ed95b29f16e97829444973ee
Author: Balazs Scheidler <bazsi at balabit.hu>
Date:   Mon Dec 7 13:36:30 2009 +0100

    The flow-control flag was sometimes enabled even if not requested by the user
    
    In case a final or fallback flag was enabled on a log statement, it could enable
    the flow-control on the same level.



-- 
Bazsi




More information about the syslog-ng mailing list