[syslog-ng] Lock up problems when using unix-dgram("/dev/log") type on

Lance Laursen lance at demonware.net
Fri Dec 4 05:05:34 CET 2009


Hello,

I'm having problems with machines eventually hanging on all processes that
write to /dev/log when using unix-dgram("/dev/log") with Syslog-NG 3.0.4.
The servers run fine for a while and hum along as expected. Unfortunately
the success does not last, with various programs completely hanging after an
undetermined time. Having an existing root shell as this happens allows me
to kill syslog-ng, freeing up all locks.

Repro'ing this is...well, annoying. I have 300+ servers running this build
of syslog-ng fine, all using unix-stream(). The 4 servers that are locking
up are the only ones I have running unix-dgram(). Completely fresh ubuntu
8.04 installs with syslog-ng 3.0.4, identical to all other boxes aside from
the one syslog-ng option. I've got strace output that is hanging after
programs try to write to /dev/log as well.

I'm currently doing a repro by running "while true ; do logger -p
local0.info ...longest_message_possible... ; sleep 1s ; done" in non-exact
science and have managed to pile things up after just over 120 messages, or
two minutes. I can still hop around as root, but all programs that try to
write to /dev/log pile up. The pile up seems to be log-size/throughput
based, not time-based after some rudimentary tests - though it could be
something random that is triggering it while my crappy tests are running. My
next test plans to have small log messages in very rapid succession.

I'm running:
# uname -a
Linux tny0032 2.6.24-24-generic #1 SMP Tue Jul 7 19:10:36 UTC 2009 x86_64
GNU/Linux
# cat /etc/debian_version
lenny/sid
(ubuntu 8.04)

Here's my source definition:
# all known message sources
source s_all {
        internal();
        unix-dgram("/dev/log");
        file("/proc/kmsg" program_override("kernel: "));
};


Here's some strace output that locks after trying to write to /dev/log:
# strace su - lance
...
...
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0
socket(PF_FILE, SOCK_DGRAM, 0) = 3
fcntl(3, F_SETFD, FD_CLOEXEC) = 0
connect(3, {sa_family=AF_FILE, path="/dev/log"}, 110) = 0
sendto(3, "<85>Dec 4 02:12:43 su[7086]: pa"..., 148, MSG_NOSIGNAL, NULL, 0

# strace logger -p local0.info lalala
produces the same lock-point as above.


I thought dgram should be connectionless? I'm not sure how syslog-ng could
be locking up resources. Has anyone seen this before? I will continue
looking for a better repro case, if anyone has any ideas though shout.

I am using unix-dgram solely because it does not break to a new log entry on
NewLines. I was encountering a problem where, using unix-stream, lighttpd's
multi-line log output was getting broken up into multiple syslog lines. This
would have been fine, except when the new line is broken out and made into
new log entries, the $hostname and $program fields get stripped out, leaving
me with just $date $msg. This basically negated the ability to filter and
relay logs effectively. I can elaborate further here if requested, but
making unix-stream behave the same as unix-dgram with regards to multi-line
log messages would solve all my problems.

Thanks,

-Lance
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.balabit.hu/pipermail/syslog-ng/attachments/20091203/41831f61/attachment.htm 


More information about the syslog-ng mailing list