<div>Hello,</div><div><br></div><div>I'm having problems with machines eventually hanging on all processes that write to /dev/log when using unix-dgram("/dev/log") with Syslog-NG 3.0.4. The servers run fine for a while and hum along as expected. Unfortunately the success does not last, with various programs completely hanging after an undetermined time. Having an existing root shell as this happens allows me to kill syslog-ng, freeing up all locks.</div>
<div><br></div><div>Repro'ing this is...well, annoying. I have 300+ servers running this build of syslog-ng fine, all using unix-stream(). The 4 servers that are locking up are the only ones I have running unix-dgram(). Completely fresh ubuntu 8.04 installs with syslog-ng 3.0.4, identical to all other boxes aside from the one syslog-ng option. I've got strace output that is hanging after programs try to write to /dev/log as well.</div>
<div><br></div><div>I'm currently doing a repro by running "while true ; do logger -p <a href="http://local0.info">local0.info</a> ...longest_message_possible... ; sleep 1s ; done" in non-exact science and have managed to pile things up after just over 120 messages, or two minutes. I can still hop around as root, but all programs that try to write to /dev/log pile up. The pile up seems to be log-size/throughput based, not time-based after some rudimentary tests - though it could be something random that is triggering it while my crappy tests are running. My next test plans to have small log messages in very rapid succession.</div>
<div><br></div><div>I'm running:</div><div># uname -a</div><div>Linux tny0032 2.6.24-24-generic #1 SMP Tue Jul 7 19:10:36 UTC 2009 x86_64 GNU/Linux</div><div># cat /etc/debian_version</div><div>lenny/sid</div><div>(ubuntu 8.04)</div>
<div><br></div><div>Here's my source definition:</div><div><div># all known message sources</div><div>source s_all {</div><div> internal();</div><div> unix-dgram("/dev/log");</div><div> file("/proc/kmsg" program_override("kernel: "));</div>
<div>};</div><div><br></div><div><br></div><div>Here's some strace output that locks after trying to write to /dev/log:</div><div># strace su - lance</div><div>...</div><div>...</div>stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=118, ...}) = 0<br>
socket(PF_FILE, SOCK_DGRAM, 0) = 3<br>fcntl(3, F_SETFD, FD_CLOEXEC) = 0<br>connect(3, {sa_family=AF_FILE, path="/dev/log"}, 110) = 0<br>sendto(3, "<85>Dec 4 02:12:43 su[7086]: pa"..., 148, MSG_NOSIGNAL, NULL, 0</div>
<div><br></div><div># strace logger -p <a href="http://local0.info">local0.info</a> lalala</div><div>produces the same lock-point as above.</div><div><br></div><div><br></div><div>I thought dgram should be connectionless? I'm not sure how syslog-ng could be locking up resources. Has anyone seen this before? I will continue looking for a better repro case, if anyone has any ideas though shout.</div>
<div><div><br></div><div>I am using unix-dgram solely because it does not break to a new log entry on NewLines. I was encountering a problem where, using unix-stream, lighttpd's multi-line log output was getting broken up into multiple syslog lines. This would have been fine, except when the new line is broken out and made into new log entries, the $hostname and $program fields get stripped out, leaving me with just $date $msg. This basically negated the ability to filter and relay logs effectively. I can elaborate further here if requested, but making unix-stream behave the same as unix-dgram with regards to multi-line log messages would solve all my problems.</div>
<div><br></div><div>Thanks,</div></div><br>-Lance<br>