Hi, I recently converted our RedHat EL servers (Release 3 Update 6) form the sysklog package to syslog-ng. On some of the servers syslog-ng appears to be occasionally hanging, i.e. messages stop getting written to the various log files. This also has a knock on effect of stopping logins, cron jobs hanging, etc, presumably because they are waiting to write to a log file. I read in the archives that this can be caused by something else using /dev/log but there's no other syslogd or minilogd processes running and lsof gives the following: COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME syslog-ng 1384 root 5u unix 0xf778c880 1703 /dev/log syslog-ng 1384 root 6u unix 0xf777e8c0 1715 /dev/log I'm using the RedHat conf file bundled in the contribs directory that comes with the source code. Any ideas would be welcome. Thanks Tony ***************************************************** You can find us at www.voca.com ***************************************************** This communication is confidential and intended for the exclusive use of the addressee only. You should not disclose its contents to any other person. If you are not the intended recipient please notify the sender named above immediately. Registered in England, No 1023742, Registered Office: Voca Limited Drake House, Three Rivers Court, Homestead Road, Rickmansworth, Hertfordshire, WD3 1FX This message has been checked for all email viruses by MessageLabs.
I'm seeing the same problem. When the connection to a remote server that we are sending logs to via TCP is broken, presumably from a temporary DNS or network error, the server starts stacking /dev/log connections till there are hundreds of them and any program that attempts to write log entries hangs, including logins. I can recreate the problem at will by sending logs from a test sender to a test collector then blocking traffic with IPTables to simulate a network issue. The sending server never recovers the connections even after turning the IPTables back off. Syslog-ng has to be restarted on the sender to clear up the issue. If not restarted, it will eventually lockup the server. I've had to write a "baby-sitter" process to watch for logging to stop then automatically restart NG when necessary on critical servers. I've already tried several variations of the keep-alive, tcp-keep-alive, log_fifo_size, etc to no avail. We were running 1.6.0rc3 but upgraded to 1.6.10 on a few servers in hopes of correcting it, but it hasn't. Here are the simplified configs I'm using on the test servers: ----------------------------------------------------------------------------------------------------- #Syslog-NG Test Sending Server options { use_dns(no); use_fqdn(yes); sync(0); stats(3600); time_reopen(10); log_fifo_size(4096); log_msg_size(8192); }; source s_local { internal(); unix-stream("/dev/log" keep-alive(yes) max-connections(100)); file("/proc/kmsg"); }; destination d_collector { tcp("testcollector.hertz.com" port(514) tcp-keep-alive(yes)); }; filter f_loc2 { facility(local2); }; log { source(s_local); filter(f_loc2); destination(d_collector); }; ----------------------------------------------------------------------------------------------------- #Syslog-NG Test Collector Server options { use_dns(no); sync(0); stats(3600); time_reopen(10); log_fifo_size(4096); log_msg_size(8192); }; source s_local { internal(); unix-stream("/dev/log" keep-alive(yes) max-connections(100)); file("/proc/kmsg"); }; source s_tcp { tcp(port(514) keep-alive(yes) tcp-keep-alive(yes)); max-connections(1000)); }; filter f_loc2 { facility(local2); }; destination d_loc2 { file("/tmp/test-loc2.log"); }; log {source(s_tcp); filter(f_loc2); destination(d_loc2); }; ----------------------------------------------------------------------------------------------------- Thank you, Chris Whipple Sr. Security Analyst Unix Security Group The Hertz Corporation 5601 NW Expressway Oklahoma City, OK 73132, USA cwhipple@hertz.com --------------- This message (including attachments) may contain information that is privileged, confidential or protected from disclosure. If you are not the intended recipient, you are hereby notified that dissemination, disclosure, copying, distribution or use of this message or any information contained in it is strictly prohibited. If you have received this message in error, please immediately notify the sender by reply e-mail and delete this message from your computer. Although we have taken steps to ensure that this e-mail and attachments are free from any virus, we advise that in keeping with good computing practice the recipient should ensure they are actually virus free. ---------------
On Tue, May 02, 2006 at 10:38:05AM +0100, Tony Davis wrote:
Hi,
I recently converted our RedHat EL servers (Release 3 Update 6) form the sysklog package to syslog-ng. On some of the servers syslog-ng appears to be occasionally hanging, i.e. messages stop getting written to the various log files. This also has a knock on effect of stopping logins, cron jobs hanging, etc, presumably because they are waiting to write to a log file. I read in the archives that this can be caused by something else using /dev/log but there's no other syslogd or minilogd processes running and lsof gives the following:
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME syslog-ng 1384 root 5u unix 0xf778c880 1703 /dev/log syslog-ng 1384 root 6u unix 0xf777e8c0 1715 /dev/log
I'm using the RedHat conf file bundled in the contribs directory that comes with the source code.
A while back I had problems with a ppp interface coming up and down (when using ppp over ssh for a temporary VPN) because a debian script in /etc/ppp/ip-up.d/ restarted postfix every time the interface came up. Something about the interaction between chrooted postfix and syslog-ng (and maybe my kernel version?) caused /dev/log to get hung up and the system would become unusable. I simply removed the postfix restart script and all has been well ever since. I have no idea what's causing your problem, but thought it worth mentioning what I'd seen in the past. -- Nate "The real question is not whether machines think but whether men do. The mystery which surrounds a thinking machine already surrounds a thinking man." - B. F. Skinner, Contingencies of Reinforcement
participants (3)
-
Chris Whipple
-
Nate Campi
-
Tony Davis