[syslog-ng]syslog-ng hanging bringing machine in trouble

Peter Bieringer pb@bieringer.de
Tue, 11 Feb 2003 23:39:48 +0100


Hi,

thanks for fast answer and some
information-retrieving-hints-on-next-hang.


--On Tuesday, February 11, 2003 11:10:02 PM +0100 Roberto Nibali
<ratz@drugphish.ch> wrote:

>> Red Hat Linux 7.3 with all updates
>> Running kernel: currently 2.4.18-18.7.x extended with Openwall
>> patch
> 
> Is it reproducable without OWL? Only test it if you can easily do
> it, if it's a productive machine, I suspect the downtime is too big
> to do heuristic tests.

Sorry, can't do that.


>> Feb 11 19:10:44 gromit syslog-ng[17700]: STATS: dropped 0
>> Feb 11 19:20:44 gromit syslog-ng[17700]: STATS: dropped 0
>> Feb 11 19:30:45 gromit syslog-ng[17700]: STATS: dropped 0 <--
>> Feb 11 20:00:47 gromit syslog-ng[6771]: syslog-ng version 1.5.26
>> starting
>> Feb 11 20:00:48 gromit syslog-ng: syslog-ng startup succeeded
>> Feb 11 20:00:48 gromit syslog-ng: klogd startup succeeded
> 
> You do not need to run klogd if you've configured syslog-ng
> accordingly unless you need address decoding.

Hmm, I'm using a given initscript to start syslog-ng:

start() {
        echo -n $"Starting system logger: "
        daemon syslog-ng $SYSLOGD_OPTIONS -f /etc/syslog-ng.conf
        RETVAL=$?
        echo
        echo -n $"Starting kernel logger: "
        daemon klogd $KLOGD_OPTIONS
        echo
        [ $RETVAL -eq 0 ] && touch /var/lock/subsys/syslog-ng
        return $RETVAL
}

Does this mean that starting klogd isn't required?


> Could you provide a snapshot of 'vmstat 1' output accompagning the
> 'hangs'?

Added to "emergency-case-todo-list" ;-)


>> Last times I saw also some CROND entries by "ps -ax", one with stat
>> "D".
> 
> crond can be in D state sometimes, nothing to worry about ;).

But they gone after stopping ldap. And everytime seeing in hanging
case. Also the number of such cron processes increases, looks like
they cannot do something in case of hanging.


>> 2) is this a LDAP problem? I've already increased threads.
> 
> We need more information from the machine's health during the hang
> occurance. netstat, sockstat, /proc/net/* information, ...

Also added to the list.


> Please provide us with a strace -f -v -i -t -p
> $PID_OF_HANGING_SYSLOG-NG when it happens the next time. Maybe we
> can see where it hangs exactly. It's always better than shooting
> holes into the dark by making questionable assumptions.

Sure, next add.


>> BTW: is this ok, that if syslog-ng restarts, crond don't log
>> anymore until restarted?
> 
> I would say no but I'm not sure here, I would also suspect it
> depends on the version of cron deployed on your machine.

vixie-cron-3.0.1-64

Does a lsof | grep crond help? I see only some libs, pipes and
sockets.


BTW: after the first hangs, I also enable UDP logging to a remote
host, but logging will stop, here, too.


        Peter
-- 
Dr. Peter Bieringer                     http://www.bieringer.de/pb/
GPG/PGP Key 0x958F422D               mailto: pb at bieringer dot de 
Deep Space 6 Co-Founder and Core Member  http://www.deepspace6.net/