[syslog-ng] syslog-ng takes 100% CPU when network fails

Balazs Scheidler bazsi at balabit.hu
Sun Oct 26 17:26:45 CET 2008


On Fri, 2008-10-24 at 05:50 +0000, D S, Manu (STSD) wrote:
> Hi,
> 
>         We are running syslog-ng 2.0.9 on a HP-UX 11.31 server. We have configured this system as a client to forward logs to a remote server. When there is a network failure ( simulated by ifconfig down ) syslog-ng starts to consume CPU and even after the network comes back, it does not forward any log messages and continues to hog CPU.
> 
>         We did system call tracing using tusc and found that "poll()" gets "POLLERR" event from TCP socket descriptor, but syslog-ng does not call  any socket calls for the TCP, only calls "gettimeofday()".
> 
>         In the logs given, TCP connection to server is disconnected at 13:10:30. From that time, poll() receives POLLERR on the TCP socket (fd=6)  and starts loop on gettimeofday(). Attached are the sar, netstat and tusc logs.

First of all, Thanks for the detailed error report.

As I see the problem seems to be caused by the fact that HP-UX returns
POLLERR only without the other bits (e.g. POLLHUP) syslog-ng would
handle this gracefully if either the other bits would be set, or there'd
be some pending messages to send, in which case a normal write() error
would occur.

This patch should fix the problem, although I only compile-tested it.
I'd appreciate if you could test this patch in your environment.

diff --git a/src/logwriter.c b/src/logwriter.c
index bb82b43..7a5fcf7 100644
--- a/src/logwriter.c
+++ b/src/logwriter.c
@@ -139,6 +139,13 @@ log_writer_fd_dispatch(GSource *source,
       log_writer_broken(self->writer, NC_CLOSE);
       return FALSE;
     }
+  else if (self->pollfd.revents & (G_IO_ERR))
+    {
+      msg_error("POLLERR occurred while idle",
+                evt_tag_int("fd", self->fd->fd),
+                NULL);
+      log_writer_broken(self->writer, NC_WRITE_ERROR);
+    }
   else if (self->writer->queue->length || self->writer->partial)
     {
       if (!log_writer_flush_log(self->writer, self->fd))





-- 
Bazsi




More information about the syslog-ng mailing list