[syslog-ng] Version 3.0.5 high system cpu time

Balazs Scheidler bazsi at balabit.hu
Mon Mar 22 19:59:48 CET 2010


On Mon, 2010-03-15 at 08:48 -0700, Evan Rempel wrote:
> We have been running a syslog-ng 2.0.x syslog server for
> a long time now, collecting messages from a few hundred systems.
> We have been very happy that the CPU for that process was running at less than 5%.
> 
> We are getting a version 3.0.5 system ready for deployment, and as a test,
> have configured our 2.0.x system to copy all of its logs to the 3.0.5 server.
> We have basically used the same configuration file from out 2.0.x server with
> minimal changes to make it a "@version: 3.0" config file.
> 
> 
> What I see as the major differences in this setup are:
> 
> 1. New server is 64 bit Redhat 5 linux and old was 32bit Redhat 4.
> 
> 2. The new servers sees one TCP connection with the full syslog stream where
>     the old server has approx 300 TCP connections with only a few messages on
>     each stream.
> 
> 
> Now for the bad stuff.
> 
> On the Syslog-ng 3.0.5 system the process causes 85% CPU load (during light load periods)
> and most (80% of the 85%) of that is system time, not user time.
> 
> 
> Has anyone seen this type of behavior?
> 
> Can anyone give me some pointers to where this issue might be?
> 
> 
> I have a few ideas, but I didn't want to "lead the witness" :-)

I know about an issue not yet fixed in 3.0.5 and introduced somewhere
between 3.0.2 and 3.0.4.

Can you check if you have pipe() source or destination which refers to a
non-named-pipe file?

e.g. it is typical to see something like pipe('/dev/console')
but /dev/console is a character device. The solution is to use the
file() driver which is meant to be used in these cases.

The patch is really simple:

commit 81f27c22e3f6b0f8f6148bdc6e98d1491b80d128
Author: Balazs Scheidler <bazsi at balabit.hu>
Date:   Fri Mar 19 13:27:39 2010 +0100

    affile: fixed a possible infinite loop if pipe driver is used for a non-pipe device
    
    In this case the file descriptor representing the pipe is not initialized,
    but success is returned. This causes the main poll() call to return with a
    failure condition repeatedly, which in turn tops up the CPU at 100%

diff --git a/src/affile.c b/src/affile.c
index 15dbee0..41c71f3 100644
--- a/src/affile.c
+++ b/src/affile.c
@@ -64,6 +64,7 @@ affile_open_file(gchar *name, gint flags,
       g_process_cap_modify(CAP_DAC_READ_SEARCH, TRUE);
       g_process_cap_modify(CAP_SYS_ADMIN, TRUE);
     }
+  *fd = -1;
   if (stat(name, &st) >= 0)
     {
       if (is_pipe && !S_ISFIFO(st.st_mode))




-- 
Bazsi




More information about the syslog-ng mailing list