[syslog-ng] Setting UDP Receive Buffer size, then when increase performance problems

Balazs Scheidler bazsi at balabit.hu
Mon Jul 31 09:50:54 CEST 2006


On Fri, 2006-07-28 at 13:44 -0500, James wrote:
> SUMMARY:

> DETAILS:
> 
> We have an HP-UX system that is regularly receiving 6,700 log entries per 
> second from several systems via UDP.  Originally we tried handling this 
> workload with one daemon, but was CPU constrained due to it being single 
> threaded.  We then split the work among 3 daemons each on their own 
> virtual IP so we could utilize the other CPUs in the system.  We also 
> increased socket_udp_rcvbuf_default (default receive buffer size for UDP) 
> from 65K to 75mb - 150mb.  It would be nice if the syslog-ng config file 
> allowed us to set this value instead of impacting ALL UDP listeners on the 
> system.  I have edited our copy of io.c to include something like the 
> following.  This example is crude, but seems to work for us setting 75Mb 
> buffer (I hard coded the value, but obtaining from the config file would 
> be a better solution):
> 
>     int Buffer = 78643199;
>     socklen_t Len = sizeof(Buffer);
> 
>     if ( (setsockopt(fd, SOL_SOCKET, SO_RCVBUF, &Buffer, Len)) == 0 ) {
>       Buffer=0;
>       getsockopt(fd, SOL_SOCKET, SO_RCVBUF, &Buffer, &Len);
>       debug("io.c: SO_RCVBUF now set to: %i\n", Buffer);
>     }
>     else {
>       werror("io.c: setsockopt() failed setting SO_RCVBUF (errno %i), %z\n", errno, strerror(errno));
>     }

Such an option was added to syslog-ng 2.0.x as so_rcvbuf()

> 
> 
> However, this can lead to another problem with syslog-ng (at least on on 
> HP-UX).  Once the buffer starts to fill the OS starts spending a LOT of 
> CPU in system time.  This appears to be because of the socket options 
> specified due to _INCLUDE_XOPEN_SOURCE_EXTENDED being set during compile. 
> On HP-UX 11.11 poll.h sets POLLIN to (POLLRDNORM|POLLRDBAND) when 
> _INCLUDE_XOPEN_SOURCE_EXTENDED is defined.  It is my understanding that 
> this means the OS will walk the entire buffer looking for a priority 
> message when syslog-ng calls poll() to look for the next packet.  Once we 
> increase the buffer past the 65K default in HP-UX this walking can take 
> some time and lots of system CPU time (hate to see this if buffer was set 
> the the 2GB maximum).  Overriding the POLLIN options to the setting poll.h 
> would use if _INCLUDE_XOPEN_SOURCE_EXTENDED was NOT defined (0x0001) seems 
> to resolve this problem, and actually allows syslog-ng to empty out this 
> large UDP buffer space very quickly.

That's interesting as UDP does not really support oob data. So as I see
there's no point in iterating the whole buffer for a data that might not
exist.

> 
> Another solution (which looks like may have been talked about in a post 
> from John Morrissey) may be to change the way syslog-ng loops on the UDP 
> socket. It appears to me that syslog-ng is currently doing:
>    poll()
>    read()
>    Back to poll()
> 
> This delay maybe could only be encountered once if syslog-ng was to do 
> something like:
>    poll()
>    read() with O_NONBLOCK set
>    if return -1 with EAGAIN, Back to poll()
>    else Back to read() with O_NONBLOCK set

This behaviour was added to 1.6.10 which had a platform specific bug on
Linux which was fixed in 1.6.11

An option to add slight delays to the processing loop also increases
performance which was also added to 1.6.10. (time_sleep(), I have
written a blog entry about this)

> 
> I am not sure what impact this would have if syslog-ng was receiving data 
> from more than one UDP socket (which is what we do) as it may spend too 
> much time doing emptying the one buffer while the other data listeners 
> wait.
> 
> 
> Have concerns like this already been addressed in more current releases? 
> Do these issues I have seen HP-UX concerns for other operating systems, or 
> should I just continue to modify the standard syslog-ng code as there 
> would be little to no benefit for others?

I think most of the changes that you made are already incorporated,
albeit some of them only in the 2.0.x tree, which is at 2.0rc1 now. The
2.0 tree should be better performance wise, though I have received a
memory leak report I have not seen into yet.

1.6.11 supports everything except the SO_RCVBUF setting.

-- 
Bazsi



More information about the syslog-ng mailing list