[syslog-ng] Profiling syslog-ng

Balazs Scheidler bazsi at balabit.hu
Tue Feb 14 10:48:33 CET 2006


On Mon, 2006-02-13 at 10:57 -0500, John Morrissey wrote:
> On Sun, Feb 12, 2006 at 08:07:14PM +0100, Balazs Scheidler wrote:
> On Sun, Feb 12, 2006 at 02:53:55PM +0100, Balazs Scheidler wrote:
> > Strange, it's been a while since I last profiled syslog-ng. What were
> > your build options?
> 
> They're pretty stock. We used the Debian package for 1.6.9, adding 'debug'
> to DEB_BUILD_OPTIONS, which seems to use:
> 
> ./configure --prefix=/ --mandir=/usr/share/man --sysconfdir=/etc \
> 	--enable-debug
> 
> and '-g -Wall' for CFLAGS. We added -pg manually.

It should be ok. Maybe some kind of kernel patch (grsec or something
like that?) 


> > io_iter basically traverses all registered fds and constructs a "struct
> > pollfd" array every time it is invoked. If there are a lot of fds, this
> > might be a problem.
> 
> I agree; poll(2) is fairly cheap(?), so regenerating the struct pollfd is a
> likely suspect. Could the struct pollfd be cached, regenerated every n msec?
> That way latency for new processes isn't too bad, but it keeps syslog-ng
> from spinning so much?
> 
> Or perhaps it could keep a struct pollfd that is only regenerated when file
> descriptors are opened/closed. Our processes are fairly long-lived, so this
> too could keep the struct from being regenerated so often. Even if logging
> processes are not long-lived, I doubt it would be regenerated more often
> than once every second or few.

the problem is that it is not so simple, this code is in libol, it is
generic and the program relies on this behaviour at a lot of places. (to
request a read callback, it simply changes the value for a want_read
value, no functions called to update the pollfd state)

io_iter currently iterates through the registered fds 3 times, first to
call prepare callbacks, then to count the final set of pollfd structs
and then to fill that up. We might spare one of the loops (the counting
part with some reallocs), but the basic structure has to remain the
same.

I have now looked at the development branch of syslog-ng, and the
mainloop in GLib basically does the same, it traverses the registered
sources multiple times for each iteration. (I have not looked that
closely though, it probably spares the counting part)

As it seems syslog-ng does not scale to many parallel connections and
prefers load patterns with less connections and more traffic on those.

I also looked at epoll now, but it is not a trivial change to convert
either libol nor glib to use that. (I also looked at ivykis as well but
again it would be a big change)

I'll see into removing one of the loops which could help a bit.

-- 
Bazsi



More information about the syslog-ng mailing list