[syslog-ng] Profiling syslog-ng
John Morrissey
jwm at horde.net
Mon Feb 13 16:57:20 CET 2006
On Sun, Feb 12, 2006 at 08:07:14PM +0100, Balazs Scheidler wrote:
> Can you check if the attached patch decreases your load? It avoids
> getting back to the poll loop after every log message and loops on the
> currently processed fd until it either returns EAGAIN or it reaches a
> hard-coded number. (which is currently 30)
Unfortunately, it didn't help that much. I strace(1)ed it for a few minutes;
here are some read(2) statistics. It shows how many reads were done before
encountering an EAGAIN:
Count - # Reads
44268 - 1 reads
370 - 2 reads
279 - 3 reads
9 - 4 reads
6 - 5 reads
1 - 6 reads
[tail removed - ~34 lines with a count < 10]
Repeated reads might not make sense for our load pattern, since there are
many, many processes all syslogging a line or two at once. Few processes (as
you can see from the stats) syslog several messages in immediate succession.
On Sun, Feb 12, 2006 at 02:53:55PM +0100, Balazs Scheidler wrote:
> Strange, it's been a while since I last profiled syslog-ng. What were
> your build options?
They're pretty stock. We used the Debian package for 1.6.9, adding 'debug'
to DEB_BUILD_OPTIONS, which seems to use:
./configure --prefix=/ --mandir=/usr/share/man --sysconfdir=/etc \
--enable-debug
and '-g -Wall' for CFLAGS. We added -pg manually.
> It would be nice to know how blocking affects this, io_iter is the
> central poll loop and unless syslog-ng is spinning for some reason it
> should be really low in the profile. Hm... as I see you have
> max-connections set to 2000, how many connections are there in reality?
> Can you check lsof -p <syslog-ng pid> how many open fds syslog-ng is
> polling?
Currently, it has about 1,000 file descriptors open. We had to increase it
since we were hitting the stock limit (1024?) at peak times.
> io_iter basically traverses all registered fds and constructs a "struct
> pollfd" array every time it is invoked. If there are a lot of fds, this
> might be a problem.
I agree; poll(2) is fairly cheap(?), so regenerating the struct pollfd is a
likely suspect. Could the struct pollfd be cached, regenerated every n msec?
That way latency for new processes isn't too bad, but it keeps syslog-ng
from spinning so much?
Or perhaps it could keep a struct pollfd that is only regenerated when file
descriptors are opened/closed. Our processes are fairly long-lived, so this
too could keep the struct from being regenerated so often. Even if logging
processes are not long-lived, I doubt it would be regenerated more often
than once every second or few.
Here are some poll(2) stats. It seems that over half of poll(2) calls return
a single file descriptor with activity.
Count - # of descriptors returned by poll(2)
20140 - 1
8372 - 2
3941 - 3
1601 - 4
635 - 5
260 - 6
103 - 7
62 - 8
30 - 9
15 - 10
[tail removed - ~40 lines with a count < 10]
Thanks for your help!
john
--
John Morrissey _o /\ ---- __o
jwm at horde.net _-< \_ / \ ---- < \,
www.horde.net/ __(_)/_(_)________/ \_______(_) /_(_)__
More information about the syslog-ng
mailing list