Hi, On Sat, May 30, 2009 at 4:57 PM, Jan Schaumann <jschauma@netmeister.org> wrote:
Hello,
I have a FreeBSD 6.2 (amd64) host where I'd like to replace the stock syslogd with syslog-ng (3.0.2). This host receives a lot of syslog messages per second from a large number of clients via UDP.
The stock syslogd configuration is trivial:
*.* /var/log/all
This host currently drops about 2-4% of all UDP packets, syslog takes about 50-65% of one CPU.
A drop-in replacement configuration using syslog-ng:
options { create_dirs(yes); use_dns(no); };
template t_default { template("${DATE} <${FACILITY}.${PRIORITY}> ${HOST} ${MSG}\n"); };
source s_standard { file("/dev/klog"); internal(); udp(); unix-dgram("/var/run/log"); };
destination d_all { file("/var/log/all" template(t_default) ); };
# *.* /var/log/all log { source(s_standard); destination(d_all); };
syslog-ng uses about 90% of one CPU and drops between 15% and 20% of UDP packets (and, based on traffic patterns and logfile size, concurrent logfile rotation etc. even as high as 30%).
This is somewhat expected as syslog-ng parses incoming messages. So my I guess is that syslog-ng can't drain fast enough the receive buffer, and the kernel simply drops messages not fitting in the buffer. It would be good to know whether the source side or the destination side is the limiting factor. As you're using local files myguess is the former.
I've tried a number of things to improve this performance, including:
log_fetch_limit(100);
This option should get raised when you've got a very busy source, it controls how many messages syslog-ng tries to read in one loop iteration.
log_iw_size(10000); flush_lines(100000); flush_timeout(10);
The flush* options are useful to tune destination performance.
in the global options and
log_fifo_size(100000)
in the destination definition
with
flags(flow-control)
in the log definition.
AFAIK with files/ UDP flow-control is a no-op.
The best I was able to get with these numbers was 5-7% of UDP drops (ie still double of what the stock syslogd drops).
I also tried adjusting "so_rcvbuf" for UDP with no noticable difference.
Increasing the receive buffer size helps when there are only a few log bursts/ peaks. However when seriously overloaded by incoming traffic syslog-ng can't drain the receive buffer fast enough ending up the buffer being full, it's only a matter of time when this occurs.
Now consider that I did not do any sysctl tuning, as those should equally influence the stock syslog and I'm trying to sort out why one performs so significantly better than the other.
There is a vast difference in functions and therefore in behaviour. syslogd just transports the logs, while syslog-ng does a lot of additional work (like parsing, filtering) which require CPU cycles.
Leaving aside any of the things I can do with syslog-ng further down the road (such as filtering and intentionally dropping certain messages), what can I do to get syslog-ng up to the same performance as the stock syslogd?
Unfortunately this can't happen. You can use the 'no-parse' option to skip initial parsing the messages which could improve performance. This means you can't use the template above as the variables won't get defined. Generally when it comes to parsing then syslog-ng could be CPU-limited. In this case you should consider deploying multiple syslog servers, and share the load. Ideally flow-controlling could be turned on the client side as well (using TCP). Regards, Sandor