[syslog-ng] udp drops

Sat May 30 17:54:32 CEST 2009

Hi,

On Sat, May 30, 2009 at 4:57 PM, Jan Schaumann <jschauma at netmeister.org> wrote:
> Hello,
>
> I have a FreeBSD 6.2 (amd64) host where I'd like to replace the stock
> syslogd with syslog-ng (3.0.2).  This host receives a lot of syslog
> messages per second from a large number of clients via UDP.
>
> The stock syslogd configuration is trivial:
>
> *.*     /var/log/all
>
> This host currently drops about 2-4% of all UDP packets, syslog takes
> about 50-65% of one CPU.
>
> A drop-in replacement configuration using syslog-ng:
>
> options {
>        create_dirs(yes);
>        use_dns(no);
> };
>
> template t_default {
>        template("${DATE} <${FACILITY}.${PRIORITY}> ${HOST} ${MSG}\n");
> };
>
> source s_standard {
>        file("/dev/klog");
>        internal();
>        udp();
>        unix-dgram("/var/run/log");
> };
>
> destination d_all {
>        file("/var/log/all"
>                template(t_default)
>                );
> };
>
> # *.*           /var/log/all
> log {
>        source(s_standard);
>        destination(d_all);
> };
>
>
> syslog-ng uses about 90% of one CPU and drops between 15% and 20% of UDP
> packets (and, based on traffic patterns and logfile size, concurrent
> logfile rotation etc. even as high as 30%).

This is somewhat expected as syslog-ng parses incoming messages. So my
I guess is that syslog-ng can't drain fast enough the receive buffer,
and the kernel simply drops messages not fitting in the buffer.

It would be good to know whether the source side or the destination
side is the limiting factor. As you're using local files myguess is
the former.

> I've tried a number of things to improve this performance, including:
>
>       log_fetch_limit(100);

This option should get raised when you've got a very busy source, it
controls how many messages syslog-ng tries to read in one loop
iteration.

>       log_iw_size(10000);
>       flush_lines(100000);
>       flush_timeout(10);

The flush* options are useful to tune destination performance.

> in the global options and
>
>        log_fifo_size(100000)
>
> in the destination definition
>
> with
>
>        flags(flow-control)
>
> in the log definition.

AFAIK with files/ UDP flow-control is a no-op.

> The best I was able to get with these numbers was 5-7% of UDP drops (ie
> still double of what the stock syslogd drops).
>
> I also tried adjusting "so_rcvbuf" for UDP with no noticable difference.

Increasing the receive buffer size helps when there are only a few log
bursts/ peaks. However when seriously overloaded by incoming traffic
syslog-ng can't drain the receive buffer fast enough ending up the
buffer being full, it's only a matter of time when this occurs.

> Now consider that I did not do any sysctl tuning, as those should
> equally influence the stock syslog and I'm trying to sort out why one
> performs so significantly better than the other.

There is a vast difference in functions and therefore in behaviour.
syslogd just transports the logs, while syslog-ng does a lot of
additional work (like parsing, filtering) which require CPU cycles.

> Leaving aside any of the things I can do with syslog-ng further down the
> road (such as filtering and intentionally dropping certain messages),
> what can I do to get syslog-ng up to the same performance as the stock
> syslogd?

Unfortunately this can't happen. You can use the 'no-parse' option to
skip initial parsing the messages which could improve performance.
This means you can't use the template above as the variables won't get
defined.

Generally when it comes to parsing then syslog-ng could be
CPU-limited. In this case you should consider deploying multiple
syslog servers, and share the load. Ideally flow-controlling could be
turned on the client side as well (using TCP).

Regards,

Sandor