Re: [syslog-ng] UDP packet loss with syslog-ng

16 Oct 2010

      On Fri, 2010-10-15 at 16:39 -0400, Lars Kellogg-Stedman wrote:
...
Hello all,
I'm deploying syslog-ng 3.0.8 on a quad-core 2.4Ghz system with 4GB of
memory.  Using stock kernel settings (e.g., without adjusting
net.core.rmem_default), we're not able to handle much more than 100
messages/second (generated from a remote host using the "loggen"
tool).  At 500 msg/sec (-r 500), we see about 50% loss, and at 1000
msg/sec, we see closer to 60% packet loss.
Our configuration looks approximately like this (template definitions
elided for brevity):
options {
          time_reap(30);
          mark_freq(10);
          keep_hostname(yes);
          use_fqdn(yes);
          dns_cache(2000);
          dns_cache_expire(86400);
  };
source s_network {
          udp();
          tcp(port(514));
  };
destination d_syslog {
          file("/srv/syslog/bydate/$YEAR-$MONTH-$DAY/messages"
                  template(t_daily_log)
                  create_dirs(yes)
                  );
          file("/srv/syslog/byhost/$FULLHOST_FROM/$YEAR-$MONTH-$DAY"
                  template(t_host_log)
                  create_dirs(yes)
                  );
  };
log {
          source(s_network);
          destination(d_syslog);
  };
I didn't think these message rates were terribly high, so I was
surprised at the loss.  We've confirmed that the loss is entirely
between the kernel and the application -- using wireshark, we've
verified that all of the packets are arriving at the host, and using
this:
awk '{print}' /inet/udp/514/0/0 > out
Our packet loss is < 1%.
If I raise the rmem settings like this:
net.core.rmem_default = 512000
  net.core.rmem_max = 1024000
Then it looks like I can support messages rates around 1000 msgs/sec.
If I try with 2000 msgs/sec, the loss rates jumps up again (to around
30%).
Do these numbers make sense?  This is an unloaded server.  The only
log traffic hitting this system is from my loggen runs.  The
filesystem is ext3 on top of a hardware RAID5 array.  I've tried
fiddling with some of the syslog-ng global options (e.g.,
flush_lines(), log_fetch_limit()), but without having much impact on
performance.
I would appreciate any help you can send our way.  Thanks!
Hmm. the numbers you are seeing are indeed low, with sufficient buffer
sizes I could get up to the 20k message/sec range with syslog-ng,
although it's been a while I last tested it.

What I'd recommend is to calculate how much _bytes_ the message rate you
are generating means.

If you generate 2000 messages, 300 byte each (loggen default IIRC),
that's 600000 bytes every second. syslog-ng is single threaded, thus the
latency to write to the disk applies. This means that it may take some
time for syslog-ng to care about its source, if it is busy writing out
messages. This is the #1 reason why I want to work on multithreading.
With a flow controlled source, syslog-ng is able to do about 70-75k
msg/sec. But not with UDP.

In order to improve the numbers, I'd:

1) increase the receive buffer rate to 3-5 seconds (e.g. 3-5MB, not just
0.5)

2) increase log_fetch_limit() to a larger value, this controls how much
messages syslog-ng fetches in each poll iteration. Increase this to
3-500

3) increase log_fifo_size() for the destination, by taking the
fetch_limit values for each sources feeding the destination (so if you
have two sources, each with 1000 fetch limit, then the destination queue
should be _at least_ 2000, preferably rounded to the next order of
magnitude (e.g. with 2x1000 fetch-limits, increase fifo to 10000)

You haven't included in your email whether syslog-ng itself is dropping
messages, or the kernel. netstat drop counts or syslog-ng statistics
should help decide that.

-- 
Bazsi

Re: [syslog-ng] UDP packet loss with syslog-ng

Balazs Scheidler