Hi Lars, try increasing your UDP receive buffers as outlined here:
http://nms.gdd.net/index.php/Install_Guide_for_LogZilla_v3.0#UDP_Buffers

______________________________________________________________

Clayton Dukes
______________________________________________________________


On Fri, Oct 15, 2010 at 4:39 PM, Lars Kellogg-Stedman <lars@oddbit.com> wrote:
Hello all,

I'm deploying syslog-ng 3.0.8 on a quad-core 2.4Ghz system with 4GB of
memory.  Using stock kernel settings (e.g., without adjusting
net.core.rmem_default), we're not able to handle much more than 100
messages/second (generated from a remote host using the "loggen"
tool).  At 500 msg/sec (-r 500), we see about 50% loss, and at 1000
msg/sec, we see closer to 60% packet loss.

Our configuration looks approximately like this (template definitions
elided for brevity):

 options {
         time_reap(30);
         mark_freq(10);
         keep_hostname(yes);
         use_fqdn(yes);
         dns_cache(2000);
         dns_cache_expire(86400);
 };

 source s_network {
         udp();
         tcp(port(514));
 };

 destination d_syslog {
         file("/srv/syslog/bydate/$YEAR-$MONTH-$DAY/messages"
                 template(t_daily_log)
                 create_dirs(yes)
                 );
         file("/srv/syslog/byhost/$FULLHOST_FROM/$YEAR-$MONTH-$DAY"
                 template(t_host_log)
                 create_dirs(yes)
                 );
 };

 log {
         source(s_network);
         destination(d_syslog);
 };

I didn't think these message rates were terribly high, so I was
surprised at the loss.  We've confirmed that the loss is entirely
between the kernel and the application -- using wireshark, we've
verified that all of the packets are arriving at the host, and using
this:

 awk '{print}' /inet/udp/514/0/0 > out

Our packet loss is < 1%.

If I raise the rmem settings like this:

 net.core.rmem_default = 512000
 net.core.rmem_max = 1024000

Then it looks like I can support messages rates around 1000 msgs/sec.
If I try with 2000 msgs/sec, the loss rates jumps up again (to around
30%).

Do these numbers make sense?  This is an unloaded server.  The only
log traffic hitting this system is from my loggen runs.  The
filesystem is ext3 on top of a hardware RAID5 array.  I've tried
fiddling with some of the syslog-ng global options (e.g.,
flush_lines(), log_fetch_limit()), but without having much impact on
performance.

I would appreciate any help you can send our way.  Thanks!

-- Lars
______________________________________________________________________________
Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng
FAQ: http://www.campin.net/syslog-ng/faq.html