Hi Mike,<div>I&#39;m heading out of town on a trip, so not enough time to read the whole thread.</div><div>You may or may not have tried some of this, but I had similar issues a while back and noted it here:</div><div><a href="http://nms.gdd.net/index.php/Install_Guide_for_LogZilla_v3.1#UDP_Buffers">http://nms.gdd.net/index.php/Install_Guide_for_LogZilla_v3.1#UDP_Buffers</a></div>

<div><a href="http://nms.gdd.net/index.php/Install_Guide_for_LogZilla_v3.1#UDP_Buffers"></a>Hope it helps :-)</div><div><br></div><div><br clear="all">______________________________________________________________ <br><br>

Clayton Dukes<br>______________________________________________________________<br>

<br><br><div class="gmail_quote">On Fri, Apr 15, 2011 at 2:01 PM, Mishou Michael <span dir="ltr">&lt;<a href="mailto:Michael.Mishou@csirc.irs.gov">Michael.Mishou@csirc.irs.gov</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

Matthew,<br>

<br>

Thanks for the suggestion.  I&#39;m not using so_sndbuf anywhere in this<br>

configuration, just recieving and writing directly to disk.  As for<br>

so_rcvbuf, I&#39;ve already tried that per the initial message, no dice.<br>

Even if I run a so_rcvbuf size that is 10 times the recommended value in<br>

the configuration note you linked to, it still fills up and then<br>

drops/udpInOverflows start to occur at the rate of about 5k/sec.<br>

<br>

Is there something else I&#39;m missing in the config perhaps?  The setting<br>

of so_rcvbuf to a 64 MB buffer only delays the problem for a few seconds<br>

until the buffer again fills.  If I set it to 1 GB (tried this, have a<br>

ton of RAM to work with) it delays the problem for about 10 minutes,<br>

then the drops start.  It seems as if the buffer is not being emptied<br>

fast enough, but the CPU is by no means pegged by syslog-ng.<br>

<br>

I left out the resources I have to work with on this system, and how<br>

bad/good things are with syslog-ng running (and dropping), I&#39;ll include<br>

those now.  As you can see, it&#39;s an older server, but it has a ton of<br>

RAM and the CPUs should have enough pop for this I think.<br>

<br>

# uname -a<br>

SunOS ms00310 5.10 Generic_127111-10 sun4u sparc SUNW,Sun-Fire-V490<br>

Solaris<br>

# psrinfo -v | grep MHz<br>

  The sparcv9 processor operates at 1350 MHz,<br>

  The sparcv9 processor operates at 1350 MHz,<br>

  The sparcv9 processor operates at 1350 MHz,<br>

  The sparcv9 processor operates at 1350 MHz,<br>

  The sparcv9 processor operates at 1350 MHz,<br>

  The sparcv9 processor operates at 1350 MHz,<br>

  The sparcv9 processor operates at 1350 MHz,<br>

  The sparcv9 processor operates at 1350 MHz,<br>

# swap -s<br>

total: 4042128k bytes allocated + 967184k reserved = 5009312k used,<br>

48662184k available<br>

# ps -e -o pcpu -o pid -o user -o args | grep syslog<br>

 0.0    70     root vxconfigd -x syslog -m boot<br>

 0.0  6110     root grep syslog<br>

 7.7 22802     root /usr/local/sbin/syslog-ng -f /etc/crap_config.txt -p<br>

/var/run/syslog-ng<br>

 0.0 22801     root /usr/local/sbin/syslog-ng -f /etc/crap_config.txt -p<br>

/var/run/syslog-ng<br>

# top -b -n 5<br>

last pid:  6355;  load avg:  1.36,  1.34,  1.34;       up 58+23:59:11<br>

17:37:27<br>

94 processes: 91 sleeping, 3 on cpu<br>

CPU states: 82.1% idle,  7.0% user, 10.9% kernel,  0.0% iowait,  0.0%<br>

swap<br>

Memory: 32G phys mem, 16G free mem, 32G total swap, 32G free swap<br>

<br>

   PID USERNAME LWP PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND<br>

 22802 root       2  50    0 3067M 3063M cpu/2   79:50  7.81% syslog-ng<br>

 29459 root      15  40    0  193M  166M cpu/1  150.5H  4.49% issCSF<br>

  6352 root       1  55    0 3376K 2032K cpu/19   0:00  0.20% top<br>

  4229 root      82  59    0  327M  324M sleep  661:05  0.17% java<br>

  2695 root       6  59    0 8000K 2984K sleep  802:59  0.11% rmserver<br>

<br>

I&#39;m just not sure what to do next to troubleshoot.  I&#39;m hoping someone<br>

here can point me in the right direction, or at least confirm that they<br>

are running syslog-ng in a similar configuration without drops so I know<br>

that it&#39;s at least possible?<br>

<br>

Regards,<br>

<br>

--Mike<br>

<div><div></div><div class="h5"><br>

-----Original Message-----<br>

From: <a href="mailto:syslog-ng-bounces@lists.balabit.hu">syslog-ng-bounces@lists.balabit.hu</a><br>

[mailto:<a href="mailto:syslog-ng-bounces@lists.balabit.hu">syslog-ng-bounces@lists.balabit.hu</a>] On Behalf Of Matthew Hall<br>

Sent: Friday, April 15, 2011 12:12 PM<br>

To: Syslog-ng users&#39; and developers&#39; mailing list<br>

Subject: Re: [syslog-ng] Solaris 10 UDP overflows, message drops<br>

<br>

Probably you need to adjust so_sndbuf and so_rcvbuf:<br>

<br>

<a href="http://www.balabit.com/sites/default/files/documents/syslog-ng-ose-v3.2-

guide-admin-en.html/index.html-single.html#reference_source_tcpudp" target="_blank">http://www.balabit.com/sites/default/files/documents/syslog-ng-ose-v3.2-<br>

guide-admin-en.html/index.html-single.html#reference_source_tcpudp</a><br>

<br>

That should make it run better.<br>

<br>

Matthew.<br>

<br>

On Fri, Apr 15, 2011 at 10:52:59AM -0400, Mishou Michael wrote:<br>

&gt; All,<br>

&gt;<br>

&gt; I&#39;ve done a lot of reading, and I can&#39;t figure out what I can do to<br>

this<br>

&gt; config in order to fix the UDP drops due to udpInOverflows on netstat<br>

&gt; -s.  Here are some statistics relating to the amount of traffic we<br>

&gt; receive via syslog-ng, it&#39;s pretty busy but in reading I&#39;m finding<br>

that<br>

&gt; some folks are doing much more.  These stats are based on a ~30 second<br>

&gt; window of traffic during peak times, but variance due to time is not<br>

so<br>

&gt; much in our environment.  I used tcpdump with a bpf to capture only<br>

&gt; inbound udp/514, so this is what the interface is seeing in the way of<br>

&gt; syslog.<br>

&gt;<br>

&gt; Elapsed:              00:00:34<br>

&gt; Packets:              200000<br>

&gt; Avg. packets/sec:     5836.546<br>

&gt; Avg. packet size:     303.182 bytes<br>

&gt; Bytes:                60636477<br>

&gt; Avg. bytes/sec:       1769537.884<br>

&gt; Avg. MBit/sec:        14.156<br>

&gt;<br>

&gt; So, about 6k messages per second.  Here are the drop numbers over a<br>

time<br>

&gt; sample (done right after a process restart, you can see the buffer<br>

takes<br>

&gt; a moment to fill up [64 MB so_rcvbuf]):<br>

&gt;<br>

&gt; # while true; do echo -en &quot;$(date) :: &quot;; netstat -s | grep<br>

&gt; udpInOverflows | head -n 1 | sed &#39;s|.*=||&#39;; sleep 10; done<br>

&gt; Fri Apr 15 14:12:46 GMT 2011 :: 472517477<br>

&gt; Fri Apr 15 14:12:56 GMT 2011 :: 472517477<br>

&gt; Fri Apr 15 14:13:06 GMT 2011 :: 472517477<br>

&gt; Fri Apr 15 14:13:16 GMT 2011 :: 472517477<br>

&gt; Fri Apr 15 14:13:26 GMT 2011 :: 472543152<br>

&gt; Fri Apr 15 14:13:36 GMT 2011 :: 472592800<br>

&gt; Fri Apr 15 14:13:46 GMT 2011 :: 472638848<br>

&gt; Fri Apr 15 14:13:56 GMT 2011 :: 472684407<br>

&gt;<br>

&gt; So that&#39;s about 5k overflows a second, which jives with our<br>

&gt; calculations, suggesting we&#39;re getting only ~10% of our messages<br>

logged<br>

&gt; to disk.<br>

&gt;<br>

&gt; I inherited a config with _very_ many filter statements, but have<br>

&gt; decided to cut all that out to see if my performance problems in the<br>

way<br>

&gt; of udp drops continue (they do).  I&#39;ve attached a sanitized config to<br>

&gt; this message, all the stuff here concerns this config running (even<br>

&gt; though I thought eliminating the filters would really help, it<br>

didn&#39;t).<br>

&gt;<br>

&gt; We&#39;re running Solaris 10 SPARC.  The syslog-ng version is:<br>

&gt;<br>

&gt; # /usr/local/sbin/syslog-ng -V<br>

&gt; syslog-ng 3.1.2<br>

&gt; Installer-Version: 3.1.2<br>

&gt; Revision:<br>

&gt;<br>

ssh+git://bazsi@git.balabit//var/scm/git/syslog-ng/syslog-ng-ose--mainli<br>

&gt; ne--3.1#master#8bf13c304b6ab5fc1a372b49d55c78370efe14ca<br>

&gt; Compile-Date: Oct 25 2010 23:56:18<br>

&gt; Enable-Threads: off<br>

&gt; Enable-Debug: off<br>

&gt; Enable-GProf: off<br>

&gt; Enable-Memtrace: off<br>

&gt; Enable-Sun-STREAMS: on<br>

&gt; Enable-Sun-Door: on<br>

&gt; Enable-IPv6: on<br>

&gt; Enable-Spoof-Source: on<br>

&gt; Enable-TCP-Wrapper: off<br>

&gt; Enable-SSL: on<br>

&gt; Enable-SQL: off<br>

&gt; Enable-Linux-Caps: off<br>

&gt; Enable-Pcre: on<br>

&gt;<br>

&gt; The following options are set for the OS:<br>

&gt;<br>

&gt; # ndd /dev/udp udp_max_buf<br>

&gt; 1073741824<br>

&gt; # ndd /dev/udp udp_recv_hiwat<br>

&gt; 65536<br>

&gt;<br>

&gt; Some options lines from the config based on what I&#39;ve seen:<br>

&gt;<br>

&gt; * note the TCP stuff can be safely ignored, it&#39;s legacy from some<br>

&gt; testing but isn&#39;t currently seeing traffic<br>

&gt; * all 3 udp sources set with so_rcvbuf(67108864) (64 MB)<br>

&gt;<br>

&gt; options { # things I&#39;ve changed/tweaked<br>

&gt;           flush_lines(1000);<br>

&gt;           flush_timeout(20);<br>

&gt;           log_fifo_size (67108864);<br>

&gt;           log_msg_size(8192);<br>

&gt;           chain_hostnames(yes);<br>

&gt;           # end my changes<br>

&gt;         &lt;snip&gt;<br>

&gt;         };<br>

&gt;<br>

&gt; So I&#39;m totally stumped.  I can set the buffers with so_rcvbuf() to 1<br>

GB,<br>

&gt; it still doesn&#39;t matter, they eventually fill up and I start losing<br>

&gt; packets.  I&#39;m hoping that someone can point me to some tweaks I can do<br>

&gt; to get the numbers of drops down or eliminated.  Is it unreasonable to<br>

&gt; expect to be able to process this many messages per second via UDP?<br>

&gt; Maybe that&#39;s the problem.  I might experiment some with default syslog<br>

&gt; to see if it can write this many messages without drops...this doesn&#39;t<br>

&gt; seem like an insane amount of traffic.  But perhaps my expectations<br>

are<br>

&gt; unrealistic, that&#39;s what I&#39;m hoping someone can tell me.<br>

&gt;<br>

&gt; Regards,<br>

&gt;<br>

&gt; --Mike<br>

<br>

<br>

&gt;<br>

________________________________________________________________________<br>

______<br>

&gt; Member info: <a href="https://lists.balabit.hu/mailman/listinfo/syslog-ng" target="_blank">https://lists.balabit.hu/mailman/listinfo/syslog-ng</a><br>

&gt; Documentation:<br>

<a href="http://www.balabit.com/support/documentation/?product=syslog-ng" target="_blank">http://www.balabit.com/support/documentation/?product=syslog-ng</a><br>

&gt; FAQ: <a href="http://www.campin.net/syslog-ng/faq.html" target="_blank">http://www.campin.net/syslog-ng/faq.html</a><br>

&gt;<br>

<br>

________________________________________________________________________<br>

______<br>

Member info: <a href="https://lists.balabit.hu/mailman/listinfo/syslog-ng" target="_blank">https://lists.balabit.hu/mailman/listinfo/syslog-ng</a><br>

Documentation:<br>

<a href="http://www.balabit.com/support/documentation/?product=syslog-ng" target="_blank">http://www.balabit.com/support/documentation/?product=syslog-ng</a><br>

FAQ: <a href="http://www.campin.net/syslog-ng/faq.html" target="_blank">http://www.campin.net/syslog-ng/faq.html</a><br>

<br>

______________________________________________________________________________<br>

Member info: <a href="https://lists.balabit.hu/mailman/listinfo/syslog-ng" target="_blank">https://lists.balabit.hu/mailman/listinfo/syslog-ng</a><br>

Documentation: <a href="http://www.balabit.com/support/documentation/?product=syslog-ng" target="_blank">http://www.balabit.com/support/documentation/?product=syslog-ng</a><br>

FAQ: <a href="http://www.campin.net/syslog-ng/faq.html" target="_blank">http://www.campin.net/syslog-ng/faq.html</a><br>

<br>

</div></div></blockquote></div><br></div>