Re: [syslog-ng] Solaris 10 UDP overflows, message drops

30 Apr 2011

      Hi,

On Tue, 2011-04-26 at 12:05 -0400, Mishou Michael wrote:
...
For those following this thread, I have applied the "thundering herd"
UDP patch and experienced no change in the drops experienced by
syslog-ng 3.1.2.  Sorry I took so long to respond, the patching was a
much more time-involved process than I thought it would be.
At this point, based on Michael Hocke's response, I'm thinking that
perhaps there is just too much UDP traffic for single-threaded syslog-ng
to deal with in light of what filtering and parsing it does up front
(for macro usage).
I'm going to experiment with syslog-ng and the loggen tool to find a
point at which a single syslog-ng instance starts dropping inbound UDP
traffic with a simple configuration writing to disk.  Once I have that
number, I have a few options:
1.  Experiment with syslog-ng 3.3 and the new threaded code to see if I
have performance gains.  I'm hesitant to push Alpha code in production,
if anyone has any experience with 3.3 in semi-production environment
running consistently I'd love to hear it.
I think the most difficult part of compiling syslog-ng for Solaris is
ivykis, the new I/O backend library that we've started using for
threading (it supports epoll, /dev/poll, kqueue etc).

The ivykis version that we use is available on git.balabit.hu, but you
need a complete toolchain (autoconf, automake, libtool, gcc, gmake) to
compile it.
...
2.  So I don't have to change the configuration on a lot of clients, use
PF to rewrite incoming UDP messages from specific, busy clients to other
syslog-ng listeners, configured exactly as my main instance (which will
handle all the non-insanely-busy clients).  I could run multiple
listeners in this manner, and not need threading to take advantage of
multiple processors, though obviously each process would still be
limited to the magic number determined above.  I have 10 or so really
busy clients, so this is one solution I'm leaning towards if syslog-ng
3.1.2 can handle just one of them.
This could work.
...
3.  Give up on syslog-ng until 3.3, or move to some other solution.  Not
sure what I could do here, rsyslog is the other major contender I guess,
not sure what gains I would get.  Could also do native syslog server and
post-process to different buckets/relay which is what we mainly use
syslog-ng for.
4.  Get a faster box (not likely to happen).
If anyone has any thoughts on any of the above I'd love to hear them.
Also, if this is unique to Solaris SPARC systems (similarly spec'd x86
Solaris systems having none of these limitations) I'd love to know that
as well.  Is there any way anyone knows to figure out at what point the
SPARC is hitting a ceiling?  The CPU is not pegged, so why would we be
experiencing CPU-based drops?  Maybe the code is not efficient for how
SPARC does things, or how some syscall is implemented on Solaris?
Yes, I think this is the root cause of the problem.

-- 
Bazsi

Re: [syslog-ng] Solaris 10 UDP overflows, message drops

Balazs Scheidler