[syslog-ng] Solaris 10 UDP overflows, message drops
Balazs Scheidler
bazsi at balabit.hu
Sat Apr 30 14:47:35 CEST 2011
Hi,
On Tue, 2011-04-26 at 12:05 -0400, Mishou Michael wrote:
> For those following this thread, I have applied the "thundering herd"
> UDP patch and experienced no change in the drops experienced by
> syslog-ng 3.1.2. Sorry I took so long to respond, the patching was a
> much more time-involved process than I thought it would be.
>
> At this point, based on Michael Hocke's response, I'm thinking that
> perhaps there is just too much UDP traffic for single-threaded syslog-ng
> to deal with in light of what filtering and parsing it does up front
> (for macro usage).
>
> I'm going to experiment with syslog-ng and the loggen tool to find a
> point at which a single syslog-ng instance starts dropping inbound UDP
> traffic with a simple configuration writing to disk. Once I have that
> number, I have a few options:
>
> 1. Experiment with syslog-ng 3.3 and the new threaded code to see if I
> have performance gains. I'm hesitant to push Alpha code in production,
> if anyone has any experience with 3.3 in semi-production environment
> running consistently I'd love to hear it.
I think the most difficult part of compiling syslog-ng for Solaris is
ivykis, the new I/O backend library that we've started using for
threading (it supports epoll, /dev/poll, kqueue etc).
The ivykis version that we use is available on git.balabit.hu, but you
need a complete toolchain (autoconf, automake, libtool, gcc, gmake) to
compile it.
> 2. So I don't have to change the configuration on a lot of clients, use
> PF to rewrite incoming UDP messages from specific, busy clients to other
> syslog-ng listeners, configured exactly as my main instance (which will
> handle all the non-insanely-busy clients). I could run multiple
> listeners in this manner, and not need threading to take advantage of
> multiple processors, though obviously each process would still be
> limited to the magic number determined above. I have 10 or so really
> busy clients, so this is one solution I'm leaning towards if syslog-ng
> 3.1.2 can handle just one of them.
This could work.
>
> 3. Give up on syslog-ng until 3.3, or move to some other solution. Not
> sure what I could do here, rsyslog is the other major contender I guess,
> not sure what gains I would get. Could also do native syslog server and
> post-process to different buckets/relay which is what we mainly use
> syslog-ng for.
>
> 4. Get a faster box (not likely to happen).
>
> If anyone has any thoughts on any of the above I'd love to hear them.
> Also, if this is unique to Solaris SPARC systems (similarly spec'd x86
> Solaris systems having none of these limitations) I'd love to know that
> as well. Is there any way anyone knows to figure out at what point the
> SPARC is hitting a ceiling? The CPU is not pegged, so why would we be
> experiencing CPU-based drops? Maybe the code is not efficient for how
> SPARC does things, or how some syscall is implemented on Solaris?
Yes, I think this is the root cause of the problem.
--
Bazsi
More information about the syslog-ng
mailing list