[syslog-ng] Solaris 10 UDP overflows, message drops

Sat Apr 30 14:33:40 CEST 2011

On Mon, 2011-04-18 at 11:43 -0400, Mishou Michael wrote:
> Fred,
>  
> Great find!  For those following on the TV at home, here is the link to
> the patch notes that I found:
>  
> https://getupdates.oracle.com/readme/144488-11
>  
> Which contains this tantalizing tidbit:
>  
> 6638967 UDP recv (think DNS) suffers from thundering herd problem
> (bug report for above: http://bit.ly/eD57KB+ )
>  
> I'm going to install this patch and see what comes of it.  That
> certainly seems like it could be related.
> 
> Martin,
> 
> I checked based on Matthew's suggestion of the ipInDiscards counter
> incrementing, and it's not, so no dice with the checksum errors, good
> call though!
>  
> Matthew, 
>  
> Following your suggestions I set up all of the network-based
> destinations to be /dev/null and the problem persists, shows no change
> in terms of how fast the buffers fill up or how many overflows are
> generated per second at all.  As for loggen, keep in mind that I was
> using that before to write to /dev/null and also directly to disk and
> topped loggen out (locally) at ~8k/msgs/sec, so I wonder if Fred isn't
> on the right track with some OS issues?
>  
> I used Google to find scripts to run for Dtrace to track syscalls, and
> one called procsystime stood out
> (http://www.brendangregg.com/DTrace/procsystime)  Here's a sample of the
> output over about 20 seconds each time, while the overflows are
> happening.  It's hard for me to tell if the write() syscall is really
> freaking slow here or if this is as it should be and not weird.  Either
> way, write() doesn't seem to be the issue if you look at the /dev/null
> output for which UDP overflows still occur.
> 
> This first one is the output for the simple config I posted earlier,
> writing to disk:
>  
> # /root/procsystime -aTn syslog-ng
> Hit Ctrl-C to stop sampling...
> ^C
>  
> Elapsed Times for processes syslog-ng,
>  
>          SYSCALL          TIME (ns)
>           getpid             193600
>           fchmod             443000
>             bind             757700
>            fcntl             786800
>       setsockopt            1355700
>           fchown            1446200
>          connect            1542000
>        so_socket            2626100
>            close            4382500
>           open64            4691400
>           stat64            4957400
>              brk           23978700
>          pollsys          113785800
>           llseek          158217200
>            gtime          244735300
>         recvfrom          404501900
>            write        11343815700
>           TOTAL:        12312217000

Hmm... I think I know what the culprit is, based on these statistics.

1) syslog-ng < 3.3 issued a single write() syscall for each and every
line written in the log files. It had overhead in Linux too, but not
this bad.

2) time() is invoked by syslog-ng a lot, and it is clearly visible from
your profile. These invocations were also decreased in numbers,
unluckily only in 3.3.

_Perhaps_ it'd be make sense to backport the performance improvements in
question, so you wouldn't have to fight 3.3 related issues, although I'm
more than happy if you choose that.

This is the bulk of the change that should more-or-less cleanly apply to
3.2:

Author: NagyAttila <naat at balabit.hu>  2010-11-15 18:00:10
Committer: Balazs Scheidler <bazsi at balabit.hu>  2010-12-21 16:31:09
Parent: 88a2a660255147c5ebd35951d1ccead9fd779e13 (LogProto: apply_state shouldn't allow file offsets over the end-of-file)
Child:  fe541bcb0368342228eebbb8dcf08e7a5f5f6a05 (LogProto: simplify prepare method)
Branches: many (84)
Follows: v3.2.1
Precedes: v3.3.0alpha1

    Performance improvement: file write uses writev instead of per-message write() to write larger chunks

There were three fixes on the code since then:

3fa8e900453fb6af1767caf42e17ab6d8c42452b    syslog() destination driver: fixed potential framing problem on contended connections
20e523a53dd8259061b2d277927f79681b8a9334    LogProtoFileWriter: flush should not attempt to call writev() if there are no buffers
3abdd8773662f9d779429262262a1ed9229e98e6    logproto: Handle EAGAIN and EINTR correctly in _text_client_flush()

The time related changes are much more scattered, so I wouldn't recommend 
to go there first, and seeing your profile, shouldn't matter that much.

Also, please note that in order to really make use of the new code, you need to 
set flush_lines() to non-zero, something like 100-1000 should make a big difference.

And one last point: please don't take it as a plug, but it _could_ make sense to 
check the PE version first, you can get a free eval (in binary), and PE 4.0 had 
these patches already (and got ported to OSE later). If you really can see the 
performance boost on your system, you can still decide whether to take the PE 
route, or stick with the OSE, possibly with porting patches.

-- 
Bazsi