[syslog-ng] Solaris 10 UDP overflows, message drops
Balazs Scheidler
bazsi at balabit.hu
Sat Apr 30 14:33:40 CEST 2011
On Mon, 2011-04-18 at 11:43 -0400, Mishou Michael wrote:
> Fred,
>
> Great find! For those following on the TV at home, here is the link to
> the patch notes that I found:
>
> https://getupdates.oracle.com/readme/144488-11
>
> Which contains this tantalizing tidbit:
>
> 6638967 UDP recv (think DNS) suffers from thundering herd problem
> (bug report for above: http://bit.ly/eD57KB+ )
>
> I'm going to install this patch and see what comes of it. That
> certainly seems like it could be related.
>
> Martin,
>
> I checked based on Matthew's suggestion of the ipInDiscards counter
> incrementing, and it's not, so no dice with the checksum errors, good
> call though!
>
> Matthew,
>
> Following your suggestions I set up all of the network-based
> destinations to be /dev/null and the problem persists, shows no change
> in terms of how fast the buffers fill up or how many overflows are
> generated per second at all. As for loggen, keep in mind that I was
> using that before to write to /dev/null and also directly to disk and
> topped loggen out (locally) at ~8k/msgs/sec, so I wonder if Fred isn't
> on the right track with some OS issues?
>
> I used Google to find scripts to run for Dtrace to track syscalls, and
> one called procsystime stood out
> (http://www.brendangregg.com/DTrace/procsystime) Here's a sample of the
> output over about 20 seconds each time, while the overflows are
> happening. It's hard for me to tell if the write() syscall is really
> freaking slow here or if this is as it should be and not weird. Either
> way, write() doesn't seem to be the issue if you look at the /dev/null
> output for which UDP overflows still occur.
>
> This first one is the output for the simple config I posted earlier,
> writing to disk:
>
> # /root/procsystime -aTn syslog-ng
> Hit Ctrl-C to stop sampling...
> ^C
>
> Elapsed Times for processes syslog-ng,
>
> SYSCALL TIME (ns)
> getpid 193600
> fchmod 443000
> bind 757700
> fcntl 786800
> setsockopt 1355700
> fchown 1446200
> connect 1542000
> so_socket 2626100
> close 4382500
> open64 4691400
> stat64 4957400
> brk 23978700
> pollsys 113785800
> llseek 158217200
> gtime 244735300
> recvfrom 404501900
> write 11343815700
> TOTAL: 12312217000
Hmm... I think I know what the culprit is, based on these statistics.
1) syslog-ng < 3.3 issued a single write() syscall for each and every
line written in the log files. It had overhead in Linux too, but not
this bad.
2) time() is invoked by syslog-ng a lot, and it is clearly visible from
your profile. These invocations were also decreased in numbers,
unluckily only in 3.3.
_Perhaps_ it'd be make sense to backport the performance improvements in
question, so you wouldn't have to fight 3.3 related issues, although I'm
more than happy if you choose that.
This is the bulk of the change that should more-or-less cleanly apply to
3.2:
Author: NagyAttila <naat at balabit.hu> 2010-11-15 18:00:10
Committer: Balazs Scheidler <bazsi at balabit.hu> 2010-12-21 16:31:09
Parent: 88a2a660255147c5ebd35951d1ccead9fd779e13 (LogProto: apply_state shouldn't allow file offsets over the end-of-file)
Child: fe541bcb0368342228eebbb8dcf08e7a5f5f6a05 (LogProto: simplify prepare method)
Branches: many (84)
Follows: v3.2.1
Precedes: v3.3.0alpha1
Performance improvement: file write uses writev instead of per-message write() to write larger chunks
There were three fixes on the code since then:
3fa8e900453fb6af1767caf42e17ab6d8c42452b syslog() destination driver: fixed potential framing problem on contended connections
20e523a53dd8259061b2d277927f79681b8a9334 LogProtoFileWriter: flush should not attempt to call writev() if there are no buffers
3abdd8773662f9d779429262262a1ed9229e98e6 logproto: Handle EAGAIN and EINTR correctly in _text_client_flush()
The time related changes are much more scattered, so I wouldn't recommend
to go there first, and seeing your profile, shouldn't matter that much.
Also, please note that in order to really make use of the new code, you need to
set flush_lines() to non-zero, something like 100-1000 should make a big difference.
And one last point: please don't take it as a plug, but it _could_ make sense to
check the PE version first, you can get a free eval (in binary), and PE 4.0 had
these patches already (and got ported to OSE later). If you really can see the
performance boost on your system, you can still decide whether to take the PE
route, or stick with the OSE, possibly with porting patches.
--
Bazsi
More information about the syslog-ng
mailing list