Hi,
I did some work with UDP buffer testing and wrote it into my Wiki at http://nms.gdd.net/index.php/Install_Guide_for_LogZilla_v3.1#UDP_Buffers

The other guy that Baszi refers to did some nice graphs, but it was a temporary URL.
At the time, I thought the email thread would be useful, so I copied it to text and saved it locally. Here's that thread:

---
I've run a series of tests against our log server, using
loggen-generated logging rates of 100, 1000, 2000, 4000, 8000, 16000,
and 32000 messages/sec, and measured the results for each rate using
socket buffer sizes ranging from 128KB to 16M.  The results show,
essentially, what buffer size you need to meet a target rate of
message logging.

So I'm happy now, and I'm getting the sort of rates I expect to be
able to support.  I figured other folks might like the data.  I've put
the raw data online as well as a graph of the results:

 http://drop.io/syslog_ng

The graph shows so_rcvbuf() size along the X axis, and packet loss
along the Y axis.


From Baszi:
Hmm. the numbers you are seeing are indeed low, with sufficient buffer
sizes I could get up to the 20k message/sec range with syslog-ng,
although it's been a while I last tested it.

What I'd recommend is to calculate how much _bytes_ the message rate you
are generating means.

If you generate 2000 messages, 300 byte each (loggen default IIRC),
that's 600000 bytes every second. syslog-ng is single threaded, thus the
latency to write to the disk applies. This means that it may take some
time for syslog-ng to care about its source, if it is busy writing out
messages. This is the #1 reason why I want to work on multithreading.
With a flow controlled source, syslog-ng is able to do about 70-75k
msg/sec. But not with UDP.

In order to improve the numbers, I'd:

1) increase the receive buffer rate to 3-5 seconds (e.g. 3-5MB, not just
0.5)

2) increase log_fetch_limit() to a larger value, this controls how much
messages syslog-ng fetches in each poll iteration. Increase this to
3-500

3) increase log_fifo_size() for the destination, by taking the
fetch_limit values for each sources feeding the destination (so if you
have two sources, each with 1000 fetch limit, then the destination queue
should be _at least_ 2000, preferably rounded to the next order of
magnitude (e.g. with 2x1000 fetch-limits, increase fifo to 10000)

You haven't included in your email whether syslog-ng itself is dropping
messages, or the kernel. netstat drop counts or syslog-ng statistics
should help decide that.


______________________________________________________________

Clayton Dukes
______________________________________________________________


On Wed, Jul 20, 2011 at 7:11 AM, Balazs Scheidler <bazsi@balabit.hu> wrote:
On Wed, 2011-07-20 at 11:14 +0200, maxime.denier@orange-ftgroup.com
wrote:
> Hello,
>
> I have recently installed syslog-ng OSE 3.1 as log collector and I
> face a problem.
> A great number of logs arrive on the server, but a little part of them
> arrive in the destination files, but all the destination files have
> some logs on it.
> I have enabled the verbose mode and I see this:
> Jul 20 07:52:04 sparte1 syslog-ng[2557]: Initializing destination file
> writer;
> template='/var/logs/${NSM.DEVICE:-Unknown_device}/${NSM.RECEIVED_TIME.YEAR}${NSM.RECEIVED_TIME.MONTH}${NSM.RECEIVED_TIME.DAY}2400.csv', filename='/var/logs/zidane2/201107202400.csv'
> Jul 20 07:52:31 sparte1 syslog-ng[2557]: Reaping unused destination
> files;
> template='/var/logs/${NSM.DEVICE:-Unknown_device}/${NSM.RECEIVED_TIME.YEAR}${NSM.RECEIVED_TIME.MONTH}${NSM.RECEIVED_TIME.DAY}2400.csv'
> Jul 20 07:53:01 sparte1 syslog-ng[2557]: Reaping unused destination
> files;
> template='/var/logs/${NSM.DEVICE:-Unknown_device}/${NSM.RECEIVED_TIME.YEAR}${NSM.RECEIVED_TIME.MONTH}${NSM.RECEIVED_TIME.DAY}2400.csv'
> Jul 20 07:53:01 sparte1 syslog-ng[2557]: Destination timed out,
> reaping;
> template='/var/logs/${NSM.DEVICE:-Unknown_device}/${NSM.RECEIVED_TIME.YEAR}${NSM.RECEIVED_TIME.MONTH}${NSM.RECEIVED_TIME.DAY}2400.csv', filename='/var/logs/peony2/201107202400.csv'
> Jul 20 07:53:01 sparte1 syslog-ng[2557]: Closing log transport fd;
> fd='31'
> Jul 20 07:53:01 sparte1 syslog-ng[2557]: Destination timed out,
> reaping;
> template='/var/logs/${NSM.DEVICE:-Unknown_device}/${NSM.RECEIVED_TIME.YEAR}${NSM.RECEIVED_TIME.MONTH}${NSM.RECEIVED_TIME.DAY}2400.csv', filename='/var/logs/decca2/201107202400.csv'
> Jul 20 07:53:01 sparte1 syslog-ng[2557]: Closing log transport fd;
> fd='19'
> Jul 20 07:53:16 sparte1 syslog-ng[2557]: Initializing destination file
> writer;
> template='/var/logs/${NSM.DEVICE:-Unknown_device}/${NSM.RECEIVED_TIME.YEAR}${NSM.RECEIVED_TIME.MONTH}${NSM.RECEIVED_TIME.DAY}2400.csv', filename='/var/logs/hyenne2/201107202400.csv'
> Jul 20 07:53:17 sparte1 syslog-ng[2557]: Initializing destination file
> writer;
> template='/var/logs/${NSM.DEVICE:-Unknown_device}/${NSM.RECEIVED_TIME.YEAR}${NSM.RECEIVED_TIME.MONTH}${NSM.RECEIVED_TIME.DAY}2400.csv', filename='/var/logs/olive2/201107202400.csv'
> Jul 20 07:53:31 sparte1 syslog-ng[2557]: Reaping unused destination
> files;
> template='/var/logs/${NSM.DEVICE:-Unknown_device}/${NSM.RECEIVED_TIME.YEAR}${NSM.RECEIVED_TIME.MONTH}${NSM.RECEIVED_TIME.DAY}2400.csv'
> Jul 20 07:53:31 sparte1 syslog-ng[2557]: Destination timed out,
> reaping;
> template='/var/logs/${NSM.DEVICE:-Unknown_device}/${NSM.RECEIVED_TIME.YEAR}${NSM.RECEIVED_TIME.MONTH}${NSM.RECEIVED_TIME.DAY}2400.csv', filename='/var/logs/zidane2/201107202400.csv'
> Jul 20 07:53:31 sparte1 syslog-ng[2557]: Closing log transport fd;
> fd='24'
>
> I haven't found information about the root cause of these timed out.

These only indicate that syslog-ng is properly garbage-collecting
destination files that receive no data. This is not an error, that's why
you only get this if you enable --debug / --verbose (i'm not sure which
one).

> This seems to be a writing problem.
> Before using syslog-ng, Logs were processed by a application owned by
> the firewall publisher on the same type of hardware without this great
> number of log lost.

Is this UDP? syslog-ng doesn't increase udp receive buffer sizes unless
explicitly told so, using the so-rcvbuf() option on the udp source. you
probably need to increase that.

But I'd recommend not using udp, as that can cause a lot of lost
messages easily. (if you create a simple loop that sends udp frames to
the syslog receiver, you can easily see as much as 97% messages lost!)
easy DoS.

>
> If any body have already faced this problem and have a solution, it
> would be wonderful.

There was a guy on this list, who has published charts and numbers how
he had to tune the various buffering options. I forgot his name, but if
you google syslog-ng, udp buffer size, you'll probably find it.


--
Bazsi


______________________________________________________________________________
Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng
FAQ: http://www.balabit.com/wiki/syslog-ng-faq