udp drops

Jan Schaumann

30 May 2009 30 May '09

4:57 p.m.

Hello, I have a FreeBSD 6.2 (amd64) host where I'd like to replace the stock syslogd with syslog-ng (3.0.2). This host receives a lot of syslog messages per second from a large number of clients via UDP. The stock syslogd configuration is trivial: *.* /var/log/all This host currently drops about 2-4% of all UDP packets, syslog takes about 50-65% of one CPU. A drop-in replacement configuration using syslog-ng: options { create_dirs(yes); use_dns(no); }; template t_default { template("${DATE} <${FACILITY}.${PRIORITY}> ${HOST} ${MSG}\n"); }; source s_standard { file("/dev/klog"); internal(); udp(); unix-dgram("/var/run/log"); }; destination d_all { file("/var/log/all" template(t_default) ); }; # *.* /var/log/all log { source(s_standard); destination(d_all); }; syslog-ng uses about 90% of one CPU and drops between 15% and 20% of UDP packets (and, based on traffic patterns and logfile size, concurrent logfile rotation etc. even as high as 30%). I've tried a number of things to improve this performance, including: log_fetch_limit(100); log_iw_size(10000); flush_lines(100000); flush_timeout(10); in the global options and log_fifo_size(100000) in the destination definition with flags(flow-control) in the log definition. The best I was able to get with these numbers was 5-7% of UDP drops (ie still double of what the stock syslogd drops). I also tried adjusting "so_rcvbuf" for UDP with no noticable difference. Now consider that I did not do any sysctl tuning, as those should equally influence the stock syslog and I'm trying to sort out why one performs so significantly better than the other. Leaving aside any of the things I can do with syslog-ng further down the road (such as filtering and intentionally dropping certain messages), what can I do to get syslog-ng up to the same performance as the stock syslogd? Many thanks in advance for any pointers and help. -Jan

Attachments:

attachment.sig (application/pgp-signature — 186 bytes)

Show replies by date

Sandor Geller

30 May 30 May

5:54 p.m.

Hi, On Sat, May 30, 2009 at 4:57 PM, Jan Schaumann <jschauma@netmeister.org> wrote:

...

Hello,

I have a FreeBSD 6.2 (amd64) host where I'd like to replace the stock syslogd with syslog-ng (3.0.2). This host receives a lot of syslog messages per second from a large number of clients via UDP.

The stock syslogd configuration is trivial:

*.* /var/log/all

This host currently drops about 2-4% of all UDP packets, syslog takes about 50-65% of one CPU.

A drop-in replacement configuration using syslog-ng:

options { create_dirs(yes); use_dns(no); };

template t_default { template("${DATE} <${FACILITY}.${PRIORITY}> ${HOST} ${MSG}\n"); };

source s_standard { file("/dev/klog"); internal(); udp(); unix-dgram("/var/run/log"); };

destination d_all { file("/var/log/all" template(t_default) ); };

# *.* /var/log/all log { source(s_standard); destination(d_all); };

syslog-ng uses about 90% of one CPU and drops between 15% and 20% of UDP packets (and, based on traffic patterns and logfile size, concurrent logfile rotation etc. even as high as 30%).

This is somewhat expected as syslog-ng parses incoming messages. So my I guess is that syslog-ng can't drain fast enough the receive buffer, and the kernel simply drops messages not fitting in the buffer. It would be good to know whether the source side or the destination side is the limiting factor. As you're using local files myguess is the former.

...

I've tried a number of things to improve this performance, including:

log_fetch_limit(100);

This option should get raised when you've got a very busy source, it controls how many messages syslog-ng tries to read in one loop iteration.

...

log_iw_size(10000); flush_lines(100000); flush_timeout(10);

The flush* options are useful to tune destination performance.

...

in the global options and

log_fifo_size(100000)

in the destination definition

with

flags(flow-control)

in the log definition.

AFAIK with files/ UDP flow-control is a no-op.

...

The best I was able to get with these numbers was 5-7% of UDP drops (ie still double of what the stock syslogd drops).

I also tried adjusting "so_rcvbuf" for UDP with no noticable difference.

Increasing the receive buffer size helps when there are only a few log bursts/ peaks. However when seriously overloaded by incoming traffic syslog-ng can't drain the receive buffer fast enough ending up the buffer being full, it's only a matter of time when this occurs.

...

Now consider that I did not do any sysctl tuning, as those should equally influence the stock syslog and I'm trying to sort out why one performs so significantly better than the other.

There is a vast difference in functions and therefore in behaviour. syslogd just transports the logs, while syslog-ng does a lot of additional work (like parsing, filtering) which require CPU cycles.

...

Leaving aside any of the things I can do with syslog-ng further down the road (such as filtering and intentionally dropping certain messages), what can I do to get syslog-ng up to the same performance as the stock syslogd?

Unfortunately this can't happen. You can use the 'no-parse' option to skip initial parsing the messages which could improve performance. This means you can't use the template above as the variables won't get defined. Generally when it comes to parsing then syslog-ng could be CPU-limited. In this case you should consider deploying multiple syslog servers, and share the load. Ideally flow-controlling could be turned on the client side as well (using TCP). Regards, Sandor

Jan Schaumann

7:43 p.m.

Sandor Geller <Sandor.Geller@morganstanley.com> wrote:

...

This is somewhat expected as syslog-ng parses incoming messages. So my I guess is that syslog-ng can't drain fast enough the receive buffer, and the kernel simply drops messages not fitting in the buffer.

Exactly.

...

It would be good to know whether the source side or the destination side is the limiting factor. As you're using local files myguess is the former.

I'm quite sure the source side is the problem. Ie, I/O to the file on disk ought to be reasonably fast (otherwise stock syslogd would have the same problems). As you noted, the additional processing that syslog-ng does for every message it receives seems to cause it to not be able to process them fast enough to drain the buffers.

...

...
flags(flow-control)

in the log definition.

AFAIK with files/ UDP flow-control is a no-op.

Ah, good to know.

...

Unfortunately this can't happen. You can use the 'no-parse' option to skip initial parsing the messages which could improve performance. This means you can't use the template above as the variables won't get defined.

I'll have to give that a try, if only to determine what, if any, performance difference it causes.

...

Generally when it comes to parsing then syslog-ng could be CPU-limited. In this case you should consider deploying multiple syslog servers, and share the load. Ideally flow-controlling could be turned on the client side as well (using TCP).

Yes, those are the long-term plans. :-) Well, we can't switch all clients to TCP, since many of them are network/storage devices etc. only capable of logging via UDP. For the time being, though, I need to lay the ground work of getting syslog-ng as a suitable replacement for the stock syslogd used on our servers. Thanks for your help. I'll keep poking at this... -Jan

Martin Holste

31 May 31 May

1:59 p.m.

Out of curiosity, how many messages per second was the stock syslogd able to process with minimal loss? What method did you employ to determine loss? I am setting up a similar logging solution with NG using the db-parser module which takes considerable CPU. I plan on using Cisco server load balancing to round-robin load balance on N number of syslog nodes to achieve zero loss, but I'm concerned that detecting loss will be difficult. --Martin On Sat, May 30, 2009 at 12:43 PM, Jan Schaumann <jschauma@netmeister.org>wrote:

...

Sandor Geller <Sandor.Geller@morganstanley.com> wrote:

...
This is somewhat expected as syslog-ng parses incoming messages. So my I guess is that syslog-ng can't drain fast enough the receive buffer, and the kernel simply drops messages not fitting in the buffer.

Exactly.

...
It would be good to know whether the source side or the destination side is the limiting factor. As you're using local files myguess is the former.

I'm quite sure the source side is the problem. Ie, I/O to the file on disk ought to be reasonably fast (otherwise stock syslogd would have the same problems). As you noted, the additional processing that syslog-ng does for every message it receives seems to cause it to not be able to process them fast enough to drain the buffers.

...
...
flags(flow-control)

in the log definition.

AFAIK with files/ UDP flow-control is a no-op.

Ah, good to know.

...
Unfortunately this can't happen. You can use the 'no-parse' option to skip initial parsing the messages which could improve performance. This means you can't use the template above as the variables won't get defined.

I'll have to give that a try, if only to determine what, if any, performance difference it causes.

...
Generally when it comes to parsing then syslog-ng could be CPU-limited. In this case you should consider deploying multiple syslog servers, and share the load. Ideally flow-controlling could be turned on the client side as well (using TCP).

Yes, those are the long-term plans. :-) Well, we can't switch all clients to TCP, since many of them are network/storage devices etc. only capable of logging via UDP.

For the time being, though, I need to lay the ground work of getting syslog-ng as a suitable replacement for the stock syslogd used on our servers.

Thanks for your help. I'll keep poking at this...

-Jan

______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html

Jan Schaumann

8:53 p.m.

Martin Holste <mcholste@gmail.com> wrote:

...

Out of curiosity, how many messages per second was the stock syslogd able to process with minimal loss?

Between 15K and 18K / s.

...

What method did you employ to determine loss?

Effectively: n1=$(netstat -s -p udp | awk '/dropped due to full socket/ { print $1 }') sleep 5 n2=$(netstat -s -p udp | awk '/dropped due to full socket/ { print $1 }') (In reality, there's a tool that works much like sar(1) does and I can query it for stats, but underneath it happens to use the above logic.)

...

I am setting up a similar logging solution with NG using the db-parser module which takes considerable CPU. I plan on using Cisco server load balancing to round-robin load balance on N number of syslog nodes to achieve zero loss

What's your plan for handling the messages on the N nodes? Will they all just log to their own filesystem, write to a shared filesystem, write into a database, forward to another system, ... ? -Jan

Martin Holste

11:38 p.m.

The plan, which is mostly through the proof-of-concept phase, is to pipe the pre-parsed and pre-formatted logs from SyslogNG to a Perl script which takes care of inserting the log lines into a database. On the surface, this sounds like I should just use the built-in SQL destination, but the Perl script is also handling doing inserts into special tables which are indexed for the individual custom macros db-parser is creating. The templates are stored in the database as well to keep track of how many and what kind of fields each log type (or class, in my system) has and arranges them for easy retrieval. This all sounds fairly complicated (and maybe it is overcomplicated), but the goal is to not need a separate table for each type of log class I want to collect. For instance, there are many different types of firewall logs I want to collect and I want to index the source IP, destination IP, etc. fields from each message in real-time. I accomplish this by having four table templates in the schema: a main table with the log line metadata like timestamp, sender, etc., one which just contains the text of the message, one for integer fields (like IP and port) and one for strings (like interface). So, if a log line comes in with 8 fields, 4 integers and 4 strings, then there is an insert into the metatable, an insert into the message table, 4 inserts into the ints table, and 4 inserts into the strings table. The tables themselves are split into a database per day and a table per hour per class. That means that when a client makes a query for a particular log class, only some of the tables have to be searched. That query pruning can make a huge difference. Perl has to arrange these transactions, and right now I'm finding that by far the most efficient way of doing this is to write to tab separated files on the hard disk (or ram disk if necessary) and do "LOAD DATA INFILE" commands in MySQL in a second, concurrent Perl script. This is incredibly efficient on the DB end, and it also has the additional benefit of accommodating bursts from SyslogNG that the database would normally not be able to handle. The only downside is that there is a few second delay between when the log is received and when it is written to the database and available for query, but any alerting would have already taken place in the Perl script. I implemented this extra step lately because we were experiencing log loss due to database insert bottleneck. The test system is an older Dell 32-bit 4-way server and it was attempting to handle about 15 Mbps of syslog from various firewall sources. I think it was something in the 2-3k msg/sec range. I'm trying to narrow that down now. In any case, each node will receive 1/Nth of the total logs and will write to its own local database. The frontend will issue parallel queries to each node to achieve horizontal partitioning. The frontend will be responsible for collating the query results to deliver to the end user. With just the one test node, I'm finding that you can request arbitrary text from 150 million messages and get the results in less than 1 second. Indexed fields are even faster (which is what I'm counting on for writing reporting later). The plan right now is to eventually put this framework and frontend up on Sourceforge when it becomes Alpha-quality. It will also include a frontend for creating db-parser XML patterns from example log files with just point-and-click. Suggestions/criticism are welcome for all I've mentioned. Thanks, Martin On Sun, May 31, 2009 at 1:53 PM, Jan Schaumann <jschauma@netmeister.org>wrote:

...

Martin Holste <mcholste@gmail.com> wrote:

...
Out of curiosity, how many messages per second was the stock syslogd able to process with minimal loss?

Between 15K and 18K / s.

...
What method did you employ to determine loss?

Effectively:

n1=$(netstat -s -p udp | awk '/dropped due to full socket/ { print $1 }') sleep 5 n2=$(netstat -s -p udp | awk '/dropped due to full socket/ { print $1 }')

(In reality, there's a tool that works much like sar(1) does and I can query it for stats, but underneath it happens to use the above logic.)

...
I am setting up a similar logging solution with NG using the db-parser module which takes considerable CPU. I plan on using Cisco server load balancing to round-robin load balance on N number of syslog nodes to achieve zero loss

What's your plan for handling the messages on the N nodes? Will they all just log to their own filesystem, write to a shared filesystem, write into a database, forward to another system, ... ?

-Jan

______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html

Balazs Scheidler

3 Jun 3 Jun

11:40 a.m.

On Sat, 2009-05-30 at 10:57 -0400, Jan Schaumann wrote:

...

Hello,

I have a FreeBSD 6.2 (amd64) host where I'd like to replace the stock syslogd with syslog-ng (3.0.2). This host receives a lot of syslog messages per second from a large number of clients via UDP.

The stock syslogd configuration is trivial:

*.* /var/log/all

This host currently drops about 2-4% of all UDP packets, syslog takes about 50-65% of one CPU.

A drop-in replacement configuration using syslog-ng:

options { create_dirs(yes); use_dns(no); };

template t_default { template("${DATE} <${FACILITY}.${PRIORITY}> ${HOST} ${MSG}\n"); };

source s_standard { file("/dev/klog"); internal(); udp(); unix-dgram("/var/run/log"); };

destination d_all { file("/var/log/all" template(t_default) ); };

# *.* /var/log/all log { source(s_standard); destination(d_all); };

syslog-ng uses about 90% of one CPU and drops between 15% and 20% of UDP packets (and, based on traffic patterns and logfile size, concurrent logfile rotation etc. even as high as 30%).

Hmm.. one possible problem is that syslog-ng wakes up too often processes a small number of messages and goes back to sleep. Since the poll iteration has its overhead, this might add up to be significant. You could perhaps play with time_sleep(), I'd go for 30msecs which would limit syslog-ng to wake up at most 30 times per second. Then, make sure that you actually have a large enough UDP receive buffer. so_rcvbuf() might not be enough, as systems usually add further limits on the maximum per-socket receive buffer size. On Linux, it is /proc/sys/net/core/rmem_{max,default} Here's what I've found using Google: http://www.29west.com/docs/THPM/udp-buffer-sizing.html sysctl -w kern.ipc.maxsockbuf=<number> I would increase the size of the receive buffer to at least 3-5 seconds worth of messages, so given you have 15k/sec, you'd need about 22MB of receive buffer space.

...

I've tried a number of things to improve this performance, including:

log_fetch_limit(100); log_iw_size(10000); flush_lines(100000); flush_timeout(10);

fetch_limit() might be related, if you have only a small number of sources, you could increase that, but don't forget to adjust the destination window size, as described in the documentation: http://wwwen.balabit/dl/html/syslog-ng-admin-guide_en.html/ch02s12.html flush stuff should not matter that match.

...

in the global options and

log_fifo_size(100000)

in the destination definition

with

flags(flow-control)

in the log definition.

flow-control is not a NOP for udp() sources as Sandor said, if the flow-control kicks in, syslog-ng will completely stop receiving udp() messages altogether, basically making the kernel drop them. flow-control for destination files is however basically a nop as destination files are _always_ writable.

...

The best I was able to get with these numbers was 5-7% of UDP drops (ie still double of what the stock syslogd drops).

I also tried adjusting "so_rcvbuf" for UDP with no noticable difference.

Please check the kernel parameters.

...

Now consider that I did not do any sysctl tuning, as those should equally influence the stock syslog and I'm trying to sort out why one performs so significantly better than the other.

syslog-ng core can do about 130k msg/sec without writing things to files, and about 70k/sec if you have a single destination file. however it might have a latency that causes the udp() receive buffer to fill up. If you carefully size your udp() receive buffer you can probably achieve no message losses for about 15k msg/sec. I'd guess you'd be losing messages for sure if we're talking over 20k msg/sec

...

Leaving aside any of the things I can do with syslog-ng further down the road (such as filtering and intentionally dropping certain messages), what can I do to get syslog-ng up to the same performance as the stock syslogd?

Many thanks in advance for any pointers and help.

-- Bazsi

Jan Schaumann

8:20 p.m.

Balazs Scheidler <bazsi@balabit.hu> wrote:

...

Hmm.. one possible problem is that syslog-ng wakes up too often processes a small number of messages and goes back to sleep. Since the poll iteration has its overhead, this might add up to be significant.

You could perhaps play with time_sleep(), I'd go for 30msecs which would limit syslog-ng to wake up at most 30 times per second.

That actually makes things a lot worse, as the buffers immediately fill up and aren't drained quickly enough.

...

Then, make sure that you actually have a large enough UDP receive buffer. so_rcvbuf() might not be enough, as systems usually add further limits on the maximum per-socket receive buffer size.

Yeah, that helps a lot. I had initially resisted making those changes as I was trying to see how/if I can tune syslog-ng to get the same performance as regular syslog without any outside changes.

...

fetch_limit() might be related, if you have only a small number of sources, you could increase that, but don't forget to adjust the destination window size, as described in the documentation:

That helps, too.

...

syslog-ng core can do about 130k msg/sec without writing things to files, and about 70k/sec if you have a single destination file. however it might have a latency that causes the udp() receive buffer to fill up. If you carefully size your udp() receive buffer you can probably achieve no message losses for about 15k msg/sec.

With the above changes (and the fix for Bug #49, thank you very much), I'm now getting syslog-ng down to between 2% and 4% UDP drops, which is about the same as stock syslog was. Well, it's slightly worse, since syslog-ng is now dropping (intentionally) a large number of messages that stock syslog is dutyfully writing to disk. Also, the increase buffersize of course also make stock syslog be more performant, but for now the above should be acceptable until I have load balancing added. Thanks! -Jan

Balazs Scheidler

6 Jun 6 Jun

10:28 a.m.

On Wed, 2009-06-03 at 14:20 -0400, Jan Schaumann wrote:

...

Balazs Scheidler <bazsi@balabit.hu> wrote:

...
Hmm.. one possible problem is that syslog-ng wakes up too often processes a small number of messages and goes back to sleep. Since the poll iteration has its overhead, this might add up to be significant.

You could perhaps play with time_sleep(), I'd go for 30msecs which would limit syslog-ng to wake up at most 30 times per second.

That actually makes things a lot worse, as the buffers immediately fill up and aren't drained quickly enough.

hmm, if you size your input buffer large enough, it shouldn't be an issue, and the CPU usage of syslog-ng should go down significantly.

...

...
Then, make sure that you actually have a large enough UDP receive buffer. so_rcvbuf() might not be enough, as systems usually add further limits on the maximum per-socket receive buffer size.

Yeah, that helps a lot. I had initially resisted making those changes as I was trying to see how/if I can tune syslog-ng to get the same performance as regular syslog without any outside changes.

syslog-ng has larger latencies than stock syslogd, since it watches several input file descriptors whereas syslogd only has to care about one UDP socket. Also, syslog-ng uses a generic I/O framework for managing all its I/O related events, whereas syslogd probably uses a plain simple select() to query its inputs. That poll() iteration is what needs more CPU, especially if it runs several thousand times per second. Also the output part of syslog-ng is also non-blocking, whereas syslogd usually sends the message to its output in blocking mode (since UDP sockets never block and files cannot be used in non-blocking mode). All-in-all, we have CPU overhead in the poll() iteration, and more latency because of the non-blocking I/O. I wouldn't think that the message parsing would be a culprit here, I made a serious effort to optimize that (although that was more than a year ago, so cruft might have gathered since then). And latency is what causes the udp() source to drop messages, especially if the input socket buffer is not large enough.

...

...
fetch_limit() might be related, if you have only a small number of sources, you could increase that, but don't forget to adjust the destination window size, as described in the documentation:

That helps, too.

...
syslog-ng core can do about 130k msg/sec without writing things to files, and about 70k/sec if you have a single destination file. however it might have a latency that causes the udp() receive buffer to fill up. If you carefully size your udp() receive buffer you can probably achieve no message losses for about 15k msg/sec.

With the above changes (and the fix for Bug #49, thank you very much), I'm now getting syslog-ng down to between 2% and 4% UDP drops, which is about the same as stock syslog was.

Well, it's slightly worse, since syslog-ng is now dropping (intentionally) a large number of messages that stock syslog is dutyfully writing to disk. Also, the increase buffersize of course also make stock syslog be more performant, but for now the above should be acceptable until I have load balancing added.

-- Bazsi

5966

Age (days ago)

5973

Last active (days ago)

List overview

Download

8 comments

4 participants

participants (4)

Balazs Scheidler
Jan Schaumann
Martin Holste
Sandor Geller

udp drops

Jan Schaumann

Sandor Geller

Jan Schaumann

Martin Holste

Jan Schaumann

Martin Holste

Balazs Scheidler

Jan Schaumann

Balazs Scheidler

tags

participants (4)