Odd behavior with very small log_fifo_size()
Hi, I was doing a little bit of testing with syslog-ng today, and I think that with a very small log_fifo_size() it was dropping messages but not telling me. If I set my log_fifo_size to 1, here are the results: [jshaw@ars1dev7 ~/syslog-ng-2.0.9/tests/loggen]$ ./loggen -r 1000000 -s 1024000 -I 60 localhost 2000 average rate = 96.48 msg/sec, count=5789 [jshaw@ars1dev7 ~/syslog-ng-2.0.9/tests/loggen]$ ls -lh /ita/jshaw/messages -rw------- 1 jshaw ita 101M Jul 1 13:30 /ita/jshaw/messages However, 1024000 * 5789 = 5.9 GB Grepping for drop messages in the output file: [jshaw@ars1dev7 ~/syslog-ng-2.0.9/tests/loggen]$ grep -i drop /ita/jshaw/messages returns nothing. I have stats_freq(5) in my config file, and that messages file is the only output. I do have other statistical lines, which imply to me that no messages are dropped: Jul 1 13:49:52 s_local@ars1dev7 syslog-ng[6344]: Log statistics; processed='center(queued)=6045', processed='center(received)=6045', processed='destination(d_file)=6045', processed='source(s_local)=6045' But it seems clear to me that they are -- or loggen is lying. With more normal log_fifo_sizes (100, 2000) the throughput rates from loggen are lower and the file sizes match. Any ideas? Thanks, Joe
On Tue, 2008-07-01 at 13:52 -0400, Joe Shaw wrote:
Hi,
I was doing a little bit of testing with syslog-ng today, and I think that with a very small log_fifo_size() it was dropping messages but not telling me.
If I set my log_fifo_size to 1, here are the results:
[jshaw@ars1dev7 ~/syslog-ng-2.0.9/tests/loggen]$ ./loggen -r 1000000 -s 1024000 -I 60 localhost 2000 average rate = 96.48 msg/sec, count=5789 [jshaw@ars1dev7 ~/syslog-ng-2.0.9/tests/loggen]$ ls -lh /ita/jshaw/messages -rw------- 1 jshaw ita 101M Jul 1 13:30 /ita/jshaw/messages
However, 1024000 * 5789 = 5.9 GB
Grepping for drop messages in the output file:
[jshaw@ars1dev7 ~/syslog-ng-2.0.9/tests/loggen]$ grep -i drop /ita/jshaw/messages
returns nothing. I have stats_freq(5) in my config file, and that messages file is the only output. I do have other statistical lines, which imply to me that no messages are dropped:
Jul 1 13:49:52 s_local@ars1dev7 syslog-ng[6344]: Log statistics; processed='center(queued)=6045', processed='center(received)=6045', processed='destination(d_file)=6045', processed='source(s_local)=6045'
But it seems clear to me that they are -- or loggen is lying.
With more normal log_fifo_sizes (100, 2000) the throughput rates from loggen are lower and the file sizes match.
syslog-ng does not log message loss on file destinations, because that'd clutter the log statistics line, on the asssumption that no messages are can be lost there. However this is not always the case, if you log_fifo_size() is smaller than the window size of incoming sources, messages can be lost. Here is a description of how flow-control and window sizes work in syslog-ng: http://www.balabit.hu/dl/html/syslog-ng-admin-guide_en.html/ch08s03.html Please let me (and the doc team) know if the description could be made clearer somehow. -- Bazsi
Hi, On Wed, Jul 2, 2008 at 5:13 AM, Balazs Scheidler <bazsi@balabit.hu> wrote:
syslog-ng does not log message loss on file destinations, because that'd clutter the log statistics line, on the asssumption that no messages are can be lost there.
However this is not always the case, if you log_fifo_size() is smaller than the window size of incoming sources, messages can be lost.
Could something be added if -v were passed to syslog-ng then? It's (very) useful diagnostic information when you're tuning an installation and I probably wouldn't have noticed if not for the data that loggen outputs and the fact that the log file seemed small.
Here is a description of how flow-control and window sizes work in syslog-ng:
http://www.balabit.hu/dl/html/syslog-ng-admin-guide_en.html/ch08s03.html
Please let me (and the doc team) know if the description could be made clearer somehow.
I hadn't seen this before -- I was only using the docs shipped with the tarball -- and it's good information. My setting of the log_fifo_size() in testing was for two reasons: (1) to see how setting it affected memory usage when being hit with many large messages and (2) to demonstrably see what its behavior was if it was overloaded and had to drop messages. I would have liked a little more info in the latter. As for the docs, it does mention that without the flow control messages may be lost, but doesn't indicate how this might be detected or what happens if they are. Thanks, Joe
On Wed, 2008-07-02 at 13:21 -0400, Joe Shaw wrote:
Hi,
On Wed, Jul 2, 2008 at 5:13 AM, Balazs Scheidler <bazsi@balabit.hu> wrote:
syslog-ng does not log message loss on file destinations, because that'd clutter the log statistics line, on the asssumption that no messages are can be lost there.
However this is not always the case, if you log_fifo_size() is smaller than the window size of incoming sources, messages can be lost.
Could something be added if -v were passed to syslog-ng then? It's (very) useful diagnostic information when you're tuning an installation and I probably wouldn't have noticed if not for the data that loggen outputs and the fact that the log file seemed small.
Here is a description of how flow-control and window sizes work in syslog-ng:
http://www.balabit.hu/dl/html/syslog-ng-admin-guide_en.html/ch08s03.html
Please let me (and the doc team) know if the description could be made clearer somehow.
I hadn't seen this before -- I was only using the docs shipped with the tarball -- and it's good information
I know I should've removed that old documentation from the source tarball. It is unmaintained there, and false information is worse than no information at all. So that's what I'm going to do, remove the doc and add a pointer to the website. (or maybe add a txt version if I can figure out how to do that).
.
My setting of the log_fifo_size() in testing was for two reasons: (1) to see how setting it affected memory usage when being hit with many large messages and (2) to demonstrably see what its behavior was if it was overloaded and had to drop messages. I would have liked a little more info in the latter.
As for the docs, it does mention that without the flow control messages may be lost, but doesn't indicate how this might be detected or what happens if they are.
Without flow control messages are lost when the output FIFO size is full. That's basically the only possibility. It even displays a debug message in this case: if (!log_queue_push_tail(self->queue, lm, path_flags)) { /* drop incoming message, we must ack here, otherwise the sender might * block forever, however this should not happen unless the sum of * window_sizes of sources feeding this writer exceeds log_fifo_size * or if flow control is not turned on. */ /* we don't send a message here since the system is draining anyway */ if (self->dropped_messages) (*self->dropped_messages)++; msg_debug("Destination queue full, dropping message", evt_tag_int("queue_len", log_queue_get_length(self->queue)), evt_tag_int("mem_fifo_size", self->options->mem_fifo_size), NULL); log_msg_drop(lm, path_flags); return; } -- Bazsi
Balazs Scheidler (bazsi@balabit.hu) wrote on 2 July 2008 11:13:
syslog-ng does not log message loss on file destinations, because that'd clutter the log statistics line, on the asssumption that no messages are can be lost there.
However this is not always the case, if you log_fifo_size() is smaller than the window size of incoming sources, messages can be lost.
Here is a description of how flow-control and window sizes work in syslog-ng:
http://www.balabit.hu/dl/html/syslog-ng-admin-guide_en.html/ch08s03.html
Please let me (and the doc team) know if the description could be made clearer somehow.
I found this part confusing: The syslog-ng application monitors (polls) the sources defined in its configuration file, periodically checking each source for messages. When a log message is found in one of the sources, syslog-ng polls every source and reads the available messages. This means it breaks the periodicity of the checks, because when it finds something in one source it checks the others immediately. Is this true? If it is it doesn't look very useful because each source should be polled frequently enough independently of the others.
On Sun, 2008-07-20 at 12:59 -0300, Carlos Carvalho wrote:
Balazs Scheidler (bazsi@balabit.hu) wrote on 2 July 2008 11:13:
syslog-ng does not log message loss on file destinations, because that'd clutter the log statistics line, on the asssumption that no messages are can be lost there.
However this is not always the case, if you log_fifo_size() is smaller than the window size of incoming sources, messages can be lost.
Here is a description of how flow-control and window sizes work in syslog-ng:
http://www.balabit.hu/dl/html/syslog-ng-admin-guide_en.html/ch08s03.html
Please let me (and the doc team) know if the description could be made clearer somehow.
I found this part confusing:
The syslog-ng application monitors (polls) the sources defined in its configuration file, periodically checking each source for messages. When a log message is found in one of the sources, syslog-ng polls every source and reads the available messages.
This means it breaks the periodicity of the checks, because when it finds something in one source it checks the others immediately. Is this true? If it is it doesn't look very useful because each source should be polled frequently enough independently of the others.
You are right, this is a bit confusing. syslog-ng polls its inputs independently from each other, using the poll() system call. poll() blocks until no input is available, and wakes up when something becomes available _OR_ a timer elapses. Normal sources (network connections or pipes) are polled using the first method, but others like file sources, are polled using timers: syslog-ng periodically checks whether the files in question have grown in size or not. Now, poll() can return that multiple sources have inputs available. All of these are processed in the same poll loop iteration. File sources on the other hand are checked whenever poll wakes up: either because the follow-freq() timer of the file source elapsed or because another source indicated that data is available. This is because checking a file whether it has grown is simple and less CPU intensive, than waiting for another poll loop. Therefore file-size checking is always performed regardless how much time elapsed. This way normal sources are treated equally. File sources are checked every time when poll wakes up, or the follow-freq timer elapses. -- Bazsi
participants (3)
-
Balazs Scheidler
-
carlos@fisica.ufpr.br
-
Joe Shaw