Hi there, Thanks for the great syslog-ng program, it rocks. I have a step-by-step recipe below that yields unpredictable results in the syslog output, and this worries me a bit as in my company we have similar messages sent to syslog by PHP's print_r() function. The example is simplified to a message that consists of 40 occurences of an incrementing number, with a newline separating each number. Let's call it syslog_overflow.txt: ------------------------------------------------------------------------ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 ------------------------------------------------------------------------ Now let's send this message repeatedly using a simple Perl script. Let's call it syslog_overflow.pl: ------------------------------------------------------------------------ #!/usr/bin/perl use Sys::Syslog qw( :DEFAULT setlogsock ); setlogsock('unix'); # First argument is program name, used for passing vhost openlog(@ARGV[0], 'cons,pid', 'user'); $log = `cat syslog_overflow.txt`; while (1) { syslog('notice', $log); } closelog ------------------------------------------------------------------------ Run this script, and watch the syslog output, while filtering out the expected message, where newlines are converted to spaces: tail -f /var/log/syslog | grep -v ': 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39' You will notice unpredictable results like this: ------------------------------------------------------------------------ Mar 16 16:17:42 dockee jbq[12563]: 0 Mar 16 16:17:42 dockee 1 Mar 16 16:17:42 dockee 2 Mar 16 16:17:42 dockee 3 Mar 16 16:17:42 dockee 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 Mar 16 16:17:42 dockee jbq[12563]: 0 Mar 16 16:17:42 dockee 1 Mar 16 16:17:42 dockee 2 Mar 16 16:17:42 dockee 3 Mar 16 16:17:42 dockee 4 Mar 16 16:17:42 dockee 5 Mar 16 16:17:42 dockee 6 Mar 16 16:17:42 dockee 7 Mar 16 16:17:42 dockee 8 Mar 16 16:17:42 dockee 9 Mar 16 16:17:42 dockee 10 Mar 16 16:17:42 dockee 11 Mar 16 16:17:42 dockee 12 Mar 16 16:17:42 dockee 13 Mar 16 16:17:42 dockee 14 Mar 16 16:17:42 dockee 15 Mar 16 16:17:42 dockee 16 Mar 16 16:17:42 dockee 17 Mar 16 16:17:42 dockee 18 ------------------------------------------------------------------------ I could reproduce this behaviour easily on various hosts, with syslog-ng 2.0.0 up to 2.0.8 from a fresh git checkout. However, this problem does not appear with the legacy syslog daemon. I'd be glad to provide more details if necessary. I tried to fiddle with the various options: log_fetch_limit, log_msg_size, with no luck. I also activated the various debugging features and had a look at logreader.c, but couldn't find anything obvious. The only noteworthy debugging output I could get is the one below, which means the bogus lines are already present on the logreader side: ------------------------------------------------------------------------ Incoming log entry; line='<13>Mar 16 16:33:36 jbq[12638]: 0' Incoming log entry; line='1' Incoming log entry; line='2' Incoming log entry; line='3' Incoming log entry; line='4\x0a5\x0a6\x0a7\x0a8\x0a9\x0a10\x0a11\x0a12\x0a13\x0a14\x0a15\x0a16\x0a17\x0a18\x0a19\x0a20\x0a21\x0a22\x0a23\x0a24\x0a25\x0a26\x0a27\x0a28\x0a29\x0a30\x0a31\x0a32\x0a33\x0a34\x0a35\x0a36\x0a37\x0a38\x0a39\x0a' ------------------------------------------------------------------------ Thanks in advance for your help, -- Jean-Baptiste Quenot http://caraldi.com/jbq/blog/
On Sun, 2008-03-16 at 16:38 +0100, Jean-Baptiste Quenot wrote:
Hi there,
Thanks for the great syslog-ng program, it rocks.
I have a step-by-step recipe below that yields unpredictable results in the syslog output, and this worries me a bit as in my company we have similar messages sent to syslog by PHP's print_r() function. The example is simplified to a message that consists of 40 occurences of an incrementing number, with a newline separating each number. Let's call it syslog_overflow.txt:
------------------------------------------------------------------------ 0 1 2 3 ... ------------------------------------------------------------------------
Now let's send this message repeatedly using a simple Perl script. Let's call it syslog_overflow.pl:
------------------------------------------------------------------------ #!/usr/bin/perl
use Sys::Syslog qw( :DEFAULT setlogsock );
setlogsock('unix'); # First argument is program name, used for passing vhost openlog(@ARGV[0], 'cons,pid', 'user');
$log = `cat syslog_overflow.txt`;
while (1) { syslog('notice', $log); } closelog ------------------------------------------------------------------------
Run this script, and watch the syslog output, while filtering out the expected message, where newlines are converted to spaces:
tail -f /var/log/syslog | grep -v ': 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39'
You will notice unpredictable results like this:
------------------------------------------------------------------------ Mar 16 16:17:42 dockee jbq[12563]: 0 Mar 16 16:17:42 dockee 1 Mar 16 16:17:42 dockee 2 Mar 16 16:17:42 dockee 3 Mar 16 16:17:42 dockee 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 Mar 16 16:17:42 dockee jbq[12563]: 0 Mar 16 16:17:42 dockee 1 Mar 16 16:17:42 dockee 2 Mar 16 16:17:42 dockee 3 Mar 16 16:17:42 dockee 4 Mar 16 16:17:42 dockee 5 Mar 16 16:17:42 dockee 6 Mar 16 16:17:42 dockee 7 Mar 16 16:17:42 dockee 8 Mar 16 16:17:42 dockee 9 Mar 16 16:17:42 dockee 10 Mar 16 16:17:42 dockee 11 Mar 16 16:17:42 dockee 12 Mar 16 16:17:42 dockee 13 Mar 16 16:17:42 dockee 14 Mar 16 16:17:42 dockee 15 Mar 16 16:17:42 dockee 16 Mar 16 16:17:42 dockee 17 Mar 16 16:17:42 dockee 18 ------------------------------------------------------------------------
I could reproduce this behaviour easily on various hosts, with syslog-ng 2.0.0 up to 2.0.8 from a fresh git checkout. However, this problem does not appear with the legacy syslog daemon.
I'd be glad to provide more details if necessary. I tried to fiddle with the various options: log_fetch_limit, log_msg_size, with no luck. I also activated the various debugging features and had a look at logreader.c, but couldn't find anything obvious.
The only noteworthy debugging output I could get is the one below, which means the bogus lines are already present on the logreader side:
------------------------------------------------------------------------ Incoming log entry; line='<13>Mar 16 16:33:36 jbq[12638]: 0' Incoming log entry; line='1' Incoming log entry; line='2' Incoming log entry; line='3' Incoming log entry; line='4\x0a5\x0a6\x0a7\x0a8\x0a9\x0a10\x0a11\x0a12\x0a13\x0a14\x0a15\x0a16\x0a17\x0a18\x0a19\x0a20\x0a21\x0a22\x0a23\x0a24\x0a25\x0a26\x0a27\x0a28\x0a29\x0a30\x0a31\x0a32\x0a33\x0a34\x0a35\x0a36\x0a37\x0a38\x0a39\x0a' ------------------------------------------------------------------------
What driver are you using for /dev/log? unix-stream or unix-dgram? Can you check how your perl program actually _sends_ data to syslog-ng? e.g. can you strace it to check in what way does it call the send() system call? If you use unix-stream, syslog-ng will interpret NL and NUL characters as the end of message indicator, for unix-dgram NL characters can be embedded in messages and will be replaced by space in syslog-ng. Judging the debug output that you posted, I assume you are using unix-dgram (otherwise NL characters would not be included in the debugging output), in which case I guess it is perl which does it wrong. Again the strace output would help here. -- Bazsi
I use the default settings, ie unix-stream("/dev/log") defined in the s_all source. I ran the tests with unix-dgram instead, and it fixes the problem. Maybe the documentation at http://www.cs.utah.edu/~clake/docs/syslog-ng-2.0/#id2524704 may require some update? What about making dgram the default? Anyway thanks for the tip. Cheers, -- Jean-Baptiste Quenot http://caraldi.com/jbq/blog/
Hi,
I use the default settings, ie unix-stream("/dev/log") defined in the s_all source. I ran the tests with unix-dgram instead, and it fixes
There are no default settings. You're up to make your own config file as syslog-ng doesn't install any kind of default configuration AFAIK.
the problem. Maybe the documentation at http://www.cs.utah.edu/~clake/docs/syslog-ng-2.0/#id2524704 may require some update? What about making dgram the default?
There is no default as I pointed out above. If you ever want to build a reliable syslog infrastructure you have to omit UDP and unix-dgram. BTW you can send random data to syslog but before doing so sometimes it is handy to read at least the RFC about syslog. Some people think syslog has been designed to be a data transport. No, it hasn't. See RFC3164 Section 4.1.3 for reference. Your data definitely violates the syslog protocol so I won't expect predictable/ correct behaviour. I was able to reproduce your problem and the strace output showed that write() calls send whole lines. The problem is on the receiver's side. When syslog-ng is under strace the problem occurs more frequently. My assumption is that the stress testing program fills up the input buffer and syslog-ng reads to the first '\n'. As the test string does contain linefeed characters syslog-ng assumes that it is the end of the message and writes the partial string to the destination. If you haven't used the default facility/priority settings in your test script you would have notice the difference as syslog-ng applies user/notice to the logs which don't have fac/pri set. For UDP/ dgram there is no such problem as syslog-ng handles the whole packet in one iteration. When the dgram queue is full the logs would be dropped by the OS without any kind of notification. So using unix-dgram you won't see partial messages, but this doesn't mean that there are no problems. Regards, Sandor -------------------------------------------------------- NOTICE: If received in error, please destroy and notify sender. Sender does not intend to waive confidentiality or privilege. Use of this email is prohibited when received in error.
On Sun, 2008-03-16 at 16:38 +0100, Jean-Baptiste Quenot wrote:
Hi there,
Thanks for the great syslog-ng program, it rocks.
I have a step-by-step recipe below that yields unpredictable results in the syslog output, and this worries me a bit as in my company we have similar messages sent to syslog by PHP's print_r() function. The example is simplified to a message that consists of 40 occurences of an incrementing number, with a newline separating each number. Let's call it syslog_overflow.txt:
This seems to be a genuine bug in syslog-ng. It is related to the fact that syslog-ng tries to fetch multiple messages at a single poll iteration, and that reading EOF from a stream results in the remainder message to be flushed unconditonally. This is what happens: * the complete 0-39 sequence of numbers is sent to syslog-ng in a single "packet", which is read by syslog-ng as one packet * syslog-ng then reads an EOF while there are still some messages in the incoming queue But on the reception of an EOF, syslog-ng takes the remainder of the buffer as a single message. This assumption is wrong. The problem is in log_reader_fetch_log() message and the way EOF is handled. For some reason I can't reproduce the same problem with 2.1 and I still can't see why. I'll give it a spin to fix this. -- Bazsi
On Fri, 2008-03-21 at 12:21 +0100, Balazs Scheidler wrote:
On Sun, 2008-03-16 at 16:38 +0100, Jean-Baptiste Quenot wrote:
Hi there,
Thanks for the great syslog-ng program, it rocks.
I have a step-by-step recipe below that yields unpredictable results in the syslog output, and this worries me a bit as in my company we have similar messages sent to syslog by PHP's print_r() function. The example is simplified to a message that consists of 40 occurences of an incrementing number, with a newline separating each number. Let's call it syslog_overflow.txt:
This seems to be a genuine bug in syslog-ng.
It is related to the fact that syslog-ng tries to fetch multiple messages at a single poll iteration, and that reading EOF from a stream results in the remainder message to be flushed unconditonally.
This is what happens: * the complete 0-39 sequence of numbers is sent to syslog-ng in a single "packet", which is read by syslog-ng as one packet * syslog-ng then reads an EOF while there are still some messages in the incoming queue
But on the reception of an EOF, syslog-ng takes the remainder of the buffer as a single message. This assumption is wrong.
The problem is in log_reader_fetch_log() message and the way EOF is handled.
For some reason I can't reproduce the same problem with 2.1 and I still can't see why.
I'll give it a spin to fix this.
Can you check if this patch fixes it? diff --git a/src/logreader.c b/src/logreader.c index 11b84bb..a723b1c 100644 --- a/src/logreader.c +++ b/src/logreader.c @@ -416,10 +416,14 @@ log_reader_fetch_log(LogReader *self, FDRead *fd) msg_verbose("EOF occurred while reading", evt_tag_int(EVT_TAG_FD, fd->fd), NULL); - log_reader_iterate_buf(self, NULL, TRUE, &msg_count); - log_pipe_notify(self->control, &self->super.super, NC_CLOSE, self); + log_reader_iterate_buf(self, NULL, FALSE, &msg_count); g_sockaddr_unref(sa); - return FALSE; + if (self->ofs == 0) + { + log_pipe_notify(self->control, &self->super.super, NC_CLOSE, self); + return FALSE; + } + break; } else { I'm unable to test personally as my perl interpreter appends a '\0' character at the end of the log message and in that case (this is done by recent libc versions), syslog-ng considers it to be terminated by NUL, instead of NL. -- Bazsi
Dear Balazs, I incorporated your patch to both 2.1 and 2.2 taken from git, and I could still reproduce the problem. Note that your patch did not apply directly on either of those branches, I had to redo by hand. These are the versions I have: libc6 2.6.1-1ubuntu10 perl 5.8.8-7ubuntu3.1 Linux 2.6.22-14-generic Cheers, -- Jean-Baptiste Quenot http://caraldi.com/jbq/blog/
On Fri, 2008-03-21 at 15:45 +0100, Jean-Baptiste Quenot wrote:
Dear Balazs,
I incorporated your patch to both 2.1 and 2.2 taken from git, and I could still reproduce the problem. Note that your patch did not apply directly on either of those branches, I had to redo by hand.
These are the versions I have:
libc6 2.6.1-1ubuntu10 perl 5.8.8-7ubuntu3.1 Linux 2.6.22-14-generic
Hmm... I seem to have been wrong. This issue has nothing to do with EOF handling, rather it seems to be related to the fact that syslog-ng accepts both NL and NUL as line terminators, but it prefers NUL over NL. So if the current buffer contains a NUL character, it is used, if it does not NL is used instead. So if the following happens: message1\nmessage2\nmessage3\n\0 ^ | +---- this is the read position In this case message1 is processed normally, NL is taken as its line terminator. Then the remainder is read and the buffer is like: message2\nmessage3\n\0 ^ read position In this case NUL is spotted and the complete line "message2\nmessage3\n" is taken as the log message. This behaviour was changed in syslog-ng PE 2.1 (because of an unrelated change), but not yet in syslog-ng OSE 2.1 & 2.0 (because the same change was not commited to these branches). I'm somewhat reluctant to change the 2.0 branch, as this change is right in the middle of input processing, and I'd like to avoid regressions at all costs. So, I've just pushed a new change in syslog-ng 2.1, it includes the same message parser as the PE version, which drops the NUL preference and accepts either NUL or NL, whichever comes first. I reran your testcase and it fixes the problem for me. You can find the patch here: http://git.balabit.hu/?p=bazsi/syslog-ng-2.1.git;a=commit;h=4de7f364ce8f29eb... -- Bazsi
With the update the behavior is different, but there is still something wrong I think. This is what I did: 1) Run syslog-ng 2) Run syslog_overflow_test.pl 3) The following lines are written to syslog: Mar 25 15:46:02 dockee jbq[12238]: 0 Mar 25 15:46:05 dockee 1 Mar 25 15:46:05 dockee 2 Mar 25 15:46:05 dockee 3 Mar 25 15:46:05 dockee 4 Mar 25 15:46:05 dockee 5 Mar 25 15:46:05 dockee 6 Mar 25 15:46:05 dockee 7 Mar 25 15:46:05 dockee 8 Mar 25 15:46:05 dockee 9 Mar 25 15:46:05 dockee 10 Mar 25 15:46:05 dockee 11 Mar 25 15:46:05 dockee 12 Mar 25 15:46:05 dockee 13 Mar 25 15:46:05 dockee 14 Mar 25 15:46:05 dockee 15 Mar 25 15:46:05 dockee 16 Mar 25 15:46:05 dockee 17 Mar 25 15:46:05 dockee 18 Mar 25 15:46:05 dockee 19 Mar 25 15:46:05 dockee 20 Mar 25 15:46:05 dockee 21 Mar 25 15:46:05 dockee 22 Mar 25 15:46:05 dockee 23 Mar 25 15:46:05 dockee 24 Mar 25 15:46:05 dockee 25 Mar 25 15:46:05 dockee 26 Mar 25 15:46:05 dockee 27 Mar 25 15:46:05 dockee 28 Mar 25 15:46:05 dockee 29 Mar 25 15:46:05 dockee 30 Mar 25 15:46:05 dockee 31 This block of messages is repeated many times during a short period of 3 seconds, but then no output appears anymore, and syslog-ng reports an error: Destination queue full, dropping message; queue_len='8192', mem_fifo_size='8192' If I increase the mem_fifo_size to 32000, syslog-ng is hogging 101% (!) of the CPU, and nothing is logged... Also, I notice that now every number is on its own line. This is not necessarily a problem, but it is different from what we had before, ie all numbers on the same line. Best regards, -- Jean-Baptiste Quenot http://caraldi.com/jbq/blog/
On Tue, 2008-03-25 at 16:41 +0100, Jean-Baptiste Quenot wrote:
With the update the behavior is different, but there is still something wrong I think.
This is what I did:
1) Run syslog-ng 2) Run syslog_overflow_test.pl 3) The following lines are written to syslog:
Mar 25 15:46:02 dockee jbq[12238]: 0 Mar 25 15:46:05 dockee 1 Mar 25 15:46:05 dockee 2 ... Mar 25 15:46:05 dockee 31
This block of messages is repeated many times during a short period of 3 seconds, but then no output appears anymore, and syslog-ng reports an error:
Destination queue full, dropping message; queue_len='8192', mem_fifo_size='8192'
If I increase the mem_fifo_size to 32000, syslog-ng is hogging 101% (!) of the CPU, and nothing is logged...
Can you also post your configuration file? I can't seem to reproduce this problem.
Also, I notice that now every number is on its own line. This is not necessarily a problem, but it is different from what we had before, ie all numbers on the same line.
This is the intended behaviour. Either NL or NUL characters can terminate a message, but earlier if a NUL was found, the NL wasn't even looked for. -- Bazsi
Here it is attached, it comes initially from the Ubuntu package and I copied it to /usr/local/etc/syslog-ng.conf where the hand-compiled syslog-ng looks for its config file. Cheers, -- Jean-Baptiste Quenot http://caraldi.com/jbq/blog/
participants (3)
-
Balazs Scheidler
-
Geller, Sandor (IT)
-
Jean-Baptiste Quenot