Syslog-NG, log4j, tabs, whitespace, indent
Hello, I believe that the log4j SyslogAppender is sending logs to syslog-ng that begin with a '\t' (or four spaces). The leading whitespace is trimmed either because it's not allowed per the syslog RFC, or for some other (hopefully) good reason. Can I turn this behavior off? an example would be a message: Dec 10 16:17:52 testbox at java.lang.Thread.run(Thread.java:619) which should be Dec 10 16:17:52 testbox \tat java.lang.Thread.run(Thread.java:619) Thanks, John Skopis Systems Administrator
On Wed, 2008-12-10 at 15:30 -0600, John Skopis wrote:
Hello, I believe that the log4j SyslogAppender is sending logs to syslog-ng that begin with a '\t' (or four spaces).
The leading whitespace is trimmed either because it's not allowed per the syslog RFC, or for some other (hopefully) good reason.
Can I turn this behavior off?
an example would be a message: Dec 10 16:17:52 testbox at java.lang.Thread.run(Thread.java:619) which should be Dec 10 16:17:52 testbox \tat java.lang.Thread.run(Thread.java:619)
I've just checked with syslog-ng OSE 2.1, and the tabs stay where they are. It should be the same with syslog-ng OSE 2.0 Can you strace/tcpdump the exact traffic as it is received by syslog-ng? -- Bazsi
Bazsi, Strace: recvfrom(5, "<147> at org.jgroups.util.TimeScheduler$Loop.run(TimeScheduler.java:141)", 8192, 0, {sa_family=AF_INET, sin_port=htons(41277), sin_addr=inet_addr("1.2.3.4")}, [16]) = 75 write(14, "Dec 10 17:52:04 1.2.3.4 at org.jgroups.util.TimeScheduler$Loop.run(TimeScheduler.java:141)\n", 97) = 97 recvfrom(5, "<147> at org.jgroups.util.TimeScheduler$Loop.run(TimeScheduler.java:141)", 8192, 0, {sa_family=AF_INET, sin_port=htons(41277), sin_addr=inet_addr("1.2.3.4")}, [16]) = 75 write(14, "Dec 10 17:52:04 1.2.3.4 at org.jgroups.util.TimeScheduler$Loop.run(TimeScheduler.java:141)\n", 97) = 97 Let me know if you need tcpdump still (I started the cap but would like to anonymize it and/or send it to you off-list). Thanks, John Skopis Systems Administrator Backstop Solutions Group
-----Original Message----- From: syslog-ng-bounces@lists.balabit.hu [mailto:syslog-ng- bounces@lists.balabit.hu] On Behalf Of Balazs Scheidler Sent: Wednesday, December 10, 2008 4:38 PM To: Syslog-ng users' and developers' mailing list Subject: Re: [syslog-ng] Syslog-NG, log4j, tabs, whitespace, indent
On Wed, 2008-12-10 at 15:30 -0600, John Skopis wrote:
Hello, I believe that the log4j SyslogAppender is sending logs to syslog-ng that begin with a '\t' (or four spaces).
The leading whitespace is trimmed either because it's not allowed per the syslog RFC, or for some other (hopefully) good reason.
Can I turn this behavior off?
an example would be a message: Dec 10 16:17:52 testbox at java.lang.Thread.run(Thread.java:619) which should be Dec 10 16:17:52 testbox \tat java.lang.Thread.run(Thread.java:619)
I've just checked with syslog-ng OSE 2.1, and the tabs stay where they are. It should be the same with syslog-ng OSE 2.0
Can you strace/tcpdump the exact traffic as it is received by syslog- ng?
-- Bazsi
_______________________________________________________________________ _______ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
On Wed, 2008-12-10 at 16:55 -0600, John Skopis wrote:
Bazsi,
Strace: recvfrom(5, "<147> at org.jgroups.util.TimeScheduler$Loop.run(TimeScheduler.java:141)", 8192, 0, {sa_family=AF_INET, sin_port=htons(41277), sin_addr=inet_addr("1.2.3.4")}, [16]) = 75 write(14, "Dec 10 17:52:04 1.2.3.4 at org.jgroups.util.TimeScheduler$Loop.run(TimeScheduler.java:141)\n", 97) = 97 recvfrom(5, "<147> at org.jgroups.util.TimeScheduler$Loop.run(TimeScheduler.java:141)", 8192, 0, {sa_family=AF_INET, sin_port=htons(41277), sin_addr=inet_addr("1.2.3.4")}, [16]) = 75 write(14, "Dec 10 17:52:04 1.2.3.4 at org.jgroups.util.TimeScheduler$Loop.run(TimeScheduler.java:141)\n", 97) = 97
Let me know if you need tcpdump still (I started the cap but would like to anonymize it and/or send it to you off-list).
Ah, this means that there's no tab at the beginning of the message, but rather it is a space. syslog-ng skips spaces if the syslog header is not complete. Try enabling setHeader(true) on your SyslogAppender class. This will probably make log4j emit a complete syslog header, in which case syslog-ng will not drop the preceding spaces. Please let me know if this works for you. -- Bazsi
-----Original Message----- From: syslog-ng-bounces@lists.balabit.hu [mailto:syslog-ng- bounces@lists.balabit.hu] On Behalf Of Balazs Scheidler Sent: Thursday, December 11, 2008 7:21 AM To: Syslog-ng users' and developers' mailing list Subject: Re: [syslog-ng] Syslog-NG, log4j, tabs, whitespace, indent
On Wed, 2008-12-10 at 16:55 -0600, John Skopis wrote:
Bazsi,
Strace: recvfrom(5, "<147> at org.jgroups.util.TimeScheduler$Loop.run(TimeScheduler.java:141)", 8192, 0, {sa_family=AF_INET, sin_port=htons(41277), sin_addr=inet_addr("1.2.3.4")}, [16]) = 75 write(14, "Dec 10 17:52:04 1.2.3.4 at org.jgroups.util.TimeScheduler$Loop.run(TimeScheduler.java:141)\n", 97) = 97 recvfrom(5, "<147> at org.jgroups.util.TimeScheduler$Loop.run(TimeScheduler.java:141)", 8192, 0, {sa_family=AF_INET, sin_port=htons(41277), sin_addr=inet_addr("1.2.3.4")}, [16]) = 75 write(14, "Dec 10 17:52:04 1.2.3.4 at org.jgroups.util.TimeScheduler$Loop.run(TimeScheduler.java:141)\n", 97) = 97
Let me know if you need tcpdump still (I started the cap but would like to anonymize it and/or send it to you off-list).
Ah, this means that there's no tab at the beginning of the message, but rather it is a space. syslog-ng skips spaces if the syslog header is not complete.
Try enabling setHeader(true) on your SyslogAppender class. This will probably make log4j emit a complete syslog header, in which case syslog-ng will not drop the preceding spaces.
Please let me know if this works for you.
Well, I tried setting Header=true but it did nothing. Actually, I think that it *is* setting the header for the first line of output, however as a stack trace is a log message spanning multiple lines it only sets the header for the first line. Can I request a feature like the one is this patch? I am not sure of the implications in keeping the whitespace. It doesn't appear to break anything for me...but then again I am only using the $HOST macro.. I admit that it is a bit of a hack...perhaps the original logic was meant to replace the first space, which acts a field separator, not trim leading whitespace chars? Also, scuse' the indentation in the patch -- clearly my vi settings don't match your indentation rules ;] Thanks diff -uar syslog-ng-2.0.9.pristine/src/cfg-lex.c syslog-ng-2.0.9/src/cfg-lex.c --- syslog-ng-2.0.9.pristine/src/cfg-lex.c 2008-03-23 15:42:01.000000000 -0500 +++ syslog-ng-2.0.9/src/cfg-lex.c 2008-12-11 09:59:44.000000000 -0600 @@ -2074,6 +2074,9 @@ return LRO_NOPARSE; if (strcmp(flag, "kernel") == 0) return LRO_KERNEL; + if (strcmp(flag, "keep-whitespace") == 0 || strcmp(flag, "keep_whitespace") == 0); + return LR_KEEP_WHITESPACE; + msg_error("Unknown parse flag", evt_tag_str("flag", flag), NULL); return 0; } diff -uar syslog-ng-2.0.9.pristine/src/cfg-lex.l syslog-ng-2.0.9/src/cfg-lex.l --- syslog-ng-2.0.9.pristine/src/cfg-lex.l 2007-12-18 09:25:14.000000000 -0600 +++ syslog-ng-2.0.9/src/cfg-lex.l 2008-12-11 09:56:06.000000000 -0600 @@ -285,6 +285,8 @@ return LRO_NOPARSE; if (strcmp(flag, "kernel") == 0) return LRO_KERNEL; + if (strcmp(flag, "keep-whitespace") == 0 || strcmp(flag, "keep_whitespace") == 0); + return LR_KEEP_WHITESPACE; msg_error("Unknown parse flag", evt_tag_str("flag", flag), NULL); return 0; } diff -uar syslog-ng-2.0.9.pristine/src/logmsg.c syslog-ng-2.0.9/src/logmsg.c --- syslog-ng-2.0.9.pristine/src/logmsg.c 2007-12-29 11:39:33.000000000 -0600 +++ syslog-ng-2.0.9/src/logmsg.c 2008-12-11 10:01:35.000000000 -0600 @@ -190,11 +190,13 @@ } - while (left && *src == ' ') - { /* Move past whitespace */ - src++; - left--; - } + if ( ! (flags & LP_KEEP_WHITESPACE) ) { + while (left && *src == ' ') + { /* Move past whitespace */ + src++; + left--; + } + }; /* If the next chars look like a date, then read them as a date. */ if (left >= 19 && src[4] == '-' && src[7] == '-' && src[10] == 'T' && src[13] == ':' && src[16] == ':') diff -uar syslog-ng-2.0.9.pristine/src/logmsg.h syslog-ng-2.0.9/src/logmsg.h --- syslog-ng-2.0.9.pristine/src/logmsg.h 2008-02-13 13:07:00.000000000 -0600 +++ syslog-ng-2.0.9/src/logmsg.h 2008-12-11 09:54:09.000000000 -0600 @@ -39,6 +39,8 @@ #define LP_CHECK_HOSTNAME 0x0008 #define LP_STRICT 0x0010 #define LP_KERNEL 0x0020 +#define LP_KEEP_WHITESPACE 0x0030 + #define LF_UNPARSED 0x0001 #define LF_INTERNAL 0x0002 diff -uar syslog-ng-2.0.9.pristine/src/logreader.c syslog-ng-2.0.9/src/logreader.c --- syslog-ng-2.0.9.pristine/src/logreader.c 2008-03-23 15:35:25.000000000 -0500 +++ syslog-ng-2.0.9/src/logreader.c 2008-12-11 10:06:22.000000000 -0600 @@ -246,6 +246,8 @@ parse_flags |= LP_INTERNAL; if (self->flags & LR_LOCAL) parse_flags |= LF_LOCAL; + if (self->options->options & LR_KEEP_WHITESPACE) + parse_flags |= LP_KEEP_WHITESPACE; if ((self->flags & LR_PKTTERM) || (!eol && (self->ofs == self->options->msg_size)) || diff -uar syslog-ng-2.0.9.pristine/src/logreader.h syslog-ng-2.0.9/src/logreader.h --- syslog-ng-2.0.9.pristine/src/logreader.h 2008-01-26 09:15:14.000000000 -0600 +++ syslog-ng-2.0.9/src/logreader.h 2008-12-11 09:55:52.000000000 -0600 @@ -36,6 +36,7 @@ #define LR_NOMREAD 0x0008 #define LR_FOLLOW 0x0010 #define LR_STRICT 0x0020 +#define LR_KEEP_WHITESPACE 0x0030 #define LR_COMPLETE_LINE 0x0100 diff -uar syslog-ng-2.0.9.pristine/syslog-ng.spec.bb syslog-ng-2.0.9/syslog-ng.spec.bb --- syslog-ng-2.0.9.pristine/syslog-ng.spec.bb 2008-03-23 15:41:57.000000000 -0500 +++ syslog-ng-2.0.9/syslog-ng.spec.bb 2008-12-11 10:05:09.000000000 -0600 @@ -14,6 +14,9 @@ BuildRoot: %{_tmppath}/%{name}-root BuildRequires: bison, flex, gcc-c++, pkgconfig, glib2-devel, libevtlog-devel Provides: syslog + +Patch1: keep_ws.patch + #BuildConflicts: #Exclusivearch: i386 @@ -30,6 +33,7 @@ %prep %setup -q -n syslog-ng-%{version} +%patch1 -p1 %build
On Thu, 2008-12-11 at 10:37 -0600, John Skopis wrote:
-----Original Message----- From: syslog-ng-bounces@lists.balabit.hu [mailto:syslog-ng- bounces@lists.balabit.hu] On Behalf Of Balazs Scheidler Sent: Thursday, December 11, 2008 7:21 AM To: Syslog-ng users' and developers' mailing list Subject: Re: [syslog-ng] Syslog-NG, log4j, tabs, whitespace, indent
On Wed, 2008-12-10 at 16:55 -0600, John Skopis wrote:
Bazsi,
Strace: recvfrom(5, "<147> at org.jgroups.util.TimeScheduler$Loop.run(TimeScheduler.java:141)", 8192, 0, {sa_family=AF_INET, sin_port=htons(41277), sin_addr=inet_addr("1.2.3.4")}, [16]) = 75 write(14, "Dec 10 17:52:04 1.2.3.4 at org.jgroups.util.TimeScheduler$Loop.run(TimeScheduler.java:141)\n", 97) = 97 recvfrom(5, "<147> at org.jgroups.util.TimeScheduler$Loop.run(TimeScheduler.java:141)", 8192, 0, {sa_family=AF_INET, sin_port=htons(41277), sin_addr=inet_addr("1.2.3.4")}, [16]) = 75 write(14, "Dec 10 17:52:04 1.2.3.4 at org.jgroups.util.TimeScheduler$Loop.run(TimeScheduler.java:141)\n", 97) = 97
Let me know if you need tcpdump still (I started the cap but would like to anonymize it and/or send it to you off-list).
Ah, this means that there's no tab at the beginning of the message, but rather it is a space. syslog-ng skips spaces if the syslog header is not complete.
Try enabling setHeader(true) on your SyslogAppender class. This will probably make log4j emit a complete syslog header, in which case syslog-ng will not drop the preceding spaces.
Please let me know if this works for you.
Well, I tried setting Header=true but it did nothing. Actually, I think that it *is* setting the header for the first line of output, however as a stack trace is a log message spanning multiple lines it only sets the header for the first line.
hmm is log4j using TCP or UDP to send the message? I assume it is TCP. In TCP mode NL characters are the record terminator characters, so as it seems log4j generates a multiline message with only one header and several embedded NL characters. syslog-ng interprets this as a series of log messages, the first one with a syslog header the rest without one. Therefore space removal kicks in, the result is what you see. Skipping the white-space removal is a band-aid solution, the real problem is that log4j uses multiline messages which syslog-ng does not support. The syslog-ng 3.0 branch already supports multi-line messages if the underlying transport supports those: either UDP or the new syslog-protocol. Are these real alternatives for you? I guess UDP is out of question because it may easily lose messages. The new syslog-protocol is not yet supported by log4j, however I know someone who implemented initial support for that. Would that be ok for you to switch over to use syslog-ng 3.0 and syslog-protocol? -- Bazsi
-----Original Message----- From: syslog-ng-bounces@lists.balabit.hu [mailto:syslog-ng- bounces@lists.balabit.hu] On Behalf Of Balazs Scheidler Sent: Monday, December 29, 2008 6:01 AM To: Syslog-ng users' and developers' mailing list Subject: Re: [syslog-ng] Syslog-NG, log4j, tabs, whitespace, indent
-----Original Message----- From: syslog-ng-bounces@lists.balabit.hu [mailto:syslog-ng- bounces@lists.balabit.hu] On Behalf Of Balazs Scheidler Sent: Thursday, December 11, 2008 7:21 AM To: Syslog-ng users' and developers' mailing list Subject: Re: [syslog-ng] Syslog-NG, log4j, tabs, whitespace, indent
On Wed, 2008-12-10 at 16:55 -0600, John Skopis wrote:
Bazsi,
Strace: recvfrom(5, "<147> at org.jgroups.util.TimeScheduler$Loop.run(TimeScheduler.java:141)", 8192, 0, {sa_family=AF_INET, sin_port=htons(41277), sin_addr=inet_addr("1.2.3.4")}, [16]) = 75 write(14, "Dec 10 17:52:04 1.2.3.4 at org.jgroups.util.TimeScheduler$Loop.run(TimeScheduler.java:141)\n",
= 97
recvfrom(5, "<147> at org.jgroups.util.TimeScheduler$Loop.run(TimeScheduler.java:141)", 8192, 0, {sa_family=AF_INET, sin_port=htons(41277), sin_addr=inet_addr("1.2.3.4")}, [16]) = 75 write(14, "Dec 10 17:52:04 1.2.3.4 at org.jgroups.util.TimeScheduler$Loop.run(TimeScheduler.java:141)\n",
On Thu, 2008-12-11 at 10:37 -0600, John Skopis wrote: 97) 97)
= 97
Let me know if you need tcpdump still (I started the cap but
would like to anonymize it and/or send it to you off-list).
Ah, this means that there's no tab at the beginning of the message, but rather it is a space. syslog-ng skips spaces if the syslog header is not complete.
Try enabling setHeader(true) on your SyslogAppender class. This will probably make log4j emit a complete syslog header, in which case syslog-ng will not drop the preceding spaces.
Please let me know if this works for you.
Well, I tried setting Header=true but it did nothing. Actually, I think that it *is* setting the header for the first line of output, however as a stack trace is a log message spanning multiple lines it only sets the header for the first line.
hmm is log4j using TCP or UDP to send the message? I assume it is TCP.
The log4j version we are using does not support tcp. There is a nlog4j project but I am not sure we would to, or even be able to switch. I am not sure if any of this has changed in JBoss 5 as we are still on 4.
In TCP mode NL characters are the record terminator characters, so as it seems log4j generates a multiline message with only one header and several embedded NL characters. syslog-ng interprets this as a series of log messages, the first one with a syslog header the rest without one.
Is this also true for UDP? Is the multi-line message sent as multiple syslog datagrams (1 line = 1 datagram) or is it a single datagram with embedded newlines? I will need to check a packet capture to confirm.
Therefore space removal kicks in, the result is what you see.
Skipping the white-space removal is a band-aid solution, the real problem is that log4j uses multiline messages which syslog-ng does not support.
Agreed. It works though, in this case at least. I still don't understand why exactly the whitespace needs to be trimmed at all but I trust it's for a good reason.
The syslog-ng 3.0 branch already supports multi-line messages if the underlying transport supports those: either UDP or the new syslog-protocol. Are these real alternatives for you? I guess UDP is out of question because it may easily lose messages. The new syslog- protocol is not yet supported by log4j, however I know someone who implemented initial support for that.
I have not used syslog-ng 3.0 yet and our version of JBoss is not using the log4j with tcp support. How easily might the syslog server lose messages? It will never receive more than ~10000pps which it should easily be able to handle. Besides at that point it’s a *BSD or Linux problem and not a syslog-ng problem...right? Of course *BSD would never have such a problem to begin with. ;]
Would that be ok for you to switch over to use syslog-ng 3.0 and syslog-protocol?
I will try syslog-ng 3.x with standard UDP syslog messages. If it can handle the incomplete syslog message with whitespace a little better and is marked stable we would certainly consider switching. Thanks, John
John Skopis wrote:
How easily might the syslog server lose messages? It will never receive more than ~10000pps which it should easily be able to handle. Besides at that point it’s a *BSD or Linux problem and not a syslog-ng problem...right? Of course *BSD would never have such a problem to begin with. ;]
Any congested router along the way might throw away a UDP datagram. /Per Jessen, Zürich
participants (3)
-
Balazs Scheidler
-
John Skopis
-
Per Jessen