[syslog-ng]Urgent: Mangled messages in pipe() destinations in versions >1.5.8

Andreas Schulze Andreas.Schulze@mediaWays.NET
Sat, 28 Dec 2002 00:05:12 +0100


Balazs Scheidler wrote:
 > On Fri, Sep 27, 2002 at 02:47:25PM +0200, Carlos Inacio wrote:
 >
 >>We are seeing exactly the same behaviour Andreas described. Actually
 >>it works fine until version 1.5.8, but if we use any version >1.5.8,
 >>including patched 1.5.19 and 1.5.20, we get lots of mangled messages.
 >
 > Without being able to reproduce the bug I _need_ more information.
 > I've checked out the difference between 1.5.20 and 1.5.8 at the time
 > I received the original bug report, but saw no reasons that could
 > cause this problem.
 >
 > What I'd need:
 > 1) the least complex config file that shows the problem
 > 2) strace dump of syslog-ng while receiving this message
 >    (only the relevant parts are needed, e.g. the place where
 >     syslog-ng receives this message)

Sorry Balazs,

I need a little bit time to set up a development environment to trace
and debug this problem. This was a bit more difficult, because the
buggy behaviour occurs only on heavy loaded log servers.
But now, I think, I know what the real problem is.

Let's remember. Using the following pipe() destinations:
(need more than one dest() for /tmp/pipe, to tag the
msgs based on filters())

> destination d_a { pipe("/tmp/pipe" template("@tag_a: | $MSG\n")); };
> destination d_b { pipe("/tmp/pipe" template("@tag_b: | $MSG\n")); };
 > destination d_c { pipe("/tmp/pipe" template("@tag_c: | $MSG\n")); };
 > destination d_d { pipe("/tmp/pipe" template("@tag_d: | $MSG\n")); };
> log { source(..); filter(fa); destination(d_a); };
> log { source(..); filter(fb); destination(d_b); };
 > log { source(..); filter(fb); destination(d_c); };
 > log { source(..); filter(fb); destination(d_d); };

we get mangled messages in /tmp/pipe, if at least one of the
log() paths is heavy loaded. This seems to occur in all versions >1.5.8

truss'ing the process gives the view below.
(Each message should begin with a tag <@--?LM-->.
This is only for easy get the pipe fd's in truss output.)

What happens:

write(48, " < @ - - I L M - - @ >  ".., 4096)	= 4096
[-> Try to write 4096 bytes. Success. No problem.
     (There are more than one message inside this 4KB)]
write(48, " c v / % l o s s   =   5".., 4096)	= 4096
[-> Try to write the next 4 KBytes. Success. No problem?
     (WARNING:
      Because we use a byte buffer instead a line buffer,
      there is NO tailing '\n' in the pipe now!!)]
write(48, " c m p d :   |   i c m p".., 508)	Err#11 EAGAIN
[-> Try to write the last bytes. Pipe is bussy. Drop.
     (Remenber, that there is NO tailing '\n' in the pipe now!!)]
write(12, " < @ - - S L M - - @ >  ".., 247)	Err#11 EAGAIN
[-> Try to write bytes from the other log path to the same pipe().
     Pipe is bussy. Drop. ]
[...]
write(20, " < @ - - C L M - - @ >  ".., 203)	= 203
[-> First successfull attempt to write bytes to the pipe, after
     the pipe becomes ready again. Success? Yes. No! HERE is the Bug.
     Here we produce a mangled message because we dropped
     some stuff and the trailing newline before!]

I think, you changed the internal buffering from a line based buffer
to byte based buffer between 1.5.8 and 1.5.9.
In 1.5.7 there isn't that problem.
Maybe, line based buffering seems a better idea in this context.

If you need better information, I could send you the complex config
and a full truss ouput in private.

Hope, this coveres the problem.

Because a solution becomes more and more important for us, I hope
that you can investigate this and maybe we can discuss a solution.

Thanks a lot --Andreas

[...]
write(48, " < @ - - I L M - - @ >  ".., 4096)	= 4096
write(48, " c v / % l o s s   =   5".., 4096)	= 4096
write(48, " i c m p d :   |   i c m".., 1553)	Err#11 EAGAIN
write(20, " < @ - - C L M - - @ >  ".., 203)	Err#11 EAGAIN
[...]
write(48, " < @ - - I L M - - @ >  ".., 4096)	= 4096
write(48, " c v / % l o s s   =   5".., 4096)	= 4096
write(48, " c m p d :   |   i c m p".., 508)	Err#11 EAGAIN
write(12, " < @ - - S L M - - @ >  ".., 247)	Err#11 EAGAIN
[...]
write(20, " < @ - - C L M - - @ >  ".., 203)	= 203
write(12, " < @ - - S L M - - @ >  ".., 161)	= 161
write(52, " < @ - - M L M - - @ >  ".., 201)	= 201
write(20, " < @ - - C L M - - @ >  ".., 259)	= 259
write(20, " < @ - - C L M - - @ >  ".., 242)	= 242
write(12, " < @ - - S L M - - @ >  ".., 176)	= 176
write(20, " < @ - - C L M - - @ >  ".., 204)	= 204
write(12, " < @ - - S L M - - @ >  ".., 135)	= 135
write(48, " < @ - - I L M - - @ >  ".., 4096)	= 4096
write(48, " n o   e r r o r s ]   <".., 4096)	= 4096
write(48, " s ]   < - - E O M - - >".., 4096)	Err#11 EAGAIN
write(48, " 9 . 7   |   A L I V E  ".., 4096)	Err#11 EAGAIN
write(48, " 2 / 2 0 0 2   2 2 : 3 9".., 4096)	Err#11 EAGAIN
write(48, "   |   i c m p . b o t .".., 4096)	Err#11 EAGAIN
write(48, " b o t . n e t :   2 7 /".., 4096)	Err#11 EAGAIN
write(48, " M - - @ >   @ i c m p d".., 4096)	Err#11 EAGAIN
write(48, "   @ i c m p d :   |   i".., 3436)	Err#11 EAGAIN
write(27, " < @ - - T L M - - @ >  ".., 436)	Err#11 EAGAIN
write(20, " < @ - - C L M - - @ >  ".., 194)	Err#11 EAGAIN
write(12, " < @ - - S L M - - @ >  ".., 2682)	Err#11 EAGAIN
[...]

-- 
Best regards --Andreas Schulze
                [phone: +49.5246.80.1275, fax: +49.5246.80.2275]

| I believe, it was Dennis Ritchie who said something like:
|   "C is rarely the best language for a given task,
|    but it's often the second-best".
| The implication being that: "[...]"
|
| sh# cat>$$.c<<EOT
| main(l,a,n,d)char**a;{for(d=atoi(a[1])/10*80-atoi(a[2])/5-596;n="@NK\
| ACLCCGZAAQBEAADAFaISADJABBA^SNLGAQABDAXIMBAACTBATAHDBANZcEMMCCCCAAhE\
| IJFAEAAABAfHJETBdFLDAANEfDNBPHdBcBBBEA_AL H E L L O,    W O R L D! "
| [l++-3];)for(;n-->64;)putchar(!d+++33^l&1);}
| EOT
| gcc -o$$ $$.c;clear;./$$ 52 8;rm -f $$*