[syslog-ng] advice/assistance with parsing attempt requested

Balazs Scheidler bazsi at balabit.hu
Wed Dec 8 21:47:48 CET 2010


Hi,

Although I really like the ideas floating around, the best way to
address this issue is to write a welf parser plugin to syslog-ng which
simply produces name-value pairs from the input, without having to pipe
them out to an external process.

The round-trip (pipe-write, pipe-read, process, pipe-write, pipe-read)
is simply enormous. 

And 3.2 already has plugins in place, so we only need someone
volunteering to write a welf parser. :)

Something along the lines of:

parser { welf-parser(prefix(".welf")); };

Which would put all name-value pairs in the input into name-value pairs,
prefixed with '.welf', e.g. name1=value1 would become an NV pair in
syslog-ng with the name ${.welf.name1} and value "value1".

Does that make sense? Or I'm missing something?

On Mon, 2010-12-06 at 13:01 -0700, Bill Anderson wrote:
> On Dec 6, 2010, at 12:37 PM, Martin Holste wrote:
> 
> >> Agreed, Perl is plenty quick, hence my wondering about the actual volume. If it is too much for Perl I'd go w/C++.
> > 
> > From what I can tell, PCRE in Perl (or Python or whatever) is really
> > close to C/C++ speeds because they're essentially using the same
> > library and therefore mostly the same syscalls.  I'd be really
> > interested if anyone has benchmarks.  I'd expect something like 10%
> > better performance in C, but not much more, assuming that the vast
> > majority of CPU time is spent on PCRE.
> 
> Yeah I was thinking the overhead might be in what is done, as opposed to just the RE portion. Of course, the OP script might be implemented rather differently. ;)
> 
> 
> > 
> >> Personally, I'd make the last step routing back into syslog-ng with a source on a custom port and letting syslog handle the writing to disk. That way you can still use macros such as timestamps, etc.. Then again, that may be because I do that all the time. ;) A log statement that takes everything from the custom source and logs to a file should work beautifully; no need for filters though you could still do additional processing if needed. That said I'd also consider running a daemon that accepted all the input, formatted it, and then sent it to syslog-ng, pointing the clients at the custom daemon if that was possible.
> >> 
> >> One advantage to the daemon route is that it wouldn't *have* to reside on the same system.
> > 
> > Yep, you could definitely let Syslog-NG handle the last mile as well.
> > I was trying to keep the scope as narrow as possible in my example.
> > 
> > I wonder if you could build an NFA state machine by conditionally
> > looping output from a pattern-db parsed message into a source in
> > Syslog-NG with a different pattern-db, depending on the previous
> > output.  Something like a token parser pdb that does an ESTRING up
> > until " " and another one that only expects the key/val pair to be
> > sent to it as the message.  So it comes in as k1=v1 k2=v2 and the
> > first kv gets gobbled up and then sent to another pdb source with a
> > pdb which only matches if the message starts with certain terms.  Then
> > the rest of the original message is looped back to itself using
> > @ANYSTRING@ to capture the remainder, that is, minus the kv which was
> > sent to the kv pdb.  It would keep recursively looping like that until
> > there's no message left.  If that all worked, your pattern db would be
> > extremely simple as it would just be a pattern per key you were
> > looking for, and order would no longer be an issue.  
> 
> Maybe I'm nuts, but that sounds awesome to me. :D
> 
> > Of course there's
> > still the problem of demuxing the whole thing back into a coherent
> > message, but I think that could be done a number of ways by passing
> > the MSGID token with each part and using the new conditionals present
> > in OSE 3.2.  
> 
> Well, there is message correlation in 3.2.1 right?  muahahaha
> 
> > If OSE 3.3 can really do close to 1 million msgs/sec,
> > then the overhead of resubmitting the same log many times may be
> > bearable, especially with the threading.
> 
> True the rate might be the downside to that mechanism. However, the terseness of the messages might make up for some of it.
> 
> 
> ______________________________________________________________________________
> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
> Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng
> FAQ: http://www.campin.net/syslog-ng/faq.html
> 
> 

-- 
Bazsi




More information about the syslog-ng mailing list