[syslog-ng] patterndb - user defined parsers

Mon Dec 19 17:47:34 CET 2011

> Also, instead of reinventing the wheel, I'd simply add a @REGEXP@
> parser, which if hit could of course become a petformance bottleneck,
> but stuffing all arguments into a @@ expression is difficult to read and
> maintain.

One idea you can borrow from IDS with regard to regexp:  Have regexp
be evaluated after non-regexp matches evaluate so that they are not
invoked on every message, but instead are used for clarification.  For
instance, I am having occasional difficulties with competing patterns
due to the almost CSV-like quality in the message patterns.
Specifically, messages sent by eventlog-to-syslog follow a pattern of
eventid: source: message
and so my pattern of @NUMBER:eventid:@: @ESTRING:source:@ @ANYSTRING@
basically matches anything with two colons in the message with a
leading number.  The program name changes, so you can't pre-filter
with that.  If you could change @ANYSTRING@ to @REGEXP@ that would
match and extract various parts of the message, but would only be
evaluated after NUMBER and ESTRING hit, you could have good
performance and easy-to-write patterns because you could invoke the
power of (often already available) regexp's.

Another idea would be to have sub-patterns.  These would take place
after a first pass of patterns are evaluated.  So in the above
example, after NUMBER and ESTRING extract their variables, the
remaining message block would be passed to another pattern set for
further evaluation, but with the parts NUMBER and ESTRING matched
removed.  The variables the prior patterns stored would still be
available.  This would allow sub-patterns to behave like programming
subclasses and written without having to copy the prior field matches
each time.  Instead, that work would already have been completed and
they would instead "inherit" all of the prior match field extractions.

If you combine the two ideas, you could have initial matching with
field extractions followed by sub-patterns which contain regexp's to
do the fine-grained field extractions.  This would present a more
versatile and programmatic way of matching precedence.