[syslog-ng] pattern matching on xxx#
Balazs Scheidler
bazsi at balabit.hu
Sat Oct 16 06:37:43 CEST 2010
On Fri, 2010-10-15 at 12:48 -0600, Bill Anderson wrote:
> I have hostnames of the format xxxx# such as host1, hostb1, hostc1. I need to split that into two fields such as (host,1).
>
> Unfortunately, since @@ escapes the @ and STRING and it's followers ALSO match digits, I've not found the obvious e.
> means to get that out. Conceptually something like @LETTER:host.name@@NUMBER:host.id@ woudl do it, save that
> LETTER doesn't exist and @@ escapes.
'@@' wouldn't escape in this situation.
The thing I'd like to understand before recommending a solution is where
these hostnames come in the picture? usually the hostname portion is not
processed by db-parser. Or you have these names inside the message
payload and you want to get it from there?
what I would propose if this is the case is to use a regexp _after_ you
parsed the hostname, and only on the hostname field.
e.g. in patterndb you only parse the hostname and put the result in a
${hostname} name-value pair.
e.g.
parser p_pdb { db-parser(); };
filter f_cluster_member { match("^([a-z]+)([0-9]+)$" value('hostname') flags(store-matches)); };
if using pcre you could also parse groups right into name-value pairs with
named groups (from man pcresyntax):
(?<name>...) named capturing group (Perl)
(?'name'...) named capturing group (Perl)
(?P<name>...) named capturing group (Python)
Also, it'd make sense to create a regexp parser which doesn't currently
exist, because you only have that functionality with a filter, and if
you don't want to filter out non-matching log messages, then you'll
have to use some nasty hackery, e.g:
parser p_pdb { db-parser(); };
filter f_cluster_member { match("^([a-z]+)([0-9]+)$" value('hostname') flags(store-matches)) or match('.'); };
>
> The end goal is as follows (pseudo-code):
> I need to have a destination for each (HOST). For example all files from hosta## go to /var/log/hosta/ and entries for hostb## go to /var/log/hostb/
Ahh, so it seems you don't want to parse out hostnames from the message
payload, but rather you'd like to use the $HOST name-value pair.
Then, definitely the regexp is the way to go and you don't need
db-parser() at all.
>
> I suppose I *might* be able to do a rewrite to add say, a hyphen, and then use csv-parser, but we're talking some heavy traffic and I suspect that doing rewrites on that much traffic would be a performance killer.
>
> I'm open to suggestions (that don't involve changing server names, preferably ;) ) as to how to accomplish this.
If regexp really becomes a performance bottleneck a parser plugin would
probably be much faster. but that requires the 3.2 codebase.
--
Bazsi
More information about the syslog-ng
mailing list