[syslog-ng] pattern matching on xxx#

Balazs Scheidler bazsi at balabit.hu
Sat Oct 16 06:37:43 CEST 2010


On Fri, 2010-10-15 at 12:48 -0600, Bill Anderson wrote:
> I have hostnames of the format xxxx# such as host1, hostb1, hostc1. I need to split that into two fields such as (host,1).
> 
> Unfortunately, since @@ escapes the @ and STRING and it's followers ALSO match digits, I've not found the obvious e.
> means to get that out. Conceptually something like @LETTER:host.name@@NUMBER:host.id@ woudl do it, save that 
> LETTER doesn't exist and @@ escapes.

'@@' wouldn't escape in this situation.

The thing I'd like to understand before recommending a solution is where
these hostnames come in the picture? usually the hostname portion is not
processed by db-parser. Or you have these names inside the message
payload and you want to get it from there?

what I would propose if this is the case is to use a regexp _after_ you
parsed the hostname, and only on the hostname field.

e.g. in patterndb you only parse the hostname and put the result in a
${hostname} name-value pair.

e.g.

parser p_pdb { db-parser(); };
filter f_cluster_member { match("^([a-z]+)([0-9]+)$" value('hostname') flags(store-matches)); };

if using pcre you could also parse groups right into name-value pairs with
named groups (from man pcresyntax):

         (?<name>...)   named capturing group (Perl)
         (?'name'...)   named capturing group (Perl)
         (?P<name>...)  named capturing group (Python)

Also, it'd make sense to create a regexp parser which doesn't currently 
exist, because you only have that functionality with a filter, and if 
you don't want to filter out non-matching log messages, then you'll 
have to use some nasty hackery, e.g:

parser p_pdb { db-parser(); };
filter f_cluster_member { match("^([a-z]+)([0-9]+)$" value('hostname') flags(store-matches)) or match('.'); };


> 
> The end goal is as follows (pseudo-code):
> I need to have a destination for each (HOST). For example all files from hosta## go to /var/log/hosta/ and entries for hostb## go to /var/log/hostb/

Ahh, so it seems you don't want to parse out hostnames from the message
payload, but rather you'd like to use the $HOST name-value pair.

Then, definitely the regexp is the way to go and you don't need
db-parser() at all.

> 
> I suppose I *might* be able to do a rewrite to add say, a hyphen, and then use csv-parser, but we're talking some heavy traffic and I suspect that doing rewrites on that much traffic would be a performance killer.
> 
> I'm open to suggestions (that don't involve changing server names, preferably ;) ) as to how to accomplish this.

If regexp really becomes a performance bottleneck a parser plugin would
probably be much faster. but that requires the 3.2 codebase.

-- 
Bazsi




More information about the syslog-ng mailing list