[syslog-ng] patterndb & not describing the entire log message
Balazs Scheidler
bazsi at balabit.hu
Wed Sep 28 09:00:38 CEST 2011
On Thu, 2011-09-22 at 11:06 +0200, Christophe Brocas wrote:
> Le 01/09/2011 17:34, Balint Kovacs a écrit :
> > Hi Christophe,
> >
> > you could probably use ESTRING with a full stop string instead of a stop
> > character to do this. If you do
> >
> > @ESTRING::stop string@
> >
> > it is more or less equivalent to
> >
> > .*?stop string
> >
> > So you could match large portions of this message not storing it in a
> > variable, e.g. @ESTRING::File Nom de l’objet@ and then
> > @ESTRING:filename:ID du handle@ to get the file name.
> >
> > Make sure, that you escape unicode chars properly, otherwise matching
> > will have problems. If you have larger volumes of this log messages, you
> > might want to give a shot at "pdbtool patternize", at least as a
> > starting point for your final pattern.
> Hello Ballint
>
> Thank you for your answer and for the Martin one too :)
>
> As you notice, I currently have problem with unicode chars (extended chars as é
> è à chars).
>
> When you say "escape unicode chars", have you got some examples and/or
> documentation about it ?
>
> Currently, the matching process stops at the first extended character (the è
> char of Système de fichiers in the following example) :
>
> Pattern file :
>
> <patterndb version="4" pub_date="2011-03-30">
> <ruleset name="WinSecAuditLog" id="1">
> <pattern>MSWinEventLog</pattern>
> <rules>
> <rule provider="cbrocas" class="windows-FS-security" id="11">
> <patterns>
> <pattern>@ESTRING:mois:.@ @ESTRING:jour:
> @@NUMBER:heure@:@NUMBER:minutes@:@NUMBER:secondes@ @NUMBER:toto@ 4663
> Microsoft-Windows-Security-Auditing @ESTRING:domaine:\@@ESTRING:user:$@
> N/A Success Audit @STRING:machine:.-_@ Système de fichiers
> @ANYSTRING@</pattern>
> </patterns>
> </rule>
> </rules>
> </ruleset>
> </patterndb>
>
>
> pdbtool match :
> ...
> Matching part:
> sept. 05 14:22:23 20 4663 Microsoft-Windows-Security-Auditing
> D16490101\SDSV2SEVEN$ N/A Success Audit
> sdsv2seven.d16490101.cpam-c.cnamts.fr Syst
> ...
>
> Thank you very much
> Christophe
The db-parser() code is 8 bit clean, but otherwise it doesn't really
care about character sets at all. If your pattern contains an accented
character, that becomes a two-byte-sequence in utf8, but the same should
be in the input, so the literal string matches, as long as both your
input _and_ the patterndb is in utf8.
If they aren't, they'll not match.
If your input is not utf8, you could use the encoding() argument for
your source driver, that'll convert from-whatever-you specify to utf8.
--
Bazsi
More information about the syslog-ng
mailing list