On Thu, 2011-09-22 at 11:06 +0200, Christophe Brocas wrote:
Le 01/09/2011 17:34, Balint Kovacs a écrit :
Hi Christophe,
you could probably use ESTRING with a full stop string instead of a stop character to do this. If you do
@ESTRING::stop string@
it is more or less equivalent to
.*?stop string
So you could match large portions of this message not storing it in a variable, e.g. @ESTRING::File Nom de l’objet@ and then @ESTRING:filename:ID du handle@ to get the file name.
Make sure, that you escape unicode chars properly, otherwise matching will have problems. If you have larger volumes of this log messages, you might want to give a shot at "pdbtool patternize", at least as a starting point for your final pattern. Hello Ballint
Thank you for your answer and for the Martin one too :)
As you notice, I currently have problem with unicode chars (extended chars as é è à chars).
When you say "escape unicode chars", have you got some examples and/or documentation about it ?
Currently, the matching process stops at the first extended character (the è char of Système de fichiers in the following example) :
Pattern file :
<patterndb version="4" pub_date="2011-03-30"> <ruleset name="WinSecAuditLog" id="1"> <pattern>MSWinEventLog</pattern> <rules> <rule provider="cbrocas" class="windows-FS-security" id="11"> <patterns> <pattern>@ESTRING:mois:.@ @ESTRING:jour: @@NUMBER:heure@:@NUMBER:minutes@:@NUMBER:secondes@ @NUMBER:toto@ 4663 Microsoft-Windows-Security-Auditing @ESTRING:domaine:\@@ESTRING:user:$@ N/A Success Audit @STRING:machine:.-_@ Système de fichiers @ANYSTRING@</pattern> </patterns> </rule> </rules> </ruleset> </patterndb>
pdbtool match : ... Matching part: sept. 05 14:22:23 20 4663 Microsoft-Windows-Security-Auditing D16490101\SDSV2SEVEN$ N/A Success Audit sdsv2seven.d16490101.cpam-c.cnamts.fr Syst ...
Thank you very much Christophe
The db-parser() code is 8 bit clean, but otherwise it doesn't really care about character sets at all. If your pattern contains an accented character, that becomes a two-byte-sequence in utf8, but the same should be in the input, so the literal string matches, as long as both your input _and_ the patterndb is in utf8. If they aren't, they'll not match. If your input is not utf8, you could use the encoding() argument for your source driver, that'll convert from-whatever-you specify to utf8. -- Bazsi