[syslog-ng] Pattern extraction

majid as majid_groups at yahoo.com
Sun Aug 15 13:15:30 CEST 2010


Thanks Mr Holste, your mail was very usefull.
I am tenderfoot in log parsing. I must extract several field for one message such as IP (or hostname), user, port, date, protocol, and other fileds if can extract. then fields must be normalized in IDMEF format. it must be done for every different syslog message types. 
So, is the syslog-ng suitable tool for this task? and how?


--- On Sat, 14/8/10, Martin Holste <mcholste at gmail.com> wrote:


From: Martin Holste <mcholste at gmail.com>
Subject: Re: [syslog-ng] Pattern extraction
To: "Syslog-ng users' and developers' mailing list" <syslog-ng at lists.balabit.hu>
Date: Saturday, 14 August, 2010, 7:32 PM


If you're looking to do never-wrong, full normalization, then yes,
you're looking at thousands of signatures.  However, if you're looking
to extract some common fields, it's actually not that much work to
grab things like IP addresses using regexp.  Since regexp is slow, I'm
thinking about writing some generic patterns that would match on IP's
using the fast pattern matcher.  I don't know if it'll work, but it
would look like "@ANYSTRING@@IPv4@@ANYSTRING@" and then maybe another
one to grep out two IP's, then another for three, etc.  I have no idea
if that will work; we'll see how it goes.

I think that the pursuit of perfection in this field will be
discouraging, and may stifle efforts before they begin.  I urge you to
take it one pattern at a time.  Sure, we may need thousands of
patterns, but there are hundreds if not thousands on this mailing
list.  Everybody take two patterns ;)  And don't forget that the
patternize tool may be able to help by heuristically identifying
fields in messages.  Then it just comes down to a human naming the
fields instead of painstakingly writing the patterns themselves.

Something else to consider:  Even if you're only extracting the RFC
headers of the syslog but you have full-text search abilities of the
log messages, you can make some OLAP-style basic dimensional analysis
happen.  So, let's say you're going through router logs looking for an
OSPF adjacency change.  You search for "LOADING to FULL" and then
group by host.  You've just magically discovered all of the routers
that flapped during whatever incident caused the adjacency change.
Obviously this is very basic, but don't underestimate the immediate
value of being able to quickly pinpoint which hosts had which events
occur.  I would say that 70% of the total value you'd get from having
all messages perfectly parsed is already attained just by being able
to do free text searches and group by host.

Lastly, not all logs are created equal!  I wrote parsers for Cisco
firewall connection teardowns and firewall denies, and now more than
half of my logs are neatly parsed.  That's because the vast majority
of Cisco logs at notification level are build/teardown messages.
(Something like four logs per flow per device).  Now if I'm looking
for something weird, I can easily take the majority of the hay out of
the haystack by excluding the already classified logs in my search.
It even helps with reporting, because a big jump in the number of
unclassified messages shows up on the radar.

So to sum up, the benefit of creating log patterns is exponential.
Not having a pattern for every possible log isn't really a big deal,
but having patterns for certain logs is.

On Fri, Aug 13, 2010 at 8:00 PM, Anton Chuvakin <anton at chuvakin.org> wrote:
>> So, I must extract hundreds of pattern manually. :(
>
> Not really hundreds, try tens of thousands. If you sit and watch a
> busy syslog server for, say, 5 years,  some say you'd see a few
> thousand or more of unique messages. Personally, I have not tried it,
> but I trust the source.
>
>
>> Regards
>>
>> --- On Fri, 13/8/10, Anton Chuvakin <anton at chuvakin.org> wrote:
>>
>> From: Anton Chuvakin <anton at chuvakin.org>
>> Subject: Re: [syslog-ng] Pattern extraction
>> To: "Syslog-ng users' and developers' mailing list" <syslog-ng at lists.balabit.hu>
>> Date: Friday, 13 August, 2010, 7:18 PM
>>
>> > I dont know how can i extract pattern form logs, I must check every log type separately?, using pattern recognition methods? or using
>> >pattern database (if exist for all aplication and device)?
>>
>> Well, this is not just you - it is "you and the rest of the world."
>> The standard way is pretty much to manually (or with tools - but still
>> mostly manually) write regular expressions for every distinct log
>> message type.
>>
>> --
>> Dr. Anton Chuvakin
>> Site: http://www.chuvakin.org
>> Blog: http://www.securitywarrior.org
>> LinkedIn: http://www.linkedin.com/in/chuvakin
>> Consulting: http://www.securitywarriorconsulting.com
>> Twitter: @anton_chuvakin
>> Google Voice: +1-510-771-7106
>> ______________________________________________________________________________
>> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
>> Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng
>> FAQ: http://www.campin.net/syslog-ng/faq.html
>>
>>
>>
>> ______________________________________________________________________________
>> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
>> Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng
>> FAQ: http://www.campin.net/syslog-ng/faq.html
>>
>>
>
>
>
> --
> Dr. Anton Chuvakin
> Site: http://www.chuvakin.org
> Blog: http://www.securitywarrior.org
> LinkedIn: http://www.linkedin.com/in/chuvakin
> Consulting: http://www.securitywarriorconsulting.com
> Twitter: @anton_chuvakin
> Google Voice: +1-510-771-7106
> ______________________________________________________________________________
> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
> Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng
> FAQ: http://www.campin.net/syslog-ng/faq.html
>
>
______________________________________________________________________________
Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng
FAQ: http://www.campin.net/syslog-ng/faq.html



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.balabit.hu/pipermail/syslog-ng/attachments/20100815/085624f7/attachment.htm 


More information about the syslog-ng mailing list