[syslog-ng] [announce] patterndb project

Balazs Scheidler bazsi at balabit.hu
Fri Jun 25 17:23:16 CEST 2010


Hi,

By now probably most of you know about patterndb, a powerful framework
in syslog-ng that lets you extract structured information from log
messages and perform classification at a high speed:

http://www.balabit.com/dl/html/syslog-ng-ose-v3.1-guide-admin-en.html/concepts_pattern_databases.html

Until now, syslog-ng offered the feature, but no release-quality
patterns were produced by the syslog-ng developers. Some samples based
on the logcheck database were created, but otherwise every syslog-ng
user had to create her samples manually, possibly repeating work
performed by others.

Since this calls out to be a community project, I'm hereby starting one.

Goals
=====

Create release-quality pattern databases that can simply be deployed to
an existing syslog-ng installation. The goal of the patterns is to
extract structured information from the free-form syslog messages, e.g.
create name-value pairs based on the syslog message.

Since the key factor when doing something like this is the naming of
fields, we're going to create our generic naming guidelines that can be
applied to any application in the industry.

It is not our goal to implement correllation or any other advanced form
of analysis, although we feel that with the results of this project,
event correllation and analysis can be performed much easier than
without it.

Related projects
================

I know there are other efforts in the field, why not simply join them?

CEF - is the log message format for a proprietary log analysis engine,
primarily meant to be used to hold IP security device logs (firewalls,
IPSs, virus gateways etc). The patterndb project aims to create patterns
for a wider range of device logs and be more generic in the approach. On
the other hand we feel that it might be useful to create a solution for
converting db-parser output to the CEF format.

CEE - Common Event Expression project by Mitre has a focus on creating a
nv pair dictionary for all kinds of devices/log messages out there.
Although I might be missing something, but I didn't find the concrete
results so far, apart from a nicely looking white paper. If the CEE
delivers something, then patterndb would probably adapt the
naming/taxonomy structure. But I guess not all devices will start
logging in the new shiny format, thus the existing devices would need
their logs converted, so the patterndb work wouldn't be wasted.

Infrastructure
==============

Our original patterndb related plans were to create an easy to use web
based interface for editing patterns, but since that project is
progressing slowly, I'm calling for a minimalist approach: git based
version control of simple plain text files. Of course once the nice web
based interface is finished, we're going to be ready to use it.

First steps
===========

I have created a git repository at:

http://git.balabit.hu/bazsi/syslog-ng-patterndb.git

This contains the initial version of the naming policy document and a
simple schema for SIEM-style and a user login-logout naming schema.

If you are interested please read the file README.txt in the git
archive, or if you prefer a web browser, use this link:

http://git.balabit.hu/?p=bazsi/syslog-ng-patterndb.git;a=blob;f=README.txt;h=9bbfeaead0c21dcf6171e12e311ae8612f572bfc;hb=6061e22221a72d35238b35f82b04afd436341b5c

Licensing
=========

I do not have a decision yet, but for sure this is going to use one of
the open source licenses or Creative Commons. Let me know if you have a
preference in this area.

Getting involved
================

Join the syslog-ng mailing list, a start discussing! If you have
existing patterns, great. If you don't, it is not late to join.

http://lists.balabit.hu/mailman/listinfo/syslog-ng


-- 
Bazsi



More information about the syslog-ng mailing list