On Tue, 2010-06-29 at 10:11 -0500, Martin Holste wrote:
This is awesome. As I've written about previously, I've used the pattern-db enough to know how powerful and efficient it is, and I am doing all my logging with it. My main use is for log classification and field parsing, which normalizes logs down to something that can easily be put in a database. The classification helps with not only quickly identifying types of logs, but also higher-level ideas like log retention (so I archive important logs) and permissions (so people like web developers can have access to certain logs). The field parsing is great for things like Snort and firewall logs, as well as web server logs.
If you use a NoSQL-style database, such as MongoDB or CouchDB, you don't have to worry about fitting fields into a rigid schema since there is no concept of "columns." That works out great for pattern-db because you can specify any field/value pairs in the pattern and then have Mongo write it as-is so that some records will be (_id:1, program:"snort", srcip:x.x.x.x} and others will be {_id:2, program:"sendmail", to_address:"person@example.com"} . They key is that you don't have to know ahead of time what fields you will be parsing in order to design a db schema. That means when new patterns are released, the fields can be named anything without breaking your schema.
Great to know. I noted the MongoDB/CouchDB as a possible project for plugin development. (hint: see the syslog-ng OSE 3.2 tree) this could perhaps be an alternative to my schema-based SQL destination (on the current roadmap)
My initial concern with the format of the pattern-db XML is with the CLSID-style ID's. I understand the advantages of CLSID's, but it is very expensive to create database indexes on them because of their enormous length. I would prefer to have an integer ID in the pattern XML somewhere. Other opinions?
I don't attach to much to UUIDs (I guess that's what you mean under CLSID), but if we are not using something like UUID, then we need a central place to administer the IDs. Do you thing that's acceptable?
On Fri, Jun 25, 2010 at 10:23 AM, Balazs Scheidler <bazsi@balabit.hu> wrote:
-- Bazsi