[syslog-ng] MongoDB destination driver

Gergely Nagy algernon at balabit.hu
Tue Jan 4 11:51:47 CET 2011


> "patterndb" : {
>      ".classifier.class" : "system",
>      ".classifier.rule_id" : "4dd5a329-da83-4876-a431-ddcb59c2858c",
>      "usracct.authmethod" : "publickey for algernon from ::1 port 59690 ssh2",
>      "usracct.username" : "algernon from ::1 port 59690 ssh2",
>      "usracct.device" : "::1 port 59690 ssh2",
>      "usracct.service" : "ssh2",
>      "usracct.type" : "login",
>      "usracct.sessionid" : "12674",
>      "usracct.application" : "sshd",
>      "secevt.verdict" : "ACCEPT"
>  }
> 
> should really look like this:
> 
> "patterndb" : {
>      "classifier": {
>         "class" : "system",
>         "rule_id" : "4dd5a329-da83-4876-a431-ddcb59c2858c"
>       },
>      "usracct": {
>        "authmethod" : "publickey for algernon from ::1 port 59690 ssh2",
>        "username" : "algernon from ::1 port 59690 ssh2",
>        "device" : "::1 port 59690 ssh2",
>        "service" : "ssh2",
>        "type" : "login",
>        "sessionid" : "12674",
>        "application" : "sshd",
>      },
>      "secevt":{
>        "verdict" : "ACCEPT"
>      }
>  }
> 
> I recognize, however, that this is not a trivial conversion.  As a
> start, just doing a simple substitution of "." for "_" on keys would
> probably work just fine.

For the time being, the current tip of my branch converts . and $ to _
in dynamic key names (which is stricter than what mongodb allows, but it
was simpler to implement it this way).

I also have an idea about how to convert the stuff to a well structured
format. Actually, I have a few ideas, all with pros and cons:

#1: Insert the root document, update with dynamic values

We would insert the root document first, up to and including the
patterndb: {} sub document. Then we'd iterate over the keys, and use
mongodb's update method to add the rest of the stuff:

> db.logs.update({_id: <id>}, 
     {$set: {"patterndb.classifier.class": "system"}})

This has the upside of being almost trivial to implement, but has three
notable flaws: it will result in more network traffic, and inserting a
log message will not be atomic, since the dynamic values are added one
at a time. It also has a good chance of fragmenting the database
(though, mongodb is said to be clever enough to leave some padding space
for objects to grow, which might save us in this case).

It is also possible to do bulk updates, like this:

> db.logs.update({_id: <id>},
     {$set: {"patterndb.classifier.class": "system",
             "patterndb.classifier.rule_id" : "4dd5a329-da83-4876-a431-ddcb59c2858c"},
             "patterndb.secevt.verdict": "ACCEPT"}
     })

With this, we can reduce the whole operation to two steps: inserting the
first, static content, then the dynamic values. However, all of the
mentioned flaws remain even with this, they're just not as serious as if
we'd insert one by one.

#2: Construct the whole document within syslog-ng

This has the upside of keeping network traffic to a minimum, and inserts
will remain atomic.

The downside is that I have no idea how to implement this properly and
reliably yet. And my gut feeling is, that whatever solution I end up
with, this method will be considerably slower and would require more
processing power.

#3: Keep the status quo and leave it unstructured.

No extra work required on either side, and the values are still
reasonably easily queryable.

I'll implement #1 tonight, and make it so that one can choose between
that and #3, for example with a flag(dynamic_values_restructure) or
somesuch option. Gotta find a decent name for the flag, though.

-- 
|8]




More information about the syslog-ng mailing list