[syslog-ng] tfjson: JSON-generating template function

Gergely Nagy algernon at balabit.hu
Fri Jan 21 17:27:47 CET 2011


Hi!

Looks like the list didn't like me sending the whole patch, so I'll have
to omit the patch part this time, and hope that the wall of text below
won't hit the size limit.

Based on the work of Balint Kovacs <blint at balabit.hu>, a $(format_json)
template function was implemented: with this function, one can hammer
log messages into JSON format.

While that is not immediately useful, we have a few things planned that
will - with a bit of luck - be able to use this function.

The function itself, without arguments, will reformat the log message
into JSON. For example, a loggen-generated message will look like this
(formatting added by me):

  { "HOST": "localhost", "HOST_FROM": "localhost", 
    "MESSAGE": "seq: 0000000000, thread: 0000, runid: 1295625842, stamp: 2011-01-21T17:04:02 PADD...",
    "PROGRAM": "prg00000", "PID": "1234", "SOURCE": "s_tcp",
    "LEGACY_MSGHDR": "prg00000[1234]: ", ".classifier.class": "unknown" }

While this is all cool, there's more! The function accepts parameters:
'--select <GLOB>' and '--exclude <GLOB>', where <GLOB> is a shell-like
glob pattern. One can have any number of selects or globs. The way this
works, is that if a key doesn't match any of the select globs, it will
not be included. If it does, but matches any of the exclude globs, it
will be dropped. Otherwise it's included. The default select glob is
"*", which is why everything is included by default.

The globs are matched against the keys: macro names without the dollar
sign, patterndb keys, and so on.

With this knowledge, we can tweak the template like this:

  $(format_json --select .classifier.* --select usracct.* --exclude *.*id)

Then, given the following message:

  <86>Jan 21 17:10:21 moria sshd[31930]: pam_unix(sshd:session): session closed for user algernon

We'll end up with this JSON:

  { ".classifier.class": "system", "usracct.username": "algernon", 
    "usracct.type": "logout", "usracct.application": "sshd" }

Now, that's not very useful, as we're lacking the HOST and a whole lot
of other things. Adding all that as selects would be a bit excessive,
therefore one can list macro names (without the dollar sign) along with
the select/exclude options, and they'll be included too.

Furthermore, one can list key=value pairs, for example, using the log
message quite like the one above, and the following template:

  $(format_json -s .classifier.* -s usracct.* -x *.*id HOST MESSAGE prg='$PROGRAM[$PID]')

We'll end up with:

  { ".classifier.class": "system", "usracct.username": "algernon",
    "usracct.type": "logout", "usracct.application": "sshd",
    "MESSAGE": "pam_unix(sshd:session): session closed for user algernon",
    "prg": "sshd[32216]", "HOST": "localhost" }

At the moment, only strings are supported, and there is no way to
structure the JSON properly yet. We'll figure out a way to do that too,
eventually. There's a few more things in the pipeline for tfjson, so
treat this version as a beta.

The code is available on the integration/tfjson/base branch of my
repository at git://git.madhouse-project.org/syslog-ng/syslog-ng-3.3.git

Browsable online at:
http://git.madhouse-project.org/syslog-ng/syslog-ng-3.3/tree/modules/tfjson/tfjson.c?h=integration/tfjson/base

The bulk of the work was done by Balint, I just happened to accidentally
refactor it a tiny bit while reviewing his code.

-- 
|8]




More information about the syslog-ng mailing list