[syslog-ng] RFC/RFH: JSON status report and plans

Gergely Nagy algernon at balabit.hu
Fri Oct 21 15:32:32 CEST 2011


Hi!

To support CEE[1] stuff better, a few things are needed for syslog-ng's
JSON support, and I'd like to take this opportunity to both give a
headsup on what's still needed, and present a few easy bits that we
need, and are very low hanging fruits for aspiring syslog-ng hackers to
grab.

Whitespace
----------

First of all, for compactness, the JSON we generate should be free of
any unneccessary whitespace. At the moment, we generate JSON like this:

{ "foo": "bar", "baz": "quux" }

The whitespace between key and value should be dropped, so should the
whitespace at the curly brackets, so that we'd generate the following
instead:

{"foo":"bar","baz":"quux"}

This is fairly trivial to fix, at least when using json-c (and that's
the only case we need fixed, for other reasons):

* Each json_object struct has a ->_to_json_string, which is used to
  format the string.
* The default implementation puts in whitespace
* The fix is:
  + #include <json_object_private.h>
  + copy the json_object_object_to_json_string() function from json.c,
  changing it to not emit extra whitespace
  + Override the ->to_json_string of the root JSON object in syslog-ng's
  modules/tfjson/tfjson.c's tf_json_append(), so that it points to the
  new function that doesn't do whitespace.

This is a trivial patch, and I pretty much already wrote it for you. If
someone can turn that into a patch, that would be grand!

Nested JSON
-----------

Another thing that would be very good to have, and not only for JSON,
but for the MongoDB driver aswell, is to be able to generate nested
objects.

So that a key-value pair like "foo.bar"="quux" would end up as
{"foo":{"bar":"quux"}} when formatted as JSON.

That is, we'd need to split up keys by "." chars, and treat the first
piece as the parent object.

The harder part is that keys are not ordered within syslog-ng, and there
might be conflicts (ie, syslog-ng allows "foo" and "foo.bar" to coexist,
but we can't format that as JSON if we split by dots, because foo would
then be a string and an object at the same time), which need to be
handled one way or the other.

My idea was that we could sort the keys in descending order, so that if
we have "foo.aaa", "foo.bbb", and "foo", we'd end up with "foo.bbb",
"foo.aaa", and "foo" list, split them up at the dots, pack stuff into
foo, notice later that we already have a foo object, so a foo string is
no-no, and threw a hissy fit (an error, that is).

I'm not going to explain the whole algorithm here, but it's not too hard
to figure it out.

The question in this case is two-fold: anyone up for implementing this
(I can provide assistance)? And if not, should this nesting be enabled
by default? Should it be an option at all, or the only supported way?

(I tend to side with going with nesting and not providing an option to
disable it, but I can be persuaded otherwise)

JSON parser
-----------

Another thing that's reasonably easy to put together, is a JSON
parser. Something like the existing CSV and patterndb parsers, except
this would parse JSON with the json-c library.

I have a few rough ideas, and am happy to provide help if someone feels
up to the task of tackling this and developing the feature. =)

Other pending issues
--------------------

There's also a case for types. It would be beneficial if numbers could
be represented as, well, numbers in the generated JSON (and similary,
stored as numbers when using the MongoDB destination). This, however,
requires some deeper changes, and I'm not yet familiar with that part of
the code yet, either. So I'm not going to go into details.

Nevertheless, I felt I need to mention this here, so that it won't fall
under the table.

 [1]: http://cee.mitre.org/

-- 
|8]



More information about the syslog-ng mailing list