[syslog-ng] RFC: value-pairs key rewrite framework, part N+1
Balazs Scheidler
bazsi at balabit.hu
Sat Oct 22 19:59:13 CEST 2011
On Tue, 2011-10-04 at 10:56 +0200, Gergely Nagy wrote:
> Hi!
>
> After my former mails (see the thread starting at
> http://thread.gmane.org/gmane.comp.syslog-ng/11355/focus=11421), I'd
> like to do a recap, and ask for comments, as I came up with a few new
> ideas.
>
> The purpose of the value-pairs key rewrite framework is to make it
> possible to apply various transformations to the keys we selected with
> VP. So that we can add or remove prefixes, replace parts of the string,
> and so on.
>
> As of this writing, code to do this exists on one of my branches, but it
> hasn't been touched for a while, and I planned to update it in the near
> future. And that's when it dawned on me, that perhaps the syntax isn't
> all that great.
>
> To show why I think that, let's see an example first:
>
> value-pairs(
> scope("everything")
> rekey(
> add-prefix(".secevt" "events")
> add-prefix(".classifier" "syslog-ng")
> shift(".sdata.*" 1)
> replace("." "_")
> )
> );
>
> This will add an "events" prefix to each key that starts with ".secevt",
> so that ".secevt.verdict" becomes "events.secevt.verdict"; similary,
> ".classifier.class" becomes "syslog-ng.classifier.class"; keys that
> match '.sdata.*' get shifted to the right, removing the dot. And all
> remaining dots at the begininng of a key will get replaced by an
> underscore instead.
>
> This kinda makes sense, and I could even massage the syntax into
> format-json: $(format-json --scope everything --rekey --add-prefix
> .secevt=events --addprefix .classifier=syslog-ng --shift .sdata.*=1
> --replace .=_ --end-rekey)
>
> However, this syntax has the downside of transformations being global: I
> can't choose subset of my keys, and apply a list of transformations on
> those and only those. Once I made a transformation, any transformations
> in the list afterwards will see the transformed key. So I can't easily
> say: "take all the keys starting with '.sdata.', shift them 6 chars,
> then replace any key names that start with 'win' with 'lose', and
> finally prefix them with 'whatever'". With the current syntax, that's
> next to impossible to do sanely.
>
> It's also not all that intuitive..
>
> So I came up with a different syntax: wiring rekey into the key() option
> of value-pairs! That way, we already selected a subset to work on, and
> the transformations would apply to only those.
>
> (This could be combined with the global syntax aswell, though)
>
> So it'd look something like this:
>
> value-pairs(
> scope("everything")
> key(".secevt.*" rekey(add-prefix("events")))
> key(".classifier.*" rekey(add-prefix("syslog-ng")))
> key(".sdata.*" rekey(shift(1)))
> key(".*" rekey(replace ("." "_")))
> );
>
> This would achieve the exact same effect as the example above, but with
> a clearer syntax, perhaps. It would also mean that key rewriting can be
> described at the same place where the key is selected to begin with.
>
> The downside of this is that it'd be a bit harder to come up with a
> syntax for format-json that mimics the config file syntax.
>
> So, instead of trying to do that and end up with something horrible, I
> have another proposal: lets make value-pairs a top-level citizen, so
> that it joins the ranks of filter{} and rewrite{} and the like!
>
> That way, we could turn the following ugly thing:
>
> destination d_structured {
> mongodb(
> value-pairs(
> scope("everything")
> key(".secevt.*" rekey(add-prefix("events")))
> key(".classifier.*" rekey(add-prefix("syslog-ng")))
> key(".sdata.*" rekey(shift(1)))
> key(".*" rekey(replace ("." "_")))
> )
> );
> file("/var/log/structured.json" template("$(format-json <repeat the above stuff, but with format-json syntax>)\n"));
> };
>
> Into this beauty:
>
> valuepairs vp_example {
> scope(everything);
> key(".secevt.*" rekey(add-prefix("events")));
> key(".classifier.*" rekey(add-prefix("syslog-ng")));
> key(".sdata.*" rekey(shift(1)));
> key(".*" rekey(replace ("." "_")));
> };
>
> destination d_structured {
> mongodb(value-pairs(vp_example));
> file("/var/log/structured.json" template("$(format-json --with-config vp_example)\n"));
> };
>
> This would have the nice consequence of not having to keep two parsers
> in sync: format-json would only have a --with-config option, and nothing
> else.
>
> So, in the end, this whole boils down to two questions:
>
> * What do you think about allowing (or even moving) rekey() inside key()
> options?
> * What do you think about introducing a top-level valuepairs element,
> and dropping the format-json argument parsing stuff?
We've discussed this IRL and came to the conclusion that it is very
handy to allow key-rewrite to be applied on a per-glob basis (e.g. to
associate the rewrite function to the set specified by --key).
We've decided against introducing the top-level value-pairs element in
the configuration, but rather made up a possible command-line-like
syntax.
Something along the lines of:
$(format-json --key .cee.* --rewrite replace .cee=Event)
--
Bazsi
More information about the syslog-ng
mailing list