[syslog-ng] [RFC] value-pairs and key rewriting
Martin Holste
mcholste at gmail.com
Tue May 10 22:24:19 CEST 2011
Yep, I think you're on the right track in that some rewriting will
definitely be necessary for Mongo. I'm a bit concerned with
performance, but Mongo will probably be the bottleneck when things
don't fit in RAM anyway.
On Tue, May 10, 2011 at 2:07 PM, Gergely Nagy <algernon at balabit.hu> wrote:
> Hi!
>
> Now that value-pairs() is in 3.3, it's time to dig up an idea Bazsi and
> I were discussing way back when we first talked about value-pairs: a way
> to change the keys in a value-pairs set, without the need to explicitly
> specify them all using pair().
>
> It's actually easier to explain this by explaining the need behind this
> feature: there's the MongoDB destination, and by default, SDATA goes
> under the "sdata" key, somewhat like this:
>
> {
> "sdata": {
> "test": "value"
> }
> }
>
> Now, if I'd rather have those values under, say "sd", I can't do that
> with the current driver, because I can't tell value-pairs() that "sdata"
> should be mapped to "sd" instead. The best I can do, is exclude
> ".SDATA.*", and either use "$SDATA", and post-process it, or list all
> the .SDATA.* keys explicitly. Neither of which is good enough.
>
> So, I propose that we should have a way to remedy this problem, and this
> remedy should be called "rekey()".
>
> The way I imagine it, is something like this:
>
> value-pairs (
> scope("selected-macros" "nv-pairs")
> rekey(
> regexp("^\.SDATA\.(.*)" "sd.$1")
> prefix(".secevt.*" "events")
> prefix("[A-Z]*" "syslog.")
> )
> )
>
> This would do the following:
>
> - Any key that begins with ".SDATA." will have that part replaced with
> "sd."
> - Keys matching ".secevt.*" (shell glob, not regexp) will be prefixed
> with "events". Thus ".secevt.verdict" would become
> "events.secevt.verdict".
> - Keys that are all uppercase would be prefixed with "syslog.", thus
> "HOST" would become "syslog.HOST"
>
> The transformations would be applied to the raw set of keys, in the
> order they're listed in the configuration file. Initially, regexp() and
> prefix() would be implemented only, with the possibility of adding more,
> if the need arises.
>
> This would also solve another problem I encountered recently: if the
> value-pairs() result set contains both "$SDATA" and "$SDATA.*" (which is
> the case if one specifies scope("selected-macros" "nv-pairs") and the
> incoming message has structured data), then we'll have a key conflict in
> the MongoDB destination, because internally "foo.bar" gets translated to
> (using JSON notation):
>
> { "foo": { "bar": ... } }
>
> Now, in the case of SDATA, this translates to something like the
> following:
>
> {
> SDATA: "[foo=bar]", // $SDATA
> SDATA: {
> "foo": "bar" // $SDATA.foo
> }
> }
>
> This is because the MongoDB destination strips the leading dot at the
> moment (because that would be invalid too), and we end up with
> conflicting types: one string, and one object. The driver does not
> support overriding right now, so this is a problem.
>
> I could, of course, change the driver to replace the dot with an
> underscore, but that would be costier than the current stripping, and
> would still be ugly, in my opinion.
>
> It's much nicer to allow the users to rewrite the keys instead, or
> prefix them.
>
> That's about how far I got with thinking for now. Critique, comments and
> ideas would be most appreciated.
>
> (PS: This is, of course, strictly 3.4 material, as 3.3 is in a feature
> freeze)
>
> --
> |8]
>
> ______________________________________________________________________________
> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
> Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng
> FAQ: http://www.campin.net/syslog-ng/faq.html
>
>
More information about the syslog-ng
mailing list