[syslog-ng] value-pairs and sdata
Gergely Nagy
algernon at balabit.hu
Thu May 19 11:33:24 CEST 2011
Balazs Scheidler <bazsi at balabit.hu> writes:
> hi,
>
> we had a conversation with Algernon a couple of weeks ago how mongodb
> should handle value collisions which arise when both the SDATA and the
> .SDATA.<id>.<param> values are to be added to the document.
>
> As a reminder, mongodb strips the initial dot, and subsequent dots are
> used to break down values as subdocuments. This means that at the top
> level both SDATA as value and SDATA as a document are present.
>
> I found out the following solution, which is a change in how mongodb currently works. I'd like to do this before releasing 3.3beta. Feedback is welcome.
>
> * the SDATA macro is not included in rfc5424, selected-macros, but can be explicitly specified by key(), pair() and it's still included in all-macros
> * a new "sdata" scope is introduced which expands to the .SDATA.<id>.<param> values, but the SDATA is not included either.
> * nv-pairs is split: nv-pairs contains values which have no leading dot, dot-nv-pairs only contains the ones that have leading dots. all-nv-pairs contain both.
>
> This way, SDATA macro is only included if someone really wants it,
> thus collision is much less likely.
Thus far, I'm all for it, and really like the proposed changes.
> Also, I'd like to replace the initial dot in mongodb to initial
> underscore, this way eliminating the possibility of collisions
> completely.
However, this is something I don't particularly like... Stripping the
dot is a very easy operation, that does not involve any extra allocation
or copying: I just pass name+1 to bson_append_string() instead of name,
and that's about it.
Replacing the dot with an underscore would be a much more costy
operation.
Furthermore, when I'm storing sdata in mongodb, in a structured manner,
then I explicitly want it to NOT have a leading dot or underscore: while
leading-dot stuff might be classified as internal syslog-ng stuff, when
I export them to a database, they're not internal anymore, and shouldn't
be distinguished from any other data, in my opinion.
This does mean that collisions are a bit more likely, but personally, I
can live with that. Especially since the $SDATA macro is only included
in the set if explicitly added. There is a possibility that other things
might conflict, but the chances are a lot lower.
In the long run, I think that key rewriting is the way forward (but
that's 3.4 material), and if we do go that way, I don't see much point
in introducing a conflicting idea in 3.3.
--
|8]
More information about the syslog-ng
mailing list