Balazs Scheidler <bazsi@balabit.hu> writes:
hi,
we had a conversation with Algernon a couple of weeks ago how mongodb should handle value collisions which arise when both the SDATA and the .SDATA.<id>.<param> values are to be added to the document.
As a reminder, mongodb strips the initial dot, and subsequent dots are used to break down values as subdocuments. This means that at the top level both SDATA as value and SDATA as a document are present.
I found out the following solution, which is a change in how mongodb currently works. I'd like to do this before releasing 3.3beta. Feedback is welcome.
* the SDATA macro is not included in rfc5424, selected-macros, but can be explicitly specified by key(), pair() and it's still included in all-macros * a new "sdata" scope is introduced which expands to the .SDATA.<id>.<param> values, but the SDATA is not included either. * nv-pairs is split: nv-pairs contains values which have no leading dot, dot-nv-pairs only contains the ones that have leading dots. all-nv-pairs contain both.
This way, SDATA macro is only included if someone really wants it, thus collision is much less likely.
Thus far, I'm all for it, and really like the proposed changes.
Also, I'd like to replace the initial dot in mongodb to initial underscore, this way eliminating the possibility of collisions completely.
However, this is something I don't particularly like... Stripping the dot is a very easy operation, that does not involve any extra allocation or copying: I just pass name+1 to bson_append_string() instead of name, and that's about it. Replacing the dot with an underscore would be a much more costy operation. Furthermore, when I'm storing sdata in mongodb, in a structured manner, then I explicitly want it to NOT have a leading dot or underscore: while leading-dot stuff might be classified as internal syslog-ng stuff, when I export them to a database, they're not internal anymore, and shouldn't be distinguished from any other data, in my opinion. This does mean that collisions are a bit more likely, but personally, I can live with that. Especially since the $SDATA macro is only included in the set if explicitly added. There is a possibility that other things might conflict, but the chances are a lot lower. In the long run, I think that key rewriting is the way forward (but that's 3.4 material), and if we do go that way, I don't see much point in introducing a conflicting idea in 3.3. -- |8]