On Sat, 2010-09-04 at 20:40 -0500, Martin Holste wrote:
Multi-value N=V are evil. They kill log parsers and RDBMS :-) We did think a lot about this conundrum of src_IP="10.10.1.2,10.10.1.3" and might well recommend that it never happens. If we have to deaggregate logs (thus exploding the volume) the whole thing would be a mess...
Yes, they are evil. I was re-reading the recent thread "[syslog-ng] [announce] patterndb project," and I think we were in agreement that tags are still a good thing, though. So, how do we store the multi-value N=V but also have the flexibility of tags? My thought is maybe we go with a "primary" tag which is the class, and then the
What I'm thinking right now is to create the possibility to create a "tagdb", independently from the patterndb database (although they must play hand-in-hand). This tagdb would define the tag hierarch (tags in bunches basically) and could perhaps also associate type with the tags. For example, Anton said that CEE is moving in the direction to provide OAS (=object, action, status) tag triplets for each log message. This type information could be represented with the hierarchy, or the "type" field. For example (representing tag types with a hierarchy): <tagdb> <bunch name="object"> <tag name="flowevt"/> </bunch> <bunch name="status"> </bunch> <bunch name="action"> <tag name="secevt"/> </bunch> </tagdb> For example (representing tag types explicitly): <tagdb> <bunch name="security"> <tag type="object" name="flowevt"/> <tag type="action" name="secevt"/> </bunch> <bunch name="storage"> <tag type="object" name="file"/> <tag type="object" name="database"/> </bunch> <tag type="class" name="violation"/> <tag type="class" name="security"/> <tag type="class" name="system"/> <tag type="class" name="unknown"/> <tag name="just-a-simple-tag-without-type"/> </tagdb> The two are more-or-less equivalent if a single tag can belong to multiple bunches, which I guess it can, the difference is that the "type" property of the tag can be used easier by syslog-ng itself. The behaviour of syslog-ng would be (typed tags): 1) if a message is tagged with a tag type=="class", it'd become .classifier.class 2) patterndb could validate easily that each message gets an object/status/action tag The behaviour of syslog-ng would be (hierarchy based tags): 1) there would be builtin bunches that must exist 2) based on the built-in bunches syslog-ng could enforce the same as the typed bunches For some reason I rather like type tags, even though it is somewhat more bureaucratic: users/pattern authors should be free to create their tags without limitation. Opinions?
<tags> can be output via macro $TAG. ($TAG will contain all values in <tags>, right?)
It is $TAGS and already exists in 3.1.2, it expands to a comma separated list of tags without further escaping. (e.g. tags may not contain spaces if your storage is a text file, or otherwise it makes it really difficult to process files later).
So for the macro-based file name, you would only use file("/var/log/messages.${.classifier.class}.log") and do your tag grepping normally, where classifier.class would be the primary tag. I think this would work out better in the long run than trying to concatenate tags for the class, because keeping track of the order would be complicated, and it would definitely be better than sticking to the logcheck's very limited range of class selections.
-- Bazsi