[syslog-ng] [Bazsi's blog] syslog-ng name-value pair naming
Bazsi
bazsi77 at gmail.com
Fri Aug 6 21:03:48 CEST 2010
I was giving a lot of thought recently to the topic of naming
name-value pairs in syslog-ng. Until now the only documented rule is
stating somewhat vaguely that whenever you use a parser you should
choose a name that has at least one dot in it, and this dot must not be
the initial character. This means that names like MSG
or .SDATA.meta.sequenceId are reserved for syslog-ng, and
APACHE.CLIENT_IP is reserved for users.
However things became more complex with syslog-ng OSE 3.2. Let's see
what sources generate name-value pairs:
- traditional macros (e.g. $DATE); these are not name-value pairs
per-se, but behave much like them, except that they are read-only
- syslog message fields (e.g. $MSG) if the message is coming from a
syslog source
- filters whenever the 'store-matches' flag is set and the regexp
contains groups
- rewrite rules, whenever the rewrite rule specifies a thus far unknown
name-value pair, e.g. set("something" value("name-value.pair"));
- and of course parsers when you tell syslog-ng to parse an input as a
CSV, or use db-parser together with the patterns produced by the
patterndb projectThe latest stuff generating name-value pairs is the
support for process accounting logs, in this case even the syslog
related fields are missing and only things like "pacct.ac_comm" (to
contain the program name) are defined.
So I was thinking whether it should be "pacct.ac_comm"
or ".pacct.ac_comm". With the quoted rule it should be simple: it is
generated by syslog-ng itself, thus it should be in the syslog-ng
namespace and should start with a dot. However in the era of syslog-ng
plugins, what consists of syslog-ng at all?
First, I wanted to use "pacct.ac_comm" (e.g. without a dot), because I
liked this name better. I was trying to explain myself why it would not
violate the rule above. The explanation I had for myself was: I'm going
to "register" names such as this in the patterndb SCHEMAS.txt file.
With this - not yet published - explanation, I've committed a patch to
convert the pacctformat plugin to use a dotless prefix.
Next, I was figuring that it is true that process accounting creates
name-value pairs without going through patternization, but I've felt,
that nothing ensures that these name-value pairs would be directly
usable, when trying to analyse the logs. The patterndb concept uses
tags and schemas to convert the incoming unstructured data into a
consistent structure. However, pacct may not completely match what the
user needs. And, in the future, when SNMP traps or SQL table polling
are going to be supported, it is going to be even more true: these
name-value pairs may need a conversion: from the SNMP/pacct structure
to the patterndb schema described structure in order to handle these
message sources consistently with regular syslog (and to make it easy
to correllate these).
So at the end, I've committed another patch, this time going back
to ".pacct" as a prefix and leaving the original naming rule intact.
The "pacct" prefix is up to the users to use, they may want the same
information in a "pacct" schema, but that may come from data not
directly tied from process accounting (e.g. from syslog messages).
So this post is about doing nothing with regards to the naming policy,
but I thought it'd be important to shed a light behind the scenes.
Giving such decisions enough thought and coming up a with a long-term
plan makes our lives much easier in the future.
This post may be a bit more involved than the others, but feel free to
ask me to elaborate, if you are interested.
--
Posted By Bazsi to Bazsi's blog at 8/06/2010 08:26:00 PM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.balabit.hu/pipermail/syslog-ng/attachments/20100806/c498bad5/attachment.htm
More information about the syslog-ng
mailing list