[RFC] value-pairs(), take #3

6 Feb 2011

      Hi!

Based on the feedback from this list, we've had a little discussion with
Bazsi on how to improve value-pairs(), and we came up with something
that is hopefully more consistent and easier to use than my last
proposal.

The Syntax
==========

We'd have two syntaxes, one for the configuration file itself (usable by
the drivers), and one for template functions (eg, tfjson): they'll share
most properties, the difference will be in how they appear. See the
example below:

config file:
------------

value-pairs(
  scope(nv_pairs core syslog all_macros selected_macros everything)
  exclude("R_*")
  exclude("S_*")
  key(".SDATA.meta.sequenceId")
  pair("MSGHDR" "$PROGRAM[$PID]: ")
)

template function:
------------------

$(format-json --scope nv_pairs,core,syslog,all_macros,selected_macros,everything \
  --exclude R_* --exclude S_* --key .SDATA.meta.sequenceId \
  --pair MSGHDR="$PROGRAM[$PID]: ")

Explanation
-----------

The above examples would start with a full set of name-value pairs (due
to having "everything" in the scope; we could start with selected_macros
instead [see below]). The scope can only be extended by subsequent calls
to scope(), but even then, the set will be built only once, at the
beginning. We'll likely end up with throwing a syntax error during parse
if more than one scope() statement is seen, or if it's not the first
statement within value-pairs().

However, explicitly specifying a key-value pair (either via key() or
pairs()) will use the full set, regardless what scope() was selected.
This, however, might change, if people find this too confusing. But
changing this will complicate the code quite a lot, and remove some of
the flexibility.
...
From this set, we exclude every pair where the key begins with "R_" or
"S_", then we explicitly include .SDATA.meta.sequenceId (though, in this
example, this is useless, as it's already included due to the scope, and
wasn't excluded). Then add a custom key-value pair.
Syntax Details
--------------

The starting name-value pair set will be defined by the scope()
statement, which can have the following values:

        * nv_pairs: The name-value pair database, including some
        frequently used builtins (currently: HOST, HOST_FROM, MESSAGE,
        PROGRAM, PID, MSGID, SOURCE and LEGACY_MSGHDR)
        * rfc3164, alias core, alias base: The basic pairs from RFC3164:
        $FACILITY, $SEVERITY (= $LEVEL), $DATE(=$S_DATE), $HOST,
        $PROGRAM, $PID, and $MSG.
        * rfc5424, alias syslog: The pairs from rfc3164 plus $SDATA and
        $MSGID.
        * all_macros: All macros known to syslog-ng (including all of
        the above, pretty much)
        * selected_macros: rfc5424 + $TAGS, $SOURCEIP, $SEQNUM
        * everything: all of the above, combined

Each key is added to the set only once, naturally.

scope() was introduced as a replacement for builtins(), which was
unclear and inflexible. scope() does the job far better, and is - in my
opinion - a lot clearer too.

Apart from scope(), we have a few more statements:

        * select() / exclude(): We wanted to rename select() to
        include(), but syslog-ng already has an include() statement, and
        I ran into problems during the rename. It's undecided whether
        we'll remain with select() or adjust the parser to treat the two
        include statements differently (I'd opt for select()).

        The difference between the previous implementation's
        select()/exclude() is that in the new implementation, the first
        match will matter. This gets rid of the confusing priority
        stuff, and is still flexible enough (especially with the
        introduction of scope()) for all cases we could come up with.

        * key(): One can list macros with this statement. It does the
        same thing "$HOST" and friends did in the previous
        implementation, one just needs to use a statement this time, for
        clarity's sake.

        * pairs(): Same thing as the previous implementation's ("key"
        "value") construct.

Current shortcomings:

        * List separation: at the moment, list values need to be space
        separated, and the key-value pairs (see pairs()) need a space
        separator too.

        In the long run, we'd like to allow commas as separators too.

Another example
---------------

value-pairs(
  scope(selected_macros nv_pairs)
  select(.*)
  select("usracct.*")
  select("secevt.*")
  select(".SDATA.*")
  exclude("*")
  key("SEVERITY") key("HOST") key("PROGRAM") key("PID")
  key("MSG") key("TAGS")
  pair("timestamp" "$UNIXTIME")
);

This will start with a base set of selected_macros and nv_pairs, select
a few specified patterns, and exclude everything else. Then it will
explicitly add a few keys (which does not need to be part of the
original set!), and a custom key-value pair.

I hope this was understandable, and better than the previous proposal.

As soon as I start working on implementing this proposal, the code will
be available from the work/value-pairs/base branch of my git tree:

git://git.madhouse-project.org/syslog-ng/syslog-ng-3.3.git

(or browsable on the web at:
http://git.madhouse-project.org/syslog-ng/syslog-ng-3.3/log/?h=work/value-pa...)

And as always, Your feedback is most appreciated! Nothing is set in
stone yet, and I'd love to hear your opinion.

-- 
|8]

Gergely Nagy

Evan Rempel

Matthew Hall

Evan Rempel

Gergely Nagy

tags

participants (3)