[syslog-ng] [RFC (3.4) 0/5]: multiline & linux 3.5+ /dev/kmsg support

Gergely Nagy algernon at balabit.hu
Sat Oct 13 17:20:16 CEST 2012

The next few patches gradually introduce two - in my opinion -
interesting features for syslog-ng 3.4: support for one kind of
multiline messages, where continuation lines start with whitespace;
and support for the /dev/kmsg format introduced in linux 3.5.

Neither implementation is complete yet, there are subtle issues with
both (see later), but they're at a stage where taking them to a test
drive would be useful, and code review wouldn't hurt either.

The patches are also available on the
feature/3.4/indented-multiline[1] branch of my git repository[2]. The
branch may get rebased in the future, I do not make any promise of
keeping it fast-forwardable.

 [1]: https://github.com/algernon/syslog-ng/commits/feature/3.4/indented-multiline
 [2]: git://github.com/algernon/syslog-ng.git

linux 3.5 /dev/kmsg

The linux 3.5+ /dev/kmsg support is the easier to test: you only need
a 3.5+ kernel, and when using the system() source, things will
magically work! This includes parsing the relative timestamps in kmsg
messages and also parsing any additional key-value pairs.

This means that if we had an input line like this:

| 6,802,65338577;ATL1E 0000:02:00.0: eth0: NIC Link is Up <100 Mbps Full Duplex>
|  DEVICE=+pci:0000:02:00.0

Then we'll get back something like this:

| {
|    "FILE_NAME" : "/dev/kmsg",
|    "MSGID" : "802",
|    "HOST" : "luthien",
|    "DATE" : "Oct 13 16:53:44",
|    "FACILITY" : "kern",
|    "SOURCEIP" : "",
|    "TAGS" : ".source.s_kmsg",
|    "kernel" : {
|       "SUBSYSTEM" : "pci",
|       "DEVICE" : {
|          "name" : "0000:02:00.0",
|          "type" : "pci"
|       }
|    },
|    "PRIORITY" : "info",
|    "HOST_FROM" : "luthien",
|    "MESSAGE" : "ATL1E 0000:02:00.0: eth0: NIC Link is Up <100 Mbps Full Duplex>",
|    "PROGRAM" : "kernel"
| }

For the above, the following template was used:

 $(format-json --scope selected-macros --scope nv-pairs --key .kernel.* --shift 1)

(Mind you, program_override() is currently broken in 3.4, but that is
unrelated to the /dev/kmsg support)

Any name-value pairs the kernel supplies will be put into variables of
the same name, prefixed with ".kernel." (so SUBSYSTEM becomes
.kernel.SUBSYSTEM), and the value of DEVICE= will be further parsed,
based on rules written down in the appropriate kernel docs.

Indented multiline

The indented multiline support enhances the file() and tcp() sources
with the ability to read multiline records (if the indented-multiline
flag is set), where continuation lines start with whitespace. Such
output is produced by the $(indent-multi-line) template function.


 source s_ml { tcp(port(12345) flags(indented-multiline, no-parse)); };

Then, you can test it like this:

(cat <<EOF
This is the first line
 - second
 - third
) | nc localhost 12345

Using a JSON output, this would result in:

{"MESSAGE":"This is the first line\n - second\n - third"}

A record can be terminated in two ways: by another line that does not
begin with whitespace, or by reacing EOF.

This means that if netcat didn't close the connection, we wouldn't get
the test record until another line arrives.

Unfortunately, there's not much I can do about this limitation, but if
anyone has a good idea how to work around or fix it, please let me

Known issues

The /dev/kmsg handling is not reload/restart-safe: if syslog-ng gets
reloaded or restarted, it will read the whole lot of /dev/kmsg over
again, for example. There are possibly other subtle issues too, but
all of those will be ironed out before I submit the patches for

The indented-multiline support has a more serious limitation: if data
needs to be split (which happens in some circumstances when the
incoming data is bigger than the internal message size (8192 by
default)), it is not correctly reassembled, making indented-multiline
less than useful for some tasks.

More information about the syslog-ng mailing list