The next few patches gradually introduce two - in my opinion - interesting features for syslog-ng 3.4: support for one kind of multiline messages, where continuation lines start with whitespace; and support for the /dev/kmsg format introduced in linux 3.5. Neither implementation is complete yet, there are subtle issues with both (see later), but they're at a stage where taking them to a test drive would be useful, and code review wouldn't hurt either. The patches are also available on the feature/3.4/indented-multiline[1] branch of my git repository[2]. The branch may get rebased in the future, I do not make any promise of keeping it fast-forwardable. [1]: https://github.com/algernon/syslog-ng/commits/feature/3.4/indented-multiline [2]: git://github.com/algernon/syslog-ng.git linux 3.5 /dev/kmsg =================== The linux 3.5+ /dev/kmsg support is the easier to test: you only need a 3.5+ kernel, and when using the system() source, things will magically work! This includes parsing the relative timestamps in kmsg messages and also parsing any additional key-value pairs. This means that if we had an input line like this: ,---- | 6,802,65338577;ATL1E 0000:02:00.0: eth0: NIC Link is Up <100 Mbps Full Duplex> | SUBSYSTEM=pci | DEVICE=+pci:0000:02:00.0 `---- Then we'll get back something like this: ,---- | { | "FILE_NAME" : "/dev/kmsg", | "MSGID" : "802", | "HOST" : "luthien", | "DATE" : "Oct 13 16:53:44", | "FACILITY" : "kern", | "SOURCEIP" : "127.0.0.1", | "TAGS" : ".source.s_kmsg", | "kernel" : { | "SUBSYSTEM" : "pci", | "DEVICE" : { | "name" : "0000:02:00.0", | "type" : "pci" | } | }, | "PRIORITY" : "info", | "HOST_FROM" : "luthien", | "MESSAGE" : "ATL1E 0000:02:00.0: eth0: NIC Link is Up <100 Mbps Full Duplex>", | "PROGRAM" : "kernel" | } `---- For the above, the following template was used: $(format-json --scope selected-macros --scope nv-pairs --key .kernel.* --shift 1) (Mind you, program_override() is currently broken in 3.4, but that is unrelated to the /dev/kmsg support) Any name-value pairs the kernel supplies will be put into variables of the same name, prefixed with ".kernel." (so SUBSYSTEM becomes .kernel.SUBSYSTEM), and the value of DEVICE= will be further parsed, based on rules written down in the appropriate kernel docs. Indented multiline ================== The indented multiline support enhances the file() and tcp() sources with the ability to read multiline records (if the indented-multiline flag is set), where continuation lines start with whitespace. Such output is produced by the $(indent-multi-line) template function. Usage: source s_ml { tcp(port(12345) flags(indented-multiline, no-parse)); }; Then, you can test it like this: (cat <<EOF This is the first line - second - third EOF ) | nc localhost 12345 Using a JSON output, this would result in: {"MESSAGE":"This is the first line\n - second\n - third"} A record can be terminated in two ways: by another line that does not begin with whitespace, or by reacing EOF. This means that if netcat didn't close the connection, we wouldn't get the test record until another line arrives. Unfortunately, there's not much I can do about this limitation, but if anyone has a good idea how to work around or fix it, please let me know! Known issues ============ The /dev/kmsg handling is not reload/restart-safe: if syslog-ng gets reloaded or restarted, it will read the whole lot of /dev/kmsg over again, for example. There are possibly other subtle issues too, but all of those will be ironed out before I submit the patches for merging. The indented-multiline support has a more serious limitation: if data needs to be split (which happens in some circumstances when the incoming data is bigger than the internal message size (8192 by default)), it is not correctly reassembled, making indented-multiline less than useful for some tasks.