[syslog-ng] Feedback for GSoC project - RIak Destination for	Syslog-ng
    Gergely Nagy 
    algernon at madhouse-project.org
       
    Wed May  6 14:59:23 CEST 2015
    
    
  
Hi!
I'm the mentor for the Riak destination for syslog-ng project, please
allow me to answer the questions below:
>>>>> "Fred" == Fred Dushin <fdushin at basho.com> writes:
    Fred> As far as I understand, you're talking about a mapping from keys to
    Fred> sets, but I'm unclear on a few things.
The idea is to map a set of log messages to a Riak Set. Where both the
key used for the set, and its contents are configurable by the
user. There are no plans for a default at this time.
There are many ways to configure a syslog-ng=>Riak setup with a
destination like the one planned. One is to turn each log message (after
parsing) to a Riak Map, and push those maps into a Riak Set. Another way
is to format the parsed log messages (with all the extracted fiels, if
any) into JSON, and push those into a set.
So, for example, given the following syslog line:
May  6 14:42:18 eowyn avahi-daemon[27812]: Invalid response packet from host fe80::5d0f:d53a:7b6:3680.
We'd end up with a JSON like this:
{"timestamp": "2015-05-06T14:42:18+02:00",
 "host": "eowyn",
 "program": "avahi-daemon",
 "pid": 27812,
 "message": "Invalid response packet from host fe80::5d0f:d53a:7b6:3680.",
 "avahi-daemon": {
   "type": "warning",
   "message": "Invalid response packet",
   "host": "fe80::5d0f:d53a:7b6:3680"
 }
}
We could either add that to a Riak set as-is, or turn it into a Riak map
first.
    Fred> What are the keys you are thinking about? Time stamps? If
    Fred> timestamps, these are presumably the timestamps of the syslog
    Fred> event?
Whatever the user configures. They may be time stamps (rounded, for
predictable keys), or a combination of program name + current date (day
granuality).
    Fred> Just a word of warning, if so. You might find a lot of
    Fred> variation in timestamp formats and granularity. Perhaps you
    Fred> can get something reliable out of syslog-ng,
We get something sensible out of syslog-ng. But in the end, it is up to
the user to configure the template used for keys. There may - and
probably will - be examples, but no default.
    Fred> but that won't help you in the case where syslog-ng is
    Fred> functioning as a syslog relay, and you want to preserve the
    Fred> timestamp of the originator, which you should, if you want to
    Fred> preserve integrity of the logs (e.g, for compliance).
In case of syslog-ng, we actually have access to a few kinds of
timestamps: the timestamp from the log message (if any), the timestamp
of receipt, and the current time. The granularity of timestamps is
configurable to some extent.
    Fred> Or are you talking about a key being a (course grained)
    Fred> timestamp, say, an integral value in UTC seconds, for example?
    Fred> And the value(s) being all logs in that interval? Is that your
    Fred> motivation for sets?
That's one way, yes. One could also use something like
$PROGRAM/$YEAR-$MONTH-$DAY as key, if the program doesn't produce more
than a megabyte of logs a day. So with the example above, our key in
case of that log would be avahi-daemon/2015-05-06, and the message would
be an element of the set underneath the key.
    Fred> How much of the syslog payload are you planning to parse?
The destination itself is not going to do any parsing. Other parts of
syslog-ng do that, and it is up to the user to set up a pipeline that
feeds the destination. The source may be syslog, HTTP logs, the Journal,
or any of the other sources syslog-ng supports. How much parsing is
done, and what gets extracted, is no concern to the destination plugin.
    Fred> Another interesting problem is that the STRUCTURED-DATA element of
    Fred> 5424 uses OIDs to discriminate different data types that are encoded
    Fred> in the header. And while there is a kind of loosely coupled authority
    Fred> for OIDs, there is no infrastructure for determining a parsing
    Fred> strategy for these fields. They could really be anything, in the worst
    Fred> case.
As far as I remember, syslog-ng treats all STRUCTURED-DATA elements as
strings. But there are tools within syslog-ng to allow converting to
other data types, but that must be done explicitly.
    Fred> But regardless of the deeply structured data, you could get some very
    Fred> interesting traction by just taking standard headers and indexing them
    Fred> through Yokozuna. Certainly, indexing the body of a syslog message is
    Fred> a great idea, as these messages are generally unstructured and fodder
    Fred> for lucene. This is something that Logstash/ElasticSearch can do
    Fred> pretty effectively today, and it would be cool to see the same in Riak
    Fred> + some syslog provider.
Yep! When I proposed the idea, using Yokozuna is something I had in
mind. Combine the parsing abilities of syslog-ng, Riak for archival
purposes, and Yokozuna for searching. That sounds like a match made in heaven.
    Fred> Finally, it would be really nice if you could structure your plugin in
    Fred> such a way that they could eventually be ported to rsyslog [2]. The
    Fred> rsyslogd daemon is deployed by default on certain Linux favors and
    Fred> enjoys fairly widespread distribution. You might be able to get it
    Fred> supported in that community, as well.
Part of the project is writing a small library to send data to Riak,
From C. Just enough for syslog-ng's needs. That library could be used by
rsyslog, too (like the MongoDB library originally written for
syslog-ng's purposes is used by rsyslog too). But sharing more code than
that is not practical, the two daemons work in widely different ways.
-- 
|8]
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
Url : http://lists.balabit.hu/pipermail/syslog-ng/attachments/20150506/1260f670/attachment.pgp 
    
    
More information about the syslog-ng
mailing list