[syslog-ng] Bazsi's blog: syslog-ng flexibility improvements

Mon Jan 16 18:04:04 CET 2012

Patrick Hemmer wrote:
>   Sent: Mon Jan 16 2012 08:47:26 GMT-0500 (EST)
> From: Balazs Scheidler <bazsi at balabit.hu>
> To: syslog-ng at lists.balabit.hu
> Subject: [syslog-ng] Bazsi's blog: syslog-ng flexibility improvements
>>
>>
>>   *syslog-ng flexibility improvements
>>   <http://bazsi.blogs.balabit.com/2012/01/syslog-ng-flexibility-improvements/>*
>>
>>
>> syslog-ng is often referred as a very flexible application when it 
>> comes to processing logs. Over the years however, I began to feel that 
>> some things are a bit more difficult to achieve in the configuration 
>> language than it should be. For instance it is sometimes too rigid 
>> when you need a combination of parsers (patterndb with db-parser) and 
>> rewrite rules to achieve the goal you wanted. Parsers and rewrite 
>> rules are distinct part of the configuration, it is not possible to 
>> combine them into a single functionality. Also, declaring objects 
>> first and then referencing them later, makes the configuration easy to 
>> read, however sometimes it is quite cumbersome, when you only need to 
>> invert the result of an already existing filter.
>>
>> To solve this situation, I’ve set out to implement an idea I had on 
>> mind for some time now. It is quite difficult to describe the feature 
>> in clear and concise words, as it is a combination of various changes 
>> that together makes syslog-ng configuration more flexible and easier 
>> to use, without sacrificing readability. Curious? Please read on.
>>
>> *In-line objects*
>>
>> Perhaps the simplest of all features is that you can now define the 
>> contents of a given object right on the spot, without having to use a 
>> separate statement. For example, earlier you had to write:
>>
>> log {
>>   source(s_local);
>>   filter(f_postfix);
>>   destination(d_postfix);
>> };
>>
>> Sometimes, f_postfix filter is only used once and is trivial. This can 
>> now be written as:
>>
>> log {
>>  source(s_local);
>>  filter { program("^postfix/"); };
>>  destination(d_postfix);
>> };
>>
>> Furthermore both the source() and destination() options can be written 
>> in-line, you simply use braces instead of parentheses. The same 
>> functionality applies to everything: sources, destinations, filters, 
>> parsers and rewrite rules.
>>
>> *Junctions*
>>
>> A limited form of junctions has been supported since syslog-ng 3.0 in 
>> the form of “embedded log statements”, which has been generalized now. 
>> Within syslog-ng, when a message is received it is dispatched to a log 
>> processing path or pipeline, which carries out the task at hand. A 
>> junction is a point in the log processing path where the processing is 
>> performed on multiple independent branches, each doing its own 
>> specific thing with the message.
>>
>> The limited functionality in 3.0 only allowed the processing tree to 
>> split (or fork) into independent branches, each of the branches was a 
>> “sink”, where processing also ended. Configuration example:
>>
>> log {
>>   source(s_all); filter(f);
>>   log { filter(f1); destination(d1); };
>>   log { filter(f2); destination(d2); };
>> };
>>
>> This sample forks the processing path into two branches starting with 
>> the “log” keyword within the top-level log statement. The first branch 
>> evaluates the filter f1 and the writes matching messages to the d1 
>> destination, effectively sending all messages that match (f AND f1) to 
>> d1. Likewise, d2 receives all messages that match (f AND f2).
>>
>> The limitation of the embedded log statement concept was simple: it 
>> could only be listed at the very end of a log statement, and the 
>> end-result of the branches couldn’t be processed further. Effectively 
>> the message at the end of each branch “fell off”. Junctions on the 
>> other hand makes it possible to do things to messages once the 
>> branches converge to the same point again. Repeating the sample above, 
>> it is now possible to write:
>>
>> log {
>>   source(s_all); filter(f);
>>   junction {
>>     log { filter(f1); destination(d1); };
>>     log { filter(f2); destination(d2); };
>>   };
>>   destination(d_all);
>> };
>>
>> The new thing is that you can now add processing *after* the branches 
>> finish their processing. A bit more useful example would be:
>>
>> log {
>>   source(s_apache_files);
>>   source(s_syslog);
>>   junction {
>>     log { filter(f_apache_files); rewrite(r_apache_remove_file_header); parser(p_apache); flags(final); };
>>     log { filter(f_apache_syslog); parser(p_apache); flags(final); };
>>   };
>>   destination(d_files);
>> };
>>
>> This example does an alternative processing of incoming logs based on 
>> where the message came from.
>>
>> *Everything is a log expression*
>>
>> This feature is probably the most complicated, however provides very 
>> nice properties and expressiveness to the configuration. From now on, 
>> not just the well known log statement allows the specification of log 
>> processing rules, but all the objects in the syslog-ng configuration 
>> file can use the same expressive power.
>>
>> It is now possible to use embedded log statements, junctions and 
>> in-line object definitions within source, destination, filter, rewrite 
>> and parser definitions. Huh, you could ask: what does it bring to me 
>> as a benefit? Well, until now, objects of different types were 
>> separate entities, connected using log statements, with this change a 
>> source can also specify a rewrite rule and that combination used as a 
>> log source in a log statement.
>>
>> For instance, a usual source definition looked like this:
>>
>> source s_apache {
>>   file("/var/log/apache/error.log");
>> };
>>
>> If you wanted to process this log file in a specific way, you needed 
>> to define the accompanying processing rules (parsers and rewrite 
>> expressions) and combine them in a log statement. But how about this:
>>
>> source s_apache {
>>   log {
>>     source { file("/var/log/apache/error.log"); };
>>     parser(p_apache_parser); };
>>   };
>> };
>>
>> log { source(s_apache); ... };
>>
> 
> This just doesnt feel right. I'm not quire sure how else to put it :-)
> I mean that I think of `log` statements as output handlers for a 
> message. They control how the message leaves syslog-ng, whether it be to 
> a file, database, pipe, whatever. To me it seems to make more sense if 
> the `log` statement is called something else here (inside the `source` 
> block), though what I dont know. Maybe at the least an alias, so that 
> log and the alias are the exact same things, but that reading the config 
> would make it look more logical. I dont know if I'm understanding the 
> distinctions between `log` and `junctions` properly, but it seems as if 
> `log`s are a serial execution of the statements within, and `junction`s 
> are a parallel execution of the statements within, so maybe names to 
> better reflect this?

I agree that this does not "feel" right. If the log statement had a destination
that could be used as a source, then the second log statement could source
the destination of the first log statement. That would make all log statements
have a "source" and a "destination".

Just my $0.02

> 
> 
>> Can you see? The s_apache source used a file source and the reference 
>> of a specific parser and all messages read from the apache error log 
>> file would be processed by that parser. The log statement is just as 
>> simple as if s_apache would be a “normal” source definition. This 
>> feature allows pairing the essential log preprocessing functionality 
>> very close to the source itself, making it very easy to write and read 
>> the log statements. As an added bonus, it becomes very easy to 
>> distribute application specific source & parser definitions as an SCL 
>> configuration snippet.
>>
>> *Where?*
>>
>> This stuff is available in the syslog-ng 3.4 git tree 
>> <http://github.com/bazsi/syslog-ng-3.4/>, on master. It passes the 
>> included regression test, so it is at least dogfoodable. The nice 
>> thing about the implementation is that it only slightly increased the 
>> code size, but brought a lot of new features. If you have trouble 
>> getting the code from git, let me know, I’m willing to create an alpha 
>> release, so that it becomes easier to play with it.
>>
>> *Feedback*
>>
>> I see a lot of potential in this functionality, however my examples 
>> may have not been the best ones. I would really appreciate any kind of 
>> feedback, please be sure to send those to the syslog-ng mailing list 
>> <http://lists.balabit.hu/mailman/listinfo/syslog-ng/> or post me as a 
>> private email.
> 
> This does sound like its going to be some useful stuff, and will 
> definitely be keeping an eye on it.
> 
> Thanks :-)
> 
> -Patrick

-- 
Evan Rempel                               erempel at uvic.ca
Senior Systems Administrator                 250.721.7691
Unix Services, University Systems, University of Victoria