[syslog-ng] are nested quotes possible in parser

Balazs Scheidler bazsi at balabit.hu
Wed Jan 14 12:40:05 CET 2009


On Tue, 2009-01-13 at 11:10 -0500, Michael Hocke wrote:
> Hi there,
> 
> my Avocent console servers are sending console port output via syslog  
> to my syslog server running syslog-ng 3.0.1. Console output looks  
> like this:
> 
> Jan 13 00:18:39 sysl at cyc2 Buffering: S39.gwa [Jan 13 00:14:04.379: % 
> EARL_NETFLOW-SP-4-TCAM_THRLD: Netflow TCAM threshold exceeded, TCAM  
> Utilization [97%]]
> Jan 13 00:18:53 sysl at cyc2 Buffering: S42.sw2f [2009 Jan 13 00:18:53  
> EST -05:00 %ETHC-5-PORTTOSTP:Port 3/27 joined bridge port 3/27]
> Jan 13 00:18:53 sysl at cyc2 Buffering: S42.sw2f [2009 Jan 13 00:18:53  
> EST -05:00 %DTP-7-PORTLINKDOWN:Port 3/27 Link down]
> Jan 13 00:18:53 sysl at cyc2 Buffering: S42.sw2f [2009 Jan 13 00:18:53  
> EST -05:00 %ETHC-5-PORTFROMSTP:Port 3/27 left bridge port 3/27]
> 
> The goal is to store the console output within square brackets into  
> separate files named after the server that created this output. The  
> first line of the example above should go into the file "gwa" while  
> the others go into "sw2f". This is what I have so far:
> 
> > source s_udp { udp (); };
> >
> > # --- parse console server output
> > # separate port description from message
> > parser p_console_output {
> >        csv-parser (columns ("CONSOLE.SOURCE", "CONSOLE.MSG")
> >                    delimiters (" ")
> >                    quote-pairs ("[]")
> >                    template ("${MSGONLY}"));
> > };
> >
> > # extract port label from port description
> > parser p_console_source {
> >        csv-parser (columns ("CONSOLE.PORT", "CONSOLE.LABEL")
> >                    delimiters (".")
> >                    template ("${CONSOLE.SOURCE}"));
> > };
> >
> > # --- destination of console output
> > destination d_console_output {
> >         file ("/usr/local/var/log/remote/${HOST_FROM}/console/$ 
> > {CONSOLE.LABEL}"
> >               template ("${CONSOLE.MSG}\n"));
> > };
> >
> > # --- filter console output
> > filter f_console_output {
> >        facility (local7) and host ("^sysl at cyc.*");
> > };
> >
> > # --- log console output
> > log {
> >        source (s_udp);
> >        filter (f_console_output);
> >        parser (p_console_output);
> >        parser (p_console_source);
> >        destination (d_console_output);
> > };
> 
> 
> This works just fine with the last three lines of my example data  
> above. The problem I am having is that if the console output (the  
> text between square brackets) contains its own square brackets the  
> message will cut off right after the first occurrence of the closing  
> bracket. The first line of my example data will look like this:
> 
> Jan 13 00:14:04.379: %EARL_NETFLOW-SP-4-TCAM_THRLD: Netflow TCAM  
> threshold exceeded, TCAM Utilization [97%
> 
> instead of
> 
> Jan 13 00:14:04.379: %EARL_NETFLOW-SP-4-TCAM_THRLD: Netflow TCAM  
> threshold exceeded, TCAM Utilization [97%]
> 
> I could probably get around this by using a rewrite rule using PCRE  
> but considering the amount of data that needs to be looked at this  
> solution is going to be very expensive.
> 
> Is there a way to make syslog-ng aware of nested quotes? If not, is  
> there something in the pipeline to support this in future releases?  
> If not, I will be willing to take a shot and implement this.
> 
> Any pointers or suggestions are welcome.

Hmm... it should be doable if the start and ending quotes are different,
but syslog-ng does not currently support that. It would be possible to
implement though in log_csv_parser_process(), but please note that there
are currently two implementation of CSV parsing:
  * one fast path when no escaping mechanisms are configured
  * one less fast path when any of the currently supported escaping
mechanisms are set

I'd guess that given the start and end quote differ, both paths should
be able to support embedded quotes without additional, user specified
flags that would make the configuration even more complex.

E.g. 

the no-escaping path could always look for the opening quote before
looking for the ending quote and do proper quote calculation. I don't
think this would be such a big performance penalty.

The escape supporting path is slower anyway, it would not be difficult
to add support for this there.

I'd appreciate if you could come up with code, but please note the
'contributory license agreement' on our website. It basically allows any
contributions to be merged into both the OSE and PE versions. (the
reason is to avoid the two to diverge too much, because that would
prevent me to keep the two in sync).

-- 
Bazsi



More information about the syslog-ng mailing list