consecutive pattern parsers, and some other pattern matching questions

newer
[Bug 116] New: Gentoo QA warning:...

Valentijn Sessink

10 Feb 2011 10 Feb '11

1:54 p.m.

Hello list, I'm trying to setup a pattern DB for Postfix, and I'm running into the a couple of problems. 1) One of the log messages I would like to catch is: connect from smtp.example.com[xxxx:xxxx:300:40c1::23] Now I'd like to catch both hostname and IP-address, and maybe I'd like to feed the IP address into some sort of program later. So I thought I'd better put these in individual variables. However, the matching rule <pattern>connect from @ESTRING:postfix.remotehost:[@@IPvANY:postfix.remoteip@]</pattern> doesn't work, because of the double "@@" - which is handled as an escaped "@", instead of two consecutive pattern parsers. So my first question is: how can I have two consecutive pattern parsers in a pattern? 2) Other messages say things like: 64A7F3001E7: from=<something@example.com> ..... The syslog-ng OSE admin guide tells me to use @QSTRING:<>@ to match the mail address; but this shows an error. Is @QSTRING:<>@ the correct way to proceed? (Or is this impossible with the current implementation?) 3) My third question boils down to: is it possible to correlate *one* single message into *two* separate trails? Would that just work by adding the same pattern to two different contexts? The question comes from Postfix using a bunch of small, interconnected programs, so a log trail of a single mail message will change characteristics during it's journey. For example, when a message comes in, smtpd will log: connect from smtp.example.com[xxxx:xxxx:300:40c1::23] ... it could then hand over the message with: 599903001E7: client=smtp.example.com[xxxx:xxxx:300:40c1::23] ... and then other programs continue the log trail useing this queue-ID 599903001E7 as a marker. So here are basically two events intertwined: a connection (that just says "connect..." and "... disconnect" and has a context-scope of "program); and a longer event that starts with the "connect..." and ends with delivery - or even forwarding - of the mail message - and probably has a "host" context-scope. 4) and finally: is there a good way to immediately end a certain context-scope? For example, after "disconnect from smtp.example.com[xxxx:xxxx:300:40c1::23]", the smtp phase is over, so there's no use keeping this context in memory anymore. Would adding something like "context-timeout=0" to the "disconnect" pattern work? Best regards, Valentijn

Show replies by date

Matthew Hall

10 Feb 10 Feb

6:52 p.m.

New subject: consecutive pattern parsers, and some other pattern matching questions

On Thu, Feb 10, 2011 at 01:54:52PM +0100, Valentijn Sessink wrote:

...

Hello list,

I'm trying to setup a pattern DB for Postfix, and I'm running into the a couple of problems.

1) One of the log messages I would like to catch is: connect from smtp.example.com[xxxx:xxxx:300:40c1::23] Now I'd like to catch both hostname and IP-address, and maybe I'd like to feed the IP address into some sort of program later. So I thought I'd better put these in individual variables. However, the matching rule <pattern>connect from @ESTRING:postfix.remotehost:[@@IPvANY:postfix.remoteip@]</pattern>

doesn't work, because of the double "@@" - which is handled as an escaped "@", instead of two consecutive pattern parsers.

So my first question is: how can I have two consecutive pattern parsers in a pattern?

One possible workaround. Capture it all together. Then make a rewrite rule to break it into two vars, when the .classifier.id matches the id you have for this rule.

...

2) Other messages say things like: 64A7F3001E7: from=<something@example.com> ..... The syslog-ng OSE admin guide tells me to use @QSTRING:<>@ to match the mail address; but this shows an error. Is @QSTRING:<>@ the correct way to proceed? (Or is this impossible with the current implementation?)

That's correct... XML escape the characters. If you can run the output through W3C XML Tidy utility that helps hugely to make sure everything is right, and nicely indented to be readable. http://packages.debian.org/sid/tidy (Also available in many distros, I even use it in OS X Ports tree)

...

3) My third question boils down to: is it possible to correlate *one* single message into *two* separate trails?

I don't see why it wouldn't be possible if you had a syslog-ng source listening to mail.* and feeding through a single parser. Correlation is only available in syslog-ng 3.2 and up and I'm not doing correlation yet because I'm doing that work with higher level language code.

...

4) and finally: is there a good way to immediately end a certain context-scope? For example, after "disconnect from smtp.example.com[xxxx:xxxx:300:40c1::23]", the smtp phase is over, so there's no use keeping this context in memory anymore. Would adding something like "context-timeout=0" to the "disconnect" pattern work?

Probably is a way but I'm not sure what it is. Maybe one of the others that's done the SNG correlation could help.

...

Best regards, Valentijn

HTH, Matthew.

Valentijn Sessink

11 Feb 11 Feb

2:14 p.m.

New subject: consecutive pattern parsers, and some other pattern matching questions

All right, replying to myself: Valentijn Sessink schreef:

...

1) @ESTRING:postfix.remotehost:[@@IPvANY:postfix.remoteip@]</pattern>

I probably had a typo in the original pattern; as far as I can see, it does work with two consecutive pattern parsers.

...

2) The syslog-ng OSE admin guide tells me to use @QSTRING:<>@ to match the

This is a bit unclear in the documentation. The documentation just mentions the QSTRING:<> match, while naturally, the < and > need to be escaped (< and >).

...

3) My third question boils down to: is it possible to correlate *one* single message into *two* separate trails?

Yes, you can, but at a cost. To match one message with two patterns, you will need two different pattern databases: parser db1 {db_parser(file("/var/lib/syslog-ng/db1.xml"));}; parser db2 {db_parser(file("/var/lib/syslog-ng/db2.xml"));}; Then, in the log {} entry, specify parser(db1) for the first pattern; and parser(db2) for the second. This seems to work as expected. Trying to match with identical patterns in one database won't work (for technical reasons). Valentijn

Balazs Scheidler

20 Feb 20 Feb

2:25 p.m.

New subject: consecutive pattern parsers, and some other pattern matching questions

Hi, Thanks for summarizing your experience and results. On Fri, 2011-02-11 at 14:14 +0100, Valentijn Sessink wrote:

...

All right, replying to myself:

Valentijn Sessink schreef:

...
1) @ESTRING:postfix.remotehost:[@@IPvANY:postfix.remoteip@]</pattern>

I probably had a typo in the original pattern; as far as I can see, it does work with two consecutive pattern parsers.

Yes, it should. These are explicitly tested by the unit tests, but probably should be mentioned in the admin guide explicitly, as it comes up every now and then.

...

...
2) The syslog-ng OSE admin guide tells me to use @QSTRING:<>@ to match the

This is a bit unclear in the documentation. The documentation just mentions the QSTRING:<> match, while naturally, the < and > need to be escaped (< and >).

Again, a note would be useful that patterndb is in XML format, and thus XML special characters need to be escaped.

...

...
3) My third question boils down to: is it possible to correlate *one* single message into *two* separate trails?

Yes, you can, but at a cost. To match one message with two patterns, you will need two different pattern databases: parser db1 {db_parser(file("/var/lib/syslog-ng/db1.xml"));}; parser db2 {db_parser(file("/var/lib/syslog-ng/db2.xml"));};

Can you explain why you needed this? Why couldn't you do all processing in your single rule?

...

Then, in the log {} entry, specify parser(db1) for the first pattern; and parser(db2) for the second. This seems to work as expected.

Trying to match with identical patterns in one database won't work (for technical reasons).

That's right, since rules are not evaluated sequentially. -- Bazsi

Valentijn Sessink

5 p.m.

New subject: consecutive pattern parsers, and some other pattern matching questions

Op 20-02-11 14:25, Balazs Scheidler schreef:

...

...
Yes, you can, but at a cost. To match one message with two patterns, you will need two different pattern databases: parser db1 {db_parser(file("/var/lib/syslog-ng/db1.xml"));}; parser db2 {db_parser(file("/var/lib/syslog-ng/db2.xml"));}; Can you explain why you needed this? Why couldn't you do all processing in your single rule?

My question came from Postfix, where i tried correlating the smtpd "connect" and "disconnect" messages - which is quite trivial; but also would like a larger correlation that included the whole mail delivery. The connect/disconnect trail is simple: context-id="postfix-smtpd" context-scope="process" and off you go. The mail delivery trail is trickier: you cannot get the full trail with just a "process" scope, you need to look for the "queueid". This queueid starts with smtpd, so there you go: a single message from smtpd that has a meaning in two different contexts. Please note that the queue-id is not available in all smtpd messages, so it is not possible to add trail 1 to trail 2. (I hope my explanation is clear, if not, please say so; I have a couple of patterns and also a postfix log trail that I could include). Best regards, Valentijn

Balazs Scheidler

1 Mar 1 Mar

8:21 p.m.

New subject: consecutive pattern parsers, and some other pattern matching questions

On Sun, 2011-02-20 at 17:00 +0100, Valentijn Sessink wrote:

...

Op 20-02-11 14:25, Balazs Scheidler schreef:

...
...
Yes, you can, but at a cost. To match one message with two patterns, you will need two different pattern databases: parser db1 {db_parser(file("/var/lib/syslog-ng/db1.xml"));}; parser db2 {db_parser(file("/var/lib/syslog-ng/db2.xml"));}; Can you explain why you needed this? Why couldn't you do all processing in your single rule?

My question came from Postfix, where i tried correlating the smtpd "connect" and "disconnect" messages - which is quite trivial; but also would like a larger correlation that included the whole mail delivery.

The connect/disconnect trail is simple: context-id="postfix-smtpd" context-scope="process" and off you go.

The mail delivery trail is trickier: you cannot get the full trail with just a "process" scope, you need to look for the "queueid". This queueid starts with smtpd, so there you go: a single message from smtpd that has a meaning in two different contexts.

Please note that the queue-id is not available in all smtpd messages, so it is not possible to add trail 1 to trail 2.

(I hope my explanation is clear, if not, please say so; I have a couple of patterns and also a postfix log trail that I could include).

That really is a problem, you basically need two correllation states for the same message, while I originally envisioned one. In fact the first designs permitted this scenario as well, but the final design doesn't. Do you use the same pattern in this case? e.g. are your name-value pairs the same in the two rules? If this is the case, then this could be supported by simply associating two rules with the same pattern (which is internally a separate 'object'). Something like this: <rule id="1" context-id='foo' context-scope='process'> <pattern>postfix pattern</pattern> </rule> <rule id="2" context-id='foo.$queue_id' context-scope='host'> <pattern>postfix pattern</pattern> </rule> Although this would cause some problems, because syslog-ng currently assumes that each message matches a rule or it doesn't. It currently doesn't have the notion of multiple matches. Also, I'm not sure this would be very intuitive. Currently we display an error message in pattern collisions. Marci, what do you think? -- Bazsi

Valentijn Sessink

8:56 p.m.

New subject: consecutive pattern parsers, and some other pattern matching questions

Balázs, Marci, list, Op 01-03-11 20:21, Balazs Scheidler schreef:

...

On Sun, 2011-02-20 at 17:00 +0100, Valentijn Sessink wrote:

...
Op 20-02-11 14:25, Balazs Scheidler schreef:

...
...
Yes, you can, but at a cost. To match one message with two patterns, you will need two different pattern databases: parser db1 {db_parser(file("/var/lib/syslog-ng/db1.xml"));}; parser db2 {db_parser(file("/var/lib/syslog-ng/db2.xml"));}; Can you explain why you needed this? Why couldn't you do all processing in your single rule? My question came from Postfix [cut: explanation about double message trails] That really is a problem, you basically need two correllation states for the same message, while I originally envisioned one. In fact the first designs permitted this scenario as well, but the final design doesn't.

I don't understand your findings. If I'm having two db_parser entries in syslog-ng.conf, it does work, or doesn't it? But before you all start programming, please note that the very question came because I was interested in syslog-ng as a message parser for the very reason Balázs has put on his blog: getting syslog-ng to spit out evil IP addresses. The rest of my tests were only conducted to see if I understood the pattern matching; I'm not currently using an smtpd correlating state machine, and I'm pretty sure I won't need the extra complexity of an *additional* smtpd connect/disconnect correlating pattern. So, to summarize: 1) I'm pretty sure I can do without the double matching. I asked out of curiousity; I got things to work "sort-of". However... 2) If you think my solution (two db_pattern files with the same patterns) doesn't work in practice, then you might warn that there's a "collision" between these two databases. As said, having two separate db_parser files did seem to solve the problem - but I could be wrong, of course. (In which case you might want to enlighten me ;) Best regards, Valentijn

Balazs Scheidler

9:31 p.m.

New subject: consecutive pattern parsers, and some other pattern matching questions

On Tue, 2011-03-01 at 20:56 +0100, Valentijn Sessink wrote:

...

Balázs, Marci, list,

Op 01-03-11 20:21, Balazs Scheidler schreef:

...
On Sun, 2011-02-20 at 17:00 +0100, Valentijn Sessink wrote:

...
Op 20-02-11 14:25, Balazs Scheidler schreef:

...
...
Yes, you can, but at a cost. To match one message with two patterns, you will need two different pattern databases: parser db1 {db_parser(file("/var/lib/syslog-ng/db1.xml"));}; parser db2 {db_parser(file("/var/lib/syslog-ng/db2.xml"));}; Can you explain why you needed this? Why couldn't you do all processing in your single rule? My question came from Postfix [cut: explanation about double message trails] That really is a problem, you basically need two correllation states for the same message, while I originally envisioned one. In fact the first designs permitted this scenario as well, but the final design doesn't.

I don't understand your findings. If I'm having two db_parser entries in syslog-ng.conf, it does work, or doesn't it?

Yes, that works, and will do. The original intention of db-parser() however is to do all kind of log -> event translation with the same database. Having to configure two of them is possible, but doesn't fit the vision :)

...

But before you all start programming, please note that the very question came because I was interested in syslog-ng as a message parser for the very reason Balázs has put on his blog: getting syslog-ng to spit out evil IP addresses.

The rest of my tests were only conducted to see if I understood the pattern matching; I'm not currently using an smtpd correlating state machine, and I'm pretty sure I won't need the extra complexity of an *additional* smtpd connect/disconnect correlating pattern.

So, to summarize: 1) I'm pretty sure I can do without the double matching. I asked out of curiousity; I got things to work "sort-of". However... 2) If you think my solution (two db_pattern files with the same patterns) doesn't work in practice, then you might warn that there's a "collision" between these two databases.

As said, having two separate db_parser files did seem to solve the problem - but I could be wrong, of course. (In which case you might want to enlighten me ;)

It's just your use-case seems to be a perfectly valid scenario. We can rule it out for now, but I guess it'll come back later :) -- Bazsi

Valentijn Sessink

9:54 p.m.

New subject: consecutive pattern parsers, and some other pattern matching questions

Op 01-03-11 21:31, Balazs Scheidler schreef:

...

It's just your use-case seems to be a perfectly valid scenario. We can rule it out for now, but I guess it'll come back later :)

... in which case you *do* need a double database, because the patterns are identical. Personally, I'd vote for the extra-patterndb-scenario. V.

5334

Age (days ago)

5353

Last active (days ago)

List overview

Download

8 comments

3 participants

participants (3)

Balazs Scheidler
Matthew Hall
Valentijn Sessink