Debugging Pattern Match Failures
Hello list, Recently I created a series of blasphemous scripts which convert some large collections of recorded log messages in my environment into pattern DB XML files. At first there were some syntax errors but I fixed all of these and the XML files are loading successfully. However I am running into some problems with the next step: getting the patterns to match against the incoming log messages. I suspect I am not properly stripping the headers off of the disk files of recorded messages I am using to generate the pattern DB XML files. I am wondering how I can enable the right debugging capabilities to get more detailed debug output from the pattern DB parser where I can see what strings are being processed so that I can fix this right instead of guessing repeatedly and incorrectly. Thanks, Matthew Hall.
Did you try the patternize utility? It can automate a lot of the pattern creating. It's on git and written about here: http://gyp.blogs.balabit.com/2010/01/introducing-pdbtool-patternize.html . Also, are you using the pdbtool to test the messages? See this blog post for more info: http://marci.blogs.balabit.com/2010/07/pdbtool-test-and-pattern-database.htm... . --Martin On Mon, Aug 2, 2010 at 9:39 PM, Matthew Hall <mhall@mhcomputing.net> wrote:
Hello list,
Recently I created a series of blasphemous scripts which convert some large collections of recorded log messages in my environment into pattern DB XML files. At first there were some syntax errors but I fixed all of these and the XML files are loading successfully.
However I am running into some problems with the next step: getting the patterns to match against the incoming log messages. I suspect I am not properly stripping the headers off of the disk files of recorded messages I am using to generate the pattern DB XML files.
I am wondering how I can enable the right debugging capabilities to get more detailed debug output from the pattern DB parser where I can see what strings are being processed so that I can fix this right instead of guessing repeatedly and incorrectly.
Thanks, Matthew Hall. ______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
Hello Martin, On Mon, Aug 02, 2010 at 10:07:36PM -0500, Martin Holste wrote:
Did you try the patternize utility? It can automate a lot of the pattern creating.
First of all thank you very much for pointing out patternize; I did see many of the patterndb related blogs but missed this one. I will certainly investigate this in detail and make as much use of it as possible.
Also, are you using the pdbtool to test the messages? See this blog post for more info:
I thought about pdbtool but the problem there was that I needed to know exactly which string the daemon would receive, how it would look when the daemon stripped the headers, and what it would send into the patterndb for matching. This is because the messages on the socket have different headers from the headers which are used in the disk files of messages I am using as the source of raw material for creating the patterns. Thus I end up with the same problem I started with, unless I'm missing something here.
--Martin
Cheers, Matthew.
On Mon, Aug 2, 2010 at 9:39 PM, Matthew Hall <mhall@mhcomputing.net> wrote:
Hello list,
Recently I created a series of blasphemous scripts which convert some large collections of recorded log messages in my environment into pattern DB XML files. At first there were some syntax errors but I fixed all of these and the XML files are loading successfully.
However I am running into some problems with the next step: getting the patterns to match against the incoming log messages. I suspect I am not properly stripping the headers off of the disk files of recorded messages I am using to generate the pattern DB XML files.
I am wondering how I can enable the right debugging capabilities to get more detailed debug output from the pattern DB parser where I can see what strings are being processed so that I can fix this right instead of guessing repeatedly and incorrectly.
Thanks, Matthew Hall. ______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
On Mon, 2010-08-02 at 22:29 -0700, Matthew Hall wrote:
Hello Martin,
On Mon, Aug 02, 2010 at 10:07:36PM -0500, Martin Holste wrote:
Did you try the patternize utility? It can automate a lot of the pattern creating.
First of all thank you very much for pointing out patternize; I did see many of the patterndb related blogs but missed this one. I will certainly investigate this in detail and make as much use of it as possible.
Integrating patternize functionality is on my todo list, so hopefully you won't have to look too far, since it's going to be part of syslog-ng itself :)
Also, are you using the pdbtool to test the messages? See this blog post for more info:
I thought about pdbtool but the problem there was that I needed to know exactly which string the daemon would receive, how it would look when the daemon stripped the headers, and what it would send into the patterndb for matching.
This is because the messages on the socket have different headers from the headers which are used in the disk files of messages I am using as the source of raw material for creating the patterns. Thus I end up with the same problem I started with, unless I'm missing something here.
Well, if you want to look at the result of the message parsing exactly as done by syslog-ng, you could use a noop rewrite rule and enable debugging (though it is not recommended to be done in a production server): rewrite r_noop { set("$MESSAGE"); }; This would set $MESSAGE to $MESSAGE, but at the end of the rewrite rule, syslog-ng would emit a debug message about the contents of the MESSAGE name-value pair. Alternatively, you may still be able to use "pdbtool match" which can read a log file, parse it with syslog-ng's message parser and report the results per name-value pair. $ pdbtool match -f /var/log/auth.log -p access/sshd.pdb | head -10 HOST=bzorp MESSAGE=pam_unix(cron:session): session opened for user root by (uid=0) PROGRAM=CRON PID=7362 LEGACY_MSGHDR=CRON[7362]: .classifier.class=unknown ... This uses the normal BSD syslog parser to read the file (thus if you are using no-parse flag, or RFC5424 format log files, that may differ) -- Bazsi
On Tue, Aug 03, 2010 at 02:39:38PM +0200, Balazs Scheidler wrote:
Well, if you want to look at the result of the message parsing exactly as done by syslog-ng, you could use a noop rewrite rule and enable debugging (though it is not recommended to be done in a production server):
rewrite r_noop { set("$MESSAGE"); };
This would set $MESSAGE to $MESSAGE, but at the end of the rewrite rule, syslog-ng would emit a debug message about the contents of the MESSAGE name-value pair.
Unfortunately I can't even get that far because the beginning of my message patterns is not matching up against whatever syslog-ng is using to do the pattern match, so I am not going to get any name value pairs out.
Alternatively, you may still be able to use "pdbtool match" which can read a log file, parse it with syslog-ng's message parser and report the results per name-value pair.
$ pdbtool match -f /var/log/auth.log -p access/sshd.pdb | head -10 HOST=bzorp MESSAGE=pam_unix(cron:session): session opened for user root by (uid=0) PROGRAM=CRON PID=7362 LEGACY_MSGHDR=CRON[7362]: .classifier.class=unknown
...
This uses the normal BSD syslog parser to read the file (thus if you are using no-parse flag, or RFC5424 format log files, that may differ)
How do I create a file in this BSD format the pdbtool expects? Right now I am using syslog-ng output files as input to my patternizing scripts, but I think I am not stripping off the right things at the beginning of the lines in these files (either too much or too little). Is there some option I can use to store just the part it would send to the pattern matcher so that I can have input to my patternizer which looks exactly like what the daemon is going to match during the pattern match for each message?
-- Bazsi
Thanks, Matthew.
I believe the matching is done against the $MSGONLY macro, so you can put another log destination in to write that out only and have a look to see what the parser is seeing. Do you have an example log you can show? On Tue, Aug 3, 2010 at 12:10 PM, Matthew Hall <mhall@mhcomputing.net> wrote:
On Tue, Aug 03, 2010 at 02:39:38PM +0200, Balazs Scheidler wrote:
Well, if you want to look at the result of the message parsing exactly as done by syslog-ng, you could use a noop rewrite rule and enable debugging (though it is not recommended to be done in a production server):
rewrite r_noop { set("$MESSAGE"); };
This would set $MESSAGE to $MESSAGE, but at the end of the rewrite rule, syslog-ng would emit a debug message about the contents of the MESSAGE name-value pair.
Unfortunately I can't even get that far because the beginning of my message patterns is not matching up against whatever syslog-ng is using to do the pattern match, so I am not going to get any name value pairs out.
Alternatively, you may still be able to use "pdbtool match" which can read a log file, parse it with syslog-ng's message parser and report the results per name-value pair.
$ pdbtool match -f /var/log/auth.log -p access/sshd.pdb | head -10 HOST=bzorp MESSAGE=pam_unix(cron:session): session opened for user root by (uid=0) PROGRAM=CRON PID=7362 LEGACY_MSGHDR=CRON[7362]: .classifier.class=unknown
...
This uses the normal BSD syslog parser to read the file (thus if you are using no-parse flag, or RFC5424 format log files, that may differ)
How do I create a file in this BSD format the pdbtool expects? Right now I am using syslog-ng output files as input to my patternizing scripts, but I think I am not stripping off the right things at the beginning of the lines in these files (either too much or too little).
Is there some option I can use to store just the part it would send to the pattern matcher so that I can have input to my patternizer which looks exactly like what the daemon is going to match during the pattern match for each message?
-- Bazsi
Thanks, Matthew. ______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
On Tue, Aug 03, 2010 at 06:53:13PM -0500, Martin Holste wrote:
I believe the matching is done against the $MSGONLY macro, so you can put another log destination in to write that out only and have a look to see what the parser is seeing. Do you have an example log you can show?
Here is an example of what would be appearing in the disk log file: Jul 1 00:00:00 <local1.notice> 172.16.0.1 0000001: Jul 1 00:00:00.000 UTC: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/1, changed state to down There are many more types of message coming from many more devices, some of which are BSD compliant and some of which are not, and I think that is part of my problem. The unclear part is how much of the front part needs to be stripped off, before making the patterns in the XML file. Hopefully I will be able to figure that out now that you have clarified how I can make a raw message file without extraneous strings appended. Thanks for helping me understand how this works and what I can do to get my patterns right. I definitely owe you a beer. Regards, Matthew.
On Tue, Aug 3, 2010 at 12:10 PM, Matthew Hall <mhall@mhcomputing.net> wrote:
On Tue, Aug 03, 2010 at 02:39:38PM +0200, Balazs Scheidler wrote:
Well, if you want to look at the result of the message parsing exactly as done by syslog-ng, you could use a noop rewrite rule and enable debugging (though it is not recommended to be done in a production server):
rewrite r_noop { set("$MESSAGE"); };
This would set $MESSAGE to $MESSAGE, but at the end of the rewrite rule, syslog-ng would emit a debug message about the contents of the MESSAGE name-value pair.
Unfortunately I can't even get that far because the beginning of my message patterns is not matching up against whatever syslog-ng is using to do the pattern match, so I am not going to get any name value pairs out.
Alternatively, you may still be able to use "pdbtool match" which can read a log file, parse it with syslog-ng's message parser and report the results per name-value pair.
$ pdbtool match -f /var/log/auth.log -p access/sshd.pdb | head -10 HOST=bzorp MESSAGE=pam_unix(cron:session): session opened for user root by (uid=0) PROGRAM=CRON PID=7362 LEGACY_MSGHDR=CRON[7362]: .classifier.class=unknown
...
This uses the normal BSD syslog parser to read the file (thus if you are using no-parse flag, or RFC5424 format log files, that may differ)
How do I create a file in this BSD format the pdbtool expects? Right now I am using syslog-ng output files as input to my patternizing scripts, but I think I am not stripping off the right things at the beginning of the lines in these files (either too much or too little).
Is there some option I can use to store just the part it would send to the pattern matcher so that I can have input to my patternizer which looks exactly like what the daemon is going to match during the pattern match for each message?
-- Bazsi
Thanks, Matthew.
I did some experimentation, using the following log setup: rewrite r_raw { set("$MSGONLY"); }; destination d_u_raw_local1 { file("/logs/raw/local1" dir_owner("root") owner("root") group("root") perm(0640) dir_perm(0755) create_dirs(yes) template(t_default) suppress(3) ); }; But I am still getting messages like this: Aug 1 00:00:00 <local1.notice> 172.16.0.2 from: 172.16.0.1: 000001: Aug 1 00:00:00.000: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet5/16, changed state to up This it seems that I am not successfully stripping all headers normally added by the file writer off of the message using this configuration. What did I miss here in my rewrite rule? Without some way to make sure I have a raw file with no weird headers added it's hard to make decent patterns. Thanks, Matthew. On Tue, Aug 03, 2010 at 05:18:10PM -0700, Matthew Hall wrote:
On Tue, Aug 03, 2010 at 06:53:13PM -0500, Martin Holste wrote:
I believe the matching is done against the $MSGONLY macro, so you can put another log destination in to write that out only and have a look to see what the parser is seeing. Do you have an example log you can show?
Here is an example of what would be appearing in the disk log file:
Jul 1 00:00:00 <local1.notice> 172.16.0.1 0000001: Jul 1 00:00:00.000 UTC: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/1, changed state to down
There are many more types of message coming from many more devices, some of which are BSD compliant and some of which are not, and I think that is part of my problem.
The unclear part is how much of the front part needs to be stripped off, before making the patterns in the XML file. Hopefully I will be able to figure that out now that you have clarified how I can make a raw message file without extraneous strings appended.
Thanks for helping me understand how this works and what I can do to get my patterns right. I definitely owe you a beer.
Regards, Matthew.
On Tue, Aug 3, 2010 at 12:10 PM, Matthew Hall <mhall@mhcomputing.net> wrote:
On Tue, Aug 03, 2010 at 02:39:38PM +0200, Balazs Scheidler wrote:
Well, if you want to look at the result of the message parsing exactly as done by syslog-ng, you could use a noop rewrite rule and enable debugging (though it is not recommended to be done in a production server):
rewrite r_noop { set("$MESSAGE"); };
This would set $MESSAGE to $MESSAGE, but at the end of the rewrite rule, syslog-ng would emit a debug message about the contents of the MESSAGE name-value pair.
Unfortunately I can't even get that far because the beginning of my message patterns is not matching up against whatever syslog-ng is using to do the pattern match, so I am not going to get any name value pairs out.
Alternatively, you may still be able to use "pdbtool match" which can read a log file, parse it with syslog-ng's message parser and report the results per name-value pair.
$ pdbtool match -f /var/log/auth.log -p access/sshd.pdb | head -10 HOST=bzorp MESSAGE=pam_unix(cron:session): session opened for user root by (uid=0) PROGRAM=CRON PID=7362 LEGACY_MSGHDR=CRON[7362]: .classifier.class=unknown
...
This uses the normal BSD syslog parser to read the file (thus if you are using no-parse flag, or RFC5424 format log files, that may differ)
How do I create a file in this BSD format the pdbtool expects? Right now I am using syslog-ng output files as input to my patternizing scripts, but I think I am not stripping off the right things at the beginning of the lines in these files (either too much or too little).
Is there some option I can use to store just the part it would send to the pattern matcher so that I can have input to my patternizer which looks exactly like what the daemon is going to match during the pattern match for each message?
-- Bazsi
Thanks, Matthew.
No worries, happy to help. Ok, try this destination: template t_test { template("$MSGONLY\n"); }; destination d_test { file("/logs/raw/test" dir_owner("root") owner("root") group("root") perm(0640) dir_perm(0755) create_dirs(yes) template(t_test) ); }; And don't use the "r_raw" rewriter with this destination. I'm pretty sure this will yield what you've got already, but I want to take as many variables out of it as possible. If the messages do indeed look the same in /logs/raw/test, then it means that you need to write your patterns so that they start matching from whatever is printed. That shouldn't be too bad as long as you can get a decent delimiter in there. The colons look like a good anchor. On Tue, Aug 3, 2010 at 8:35 PM, Matthew Hall <mhall@mhcomputing.net> wrote:
I did some experimentation, using the following log setup:
rewrite r_raw { set("$MSGONLY"); };
destination d_u_raw_local1 { file("/logs/raw/local1" dir_owner("root") owner("root") group("root") perm(0640) dir_perm(0755) create_dirs(yes) template(t_default) suppress(3) ); };
But I am still getting messages like this:
Aug 1 00:00:00 <local1.notice> 172.16.0.2 from: 172.16.0.1: 000001: Aug 1 00:00:00.000: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet5/16, changed state to up
This it seems that I am not successfully stripping all headers normally added by the file writer off of the message using this configuration. What did I miss here in my rewrite rule? Without some way to make sure I have a raw file with no weird headers added it's hard to make decent patterns.
Thanks, Matthew.
On Tue, Aug 03, 2010 at 05:18:10PM -0700, Matthew Hall wrote:
On Tue, Aug 03, 2010 at 06:53:13PM -0500, Martin Holste wrote:
I believe the matching is done against the $MSGONLY macro, so you can put another log destination in to write that out only and have a look to see what the parser is seeing. Do you have an example log you can show?
Here is an example of what would be appearing in the disk log file:
Jul 1 00:00:00 <local1.notice> 172.16.0.1 0000001: Jul 1 00:00:00.000 UTC: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/1, changed state to down
There are many more types of message coming from many more devices, some of which are BSD compliant and some of which are not, and I think that is part of my problem.
The unclear part is how much of the front part needs to be stripped off, before making the patterns in the XML file. Hopefully I will be able to figure that out now that you have clarified how I can make a raw message file without extraneous strings appended.
Thanks for helping me understand how this works and what I can do to get my patterns right. I definitely owe you a beer.
Regards, Matthew.
On Tue, Aug 3, 2010 at 12:10 PM, Matthew Hall <mhall@mhcomputing.net> wrote:
On Tue, Aug 03, 2010 at 02:39:38PM +0200, Balazs Scheidler wrote:
Well, if you want to look at the result of the message parsing exactly as done by syslog-ng, you could use a noop rewrite rule and enable debugging (though it is not recommended to be done in a production server):
rewrite r_noop { set("$MESSAGE"); };
This would set $MESSAGE to $MESSAGE, but at the end of the rewrite rule, syslog-ng would emit a debug message about the contents of the MESSAGE name-value pair.
Unfortunately I can't even get that far because the beginning of my message patterns is not matching up against whatever syslog-ng is using to do the pattern match, so I am not going to get any name value pairs out.
Alternatively, you may still be able to use "pdbtool match" which can read a log file, parse it with syslog-ng's message parser and report the results per name-value pair.
$ pdbtool match -f /var/log/auth.log -p access/sshd.pdb | head -10 HOST=bzorp MESSAGE=pam_unix(cron:session): session opened for user root by (uid=0) PROGRAM=CRON PID=7362 LEGACY_MSGHDR=CRON[7362]: .classifier.class=unknown
...
This uses the normal BSD syslog parser to read the file (thus if you are using no-parse flag, or RFC5424 format log files, that may differ)
How do I create a file in this BSD format the pdbtool expects? Right now I am using syslog-ng output files as input to my patternizing scripts, but I think I am not stripping off the right things at the beginning of the lines in these files (either too much or too little).
Is there some option I can use to store just the part it would send to the pattern matcher so that I can have input to my patternizer which looks exactly like what the daemon is going to match during the pattern match for each message?
-- Bazsi
Thanks, Matthew.
Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
Ok, this pattern seems to work in the pdbtest tool: <ruleset> <rules> <rule class="test" id="test"> <patterns> <pattern>@ESTRING:month: @@NUMBER:mday:@ @NUMBER:hour:@:@NUMBER:minute:@:@NUMBER:second:@ @QSTRING:priority:<>@ @IPv4:host1:@ from: @IPv4:host2:@: @NUMBER::@: @ESTRING:: @@NUMBER::@ @NUMBER::@:@NUMBER::@:@NUMBER::@.@NUMBER::@: @ESTRING:program::@ @ANYSTRING:message:@</pattern> </patterns> <examples> <example> <test_message program="">Aug 1 00:00:00 <local1.notice> 172.16.0.2 from: 172.16.0.1: 000001: Aug 1 00:00:00.000: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet5/16, changed state to up</test_message> <test_values> <test_value name="month">Aug</test_value> <test_value name="mday">1</test_value> <test_value name="hour">00</test_value> <test_value name="minute">00</test_value> <test_value name="second">00</test_value> <test_value name="priority">local1.notice</test_value> <test_value name="host1">172.16.0.2</test_value> <test_value name="host2">172.16.0.1</test_value> <test_value name="program">%LINEPROTO-5-UPDOWN</test_value> <test_value name="message">Line protocol on Interface FastEthernet5/16, changed state to up</test_value> </test_values> </example> </examples> </rule> </rules> </ruleset> So give that a shot with the tool and then it's just a matter of getting your $MSGONLY macro to line up. On Tue, Aug 3, 2010 at 9:11 PM, Martin Holste <mcholste@gmail.com> wrote:
No worries, happy to help.
Ok, try this destination:
template t_test { template("$MSGONLY\n"); };
destination d_test { file("/logs/raw/test" dir_owner("root") owner("root") group("root") perm(0640) dir_perm(0755) create_dirs(yes) template(t_test) ); };
And don't use the "r_raw" rewriter with this destination. I'm pretty sure this will yield what you've got already, but I want to take as many variables out of it as possible.
If the messages do indeed look the same in /logs/raw/test, then it means that you need to write your patterns so that they start matching from whatever is printed. That shouldn't be too bad as long as you can get a decent delimiter in there. The colons look like a good anchor.
On Tue, Aug 3, 2010 at 8:35 PM, Matthew Hall <mhall@mhcomputing.net> wrote:
I did some experimentation, using the following log setup:
rewrite r_raw { set("$MSGONLY"); };
destination d_u_raw_local1 { file("/logs/raw/local1" dir_owner("root") owner("root") group("root") perm(0640) dir_perm(0755) create_dirs(yes) template(t_default) suppress(3) ); };
But I am still getting messages like this:
Aug 1 00:00:00 <local1.notice> 172.16.0.2 from: 172.16.0.1: 000001: Aug 1 00:00:00.000: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet5/16, changed state to up
This it seems that I am not successfully stripping all headers normally added by the file writer off of the message using this configuration. What did I miss here in my rewrite rule? Without some way to make sure I have a raw file with no weird headers added it's hard to make decent patterns.
Thanks, Matthew.
On Tue, Aug 03, 2010 at 05:18:10PM -0700, Matthew Hall wrote:
On Tue, Aug 03, 2010 at 06:53:13PM -0500, Martin Holste wrote:
I believe the matching is done against the $MSGONLY macro, so you can put another log destination in to write that out only and have a look to see what the parser is seeing. Do you have an example log you can show?
Here is an example of what would be appearing in the disk log file:
Jul 1 00:00:00 <local1.notice> 172.16.0.1 0000001: Jul 1 00:00:00.000 UTC: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/1, changed state to down
There are many more types of message coming from many more devices, some of which are BSD compliant and some of which are not, and I think that is part of my problem.
The unclear part is how much of the front part needs to be stripped off, before making the patterns in the XML file. Hopefully I will be able to figure that out now that you have clarified how I can make a raw message file without extraneous strings appended.
Thanks for helping me understand how this works and what I can do to get my patterns right. I definitely owe you a beer.
Regards, Matthew.
On Tue, Aug 3, 2010 at 12:10 PM, Matthew Hall <mhall@mhcomputing.net> wrote:
On Tue, Aug 03, 2010 at 02:39:38PM +0200, Balazs Scheidler wrote:
Well, if you want to look at the result of the message parsing exactly as done by syslog-ng, you could use a noop rewrite rule and enable debugging (though it is not recommended to be done in a production server):
rewrite r_noop { set("$MESSAGE"); };
This would set $MESSAGE to $MESSAGE, but at the end of the rewrite rule, syslog-ng would emit a debug message about the contents of the MESSAGE name-value pair.
Unfortunately I can't even get that far because the beginning of my message patterns is not matching up against whatever syslog-ng is using to do the pattern match, so I am not going to get any name value pairs out.
Alternatively, you may still be able to use "pdbtool match" which can read a log file, parse it with syslog-ng's message parser and report the results per name-value pair.
$ pdbtool match -f /var/log/auth.log -p access/sshd.pdb | head -10 HOST=bzorp MESSAGE=pam_unix(cron:session): session opened for user root by (uid=0) PROGRAM=CRON PID=7362 LEGACY_MSGHDR=CRON[7362]: .classifier.class=unknown
...
This uses the normal BSD syslog parser to read the file (thus if you are using no-parse flag, or RFC5424 format log files, that may differ)
How do I create a file in this BSD format the pdbtool expects? Right now I am using syslog-ng output files as input to my patternizing scripts, but I think I am not stripping off the right things at the beginning of the lines in these files (either too much or too little).
Is there some option I can use to store just the part it would send to the pattern matcher so that I can have input to my patternizer which looks exactly like what the daemon is going to match during the pattern match for each message?
-- Bazsi
Thanks, Matthew.
Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
Hi Martin, I think I've not explained my problem very well yet and I have to improve the explanation based on what you've been showing me so far. The part which confuses me is the header that seems to be getting prepended to the message: Aug 1 00:00:00 <local1.notice> 172.16.0.2 I am pretty sure that syslog-ng must be adding this. I doubt it arrives that way off the socket based on the packet captures. If that part is prepended before the pattern DB parser is invoked. If it's added before the parser is invoked and is not present on the original messages off the socket, I'd say it's buggy behavior or should at least be configurable. If that part is prepended later on when it's time to send the message to the file writer, then I really would want a way to see exactly what's passed to the parser to make the proper patterns, it seems silly my patterns should need to reparse fields added by the daemon itself, especially if there could be several different output formats I'd need separate patterns for each even though the device only sent one type of message. Regards, Matthew. On Tue, Aug 03, 2010 at 10:18:14PM -0500, Martin Holste wrote:
Ok, this pattern seems to work in the pdbtest tool:
<ruleset> <rules> <rule class="test" id="test"> <patterns> <pattern>@ESTRING:month: @@NUMBER:mday:@ @NUMBER:hour:@:@NUMBER:minute:@:@NUMBER:second:@ @QSTRING:priority:<>@ @IPv4:host1:@ from: @IPv4:host2:@: @NUMBER::@: @ESTRING:: @@NUMBER::@ @NUMBER::@:@NUMBER::@:@NUMBER::@.@NUMBER::@: @ESTRING:program::@ @ANYSTRING:message:@</pattern> </patterns> <examples> <example> <test_message program="">Aug 1 00:00:00 <local1.notice> 172.16.0.2 from: 172.16.0.1: 000001: Aug 1 00:00:00.000: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet5/16, changed state to up</test_message> <test_values> <test_value name="month">Aug</test_value> <test_value name="mday">1</test_value> <test_value name="hour">00</test_value> <test_value name="minute">00</test_value> <test_value name="second">00</test_value> <test_value name="priority">local1.notice</test_value> <test_value name="host1">172.16.0.2</test_value> <test_value name="host2">172.16.0.1</test_value> <test_value name="program">%LINEPROTO-5-UPDOWN</test_value> <test_value name="message">Line protocol on Interface FastEthernet5/16, changed state to up</test_value> </test_values> </example> </examples> </rule> </rules> </ruleset>
So give that a shot with the tool and then it's just a matter of getting your $MSGONLY macro to line up.
On Tue, Aug 3, 2010 at 9:11 PM, Martin Holste <mcholste@gmail.com> wrote:
No worries, happy to help.
Ok, try this destination:
template t_test { template("$MSGONLY\n"); };
destination d_test { file("/logs/raw/test" dir_owner("root") owner("root") group("root") perm(0640) dir_perm(0755) create_dirs(yes) template(t_test) ); };
And don't use the "r_raw" rewriter with this destination. I'm pretty sure this will yield what you've got already, but I want to take as many variables out of it as possible.
If the messages do indeed look the same in /logs/raw/test, then it means that you need to write your patterns so that they start matching from whatever is printed. That shouldn't be too bad as long as you can get a decent delimiter in there. The colons look like a good anchor.
On Tue, Aug 3, 2010 at 8:35 PM, Matthew Hall <mhall@mhcomputing.net> wrote:
I did some experimentation, using the following log setup:
rewrite r_raw { set("$MSGONLY"); };
destination d_u_raw_local1 { file("/logs/raw/local1" dir_owner("root") owner("root") group("root") perm(0640) dir_perm(0755) create_dirs(yes) template(t_default) suppress(3) ); };
But I am still getting messages like this:
Aug 1 00:00:00 <local1.notice> 172.16.0.2 from: 172.16.0.1: 000001: Aug 1 00:00:00.000: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet5/16, changed state to up
This it seems that I am not successfully stripping all headers normally added by the file writer off of the message using this configuration. What did I miss here in my rewrite rule? Without some way to make sure I have a raw file with no weird headers added it's hard to make decent patterns.
Thanks, Matthew.
On Tue, Aug 03, 2010 at 05:18:10PM -0700, Matthew Hall wrote:
On Tue, Aug 03, 2010 at 06:53:13PM -0500, Martin Holste wrote:
I believe the matching is done against the $MSGONLY macro, so you can put another log destination in to write that out only and have a look to see what the parser is seeing. Do you have an example log you can show?
Here is an example of what would be appearing in the disk log file:
Jul 1 00:00:00 <local1.notice> 172.16.0.1 0000001: Jul 1 00:00:00.000 UTC: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/1, changed state to down
There are many more types of message coming from many more devices, some of which are BSD compliant and some of which are not, and I think that is part of my problem.
The unclear part is how much of the front part needs to be stripped off, before making the patterns in the XML file. Hopefully I will be able to figure that out now that you have clarified how I can make a raw message file without extraneous strings appended.
Thanks for helping me understand how this works and what I can do to get my patterns right. I definitely owe you a beer.
Regards, Matthew.
On Tue, Aug 3, 2010 at 12:10 PM, Matthew Hall <mhall@mhcomputing.net> wrote:
On Tue, Aug 03, 2010 at 02:39:38PM +0200, Balazs Scheidler wrote: > Well, if you want to look at the result of the message parsing exactly > as done by syslog-ng, you could use a noop rewrite rule and enable > debugging (though it is not recommended to be done in a production > server): > > rewrite r_noop { set("$MESSAGE"); }; > > This would set $MESSAGE to $MESSAGE, but at the end of the rewrite rule, > syslog-ng would emit a debug message about the contents of the MESSAGE > name-value pair.
Unfortunately I can't even get that far because the beginning of my message patterns is not matching up against whatever syslog-ng is using to do the pattern match, so I am not going to get any name value pairs out.
> Alternatively, you may still be able to use "pdbtool match" which can > read a log file, parse it with syslog-ng's message parser and report the > results per name-value pair. > > $ pdbtool match -f /var/log/auth.log -p access/sshd.pdb | head -10 > HOST=bzorp > MESSAGE=pam_unix(cron:session): session opened for user root by (uid=0) > PROGRAM=CRON > PID=7362 > LEGACY_MSGHDR=CRON[7362]: > .classifier.class=unknown > > ... > > This uses the normal BSD syslog parser to read the file (thus if you are > using no-parse flag, or RFC5424 format log files, that may differ)
How do I create a file in this BSD format the pdbtool expects? Right now I am using syslog-ng output files as input to my patternizing scripts, but I think I am not stripping off the right things at the beginning of the lines in these files (either too much or too little).
Is there some option I can use to store just the part it would send to the pattern matcher so that I can have input to my patternizer which looks exactly like what the daemon is going to match during the pattern match for each message?
> -- > Bazsi
Thanks, Matthew.
Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
Ok, it definitely shouldn't be adding fields before the parser. Did you do the test destination with $MSGONLY? That should show exactly what Syslog-NG is passing to db-parser(). What does your t_default look like? On Wed, Aug 4, 2010 at 2:08 AM, Matthew Hall <mhall@mhcomputing.net> wrote:
Hi Martin,
I think I've not explained my problem very well yet and I have to improve the explanation based on what you've been showing me so far. The part which confuses me is the header that seems to be getting prepended to the message:
Aug 1 00:00:00 <local1.notice> 172.16.0.2
I am pretty sure that syslog-ng must be adding this. I doubt it arrives that way off the socket based on the packet captures.
If that part is prepended before the pattern DB parser is invoked. If it's added before the parser is invoked and is not present on the original messages off the socket, I'd say it's buggy behavior or should at least be configurable.
If that part is prepended later on when it's time to send the message to the file writer, then I really would want a way to see exactly what's passed to the parser to make the proper patterns, it seems silly my patterns should need to reparse fields added by the daemon itself, especially if there could be several different output formats I'd need separate patterns for each even though the device only sent one type of message.
Regards, Matthew.
On Tue, Aug 03, 2010 at 10:18:14PM -0500, Martin Holste wrote:
Ok, this pattern seems to work in the pdbtest tool:
<ruleset> <rules> <rule class="test" id="test"> <patterns> <pattern>@ESTRING:month: @@NUMBER:mday:@ @NUMBER:hour:@:@NUMBER:minute:@:@NUMBER:second:@ @QSTRING:priority:<>@ @IPv4:host1:@ from: @IPv4:host2:@: @NUMBER::@: @ESTRING:: @@NUMBER::@ @NUMBER::@:@NUMBER::@:@NUMBER::@.@NUMBER::@: @ESTRING:program::@ @ANYSTRING:message:@</pattern> </patterns> <examples> <example> <test_message program="">Aug 1 00:00:00 <local1.notice> 172.16.0.2 from: 172.16.0.1: 000001: Aug 1 00:00:00.000: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet5/16, changed state to up</test_message> <test_values> <test_value name="month">Aug</test_value> <test_value name="mday">1</test_value> <test_value name="hour">00</test_value> <test_value name="minute">00</test_value> <test_value name="second">00</test_value> <test_value name="priority">local1.notice</test_value> <test_value name="host1">172.16.0.2</test_value> <test_value name="host2">172.16.0.1</test_value> <test_value name="program">%LINEPROTO-5-UPDOWN</test_value> <test_value name="message">Line protocol on Interface FastEthernet5/16, changed state to up</test_value> </test_values> </example> </examples> </rule> </rules> </ruleset>
So give that a shot with the tool and then it's just a matter of getting your $MSGONLY macro to line up.
On Tue, Aug 3, 2010 at 9:11 PM, Martin Holste <mcholste@gmail.com> wrote:
No worries, happy to help.
Ok, try this destination:
template t_test { template("$MSGONLY\n"); };
destination d_test { file("/logs/raw/test" dir_owner("root") owner("root") group("root") perm(0640) dir_perm(0755) create_dirs(yes) template(t_test) ); };
And don't use the "r_raw" rewriter with this destination. I'm pretty sure this will yield what you've got already, but I want to take as many variables out of it as possible.
If the messages do indeed look the same in /logs/raw/test, then it means that you need to write your patterns so that they start matching from whatever is printed. That shouldn't be too bad as long as you can get a decent delimiter in there. The colons look like a good anchor.
On Tue, Aug 3, 2010 at 8:35 PM, Matthew Hall <mhall@mhcomputing.net> wrote:
I did some experimentation, using the following log setup:
rewrite r_raw { set("$MSGONLY"); };
destination d_u_raw_local1 { file("/logs/raw/local1" dir_owner("root") owner("root") group("root") perm(0640) dir_perm(0755) create_dirs(yes) template(t_default) suppress(3) ); };
But I am still getting messages like this:
Aug 1 00:00:00 <local1.notice> 172.16.0.2 from: 172.16.0.1: 000001: Aug 1 00:00:00.000: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet5/16, changed state to up
This it seems that I am not successfully stripping all headers normally added by the file writer off of the message using this configuration. What did I miss here in my rewrite rule? Without some way to make sure I have a raw file with no weird headers added it's hard to make decent patterns.
Thanks, Matthew.
On Tue, Aug 03, 2010 at 05:18:10PM -0700, Matthew Hall wrote:
On Tue, Aug 03, 2010 at 06:53:13PM -0500, Martin Holste wrote:
I believe the matching is done against the $MSGONLY macro, so you can put another log destination in to write that out only and have a look to see what the parser is seeing. Do you have an example log you can show?
Here is an example of what would be appearing in the disk log file:
Jul 1 00:00:00 <local1.notice> 172.16.0.1 0000001: Jul 1 00:00:00.000 UTC: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/1, changed state to down
There are many more types of message coming from many more devices, some of which are BSD compliant and some of which are not, and I think that is part of my problem.
The unclear part is how much of the front part needs to be stripped off, before making the patterns in the XML file. Hopefully I will be able to figure that out now that you have clarified how I can make a raw message file without extraneous strings appended.
Thanks for helping me understand how this works and what I can do to get my patterns right. I definitely owe you a beer.
Regards, Matthew.
On Tue, Aug 3, 2010 at 12:10 PM, Matthew Hall <mhall@mhcomputing.net> wrote: > On Tue, Aug 03, 2010 at 02:39:38PM +0200, Balazs Scheidler wrote: >> Well, if you want to look at the result of the message parsing exactly >> as done by syslog-ng, you could use a noop rewrite rule and enable >> debugging (though it is not recommended to be done in a production >> server): >> >> rewrite r_noop { set("$MESSAGE"); }; >> >> This would set $MESSAGE to $MESSAGE, but at the end of the rewrite rule, >> syslog-ng would emit a debug message about the contents of the MESSAGE >> name-value pair. > > Unfortunately I can't even get that far because the beginning of my > message patterns is not matching up against whatever syslog-ng is using > to do the pattern match, so I am not going to get any name value pairs > out. > >> Alternatively, you may still be able to use "pdbtool match" which can >> read a log file, parse it with syslog-ng's message parser and report the >> results per name-value pair. >> >> $ pdbtool match -f /var/log/auth.log -p access/sshd.pdb | head -10 >> HOST=bzorp >> MESSAGE=pam_unix(cron:session): session opened for user root by (uid=0) >> PROGRAM=CRON >> PID=7362 >> LEGACY_MSGHDR=CRON[7362]: >> .classifier.class=unknown >> >> ... >> >> This uses the normal BSD syslog parser to read the file (thus if you are >> using no-parse flag, or RFC5424 format log files, that may differ) > > How do I create a file in this BSD format the pdbtool expects? Right now > I am using syslog-ng output files as input to my patternizing scripts, > but I think I am not stripping off the right things at the beginning of > the lines in these files (either too much or too little). > > Is there some option I can use to store just the part it would send to > the pattern matcher so that I can have input to my patternizer which > looks exactly like what the daemon is going to match during the pattern > match for each message? > >> -- >> Bazsi > > Thanks, > Matthew.
Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
Brilliant advice Martin. It seems that this new output template has indeed resolved my issue. I will collect some new log files and rerun my patternizer. 172.16.0.1: 000001: Aug 1 00:00:00.000: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet0/1, changed state to up Best Regards, Matthew. On Tue, Aug 03, 2010 at 09:11:08PM -0500, Martin Holste wrote:
No worries, happy to help.
Ok, try this destination:
template t_test { template("$MSGONLY\n"); };
destination d_test { file("/logs/raw/test" dir_owner("root") owner("root") group("root") perm(0640) dir_perm(0755) create_dirs(yes) template(t_test) ); };
And don't use the "r_raw" rewriter with this destination. I'm pretty sure this will yield what you've got already, but I want to take as many variables out of it as possible.
If the messages do indeed look the same in /logs/raw/test, then it means that you need to write your patterns so that they start matching from whatever is printed. That shouldn't be too bad as long as you can get a decent delimiter in there. The colons look like a good anchor.
On Tue, Aug 3, 2010 at 8:35 PM, Matthew Hall <mhall@mhcomputing.net> wrote:
I did some experimentation, using the following log setup:
rewrite r_raw { set("$MSGONLY"); };
destination d_u_raw_local1 { file("/logs/raw/local1" dir_owner("root") owner("root") group("root") perm(0640) dir_perm(0755) create_dirs(yes) template(t_default) suppress(3) ); };
But I am still getting messages like this:
Aug 1 00:00:00 <local1.notice> 172.16.0.2 from: 172.16.0.1: 000001: Aug 1 00:00:00.000: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet5/16, changed state to up
This it seems that I am not successfully stripping all headers normally added by the file writer off of the message using this configuration. What did I miss here in my rewrite rule? Without some way to make sure I have a raw file with no weird headers added it's hard to make decent patterns.
Thanks, Matthew.
On Tue, Aug 03, 2010 at 05:18:10PM -0700, Matthew Hall wrote:
On Tue, Aug 03, 2010 at 06:53:13PM -0500, Martin Holste wrote:
I believe the matching is done against the $MSGONLY macro, so you can put another log destination in to write that out only and have a look to see what the parser is seeing. Do you have an example log you can show?
Here is an example of what would be appearing in the disk log file:
Jul 1 00:00:00 <local1.notice> 172.16.0.1 0000001: Jul 1 00:00:00.000 UTC: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/1, changed state to down
There are many more types of message coming from many more devices, some of which are BSD compliant and some of which are not, and I think that is part of my problem.
The unclear part is how much of the front part needs to be stripped off, before making the patterns in the XML file. Hopefully I will be able to figure that out now that you have clarified how I can make a raw message file without extraneous strings appended.
Thanks for helping me understand how this works and what I can do to get my patterns right. I definitely owe you a beer.
Regards, Matthew.
On Tue, Aug 3, 2010 at 12:10 PM, Matthew Hall <mhall@mhcomputing.net> wrote:
On Tue, Aug 03, 2010 at 02:39:38PM +0200, Balazs Scheidler wrote:
Well, if you want to look at the result of the message parsing exactly as done by syslog-ng, you could use a noop rewrite rule and enable debugging (though it is not recommended to be done in a production server):
rewrite r_noop { set("$MESSAGE"); };
This would set $MESSAGE to $MESSAGE, but at the end of the rewrite rule, syslog-ng would emit a debug message about the contents of the MESSAGE name-value pair.
Unfortunately I can't even get that far because the beginning of my message patterns is not matching up against whatever syslog-ng is using to do the pattern match, so I am not going to get any name value pairs out.
Alternatively, you may still be able to use "pdbtool match" which can read a log file, parse it with syslog-ng's message parser and report the results per name-value pair.
$ pdbtool match -f /var/log/auth.log -p access/sshd.pdb | head -10 HOST=bzorp MESSAGE=pam_unix(cron:session): session opened for user root by (uid=0) PROGRAM=CRON PID=7362 LEGACY_MSGHDR=CRON[7362]: .classifier.class=unknown
...
This uses the normal BSD syslog parser to read the file (thus if you are using no-parse flag, or RFC5424 format log files, that may differ)
How do I create a file in this BSD format the pdbtool expects? Right now I am using syslog-ng output files as input to my patternizing scripts, but I think I am not stripping off the right things at the beginning of the lines in these files (either too much or too little).
Is there some option I can use to store just the part it would send to the pattern matcher so that I can have input to my patternizer which looks exactly like what the daemon is going to match during the pattern match for each message?
-- Bazsi
Thanks, Matthew.
Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
participants (3)
-
Balazs Scheidler
-
Martin Holste
-
Matthew Hall