Hello, Does anyone have an explanation for why a "pdbtool patternize" generated pattern db indicates it is version '3'? I'm running the latest version of syslog-ng (3.5.4.1) so I was expecting that this would produce a version '4' pattern db. Easy enough to change in the generated XML, just wondering why the latest generator wouldn't create the latest version. Also, what is the nominal format for the log messages that the 'patternize' command is able to process (i.e., would this be logs that contain the nominally formatted syslog-ng output - e.g., via the default template: template("$ISODATE $HOST $MSGHDR$MSG\n");). I've seen some output that appears to suggest there's some nominal decoding of the input log messages. Thanks, -David
Hi, it seems that the version number is hardwired into the code of patternize, and hasn't been updated. Changing the version number is trivial, but I don't know if anything else should be updated. Robert On 04/16/2014 01:40 AM, David Hauck wrote:
Hello,
Does anyone have an explanation for why a "pdbtool patternize" generated pattern db indicates it is version '3'? I'm running the latest version of syslog-ng (3.5.4.1) so I was expecting that this would produce a version '4' pattern db. Easy enough to change in the generated XML, just wondering why the latest generator wouldn't create the latest version.
Also, what is the nominal format for the log messages that the 'patternize' command is able to process (i.e., would this be logs that contain the nominally formatted syslog-ng output - e.g., via the default template: template("$ISODATE $HOST $MSGHDR$MSG\n");). I've seen some output that appears to suggest there's some nominal decoding of the input log messages.
Thanks, -David ______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
Hi David, Robert is right, the pattern version is hardcoded.Taking a glimpse at the patterndb v3 and v4 XSDs I think the update should indeed be trivial, the format is upwards compatible. I'll send a pull request for this change in a minute. Regarding the formatting: it uses the parsing mechanism of syslog-ng internally. It works just as if you specified a file() source for syslog-ng with flags(syslog-protocol) added. You can also give "--no-parse" for the tool which makes it parse logs just like a file() source with flags(no-parse). It wouldn't be too complicated to make it possible to use all available file source flags but I never got around doing it. cheers, Peter On Wed, Apr 16, 2014 at 1:40 AM, David Hauck <davidh@netacquire.com> wrote:
Hello,
Does anyone have an explanation for why a "pdbtool patternize" generated pattern db indicates it is version '3'? I'm running the latest version of syslog-ng (3.5.4.1) so I was expecting that this would produce a version '4' pattern db. Easy enough to change in the generated XML, just wondering why the latest generator wouldn't create the latest version.
Also, what is the nominal format for the log messages that the 'patternize' command is able to process (i.e., would this be logs that contain the nominally formatted syslog-ng output - e.g., via the default template: template("$ISODATE $HOST $MSGHDR$MSG\n");). I've seen some output that appears to suggest there's some nominal decoding of the input log messages.
Thanks, -David
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
Hi Péter, Thanks to you and Robert for the extra information. Cheers, -David On Wednesday, April 16, 2014 3:13 AM, Péter Gyöngyösi <gyp@balabit.hu> wrote:
Hi David,
Robert is right, the pattern version is hardcoded.Taking a glimpse at the patterndb v3 and v4 XSDs I think the update should indeed be trivial, the format is upwards compatible. I'll send a pull request for this change in a minute.
Regarding the formatting: it uses the parsing mechanism of syslog-ng internally. It works just as if you specified a file() source for syslog-ng with flags(syslog-protocol) added. You can also give "--no-parse" for the tool which makes it parse logs just like a file() source with flags(no-parse). It wouldn't be too complicated to make it possible to use all available file source flags but I never got around doing it.
cheers, Peter
On Wed, Apr 16, 2014 at 1:40 AM, David Hauck <davidh@netacquire.com> wrote:
Hello,
Does anyone have an explanation for why a "pdbtool patternize" generated pattern db indicates it is version '3'? I'm running the latest version of syslog-ng (3.5.4.1) so I was expecting that this would produce a version '4' pattern db. Easy enough to change in the generated XML, just wondering why the latest generator wouldn't create the latest version.
Also, what is the nominal format for the log messages that the 'patternize' command is able to process (i.e., would this be logs that contain the nominally formatted syslog-ng output - e.g., via the default template: template("$ISODATE $HOST $MSGHDR$MSG\n");). I've seen some output that appears to suggest there's some nominal decoding of the input log messages.
Thanks, -David
Hi Péter, Another couple questions regarding 'patternize'. Why does the 'patternize' output not include additionally relevant parts of the schema? In particular the 'program pattern' is not output as part of the result? It's my understanding that this is key matching criteria when determining matches and I'm unsure what would happen with the pattern db that contains rulesets with no program pattern specifiers (note: the documentation does talk about the matching behaviour when ${PROGRAM} is empty, but this is different - i.e., I assume rules with empty program patterns don't get matched/looked at when ${PROGRAM} is non-empty). Also, where is the actual schema (the xsd file) that defines the pattern db format (and the semantics of each element)? I've found the admin guide documentation lacking in terms of explicit description of the patter db format (the brief section that attempts to describe this is very thin). Thanks, -David On Wednesday, April 16, 2014 3:13 AM, Péter Gyöngyösi <gyp@balabit.hu> wrote:
Hi David,
Robert is right, the pattern version is hardcoded.Taking a glimpse at the patterndb v3 and v4 XSDs I think the update should indeed be trivial, the format is upwards compatible. I'll send a pull request for this change in a minute.
Regarding the formatting: it uses the parsing mechanism of syslog-ng internally. It works just as if you specified a file() source for syslog-ng with flags(syslog-protocol) added. You can also give "--no-parse" for the tool which makes it parse logs just like a file() source with flags(no-parse). It wouldn't be too complicated to make it possible to use all available file source flags but I never got around doing it.
cheers, Peter
On Wed, Apr 16, 2014 at 1:40 AM, David Hauck <davidh@netacquire.com> wrote:
Hello,
Does anyone have an explanation for why a "pdbtool patternize" generated pattern db indicates it is version '3'? I'm running the latest version of syslog-ng (3.5.4.1) so I was expecting that this would produce a version '4' pattern db. Easy enough to change in the generated XML, just wondering why the latest generator wouldn't create the latest version.
Also, what is the nominal format for the log messages that the 'patternize' command is able to process (i.e., would this be logs that contain the nominally formatted syslog-ng output - e.g., via the default template: template("$ISODATE $HOST $MSGHDR$MSG\n");). I've seen some output that appears to suggest there's some nominal decoding of the input log messages.
Thanks, -David
__________________________________________________________ ______________ ______ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
Hi, On Wed, Apr 16, 2014 at 6:15 PM, David Hauck <davidh@netacquire.com> wrote:
Another couple questions regarding 'patternize'.
Why does the 'patternize' output not include additionally relevant parts of the schema? In particular the 'program pattern' is not output as part of the result? It's my understanding that this is key matching criteria when determining matches and I'm unsure what would happen with the pattern db that contains rulesets with no program pattern specifiers (note: the documentation does talk about the matching behaviour when ${PROGRAM} is empty, but this is different - i.e., I assume rules with empty program patterns don't get matched/looked at when ${PROGRAM} is non-empty).
That's because the clustering algorithm used within patternize itself does not take the program field into account, so including that in the pattern database would create erroneous results. It wouldn't be that difficult to update the algorithm to use the program field and only group logs together if they have the same value there but I won't have time to get to it in the upcoming weeks. It's a low hanging fruit if you are willing to code, I am happy to help if you get stuck :) If the {$PROGRAM} is non-empty but there's no "program" entry defined in the pattern, the message does get matched, although I am pretty sure that the patterns where the "program" entry is specified are stronger, but I am not 100% about that priority order. Actually, that's what happens if you run "pdbtool test" on an XML generated by patternize: as you can see it contains examples in which the program field is set to the bogus "patternize" value manually, and the patterns match those examples nevertheless. Probably the documentation should be updated to describe that scenario, too.
Also, where is the actual schema (the xsd file) that defines the pattern db format (and the semantics of each element)? I've found the admin guide documentation lacking in terms of explicit description of the patter db format (the brief section that attempts to describe this is very thin).
Well, a human-readable description can indeed never be as precise as a formal definition :) I don't know how the version you are using is packaged, but in the source tree these XSDs are in "/doc/xsd": https://github.com/balabit/syslog-ng/tree/master/doc/xsd These are pretty well annotated XSDs which should be quite self-explaining when it comes to the semantics, too. greets, Peter
Hi Péter, On Wednesday, April 16, 2014 10:11 AM, syslog-ng-bounces@lists.balabit.hu wrote:
Hi,
On Wed, Apr 16, 2014 at 6:15 PM, David Hauck <davidh@netacquire.com> wrote: Another couple questions regarding 'patternize'.
Why does the 'patternize' output not include additionally relevant parts of the schema? In particular the 'program pattern' is not output as part of the result? It's my understanding that this is key matching criteria when determining matches and I'm unsure what would happen with the pattern db that contains rulesets with no program pattern specifiers (note: the documentation does talk about the matching behaviour when ${PROGRAM} is empty, but this is different - i.e., I assume rules with empty program patterns don't get matched/looked at when ${PROGRAM} is non-empty).
That's because the clustering algorithm used within patternize itself does not take the program field into account, so including that in the pattern database would create erroneous results. It wouldn't be that difficult to update the algorithm to use the program field and only group logs together if they have the same value there but I won't have time to get to it in the upcoming weeks. It's a low hanging fruit if you are willing to code, I am happy to help if you get stuck :)
If the {$PROGRAM} is non-empty but there's no "program" entry defined in the pattern, the message does get matched, although I am pretty sure that the patterns where the "program" entry is specified are stronger, but I am not 100% about that priority order. Actually, that's what happens if you run "pdbtool test" on an XML generated by patternize: as you can see it contains examples in which the program field is set to the bogus "patternize" value manually, and the patterns match those examples nevertheless. Probably the documentation should be updated to describe that scenario, too.
OK, I get the gist of all of the above and so my remaining question is then: "what's the point of the 'program pattern' in the ruleset definitions"?
Also, where is the actual schema (the xsd file) that defines the pattern db format (and the semantics of each element)? I've found the admin guide documentation lacking in terms of explicit description of the patter db format (the brief section that attempts to describe this is very thin).
Well, a human-readable description can indeed never be as precise as a formal definition :) I don't know how the version you are using is packaged, but in the source tree these XSDs are in "/doc/xsd": https://github.com/balabit/syslog-ng/tree/master/doc/xsd These are pretty well annotated XSDs which should be quite self-explaining when it comes to the semantics, too.
Great, thx - I'll take a look (maybe it will help to clarify my remaining question above ;)). Thanks for this, -David
greets, Peter
Hi, On 16 Apr 2014 19:17, David Hauck <davidh@netacquire.com> wrote:
OK, I get the gist of all of the above and so my remaining question is then: "what's the point of the 'program pattern' in the ruleset definitions"?
It enables you to match similar messages with different $PROGRAM names. A good example is pam: the program can be any application using the authentication module e.g. sshd, vsftpd, login, etc. but the message is the same. Cheers
Hi, On Thursday, April 17, 2014 1:08 AM, Fabien Wernli wrote:
Hi,
On 16 Apr 2014 19:17, David Hauck <davidh@netacquire.com> wrote:
OK, I get the gist of all of the above and so my remaining question is then: "what's the point of the 'program pattern' in the ruleset definitions"?
It enables you to match similar messages with different $PROGRAM names. A good example is pam: the program can be any application using the authentication module e.g. sshd, vsftpd, login, etc. but the message is the same.
Sorry, I'm still not sure this clarifies it for me (and yes, the SSH example is a good one here). For example, if I have the following: <patterndb ...> <ruleset ...> <pattern>sshd</pattern> <!-- this is the first 'program pattern' --> <rules> ... </rules> </ruleset> <ruleset ...> <pattern>login</pattern> <!-- this is the second 'program pattern' --> <rules> ... </rules> </ruleset> I would expect only the rules defined in each 'program pattern' block would be inspected for a match given a particular 'program pattern' match against $PROGRAM. For example, incoming messages from 'sshd' would be compared against rules in the first ruleset (and not the second) and incoming messages from 'login' would be compared against rules in the second ruleset (and not the first). Do I have this right? Thanks, -David
Cheers
On Thu, Apr 17, 2014 at 02:57:32PM +0000, David Hauck wrote:
I would expect only the rules defined in each 'program pattern' block would be inspected for a match given a particular 'program pattern' match against $PROGRAM. For example, incoming messages from 'sshd' would be compared against rules in the first ruleset (and not the second) and incoming messages from 'login' would be compared against rules in the second ruleset (and not the first).
Do I have this right?
Yes, you do. In my example, where many programs have the same logs, you could implement it the following way: <ruleset ...> <patterns> <pattern>login</pattern> <pattern>sshd</pattern> <pattern>pam_afs</pattern> <pattern>vsftpd</pattern> ... </patterns> <rules> ... insert common rules but with specific examples here ... </rules> </ruleset>
Hi Fabien, On Thursday, April 17, 2014 8:47 AM, you wrote:
On Thu, Apr 17, 2014 at 02:57:32PM +0000, David Hauck wrote:
I would expect only the rules defined in each 'program pattern' block would be inspected for a match given a particular 'program pattern' match against $PROGRAM. For example, incoming messages from 'sshd' would be compared against rules in the first ruleset (and not the second) and incoming messages from 'login' would be compared against rules in the second ruleset (and not the first).
Do I have this right?
Yes, you do. In my example, where many programs have the same logs, you could implement it the following way:
<ruleset ...> <patterns> <pattern>login</pattern> <pattern>sshd</pattern> <pattern>pam_afs</pattern> <pattern>vsftpd</pattern> ... </patterns>
<rules> ... insert common rules but with specific examples here ... </rules> </ruleset>
Great, thanks for clarifying this. I'd asked this originally because I thought that I'd seen that this wasn't happening (I must have mistook the result for something else). By extension then I guess that rulesets without 'program pattern' elements provide default rules for *any* incoming message with a non-zero $PROGRAM value (right?). Cheers, -David
Hello David, On Thu, Apr 17, 2014 at 04:21:34PM +0000, David Hauck wrote:
Great, thanks for clarifying this. I'd asked this originally because I thought that I'd seen that this wasn't happening (I must have mistook the result for something else). By extension then I guess that rulesets without 'program pattern' elements provide default rules for *any* incoming message with a non-zero $PROGRAM value (right?).
Quoting from the "syslog-ng-ose-v3.5-guide-admin", §13.5.3 "The syslog-ng pattern database format": "If the <pattern> element of a ruleset is not specified, syslog-ng OSE will use this ruleset as a fallback ruleset: it will apply the ruleset to messages that have an empty PROGRAM header, or if none of the program patterns matched the PROGRAM header of the incoming message." That seems pretty crystal clear to me :) cheers
Hi Fabien, On Friday, April 18, 2014 1:10 AM, you wrote:
Hello David,
On Thu, Apr 17, 2014 at 04:21:34PM +0000, David Hauck wrote:
Great, thanks for clarifying this. I'd asked this originally because I thought that I'd seen that this wasn't happening (I must have mistook the result for something else). By extension then I guess that rulesets without 'program pattern' elements provide default rules for *any* incoming message with a non-zero $PROGRAM value (right?).
Quoting from the "syslog-ng-ose-v3.5-guide-admin", §13.5.3 "The syslog-ng pattern database format":
"If the <pattern> element of a ruleset is not specified, syslog-ng OSE will use this ruleset as a fallback ruleset: it will apply the ruleset to messages that have an empty PROGRAM header, or if none of the program patterns matched the PROGRAM header of the incoming message."
That seems pretty crystal clear to me :)
Yes, of course :)! Not sure how I missed that, sorry for the noise ;)! Cheers, -David
cheers
__________________________________________________________ ____________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
Hi, There's an issue on GH or BZ I opened some time ago, in case you missed it. https://github.com/balabit/syslog-ng/issues/109 Cheers -
participants (5)
-
David Hauck
-
Fabien WERNLI
-
Fabien Wernli
-
Fekete Robert
-
Péter Gyöngyösi