<div dir="ltr"><div>Hi,</div><br><div class="gmail_extra"><div class="gmail_quote">On Wed, Apr 16, 2014 at 6:15 PM, David Hauck <span dir="ltr"><<a href="mailto:davidh@netacquire.com" target="_blank">davidh@netacquire.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Another couple questions regarding 'patternize'.<br>
<br>
Why does the 'patternize' output not include additionally relevant parts of the schema? In particular the 'program pattern' is not output as part of the result? It's my understanding that this is key matching criteria when determining matches and I'm unsure what would happen with the pattern db that contains rulesets with no program pattern specifiers (note: the documentation does talk about the matching behaviour when ${PROGRAM} is empty, but this is different - i.e., I assume rules with empty program patterns don't get matched/looked at when ${PROGRAM} is non-empty).<br>
</blockquote><div><br></div><div>That's because the clustering algorithm used within patternize itself does not take the program field into account, so including that in the pattern database would create erroneous results. It wouldn't be that difficult to update the algorithm to use the program field and only group logs together if they have the same value there but I won't have time to get to it in the upcoming weeks. It's a low hanging fruit if you are willing to code, I am happy to help if you get stuck :)</div>
<div><br></div><div>If the {$PROGRAM} is non-empty but there's no "program" entry defined in the pattern, the message does get matched, although I am pretty sure that the patterns where the "program" entry is specified are stronger, but I am not 100% about that priority order. Actually, that's what happens if you run "pdbtool test" on an XML generated by patternize: as you can see it contains examples in which the program field is set to the bogus "patternize" value manually, and the patterns match those examples nevertheless. Probably the documentation should be updated to describe that scenario, too.</div>
<div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
Also, where is the actual schema (the xsd file) that defines the pattern db format (and the semantics of each element)? I've found the admin guide documentation lacking in terms of explicit description of the patter db format (the brief section that attempts to describe this is very thin).<br>
</blockquote><div><br></div><div>Well, a human-readable description can indeed never be as precise as a formal definition :) I don't know how the version you are using is packaged, but in the source tree these XSDs are in "/doc/xsd": <a href="https://github.com/balabit/syslog-ng/tree/master/doc/xsd">https://github.com/balabit/syslog-ng/tree/master/doc/xsd</a> These are pretty well annotated XSDs which should be quite self-explaining when it comes to the semantics, too.</div>
<div><br></div><div>greets,</div><div>Peter</div></div></div></div>