Hello,

As the patterndb project is starting to gain some momentum I thought it’d be the right time to port my patternize tool to the new, plugin-based 3.2 codebase as the first step towards getting it integrated — and to be able to use the fancy new pdbtool features along with patternize. To those who are unfamiliar with it, patternize is an addition to pdbtool that makes it possible to automatically generate a pattern database from raw logs using statistical data clustering methods: you can read more about it in this blog post: http://gyp.blogs.balabit.com/2010/01/introducing-pdbtool-patternize/

Besides the port to the new codebase, it’s received some fixes and new features since my original post:

 * multiple small internal bugfixes to get rid of weird errors
 * added the option “–named-parsers” that names the found @ESTRING@s like “.dict.string0,1,2,3…
 * Balint Kovacs has sent three contributions: added support for reading the logfile from the standard input, escaping special characters in the output and putting examples in the XML that can be used for self-testing.

It can be found in my public syslog-ng 3.2 tree: http://git.balabit.hu/?p=gyp/syslog-ng-3.2.git;a=summary

If you're already using it (I've received some feedback so I guess some of you do), please note that most probably this 3.2-based branch will get the fixes and new features from now on.

It’s only received a basic sanity check and the unit tests do pass, so as usual, handle it with care and all feedback is welcome.

greets,
Peter

ps.: the branch also contains a patch that fixes a wrong section name in pdbtool's man page and I'll try to update the whole manpage a bit when adding a section for patternize soon -- Bazsi, you might want to pull those to the mainline.