On Mon, 2010-03-22 at 12:54 -0500, Martin Holste wrote:
That's definitely some heavy-duty regex. You'd be a good candidate for the pattern db, as the pattern matching engine is orders of magnitude faster than PCRE because it uses trie-based pattern searching. It also allows for extracting the matches and using them in the output macros, so you wouldn't have to sacrifice any functionality. I would estimate that it would drop your CPU usage down to around 25-30% while doing all of the work in a single thread.
@Balabit: You know what goes great with pattern matching? CUDA support with Nvidia cards for GPU-based pattern matching acceleration. They've got preliminary support for it in the Open Information Security Foundation's (OISF) Suricata IDS engine. That project is GPL, so you could port most of that code directly into the pattern db matcher for the OSE version of SyslogNG. $500 USD will buy a GPU with 480 stream processors, so you could match 480 patterns simultaneously, per card. You can link up to four cards together, so you could match 1920 patterns in parallel, offloaded from the CPU, on commodity hardware. So, a server costing under $5,000 could probably process (maybe not store) 250,000+ messages per second. Even if there wasn't much speed increase, the CPU offload alone would probably be worth it for busy log servers.
Thanks for the hint, I've downloaded the docs, I'll definitely look into it. -- Bazsi