[syslog-ng] Buffering AF_UNIX Destination, Batch Post Processing Messages

Matthew Hall mhall at mhcomputing.net
Wed Sep 8 03:05:26 CEST 2010


Hello All,

I want to configure an AF_UNIX SOCK_DGRAM syslog-ng destination which 
sends certain log messages to an external program for further processing 
and analysis. This program should batch up the messages into 60 second 
batches for processing.

Currently I am running into an architectural challenge in how I should 
process the 60 second batch without slowing down the select which is 
collecting the messages from the destination.

In the past when creating a similar kind of application in Java I 
handled this by creating a huge dynamic array to store objects creating 
from each incoming message, then passed the array reference to another 
background thread for processing, and began building a new array in the 
select thread.

Currently I am trying to solve this same basic problem in Perl, which 
has poor threading support. I am investigating a few different options:

* use threads anyway-- not recommended by more expert Perl devs I asked

* prefork a process which listens to the AF_UNIX from syslog-ng, and 
writes to some kind of buffered non blocking pipe with a really big 
buffer-- not sure if such a pipe device actually exists, many pipes 
block

* postfork a worker process which handles the 60 second batch-- problem 
here is that you want to have a whole lot of long term state data which 
is maintained between batches to help separate the needles from the 
haystacks, and you could get weird behavior on the duplicated FDs that 
are still being select()ed in the parent process which are copied into 
the child.

* use some kind of message or job queue to copy things from a producer 
process to a consumer process-- gearman, theschwartz, beanstalk, 
rabbitmq, activemq, and poe::component::mq have been suggested-- this 
would probably cause a lot of context switching and unwanted buffer copies

* see if there is something existing in syslog-ng that can help with 
this situation. can it somehow be convinced to buffer things internally 
for my process when my process is busy on a 60 second batch, or send in 
60 second batch, etc. / whatever other clever people can dream up?

Is this a problem other people have dealt with before? What did you do 
about this one? I want to get this right and avoid making a big mess or 
reinventing the wheel.

Matthew.


More information about the syslog-ng mailing list