feature request (parallel processing)

3 Sep 2010

      So, after months of work, we finally turned on our production 
environment for syslog collection. However, we hit one immediate snag. 
Currently were writing to the database, and the way the database works 
is that it collects enough data to fill a single block, and then it 
flushes out that block. Well every time it goes to flush the block out, 
the insert takes an extra couple milliseconds. Now when I'm doing about 
220000 inserts a second, that millisecond delay is significant. So 
basically syslog has to pause on that log statement while it waits for 
the database to flush. (1 out of 10 messages was getting dropped)

Now I tried to solve this by writing multiple destination drivers so 
that a second database thread could be processing while the first was 
flushing, but that didnt work as it appears syslog waits for the 
destination driver to complete before it hands data off to the second 
driver.

Instead I managed to solve the problem by creating yet more syslog 
processes. So basically the master process listens for data from all the 
hosts. It then runs a match on the $PID and sends all even numbered PIDs 
to one syslog process, and all odd numbered PIDs to a second syslog 
process. This way both processes can be inserting to the database at the 
same time. It effectively cuts the amount of work each database thread 
does in half, so that when it has to pause to flush, it doesnt cause the 
syslog buffer to fill up.

Ultimately my request is this, allow multiple destination drivers to 
work at the same time. I realize this is probably not a simple change, 
but seems like it would be a significant speed enhancement.

syslogng＠feystorm.net

Martin Holste

Clayton Dukes

syslogng＠feystorm.net

Martin Holste

syslogng＠feystorm.net

Balazs Scheidler

syslogng＠feystorm.net

tags

participants (4)