[syslog-ng] sql scaling commits idea

Sun Jan 22 01:12:52 CET 2012

The point of the idea wasnt to get more performance out of the database, 
that can be done already by increasing the number of inserts per 
transaction. The point was to have as few inserts per transaction as 
possible without overwhelming the database.

-Patrick

Sent: Sat Jan 21 2012 18:48:36 GMT-0500 (EST)
From: Martin Holste <mcholste at gmail.com>
To: Syslog-ng users' and developers' mailing list 
<syslog-ng at lists.balabit.hu>
Subject: Re: [syslog-ng] sql scaling commits idea
> In general, you should not expect SQL servers to be able to handle
> more than 3-5k inserts per second, because the CPU time spent parsing
> each query becomes the bottleneck.
>
> There are two ways to drastically improve insert efficiency to achieve
> better than 5k inserts/sec:
>
> Multi-value inserts:
> INSERT INTO table (col1,col2) VALUES (val1,val2), (val3,val4), ... (valM, valN)
> This will bundle all inserts into a single transaction and provides
> the fastest possible insert for multiple rows.  Batches of 1000 work
> very well for busy servers.
>
> Batch loads:
> LOAD DATA INFILE
> This is the absolute fastest mechanism possible and will yield 100k
> rows/sec for most MySQL servers.  PgSQL has an equivalent (COPY) as
> does MS-SQL (BULK INSERT).
>
> An easy way to achieve either is to use the program() destination and
> have a small script handle the batching for you.  This is essentially
> what I use for ELSA.
>
> On Sat, Jan 21, 2012 at 12:11 PM, Patrick Hemmer
> <syslogng at stormcloud9.net>  wrote:
>> A technique that we use for an in-house application I wrote -- that
>> writes log entries to a database -- I think would be very useful for
>> syslog-ng in high-performance setups.
>>
>> The idea is basically to commit as frequently as the backend database
>> can handle. The way it works is that you basically commit when your
>> queue is completely empty. So if you have a low volume of logs coming
>> it, it would commit after every log entry. This is quite demanding on
>> the database for high volumes, so as the database loses the ability to
>> keep up, the queue in syslog-ng starts to build and so it starts to
>> commit less. When it starts doing more log entries per
>> commit/transaction, performance on the database increases. So you end up
>> writing out to the database as fast as the database can handle, with as
>> little data loss as possible were syslog-ng to die somehow. You would
>> also need some sort of max-transaction-size so you dont end up with too
>> many uncommited log entries if the database is being really really slow.
>>
>> As mentioned I've been using this technique in our in-house application
>> for a couple years now and it performs beautifully. Very low latency
>> between syslog-ng getting the message, and it being available (for a
>> select query) in the database, without overwhelming the database.
>>
>>
>> In theory you could also use this same principle for file destinations
>> as well, but I dont think it'd be as useful there.
>>
>>
>> But anyway, just an idea :-)
>> ______________________________________________________________________________
>> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
>> Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng
>> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>>
> ______________________________________________________________________________
> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
> Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng
> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>