[syslog-ng] sql scaling commits idea

Tue Jan 24 03:43:11 CET 2012

Not necessarily, especially if you have more cpu than disk bandwidth, 
and especially especially (yes 2 especiallys :-P ) in oracle. Oracle has 
the ability to recognize insert statements that are the same as a 
previous insert, just with different values. Thus it doesnt have to do 
permission checking, table lookups, etc.

As for how many inserts per transaction we use. It varies, a lot. Off 
hours it can be around 2 or 3 inserts per transaction. During peak it'll 
be around 50 or so. We prefer not to lose any, thats why we use this 
method. Otherwise we'd have to hard set it at 100 or something higher 
than we'll ever need to handle peak times. Though the application does 
keep a 1-minute buffer on disk just in case the s**t hits the fan (but 
weve never had that happen).

-Patrick

Sent: Mon Jan 23 2012 09:44:20 GMT-0500 (EST)
From: Martin Holste <mcholste at gmail.com>
To: Patrick Hemmer <syslogng at stormcloud9.net> "Syslog-ng users' and 
developers' mailing list" <syslog-ng at lists.balabit.hu>
Subject: Re: [syslog-ng] sql scaling commits idea
> You described backing off the commits to increase volume if the
> database starts to get overwhelmed, and I was trying to point out that
> it's not really the commits that are bottleneck at high volumes, it's
> the parsing itself.  Either way, you raise a good point that a dynamic
> interval for commits would be beneficial, especially in low-volume
> situations.
>
> In your in-house app, what's the optimal queue length?  Is the data
> considered somewhat trivial such that your org is willing to lose
> some, or do you have another mechanism in place to ensure fault
> tolerance?
>
> On Sat, Jan 21, 2012 at 6:12 PM, Patrick Hemmer
> <syslogng at stormcloud9.net>  wrote:
>> The point of the idea wasnt to get more performance out of the database,
>> that can be done already by increasing the number of inserts per
>> transaction. The point was to have as few inserts per transaction as
>> possible without overwhelming the database.
>>
>> -Patrick
>>
>>
>> Sent: Sat Jan 21 2012 18:48:36 GMT-0500 (EST)
>> From: Martin Holste<mcholste at gmail.com>
>> To: Syslog-ng users' and developers' mailing list
>> <syslog-ng at lists.balabit.hu>
>> Subject: Re: [syslog-ng] sql scaling commits idea
>>
>>> In general, you should not expect SQL servers to be able to handle
>>> more than 3-5k inserts per second, because the CPU time spent parsing
>>> each query becomes the bottleneck.
>>>
>>> There are two ways to drastically improve insert efficiency to achieve
>>> better than 5k inserts/sec:
>>>
>>> Multi-value inserts:
>>> INSERT INTO table (col1,col2) VALUES (val1,val2), (val3,val4), ... (valM,
>>> valN)
>>> This will bundle all inserts into a single transaction and provides
>>> the fastest possible insert for multiple rows.  Batches of 1000 work
>>> very well for busy servers.
>>>
>>> Batch loads:
>>> LOAD DATA INFILE
>>> This is the absolute fastest mechanism possible and will yield 100k
>>> rows/sec for most MySQL servers.  PgSQL has an equivalent (COPY) as
>>> does MS-SQL (BULK INSERT).
>>>
>>> An easy way to achieve either is to use the program() destination and
>>> have a small script handle the batching for you.  This is essentially
>>> what I use for ELSA.
>>>
>>> On Sat, Jan 21, 2012 at 12:11 PM, Patrick Hemmer
>>> <syslogng at stormcloud9.net>    wrote:
>>>> A technique that we use for an in-house application I wrote -- that
>>>> writes log entries to a database -- I think would be very useful for
>>>> syslog-ng in high-performance setups.
>>>>
>>>> The idea is basically to commit as frequently as the backend database
>>>> can handle. The way it works is that you basically commit when your
>>>> queue is completely empty. So if you have a low volume of logs coming
>>>> it, it would commit after every log entry. This is quite demanding on
>>>> the database for high volumes, so as the database loses the ability to
>>>> keep up, the queue in syslog-ng starts to build and so it starts to
>>>> commit less. When it starts doing more log entries per
>>>> commit/transaction, performance on the database increases. So you end up
>>>> writing out to the database as fast as the database can handle, with as
>>>> little data loss as possible were syslog-ng to die somehow. You would
>>>> also need some sort of max-transaction-size so you dont end up with too
>>>> many uncommited log entries if the database is being really really slow.
>>>>
>>>> As mentioned I've been using this technique in our in-house application
>>>> for a couple years now and it performs beautifully. Very low latency
>>>> between syslog-ng getting the message, and it being available (for a
>>>> select query) in the database, without overwhelming the database.
>>>>
>>>>
>>>> In theory you could also use this same principle for file destinations
>>>> as well, but I dont think it'd be as useful there.
>>>>
>>>>
>>>> But anyway, just an idea :-)
>>>>
>>>> ______________________________________________________________________________
>>>> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
>>>> Documentation:
>>>> http://www.balabit.com/support/documentation/?product=syslog-ng
>>>> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>>>>
>>> ______________________________________________________________________________
>>> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
>>> Documentation:
>>> http://www.balabit.com/support/documentation/?product=syslog-ng
>>> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>>>