[syslog-ng] MongoDB destination driver

Gergely Nagy algernon at madhouse-project.org
Sat Jan 1 15:30:41 CET 2011


A little update on the state of the driver: last night, I arrived to a
state where I consider it good enough for my own purposes (already
using it in production), today I did some benchmarking (completely
unscientific, mind you) to see if and where I can improve the driver.

A standard setup, logging to a file resulted in 24k message/sec, we'll
use that for comparsion. Logging the same data to a capped (at 1000
messages) mongodb collection netted 18k messages/sec, while logging to
an uncapped and unindexed mongodb collection is around 13k
messages/sec.

All tests were run on the same computer, using the same loggen
commandline, the only change is the destionation in the syslog-ng
config. Each test ran for 10 minutes.

The numbers could probably be upped with suitable configuration and a
more appropriate test environment, but I'm not really into that stuff,
the current performance fits my needs perfectly well.

I haven't tested an SQL destination, but my gut feeling is, mongodb's
a lot faster already.

And there's obviously a lot of cases I haven't tested: query speed
while writes are flowing in; how indexing affects it all, and so on,
since those scenarios are either not part of my use case, or I don't
feel knowledgable enough to draw the proper conclusions. I'll let
someone else do proper benchmarking, I'll stick to coding :)

Now, the next thing I explored is if I can speed things up easily: for
this reason, I had a look at callgrind's output, and concluded that
most of the CPU time is spent outside of the mongodb driver, speeding
up the driver would be possible, but it'd need some nasty tricks I'm
not too keen on implementing.

For the record, most of the time was spent in template resolution
(resolving the collection name and the values to log) - there's not
much I can do to speed those up.

Another way to speed things up, especially when network speed starts
to matter, would be to push the syslog-ng<->mongodb communication into
a writer thread, much like the SQL driver is doing.

I attempted to do that, but ran into a few blocking problems: the
original idea was to collect a set amount of log messages and insert
them in bulk. MongoDB has support for this, so that part is trivial.
The problem is, that even with bulk insert, I can only insert into a
single collection at a time.

Since the collection name can contain macros, in order to do bulk
inserts, I'd have to store the queue on a per-collection basis, and
that would make things trickier, and more than likely would negate all
the benefits of inserting in bulk. It would also be an option to
disable macro support in collection(), but that has quite a few
negative consequences, and I really like this functionality anyway.

However, splitting the writing out to a thread, but with skipping the
bulk insert part, is still preferable, due to mongo_insert() being a
blocking call. Implementing this is my current plan for today.

So, to sum it up, the current state is like this:

* The driver works reasonably well, at an - in my opionion - good speed
* It handles error cases reasonably gracefully: it detects network
errors, and will try to reconnect after time_reopen seconds. In the
meantime, messages are dropped, though.
* Supports authentication
* The key-value pairs to log can be configured (and the values can
contain macros, obviously)
* The collection name can contain macros aswell
* Empty values are not stored in the database

My TODO list for the driver at the moment is something along these lines:

* Better error handling, preferably with no message dropping
* mongodb communication moved to a separate thread

I have set up a small project page with some rudimentary documentation
at http://asylum.madhouse-project.org/projects/syslog-ng/mongodb/ if
anyone's interested in trying out the driver.


More information about the syslog-ng mailing list