[syslog-ng] MongoDB destination driver

Gergely Nagy algernon at balabit.hu
Sun Jan 16 22:05:44 CET 2011


Most of the mentioned issues are fixed in git, one way or the other
(details below). However, I'm seeing some weird problems from time to
time...

Namely, when I shut down the mongodb server to test reconnect and
perhaps a few other things: with the mongodb server down, when I do an
msg_error() from the writer thread, syslog-ng ends up segfaulting, due
to indirectly calling cached_g_current_time() (via the log_msg_init()) -
which, as far as I experienced so far, shouldn't be called from anywhere
but the main thread.

I'm not quite sure how to get around that, except for not generating
messages in the worker thread, which would be a bit painful.

(This issue was present for a while, I think, will check if I can bisect
it)

I probably screwed up something around the mutexes too, as I threw
loggen at syslog-ng and after a while it just stopped, probably in a
neat little deadlock.

Current status as of this writing:

> > * start/stop_thread() creates/frees mutexes/condvars, is there a reason
> > not to do this once in dd_new() and free them in free() ? [RC]
> 
> No reason that I can think of, will move them over.
 
Moved the mutex/condvar stuff to dd_init() and dd_deinit() respectively.
The reason they weren't moved to _new/_free() is that the worker thread
depends on stuff that is set up by dd_init(), so it can't start at
dd_new() time. 

I might be missing something though, in which case please bonk me on the
head.

(One option would be to have a wakeup cond for the thread: I would set
up the cond vars and mutexes from _new(), but only issue the wakeup from
_init() - problem solved.

At the moment I'm having a few issues grasping the process of module
starting & shutdown. I'll have another go at it tomorrow.) 

> > * is there a reason mongodb_dd_connect() is protected by a mutex? it is
> > only protected from one code path, but it is only called from the writer
> > thread, so it should be single threaded anyway. [RC]
> 
> Good catch, will fix, thanks!
 
Fixed in git.

> > * worker_insert():
> >   * it queries the time for each invocation. time(NULL) is dreadfully
> > slow and we're trying to phase it out, especially in slow paths.
> > cached_g_current_time_sec() should  be used instead. [RC]
> 
> Partially-fixed: The OID generation was moved to the mongo client
> library. Though, that still calls time() for now. I'll add a way to
> supply our own time, and then the mongodb destination can pass the
> cached time.

Fixed in git. I added a self->last_msg_stamp, which is updated under a
lock in dd_queue(), and passed to the OID generator in the writer
thread. 

> * [oid generation] srand() should not be called here, if you
> > absolutely must rely on srand/rand, then srand() should be called once
> > at startup. maybe it would be better to use RAND_bytes(), although that
> > would pull in a libcrypto as a requirement for mongodb as well (non-RC
> > if well hidden behind a function)
> 
> This is partially fixed too, as OID generation was moved to the mongo
> client (I just haven't pushed those changes yet, wanted to keep my HEAD
> stable for a few days).
> 
> The rand() is used to generate the machine ID... I think I have an
> acceptable way to solve that. 

Fixed in git. The client library now has a mongo_util_oid_init() method
which will set up the machine id and the pid. This is called from
dd_new(), neither srand, nor rand, nor getpid, nor time gets called from
afmongodb anymore.

And getpid/srand/rand only gets indirectly called one time during
dd_new(). 

> > * [formatting collection name] g_string_prepend() is slow, add the
> > literal to the beginning, and then use log_template_append_format() to
> > append to the end (no need to move strings around). [RC]
> 
> D'oh. Suits me for not checking the headers: didn't know about
> log_template_append_format().

Fixed in git.

> * mongo-client.c:
> >   * perhaps a mongo state struct wouldn't hurt even if currently it
> > would be an fd there. 
> 
> Partially fixed: it's on my TODO list for the client lib for today.

Fixed in git.

> * mongo_packet_send(): perhaps you could use writev() to combine the
> > header/data parts, instead of sending them in two chunks and two syscall
> > roundtrips
> 
> Was on my TODO list, it climbed a bit higher.

Fixed in git.

> * mongo-wire.c:
> >   * struct _mongo_packet, please don't use #pragma there, just add a
> > comment that new fields should be added with care. You perfectly filled
> > all padding bytes with fields anyway.
> 
> Noted, will fix.

Fixed in git. Though, the client library grew another struct (which is
not used by syslog-ng yet) which still uses #pragma, to avoid padding.

I'll see if I can get around that easily. 

> * perhaps using GByteArray here is an overkill, you size the array by
> > calculating the required space anyway, this way you could hang the data
> > array right the end of struct mongodb_packet. [non-RC]
> 
> Noted, will fix.

Fixed in git.

> * [av_foreach_buffer] please don't define a struct for this as its
> > name is just not covering what it does, I usually use an array of
> > pointers in this case. [non-RC]
> 
> Noted, will fix.

Fixed in git.

> > * [value, coll variables] if at all possible, please reuse GString
> > based buffers in @self, especially if you don't need to protect them via
> > a mutex. allocation heavy code can really affect performance (and
> > although with SQL the numbers are quite low, the numbers you mentioned
> > with mongodb, this may not be the case). This may apply to
> > bson_new_sized() too (e.g. reuse the bson object header after a call
> > similar to g_string_truncate(0)) (non-RC, but preferable)
> 
> Noted, will look into it.
 
value & coll moved to self (as self->current_value and
self->current_namespace), they're allocated by the worker thread upon
startup, and freed when shutting down. 

And since the "db." prefix doesn't change, ever, I also store the
namespace prefix length in self, and truncate to that point during
insert. 

The bson objects will be moved there too later tonight - I will have to
update the client lib for that, and am knee-deep in syslog-ng at the
moment, and don't want to switch out.

> > * when naming/defining mutexes, please try to name it according to the
> > data it protects, and not code (writer_thread_mutex should probably be
> > named something else. In SQL this is using a similar name, but that has
> > historic roots). [applies to SQL, not RC]
> 
> Ok.

(Partly) fixed in git. I removed the writer_thread_mutex, and added
suspend_mutex (held when checking/setting suspend state) and queue_mutex
(held when checking queue length).

Also got rid of explicit mutex locking around log_queue_push & _pop
calls.

Once I figure out how to use log_queue_check_items() and ivykis threads,
queue_mutex will be gone too.

> > * in syslog-ng 3.3, LogQueue is taking care of its own locking, so no
> > need to separately lock it, at least for _push/pop operations. I see
> > that you are using SQL-like wakeup mechanism, which requires you to know
> > the length of the queue atomicaly, which still requires this external
> > lock. A better solution would probably be to use log_queue_check_items()
> > API and standard ivykis threads for the writer [applies to SQL, not RC]
> 
> Noted, I'll have a closer look at ivykis.

I think I have a rough idea how this should work. If all goes well, I'll
push this out tonight aswell.

> > * afmongodb_dd_free() doesn't free keys/values [RC]
> 
> Off the top of my head, I don't see which key/values should be free'd
> there: self->fields is cleaned up as far as I see.

Haven't looked further, I'll have a look at it tomorrow.

-- 
|8]





More information about the syslog-ng mailing list