You're almost certainly overflowing the internal queue in either the server or the clients or both. Try logging the messages on the clients to a file as well as the network, and see if the local syslog-ngs are dropping messages before the server has any chance.
I can give this a try on the server side, but on the client side I have a custom program that is doing the sending. It could be that the buffers are getting overflown there, but I am finding that I am having a hard time telling. Also, if it were on the client side, then in theory, I should get one of the attempts that gets all of the messages as the os scheduler schedules the threads differently to send, so I am thinking it must be on the server side somewhere.
You can put some printf statements into the libol code to trace what is happening with each log entry and the output queue. I did this to mine and I can see in the output that syslog-ng sometimes (often) simply throws messages away. The interesting parts are in pkt_buffer.c and queue.c in libol:
if (self->queue_size == self->queue_max) { /* fifo full */ <== oops ol_string_free(string); <== this tosses your message return ST_FAIL | ST_OK; <== return code is ignored }
Okay I set log_fifo_size to be something like 72,000 does that not integrate well with queue_max? I will give this a try though and see if this is where I am losing messages.
I've raised this issue on this list before, but have been ignored. Regardless of how high your fifo size is, syslog-ng will lose messages if the sources generate messages faster than the destintation can consume them. Raising the fifo size only masks transients, but does not help in the steady state.
The symptom is easily seen if you send very small message, such as a three-digit sequence and a newline. This puts as much stress as possible on syslog-ng. If you are suffering from this problem, you will notice in your logs that large blocks of messages are missing. You can generate these messages very quickly with a perl script writing to a named pipe.
-jwb