[syslog-ng] Fwd: Multithreaded inputs with same output file : is it safe ?

László Várady (lvarady) Laszlo.Varady at oneidentity.com
Fri Aug 2 11:11:46 UTC 2019


> 1/ Tcp connections are sent to different thread when multithreading is
> enabled, even if there is a single source enabled in config with a
> single tcp port listening

That is correct. A single source object in the configuration can handle at most `max-connections()` number of connections, the default is 10.
This does not mean that 10 long-running threads are created, syslog-ng is non-blocking and uses a thread pool to schedule I/O jobs.
Nevertheless, TCP connections within 1 source objects are "multithreaded" and scales between cores.

> 2/ UDP datagrams are NOT sent to different thread if there is a single
> source even if there are multiple listening ports in the source config

Yes. Every UDP source object in the configuration can be scheduled to the I/O work pool
separately making it possible to receive messages on UDP faster if you can load-balance your messages
to 2 or more separate UDP port. The datagrams of a UDP source are processed sequentially,
meaning a single UDP source object can't scale between cores.

>3/ UDP datagrams are sent to different threads if there are different
>sources in config (1 source -> 1 thread), so basically to benefit from
> multithreading with UDP one needs to configure several sources.

Correct. 1 UDP source is not bound to a specific thread, but yes, a single UDP source won't scale to more than 1 thread, we can't talk about parallelism in this case.
Please note that using the same UDP source object in multiple log paths still counts as one source.

Besides the mentioned possibilities, you have another options: UDP sources have a config option called `so-reuseport()`.

I'm quoting the syslog-ng Admin Guide:
Enables SO_REUSEPORT on systems that support it. When enabled, the kernel allows multiple UDP sockets to be bound to the same port, and the kernel load-balances incoming UDP datagrams to the sockets. The sockets are distributed based on the hash of (srcip, dstip, srcport, dstport), so the same listener should be receiving packets from the same endpoint. For example:

source {
        udp(so-reuseport(1) port(2000) persist-name("udp1"));
        udp(so-reuseport(1) port(2000) persist-name("udp2"));
        udp(so-reuseport(1) port(2000) persist-name("udp3"));
        udp(so-reuseport(1) port(2000) persist-name("udp4"));

> Is there an issue if several source (=
> from different threads) ends up being outputed in the  same file.

This is not a problem, our destinations are decoupled from their sources through a memory or disk-based queue.
There is no threading issue that user should care about.

>Another way to turn the question is: are all the data from the source
>thread then send to some common thread so that output to a file is
>always handled by the same thread even if log is received from
>different source threads ?

Something like that. Non-blocking destination jobs are scheduled to the I/O pool, they run in parallel with their sources.
"Threaded destinations" such as HTTP, Redis, Python, Java, etc. have a dedicated thread, but it is not important in this context.
Synchronization between sources and destinations is done through memory queues (or optionally: disk-based queues), so you have the freedom to create any log pipeline/path you want.

László Várady

More information about the syslog-ng mailing list