On 5/28/20 7:46 AM, László Várady (lvarady) wrote:
Notice: This message was sent from outside the University of Victoria email system. Please be cautious with links and sensitive information.

Hi,

> 1. The OS UDP buffer seems to be 128MB in size and the so_rcvbuf configured ins 64M in size. Is that because the syslog-ng configuration of so_rcvbuf is in characters but the OS buffer is in bytes?

This is because the kernel doubles the value set by syslog-ng (to allow space for bookkeeping overhead), and this doubled value is returned by getsockopt(2) and other tools.


Thanks.



> 3. Increasing the log_iw_size or the log_iw_size actually seems to make things worse.

These 2 values already seem high enough.
Disabling flow-control is also a good idea IMO, when using UDP sources.


Flow control is completely disabled. Or more precisely unspecified. It occurs to me that perhaps the default is enabled?



> All suggestions that help me understand this and help to minimize the drops are welcome.

Could you share how incoming packets are distributed across the 8 sockets?

The default SO_REUSEPORT mechanism distributes packets based on the hash of (peer IP address, port) and (local IP address, port),
Hashing collision is also likely to happen [1], so if you encounter this problem, there are other possible resolutions. The commercial syslog-ng version has, for example, an
udp-balancer() driver, that uses custom BPF programs to achieve an even distribution of packets.


It is a single device, single IP, single port, so there is just one socket that is overwhelmed. This will probably be the same failure to process fast enough if we move to a TCP transport.



[1] https://blog.cloudflare.com/how-to-receive-a-million-packets/


I will read through this blog.


-- 
Evan Rempel