On 5/28/20 7:46 AM, László Várady (lvarady) wrote:
Notice: This message was sent from outside the University of Victoria email system. Please be cautious with links and sensitive information.
Hi,
1. The OS UDP buffer seems to be 128MB in size and the so_rcvbuf configured ins 64M in size. Is that because the syslog-ng configuration of so_rcvbuf is in characters but the OS buffer is in bytes?
This is because the kernel doubles the value set by syslog-ng (to allow space for bookkeeping overhead), and this doubled value is returned by getsockopt(2) and other tools.
Thanks.
3. Increasing the log_iw_size or the log_iw_size actually seems to make things worse.
These 2 values already seem high enough. Disabling flow-control is also a good idea IMO, when using UDP sources.
Flow control is completely disabled. Or more precisely unspecified. It occurs to me that perhaps the default is enabled?
All suggestions that help me understand this and help to minimize the drops are welcome.
Could you share how incoming packets are distributed across the 8 sockets?
The default SO_REUSEPORT mechanism distributes packets based on the hash of (peer IP address, port) and (local IP address, port), Hashing collision is also likely to happen [1], so if you encounter this problem, there are other possible resolutions. The commercial syslog-ng version has, for example, an udp-balancer() driver, that uses custom BPF programs to achieve an even distribution of packets.
It is a single device, single IP, single port, so there is just one socket that is overwhelmed. This will probably be the same failure to process fast enough if we move to a TCP transport.
[1] https://blog.cloudflare.com/how-to-receive-a-million-packets/
I will read through this blog. -- Evan Rempel