udp tuning - where to look.
Hi, I am working an issue with UDP drops in a high EPS rate environment (several thousand steady) and struggling with making any changes in the loss rate. (according to netstat -su | grep error). This is happening on two nodes, (one running an older syslog-ng 3.2 OSE, the other running a splunk forwarder) but I don't actually think it is making it out of the kernel to either application. On the syslog-ng side syslog-ng-ctl stats shows *no* drops at all. Increasing net.core.rmem_max and so_rcvbuf together all the way to 64 MB did not seem to make any significant difference. This is a RHEL 6 box with 16 GB and 4 cores (virtual - running in an ESX environment) Are there other parameters, things I should be looking at? Thanks, Jim
Hi Jim, On Tue, Jan 10, 2017 at 04:20:02PM -0500, Jim Hendrick wrote:
loss rate. (according to netstat -su | grep error). […] On the syslog-ng side syslog-ng-ctl stats shows *no* drops at all.
This means that syslog-ng isn't accepting the packets fast enough, so the kernel starts buffering, and the latter gets full, thus increasing the kernel counters (see `/proc/net/snmp`).
Increasing net.core.rmem_max and so_rcvbuf together all the way to 64 MB did not seem to make any significant difference.
I'm afraid these are the values I was going to suggest.
This is a RHEL 6 box with 16 GB and 4 cores (virtual - running in an ESX environment)
FWIW I've had many problems with dropped Udp on virtual machines. It's easy to correlate the `steal` cpu state with drop events where relevant.
Are there other parameters, things I should be looking at?
I'm curious too if there is anything else that can be done (apart from switching to TCP).
Hi Jim, While not a direct answer and following on from Fabien's suggestion: If in a virtual environment, as a work around you could create a few instances running syslog-ng with udp source and tcp destinations, and enable fifo or disk buffering and balance the load over the new instances; maybe explore round robin dns configuration if your environment permits? Kr, James On 11 January 2017 08:26:39 GMT+00:00, Fabien Wernli <wernli@in2p3.fr> wrote:
Hi Jim,
On Tue, Jan 10, 2017 at 04:20:02PM -0500, Jim Hendrick wrote:
loss rate. (according to netstat -su | grep error). […] On the syslog-ng side syslog-ng-ctl stats shows *no* drops at all.
This means that syslog-ng isn't accepting the packets fast enough, so the kernel starts buffering, and the latter gets full, thus increasing the kernel counters (see `/proc/net/snmp`).
Increasing net.core.rmem_max and so_rcvbuf together all the way to 64 MB did not seem to make any significant difference.
I'm afraid these are the values I was going to suggest.
This is a RHEL 6 box with 16 GB and 4 cores (virtual - running in an ESX environment)
FWIW I've had many problems with dropped Udp on virtual machines. It's easy to correlate the `steal` cpu state with drop events where relevant.
Are there other parameters, things I should be looking at?
I'm curious too if there is anything else that can be done (apart from switching to TCP).
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
-- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Jim, You can try some of the steps outlined here: http://demo.logzilla.net/help/performance_tuning/udp_buffer_tuning Since these are VM's, you may also want to look at the type of NIC you are using for the VM, for example: https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=di... And of course, make sure the VMware drivers are installed in the OS. From: syslog-ng <syslog-ng-bounces@lists.balabit.hu> on behalf of Jim Hendrick <james.r.hendrick@gmail.com> Reply-To: Syslog-ng users' and developers' mailing list <syslog-ng@lists.balabit.hu> Date: Tuesday, January 10, 2017 at 4:20 PM To: Syslog-ng users' and developers' mailing list <syslog-ng@lists.balabit.hu> Subject: [syslog-ng] udp tuning - where to look. Hi, I am working an issue with UDP drops in a high EPS rate environment (several thousand steady) and struggling with making any changes in the loss rate. (according to netstat -su | grep error). This is happening on two nodes, (one running an older syslog-ng 3.2 OSE, the other running a splunk forwarder) but I don't actually think it is making it out of the kernel to either application. On the syslog-ng side syslog-ng-ctl stats shows *no* drops at all. Increasing net.core.rmem_max and so_rcvbuf together all the way to 64 MB did not seem to make any significant difference. This is a RHEL 6 box with 16 GB and 4 cores (virtual - running in an ESX environment) Are there other parameters, things I should be looking at? Thanks, Jim
Thanks everyone for the quick feedback! I am asking our vmware folks to take a look from the "outside" for things that might help (we may be having a shared resource issue for example) I am working on a test environment to look into further tuning with kernel settings. I am also looking at application changes (since this is seen on both a syslog-ng "collector" and a splunk "forwarder" I am looking at the kernel / OS stuff first). One possibility is that we could stand up a load balancer in front and spread the load across multiple boxes. Thanks - If / when I have any brilliant success I will pass it along :-) Jim On Wed, Jan 11, 2017 at 8:06 AM, Clayton Dukes <cdukes@logzilla.net> wrote:
Jim,
You can try some of the steps outlined here:
http://demo.logzilla.net/help/performance_tuning/udp_buffer_tuning
Since these are VM's, you may also want to look at the type of NIC you are using for the VM, for example:
https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd= displayKC&externalId=2019944
And of course, make sure the VMware drivers are installed in the OS.
*From: *syslog-ng <syslog-ng-bounces@lists.balabit.hu> on behalf of Jim Hendrick <james.r.hendrick@gmail.com> *Reply-To: *Syslog-ng users' and developers' mailing list < syslog-ng@lists.balabit.hu> *Date: *Tuesday, January 10, 2017 at 4:20 PM *To: *Syslog-ng users' and developers' mailing list < syslog-ng@lists.balabit.hu> *Subject: *[syslog-ng] udp tuning - where to look.
Hi,
I am working an issue with UDP drops in a high EPS rate environment (several thousand steady) and struggling with making any changes in the loss rate. (according to netstat -su | grep error).
This is happening on two nodes, (one running an older syslog-ng 3.2 OSE, the other running a splunk forwarder) but I don't actually think it is making it out of the kernel to either application.
On the syslog-ng side syslog-ng-ctl stats shows *no* drops at all.
Increasing net.core.rmem_max and so_rcvbuf together all the way to 64 MB did not seem to make any significant difference.
This is a RHEL 6 box with 16 GB and 4 cores (virtual - running in an ESX environment)
Are there other parameters, things I should be looking at?
Thanks,
Jim
____________________________________________________________ __________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/? product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
participants (4)
-
Clayton Dukes
-
Fabien Wernli
-
James Elstone
-
Jim Hendrick