I am currently using the stock syslog daemon from RedHat but it appears to not be able to keep up so I am looking at syslog-ng to improve things. The data below is to provide a baseline of what I am currently seeing and what I have attempted to do. Then if anyone would let me know if syslog-ng would be able to improve the performance and what measures I can take to achieve the improved performance that would be great. Logs have to be rotated each hour due to the amount of traffic. On average I am successfully logging 25,888 events per minute. That goes higher during the early morning login times. I have set the following sysctl params: net.core.rmem_max = 33554432 net.core.wmem_max = 33554432 net.core.rmem_default = 65536 net.core.wmem_default = 65536 net.ipv4.tcp_rmem = 4096 87380 33554432 net.ipv4.tcp_wmem = 4096 65536 33554432 net.ipv4.tcp_mem = 33554432 33554432 33554432 I have also set the ring params on the intel1000 nic (using intels latest driver e1000-7.1.9, not the default kernel one) to 512MB. Pre-set maximums: RX: 4096 RX Mini: 0 RX Jumbo: 0 TX: 4096 Current hardware settings: RX: 512 RX Mini: 0 RX Jumbo: 0 TX: 512 Based on the nic statistics, the bottleneck is not at the nic: NIC statistics: rx_packets: 148314357 tx_packets: 24906469 rx_bytes: 3023662070 tx_bytes: 3216438764 rx_errors: 0 tx_errors: 0 tx_dropped: 0 multicast: 8531 collisions: 0 rx_length_errors: 0 rx_over_errors: 0 rx_crc_errors: 0 rx_frame_errors: 0 rx_no_buffer_count: 0 rx_missed_errors: 0 tx_aborted_errors: 0 tx_carrier_errors: 0 tx_fifo_errors: 0 tx_heartbeat_errors: 0 tx_window_errors: 0 tx_abort_late_coll: 0 tx_deferred_ok: 0 tx_single_coll_ok: 0 tx_multi_coll_ok: 0 tx_timeout_count: 0 rx_long_length_errors: 0 rx_short_length_errors: 0 rx_align_errors: 0 tx_tcp_seg_good: 10699059 tx_tcp_seg_failed: 0 rx_flow_control_xon: 0 rx_flow_control_xoff: 0 tx_flow_control_xon: 0 tx_flow_control_xoff: 0 rx_long_byte_count: 76038106102 rx_csum_offload_good: 148139283 rx_csum_offload_errors: 0 rx_header_split: 0 alloc_rx_buff_failed: 0 Sar -B output shows a lot of paging going on which is probably causing some loss 06:10:01 AM pgpgin/s pgpgout/s fault/s majflt/s 06:20:01 AM 0.00 160.46 3.02 0.00 06:30:01 AM 0.00 162.02 3.04 0.00 06:40:01 AM 0.00 164.49 4.14 0.00 06:50:01 AM 0.02 161.51 3.03 0.00 07:00:01 AM 0.00 174.21 3.04 0.00 07:10:01 AM 0.11 198.15 6.02 0.00 07:20:01 AM 0.01 193.53 3.03 0.00 07:30:01 AM 1278.66 1593.51 29.30 0.04 Average: 67.76 189.06 19.12 0.02 Strange though (or maybe I am misreading this) sar -W shows all zeroe's for pswpin/s and pswpout/s which says there is not swapping going on. Sar -N shows no errors either, but the udpsck value is locked at 5. I am thinking this is hardcoded somewhere or maybe compiled in syslog srpms as I cannot find a syslog setting. This is one reason for wanting to use syslog-ng as from what I read I can change the number of allocated sockets via the configuration file. Sar -P shows the CPU is avg 95% idle so I see no issue here. I have re-niced syslog to -10 to increase it's priority. Netstat -su shows what might be data loss: Udp: 131725715 packets received 16642 packets to unknown port received. 4859684 packet receive errors 31571 packets sent I have no way to tell if the udp errors are related to syslog data and with so much syslog data arriving setting up tcpdump to see what the errors relate to is going to be rather difficult. What I think is going on is the stock syslog daemon is simply unable to buffer enough to keep up with the syslog stream. Therefore I am wanting to look at using syslog-ng to see if the error rate drops. Also I was thinking of using TCP for this but at the data rate I am seeing I am thinking this would cause a potential denial of service on both the syslog transmitters and the syslog receiving server. Any thoughts, ideas? Thanks Greg
Both TCP and UDP have risks and limitations. If message loss/spoofing are important to you, TCP is the way to go. (One key exception being logs from PIX firewalls :) Is the log server handling other services as well? Kevin
On Tue, 2006-08-22 at 08:04 -0500, King, John (Greg) (LMIT-HOU) wrote:
I am currently using the stock syslog daemon from RedHat but it appears to not be able to keep up so I am looking at syslog-ng to improve things. The data below is to provide a baseline of what I am currently seeing and what I have attempted to do. Then if anyone would let me know if syslog-ng would be able to improve the performance and what measures I can take to achieve the improved performance that would be great.
Logs have to be rotated each hour due to the amount of traffic. On average I am successfully logging 25,888 events per minute. That goes higher during the early morning login times.
I have set the following sysctl params:
net.core.rmem_max = 33554432 net.core.wmem_max = 33554432 net.core.rmem_default = 65536 net.core.wmem_default = 65536 net.ipv4.tcp_rmem = 4096 87380 33554432 net.ipv4.tcp_wmem = 4096 65536 33554432 net.ipv4.tcp_mem = 33554432 33554432 33554432
syslog-ng is more complex than plain syslogd, especially when it comes to complex regexp based filtering. the 2.0.x branch should be way better performance wise, than the 1.6.x series. What I spotted in your settings is that probably rmem_default set at 64k might be a bit small, you can increase its value with syslog-ng's so_rcvbuf() option (available in 2.0.x only) With your message rate I'd suggest about 512k-1MB receive buffer for the UDP receiver. -- Bazsi
participants (3)
-
Balazs Scheidler
-
Kevin
-
King, John (Greg) (LMIT-HOU)