syslog-ng stops accepting new connections every 100-110 minutes
For the last 24 hours on versions 2.0.4, 2.1.4, & 3.0.3 syslog-ng will stop taking new connections via a listening port every 100-110 minutes (aka it will hang up immediately). It will never recover on its own and has to be restarted. I haven't figured out the exact interval but hopefully that will be close enough to work with (note that the traffic is fairly low -- 10 mesgs/sec -- 500K-600K data/min). I had a program logging data locally via /dev/log into a named directory and then moved this program to a remote server. That remote server does not seem to be having an issue. I have observed this issue on two separate servers (RHEL4.[image: Cool] that were taking this data feed. I have tried with flush_lines/sync & time_reopen commented out with no difference as well as log_fifo_size, log_mesg_size,so_recvbuf commented out. There are no obvious messages about why syslog-ng stops working (even with debug and verbose enabled). Note that these two servers (that stop working) are behind an Alteon 2424 switch (although I have other feeds to other servers working fine behind this switch). Ideas? Need more data? ==syslog will stop accepting connections== [root@server]# telnet localhost 514 Trying 127.0.0.1... Connected to localhost.localdomain (127.0.0.1). Escape character is '^]'. Connection closed by foreign host. ==top section of 3.0 syslog-ng.conf== options { flush_lines (0); time_reopen (10); log_fifo_size (10000); long_hostnames (off); use_dns (no); use_fqdn (no); create_dirs (no); dir_perm (0755); perm (0644); chain_hostnames(no); keep_hostname (yes); stats_freq (3600); log_msg_size(65536); }; source remote { udp(ip(0.0.0.0) port(514) so_rcvbuf(1048576)); tcp(ip(0.0.0.0) port(514) max-connections(50) so_rcvbuf(1048576)); }; ==logging data like this=== filter f_data { match("Data:"); }; destination d_data { file("/var/log/data/data-$R_MONTH$R_DAY$R_HOUR$R_MIN"); }; log { source(remote); filter(f_data); destination(d_data); };
You should check the internal logs of syslog-ng. Your config allows only 50 concurrent TCP connections, so my guess is that you're simply over this limit. On Wed, Jul 29, 2009 at 5:22 PM, Matt Pinkham<westphalia@gmail.com> wrote:
For the last 24 hours on versions 2.0.4, 2.1.4, & 3.0.3 syslog-ng will stop taking new connections via a listening port every 100-110 minutes (aka it will hang up immediately). It will never recover on its own and has to be restarted. I haven't figured out the exact interval but hopefully that will be close enough to work with (note that the traffic is fairly low -- 10 mesgs/sec -- 500K-600K data/min). I had a program logging data locally via /dev/log into a named directory and then moved this program to a remote server. That remote server does not seem to be having an issue. I have observed this issue on two separate servers (RHEL4. that were taking this data feed. I have tried with flush_lines/sync & time_reopen commented out with no difference as well as log_fifo_size, log_mesg_size,so_recvbuf commented out. There are no obvious messages about why syslog-ng stops working (even with debug and verbose enabled). Note that these two servers (that stop working) are behind an Alteon 2424 switch (although I have other feeds to other servers working fine behind this switch). Ideas? Need more data?
==syslog will stop accepting connections== [root@server]# telnet localhost 514 Trying 127.0.0.1... Connected to localhost.localdomain (127.0.0.1). Escape character is '^]'. Connection closed by foreign host.
==top section of 3.0 syslog-ng.conf== options { flush_lines (0); time_reopen (10); log_fifo_size (10000); long_hostnames (off); use_dns (no); use_fqdn (no); create_dirs (no); dir_perm (0755); perm (0644); chain_hostnames(no); keep_hostname (yes); stats_freq (3600); log_msg_size(65536); };
source remote { udp(ip(0.0.0.0) port(514) so_rcvbuf(1048576)); tcp(ip(0.0.0.0) port(514) max-connections(50) so_rcvbuf(1048576)); };
==logging data like this=== filter f_data { match("Data:"); }; destination d_data { file("/var/log/data/data-$R_MONTH$R_DAY$R_HOUR$R_MIN"); }; log { source(remote); filter(f_data); destination(d_data); };
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
No, it is not a concurrent TCP limit issue as this data stream is a single connection. I do not see the max-connections message in the messages log either. On Wed, Jul 29, 2009 at 11:22 AM, Matt Pinkham <westphalia@gmail.com> wrote:
For the last 24 hours on versions 2.0.4, 2.1.4, & 3.0.3 syslog-ng will stop taking new connections via a listening port every 100-110 minutes (aka it will hang up immediately). It will never recover on its own and has to be restarted. I haven't figured out the exact interval but hopefully that will be close enough to work with (note that the traffic is fairly low -- 10 mesgs/sec -- 500K-600K data/min). I had a program logging data locally via /dev/log into a named directory and then moved this program to a remote server. That remote server does not seem to be having an issue. I have observed this issue on two separate servers (RHEL4.[image: Cool] that were taking this data feed. I have tried with flush_lines/sync & time_reopen commented out with no difference as well as log_fifo_size, log_mesg_size,so_recvbuf commented out. There are no obvious messages about why syslog-ng stops working (even with debug and verbose enabled). Note that these two servers (that stop working) are behind an Alteon 2424 switch (although I have other feeds to other servers working fine behind this switch). Ideas? Need more data?
==syslog will stop accepting connections== [root@server]# telnet localhost 514 Trying 127.0.0.1... Connected to localhost.localdomain (127.0.0.1). Escape character is '^]'. Connection closed by foreign host.
==top section of 3.0 syslog-ng.conf== options { flush_lines (0); time_reopen (10); log_fifo_size (10000); long_hostnames (off); use_dns (no); use_fqdn (no); create_dirs (no); dir_perm (0755); perm (0644); chain_hostnames(no); keep_hostname (yes); stats_freq (3600); log_msg_size(65536); };
source remote { udp(ip(0.0.0.0) port(514) so_rcvbuf(1048576)); tcp(ip(0.0.0.0) port(514) max-connections(50) so_rcvbuf(1048576)); };
==logging data like this=== filter f_data { match("Data:"); }; destination d_data { file("/var/log/data/data-$R_MONTH$R_DAY$R_HOUR$R_MIN"); }; log { source(remote); filter(f_data); destination(d_data); };
-- Some men see things as they are and ask why. I see things that never were and ask for initiative rolls.
You should doublecheck the connections using lsof or netstat. Firewalls and other "smart" network devices could cause strange issues. If there is really just a single connection then strace syslog-ng while it refuses the connection, and show the output. On Wed, Jul 29, 2009 at 5:34 PM, Matt Pinkham<westphalia@gmail.com> wrote:
No, it is not a concurrent TCP limit issue as this data stream is a single connection. I do not see the max-connections message in the messages log either.
On Wed, Jul 29, 2009 at 11:22 AM, Matt Pinkham <westphalia@gmail.com> wrote:
For the last 24 hours on versions 2.0.4, 2.1.4, & 3.0.3 syslog-ng will stop taking new connections via a listening port every 100-110 minutes (aka it will hang up immediately). It will never recover on its own and has to be restarted. I haven't figured out the exact interval but hopefully that will be close enough to work with (note that the traffic is fairly low -- 10 mesgs/sec -- 500K-600K data/min). I had a program logging data locally via /dev/log into a named directory and then moved this program to a remote server. That remote server does not seem to be having an issue. I have observed this issue on two separate servers (RHEL4. that were taking this data feed. I have tried with flush_lines/sync & time_reopen commented out with no difference as well as log_fifo_size, log_mesg_size,so_recvbuf commented out. There are no obvious messages about why syslog-ng stops working (even with debug and verbose enabled). Note that these two servers (that stop working) are behind an Alteon 2424 switch (although I have other feeds to other servers working fine behind this switch). Ideas? Need more data?
==syslog will stop accepting connections== [root@server]# telnet localhost 514 Trying 127.0.0.1... Connected to localhost.localdomain (127.0.0.1). Escape character is '^]'. Connection closed by foreign host.
==top section of 3.0 syslog-ng.conf== options { flush_lines (0); time_reopen (10); log_fifo_size (10000); long_hostnames (off); use_dns (no); use_fqdn (no); create_dirs (no); dir_perm (0755); perm (0644); chain_hostnames(no); keep_hostname (yes); stats_freq (3600); log_msg_size(65536); };
source remote { udp(ip(0.0.0.0) port(514) so_rcvbuf(1048576)); tcp(ip(0.0.0.0) port(514) max-connections(50) so_rcvbuf(1048576)); };
==logging data like this=== filter f_data { match("Data:"); }; destination d_data { file("/var/log/data/data-$R_MONTH$R_DAY$R_HOUR$R_MIN"); }; log { source(remote); filter(f_data); destination(d_data); };
-- Some men see things as they are and ask why. I see things that never were and ask for initiative rolls.
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
On Wed, 2009-07-29 at 11:22 -0400, Matt Pinkham wrote:
For the last 24 hours on versions 2.0.4, 2.1.4, & 3.0.3 syslog-ng will stop taking new connections via a listening port every 100-110 minutes (aka it will hang up immediately). It will never recover on its own and has to be restarted. I haven't figured out the exact interval but hopefully that will be close enough to work with (note that the traffic is fairly low -- 10 mesgs/sec -- 500K-600K data/min). I had a program logging data locally via /dev/log into a named directory and then moved this program to a remote server. That remote server does not seem to be having an issue. I have observed this issue on two separate servers (RHEL4.Coolthat were taking this data feed. I have tried with flush_lines/sync & time_reopen commented out with no difference as well as log_fifo_size, log_mesg_size,so_recvbuf commented out. There are no obvious messages about why syslog-ng stops working (even with debug and verbose enabled). Note that these two servers (that stop working) are behind an Alteon 2424 switch (although I have other feeds to other servers working fine behind this switch). Ideas? Need more data?
does this mean that syslog-ng is closing the connection immediately? I see only one reason that causes this: max_connections() limit is reached. try increasing max-connections() Although this case is logged in syslog-ng's log. -- Bazsi
I haven't seen the max-connections message but the ESTABLISHED connections (from the same source) keeps incrementing every couple of minutes on the target (even though the sender only ever shows one connection). The only other point I had forgotten to mention (and it shouldn't matter) is that this traffic runs through a Radware (formerly Nortel) Application Switch 2424 (I previously had a similar syslog config but different data stream running an Alteon 180e with no issues). The IP 10.10.10.41 is the load balance IP (VIP). I upgraded both source and target to 3.0.3 in case that would help (it hasn't). SENDER (10.10.10.227) (syslog-ng.conf snippet) options { time_reopen (2); log_fifo_size (10000); long_hostnames (off); use_dns (no); use_fqdn (no); create_dirs (yes); dir_perm (0755); perm (0644); chain_hostnames (no); keep_hostname (yes); stats_freq (3600); log_msg_size (65535); log_fifo_size (65536); }; destination d_data { tcp("10.10.10.41" so_sndbuf(2094752) so_keepalive(yes)); }; (netstat) tcp 0 0 10.10.10.227:38370 10.10.10.41:514 ESTABLISHED 2067/syslog-ng RECEIVER (10.10.10.31) (syslog-ng.conf snippet) source remote { udp(ip(0.0.0.0) port(514) so_rcvbuf(1048576)); tcp(ip(0.0.0.0) port(514) max-connections(500) so_rcvbuf(1048576) so_keepalive(yes)); }; (netstat) tcp 0 0 0.0.0.0:514 0.0.0.0:* LISTEN 2086/syslog-ng tcp 0 0 10.10.10.31:514 10.10.10.227:9501 ESTABLISHED 2086/syslog-ng tcp 0 0 10.10.10.31:514 10.10.10.227:9503 ESTABLISHED 2086/syslog-ng tcp 0 0 10.10.10.31:514 10.10.10.227:9499 ESTABLISHED 2086/syslog-ng tcp 0 0 10.10.10.31:514 10.10.10.227:9509 ESTABLISHED 2086/syslog-ng tcp 0 0 10.10.10.31:514 10.10.10.227:9511 ESTABLISHED 2086/syslog-ng tcp 0 0 10.10.10.31:514 10.10.10.227:9505 ESTABLISHED 2086/syslog-ng tcp 0 0 10.10.10.31:514 10.10.10.227:9507 ESTABLISHED 2086/syslog-ng tcp 0 0 10.10.10.31:514 10.10.10.227:9513 ESTABLISHED 2086/syslog-ng On Thu, Jul 30, 2009 at 3:25 AM, Balazs Scheidler <bazsi@balabit.hu> wrote:
On Wed, 2009-07-29 at 11:22 -0400, Matt Pinkham wrote:
For the last 24 hours on versions 2.0.4, 2.1.4, & 3.0.3 syslog-ng will stop taking new connections via a listening port every 100-110 minutes (aka it will hang up immediately). It will never recover on its own and has to be restarted. I haven't figured out the exact interval but hopefully that will be close enough to work with (note that the traffic is fairly low -- 10 mesgs/sec -- 500K-600K data/min). I had a program logging data locally via /dev/log into a named directory and then moved this program to a remote server. That remote server does not seem to be having an issue. I have observed this issue on two separate servers (RHEL4.Coolthat were taking this data feed. I have tried with flush_lines/sync & time_reopen commented out with no difference as well as log_fifo_size, log_mesg_size,so_recvbuf commented out. There are no obvious messages about why syslog-ng stops working (even with debug and verbose enabled). Note that these two servers (that stop working) are behind an Alteon 2424 switch (although I have other feeds to other servers working fine behind this switch). Ideas? Need more data?
does this mean that syslog-ng is closing the connection immediately? I see only one reason that causes this: max_connections() limit is reached.
try increasing max-connections()
Although this case is logged in syslog-ng's log.
-- Bazsi
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
-- Some men see things as they are and ask why. I see things that never were and ask for initiative rolls.
On Thu, 2009-07-30 at 09:52 -0400, Matt Pinkham wrote:
I haven't seen the max-connections message but the ESTABLISHED connections (from the same source) keeps incrementing every couple of minutes on the target (even though the sender only ever shows one connection). The only other point I had forgotten to mention (and it shouldn't matter) is that this traffic runs through a Radware (formerly Nortel) Application Switch 2424 (I previously had a similar syslog config but different data stream running an Alteon 180e with no issues). The IP 10.10.10.41 is the load balance IP (VIP).
I upgraded both source and target to 3.0.3 in case that would help (it hasn't).
SENDER (10.10.10.227) (syslog-ng.conf snippet) options { time_reopen (2); log_fifo_size (10000); long_hostnames (off); use_dns (no); use_fqdn (no); create_dirs (yes); dir_perm (0755); perm (0644); chain_hostnames (no); keep_hostname (yes); stats_freq (3600); log_msg_size (65535); log_fifo_size (65536); };
destination d_data { tcp("10.10.10.41" so_sndbuf(2094752) so_keepalive(yes)); };
(netstat) tcp 0 0 10.10.10.227:38370 10.10.10.41:514 ESTABLISHED 2067/syslog-ng
RECEIVER (10.10.10.31) (syslog-ng.conf snippet) source remote { udp(ip(0.0.0.0) port(514) so_rcvbuf(1048576)); tcp(ip(0.0.0.0) port(514) max-connections(500) so_rcvbuf(1048576) so_keepalive(yes)); };
(netstat) tcp 0 0 0.0.0.0:514 0.0.0.0:* LISTEN 2086/syslog-ng tcp 0 0 10.10.10.31:514 10.10.10.227:9501 ESTABLISHED 2086/syslog-ng tcp 0 0 10.10.10.31:514 10.10.10.227:9503 ESTABLISHED 2086/syslog-ng tcp 0 0 10.10.10.31:514 10.10.10.227:9499 ESTABLISHED 2086/syslog-ng tcp 0 0 10.10.10.31:514 10.10.10.227:9509 ESTABLISHED 2086/syslog-ng tcp 0 0 10.10.10.31:514 10.10.10.227:9511 ESTABLISHED 2086/syslog-ng tcp 0 0 10.10.10.31:514 10.10.10.227:9505 ESTABLISHED 2086/syslog-ng tcp 0 0 10.10.10.31:514 10.10.10.227:9507 ESTABLISHED 2086/syslog-ng tcp 0 0 10.10.10.31:514 10.10.10.227:9513 ESTABLISHED 2086/syslog-ng
hmm.. if syslog-ng closes the connection immediately, the followings may apply: 1) max-connections limit 2) tcp wrapper (e.g. /etc/hosts.allow and /etc/hosts.deny if enabled) 3) fd limit you should try running strace on the running syslog-ng process and see what it does when it rejects an incoming connection. -- Bazsi
participants (3)
-
Balazs Scheidler
-
Matt Pinkham
-
Sandor Geller