Get local log files written immediately even if remote log server is unresponsive
We are running syslog-ng 3.16.1 on Centos 7.4.1708 on a central logging host. We have a large (nearly 1,000) servers also running the same version of syslog-ng on the same Centos release. The servers are configured to log locally and also forward logs to the central logging host. This morning we encountered a problem - syslog-ng was running on the logging host, but was not processing incoming logs or locally generated ones. The servers forwarding to the central host did not write anything to their local log files, a small but significant portion of them had syslog crash, after which is was restarted by systemd, but still no logs were written until syslog-ng was forcibly stopped on the central server and then restarted. Connections to the central server weren't failing in the sense of TCP close or reset, but logs were accumulating on all the servers, including the central one, in the cache file for buffering logs. For our purposes, we need to have up-to-the moment logs available on the individual servers, so an admin going in to troubleshoot on a server who only has console access still has recent logs to consult if needed. Is there a way to tell syslog-ng to write local logs immediately even if it's currently buffering logs for sending to a non-responsive remote server?
I noticed that between 3.9 and 3.14 this issue was introduced. Buffering to any destination seems to block all destinations of he message. I am sure this did not work this way for 3.9 and earlier. I didn't report this because I had not had the time to verify and test this behaviour. Evan. ________________________________________ From: syslog-ng [syslog-ng-bounces@lists.balabit.hu] on behalf of Jim Segrave [jes@j-e-s.net] Sent: Friday, July 5, 2019 6:58 AM To: syslog-ng@lists.balabit.hu Subject: [syslog-ng] Get local log files written immediately even if remote log server is unresponsive We are running syslog-ng 3.16.1 on Centos 7.4.1708 on a central logging host. We have a large (nearly 1,000) servers also running the same version of syslog-ng on the same Centos release. The servers are configured to log locally and also forward logs to the central logging host. This morning we encountered a problem - syslog-ng was running on the logging host, but was not processing incoming logs or locally generated ones. The servers forwarding to the central host did not write anything to their local log files, a small but significant portion of them had syslog crash, after which is was restarted by systemd, but still no logs were written until syslog-ng was forcibly stopped on the central server and then restarted. Connections to the central server weren't failing in the sense of TCP close or reset, but logs were accumulating on all the servers, including the central one, in the cache file for buffering logs. For our purposes, we need to have up-to-the moment logs available on the individual servers, so an admin going in to troubleshoot on a server who only has console access still has recent logs to consult if needed. Is there a way to tell syslog-ng to write local logs immediately even if it's currently buffering logs for sending to a non-responsive remote server?
Hi, On Fri, Jul 05, 2019 at 03:00:48PM +0000, Evan Rempel wrote:
I noticed that between 3.9 and 3.14 this issue was introduced. Buffering to any destination seems to block all destinations of he message.
This seems pretty bad to me. Could you open an issue on github for this please?
We've mostly resolved the issue (combination of flow control and feeding syslog-ng from the journald file. Disabling flow control and rate limiting and setting forward to syslog true almost cures things. However, there's a serious bug in journald which will cause some number of logs to disappear without any report of missed forwarding of logs. I've raised an issue on the github master for systemd https://github.com/systemd/systemd/issues/13078: <https://github.com/systemd/systemd/issues/13078> Possible bug in journald-syslog.c #13078 I fear that this oversight in systemd/journald is not the only occurrence of this bug, just the one we're seeing. On 7/10/19 10:53 AM, Fabien Wernli wrote:
Hi,
On Fri, Jul 05, 2019 at 03:00:48PM +0000, Evan Rempel wrote:
I noticed that between 3.9 and 3.14 this issue was introduced. Buffering to any destination seems to block all destinations of he message. This seems pretty bad to me. Could you open an issue on github for this please?
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
participants (3)
-
Evan Rempel
-
Fabien Wernli
-
Jim Segrave