[syslog-ng] syslog-ng suddenly stops logging
richard.vigeant at vantrix.com
Fri Jun 27 18:17:19 CEST 2008
On 27-Jun-08, at 4:13 AM, Balazs Scheidler wrote:
> On Wed, 2008-06-25 at 08:49 -0400, Richard Vigeant wrote:
>> On 24-Jun-08, at 3:37 AM, Balazs Scheidler wrote:
>>> On Thu, 2008-06-19 at 15:54 -0400, Richard Vigeant wrote:
>>>> I have a configuration where several nodes send all log messages
>>>> to a
>>>> central server. The
>>>> applications on remote nodes send their logs locally either via UDP
>>>> a unix socket. The
>>>> syslog-ng running on remote nodes simply pick up all log messages
>>>> all sources, i.e. TCP, UDP,
>>>> /proc/kmsg, /dev/log and internal, and transmit all messages to the
>>>> central server uisng TCP. The
>>>> remote node's config file follows.
>>>> We've been having intermittent problems where the central server
>>>> suddenly stop logging messages
>>>> from certain nodes. We noticed that very often restarting syslog-ng
>>>> the central server would fix
>>>> the condition and logging would carry on.
>>>> Howver I discovered a new rare case where restarting the central
>>>> syslog-ng didn't work. I found out
>>>> by doing a tcpdump that the remote syslog-ng was not sending the
>>>> messages. I have done an strace
>>>> on the remote syslog-ng and it shows that nothing happens after a
>>>> message has been "recvfrom()" or
>>>> "read()". Then I have restarted syslog-ng and things went back to
>>>> normal. In the 2nd strace we can see
>>>> that there is a "write()" after the "read()".
>>> I might be guessing here as I don't really know which fd is which,
>>> but I
>>> think you've ran into an issue that some others have experienced
>>> In the case when the traffic does not work, syslog-ng is correctly
>>> polling fd 8 for output, I assumed that fd 8 is the fd of the
>>> to the server. (it is in the 2nd strace dump).
>>> So syslog-ng is polling for writing out on fd 8, but the poll system
>>> call does not indicate writability. This usually means that the
>>> window is full, the server does not accept new data.
>>> State based firewalls often drop inactive connections after a period
>>> time and in case packets arrive for a connection for which no state
>>> exists, packets are dropped.
>>> Do you have a firewall between the client and the server?
>> No firewall. Clients and server are all on the same LAN. This is one
>> of our local QA environment.
>> Note that I have seen similar cases where the problem occurred on the
>> server and the output is a file. However I can't currently
>> reproduce it.
> Hmmm, and neither the clients nor the server is running connection
> tracking, right?
> If my initial analysis is correct (an lsof output should confirm
> then the problem is that syslog-ng is unable to send to the TCP
> connection and it is the TCP stack of the OS that tells this to
> If this is a QA network, can you run tcpdump to sniff the packets and
> see how the on-wire traffic looks like?
> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
> Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng
> FAQ: http://www.campin.net/syslog-ng/faq.html
Well I had done netstat on both server and client and it showed the
TCP connection between the server:514 and client as ESTABLISHED.
I had run some tcpdump and all traffic seemed normal except for the
absence of syslog-ng traffic. Traffic was not particularly heavy and
everything else worked normally.
Unfortunately I cannot get any more info because since then I had to
re-enable syslog-ng on the QA system. All I did was restart syslog-ng
on the client node and went back to normal.
More information about the syslog-ng