Repeated log messages?
Recently, we've noticed a few machines are filling up their log filesystems with duplicate log entries. At first, I thought this behavior was caused by running out of disk space (i.e., the machine runs out of disk and syslog-ng does some sort of buffering, and as disk space oscellates between a few hundred bytes available and completely full, syslog-ng is writing this buffered log data out to disk but never removing the log entries from its buffer), but I can't reliably reproduce it. I also noticed that sometimes this would happen when the remote syslog sub was unavailable, but I can't reliably reproduce this behavior by blocking UDP syslog traffic directed at the remote syslog hub. I'd also wonder how syslog-ng would know that UDP syslog traffic is being dropped, unless the nature of the traffic block is such that an ICMP message (host/port unreachable, etc.) is returned to the sending host. I've poked and prodded at syslog-ng, attempting to reliably reproduce this behavior, but haven't been able to. I'm not sure if either of these two events (out of disk space, loss of network access to the syslog hub) are simple coincidences or actually cause/contribute to the behavior. The odd part is that the duplicate log entries seem to be logged forever, such as if syslog-ng was in an infinite loop. Additionally, each duplicate log entry has an additional space each time it's duplicated. For example: Original log entries: Feb 12 06:28:12 rdr01 su[21942]: Successful su for nobody by root Feb 12 06:28:12 rdr01 su[21942]: + ??? root:nobody First round of duplicates, with a single trailing space: Feb 12 06:28:12 localhost su[21942]: Successful su for nobody by root Feb 12 06:28:12 localhost su[21942]: + ??? root:nobody Second round of duplicates, with two trailing spaces: Feb 12 06:28:12 localhost su[21942]: Successful su for nobody by root Feb 12 06:28:12 localhost su[21942]: + ??? root:nobody Third round of duplicates, with three trailing spaces: Feb 12 06:28:12 localhost su[21942]: Successful su for nobody by root Feb 12 06:28:12 localhost su[21942]: + ??? root:nobody [and so on] This is with syslog-ng 2.0.0-1etch1. john -- John Morrissey _o /\ ---- __o jwm@horde.net _-< \_ / \ ---- < \, www.horde.net/ __(_)/_(_)________/ \_______(_) /_(_)__
On Fri, 2008-02-29 at 12:59 -0500, John Morrissey wrote:
Recently, we've noticed a few machines are filling up their log filesystems with duplicate log entries. At first, I thought this behavior was caused by running out of disk space (i.e., the machine runs out of disk and syslog-ng does some sort of buffering, and as disk space oscellates between a few hundred bytes available and completely full, syslog-ng is writing this buffered log data out to disk but never removing the log entries from its buffer), but I can't reliably reproduce it.
I also noticed that sometimes this would happen when the remote syslog sub was unavailable, but I can't reliably reproduce this behavior by blocking UDP syslog traffic directed at the remote syslog hub. I'd also wonder how syslog-ng would know that UDP syslog traffic is being dropped, unless the nature of the traffic block is such that an ICMP message (host/port unreachable, etc.) is returned to the sending host.
I've poked and prodded at syslog-ng, attempting to reliably reproduce this behavior, but haven't been able to. I'm not sure if either of these two events (out of disk space, loss of network access to the syslog hub) are simple coincidences or actually cause/contribute to the behavior.
The odd part is that the duplicate log entries seem to be logged forever, such as if syslog-ng was in an infinite loop. Additionally, each duplicate log entry has an additional space each time it's duplicated. For example:
Original log entries: Feb 12 06:28:12 rdr01 su[21942]: Successful su for nobody by root Feb 12 06:28:12 rdr01 su[21942]: + ??? root:nobody
First round of duplicates, with a single trailing space: Feb 12 06:28:12 localhost su[21942]: Successful su for nobody by root Feb 12 06:28:12 localhost su[21942]: + ??? root:nobody
Second round of duplicates, with two trailing spaces: Feb 12 06:28:12 localhost su[21942]: Successful su for nobody by root Feb 12 06:28:12 localhost su[21942]: + ??? root:nobody
Third round of duplicates, with three trailing spaces: Feb 12 06:28:12 localhost su[21942]: Successful su for nobody by root Feb 12 06:28:12 localhost su[21942]: + ??? root:nobody
[and so on]
Hm... is it possible that syslog-ng somehow loops back to itself? I remember a problem which caused syslog-ng to resolve a destination host name as 0.0.0.0 if DNS was unavailable, effectively causing it to send messages to localhost. -- Bazsi
On Fri, Feb 29, 2008 at 07:08:14PM +0100, Balazs Scheidler wrote:
On Fri, 2008-02-29 at 12:59 -0500, John Morrissey wrote:
Recently, we've noticed a few machines are filling up their log filesystems with duplicate log entries. At first, I thought this behavior was caused by running out of disk space (i.e., the machine runs out of disk and syslog-ng does some sort of buffering, and as disk space oscellates between a few hundred bytes available and completely full, syslog-ng is writing this buffered log data out to disk but never removing the log entries from its buffer), but I can't reliably reproduce it.
Hm... is it possible that syslog-ng somehow loops back to itself? I remember a problem which caused syslog-ng to resolve a destination host name as 0.0.0.0 if DNS was unavailable, effectively causing it to send messages to localhost.
That's exactly it; syslog-ng has a UDP socket open and is sending data to its own UDP listener socket. I'd like to backport a targeted patch if possible, and haven't looked through the git repo yet; do you have an idea when this was fixed, or know the git revision it was fixed in? john -- John Morrissey _o /\ ---- __o jwm@horde.net _-< \_ / \ ---- < \, www.horde.net/ __(_)/_(_)________/ \_______(_) /_(_)__
John Morrissey wrote:
On Fri, Feb 29, 2008 at 07:08:14PM +0100, Balazs Scheidler wrote:
On Fri, 2008-02-29 at 12:59 -0500, John Morrissey wrote:
Recently, we've noticed a few machines are filling up their log filesystems with duplicate log entries. At first, I thought this behavior was caused by running out of disk space (i.e., the machine runs out of disk and syslog-ng does some sort of buffering, and as disk space oscellates between a few hundred bytes available and completely full, syslog-ng is writing this buffered log data out to disk but never removing the log entries from its buffer), but I can't reliably reproduce it.
Hm... is it possible that syslog-ng somehow loops back to itself? I remember a problem which caused syslog-ng to resolve a destination host name as 0.0.0.0 if DNS was unavailable, effectively causing it to send messages to localhost.
That's exactly it; syslog-ng has a UDP socket open and is sending data to its own UDP listener socket. I'd like to backport a targeted patch if possible, and haven't looked through the git repo yet; do you have an idea when this was fixed, or know the git revision it was fixed in?
This is related to the tcp destination not being followed if it changes IP address, but maintains its IP name. If you can't resolve the hostname, the retry mechanism should should start from the beginning, resolve the name and attempt to open. If the resolve fails, then the retry delay is invoked. If the connection can not established, then the retry delay is invoked. It should never fail to resolve the address, but return a valid IP address (0.0.0.0 or 127.0.0.1). We go around this by binding to port 0.0.0.0:1514 for local stuff. Who would send stuff to the local syslog via UDP? This may not work for your situation. Evan Rempel
On Wed, 2008-03-12 at 14:07 -0400, John Morrissey wrote:
On Fri, Feb 29, 2008 at 07:08:14PM +0100, Balazs Scheidler wrote:
On Fri, 2008-02-29 at 12:59 -0500, John Morrissey wrote:
Recently, we've noticed a few machines are filling up their log filesystems with duplicate log entries. At first, I thought this behavior was caused by running out of disk space (i.e., the machine runs out of disk and syslog-ng does some sort of buffering, and as disk space oscellates between a few hundred bytes available and completely full, syslog-ng is writing this buffered log data out to disk but never removing the log entries from its buffer), but I can't reliably reproduce it.
Hm... is it possible that syslog-ng somehow loops back to itself? I remember a problem which caused syslog-ng to resolve a destination host name as 0.0.0.0 if DNS was unavailable, effectively causing it to send messages to localhost.
That's exactly it; syslog-ng has a UDP socket open and is sending data to its own UDP listener socket. I'd like to backport a targeted patch if possible, and haven't looked through the git repo yet; do you have an idea when this was fixed, or know the git revision it was fixed in?
It was fixed here, e.g. it was included in 2.0.6 Author: Balazs Scheidler <bazsi@balabit.hu> 2007-10-23 17:04:46 Committer: Balazs Scheidler <bazsi@balabit.hu> 2007-10-23 17:04:46 Parent: 2288151086e2557c8a30e37c1b4a37a7b373fffb (fixed implicit declaration of strtol warning) Child: 1661ca746f13eacf3c5e0d08640dd00c71548d11 (fixed ipv6 address resolution as only parts of the sockaddr_in6 structure were copied) Follows: v2.0.5 unset destination/bind address if a name resolution error occurs (fixes: #12393) 2007-10-23 Balazs Scheidler <bazsi@balabit.hu> * src/afinet.c (afinet_resolve_name): changed prototype to require a pointer to GSockAddr *, set the passed reference to NULL whenever a resolve error occurs (fixes: #12393), (various functions): changed all afinet_resolve_name invocations to use the new prototype -- Bazsi
participants (3)
-
Balazs Scheidler
-
Evan Rempel
-
John Morrissey