On Fri, 2007-06-29 at 01:19 -0700, Christopher Cowart wrote:
Hello,
I've had a recent problem where my dhcpd logs appear to be arriving and rearriving until /var fills up.
My log traffic looks like this (all over udp/syslog and IPSec):
------------+ all logs +----------+ dhcpd logs +------------+ dhcp servers|---------->|log server|------------>|other server|----| ------------+ +----------+ +------------+ | ^ | | all logs except dhcpd | -------------------------------------|
This configuration has been in effect and working as expected for about 11 months now.
Approximately 2 weeks ago, I was reviewing my morning cronspam and discovered that /var had filled up on my log server. The dhcpd log was the obvious culprit, expanded to fill my entire log partition. I'm attaching sample entries so you can take a look if you are so inclined; I apologize for sanitizing out the data in the messages, but they're both protected data and not important to the discussion at hand.
There should be *no* dhcpd log messages originating on the log server. When logging is working properly, the host field in the entries properly reflect their dhcp servers of origin. When logging becomes fubar'd, I see a lot of messages with localhost in that field, all with increasing amounts of whitespace in them. syslog-ng spins the CPU to 100% and proceeds to fill up /var with these "feedback" entries.
This has happened 3 times in the past 2 weeks. It always begins at midnight. I checked my crons -- I have an hourly execution of logrotate. It is configured to rotate most of my logs (dhcpd included) once a day, which happens at midnight. It politely runs `pkill -HUP syslog-ng' after truncating the log files; this works successfully for all the other logfiles on the log server and has been working fine for dhcpd for the previous 11 months as well.
The other likely candidate for trouble is the "other server" from the diagram above that logs back to the log server. I'm relatively certain that this (alone) is not the cause of the problem. I was painstakingly careful to not generate any feedback loops in its configuration. The incoming udp messages have a dedicated source, which is written to a dedicated destination.
I'm having trouble piecing two and two together here. I don't think either of these things are causing the problem, but they're the only obvious culprits. Neither of them, if misconfigured, would explain why the duplicates all appear to come from the localhost. Why would the udp source on the logserver be receiving the messages it sends, but only recently and only immediately after midnight?
I haven't had the chance to do a tcpdump yet to see the actual network traffic; twice now I've gotten there after /var was already full and tonight when I did finally catch it in the act, I was too concerned about stopping it to troubleshoot it. Stopping and starting syslog-ng returns my logging into a normal state of affairs.
$ uname -a | FreeBSD pitfall.rescomp.berkeley.edu 6.1-RELEASE-p3 FreeBSD | 6.1-RELEASE-p3 #0: Mon Aug 21 15:42:31 PDT 2006 | root@new-pitfall.rescomp.berkeley.edu:/usr/obj/usr/src/sys/RCBSD_1 | i386 $ syslog-ng -V | syslog-ng 2.0.0
First time-stamp in fubar'd logfile: Jun 29 00:00:07 First localhost entry: Jun 29 00:02:19 (Line #256) Last time-stamp in fubar'd logfile: Jun 29 00:08:46 Number of "good" entries: 1266 Number of localhost entries: 1123083 Total number of entries: 1124349
If anyone has tips for continuing my troubleshooting process, please let me know. I'd be tempted to upgrade to 2.0.2 (it's in ports), but given the changelogs didn't mention anything about random feedback loops, I didn't figure it would help much.
Thanks in advance for your time,
I did not fix "random feedback loops" and never even encountered one. So if it is indeed a syslog-ng problem, then upgrading will probably not help. Can you post your configuration file? -- Bazsi