[syslog-ng] Re: syslog-ng memory leak patch

Fri, 7 Jul 2000 10:21:50 -0400

Sorry for the delay in getting this reply out.

Balazs Scheidler on Thu 29/06 19:38 +0200:
> > I noticed your post on the syslog-ng archives concerning memory
> > leaks in the software.  I was wondering if you had met any success
> > reports with this patch.  I just discovered my DMZ loghost has a
> > 70MB syslog-ng image in core after running only a couple of weeks;
> > there is most *definitely* a memory leak.  I was going to track it
> > down myself, but thought I would check the list archives
> > first...indeed, your patch was posted.
> > 
> > Before I implement the patch, I was wondering if you've had success
> > running some syslog-ngs for more than a week, with at least half a
> > dozen hosts logging to it, and kept the memory image within
> > reasonable limits, with this patch.  If it's not a complete leak
> > preventer, I'd just as soon do a full audit of the source, since I
> > simply must use syslog-ng in production.  Either that, or we could
> > split the work or something...
> 
> That patch fixed some leak bugs, but as it seems not all of them. I'm
> now trying to reproduce the problem.

I just discovered my syslog-ng loghost with 99% CPU utilization and 50MB
VSZ after running only one week.

It's easy to reproduce.  Here is my configuration file for the master
loghost that displays the problems:

        options {
                chain_hostnames(yes);
        };

        source "self" {
                internal();
        };

        source "local" {
                unix-stream("/dev/log");
        };

        source "intranet" {
                udp();
        };

        source "dmz" {
                tcp();
        };

        destination "file" {
                file("/var/log/syslog/$HOST/$PROGRAM/$YEAR.$MONTH.$DAY"
                     create_dirs(yes)
                     dir_perm(0555)
                     perm(0660)
                     owner(root)
                     group(qdn));
        };

        log {
                source("self");
                source("local");
                source("intranet");
                source("dmz");

                destination("file");
        };

Then on each of about 12 internal machines, they have a conventional UDP
syslogd with a configuration file that just says to send *.debug to
loghost (the machine with configuration above).

Then in the DMZ, and in remote offices separated by a WAN, there is one
relaying loghost running syslog-ng with this configuration:

        options {
                chain_hostnames(yes);
        };

        source "self" {
                internal();
        };

        source "local" {
                unix-stream("/dev/log");
        };

        source "dmz" {
                udp();
        };

        destination "intranet" {
                tcp(intraloghost.roc);
        };

        log {
                source("self");
                source("local");
                source("dmz");

                destination("intranet");
        };

with a few machines logging to the relay, and then the relay passing the
log over TCP to the main loghost which displays the problems.

Note that this never occurs on the relaying machines; they have pretty
much constant memory and CPU usage (very low).

Also note that on the central loghost, even though I have a destination
like this:

        destination "file" {
                file("/var/log/syslog/$HOST/$PROGRAM/$YEAR.$MONTH.$DAY"
                     create_dirs(yes)
                     dir_perm(0555)
                     perm(0660)
                     owner(root)
                     group(qdn));
        };

I still get files like this

        /var/log/syslog/loghost/2000.06.25
        /var/log/syslog/loghost/2000.06.26
        /var/log/syslog/loghost/2000.06.27
        /var/log/syslog/loghost/2000.06.28
        /var/log/syslog/loghost/2000.06.29
        /var/log/syslog/loghost/2000.06.30

(they are missing a $PROGRAM...in traces I see it writing to
/var/log/syslog//loghost/file...there should probably be code to handle
this case where $PROGRAM is empty somehow) The files invariably contain
only strings like this:

        Jun 25 00:20:52 local@loghost
        Jun 25 00:50:58 local@loghost
        Jun 25 01:21:04 local@loghost
        Jun 25 01:51:10 local@loghost
        Jun 25 02:21:16 local@loghost
        Jun 25 02:51:21 local@loghost
        Jun 25 03:21:27 local@loghost

I have no idea where these are coming from, but they're only on the
central loghost.  This may or may not be related to the memory leak
problem.