[syslog-ng] syslog-ng 3.3.1 quits at reload
Balazs Scheidler
bazsi at balabit.hu
Wed Dec 21 15:06:27 CET 2011
On Wed, 2011-11-30 at 15:21 -0500, Michael Hocke wrote:
> On Nov 29, 2011, at 1:10 PM, Michael Hocke wrote:
>
> > Sorry for following up on my own posting. I just checked the OpenSolaris sources and it is definitely something specific to Solaris. The /dev/log device can only be cloned up to LOG_NUMCLONES times which is defined as 16 in <sys/log.h>. Every open call on /dev/log clones the device and since it seems that /dev/log is not closed when a HUP is received the number of clones accumulate until after the 16th HUP signal it tries to execute open64("/dev/log", O_RDONLY|O_NONBLOCK|O_NOCTTY) which results in an ENXIO error.
>
> I put in a quick and dirty fix for the problem I am seeing. I made sure that the /dev/log device is being closed in afstreams_sd_deinit(). I am very sure this is not the right place and it should probably be closer to LogTransport but that would probably require some extra flags and methods since there is no log_transport_deinit() and a closing of the fd for all kinds of transports is probably not desired. Anyway, here is the "fix" I put in place:
>
> # diff afstreams.c afstreams.c.orig
> 37d36
> < gint log_fd;
> 166a166
> > gint fd;
> 173,174c173,174
> < self->log_fd = open(self->dev_filename->str, O_RDONLY | O_NOCTTY | O_NONBLOCK);
> < if (self->log_fd != -1)
> ---
> > fd = open(self->dev_filename->str, O_RDONLY | O_NOCTTY | O_NONBLOCK);
> > if (fd != -1)
> 178c178
> < g_fd_set_cloexec(self->log_fd, TRUE);
> ---
> > g_fd_set_cloexec(fd, TRUE);
> 181c181
> < if (ioctl(self->log_fd, I_STR, &ioc) < 0)
> ---
> > if (ioctl(fd, I_STR, &ioc) < 0)
> 187c187
> < close(self->log_fd);
> ---
> > close(fd);
> 190,191c190,191
> < g_fd_set_nonblock(self->log_fd, TRUE);
> < self->reader = log_reader_new(log_proto_dgram_server_new(log_transport_streams_new(self->log_fd), self->reader_options.msg_size, 0));
> ---
> > g_fd_set_nonblock(fd, TRUE);
> > self->reader = log_reader_new(log_proto_dgram_server_new(log_transport_streams_new(fd), self->reader_options.msg_size, 0));
> 207c207
> < evt_tag_int("fd", self->log_fd),
> ---
> > evt_tag_int("fd", fd),
> 211c211
> < close(self->log_fd);
> ---
> > close(fd);
> 239,240d238
> < if (self->log_fd != -1)
> < close (self->log_fd);
>
> I pretty much store the fd of the log device in AFStreamsSourceDriver and use that in afstreams_sd_deinit().
Hi,
Checking out the code, the fd leak can only happen in case the
LogTransport instance doesn't get freed.
LogTransport is freed by LogProto and that by LogReader and that by
AFStreamsSourceDriver.
e.g. the structure is AFStreamsSourceDriver->reader->proto->transport
So at the same time of the fdleak, this seems to be a memory leak too.
Just by the looks of it, the reader instance is deinited properly in
afstreams_sd_deinit() and then unrefed in afstreams_sd_free().
Can you perhaps check if log_pipe_unref() call in afstreams_sd_free() is
invoked?
and then once there, can you also check if the ref count of the reader
actually goes down to zero and log_reader_free() is also invoked?
Looking the code further, the culprit seems to be the LogReader->control
member, which holds a reference to the source driver, e.g. there's a
circular reference between the source driver and the reader.
This causes the neither source driver nor the reader to be freed, which
should explain the fd leak.
I have to go now, but I'll think a bit more about the issue.
--
Bazsi
More information about the syslog-ng
mailing list