syslog-ng blocking on read() from /proc/kmsg
I'm running the Debian 1.6.9-1 package with the 'read from each file descriptor n times or until EAGAIN' patch backported. There seems to be a problem with this behavior when it's applied to /proc/kmsg. Here's the source we use: source s_kmsg { file("/proc/kmsg" log_prefix("kernel: ")); }; What happens is: 1. syslog-ng calls poll(2) and waits. 2. Data appears on /proc/kmsg and causes poll() to return. 3. syslog-ng read(2)s from /proc/kmsg, grabbing some data. 4. Since it hasn't received EAGAIN or read n times from this file descriptor, syslog-ng happily read()s again. Since it's exhausted the available data, the read() blocks. Forever. 5. Meanwhile, other syslog traffic to /dev/log piles up. Eventually, whatever buffer there is on Unix domain sockets fills up and causes other applications on the machine to block, waiting to syslog. 6. The box eventually runs out of file descriptors since other processes (cron is a big culprit because it runs jobs so often) are piling up, holding onto file descriptors and blocking on writes to /dev/log. The blocking behavior on /proc/kmsg is interesting, since if I'm reading the source correctly, file sources are set O_NONBLOCK: affile.c: 152 static int do_init_affile_source(struct log_handler *c, struct syslog_con 153 { 154 CAST(affile_source, self, c); 155 int fd, flags; 156 157 if (self->flags & AFFILE_PIPE) 158 flags = O_RDWR | O_NOCTTY | O_NONBLOCK | O_LARGEFILE; 159 else 160 flags = O_RDONLY | O_NOCTTY | O_NONBLOCK | O_LARGEFILE; Any thoughts? thanks, john -- John Morrissey _o /\ ---- __o jwm@horde.net _-< \_ / \ ---- < \, www.horde.net/ __(_)/_(_)________/ \_______(_) /_(_)__
On Thu, 2006-04-27 at 15:48 -0400, John Morrissey wrote:
I'm running the Debian 1.6.9-1 package with the 'read from each file descriptor n times or until EAGAIN' patch backported. There seems to be a problem with this behavior when it's applied to /proc/kmsg. Here's the source we use:
source s_kmsg { file("/proc/kmsg" log_prefix("kernel: ")); };
/proc/kmsg does not support nonblocking mode, I already encountered this earlier. This could be called a kernel bug, however a workaround needs to be implemented. Backing out the change will make the performance improvements that we implemented void. A possible solution is to avoid this multi-read behaviour for file sources. Can you check if this patch fixes it? Index: affile.c =================================================================== RCS file: /var/cvs/syslog-ng/syslog-ng/src/affile.c,v retrieving revision 1.61.4.5 diff -u -r1.61.4.5 affile.c --- affile.c 5 Nov 2005 14:53:12 -0000 1.61.4.5 +++ affile.c 30 Apr 2006 20:06:49 -0000 @@ -162,7 +162,7 @@ io_set_nonblocking(fd); lseek(fd, 0, SEEK_END); self->src = io_read(make_io_fd(cfg->backend, fd, ol_string_use(self->name)), - make_log_reader(0, self->prefix, cfg->log_msg_size, self->pad_size, cfg->check_hostname ? LF_CHECK_HOSTNAME : 0, cfg->bad_hostname, c), + make_log_reader(0, self->prefix, cfg->log_msg_size, self->pad_size, LF_NO_MULTI_READ | (cfg->check_hostname ? LF_CHECK_HOSTNAME : 0), cfg->bad_hostname, c), NULL); REMEMBER_RESOURCE(cfg->resources, &self->src->super.super); return ST_OK | ST_GOON; Index: log.h =================================================================== RCS file: /var/cvs/syslog-ng/syslog-ng/src/log.h,v retrieving revision 1.19 diff -u -r1.19 log.h --- log.h 8 Jan 2003 09:31:37 -0000 1.19 +++ log.h 30 Apr 2006 20:06:50 -0000 @@ -57,6 +57,8 @@ #define LF_INTERNAL 0x0001 #define LF_MARK 0x0002 #define LF_LOCAL 0x0004 +/* hack, this is piggybacked to msg_flags instead of using a separate argument of make_log_reader */ +#define LF_NO_MULTI_READ 0x0008 #define LF_CHECK_HOSTNAME 0x0100 #define LF_USER_FLAGS 0xff00 Index: sources.c =================================================================== RCS file: /var/cvs/syslog-ng/syslog-ng/src/sources.c,v retrieving revision 1.37.4.9 diff -u -r1.37.4.9 sources.c --- sources.c 13 Mar 2006 23:30:26 -0000 1.37.4.9 +++ sources.c 30 Apr 2006 20:06:50 -0000 @@ -87,8 +87,9 @@ char sabuf[256]; socklen_t salen = sizeof(sabuf); int fetch_count = 0; + int fetch_max = (closure->msg_flags & LF_NO_MULTI_READ) ? 1 : 30; - while (fetch_count < 30) { + while (fetch_count < fetch_max) { if (!closure->dgram) { if (closure->pad_size) -- Bazsi
Hi Baszi-- On Sun, Apr 30, 2006 at 10:08:36PM +0200, Balazs Scheidler wrote:
/proc/kmsg does not support nonblocking mode, I already encountered this earlier. [snip] A possible solution is to avoid this multi-read behaviour for file sources. Can you check if this patch fixes it?
Yup, that works perfectly. Thanks! john -- John Morrissey _o /\ ---- __o jwm@horde.net _-< \_ / \ ---- < \, www.horde.net/ __(_)/_(_)________/ \_______(_) /_(_)__
participants (2)
-
Balazs Scheidler
-
John Morrissey