Thanks for tracking down this issue. The problem might be the difference between libc/kernel versions. Earlier libcs used to emulate poll using select (glibc 2.0), this is not the case as strace reports it as poll. But Rh 6.2 and 7.1 may contain different kernel versions which behave differently.
The problem is that rh 6.2 returns only POLLERR without POLLHUP, and syslog-ng expects POLLHUP for closed sessions. This patch may fix this problem and create new ones, however at 22:43pm, this is the best I can make:
Well, I gave this patch a try, but it doesn't seem to fix the problem. I haven't walked through it with gdb yet with the patch in place though but the messages indicating a reconnect attempt in 10 seconds only flashed by once, which is how it was behaving before. I will take another look at it tomorrow morning and see if I can figure out some more of what is happening. Matthew M. Copeland
Index: io.c =================================================================== RCS file: /var/cvs/libol/src/io.c,v retrieving revision 1.25 diff -u -r1.25 io.c --- io.c 2001/08/26 21:28:18 1.25 +++ io.c 2001/09/05 20:39:02 @@ -231,7 +231,7 @@ if (!fd->super.alive) continue;
- if (fds[i].revents & POLLHUP) { + if (fds[i].revents & (POLLHUP|POLLERR|POLLNVAL)) { if (fd->want_read && fd->read) READ_FD(fd); else if (fd->want_write && fd->write) @@ -246,10 +246,12 @@ close_fd(fd, CLOSE_PROTOCOL_FAILURE); continue; } + /* if (fds[i].revents & (POLLNVAL | POLLERR)) { close_fd(fd, CLOSE_POLL_FAILED); continue; } + */ if (fds[i].revents & POLLOUT) if (fd->want_write && fd->write) WRITE_FD(fd);
-- You may be sure that when a man begins to call himself a "realist," he is preparing to do something he is secretly ashamed of doing. -- Sydney Harris