[syslog-ng]preparing syslog-ng 1.6.6

Roberto Nibali syslog-ng@lists.balabit.hu
Thu, 03 Feb 2005 11:31:49 +0100


Hi,

> I know about this problem but I'm afraid it is very difficult to solve
> in the 1.6.x branch, and should be solved by the new development branch.

Maybe there is a stupid workaround which is not difficult. Just spitting 
out some ideas, as I clearly haven't read the code in question (to be 
honest, I couldn't find it).

> The problem is that syslog-ng polls the destination TCP sockets for
> writing only,

That's the issue with poll(). Dumb question: Why not using select()?

> and whenever the remote endpoint closes the TCP
> connection, it is not indicated in any way (as closing a socket triggers
> readability and not writability). Whenever a message is to be written to
> this socket, the first write() syscall succeeds, and only the next
> write() will return EPIPE, so syslog-ng is able to detect the broken
> connection.

Also a signal SIGPIPE is invoked. Again, stupid question: Couldn't you 
use SIGPIPE to inject the write request off the remaining buffer? The 
way I see syslog-ng functionally working (extremely simplified, please 
correct), is:

o syslog-ng polls on read_fds for incoming syslog messages
o syslog-ng maintains a queue or linked list of messages where the newly 
arrived messages get queued up for delivery. This queue is also used in 
case a destination is down and needs to be reprobed (reopened) for a 
connection.
o syslog-ng polls on write_fds for outgoing possibilities and if 
success, sends out in FIFO the queued messages.
o TCP close -> eof reaches the socket and gets passed up to syslog-ng 
which has already sent (write()) one line _but_ not yet lost the buffer 
it has written.
o A new write will return EPIPE and a SIGPIPE signal.

The idea is to either pass the EPIPE back to the caller function sending 
the syslog message chunk or to invoke a signal handler that signals the 
caller to resend that message again. Or use select() to poll for 
readability? Or create a thread within the calling stack (same process 
with access to the write buffer) of the poll function and have it wait 
on a condition variable which is set upon EPIPE. The thread waits in 
pthread_cond_wait() and will write the last successfully written buffer 
again.

> The message aa4 is sent still on the old connection:
> read(5, "<5>Feb  3 09:42:01 bazsi: aa4\0", 2048) = 30
> time(NULL)                              = 1107420121
> time(NULL)                              = 1107420121
> time(NULL)                              = 1107420121
> poll([{fd=5, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=POLLOUT, revents=POLLOUT}, {fd=3, events=POLLIN}], 3, 100) = 2
> read(5, "", 2048)                       = 0
> write(2, "Read EOF on fd 5.\n", 18)     = 18
> write(2, "Marking fd 5 for closing.\n", 26) = 26
> write(4, "<5>Feb  3 09:42:01 src@bzorp bazsi: aa4\n", 40) = 40
> time(NULL)                              = 1107420121
> poll([{fd=5, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=0}, {fd=3, events=POLLIN}], 3, 100) = 1
> time(NULL)                              = 1107420121
> 
> Then I terminated netcat, and sent another message:
> read(5, "<5>Feb  3 09:42:05 bazsi: aa5\0", 2048) = 30
> time(NULL)                              = 1107420125
> time(NULL)                              = 1107420125
> time(NULL)                              = 1107420125
> poll([{fd=5, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=POLLOUT, revents=POLLOUT}, {fd=3, events=POLLIN}], 3, 100) = 2
> read(5, "", 2048)                       = 0
> write(2, "Read EOF on fd 5.\n", 18)     = 18
> write(2, "Marking fd 5 for closing.\n", 26) = 26
> write(4, "<5>Feb  3 09:42:05 src@bzorp bazsi: aa5\n", 40) = 40

I see.

> time(NULL)                              = 1107420125
> poll([{fd=5, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=0, revents=POLLERR|POLLHUP}, {fd=3, events=POLLIN}], 3, 100) = 2
> 
> Even though the connection was broken the write() syscall returned
> success, and only the next poll() call indicates some problems.

Yes, because the revents in the pollfd struct indicates events that 
occured. Kinda looking into the past ;).

> time(NULL)                              = 1107420125
> poll([{fd=5, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=0, revents=POLLERR|POLLHUP}, {fd=3, events=POLLIN}], 3, 100) =
> 2
> write(2, "io_iter(): POLLHUP on inactive fd!\n", 35) = 35
> write(2, "Marking fd 4 for closing.\n", 26) = 26
> time(NULL)                              = 1107420125
> write(2, "Closing fd 5.\n", 14)         = 14
> close(5)                                = 0
> write(2, "Connection broken to AF_INET(127.0.0.1:2000), reopening in 1 seconds\n", 69) = 69
> time(NULL)                              = 1107420125
> write(2, "Closing fd 4.\n", 14)         = 14
> close(4)                                = 0
> 
> The current solution in version 1.9 is to poll the fd for reading as 
> well as writing, so we get immediate notification when the connection is
> broken. The same problem could still occur, though only within a very
> small timeframe.

Could you point me to the code in question in 1.6.x so I could check it 
out for myself, please?

Thanks for your patience and excellent work,
Roberto Nibali, ratz
-- 
echo 
'[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc