Re: [syslog-ng]preparing syslog-ng 1.6.6

3 Feb 2005

      Hi,
...
I know about this problem but I'm afraid it is very difficult to solve
in the 1.6.x branch, and should be solved by the new development branch.
Maybe there is a stupid workaround which is not difficult. Just spitting 
out some ideas, as I clearly haven't read the code in question (to be 
honest, I couldn't find it).
...
The problem is that syslog-ng polls the destination TCP sockets for
writing only,
That's the issue with poll(). Dumb question: Why not using select()?
...
and whenever the remote endpoint closes the TCP
connection, it is not indicated in any way (as closing a socket triggers
readability and not writability). Whenever a message is to be written to
this socket, the first write() syscall succeeds, and only the next
write() will return EPIPE, so syslog-ng is able to detect the broken
connection.
Also a signal SIGPIPE is invoked. Again, stupid question: Couldn't you 
use SIGPIPE to inject the write request off the remaining buffer? The 
way I see syslog-ng functionally working (extremely simplified, please 
correct), is:

o syslog-ng polls on read_fds for incoming syslog messages
o syslog-ng maintains a queue or linked list of messages where the newly 
arrived messages get queued up for delivery. This queue is also used in 
case a destination is down and needs to be reprobed (reopened) for a 
connection.
o syslog-ng polls on write_fds for outgoing possibilities and if 
success, sends out in FIFO the queued messages.
o TCP close -> eof reaches the socket and gets passed up to syslog-ng 
which has already sent (write()) one line _but_ not yet lost the buffer 
it has written.
o A new write will return EPIPE and a SIGPIPE signal.

The idea is to either pass the EPIPE back to the caller function sending 
the syslog message chunk or to invoke a signal handler that signals the 
caller to resend that message again. Or use select() to poll for 
readability? Or create a thread within the calling stack (same process 
with access to the write buffer) of the poll function and have it wait 
on a condition variable which is set upon EPIPE. The thread waits in 
pthread_cond_wait() and will write the last successfully written buffer 
again.
...
The message aa4 is sent still on the old connection:
read(5, "<5>Feb  3 09:42:01 bazsi: aa4\0", 2048) = 30
time(NULL)                              = 1107420121
time(NULL)                              = 1107420121
time(NULL)                              = 1107420121
poll([{fd=5, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=POLLOUT, revents=POLLOUT}, {fd=3, events=POLLIN}], 3, 100) = 2
read(5, "", 2048)                       = 0
write(2, "Read EOF on fd 5.\n", 18)     = 18
write(2, "Marking fd 5 for closing.\n", 26) = 26
write(4, "<5>Feb  3 09:42:01 src@bzorp bazsi: aa4\n", 40) = 40
time(NULL)                              = 1107420121
poll([{fd=5, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=0}, {fd=3, events=POLLIN}], 3, 100) = 1
time(NULL)                              = 1107420121
Then I terminated netcat, and sent another message:
read(5, "<5>Feb  3 09:42:05 bazsi: aa5\0", 2048) = 30
time(NULL)                              = 1107420125
time(NULL)                              = 1107420125
time(NULL)                              = 1107420125
poll([{fd=5, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=POLLOUT, revents=POLLOUT}, {fd=3, events=POLLIN}], 3, 100) = 2
read(5, "", 2048)                       = 0
write(2, "Read EOF on fd 5.\n", 18)     = 18
write(2, "Marking fd 5 for closing.\n", 26) = 26
write(4, "<5>Feb  3 09:42:05 src@bzorp bazsi: aa5\n", 40) = 40
I see.
...
time(NULL)                              = 1107420125
poll([{fd=5, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=0, revents=POLLERR|POLLHUP}, {fd=3, events=POLLIN}], 3, 100) = 2
Even though the connection was broken the write() syscall returned
success, and only the next poll() call indicates some problems.
Yes, because the revents in the pollfd struct indicates events that 
occured. Kinda looking into the past ;).
...
time(NULL)                              = 1107420125
poll([{fd=5, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=0, revents=POLLERR|POLLHUP}, {fd=3, events=POLLIN}], 3, 100) =
2
write(2, "io_iter(): POLLHUP on inactive fd!\n", 35) = 35
write(2, "Marking fd 4 for closing.\n", 26) = 26
time(NULL)                              = 1107420125
write(2, "Closing fd 5.\n", 14)         = 14
close(5)                                = 0
write(2, "Connection broken to AF_INET(127.0.0.1:2000), reopening in 1 seconds\n", 69) = 69
time(NULL)                              = 1107420125
write(2, "Closing fd 4.\n", 14)         = 14
close(4)                                = 0
The current solution in version 1.9 is to poll the fd for reading as 
well as writing, so we get immediate notification when the connection is
broken. The same problem could still occur, though only within a very
small timeframe.
Could you point me to the code in question in 1.6.x so I could check it 
out for myself, please?

Thanks for your patience and excellent work,
Roberto Nibali, ratz
-- 
echo 
'[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc