preparing syslog-ng 1.6.6

Balazs Scheidler

31 Jan 2005 31 Jan '05

10:40 a.m.

Hi, As there are some changes since syslog-ng 1.6.5, I'd like to release a new version. Before doing so I'd like to ask you to test the latest snapshot a little bit. The most important and risky change seems to be this one: 2005-01-20 Balazs Scheidler <bazsi@bzorp.balabit> * src/main.c: fix possible log message losing during HUP as syslog-ng was not accepting messages for 1 second and these messages were dropped, processing changed so we still wait 1 second to flush buffers but also accept messages in this interval, so the window of losing messages becomes much smaller. (fixes: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=282771) Everything else is minor, and although they might cause build problems on some platforms, I doubt they'd cause real problems. The latest snapshot is here: http://www.balabit.com/downloads/syslog-ng/1.6/src-snapshot/ The summary of changes can be found in the file named ChangeLog within the tarball. If no show-stopper bugs are found, I'm going to release syslog-ng 1.6.6 this week. Thanks in advance. -- Bazsi

Show replies by date

José Pedro Oliveira

31 Jan 31 Jan

9:33 p.m.

Balazs,

...

As there are some changes since syslog-ng 1.6.5, I'd like to release a new version. Before doing so I'd like to ask you to test the latest snapshot a little bit. The most important and risky change seems to be this one:

2005-01-20 Balazs Scheidler <bazsi@bzorp.balabit>

* src/main.c: fix possible log message losing during HUP as syslog-ng was not accepting messages for 1 second and these messages were dropped, processing changed so we still wait 1 second to flush buffers but also accept messages in this interval, so the window of losing messages becomes much smaller. (fixes: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=282771)

Everything else is minor, and although they might cause build problems on some platforms, I doubt they'd cause real problems.

The above patch corrected the out of order messages when handling the SIGHUP signal: ... Sep 30 20:35:58 pateta syslog-ng[1858]: STATS: dropped 0 Sep 30 20:45:58 pateta syslog-ng[1858]: STATS: dropped 0

...

...
Sep 30 20:51:06 pateta syslog-ng[1858]: new configuration initialized Sep 30 20:51:05 pateta syslog-ng[1858]: SIGHUP received, restarting syslog-ng Sep 30 21:01:06 pateta syslog-ng[1858]: STATS: dropped 0 ...

Previously reported in: https://lists.balabit.hu/pipermail/syslog-ng/2004-September/006416.html

...

The latest snapshot is here:

http://www.balabit.com/downloads/syslog-ng/1.6/src-snapshot/

The summary of changes can be found in the file named ChangeLog within the tarball. If no show-stopper bugs are found, I'm going to release syslog-ng 1.6.6 this week.

There is a small error in the Changelog. The 2004-08-05 entry reports that syslog-ng 1.6.5 needs libol version 0.3.15. It should be libol version 0.3.14 (not 0.3.15). Regards, jpo -- José Pedro Oliveira * mailto: jpo@di.uminho.pt * http://gsd.di.uminho.pt/~jpo * * gpg fingerprint = F9B6 8D87 859D 1C94 48F0 84C0 9749 9EB5 91BD 851B *

Balazs Scheidler

1 Feb 1 Feb

9:37 a.m.

On Mon, 2005-01-31 at 20:33 +0000, José Pedro Oliveira wrote:

...

...
As there are some changes since syslog-ng 1.6.5, I'd like to release a new version. Before doing so I'd like to ask you to test the latest snapshot a little bit. The most important and risky change seems to be this one:

2005-01-20 Balazs Scheidler <bazsi@bzorp.balabit>

* src/main.c: fix possible log message losing during HUP as syslog-ng was not accepting messages for 1 second and these messages were dropped, processing changed so we still wait 1 second to flush buffers but also accept messages in this interval, so the window of losing messages becomes much smaller. (fixes: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=282771)

Everything else is minor, and although they might cause build problems on some platforms, I doubt they'd cause real problems.

The above patch corrected the out of order messages when handling the SIGHUP signal:

... Sep 30 20:35:58 pateta syslog-ng[1858]: STATS: dropped 0 Sep 30 20:45:58 pateta syslog-ng[1858]: STATS: dropped 0

...
...
Sep 30 20:51:06 pateta syslog-ng[1858]: new configuration initialized Sep 30 20:51:05 pateta syslog-ng[1858]: SIGHUP received, restarting syslog-ng Sep 30 21:01:06 pateta syslog-ng[1858]: STATS: dropped 0 ...

Previously reported in: https://lists.balabit.hu/pipermail/syslog-ng/2004-September/006416.html

Yes, the "SIGHUP received" message was delayed until the new configuration was initialized.

...

...
The latest snapshot is here:

http://www.balabit.com/downloads/syslog-ng/1.6/src-snapshot/

The summary of changes can be found in the file named ChangeLog within the tarball. If no show-stopper bugs are found, I'm going to release syslog-ng 1.6.6 this week.

There is a small error in the Changelog. The 2004-08-05 entry reports that syslog-ng 1.6.5 needs libol version 0.3.15. It should be libol version 0.3.14 (not 0.3.15).

Fixed. -- Bazsi

Roberto Nibali

2 Feb 2 Feb

10:14 a.m.

Hello,

...

2005-01-20 Balazs Scheidler <bazsi@bzorp.balabit>

* src/main.c: fix possible log message losing during HUP as syslog-ng was not accepting messages for 1 second and these messages were dropped, processing changed so we still wait 1 second to flush buffers but also accept messages in this interval, so the window of losing messages becomes much smaller. (fixes: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=282771)

We have a similar bug report in this department in our internal bugzilla db. I will now check if this fixes our issue as well or if we need another round of chasing bugs. So please hold of at least until tonight before releasing. I will initiate a test round over here.

...

Everything else is minor, and although they might cause build problems on some platforms, I doubt they'd cause real problems.

I can test static linking here.

...

The latest snapshot is here:

http://www.balabit.com/downloads/syslog-ng/1.6/src-snapshot/

The summary of changes can be found in the file named ChangeLog within the tarball. If no show-stopper bugs are found, I'm going to release syslog-ng 1.6.6 this week.

Excellent, thanks. Best regards, Roberto Nibali, ratz -- ------------------------------------------------------------- addr://Rathausgasse 31, CH-5001 Aarau tel://++41 62 823 9355 http://www.terreactive.com fax://++41 62 823 9356 ------------------------------------------------------------- terreActive AG Wir sichern Ihren Erfolg -------------------------------------------------------------

Roberto Nibali

11:53 a.m.

Hello, [if this email is too confusing, I'll split it up into the 3 subjects]

...

The summary of changes can be found in the file named ChangeLog within the tarball. If no show-stopper bugs are found, I'm going to release syslog-ng 1.6.6 this week.

Ok, the following patch has been applied and works (it's not in the Changelog however), so I can verify this to be fixed: --- macros.gprf 6 May 2004 07:37:10 -0000 1.1.4.1 +++ macros.gprf 13 Dec 2004 18:16:37 -0000 @@ -7,6 +7,7 @@ PRIORITY, M_LEVEL LEVEL, M_LEVEL TAG, M_TAG +PRI, M_PRI DATE, M_DATE FULLDATE, M_FULLDATE ISODATE, M_ISODATE Please have a look at following patch which I'd like to submit: diff -ur syslog-ng-1.6.5+20050202/src/sources.c syslog-ng-1.6.5+20050202-fixed/s rc/sources.c --- syslog-ng-1.6.5+20050202/src/sources.c 2004-08-05 13:35:12.000000000 +0 200 +++ syslog-ng-1.6.5+20050202-fixed/src/sources.c 2005-02-02 11:40:04.0000 00000 +0100 @@ -128,6 +128,9 @@ closure->buffer[closure->pos - 1] == '\0')) closure->pos--; } + if (closure->dgram && !eol && closure->pos == closure->max_log_l ine) { + werror("Message length overflow (ignored the rest)\n"); + } do_handle_line(closure, closure->pos, closure->buffer, salen ? ( abstract_addr *) &sabuf : NULL, salen); closure->pos = 0; return ST_OK | ST_GOON; The issue here is that we would like to get notified somehow that we've exceeded the max_log_line. This is for tuning reasons and also a reason for our support team not to debug log streams for possible corruption. Please consider applying this patch, it adds only minor computing time to the fast path. I see that at about the same line the following code has been added: if (closure->dgram) { /* strip one trailing LF or NUL character */ if (closure->pos > 0 && (closure->buffer[closure->pos - 1] == '\n' || closure->buffer[closure->pos - 1] == '\0')) closure->pos--; } I've dug out a patch in our CVS which somehow tries to achieve roughly the same but using a different manner (more intrusive). Any comment on this: diff -u -r1.37.4.1 sources.c --- sources.c 13 Jan 2004 18:08:02 -0000 1.37.4.1 +++ sources.c 24 Feb 2004 10:37:43 -0000 @@ -115,7 +115,7 @@ closure->pos = 0; return ST_OK | ST_GOON; } - if (!eol && (closure->dgram || closure->pos == closure->max_log_line)) { + if (closure->dgram || (!eol && closure->pos == closure->max_log_line)) { /* we don't have a terminating nl nor \0, and our buffer is full or we are a datagram receiver, when the message is in its own packet. Our bug report regarding this patchlet can be summarised as follows: "syslog-ng assumes that NL or NUL terminates messages regardless of the transport medium used." Please verify and report back if you have time. It helps us cleaning up our local patch repository :). Best regards, Roberto Nibali, ratz -- ------------------------------------------------------------- addr://Rathausgasse 31, CH-5001 Aarau tel://++41 62 823 9355 http://www.terreactive.com fax://++41 62 823 9356 ------------------------------------------------------------- terreActive AG Wir sichern Ihren Erfolg -------------------------------------------------------------

Balazs Scheidler

3 Feb 3 Feb

11:20 a.m.

Hi Roberto, On Wed, 2005-02-02 at 11:53 +0100, Roberto Nibali wrote:

...

Hello,

[if this email is too confusing, I'll split it up into the 3 subjects]

...
The summary of changes can be found in the file named ChangeLog within the tarball. If no show-stopper bugs are found, I'm going to release syslog-ng 1.6.6 this week.

Ok, the following patch has been applied and works (it's not in the Changelog however), so I can verify this to be fixed:

added the changelog entry.

...

Please have a look at following patch which I'd like to submit:

diff -ur syslog-ng-1.6.5+20050202/src/sources.c syslog-ng-1.6.5+20050202-fixed/s rc/sources.c --- syslog-ng-1.6.5+20050202/src/sources.c 2004-08-05 13:35:12.000000000 +0 200 +++ syslog-ng-1.6.5+20050202-fixed/src/sources.c 2005-02-02 11:40:04.0000 00000 +0100 @@ -128,6 +128,9 @@ closure->buffer[closure->pos - 1] == '\0')) closure->pos--; } + if (closure->dgram && !eol && closure->pos == closure->max_log_l ine) { + werror("Message length overflow (ignored the rest)\n"); + } do_handle_line(closure, closure->pos, closure->buffer, salen ? ( abstract_addr *) &sabuf : NULL, salen); closure->pos = 0; return ST_OK | ST_GOON;

The issue here is that we would like to get notified somehow that we've exceeded the max_log_line. This is for tuning reasons and also a reason for our support team not to debug log streams for possible corruption. Please consider applying this patch, it adds only minor computing time to the fast path.

are you sure you want to add the condition as displayed above? that would mean that only dgram sockets will have this check, however it is more common to exceed line lengths with stream oriented sockets. I think you wanted something like if (!eol && closure->pos == closure->max_log_line) { werror("Message length overflow (ignored the rest)\n"); } Please confirm, then I could add the patch.

...

I see that at about the same line the following code has been added:

if (closure->dgram) { /* strip one trailing LF or NUL character */ if (closure->pos > 0 && (closure->buffer[closure->pos - 1] == '\n' || closure->buffer[closure->pos - 1] == '\0')) closure->pos--; }

I've dug out a patch in our CVS which somehow tries to achieve roughly the same but using a different manner (more intrusive). Any comment on this:

diff -u -r1.37.4.1 sources.c --- sources.c 13 Jan 2004 18:08:02 -0000 1.37.4.1 +++ sources.c 24 Feb 2004 10:37:43 -0000 @@ -115,7 +115,7 @@ closure->pos = 0; return ST_OK | ST_GOON; } - if (!eol && (closure->dgram || closure->pos == closure->max_log_line)) { + if (closure->dgram || (!eol && closure->pos == closure->max_log_line)) { /* we don't have a terminating nl nor \0, and our buffer is full or we are a datagram receiver, when the message is in its own packet.

Our bug report regarding this patchlet can be summarised as follows:

"syslog-ng assumes that NL or NUL terminates messages regardless of the transport medium used."

Please verify and report back if you have time. It helps us cleaning up our local patch repository :).

I think the same patch is already applied. At least this is what I see here: if (closure->dgram || (!eol && closure->pos == closure->max_log_line)) { /* we don't have a terminating nl nor \0, and our buffer is full or we are a datagram receiver, when the message is in its own packet. */ if (closure->dgram) { /* strip one trailing LF or NUL character */ if (closure->pos > 0 && (closure->buffer[closure->pos - 1] == '\n' || closure->buffer[closure->pos - 1] == '\0')) closure->pos--; } do_handle_line(closure, closure->pos, closure->buffer, salen ? (abstract_addr *) &sabuf : NULL, salen); closure->pos = 0; return ST_OK | ST_GOON; } -- Bazsi

Roberto Nibali

11:53 a.m.

Hi Bazsi,

...

...
Ok, the following patch has been applied and works (it's not in the Changelog however), so I can verify this to be fixed:

added the changelog entry.

Thanks, it's not for the fame :), just for QA and synchronisation with our internal bugzilla.

...

are you sure you want to add the condition as displayed above? that would mean that only dgram sockets will have this check, however it is more common to exceed line lengths with stream oriented sockets.

I think you wanted something like

if (!eol && closure->pos == closure->max_log_line) { werror("Message length overflow (ignored the rest)\n"); }

Please confirm, then I could add the patch.

Absolutely. I was mislead by the enclosing if (closure->dgram ...) into thinking do_read_line() only handles dgrams. If you would be so kind as to include your version of this patch, I'd be much delighted.

...

I think the same patch is already applied. At least this is what I see here:

if (closure->dgram || (!eol && closure->pos == closure->max_log_line)) { /* we don't have a terminating nl nor \0, and our buffer is full or we are a datagram receiver, when the message is in its own packet. */

Indeed. I wonder what I was looking at ... so no need to further look into this. Excellent: -2 patches on our side :). Thanks for you cooperation, Roberto Nibali, ratz -- echo '[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc

Balazs Scheidler

12:09 p.m.

On Thu, 2005-02-03 at 11:53 +0100, Roberto Nibali wrote:

...

...
are you sure you want to add the condition as displayed above? that would mean that only dgram sockets will have this check, however it is more common to exceed line lengths with stream oriented sockets.

I think you wanted something like

if (!eol && closure->pos == closure->max_log_line) { werror("Message length overflow (ignored the rest)\n"); }

Please confirm, then I could add the patch.

Absolutely. I was mislead by the enclosing if (closure->dgram ...) into thinking do_read_line() only handles dgrams. If you would be so kind as to include your version of this patch, I'd be much delighted.

Added. Though the snapshot I've uploaded 5 minutes ago still does not contain it. Look for it tomorrow (clarified the message somewhat, and moved the check above the trailing NL and NUL character check) -- Bazsi

Roberto Nibali

2 Feb 2 Feb

5:38 p.m.

Hello,

...

As there are some changes since syslog-ng 1.6.5, I'd like to release a new version. Before doing so I'd like to ask you to test the latest snapshot a little bit.

Ok, while browsing through our internal bugzilla DB I've found another bug which is still reproduceable with the latest drop of your sources. The problem is that with a TCP reset from the receiving end of syslog-ng messages we lose one message line while syslog-ng tries to reconnect. Here is a brief instruction on how to reproduce this use case: o You need two nodes, A and B (TARGET_IP). o On node A add the following lines to syslog-ng.conf for the use case options { sync (0); time_reopen (10); log_fifo_size (1000); long_hostnames (on); use_dns (no); use_fqdn (no); create_dirs (yes); keep_hostname (yes); }; source s_int { internal(); unix-stream("/dev/log" max-connections(256)); udp(ip(127.0.0.1) port(514)); file("/proc/kmsg"); }; filter f_test { facility(news) and level(emerg); }; destination d_net { tcp("<TARGET_IP>" port(1513) template("$DATE testpf $MSG\n")); }; log { source(s_int); filter(f_test); destination(d_net); flags(final); } o fire up syslog-ng with this configuration o on node B (TARGET_IP) run the following: netcat -l -p 1513 o on node A type the following logger -p news.emerg "test message1" logger -p news.emerg "test message2" logger -p news.emerg "test message3" logger -p news.emerg "test message4" you should see the messages coming on TARGET_IP: o now stop the netcat (CTRL-C): foo@bar:~ > netcat -l -p 1513 Feb 2 10:44:25 testpf root: test message1 Feb 2 10:44:26 testpf root: test message2 Feb 2 10:44:28 testpf root: test message3 Feb 2 10:44:30 testpf root: test message4 punt! o and generate some more messages on node A: logger -p news.emerg "test message5" logger -p news.emerg "test message6" logger -p news.emerg "test message7" logger -p news.emerg "test message8" o and afterwards restart the netcat on TARGET_IP foo@bar:~ > netcat -l -p 1513 Feb 2 10:44:39 testpf root: test message6 Feb 2 10:44:41 testpf root: test message7 Feb 2 10:44:49 testpf root: test message8 All the queued messages appear except number 5 which has been eaten by some evil bug. I suspect an off-by-one error somewhere but in my life I've already suspected a lot of things ;). I'd be grateful to help debugging this case. Cheers, Roberto Nibali, ratz -- ------------------------------------------------------------- addr://Rathausgasse 31, CH-5001 Aarau tel://++41 62 823 9355 http://www.terreactive.com fax://++41 62 823 9356 ------------------------------------------------------------- terreActive AG Wir sichern Ihren Erfolg -------------------------------------------------------------

Balazs Scheidler

3 Feb 3 Feb

9:53 a.m.

On Wed, 2005-02-02 at 17:38 +0100, Roberto Nibali wrote:

...

Hello,

...
As there are some changes since syslog-ng 1.6.5, I'd like to release a new version. Before doing so I'd like to ask you to test the latest snapshot a little bit.

Ok, while browsing through our internal bugzilla DB I've found another bug which is still reproduceable with the latest drop of your sources. The problem is that with a TCP reset from the receiving end of syslog-ng messages we lose one message line while syslog-ng tries to reconnect. Here is a brief instruction on how to reproduce this use case: [detailed error description removed]

I know about this problem but I'm afraid it is very difficult to solve in the 1.6.x branch, and should be solved by the new development branch. The problem is that syslog-ng polls the destination TCP sockets for writing only, and whenever the remote endpoint closes the TCP connection, it is not indicated in any way (as closing a socket triggers readability and not writability). Whenever a message is to be written to this socket, the first write() syscall succeeds, and only the next write() will return EPIPE, so syslog-ng is able to detect the broken connection. Here's the strace that shows this behaviour: The message aa4 is sent still on the old connection: read(5, "<5>Feb 3 09:42:01 bazsi: aa4\0", 2048) = 30 time(NULL) = 1107420121 time(NULL) = 1107420121 time(NULL) = 1107420121 poll([{fd=5, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=POLLOUT, revents=POLLOUT}, {fd=3, events=POLLIN}], 3, 100) = 2 read(5, "", 2048) = 0 write(2, "Read EOF on fd 5.\n", 18) = 18 write(2, "Marking fd 5 for closing.\n", 26) = 26 write(4, "<5>Feb 3 09:42:01 src@bzorp bazsi: aa4\n", 40) = 40 time(NULL) = 1107420121 poll([{fd=5, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=0}, {fd=3, events=POLLIN}], 3, 100) = 1 time(NULL) = 1107420121 Then I terminated netcat, and sent another message: read(5, "<5>Feb 3 09:42:05 bazsi: aa5\0", 2048) = 30 time(NULL) = 1107420125 time(NULL) = 1107420125 time(NULL) = 1107420125 poll([{fd=5, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=POLLOUT, revents=POLLOUT}, {fd=3, events=POLLIN}], 3, 100) = 2 read(5, "", 2048) = 0 write(2, "Read EOF on fd 5.\n", 18) = 18 write(2, "Marking fd 5 for closing.\n", 26) = 26 write(4, "<5>Feb 3 09:42:05 src@bzorp bazsi: aa5\n", 40) = 40 time(NULL) = 1107420125 poll([{fd=5, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=0, revents=POLLERR|POLLHUP}, {fd=3, events=POLLIN}], 3, 100) = 2 Even though the connection was broken the write() syscall returned success, and only the next poll() call indicates some problems. time(NULL) = 1107420125 poll([{fd=5, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=0, revents=POLLERR|POLLHUP}, {fd=3, events=POLLIN}], 3, 100) = 2 write(2, "io_iter(): POLLHUP on inactive fd!\n", 35) = 35 write(2, "Marking fd 4 for closing.\n", 26) = 26 time(NULL) = 1107420125 write(2, "Closing fd 5.\n", 14) = 14 close(5) = 0 write(2, "Connection broken to AF_INET(127.0.0.1:2000), reopening in 1 seconds\n", 69) = 69 time(NULL) = 1107420125 write(2, "Closing fd 4.\n", 14) = 14 close(4) = 0 The current solution in version 1.9 is to poll the fd for reading as well as writing, so we get immediate notification when the connection is broken. The same problem could still occur, though only within a very small timeframe. -- Bazsi

Roberto Nibali

11:31 a.m.

Hi,

...

I know about this problem but I'm afraid it is very difficult to solve in the 1.6.x branch, and should be solved by the new development branch.

Maybe there is a stupid workaround which is not difficult. Just spitting out some ideas, as I clearly haven't read the code in question (to be honest, I couldn't find it).

...

The problem is that syslog-ng polls the destination TCP sockets for writing only,

That's the issue with poll(). Dumb question: Why not using select()?

...

and whenever the remote endpoint closes the TCP connection, it is not indicated in any way (as closing a socket triggers readability and not writability). Whenever a message is to be written to this socket, the first write() syscall succeeds, and only the next write() will return EPIPE, so syslog-ng is able to detect the broken connection.

Also a signal SIGPIPE is invoked. Again, stupid question: Couldn't you use SIGPIPE to inject the write request off the remaining buffer? The way I see syslog-ng functionally working (extremely simplified, please correct), is: o syslog-ng polls on read_fds for incoming syslog messages o syslog-ng maintains a queue or linked list of messages where the newly arrived messages get queued up for delivery. This queue is also used in case a destination is down and needs to be reprobed (reopened) for a connection. o syslog-ng polls on write_fds for outgoing possibilities and if success, sends out in FIFO the queued messages. o TCP close -> eof reaches the socket and gets passed up to syslog-ng which has already sent (write()) one line _but_ not yet lost the buffer it has written. o A new write will return EPIPE and a SIGPIPE signal. The idea is to either pass the EPIPE back to the caller function sending the syslog message chunk or to invoke a signal handler that signals the caller to resend that message again. Or use select() to poll for readability? Or create a thread within the calling stack (same process with access to the write buffer) of the poll function and have it wait on a condition variable which is set upon EPIPE. The thread waits in pthread_cond_wait() and will write the last successfully written buffer again.

...

The message aa4 is sent still on the old connection: read(5, "<5>Feb 3 09:42:01 bazsi: aa4\0", 2048) = 30 time(NULL) = 1107420121 time(NULL) = 1107420121 time(NULL) = 1107420121 poll([{fd=5, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=POLLOUT, revents=POLLOUT}, {fd=3, events=POLLIN}], 3, 100) = 2 read(5, "", 2048) = 0 write(2, "Read EOF on fd 5.\n", 18) = 18 write(2, "Marking fd 5 for closing.\n", 26) = 26 write(4, "<5>Feb 3 09:42:01 src@bzorp bazsi: aa4\n", 40) = 40 time(NULL) = 1107420121 poll([{fd=5, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=0}, {fd=3, events=POLLIN}], 3, 100) = 1 time(NULL) = 1107420121

Then I terminated netcat, and sent another message: read(5, "<5>Feb 3 09:42:05 bazsi: aa5\0", 2048) = 30 time(NULL) = 1107420125 time(NULL) = 1107420125 time(NULL) = 1107420125 poll([{fd=5, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=POLLOUT, revents=POLLOUT}, {fd=3, events=POLLIN}], 3, 100) = 2 read(5, "", 2048) = 0 write(2, "Read EOF on fd 5.\n", 18) = 18 write(2, "Marking fd 5 for closing.\n", 26) = 26 write(4, "<5>Feb 3 09:42:05 src@bzorp bazsi: aa5\n", 40) = 40

I see.

...

time(NULL) = 1107420125 poll([{fd=5, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=0, revents=POLLERR|POLLHUP}, {fd=3, events=POLLIN}], 3, 100) = 2

Even though the connection was broken the write() syscall returned success, and only the next poll() call indicates some problems.

Yes, because the revents in the pollfd struct indicates events that occured. Kinda looking into the past ;).

...

time(NULL) = 1107420125 poll([{fd=5, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=0, revents=POLLERR|POLLHUP}, {fd=3, events=POLLIN}], 3, 100) = 2 write(2, "io_iter(): POLLHUP on inactive fd!\n", 35) = 35 write(2, "Marking fd 4 for closing.\n", 26) = 26 time(NULL) = 1107420125 write(2, "Closing fd 5.\n", 14) = 14 close(5) = 0 write(2, "Connection broken to AF_INET(127.0.0.1:2000), reopening in 1 seconds\n", 69) = 69 time(NULL) = 1107420125 write(2, "Closing fd 4.\n", 14) = 14 close(4) = 0

The current solution in version 1.9 is to poll the fd for reading as well as writing, so we get immediate notification when the connection is broken. The same problem could still occur, though only within a very small timeframe.

Could you point me to the code in question in 1.6.x so I could check it out for myself, please? Thanks for your patience and excellent work, Roberto Nibali, ratz -- echo '[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc

Balazs Scheidler

12:02 p.m.

On Thu, 2005-02-03 at 11:31 +0100, Roberto Nibali wrote:

...

...
The problem is that syslog-ng polls the destination TCP sockets for writing only,

That's the issue with poll(). Dumb question: Why not using select()?

the same, in the kernel internally both poll and select use the same interfaces, e.g. the same rules apply how they will indicate readability.

...

...
and whenever the remote endpoint closes the TCP connection, it is not indicated in any way (as closing a socket triggers readability and not writability). Whenever a message is to be written to this socket, the first write() syscall succeeds, and only the next write() will return EPIPE, so syslog-ng is able to detect the broken connection.

Also a signal SIGPIPE is invoked. Again, stupid question: Couldn't you use SIGPIPE to inject the write request off the remaining buffer?

I think SIGPIPE is issued at the same time write() returns EPIPE, so I think it also happens at the second write, and the kernel already acknowledged the previous message. I was curious whether this was true, but even when I disabled SIG_IGNing SIGPIPE, it did not occur for some reason. tcp(7) states that SIGPIPEs are only triggered for SO_KEEPALIVE-d sockets, I enabled SO_KEEPALIVE still no SIGPIPEs. Anyway I don't think SIGPIPE would help us here. The real problem is that the kernel returns success for the write() system call, while the connection was already broken. I hackish solution would be to buffer the last line written, and in case of failure push it back to the FIFO queue. This is ugly but could work.

...

The way I see syslog-ng functionally working (extremely simplified, please correct), is:

o syslog-ng polls on read_fds for incoming syslog messages o syslog-ng maintains a queue or linked list of messages where the newly arrived messages get queued up for delivery. This queue is also used in case a destination is down and needs to be reprobed (reopened) for a connection. o syslog-ng polls on write_fds for outgoing possibilities and if success, sends out in FIFO the queued messages.

yes, it is more or less correct, though the same loop is used for read/write polling.

...

o TCP close -> eof reaches the socket and gets passed up to syslog-ng which has already sent (write()) one line _but_ not yet lost the buffer it has written. o A new write will return EPIPE and a SIGPIPE signal.

The idea is to either pass the EPIPE back to the caller function sending the syslog message chunk or to invoke a signal handler that signals the caller to resend that message again. Or use select() to poll for readability? Or create a thread within the calling stack (same process with access to the write buffer) of the poll function and have it wait on a condition variable which is set upon EPIPE. The thread waits in pthread_cond_wait() and will write the last successfully written buffer again.

Again the solution I outlined above might be ok, though the fact that syslog-ng might coalesce outgoing TCP writes it is not very simple. [snip]

...

Could you point me to the code in question in 1.6.x so I could check it out for myself, please?

it is implemented in libol/src/pkt_buffer.c, packet buffers can operate in two modes packet and stream mode. packet mode makes the output routines write a single message at a time, stream mode enables write coalescing. These two modes have two independent flush functions: static int do_flush_stream(struct abstract_buffer *c, struct abstract_write *w) static int do_flush_pkt(struct abstract_buffer *c, struct abstract_write *w) Solving the first would mean to save the last coalesced buffer and push it back to the buffer in case of EPIPE, but in fact this can also introduce platform dependence, I'm not sure all IP stacks behave identically. The other option is to add reading the socket to the poll loop like it is done in 1.9.x (can be done using io_read_write instead of io_write, and drop the connection from the read callback when the socket is readable and read() returns 0 bytes.) -- Bazsi

7551

Age (days ago)

7554

Last active (days ago)

List overview

Download

11 comments

4 participants

participants (4)

Balazs Scheidler
José Pedro Oliveira
Roberto Nibali
Roberto Nibali