Question about reopening sockets
Hi all, I'm writing a small script that would be fed with data via pipe by syslog-ng 2.0.9. The script itself is responsible for creating the socket into which syslog-ng would be writing to. I'm looking for an option within syslog-ng that would allow for reopening the socket in case the script dies, quits, or it gets removed for some reason (like an epic admin fail). I did some testing, and as per that, checking with lsof, syslog-ng opens the socket only on the first time it would be writing to that destination, which is good. After the socket gets removed, it shows up as... syslog-ng 31666 root 17u FIFO 253,2 24624 /tmp/pw.sock (deleted) ... in lsof, and the deleted flag does not go away after the socket gets recreated (f.e. the script is restarted), and a test message is sent into the given destination. Is there a config option available for reopening sockets? Thanks, Zoltan HERPAI
Hi Zoltan, syslog-ng doesn't detect the broken connection until it tries to write to the socket. When the write error occurs then it will try to reconnect to the socket and keeps to do so until it succeeds. You can control how many seconds to wait between retries using the time_reopen option. You should consider migrating to 2.1 or even better to 3.0, both are backward-compatible with the configuration file syntax of syslog-ng 2.0. AFAIK syslog-ng 2.0 isn't maintained anymore, and slowly syslog-ng 2.1 will get "abandoned" as well. Regards, Sandor On Fri, Jul 3, 2009 at 10:16 PM, Zoltan HERPAI<wigyori@uid0.hu> wrote:
Hi all,
I'm writing a small script that would be fed with data via pipe by syslog-ng 2.0.9. The script itself is responsible for creating the socket into which syslog-ng would be writing to. I'm looking for an option within syslog-ng that would allow for reopening the socket in case the script dies, quits, or it gets removed for some reason (like an epic admin fail).
I did some testing, and as per that, checking with lsof, syslog-ng opens the socket only on the first time it would be writing to that destination, which is good. After the socket gets removed, it shows up as...
syslog-ng 31666 root 17u FIFO 253,2 24624 /tmp/pw.sock (deleted)
... in lsof, and the deleted flag does not go away after the socket gets recreated (f.e. the script is restarted), and a test message is sent into the given destination. Is there a config option available for reopening sockets?
Thanks, Zoltan HERPAI
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
Hi Sandor, As per the lsof output, syslog-ng does detect immediately when the socket gets removed (showing up as deleted), even before I send another test log message into the given destination. Also, even if I modify the time_reopen value down to 1, the socket does not get reopened after any given time. time_reap is set, and I can see that f.e. user.log gets closed and reopened with a new file handler, so some kind of close/open does happen. (This syslog-ng is provided by Ubuntu, 2.0.9-4.1, if this is of any help.) I do agree that I should move up to syslog-ng 2.1 or 3.0, but since this version (2.0.7 or 2.0.9) is provided by most of the distributions, and this app would be released into the wild at some point, I would not really want to depend on features or versions of syslog-ng that are not available generally. Thanks, Zoltan HERPAI On Sat, 4 Jul 2009, Sandor Geller wrote:
Hi Zoltan,
syslog-ng doesn't detect the broken connection until it tries to write to the socket. When the write error occurs then it will try to reconnect to the socket and keeps to do so until it succeeds. You can control how many seconds to wait between retries using the time_reopen option.
You should consider migrating to 2.1 or even better to 3.0, both are backward-compatible with the configuration file syntax of syslog-ng 2.0. AFAIK syslog-ng 2.0 isn't maintained anymore, and slowly syslog-ng 2.1 will get "abandoned" as well.
Regards,
Sandor
On Fri, Jul 3, 2009 at 10:16 PM, Zoltan HERPAI<wigyori@uid0.hu> wrote:
Hi all,
I'm writing a small script that would be fed with data via pipe by syslog-ng 2.0.9. The script itself is responsible for creating the socket into which syslog-ng would be writing to. I'm looking for an option within syslog-ng that would allow for reopening the socket in case the script dies, quits, or it gets removed for some reason (like an epic admin fail).
I did some testing, and as per that, checking with lsof, syslog-ng opens the socket only on the first time it would be writing to that destination, which is good. After the socket gets removed, it shows up as...
syslog-ng 31666 root 17u FIFO 253,2 24624 /tmp/pw.sock (deleted)
... in lsof, and the deleted flag does not go away after the socket gets recreated (f.e. the script is restarted), and a test message is sent into the given destination. Is there a config option available for reopening sockets?
Thanks, Zoltan HERPAI
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
Hi, When lsof shows something as (deleted) that doesn't mean that the app which keeps the fd open knows that the file/socket/whatever has been deleted, quite the contrary. If the app which listens on the socket is still running then nothing happens, syslog-ng forwards the logs and the app receives them. If the listening app exits then syslog-ng detects it and gives messages like this: EOF occurred while idle; fd='17' Connection broken; time_reopen='120' So my guess is that you simply deleted/recreated the socket, but didn't terminate the listener. Ah, I've just realised that your app isn't listening on a socket at all. You're using a named pipe instead. This makes things somewhat more complicated because pipes are handled/backed up by the kernel. Syslog-ng opens pipes in RW mode, so on linux it can read/write to pipes even when there is nothing writing/reading on the other end. The kernel sends SIGPIPE in this case but syslog-ng ignores SIGPIPE. This means that syslog-ng doesn't detect problems with pipes. If the pipe gets removed and the reading side vanishes than the pipe destination has just became a sinkhole. time_reap could help when there is no activity, however in syslog-ng 2.0 and 2.1 time_reap works only for real files, not for named pipes. The cause of this is that cfg-grammar.l sets the AFFILE_NO_EXPAND flag for pipes: dest_afpipe_params : string { last_driver = affile_dd_new($1, AFFILE_NO_EXPAND | AFFILE_PIPE); free($1); last_writer_options = &((AFFileDestDriver *) last_driver)->writer_options; last_writer_options->flush_lines = 0; } dest_afpipe_options { $$ = last_driver; } ; and affile.c contains a conditional in the reap setup code: if ((self->flags & AFFILE_NO_EXPAND) == 0) { self->reap_timer = g_timeout_add_full(G_PRIORITY_DEFAULT, self->time_reap * 1000 / 2, affile_dd_reap, self, NULL); self->writer_hash = persist_config_fetch(persist, affile_dd_format_persist_name(self)); if (self->writer_hash) g_hash_table_foreach(self->writer_hash, affile_dd_reuse_writer, self); } else { self->writer = persist_config_fetch(persist, affile_dd_format_persist_name(self)); if (self->writer) { affile_dw_set_owner(self->writer, self); log_pipe_init(&self->writer->super, NULL, NULL); } } So although I think pipes should get reaped as well, this won't happen. BTW this has been fixed in syslog-ng 3: the AFFILE_NO_EXPAND flag has been removed from the pipe setup code to allow using pipe names containing macros. A side-effect of this change is that pipes could get reaped as well. So, for now I'd recommend using a socket instead of a named pipe or send a HUP signal to syslog-ng after the pipe was changed. Regards, Sandor On Sat, Jul 4, 2009 at 3:01 PM, Zoltan HERPAI<wigyori@uid0.hu> wrote:
Hi Sandor,
As per the lsof output, syslog-ng does detect immediately when the socket gets removed (showing up as deleted), even before I send another test log message into the given destination. Also, even if I modify the time_reopen value down to 1, the socket does not get reopened after any given time. time_reap is set, and I can see that f.e. user.log gets closed and reopened with a new file handler, so some kind of close/open does happen.
(This syslog-ng is provided by Ubuntu, 2.0.9-4.1, if this is of any help.)
I do agree that I should move up to syslog-ng 2.1 or 3.0, but since this version (2.0.7 or 2.0.9) is provided by most of the distributions, and this app would be released into the wild at some point, I would not really want to depend on features or versions of syslog-ng that are not available generally.
Thanks, Zoltan HERPAI
On Sat, 4 Jul 2009, Sandor Geller wrote:
Hi Zoltan,
syslog-ng doesn't detect the broken connection until it tries to write to the socket. When the write error occurs then it will try to reconnect to the socket and keeps to do so until it succeeds. You can control how many seconds to wait between retries using the time_reopen option.
You should consider migrating to 2.1 or even better to 3.0, both are backward-compatible with the configuration file syntax of syslog-ng 2.0. AFAIK syslog-ng 2.0 isn't maintained anymore, and slowly syslog-ng 2.1 will get "abandoned" as well.
Regards,
Sandor
On Fri, Jul 3, 2009 at 10:16 PM, Zoltan HERPAI<wigyori@uid0.hu> wrote:
Hi all,
I'm writing a small script that would be fed with data via pipe by syslog-ng 2.0.9. The script itself is responsible for creating the socket into which syslog-ng would be writing to. I'm looking for an option within syslog-ng that would allow for reopening the socket in case the script dies, quits, or it gets removed for some reason (like an epic admin fail).
I did some testing, and as per that, checking with lsof, syslog-ng opens the socket only on the first time it would be writing to that destination, which is good. After the socket gets removed, it shows up as...
syslog-ng 31666 root 17u FIFO 253,2 24624 /tmp/pw.sock (deleted)
... in lsof, and the deleted flag does not go away after the socket gets recreated (f.e. the script is restarted), and a test message is sent into the given destination. Is there a config option available for reopening sockets?
Thanks, Zoltan HERPAI
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
On Sun, 2009-07-05 at 01:20 +0200, Sandor Geller wrote:
Hi,
When lsof shows something as (deleted) that doesn't mean that the app which keeps the fd open knows that the file/socket/whatever has been deleted, quite the contrary. If the app which listens on the socket is still running then nothing happens, syslog-ng forwards the logs and the app receives them. If the listening app exits then syslog-ng detects it and gives messages like this:
EOF occurred while idle; fd='17' Connection broken; time_reopen='120'
So my guess is that you simply deleted/recreated the socket, but didn't terminate the listener.
Ah, I've just realised that your app isn't listening on a socket at all. You're using a named pipe instead. This makes things somewhat more complicated because pipes are handled/backed up by the kernel. Syslog-ng opens pipes in RW mode, so on linux it can read/write to pipes even when there is nothing writing/reading on the other end. The kernel sends SIGPIPE in this case but syslog-ng ignores SIGPIPE. This means that syslog-ng doesn't detect problems with pipes.
The assumption that syslog-ng uses regarding named pipes is as follows: 1) the named pipe can be created either by syslog-ng, or the application reading it 2) the named pipe exists continously, e.g. it does not get created 3) it does not matter if the reader gets stopped, the contents of the named pipe remain in kernel memory (until the reader is restarted) 4) if the pipe buffer becomes full in the kernel, syslog-ng will either enable flow-control or start dropping messages. So the basic model is that you create the named pipe once, and your application never removes it, it just opens it in O_RDONLY mode. Recent versions of syslog-ng (in 3.0) can even create named pipes on their own without having to do that before starting syslog-ng.
If the pipe gets removed and the reading side vanishes than the pipe destination has just became a sinkhole.
Of course only after the kernel buffer fills up. Basically you only have time to restart your reader, if the traffic is high, and the reader is not present, then syslog-ng may start flow-controlling/dropping messages.
time_reap could help when there is no activity, however in syslog-ng 2.0 and 2.1 time_reap works only for real files, not for named pipes. The cause of this is that cfg-grammar.l sets the AFFILE_NO_EXPAND flag for pipes:
dest_afpipe_params : string { last_driver = affile_dd_new($1, AFFILE_NO_EXPAND | AFFILE_PIPE); free($1); last_writer_options = &((AFFileDestDriver *) last_driver)->writer_options; last_writer_options->flush_lines = 0; } dest_afpipe_options { $$ = last_driver; } ;
and affile.c contains a conditional in the reap setup code:
if ((self->flags & AFFILE_NO_EXPAND) == 0) { self->reap_timer = g_timeout_add_full(G_PRIORITY_DEFAULT, self->time_reap * 1000 / 2, affile_dd_reap, self, NULL); self->writer_hash = persist_config_fetch(persist, affile_dd_format_persist_name(self)); if (self->writer_hash) g_hash_table_foreach(self->writer_hash, affile_dd_reuse_writer, self); } else { self->writer = persist_config_fetch(persist, affile_dd_format_persist_name(self)); if (self->writer) { affile_dw_set_owner(self->writer, self); log_pipe_init(&self->writer->super, NULL, NULL); } }
So although I think pipes should get reaped as well, this won't happen. BTW this has been fixed in syslog-ng 3: the AFFILE_NO_EXPAND flag has been removed from the pipe setup code to allow using pipe names containing macros. A side-effect of this change is that pipes could get reaped as well.
in syslog-ng 2.x, pipes didn't need reaping, as they couldn't contain macro names.
So, for now I'd recommend using a socket instead of a named pipe or send a HUP signal to syslog-ng after the pipe was changed.
unix domain sockets work similar to tcp() destinations, so if that's the semantics that you want, please use those instead of pipe(). -- Bazsi
participants (3)
-
Balazs Scheidler
-
Sandor Geller
-
Zoltan HERPAI