[syslog-ng] down network during reload leads to blocking syslog() calls

Scheidler, Balázs balazs.scheidler at balabit.com
Tue Apr 18 06:33:37 UTC 2017


Hi,

DNS resolution happens in the main thread while reloading, so indeed the
only resolution is now to:


   1. use IP address in the config
   2. hard-code the name in /etc/hosts

We used to resolve the name only during startup, but another use-case is to
handle IP changes when reloading, which conflicts with your use-case. It
would also be possible to add an asynchronous DNS resolution into syslog-ng
and do DNS resolution after being initialized, but that's far from a
trivial effort, which I can't undertake right now.

-- 
Bazsi

On Tue, Apr 18, 2017 at 12:31 AM, Nathan Parrish <nparrish at purestorage.com>
wrote:

> Hi there,
> we are using syslog-ng 3.6.4 on Linux.  we had an incident where the
> network port used for remote logging was inadvertently disabled for a
> couple hours, and during this time a critical process which logged via
> syslog() calls experienced threads hanging for seconds to minutes at a
> time.
>
> after some investigation and reproduction efforts, it looks like the
> problem is a combination of:
> - remote logging to server specified by hostname (vs. IP)
> - loss of management interface (e.g. that used for syslog traffic and DNS
> resolution)
> - log rotate triggering syslog-ng reload
>
> when I reproduce the problem (seems to take some considerable amount of
> logging load and a minute or so), I can see that writes to /dev/log hang:
> root at escort-ct0:/etc# strace -ttT logger test
> ^C
> 19:38:46.064830 <(46)%20064%20830> execve("/usr/bin/logger", ["logger",
> "test"], [/* 26 vars */]) = 0 <0.000135>
> ...
> 19:38:46.069757 <(46)%20069%20757> socket(PF_LOCAL,
> SOCK_DGRAM|SOCK_CLOEXEC, 0) = 1 <0.000015>
> 19:38:46.069806 <(46)%20069%20806> connect(1, {sa_family=AF_LOCAL,
> sun_path="/dev/log"}, 110) = 0 <0.000010>
> 19:38:46.069850 <(46)%20069%20850> sendto(1, "<13>Mar 20 19:38:46 root:
> test", 30, MSG_NOSIGNAL, NULL, 0) = 30 <11.627190>
> 19:38:57.697110 <(57)%20697%20110> close(1)= 0 <0.000490>
>
> meanwhile I see the main thread timing out name resolution:
> 19:48:18.214182 stat("/etc/resolv.conf", {st_mode=S_IFREG|0644,
> st_size=148, ...}) = 0 <0.000008>
> 19:48:18.214234 socket(PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 31
> <0.000012>
> 19:48:18.214279 connect(31, {sa_family=AF_INET, sin_port=htons(53),
> sin_addr=inet_addr("10.15.83.11")}, 16) = 0 <0.000024>
> 19:48:18.214332 poll([{fd=31, events=POLLOUT}], 1, 0) = 1 ([{fd=31,
> revents=POLLOUT}]) <0.000006>
> 19:48:18.214370 sendto(31, "\343\235\1\0\0\1\0\0\0\0\0\0\
> rescdev-syslog\3dev\vpurestorage\3com\0\0\1\0\1", 51, MSG_NOSIGNAL, NULL,
> 0) = 51 <0.000047>
> 19:48:18.214444 poll([{fd=31, events=POLLIN}], 1, 5000) = 0 (Timeout)
> <5.005040>
>
> this repeats with the other DNS servers configured.
>
> I cannot reproduce the issue if I configure an IP instead of hostname for
> the syslog server.
>
> we are using UDP and no flow-control configuration, with the expectation
> that syslog() will never block.  and, indeed, until the reload, it works as
> expected.  however, after reload I guess we re-establish the socket for the
> remote connection, requiring us to resolve the hostname; I don’t pretend to
> understand how this ultimately backs up processing of /dev/log (note that
> internal() and kernel messages are coming through just fine during this
> time).
>
> I guess my question is whether this is a known/expected issue, and/or if
> there’s a resolution other than specifying remote syslog servers by IP or
> hardcoding the name resolution in /etc/hosts and pointing to that with
> dns-cache-hosts().  basically, I’d like syslog-ng to simply give up if it
> can’t resolve remote syslog server hostnames, rather than allow this to
> interfere with servicing of /dev/log, with ramifications to callers.
>
>
> thanks in advance,
> nathan
>
> ____________________________________________________________
> __________________
> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
> Documentation: http://www.balabit.com/support/documentation/?
> product=syslog-ng
> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.balabit.hu/pipermail/syslog-ng/attachments/20170418/825de3da/attachment-0001.html>


More information about the syslog-ng mailing list