"Error resolving hostname" for UDP Destination
Hi, I've experienced problems starting syslog-ng related to hostname lookup on UDP destinations for a while now and am finally getting back to looking into some kind of mitigation for the exceptions. Here's what I see: if I have a destination configured with an endpoint that requires name lookup and the name lookup fails, I see the following error when attempting to start (or reload) syslog-ng: Starting syslog-ng: Error resolving hostname; host='test.nacc.netacquire.dom' Error initializing message pipeline; Unfortunately, this results in the entire process failing to start. If syslog-ng is otherwise running and I attempt to reload the configuration (with the problematic name present) I don't see any errors logged, but I do see that the process is now defunct (and is no longer running): root 18674 1 0 10:03 ? 00:00:00 supervising syslog-ng root 18701 18674 0 10:04 ? 00:00:00 [syslog-ng] <defunct> Here's the simple destination in question: destination d_NAaudit_Prio { file("/var/log/netacquire/audit_log" template(t_NAFormat_Prio)); udp("test.nacc.netacquire.dom" port(514) template(t_NAFormat_Prio)); }; I'm wondering if there might be some way to mitigate this exception. For example: 1. Log the error and allow syslog-ng to otherwise continue to start/function without the destination being configured/used. 2. As above, and suspend the destination but allow it to continue to attempt to probe the name lookup so that if the resolution does eventually succeed the destination can be resumed. Are either of the above options feasible/available? Are there any other ideas for how to mitigate this exception so that syslog-ng can be started/reloaded successfully (other than detecting this event a priori and removing the destination from the configuration before attempts to start/reload)? Thanks in advance! -David PS: Here's my version information: # syslog-ng --version syslog-ng 3.5.4.1 Installer-Version: 3.5.4.1 Revision: ssh+git://algernon@git.balabit/var/scm/git/syslog-ng/syslog-ng-ose--mainline--3.5#master#4090ee62163780ae68a0c83cfdc23998c904fe97 Compile-Date: May 19 2015 17:04:36 Available-Modules: confgen,afstomp,afmongodb,linux-kmsg-format,cryptofuncs,afsocket,csvparser,dbparser,afamqp,afsocket-tls,syslogformat,afsocket-notls,afprog,basicfuncs,afuser,affile,system-source Enable-Debug: off Enable-GProf: off Enable-Memtrace: off Enable-IPv6: on Enable-Spoof-Source: off Enable-TCP-Wrapper: on Enable-Linux-Caps: off Enable-Pcre: on
Hi David, On Fri, Jun 12, 2015 at 05:09:03PM +0000, David Hauck wrote:
Starting syslog-ng: Error resolving hostname; host='test.nacc.netacquire.dom' Error initializing message pipeline;
Unfortunately, this results in the entire process failing to start.
This looks a hell lot like a resolved issue [1] on github [1] https://github.com/balabit/syslog-ng/issues/318
On yslog-ng-bounces@lists.balabit.hu] On Behalf Of Fabien Wernli, syslog-ng-bounces@lists.balabit.hu wrote:
Hi David,
On Fri, Jun 12, 2015 at 05:09:03PM +0000, David Hauck wrote:
Starting syslog-ng: Error resolving hostname; host='test.nacc.netacquire.dom' Error initializing message pipeline;
Unfortunately, this results in the entire process failing to start.
This looks a hell lot like a resolved issue [1] on github
Yes, indeed! And this looks to have been included in v3.6.3 - I'll give this a try. Thanks, -David
Hi Fabien, On Monday, June 15, 2015 7:08 AM, I wrote:
On Monday, June 15, 2015 12:50 AM Fabien Wernli wrote:
Hi David,
On Fri, Jun 12, 2015 at 05:09:03PM +0000, David Hauck wrote:
Starting syslog-ng: Error resolving hostname; host='test.nacc.netacquire.dom' Error initializing message pipeline;
Unfortunately, this results in the entire process failing to start.
This looks a hell lot like a resolved issue [1] on github
Yes, indeed! And this looks to have been included in v3.6.3 - I'll give this a try.
I've finally had a chance to test this and see that it indeed fixes outright error. However, I now see the following messages appear every 10s: 20150821 15:54:37.994 err syslog(syslog-ng):Error resolving hostname; host='tester' 20150821 15:54:37.994 err syslog(syslog-ng):Initiating connection failed, reconnecting; time_reopen='10' Is there a way to change the timeout? Is this the time-reopen global option? Besides DNS lookup retries, what other operations are subject to this timeout? Finally, the default (3.7) OSE documentation indicates the time-reopen default is 60s (not 10s like I'm seeing). Thanks, -David
Earlier syslog-ng immediately exited at startup, now it is considering dns resolution errors just like connection failures so time-reopen applies. Time-reopen defaults to 60 seconds as I remember as well and I can't remember any patch that would have changed it. On Aug 22, 2015 1:01 AM, "David Hauck" <davidh@netacquire.com> wrote:
Hi Fabien,
On Monday, June 15, 2015 7:08 AM, I wrote:
On Monday, June 15, 2015 12:50 AM Fabien Wernli wrote:
Hi David,
On Fri, Jun 12, 2015 at 05:09:03PM +0000, David Hauck wrote:
Starting syslog-ng: Error resolving hostname; host='test.nacc.netacquire.dom' Error initializing message pipeline;
Unfortunately, this results in the entire process failing to start.
This looks a hell lot like a resolved issue [1] on github
Yes, indeed! And this looks to have been included in v3.6.3 - I'll give this a try.
I've finally had a chance to test this and see that it indeed fixes outright error. However, I now see the following messages appear every 10s:
20150821 15:54:37.994 err syslog(syslog-ng):Error resolving hostname; host='tester' 20150821 15:54:37.994 err syslog(syslog-ng):Initiating connection failed, reconnecting; time_reopen='10'
Is there a way to change the timeout? Is this the time-reopen global option? Besides DNS lookup retries, what other operations are subject to this timeout? Finally, the default (3.7) OSE documentation indicates the time-reopen default is 60s (not 10s like I'm seeing).
Thanks, -David
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
On yslog-ng-bounces@lists.balabit.hu] On Behalf Of Scheidler,, syslog-ng-bounces@lists.balabit.hu wrote:
Earlier syslog-ng immediately exited at startup, now it is considering dns resolution errors just like connection failures so time-reopen applies.
OK, thx. BTW, I was reading about the DNS resolver blocking the logger (during resolutions, which, in situations where the lookup fails, could result in significant time). What does this mean exactly? Are all destinations/sources blocked during this time?
Time-reopen defaults to 60 seconds as I remember as well and I can't remember any patch that would have changed it.
Thx also - I did locate my configuration setting for this and see that the distribution I'm using resets this default to 10s (so everything's working fine here).
On Aug 22, 2015 1:01 AM, "David Hauck" <davidh@netacquire.com> wrote:
Hi Fabien,
On Monday, June 15, 2015 7:08 AM, I wrote: > On Monday, June 15, 2015 12:50 AM Fabien Wernli wrote: >> Hi David, >> >> On Fri, Jun 12, 2015 at 05:09:03PM +0000, David Hauck wrote: >>> Starting syslog-ng: Error resolving hostname; >>> host='test.nacc.netacquire.dom' Error initializing message pipeline; >>> >>> Unfortunately, this results in the entire process failing to start. >> >> This looks a hell lot like a resolved issue [1] on github >> >> [1] https://github.com/balabit/syslog-ng/issues/318 > > Yes, indeed! And this looks to have been included in v3.6.3 - I'll give this a try.
I've finally had a chance to test this and see that it indeed fixes outright error. However, I now see the following messages appear every 10s:
20150821 15:54:37.994 err syslog(syslog-ng):Error resolving hostname; host='tester' 20150821 15:54:37.994 err syslog(syslog-ng):Initiating connection failed, reconnecting; time_reopen='10'
Is there a way to change the timeout? Is this the time-reopen global option? Besides DNS lookup retries, what other operations are subject to this timeout? Finally, the default (3.7) OSE documentation indicates the time-reopen default is 60s (not 10s like I'm seeing).
Thanks, -David
______________________________________________________________________ __ ______ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
Syslog-ng doesn't use an asynchronous dns resolver, but rather it uses the libc one as it wants to keep the ordering of messages. However it uses an inprocess DNS cache, that should mitigate most dns latency issues as hosts that generate logs should already be in the cache anyway. If you can't trust that the dns will work, just disable dns resolution eg use-dns(no) or use persist-only dns caching and populate /etc/hosts with those you want to see with names. https://www.balabit.com/sites/default/files/documents/syslog-ng-ose-latest-g... With regards to dns stalling all sources, no its not as long as syslog-ng is running in threaded mode. Only the affected worker is stalled the others will continue. On Aug 22, 2015 5:37 PM, "David Hauck" <davidh@netacquire.com> wrote:
On yslog-ng-bounces@lists.balabit.hu] On Behalf Of Scheidler,, syslog-ng-bounces@lists.balabit.hu wrote:
Earlier syslog-ng immediately exited at startup, now it is considering dns resolution errors just like connection failures so time-reopen applies.
OK, thx.
BTW, I was reading about the DNS resolver blocking the logger (during resolutions, which, in situations where the lookup fails, could result in significant time). What does this mean exactly? Are all destinations/sources blocked during this time?
Time-reopen defaults to 60 seconds as I remember as well and I can't remember any patch that would have changed it.
Thx also - I did locate my configuration setting for this and see that the distribution I'm using resets this default to 10s (so everything's working fine here).
On Aug 22, 2015 1:01 AM, "David Hauck" <davidh@netacquire.com> wrote:
Hi Fabien,
On Monday, June 15, 2015 7:08 AM, I wrote: > On Monday, June 15, 2015 12:50 AM Fabien Wernli wrote: >> Hi David, >> >> On Fri, Jun 12, 2015 at 05:09:03PM +0000, David Hauck wrote: >>> Starting syslog-ng: Error resolving hostname; >>> host='test.nacc.netacquire.dom' Error initializing message pipeline; >>> >>> Unfortunately, this results in the entire process failing to start. >> >> This looks a hell lot like a resolved issue [1] on github >> >> [1] https://github.com/balabit/syslog-ng/issues/318 > > Yes, indeed! And this looks to have been included in v3.6.3 - I'll give this a try.
I've finally had a chance to test this and see that it indeed fixes outright error. However, I now see the following messages appear every 10s:
20150821 15:54:37.994 err syslog(syslog-ng):Error resolving hostname; host='tester' 20150821 15:54:37.994 err syslog(syslog-ng):Initiating connection failed, reconnecting; time_reopen='10'
Is there a way to change the timeout? Is this the time-reopen global option? Besides DNS lookup retries, what other operations are subject to this timeout? Finally, the default (3.7) OSE documentation indicates the time-reopen default is 60s (not 10s like I'm seeing).
Thanks, -David
______________________________________________________________________ __ ______ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
On Saturday, August 22, 2015 10:29 AM, syslog-ng-bounces@lists.balabit.hu wrote:
Syslog-ng doesn't use an asynchronous dns resolver, but rather it uses the libc one as it wants to keep the ordering of messages.
However it uses an inprocess DNS cache, that should mitigate most dns latency issues as hosts that generate logs should already be in the cache anyway.
If you can't trust that the dns will work, just disable dns resolution eg use- dns(no) or use persist-only dns caching and populate /etc/hosts with those you want to see with names.
https://www.balabit.com/sites/default/files/documents/syslog-ng-ose-la te st-guides/en/syslog-ng-ose-guide-admin/html/example-local-dns.html
Great, the above all sounds fine. And, yes, the edge condition I'm considering is a UDP destination with a failing DNS lookup (see below).
With regards to dns stalling all sources, no its not as long as syslog-ng is running in threaded mode. Only the affected worker is stalled the others will continue.
By "affected worker" do you mean the source or the destination. Hopefully this is the latter as I have many "log" definitions (all with the same source and all tied various destinations. Only some of these are configured with udp destinations that fail DNS lookup. What exactly doesn't/does get blocked here?
On Aug 22, 2015 5:37 PM, "David Hauck" <davidh@netacquire.com> wrote:
On yslog-ng-bounces@lists.balabit.hu] On Behalf Of Scheidler,, syslog-ng-bounces@lists.balabit.hu wrote: > Earlier syslog-ng immediately exited at startup, now it is considering > dns resolution errors just like connection failures so time-reopen applies.
OK, thx.
BTW, I was reading about the DNS resolver blocking the logger (during resolutions, which, in situations where the lookup fails, could result in significant time). What does this mean exactly? Are all destinations/sources blocked during this time?
Time-reopen defaults to 60 seconds as I remember as well and I can't remember any patch that would have changed it.
Thx also - I did locate my configuration setting for this and see that the distribution I'm using resets this default to 10s (so everything's working fine here).
On Aug 22, 2015 1:01 AM, "David Hauck" <davidh@netacquire.com> wrote:
Hi Fabien, > > On Monday, June 15, 2015 7:08 AM,
I wrote: > On Monday, June 15, 2015 > 12:50 AM Fabien Wernli wrote: >> Hi David, >> On Fri, Jun 12, 2015 at 05:09:03PM +0000, David Hauck wrote: >>> Starting syslog-ng: Error > resolving hostname; >>> host='test.nacc.netacquire.dom' Error initializing message pipeline; >>> >>> Unfortunately, this results in > the entire process failing to start. >> >> This looks a hell lot like > a resolved issue [1] on github >> [1] > https://github.com/balabit/syslog-ng/issues/318 > Yes, indeed! And this looks to have been included in v3.6.3 - I'll give this a try. > I've finally had a chance to test this and see that it indeed fixes > outright error. However, I now see the following messages appear every 10s: > > 20150821 15:54:37.994 err syslog(syslog-ng):Error resolving hostname; host='tester' > 20150821 15:54:37.994 err syslog(syslog-ng):Initiating connection failed, reconnecting; time_reopen='10' > > Is there a way to change the timeout? Is this the time-reopen global > option? Besides DNS lookup retries, what other operations are subject > to this timeout? Finally, the default (3.7) OSE documentation indicates the time-reopen default is 60s (not 10s like I'm seeing). > > Thanks, > -David > >
__ ______ Member info: > https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: > http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: > http://www.balabit.com/wiki/syslog-ng-faq > >
__________________________________________________________ ______________ ______ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
sources/destinations are worked on by a set of worker threads, which are not dedicated to a source or destination. DNS resolution happens at the input side, so if you have multiple log statements, it will only happen once, right after reception, on the input side. however, if you only have one udp() source, that will only use one worker at a time, so if you have multiple threads the others will not be affected. hope this helps. -- Bazsi On Mon, Aug 24, 2015 at 4:27 PM, David Hauck <davidh@netacquire.com> wrote:
On Saturday, August 22, 2015 10:29 AM, syslog-ng-bounces@lists.balabit.hu wrote:
Syslog-ng doesn't use an asynchronous dns resolver, but rather it uses the libc one as it wants to keep the ordering of messages.
However it uses an inprocess DNS cache, that should mitigate most dns latency issues as hosts that generate logs should already be in the cache anyway.
If you can't trust that the dns will work, just disable dns resolution eg use- dns(no) or use persist-only dns caching and populate /etc/hosts with those you want to see with names.
https://www.balabit.com/sites/default/files/documents/syslog-ng-ose-la te st-guides/en/syslog-ng-ose-guide-admin/html/example-local-dns.html
Great, the above all sounds fine.
And, yes, the edge condition I'm considering is a UDP destination with a failing DNS lookup (see below).
With regards to dns stalling all sources, no its not as long as syslog-ng is running in threaded mode. Only the affected worker is stalled the others will continue.
By "affected worker" do you mean the source or the destination. Hopefully this is the latter as I have many "log" definitions (all with the same source and all tied various destinations. Only some of these are configured with udp destinations that fail DNS lookup. What exactly doesn't/does get blocked here?
On Aug 22, 2015 5:37 PM, "David Hauck" <davidh@netacquire.com> wrote:
On yslog-ng-bounces@lists.balabit.hu] On Behalf Of Scheidler,, syslog-ng-bounces@lists.balabit.hu wrote: > Earlier syslog-ng immediately exited at startup, now it is considering > dns resolution errors just like connection failures so time-reopen applies.
OK, thx.
BTW, I was reading about the DNS resolver blocking the logger (during resolutions, which, in situations where the lookup fails, could result in significant time). What does this mean exactly? Are all destinations/sources blocked during this time?
> Time-reopen defaults to 60 seconds as I remember as well and I can't remember any patch that would have changed it.
Thx also - I did locate my configuration setting for this and see that the distribution I'm using resets this default to 10s (so everything's working fine here).
> On Aug 22, 2015 1:01 AM, "David Hauck" <davidh@netacquire.com> wrote: > > > Hi Fabien, > > On Monday, June 15, 2015 7:08 AM, I wrote: > On Monday, June 15, 2015 > 12:50 AM Fabien Wernli wrote: >> Hi David, >>
On Fri, Jun 12, 2015 > at 05:09:03PM +0000, David Hauck wrote: >>> Starting syslog-ng: Error > resolving hostname; >>> host='test.nacc.netacquire.dom' Error > initializing message pipeline; >>> >>> Unfortunately, this results in > the entire process failing to start. >> >> This looks a hell lot like > a resolved issue [1] on github >> [1] > https://github.com/balabit/syslog-ng/issues/318 > Yes, indeed! And > this looks to have been included in v3.6.3 - I'll give this a try. > > I've finally had a chance to test this and see that it indeed fixes > outright error. However, I now see the following messages appear every 10s: > > 20150821 15:54:37.994 err syslog(syslog-ng):Error resolving hostname; host='tester' > 20150821 15:54:37.994 err syslog(syslog-ng):Initiating connection failed, reconnecting; time_reopen='10' > > Is there a way to change the timeout? Is this the time-reopen global > option? Besides DNS lookup retries, what other operations are subject > to this timeout? Finally, the default (3.7) OSE documentation indicates the time-reopen default is 60s (not 10s like I'm seeing). > > Thanks, > -David > >
> __ ______ Member info: > https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: > http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: > http://www.balabit.com/wiki/syslog-ng-faq > >
__________________________________________________________ ______________ ______ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
On Monday, August 24, 2015 7:31 AM, syslog-ng-bounces@lists.balabit.hu wrote:
sources/destinations are worked on by a set of worker threads, which are not dedicated to a source or destination.
DNS resolution happens at the input side, so if you have multiple log statements, it will only happen once, right after reception, on the input side.
however, if you only have one udp() source, that will only use one worker at a time, so if you have multiple threads the others will not be affected.
hope this helps.
;) Not sure. First off, my UDP DNS resolution concern is in relation to a *destination* definition. destination d_NAaudit_Prio { file("/var/log/zzz/audit_log" template(t_NAFormat_Prio)); udp("testing" port(514) template(t_NAFormat_Prio)); }; This same destination is used in several log statements, the main one of which is in a fairly complex log statement with multiple junction definitions (see the genesis of this in the following mailing list thread: https://lists.balabit.hu/pipermail/syslog-ng/2014-April/021330.html). So it isn't entirely clear to me how a statement definition like this results in a specific thread breakdown... Would isolating this destination to its own thread be as simple as adding "threaded" to the flags option for this destination (and then any of the referenced "log" statements would be running in their own threads), or does this happened by default (in v3.6.x)? Apologies if I'm missing something obvious here ;).
The destinations don't have dns resolution problems. They resolve their target name once and at every reconnect. On the source side you have a potential name lookup for every message (if uncached of course). Or you mean that the target server cannot be resolved? Why not add it to /etc/hosts? Initiating a reconnect is handled in the main thread, thus name resolution of the target server would block other threads as well. The basic problem of name resolution is on the source side though, there each incoming message can have an associated dns cache miss but that's delegated to the workers. On Monday, August 24, 2015 7:31 AM, syslog-ng-bounces@lists.balabit.hu wrote:
sources/destinations are worked on by a set of worker threads, which are not dedicated to a source or destination.
DNS resolution happens at the input side, so if you have multiple log statements, it will only happen once, right after reception, on the input side.
however, if you only have one udp() source, that will only use one worker at a time, so if you have multiple threads the others will not be affected.
hope this helps.
;) Not sure. First off, my UDP DNS resolution concern is in relation to a *destination* definition. destination d_NAaudit_Prio { file("/var/log/zzz/audit_log" template(t_NAFormat_Prio)); udp("testing" port(514) template(t_NAFormat_Prio)); }; This same destination is used in several log statements, the main one of which is in a fairly complex log statement with multiple junction definitions (see the genesis of this in the following mailing list thread: https://lists.balabit.hu/pipermail/syslog-ng/2014-April/021330.html). So it isn't entirely clear to me how a statement definition like this results in a specific thread breakdown... Would isolating this destination to its own thread be as simple as adding "threaded" to the flags option for this destination (and then any of the referenced "log" statements would be running in their own threads), or does this happened by default (in v3.6.x)? Apologies if I'm missing something obvious here ;). ______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
On Monday, August 24, 2015 10:10 AM, syslog-ng-bounces@lists.balabit.hu wrote:
The destinations don't have dns resolution problems. They resolve their target name once and at every reconnect. On the source side you have a potential name lookup for every message (if uncached of course).
Understood, but see below.
Or you mean that the target server cannot be resolved?
Correct. These target destinations are configured by administrators and the DNS failures may or may not be seen in a timely fashion (so the continual reconnects will result in frequent blockage).
Why not add it to /etc/hosts?
This isn't an option when the operator simply gets the name wrong (independent of whether the host is actually online and DNS resolvable) ;).
Initiating a reconnect is handled in the main thread, thus name resolution of the target server would block other threads as well.
By other do you mean all other threads? If not, which ones? What operations are suspended during the time of the blocked DNS resolution (note: typical resolver configuration is to retry 2-3 times with retry timeouts set to ~10s - so the total blockage time on a failed DNS lookup could be significant)?
The basic problem of name resolution is on the source side though, there each incoming message can have an associated dns cache miss but that's delegated to the workers.
OK, understood for the source, but I'm still trying to determine the exact effect on destinations that fail to resolve...
On Monday, August 24, 2015 7:31 AM, syslog-ng-bounces@lists.balabit.hu wrote:
sources/destinations are worked on by a set of worker threads, which are not dedicated to a source or destination.
DNS resolution happens at the input side, so if you have multiple log statements, it will only happen once, right after reception, on the input side.
however, if you only have one udp() source, that will only use one worker at a time, so if you have multiple threads the others will not be affected.
hope this helps.
;) Not sure. First off, my UDP DNS resolution concern is in relation to a *destination* definition.
destination d_NAaudit_Prio { file("/var/log/zzz/audit_log" template(t_NAFormat_Prio)); udp("testing" port(514) template(t_NAFormat_Prio)); };
This same destination is used in several log statements, the main one of which is in a fairly complex log statement with multiple junction definitions (see the genesis of this in the following mailing list thread: https://lists.balabit.hu/pipermail/syslog-ng/2014-April/021330.html).
So it isn't entirely clear to me how a statement definition like this results in a specific thread breakdown... Would isolating this destination to its own thread be as simple as adding "threaded" to the flags option for this destination (and then any of the referenced "log" statements would be running in their own threads), or does this happened by default (in v3.6.x)?
Apologies if I'm missing something obvious here ;). __________________________________________________________ ______________ ______ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
Ah, for this use case, having to wait for dns timeout sucks. :( Adding an asynchronous DNS resolver would be possible, but I am afraid it won't make it to our schedule soon enough to solve your problem. I see two solutions: 1) what about you or a colleague submit a patch? :) 2) validate the hostname before generating it into the configuration 3) hack the syslog-ng startup script and validate it before syslog-ng is launched with the erroneous hostname Ops, that was three not two, but you get the idea. On Aug 24, 2015 7:27 PM, "David Hauck" <davidh@netacquire.com> wrote:
On Monday, August 24, 2015 10:10 AM, syslog-ng-bounces@lists.balabit.hu wrote:
The destinations don't have dns resolution problems. They resolve their target name once and at every reconnect. On the source side you have a potential name lookup for every message (if uncached of course).
Understood, but see below.
Or you mean that the target server cannot be resolved?
Correct. These target destinations are configured by administrators and the DNS failures may or may not be seen in a timely fashion (so the continual reconnects will result in frequent blockage).
Why not add it to /etc/hosts?
This isn't an option when the operator simply gets the name wrong (independent of whether the host is actually online and DNS resolvable) ;).
Initiating a reconnect is handled in the main thread, thus name resolution of the target server would block other threads as well.
By other do you mean all other threads? If not, which ones? What operations are suspended during the time of the blocked DNS resolution (note: typical resolver configuration is to retry 2-3 times with retry timeouts set to ~10s - so the total blockage time on a failed DNS lookup could be significant)?
The basic problem of name resolution is on the source side though, there each incoming message can have an associated dns cache miss but that's delegated to the workers.
OK, understood for the source, but I'm still trying to determine the exact effect on destinations that fail to resolve...
On Monday, August 24, 2015 7:31 AM, syslog-ng-bounces@lists.balabit.hu wrote:
sources/destinations are worked on by a set of worker threads, which are not dedicated to a source or destination.
DNS resolution happens at the input side, so if you have multiple log statements, it will only happen once, right after reception, on the input side.
however, if you only have one udp() source, that will only use one worker at a time, so if you have multiple threads the others will not be affected.
hope this helps.
;) Not sure. First off, my UDP DNS resolution concern is in relation to a *destination* definition.
destination d_NAaudit_Prio { file("/var/log/zzz/audit_log" template(t_NAFormat_Prio)); udp("testing" port(514) template(t_NAFormat_Prio)); };
This same destination is used in several log statements, the main one of which is in a fairly complex log statement with multiple junction definitions (see the genesis of this in the following mailing list thread: https://lists.balabit.hu/pipermail/syslog-ng/2014-April/021330.html).
So it isn't entirely clear to me how a statement definition like this results in a specific thread breakdown... Would isolating this destination to its own thread be as simple as adding "threaded" to the flags option for this destination (and then any of the referenced "log" statements would be running in their own threads), or does this happened by default (in v3.6.x)?
Apologies if I'm missing something obvious here ;). __________________________________________________________ ______________ ______ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
On Monday, August 24, 2015 1:50 PM, syslog-ng-bounces@lists.balabit.hu wrote:
Ah, for this use case, having to wait for dns timeout sucks. :(
;)
Adding an asynchronous DNS resolver would be possible, but I am afraid it won't make it to our schedule soon enough to solve your problem. I see two solutions:
1) what about you or a colleague submit a patch? :)
2) validate the hostname before generating it into the configuration
3) hack the syslog-ng startup script and validate it before syslog-ng is launched with the erroneous hostname
Ops, that was three not two, but you get the idea.
I get the idea :). I'll ponder this a bit more, thanks for all your thoughts up to this point...
On Aug 24, 2015 7:27 PM, "David Hauck" <davidh@netacquire.com> wrote:
On Monday, August 24, 2015 10:10 AM, syslog-ng-bounces@lists.balabit.hu wrote: > The destinations don't have dns resolution problems. They resolve > their target name once and at every reconnect. On the source side you > have a potential name lookup for every message (if uncached of course).
Understood, but see below.
Or you mean that the target server cannot be resolved?
Correct. These target destinations are configured by administrators and the DNS failures may or may not be seen in a timely fashion (so the continual reconnects will result in frequent blockage).
Why not add it to /etc/hosts?
This isn't an option when the operator simply gets the name wrong (independent of whether the host is actually online and DNS resolvable) ;).
Initiating a reconnect is handled in the main thread, thus name resolution of the target server would block other threads as well.
By other do you mean all other threads? If not, which ones? What operations are suspended during the time of the blocked DNS resolution (note: typical resolver configuration is to retry 2-3 times with retry timeouts set to ~10s - so the total blockage time on a failed DNS lookup could be significant)?
The basic problem of name resolution is on the source side though, > there each incoming message can have an associated dns cache miss but > that's delegated to the workers.
OK, understood for the source, but I'm still trying to determine the exact effect on destinations that fail to resolve...
On Monday, August 24, 2015 7:31 AM, syslog-ng-bounces@lists.balabit.hu > wrote: >> sources/destinations are worked on by a set of worker threads, which >> are not dedicated to a source or destination. >> >> DNS resolution happens at the input side, so if you have multiple log >> statements, it will only happen once, right after reception, on the >> input side. >> >> however, if you only have one udp() source, that will only use one >> worker at a time, so if you have multiple threads the others will not >> be affected. >> >> hope this helps. > > ;) Not sure. First off, my UDP DNS resolution concern is in relation > to a > *destination* definition. > > destination d_NAaudit_Prio { file("/var/log/zzz/audit_log" > template(t_NAFormat_Prio)); udp("testing" port(514) > template(t_NAFormat_Prio)); }; > > This same destination is used in several log statements, the main one > of which is in a fairly complex log statement with multiple junction > definitions (see the genesis of this in the following mailing list > thread: > https://lists.balabit.hu/pipermail/syslog-ng/2014-April/021330.html). > So it isn't entirely clear to me how a statement definition like this results in a specific thread breakdown... Would isolating this > destination to its own thread be as simple as adding "threaded" to the flags option for this destination (and then any of the referenced "log" > statements would be running in their own threads), or does this happened by default (in v3.6.x)? > > Apologies if I'm missing something obvious here ;). > __________________________________________________________ > ______________ ______ Member info: > https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: > http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: > http://www.balabit.com/wiki/syslog-ng-faq >
__________________________________________________________ ______________ ______ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
participants (3)
-
David Hauck
-
Fabien Wernli
-
Scheidler, Balázs