[Bug 212] New: strange hostnames from /etc/hosts with threaded(yes);
https://bugzilla.balabit.com/show_bug.cgi?id=212 Summary: strange hostnames from /etc/hosts with threaded(yes); Product: syslog-ng Version: 3.3.x Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: unspecified Component: syslog-ng AssignedTo: bazsi@balabit.hu ReportedBy: bpkroth@gmail.com Type of the Report: --- Estimated Hours: 0.0 Created an attachment (id=67) --> (https://bugzilla.balabit.com/attachment.cgi?id=67) syslog-ng.conf Hi, I've backported syslog-ng 3.3.5 for Debian squeeze in order to get the threaded feature for our main syslog server, but I've found that with threaded(yes) and sending logs to directories byhost ($FROM_FULLHOST) set, then I get a few directories that are actually weird mashups from entries in /etc/hosts (eg: fqdn{tab}shortname{newline}, where {tab} is an actual tab character, similarly for newlines). As an example: 13 lager.$domain lager loghost 2.208 ldap1.$domain ldap1 Here, 13 and 2.208 are the last part of the IP listing for that host, but the rest is missing. This seems similar to Bug #183, except that I'm not getting segfaults, just goofy log file names and directory names, which in turn causes some other problems on our end. For now, I've disabled threaded (and was happy to find that the CPU load on the machine is still quite reduced, so good work on whatever other scalability improvements you've done - epoll?), but I'd like to take advantage of the slew of cores the machine has. use_dns(yes) and dns_cache(yes) are set, but it doesn't use a persistent file, but rather the system resolver is set to 127.0.0.1 and there's a local bind named slave running (which has all of the necessary reverse lookup information). Attached is my full conf file. Please let me know if you need anymore info. Thanks, Brian -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.
https://bugzilla.balabit.com/show_bug.cgi?id=212 Gergely Nagy <algernon@balabit.hu> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |algernon@balabit.hu --- Comment #1 from Gergely Nagy <algernon@balabit.hu> 2012-12-10 19:40:11 --- This seems strange, but #183 might be an explanation nevertheless. Could you try with syslog-ng 3.3.7? There are pre-built debian packages available for squeeze at http://asylum.madhouse-project.org/projects/debian/ -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.
I had the same problem even with 3.3.7. -----Ursprüngliche Nachricht----- Von: syslog-ng-bounces@lists.balabit.hu [mailto:syslog-ng-bounces@lists.balabit.hu] Im Auftrag von bugzilla@bugzilla.balabit.com Gesendet: Montag, 10. Dezember 2012 19:40 An: syslog-ng@lists.balabit.hu Betreff: [syslog-ng] [Bug 212] strange hostnames from /etc/hosts with threaded(yes); https://bugzilla.balabit.com/show_bug.cgi?id=212 Gergely Nagy <algernon@balabit.hu> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |algernon@balabit.hu --- Comment #1 from Gergely Nagy <algernon@balabit.hu> 2012-12-10 19:40:11 --- This seems strange, but #183 might be an explanation nevertheless. Could you try with syslog-ng 3.3.7? There are pre-built debian packages available for squeeze at http://asylum.madhouse-project.org/projects/debian/ -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes. ______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
https://bugzilla.balabit.com/show_bug.cgi?id=212 Gergely Nagy <algernon@balabit.hu> changed: What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|--- |3.3.8 AssignedTo|bazsi@balabit.hu |algernon@balabit.hu -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.
https://bugzilla.balabit.com/show_bug.cgi?id=212 Gergely Nagy <algernon@balabit.hu> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.
https://bugzilla.balabit.com/show_bug.cgi?id=212 --- Comment #2 from Brian Kroth <bpkroth@gmail.com> 2012-12-11 14:06:23 --- I was in the middle of testing 3.3.6 already so I thought I'd wait to see what I found there. It also had the problem, so I'll try your packages later on today. Thanks, Brian Also, FYI, I got a mail delivery error when I tried to to reply to your previous message: This is the mail system at host lists.balabit.hu. I'm sorry to have to inform you that your message could not be delivered to one or more recipients. It's attached below. For further assistance, please send mail to postmaster. If you do so, please include this problem report. You can delete your own text from the attached returned message. The mail system <bugzilla@bugzilla.balabit.com>: mail for bugzilla.balabit.com loops back to myself Reporting-MTA: dns; lists.balabit.hu X-Postfix-Queue-ID: B070E39D9CE X-Postfix-Sender: rfc822; bpkroth@gmail.com Arrival-Date: Tue, 11 Dec 2012 13:53:00 +0100 (CET) Final-Recipient: rfc822; bugzilla@bugzilla.balabit.com Original-Recipient: rfc822;bugzilla@bugzilla.balabit.com Action: failed Status: 5.4.6 Diagnostic-Code: X-Postfix; mail for bugzilla.balabit.com loops back to myself -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.
https://bugzilla.balabit.com/show_bug.cgi?id=212 --- Comment #3 from Gergely Nagy <algernon@balabit.hu> 2012-12-11 14:46:16 --- If 3.3.6 has this issue, so will 3.3.7 - thanks for the confirmation! As for the bounce: there's a reply-to set, bugzilla itself doesn't accept mail replies (but comments get forwarded to the list) -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.
https://bugzilla.balabit.com/show_bug.cgi?id=212 --- Comment #4 from Brian Kroth <bpkroth@gmail.com> 2012-12-11 16:41:04 --- Aww, that's too bad. I'm used to just emailing my ticket system. Hmm, I'm not seeing the Reply-To header. There's an In-Reply-To, but that's different. The full headers are below. Also, I installed the 3.3.7 packages for squeeze from your repo a moment ago. I'll let it run for the day and let you know if I see any changes. Thanks, Brian
From bugzilla@bugzilla.balabit.com Tue Dec 11 07:46:17 2012 Delivered-To: bpkroth@gmail.com Received: by 10.14.176.65 with SMTP id a41csp286012eem; Tue, 11 Dec 2012 05:46:17 -0800 (PST) Received: by 10.204.8.143 with SMTP id h15mr6234802bkh.115.1355233577371; Tue, 11 Dec 2012 05:46:17 -0800 (PST) Return-Path: <bugzilla@bugzilla.balabit.com> Received: from lists.balabit.hu (brother.balabit.com. [195.70.62.219]) by mx.google.com with ESMTP id jd13si30678752bkc.44.2012.12.11.05.46.17; Tue, 11 Dec 2012 05:46:17 -0800 (PST) Received-SPF: pass (google.com: domain of bugzilla@bugzilla.balabit.com designates 195.70.62.219 as permitted sender) client-ip=195.70.62.219; Authentication-Results: mx.google.com; spf=pass (google.com: domain of bugzilla@bugzilla.balabit.com designates 195.70.62.219 as permitted sender) smtp.mail=bugzilla@bugzilla.balabit.com Received: by lists.balabit.hu (Postfix, from userid 33) id D069939DB0E; Tue, 11 Dec 2012 14:46:16 +0100 (CET) From: bugzilla@bugzilla.balabit.com To: bpkroth@gmail.com Subject: [Bug 212] strange hostnames from /etc/hosts with threaded(yes); X-Bugzilla-Reason: Reporter X-Bugzilla-Type: newchanged X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: syslog-ng X-Bugzilla-Component: syslog-ng X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: algernon@balabit.hu X-Bugzilla-Status: ASSIGNED X-Bugzilla-Priority: unspecified X-Bugzilla-Assigned-To: algernon@balabit.hu X-Bugzilla-Target-Milestone: 3.3.8 X-Bugzilla-Changed-Fields: In-Reply-To: <bug-212-181@https.bugzilla.balabit.com/> Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Message-Id: <20121211134616.D069939DB0E@lists.balabit.hu> Date: Tue, 11 Dec 2012 14:46:16 +0100 (CET)
https://bugzilla.balabit.com/show_bug.cgi?id=212 --- Comment #3 from Gergely Nagy <algernon@balabit.hu> 2012-12-11 14:46:16 --- If 3.3.6 has this issue, so will 3.3.7 - thanks for the confirmation! As for the bounce: there's a reply-to set, bugzilla itself doesn't accept mail replies (but comments get forwarded to the list) -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You reported the bug. -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.
https://bugzilla.balabit.com/show_bug.cgi?id=212 --- Comment #5 from Brian Kroth <bpkroth@gmail.com> 2012-12-11 19:44:14 --- 3.3.7 had the same problem. I'm trying now with dns_cache(no) set. Any other thoughts? Thanks, Brian -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.
https://bugzilla.balabit.com/show_bug.cgi?id=212 --- Comment #6 from Brian Kroth <bpkroth@gmail.com> 2012-12-11 20:38:05 --- (In reply to comment #5)
3.3.7 had the same problem. I'm trying now with dns_cache(no) set. Any other thoughts?
Thanks, Brian
That didn't help either. If anything the broken host names appeared more quickly. I'm flipping back to threaded(no) for now. Brian -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.
https://bugzilla.balabit.com/show_bug.cgi?id=212 --- Comment #7 from Gergely Nagy <algernon@balabit.hu> 2012-12-11 22:25:58 --- Aha, I think I know what's up. In resolve_sockaddr(), we're calling gethostbyaddr(), which is not reentrant, so threads go and overwrite each others data. Similar to how we use getaddrinfo() when available, we should use getnameinfo() when available, instead of gethostbyaddr(). Thanks for the report, I'll prepare a patch as soon as possible. -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.
https://bugzilla.balabit.com/show_bug.cgi?id=212 Balazs Scheidler <bazsi@balabit.hu> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |bazsi@balabit.hu --- Comment #8 from Balazs Scheidler <bazsi@balabit.hu> 2012-12-12 07:24:27 --- but how did we escape noticing this bug so far? I understand that DNS caching can hide this bug. Brian, do you have lots of hosts to resolve? Or did you change the DNS caching parameters? thanks for the response. -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.
https://bugzilla.balabit.com/show_bug.cgi?id=212 --- Comment #9 from Brian Kroth <bpkroth@gmail.com> 2012-12-12 16:35:28 --- (In reply to comment #8)
but how did we escape noticing this bug so far? I understand that DNS caching can hide this bug. Brian, do you have lots of hosts to resolve?
There have been ~1200 unique hosts check in within the last log retention period, but I think it's more like ~1000 daily. In the past we'd bumped the dns cache size up to account for that, but that hasn't changed in a long time.
Or did you change the DNS caching parameters?
Nothing changed there recently (years). The only thing I did was try to backport the 3.3 series of syslog-ng specifically for the threaded option to try and work around some scaling issues we were seeing. The reentrant problem mentioned earlier strikes me as familiar from other past projects, though I couldn't say for sure off the top of my head. I'm happy to test it if you've got some packages for me somewhere. Thanks, Brian -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.
https://bugzilla.balabit.com/show_bug.cgi?id=212 --- Comment #10 from Balazs Scheidler <bazsi@balabit.hu> 2012-12-14 01:06:32 --- Created an attachment (id=68) --> (https://bugzilla.balabit.com/attachment.cgi?id=68) patch to use reentrant versions of name lookup functions This is a completely untested patch, I was not able to start it even once, but it does compile and should the problem if the initial diagnosis is right. I'm not sure how we couldn't find this one so far, probably the DNS cache has a high enough hit rate that it doesn't get hit. I'd appreciate feedback and will probably work on the patch when I return, but I gotta go now. -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.
https://bugzilla.balabit.com/show_bug.cgi?id=212 --- Comment #11 from Gergely Nagy <algernon@balabit.hu> 2012-12-14 11:05:30 --- (In reply to comment #10)
Created an attachment (id=68) --> (https://bugzilla.balabit.com/attachment.cgi?id=68) [details] patch to use reentrant versions of name lookup functions
Looks fine to me, the locks around the non-reentrant paths are good to have too.
This is a completely untested patch, I was not able to start it even once, but it does compile and should the problem if the initial diagnosis is right.
I ran a few tests, and the patch works as one would expect (in my case, I was testing the getnameinfo() branch, but I will test the other branches too, for safety's sake).
I'm not sure how we couldn't find this one so far, probably the DNS cache has a high enough hit rate that it doesn't get hit.
Indeed. I've been running threaded(yes) for a loooong time now, and never saw anything like this, not even on those hosts where I have a few dozen network source-destination combos :/ -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.
https://bugzilla.balabit.com/show_bug.cgi?id=212 --- Comment #12 from Brian Kroth <bpkroth@gmail.com> 2012-12-14 19:41:20 --- (In reply to comment #10)
Created an attachment (id=68) --> (https://bugzilla.balabit.com/attachment.cgi?id=68) [details] patch to use reentrant versions of name lookup functions
This is a completely untested patch, I was not able to start it even once, but it does compile and should the problem if the initial diagnosis is right.
I'm not sure how we couldn't find this one so far, probably the DNS cache has a high enough hit rate that it doesn't get hit.
I'd appreciate feedback and will probably work on the patch when I return, but I gotta go now.
I tried to apply this to the 3.3.6 and 3.3.7 deb-src packages I was working with, but it didn't go down cleanly. Do you have a pkg somewhere I can just use or some git references I should pull from instead? Thanks, Brian -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.
https://bugzilla.balabit.com/show_bug.cgi?id=212 --- Comment #13 from Gergely Nagy <algernon@balabit.hu> 2012-12-17 16:19:59 --- (In reply to comment #12)
(In reply to comment #10)
Created an attachment (id=68) --> (https://bugzilla.balabit.com/attachment.cgi?id=68) [details] [details] patch to use reentrant versions of name lookup functions
This is a completely untested patch, I was not able to start it even once, but it does compile and should the problem if the initial diagnosis is right.
I'm not sure how we couldn't find this one so far, probably the DNS cache has a high enough hit rate that it doesn't get hit.
I'd appreciate feedback and will probably work on the patch when I return, but I gotta go now.
I tried to apply this to the 3.3.6 and 3.3.7 deb-src packages I was working with, but it didn't go down cleanly. Do you have a pkg somewhere I can just use or some git references I should pull from instead?
Thanks, Brian
This should apply cleanly to the 3.3.7 deb-src package: https://github.com/balabit/syslog-ng-3.3/commit/11b20b28f7586b2bf10c281328f2... I'll be preparing new Debian packages this friday or thereabouts, if all goes well. Until then, the fix is on the 3.3 git tree: https://github.com/balabit/syslog-ng-3.3 -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.
https://bugzilla.balabit.com/show_bug.cgi?id=212 --- Comment #14 from Brian Kroth <bpkroth@gmail.com> 2012-12-18 17:48:21 --- (In reply to comment #13)
(In reply to comment #12)
(In reply to comment #10)
Created an attachment (id=68) --> (https://bugzilla.balabit.com/attachment.cgi?id=68) [details] [details] [details] patch to use reentrant versions of name lookup functions
This is a completely untested patch, I was not able to start it even once, but it does compile and should the problem if the initial diagnosis is right.
I'm not sure how we couldn't find this one so far, probably the DNS cache has a high enough hit rate that it doesn't get hit.
I'd appreciate feedback and will probably work on the patch when I return, but I gotta go now.
I tried to apply this to the 3.3.6 and 3.3.7 deb-src packages I was working with, but it didn't go down cleanly. Do you have a pkg somewhere I can just use or some git references I should pull from instead?
Thanks, Brian
This should apply cleanly to the 3.3.7 deb-src package: https://github.com/balabit/syslog-ng-3.3/commit/11b20b28f7586b2bf10c281328f2...
I'll be preparing new Debian packages this friday or thereabouts, if all goes well. Until then, the fix is on the 3.3 git tree: https://github.com/balabit/syslog-ng-3.3
Thanks. I've rebuilt 3.3.7-1~mhp1 with this patch and am testing it now. I'll let you know if I see any weird hostnames again. Thanks, Brian -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.
https://bugzilla.balabit.com/show_bug.cgi?id=212 --- Comment #15 from Brian Kroth <bpkroth@gmail.com> 2012-12-21 16:54:38 --- (In reply to comment #14)
(In reply to comment #13)
(In reply to comment #12)
(In reply to comment #10)
Created an attachment (id=68) --> (https://bugzilla.balabit.com/attachment.cgi?id=68) [details] [details] [details] [details] patch to use reentrant versions of name lookup functions
This is a completely untested patch, I was not able to start it even once, but it does compile and should the problem if the initial diagnosis is right.
I'm not sure how we couldn't find this one so far, probably the DNS cache has a high enough hit rate that it doesn't get hit.
I'd appreciate feedback and will probably work on the patch when I return, but I gotta go now.
I tried to apply this to the 3.3.6 and 3.3.7 deb-src packages I was working with, but it didn't go down cleanly. Do you have a pkg somewhere I can just use or some git references I should pull from instead?
Thanks, Brian
This should apply cleanly to the 3.3.7 deb-src package: https://github.com/balabit/syslog-ng-3.3/commit/11b20b28f7586b2bf10c281328f2...
I'll be preparing new Debian packages this friday or thereabouts, if all goes well. Until then, the fix is on the 3.3 git tree: https://github.com/balabit/syslog-ng-3.3
Thanks. I've rebuilt 3.3.7-1~mhp1 with this patch and am testing it now. I'll let you know if I see any weird hostnames again.
Thanks, Brian
I haven't seen any weird hostnames yet, so I think this is working. It is a little interesting that from 3.3.7 threaded(no) to threaded(yes) I actually see an increase in load average on the machine (.4 vs .8). It's still sub 1 which is better than we were getting with 3.1 from stock debian, but still kinda curious. My only guess is that with it having now multiple threads reading and writing the disk IO pattern is somewhat different, but there again I'd expect the OS to buffer most of that and then stream it out later. Anyways, thanks for the help. Brian -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.
https://bugzilla.balabit.com/show_bug.cgi?id=212 Gergely Nagy <algernon@balabit.hu> changed: What |Removed |Added ---------------------------------------------------------------------------- Resolution| |FIXED Status|ASSIGNED |RESOLVED --- Comment #16 from Gergely Nagy <algernon@balabit.hu> 2012-12-21 17:22:59 --- (In reply to comment #15)
I haven't seen any weird hostnames yet, so I think this is working.
Wonderful! I'm closing the issue then. Thanks for the report & testing! -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.
participants (2)
-
bugzilla@bugzilla.balabit.com
-
Daniel Neubacher