[syslog-ng] ERROR:lib/dnscache.c:112:dns_cache_key_hash: code should not be reached

Balazs Scheidler bazsi77 at gmail.com
Sat Oct 1 18:17:07 CEST 2016


Hi,

And here comes the fix:
https://github.com/balabit/syslog-ng/pull/1218


On Sat, Oct 1, 2016 at 5:10 PM, Balazs Scheidler <bazsi77 at gmail.com> wrote:

> Hi,
>
> I could reproduce the issue in a localized environment. Thanks for your
> efforts, hopefully the fix is underway, once I can diagnose it properly.
>
>
> On Fri, Sep 30, 2016 at 8:47 AM, Claus Albøge <ca at tdchosting.dk> wrote:
>
>> Hi,
>>
>> Just a quick update, after changing "dns-cache-size” to 10000, syslog-ng
>> has been running for 3 days, without any restarts due to the SIGABRT.
>>
>> I guess the issues can be easily reproduced by setting “dns-cache-size”
>> to 1.
>>
>> This issues is by the way related to the “Listen backlog issue” reported
>> in: https://lists.balabit.hu/pipermail/syslog-ng/2016-September/
>> 023228.html (and https://github.com/balabit/syslog-ng/pull/1211). The
>> sudden surge in SYN packets was caused because syslog-ng restarted due to
>> the SIGABRT, causing all connected clients to reconnect. With a couple of
>> thousand clients, it was easy to trigger a situation, where more than 255
>> clients reconnected at once, causing the hardcoded listen backlog to run
>> full.
>>
>>
>>
>> /Claus A
>>
>>
>> > On 26. sep. 2016, at 13.53, Claus Albøge <ca at tdchosting.dk> wrote:
>> >
>> > Hi,
>> >
>> > I tried to minimize my config as much as possible, but was unable to
>> reproduce - had it run for a couple of hours.
>> >
>> > Looking at the source code pointed my towards
>> “effective_dns_cache_options" and dns-cache-size (
>> https://www.balabit.com/documents/syslog-ng-ose-latest-
>> guides/en/syslog-ng-ose-guide-admin/html/examples-dns.html)
>> >
>> > Setting dns-cache-size to 1 on my minimized config triggered the
>> SIGABRT within a few seconds. (Default value seems to be 1007!, which is
>> higher that the number of hosts sending logs to my syslog-ng instance with
>> the minimized config)
>> >
>> > Setting dns-cache-size to 10000 on a syslog-instance with the “full”
>> number of log sources, seems to work around the issue, as syslog-ng has
>> been running for almost a hour now without any restarts - normally it
>> restarts every few minutes.
>> >
>> > So it does indeed seems like an issue with the dnscache logic. You
>> might be able to reproduce by setting dns-cache-size to a very low number.
>> >
>> > If you still need a core dump to debug this further, I might need a bit
>> of help creating the coredump correctly. After the SIGABRT, running gcore
>> in gdb creates a core, but if I try to read this core again using gdb, I’m
>> unable to do a backtrace. Do I need to do anything special?
>> >
>> >
>> > /Claus A
>> >
>> >
>> >> On 23. sep. 2016, at 08.37, Claus Albøge <ca at tdchosting.dk> wrote:
>> >>
>> >> Hi,
>> >>
>> >> I’ll try to reproduce with a minimal configuration, without ssl-certs,
>> etc.
>> >>
>> >> Syslog-ng binaries are from: https://copr-be.cloud.fedorapr
>> oject.org/results/czanik/syslog-ng38/epel-7-x86_64/00449829-syslog-ng/
>> >>
>> >>
>> >> /Claus A
>> >>
>> >>> On 23. sep. 2016, at 08.25, Balazs Scheidler <bazsi77 at gmail.com>
>> wrote:
>> >>>
>> >>> Is there a possibility for me to receive the core file and the
>> binaries? (Syslog-ng plus dependent libs)
>> >>>
>> >>> This seems to be some kind of dns cache corruption, which is
>> difficult to diagnose without those.
>> >>>
>> >>> Bazsi
>> >>>
>> >>>
>> >>> On Sep 22, 2016 3:29 PM, "Claus Albøge" <ca at tdchosting.dk> wrote:
>> >>> Hi,
>> >>>
>> >>> # gdb --args /usr/sbin/syslog-ng -F -p /var/run/syslogd.pid
>> >>> GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-80.el7
>> >>> Copyright (C) 2013 Free Software Foundation, Inc.
>> >>> License GPLv3+: GNU GPL version 3 or later <
>> http://gnu.org/licenses/gpl.html>
>> >>> This is free software: you are free to change and redistribute it.
>> >>> There is NO WARRANTY, to the extent permitted by law.  Type "show
>> copying"
>> >>> and "show warranty" for details.
>> >>> This GDB was configured as "x86_64-redhat-linux-gnu".
>> >>> For bug reporting instructions, please see:
>> >>> <http://www.gnu.org/software/gdb/bugs/>...
>> >>> Reading symbols from /usr/sbin/syslog-ng...Reading symbols from
>> /usr/lib/debug/usr/sbin/syslog-ng.debug...done.
>> >>> done.
>> >>> (gdb) set pagination off
>> >>> (gdb) run
>> >>> Starting program: /usr/sbin/syslog-ng -F -p /var/run/syslogd.pid
>> >>> [Thread debugging using libthread_db enabled]
>> >>> Using host libthread_db library "/lib64/libthread_db.so.1".
>> >>> [New Thread 0x7ffff1e8a700 (LWP 12861)]
>> >>> [New Thread 0x7ffff1689700 (LWP 12862)]
>> >>> [New Thread 0x7ffff0e88700 (LWP 12863)]
>> >>> [New Thread 0x7fffe3fff700 (LWP 13103)]
>> >>> [New Thread 0x7fffe37fe700 (LWP 13108)]
>> >>> [New Thread 0x7fffe2ffd700 (LWP 13109)]
>> >>> [New Thread 0x7fffe27fc700 (LWP 13110)]
>> >>> [New Thread 0x7fffe1ffb700 (LWP 13111)]
>> >>> [Thread 0x7ffff1689700 (LWP 12862) exited]
>> >>> [Thread 0x7ffff1e8a700 (LWP 12861) exited]
>> >>> [New Thread 0x7ffff1e8a700 (LWP 13395)]
>> >>> [New Thread 0x7ffff1689700 (LWP 13396)]
>> >>> [Thread 0x7fffe37fe700 (LWP 13108) exited]
>> >>> [Thread 0x7fffe2ffd700 (LWP 13109) exited]
>> >>> [New Thread 0x7fffe37fe700 (LWP 13646)]
>> >>> [New Thread 0x7fffe2ffd700 (LWP 13647)]
>> >>> [Thread 0x7ffff1689700 (LWP 13396) exited]
>> >>> [Thread 0x7fffe3fff700 (LWP 13103) exited]
>> >>> [New Thread 0x7fffe3fff700 (LWP 13959)]
>> >>> [New Thread 0x7ffff1689700 (LWP 13960)]
>> >>> [Thread 0x7ffff1e8a700 (LWP 13395) exited]
>> >>> [Thread 0x7fffe1ffb700 (LWP 13111) exited]
>> >>> [New Thread 0x7fffe1ffb700 (LWP 14040)]
>> >>> [New Thread 0x7ffff1e8a700 (LWP 14041)]
>> >>> [Thread 0x7fffe3fff700 (LWP 13959) exited]
>> >>> [Thread 0x7ffff0e88700 (LWP 12863) exited]
>> >>> [New Thread 0x7ffff0e88700 (LWP 14182)]
>> >>> [New Thread 0x7fffe3fff700 (LWP 14183)]
>> >>> **
>> >>> ERROR:lib/dnscache.c:112:dns_cache_key_hash: code should not be
>> reached
>> >>>
>> >>> Program received signal SIGABRT, Aborted.
>> >>> [Switching to Thread 0x7ffff0e88700 (LWP 14182)]
>> >>> 0x00007ffff5fb05f7 in raise () from /lib64/libc.so.6
>> >>> Missing separate debuginfos, use: debuginfo-install
>> bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.163-3.el7.x86_64
>> elfutils-libs-0.163-3.el7.x86_64 eventlog-0.2.13-4.el7.x86_64
>> glib2-2.42.2-5.el7.x86_64 glibc-2.17-106.el7_2.8.x86_64
>> ivykis-0.36.2-2.el7.x86_64 json-c-0.11-4.el7_0.x86_64
>> keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.13.2-12.el7_2.x86_64
>> libattr-2.4.46-12.el7.x86_64 libcap-2.22-8.el7.x86_64
>> libcom_err-1.42.9-7.el7.x86_64 libgcc-4.8.5-4.el7.x86_64
>> libgcrypt-1.5.3-12.el7_1.1.x86_64 libgpg-error-1.12-3.el7.x86_64
>> libnet-1.1.6-7.el7.x86_64 libselinux-2.2.2-6.el7.x86_64
>> libuuid-2.23.2-26.el7_2.3.x86_64 openssl-libs-1.0.1e-51.el7_2.5.x86_64
>> pcre-8.32-15.el7_2.1.x86_64 systemd-libs-219-19.el7_2.13.x86_64
>> tcp_wrappers-libs-7.6-77.el7.x86_64 xz-libs-5.1.2-12alpha.el7.x86_64
>> zlib-1.2.7-15.el7.x86_64
>> >>> (gdb) bt full
>> >>> #0  0x00007ffff5fb05f7 in raise () from /lib64/libc.so.6
>> >>> No symbol table info available.
>> >>> #1  0x00007ffff5fb1ce8 in abort () from /lib64/libc.so.6
>> >>> No symbol table info available.
>> >>> #2  0x00007ffff70426d5 in g_assertion_message () from
>> /lib64/libglib-2.0.so.0
>> >>> No symbol table info available.
>> >>> #3  0x00007ffff704276a in g_assertion_message_expr () from
>> /lib64/libglib-2.0.so.0
>> >>> No symbol table info available.
>> >>> #4  0x00007ffff7b62cc4 in dns_cache_key_hash (e=<optimized out>) at
>> lib/dnscache.c:112
>> >>>       e = <optimized out>
>> >>> #5  0x00007ffff700c4fc in g_hash_table_remove_internal () from
>> /lib64/libglib-2.0.so.0
>> >>> No symbol table info available.
>> >>> #6  0x00007ffff7b67584 in resolve_sockaddr_to_inet_or_inet6_hostname
>> (host_resolve_options=0x651754, saddr=0x183d350,
>> result_len=0x7ffff0e85370) at lib/host-resolve.c:332
>> >>>       hname = 0x7ffff0e864b8 "213.174.76.60"
>> >>>       hname_len = 5
>> >>>       positive = 0
>> >>>       dnscache_key = 0x183d368
>> >>> #7  resolve_sockaddr_to_hostname (result_len=result_len at entry=0x7ffff0e85370,
>> saddr=0x183d350, host_resolve_options=0x651754) at lib/host-resolve.c:343
>> >>> No locals.
>> >>> #8  0x00007ffff7b6c03a in log_source_mangle_hostname (self=self at entry=0x1868c10,
>> msg=msg at entry=0x7fffe87cdd40) at lib/logsource.c:156
>> >>>       resolved_name = <optimized out>
>> >>>       resolved_name_len = 140737093885344
>> >>> #9  0x00007ffff7b6c280 in log_source_queue (s=0x1868c10,
>> msg=0x7fffe87cdd40, path_options=0x7ffff0e85550, user_data=<optimized out>)
>> at lib/logsource.c:266
>> >>>       self = 0x1868c10
>> >>>       i = <optimized out>
>> >>>       __func__ = "log_source_queue"
>> >>> #10 0x00007ffff7b6ca0a in log_pipe_queue (path_options=<optimized
>> out>, msg=<optimized out>, s=<optimized out>) at lib/logpipe.h:339
>> >>> No locals.
>> >>> #11 log_pipe_forward_msg (self=<optimized out>,
>> path_options=<optimized out>, msg=<optimized out>) at lib/logpipe.h:302
>> >>> No locals.
>> >>> #12 log_pipe_queue (path_options=<optimized out>, msg=<optimized
>> out>, s=<optimized out>) at lib/logpipe.h:343
>> >>> No locals.
>> >>> #13 log_pipe_forward_msg (self=<optimized out>,
>> path_options=<optimized out>, msg=<optimized out>) at lib/logpipe.h:302
>> >>> No locals.
>> >>> #14 log_pipe_queue (path_options=<optimized out>, msg=0x7fffe87cdd40,
>> s=0x7fffe87cdd40) at lib/logpipe.h:343
>> >>> No locals.
>> >>> #15 log_source_post (self=self at entry=0x1868c10, msg=msg at entry=0x7fffe87cdd40)
>> at lib/logsource.c:249
>> >>>       path_options = {ack_needed = 1, flow_control_requested = 0,
>> matched = 0x0}
>> >>>       old_window_size = <optimized out>
>> >>>       __func__ = "log_source_post"
>> >>> #16 0x00007ffff7b6b179 in log_reader_handle_line (aux=0x7ffff0e85600,
>> length=<optimized out>, line=<optimized out>, self=0x1868c10) at
>> lib/logreader.c:320
>> >>>       m = 0x7fffe87cdd40
>> >>> #17 log_reader_fetch_log (self=self at entry=0x1868c10) at
>> lib/logreader.c:381
>> >>>       bookmark = <optimized out>
>> >>>       msg = 0x7fffe445cef5 "<14>1 2016-09-22T10:51:07.178337+02:00
>> pafodev02.int.patientforsikringen.dk Microsoft-Windows-Security-Auditing
>> 608 - [NXLOG at 47121 AssetID=\"105939\" CustomerID=\"0002592\"
>> Location=\"Hyper-V - Farm 01 (ps"...
>> >>>       msg_len = 4245
>> >>>       status = <optimized out>
>> >>>       msg_count = 1
>> >>>       may_read = 1
>> >>>       aux = {peer_addr = 0x0, data = "\000\207\350\360\377\177\000\
>> 000)\000\000\000\000\000\000\000\304\335U\366\377\177\000\00
>> 0\000Y\350\360\377\177\000\000\230\235\332\001\000\000\000\000@
>> \245\350\377\177\000\000h\313U\366\377\177\000\000 at V\350\
>> 360\377\177\000\000 at V\350\360\377\177\000\000\000Y\350\360\
>> 377\177\000\000\340\037\245\350\377\177\000\000\b\000\000\
>> 000\000\000\000\000\250\037\245\350\377\177\000\000\200V\
>> 350\360\377\177\000\000\223\341U\366\377\177\000\000\001\
>> 000\000\000\000\000\000\000\006Y\350\360\377\177\000\000\
>> aY\350\360\377\177\000\000\000Y\350\360\377\177\000\000\
>> aY\350\360\377\177\000\000gk\266\367\377\177\000\000\220", '\000'
>> <repeats 15 times>, "\240 E\001\000\000\000\000\301\326\003\367\377\177\000\000"...,
>> end_ptr = 0}
>> >>>       __func__ = "log_reader_fetch_log"
>> >>> #18 0x00007ffff7b6b299 in log_reader_work_perform (s=0x1868c10) at
>> lib/logreader.c:88
>> >>>       self = 0x1868c10
>> >>> #19 0x00007ffff7b7307d in _work (self=<optimized out>) at
>> lib/mainloop-io-worker.c:52
>> >>> No locals.
>> >>> #20 0x00007ffff655dc67 in iv_work_thread_do_work () from
>> /lib64/libivykis.so.0
>> >>> No symbol table info available.
>> >>> #21 0x00007ffff655cfc3 in iv_run_tasks () from /lib64/libivykis.so.0
>> >>> No symbol table info available.
>> >>> #22 0x00007ffff655f7dc in iv_main () from /lib64/libivykis.so.0
>> >>> No symbol table info available.
>> >>> #23 0x00007ffff655dab4 in iv_work_thread () from /lib64/libivykis.so.0
>> >>> No symbol table info available.
>> >>> #24 0x00007ffff6560270 in iv_thread_handler () from
>> /lib64/libivykis.so.0
>> >>> No symbol table info available.
>> >>> #25 0x00007ffff6344dc5 in start_thread () from /lib64/libpthread.so.0
>> >>> No symbol table info available.
>> >>> #26 0x00007ffff6071ced in clone () from /lib64/libc.so.6
>> >>> No symbol table info available.
>> >>> (gdb) frame 6
>> >>> #6  0x00007ffff7b67584 in resolve_sockaddr_to_inet_or_inet6_hostname
>> (host_resolve_options=0x651754, saddr=0x183d350,
>> result_len=0x7ffff0e85370) at lib/host-resolve.c:332
>> >>> 332         dns_caching_store(saddr->sa.sa_family, dnscache_key,
>> hname, positive);
>> >>> (gdb) print *saddr
>> >>> $1 = {refcnt = {counter = 3}, flags = 0, sa_funcs = 0x7ffff7dd7060
>> <inet_sockaddr_funcs>, salen = 16, sa = {sa_family = 2, sa_data =
>> "\225\327ծL<\000\000\000\000\000\000\000"}}
>> >>> (gdb) print *saddr->sa
>> >>> Structure has no component named operator*.
>> >>> (gdb) print saddr
>> >>> $2 = (GSockAddr *) 0x183d350
>> >>>
>> >>>
>> >>> /Claus A
>> >>>
>> >>>
>> >>>> On 21. sep. 2016, at 15.37, Szalai, Attila <
>> Attila.Szalai at morganstanley.com> wrote:
>> >>>>
>> >>>> Hi,
>> >>>>
>> >>>> If it is convenient to you, just for the sake of my curiosity, can
>> you run some more command in the gdb session, after you reproduced the
>> issue?
>> >>>>
>> >>>> * bt full
>> >>>> * frame 6 (you should search for the line with
>> resolve_sockaddr_to_inet_or_inet6_hostname and write the number written
>> in that line. In the case of your previous email, it was #6)
>> >>>> * print *saddr
>> >>>> * print *saddr->sa
>> >>>>
>> >>>>
>> >>
>> >>
>> >>
>> >> ____________________________________________________________
>> __________________
>> >> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
>> >> Documentation: http://www.balabit.com/support/documentation/?product=
>> syslog-ng
>> >> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>> >>
>> >
>> >
>> > ____________________________________________________________
>> __________________
>> > Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
>> > Documentation: http://www.balabit.com/support/documentation/?product=
>> syslog-ng
>> > FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>> >
>>
>>
>> ____________________________________________________________
>> __________________
>> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
>> Documentation: http://www.balabit.com/support/documentation/?product=
>> syslog-ng
>> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>>
>>
>
>
> --
> Bazsi
>



-- 
Bazsi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.balabit.hu/pipermail/syslog-ng/attachments/20161001/5e05ab32/attachment-0001.htm 


More information about the syslog-ng mailing list