[syslog-ng]syslog-ng dying randomly?

Jim Gifford maillist@jg555.com
Tue, 14 May 2002 20:40:23 -0700


I have noticed this also with the version 1.5.17. I just thought it was my
setup.

----- Original Message -----
From: "John Morrissey" <jwm@horde.net>
To: <syslog-ng@lists.balabit.hu>
Sent: Tuesday, May 14, 2002 4:36 PM
Subject: [syslog-ng]syslog-ng dying randomly?


> [warning: really long post follows]
>
> I'm running syslog-ng 1.4.14 on a number of Red Hat 7.1/7.2 systems. It's
> configured fairly simply, writing logs to
> /var/log/programname.$YEAR$MONTH$DATE and also forwards logs via UDP to a
> central syslog-ng hub.
>
> client config:
> options {
>         use_fqdn(on);
>         chain_hostnames(off);
>         sync(0);
> };
>
> source s_local {
>         internal();
>         unix-stream("/dev/log");
> };
> source s_kmsg {
>         file("/proc/kmsg");
> };
>
> destination d_sort_program {
>         file("/var/log/$PROGRAM.$MONTH$DAY$YEAR"
>                 group("logs")
>                 perm(0640)
>                 create_dirs(yes)
>                 dir_group("logs")
>                 dir_perm(0750)
>         );
> };
> destination d_syslog_hub {
>         udp(172.16.1.4 port(514));
> };
>
> log {
>         source(s_local);
>         source(s_kmsg);
>         destination(d_sort_program);
>         destination(d_syslog_hub);
> };
>
>
> The problem is that syslog-ng seems to die on a fairly regular basis. I
> can't say for sure if it's because our syslog hub wasn't available for an
> extended period of time or if it's due to some sort of problem with the
> code. I'm doing some testing to see if the availability of the hub makes
any
> difference and I'll make a follow-up in a few days.
>
> Anyway, I was curious if anyone had an idea on what might be going on. An
> strace of syslog-ng on one of the machines before it died is very
> interesting.
>
> poll([{fd=6, events=0}, {fd=5, events=0, revents=POLLERR}, {fd=2,
events=POLLIN}, {fd=4, events=POLLIN}, {fd=3, events=POLLIN}], 5, 100) = 1
> time(NULL)                              = 1021313163
> getpid()                                = 17061
> time(NULL)                              = 1021313163
> open("/var/log/syslog-ng.05132002",
O_WRONLY|O_NONBLOCK|O_APPEND|O_CREAT|O_NOCTTY, 0640) = 0
> chown32("/var/log/syslog-ng.05132002", 0, 6000) = 0
> fcntl64(0, F_GETFL)                     = 0xc01 (flags
O_WRONLY|O_NONBLOCK|O_APPEND)
> fcntl64(0, F_SETFL, O_WRONLY|O_NONBLOCK|O_APPEND) = 0
> fcntl64(0, F_SETFD, FD_CLOEXEC)         = 0
> time(NULL)                              = 1021313163
> time(NULL)                              = 1021313163
> time(NULL)                              = 1021313163
> close(5)                                = 0
> poll([{fd=0, events=0}, {fd=6, events=0}, {fd=2, events=POLLIN}, {fd=4,
events=POLLIN}, {fd=3, events=POLLIN}], 5, 100) = 0
> getpid()                                = 17061
> time(NULL)                              = 1021313163
> getpid()                                = 17061
> time(NULL)                              = 1021313163
> getpid()                                = 17061
> time(NULL)                              = 1021313163
> getpid()                                = 17061
> time(NULL)                              = 1021313163
>
> And so on, alternating between getpid() and time() until it finally
> segfaults and dies. A backtrace is also interesting; it seems like
syslog-ng
> might be getting caught in an infinite loop:
>
> #0  0x400b49c9 in chunk_alloc (ar_ptr=0x0, nb=0) at malloc.c:2843
> #1  0x400b4818 in __libc_malloc (bytes=88) at malloc.c:2811
> #2  0x08053b22 in debug_malloc ()
> #3  0x08053bdd in xalloc ()
> #4  0x08053da0 in ol_string_alloc ()
> #5  0x08051203 in c_format_cstring ()
> #6  0x0804b98b in make_internal_message ()
> #7  0x080503e6 in write_internal_message ()
> #8  0x08053291 in msg_vformat ()
> #9  0x0805339c in fatal ()
> #10 0x08053e3d in ol_object_check ()
> #11 0x0804a7c6 in do_add_source_name ()
> #12 0x080503fc in write_internal_message ()
> #13 0x08053291 in msg_vformat ()
> #14 0x0805339c in fatal ()
> #15 0x08053e3d in ol_object_check ()
> #16 0x0804a7c6 in do_add_source_name ()
> #17 0x080503fc in write_internal_message ()
> #18 0x08053291 in msg_vformat ()
> #19 0x0805339c in fatal ()
> #20 0x08053e3d in ol_object_check ()
> #21 0x0804a7c6 in do_add_source_name ()
> #22 0x080503fc in write_internal_message ()
> #23 0x08053291 in msg_vformat ()
> #24 0x0805339c in fatal ()
>
> Calls #20 - #24 repeat over and over. The entire backtrace is huge, almost
> 35,000 calls. Here's the beginning of it:
>
> #34475 0x08053e3d in ol_object_check ()
> #34476 0x0804a7c6 in do_add_source_name ()
> #34477 0x080503fc in write_internal_message ()
> #34478 0x08053291 in msg_vformat ()
> #34479 0x0805339c in fatal ()
> #34480 0x08053e3d in ol_object_check ()
> #34481 0x0804a7c6 in do_add_source_name ()
> #34482 0x080503fc in write_internal_message ()
> #34483 0x08053291 in msg_vformat ()
> #34484 0x0805339c in fatal ()
> #34485 0x080544d7 in dont_free_live_resource ()
> #34486 0x080543f3 in do_resource_free ()
> #34487 0x08053961 in gc_sweep ()
> #34488 0x08053a5b in gc ()
> #34489 0x08053aea in gc_maybe ()
> #34490 0x080515df in io_iter ()
> #34491 0x0804a095 in main_loop ()
> #34492 0x0804a3d1 in main ()
> #34493 0x40050647 in __libc_start_main (main=0x804a1c8 <main>, argc=2,
> ubp_av=0xbffff754,
>     init=0x8049664 <_init>, fini=0x8054620 <_fini>, rtld_fini=0x4000dcd4
> <_dl_fini>,
>     stack_end=0xbffff74c) at ../sysdeps/generic/libc-start.c:129
>
> It looks like it's getting caught in an infinite loop which finally
exhausts
> its stack space and causes it to segfault(?).
>
> If you need any more information, let me know. I also still have the core
if
> anybody wants it. Any ideas would be much appreciated!
>
> thanks,
> john
> --
> John Morrissey          _o            /\         ----  __o
> jwm@horde.net        _-< \_          /  \       ----  <  \,
> www.horde.net/    __(_)/_(_)________/    \_______(_) /_(_)__
>
> _______________________________________________
> syslog-ng maillist  -  syslog-ng@lists.balabit.hu
> https://lists.balabit.hu/mailman/listinfo/syslog-ng
>