[syslog-ng] syslog-ng on solaris locks up after a while

Balazs Scheidler bazsi at balabit.hu
Thu Nov 12 17:02:04 CET 2009


Hi,

not really, are there multiple threads in the same core file?

e.g. what is the response for "info threads"?

It would be nice to have the backtrace for all threads, like this:

(gdb) thread 1
(gdb) bt
(gdb) thread 2
(gdb) bt

and so on, for each threadid that "info thread" lists.


On Fri, 2009-11-06 at 11:41 -0800, Igor Manassypov wrote:
> Would this one make more sense?
> 
> 
> 
> bash-3.00# ps -eaf | grep syslog 
>     root 22562 22561   0   Nov 04 ?
> 0:30 /usr/local/sbin/syslog-ng 
>     root 22561     1   0   Nov 04 ?
> 0:00 /usr/local/sbin/syslog-ng
> 
> bash-3.00# truss -f -p 22562 
> 22562/2:        door_return(0x00000000, 0, 0x00000000, 0)
> (sleeping...) 
> 22562/1:        lwp_park(0x00000000, 0)         (sleeping....) 
> 22562/1:            Received signal #11, SIGSEGV, in lwp_park()
> [default] 
> 22562/1:              siginfo: SIGSEGV pid=12717 uid=0 
> 22562/1:        lwp_park(0x00000000, 0)                         Err#4
> EINTR
> 
> Core was generated by `/usr/local/sbin/syslog-ng'. 
> Program terminated with signal 11, Segmentation fault. 
> [New process 88098    ] 
> [New process 153634    ] 
> #0  0xfed46df0 in __lwp_park () from /lib/libc.so.1 
> #0  0xfed46df0 in __lwp_park () from /lib/libc.so.1
> 
> bash-3.00# gdb syslog-ng core
> 
> Core was generated by `/usr/local/sbin/syslog-ng'. 
> Program terminated with signal 11, Segmentation fault. 
> [New process 88098    ] 
> [New process 153634    ] 
> #0  0xfed46df0 in __lwp_park () from /lib/libc.so.1 
> (gdb) 
> 
> --- On Tue, 11/3/09, Balazs Scheidler <bazsi at balabit.hu> wrote:
>         
>         From: Balazs Scheidler <bazsi at balabit..hu>
>         Subject: Re: [syslog-ng] syslog-ng on solaris locks up after a
>         while
>         To: imanassypov at rogers.com, "Syslog-ng users' and developers'
>         mailing list" <syslog-ng at lists.balabit.hu>
>         Cc: "Pallagi Zoltán" <pzolee at balabit.hu>, network at ci.com
>         Date: Tuesday, November 3, 2009, 2:11 PM
>         
>         Hi,
>         
>         The problem is that you killed the supervisor process, which
>         restarts
>         syslog-ng in case it crashes.. However the hang is not in this
>         part, but
>         in its child.
>         
>         So by looking at the ps output, I'd say that in this situation
>         you
>         should have trussed 13621 and not its parent.
>         
>         On Tue, 2009-11-03 at 08:54 -0800, Igor Manassypov wrote:
>         > Hi Zoltan,
>         > 
>         > 
>         > Here are the traces:
>         > 
>         > bash-3.00# ps -eaf | grep syslog
>         >     root 12694 12616   0 11:37:07 pts/1       0:00 grep
>         syslog
>         >     root 13012     1   0   Oct 21 ?           0:00 syslog-ng
>         -v
>         >     root 13013 13012   0   Oct 21 ?           0:41 syslog-ng
>         -v
>         >     root 13620     1   0   Oct 08 ?
>         > 0:00 /usr/local/sbin/syslog-ng
>         >     root 13621 13620   0   Oct 08 ?
>         > 6:16 /usr/local/sbin/syslog-ng
>         > bash-3.00# truss -f -p "13620"
>         > 13620:  waitid(P_PID, 13621, 0xFFBFF468, WEXITED|WTRAPPED)
>         > (sleeping...)
>         > 
>         > 13620:      Received signal #11, SIGSEGV, in waitid()
>         [default]
>         > 13620:        siginfo: SIGSEGV pid=12717 uid=0
>         > 13620:  waitid(P_PID, 13621, 0xFFBFF468, WEXITED|WTRAPPED)
>         Err#4 EINTR
>         > 
>         > Core was generated by `/usr/local/sbin/syslog-ng'.
>         > Program terminated with signal 11, Segmentation fault.
>         > [New process 79156    ]
>         > #0  0xfed4ad80 in _waitid () from /lib/libc.so.1
>         > (gdb) bt full
>         > #0  0xfed4ad80 in _waitid () from /lib/libc.so.1
>         > No symbol table info available.
>         > #1  0xfecee038 in _waitpid () from /lib/libc.so.1
>         > No symbol table info available.
>         > #2  0xfed3a70c in waitpid () from /lib/libc.so.1
>         > No symbol table info available.
>         > #3  0x0003017c in g_process_start () at gprocess.c:1042
>         >         rc = 0
>         >         deadlock = 0
>         >         pid = 13621
>         >         __PRETTY_FUNCTION__ = "g_process_start"
>         > #4  0x0001c214 in main (argc=1, argv=0xffbffd14) at
>         main.c:371
>         >         cfg = (GlobalConfig *) 0x10034
>         >         rc = 310272
>         >         ctx = (GOptionContext *) 0x76030
>         >         error = (GError *) 0x0
>         > 
>         > Please let me know if I can provide you with more
>         information,
>         > 
>         > Thanks!
>         > 
>         > --- On Tue, 11/3/09, Pallagi Zoltán <pzolee at balabit.hu>
>         wrote:
>         >         
>         >         From: Pallagi Zoltán <pzolee at balabit.hu>
>         >         Subject: Re: [syslog-ng] syslog-ng on solaris locks
>         up after a
>         >         while
>         >         To: imanassypov at rogers.com, "Syslog-ng users' and
>         developers'
>         >         mailing list" <syslog-ng at lists.balabit.hu>
>         >         Received: Tuesday, November 3, 2009, 11:10 AM
>         >         
>         >         Hi Igor,
>         >         
>         >         Can you show me truss output or backtrace of the
>         stuck
>         >         syslog-ng?:
>         >         truss:
>         >         
>         >         truss -f -p "syslog-ng pid"
>         >         
>         >         backtrace:
>         >         
>         >         kill -11 "syslog-ng pid" (syslog-ng will drop a core
>         file)
>         >         gdb syslog-ng core
>         >         bt full
>         >         
>         >         Igor Manassypov írta: 
>         >         > Hello,
>         >         > 
>         >         > 
>         >         > I am having an issue with a solaris installation
>         of the
>         >         > syslog-ng. It is configured such that all the logs
>         are
>         >         > stored different per-ip folders. This is my
>         centralized
>         >         > logging device, so it is fairly heavily loaded
>         with
>         >         > receiving logs from a few dozen hosts. The
>         syslog-ng process
>         >         > locks up every two to three weeks, with no
>         messages logging
>         >         > to any of the files. The only way of getting it
>         back is kill
>         >         > -9 the process and restart it.
>         >         > 
>         >         > Is there any known issue of same sorts and is
>         there any
>         >         > other way around it other than recycling the
>         daemon every
>         >         > night?
>         >         > 
>         >         > 
>         >         > here is the version info:
>         >         > 
>         >         > bash-3.00# syslog-ng --version
>         >         > syslog-ng 3.0.4
>         >         > Revision: ssh
>         >         >
>         +git://bazsi@git.balabit//var/scm/git/syslog-ng/syslog-ng-ose--mainline--3.0#master#1b5d618e301ad94aa20e692ffba16469dece8d10
>         >         > Compile-Date: Aug 11 2009 10:44:17
>         >         > Enable-Threads: on
>         >         > Enable-Debug: off
>         >         > Enable-GProf: off
>         >         > Enable-Memtrace: off
>         >         > Enable-Sun-STREAMS: on
>         >         > Enable-Sun-Door: on
>         >         > Enable-IPv6: off
>         >         > Enable-Spoof-Source: on
>         >         > Enable-TCP-Wrapper: off
>         >         > Enable-SSL: on
>         >         > Enable-SQL: on
>         >         > Enable-Linux-Caps: off
>         >         > Enable-Pcre: on
>         >         > 
>         >         > bash-3.00# uname -a
>         >         > SunOS prelude 5.10 Generic_137137-09 sun4v sparc
>         SUNW,T5240
>         >         > Thanks!
>         >         > 
>         >         > -igor
>         >         > 
>         >         > Igor Manassypov., M.Eng, P.Eng, CCIE 23032, CCVP
>         Network
>         >         > Architect
>         >         > 
>         >         >
>         ____________________________________________________________
>         >         > 
>         >         >
>         ______________________________________________________________________________
>         >         > Member info:
>         https://lists.balabit.hu/mailman/listinfo/syslog-ng
>         >         > Documentation:
>         http://www.balabit.com/support/documentation/?product=syslog-ng
>         >         > FAQ: http://www.campin.net/syslog-ng/faq.html
>         >         > 
>         >         >   
>         >         
>         >         
>         >
>         ______________________________________________________________________________
>         > Member info:
>         https://lists.balabit.hu/mailman/listinfo/syslog-ng
>         > Documentation:
>         http://www.balabit.com/support/documentation/?product=syslog-ng
>         > FAQ: http://www.campin.net/syslog-ng/faq.html
>         > 
>         -- 
>         Bazsi
>         
>         
>         
-- 
Bazsi



More information about the syslog-ng mailing list