[syslog-ng] syslog-ng on solaris locks up after a while

Igor Manassypov imanassypov at rogers.com
Tue Nov 10 14:09:28 CET 2009


(gdb) bt full


#0  0xfed46df0 in __lwp_park () from /lib/libc.so.1


No symbol table info available.


#1  0xfed40c44 in cond_sleep_queue () from /lib/libc.so.1


No symbol table info available.


#2  0xfed40e08 in cond_wait_queue () from /lib/libc.so.1


No symbol table info available.


#3  0xfed41350 in cond_wait () from /lib/libc.so.1


No symbol table info available.


#4  0xfed4138c in pthread_cond_wait () from /lib/libc.so.1


No symbol table info available.


#5  0xff119d80 in g_async_queue_pop_intern_unlocked (queue=0x757e0, try=0, end_time=0x75618) at gasyncqueue.c:359


        retval = (gpointer) 0xa15b8


        __PRETTY_FUNCTION__ = "g_async_queue_pop_intern_unlocked"


#6  0xff119e80 in g_async_queue_pop (queue=0x757e0) at gasyncqueue.c:398


        retval = (gpointer) 0x757e0


        __PRETTY_FUNCTION__ = "g_async_queue_pop"


#7  0x0003e984 in
afinter_source_dispatch (source=0x8d260, callback=0x3e9dc
<afinter_source_dispatch_msg>, user_data=0x8d1e0)

    at afinter.c:112


        msg = (LogMessage *) 0xa0dc0


        path_options = {flow_control = -1, matched = 0x0}


        tv = {tv_sec = 1257363112, tv_usec = 441817}


#8  0xff143564 in g_main_context_dispatch (context=0x8d158) at gmain.c:2144


No locals.


#9  0xff1459a4 in g_main_context_iterate (context=0x8d158, block=1, dispatch=1, self=0x76030) at gmain.c:2778


        max_priority = 2147483647


        timeout = 4000


        some_ready = 1


        nfds = 4


        allocated_nfds = 1


        fds = (GPollFD *) 0x788c8


        __PRETTY_FUNCTION__ = "g_main_context_iterate"


#10 0xff146050 in g_main_context_iteration (context=0x8d158, may_block=1) at gmain.c:2841


        retval = 1


#11 0x0001bc20 in main_loop_run (cfg=0xffbffbc8) at main.c:149


        iters = 0


        stats_timer_id = 0


#12 0x0001c260 in main (argc=1, argv=0xffbffd44) at main.c:394


        cfg = (GlobalConfig *) 0x794d0


        rc = 0


        ctx = (GOptionContext *) 0x76030


        error = (GError *) 0x0


Igor M., M.Eng, P.Eng Network Architect

--- On Mon, 11/9/09, Pallagi Zoltán <pzolee at balabit.hu> wrote:

From: Pallagi Zoltán <pzolee at balabit.hu>
Subject: Re: [syslog-ng] syslog-ng on solaris locks up after a while
To: imanassypov at rogers.com, "Syslog-ng users' and developers' mailing list" <syslog-ng at lists.balabit.hu>
Date: Monday, November 9, 2009, 11:35 AM




  
Igor Manassypov írta:

  
    
      
        
        Would this one make more sense?

        
        

        
        bash-3.00# ps -eaf | grep syslog
        

            root 22562 22561   0   Nov 04
?           0:30 /usr/local/sbin/syslog-ng
        

            root 22561     1   0   Nov 04
?           0:00 /usr/local/sbin/syslog-ng
        
        bash-3.00# truss -f -p 22562
        

        22562/2:       
door_return(0x00000000, 0, 0x00000000, 0) (sleeping...)
        

        22562/1:       
lwp_park(0x00000000, 0)         (sleeping....)
        

        22562/1:            Received signal
#11, SIGSEGV, in lwp_park() [default]
        

        22562/1:              siginfo:
SIGSEGV pid=12717 uid=0
        

        22562/1:       
lwp_park(0x00000000, 0)                         Err#4 EINTR
        
        Core was generated by
`/usr/local/sbin/syslog-ng'.
        

        Program terminated with signal 11,
Segmentation fault.
        

        [New process 88098    ]
        

        [New process 153634    ]
        

        #0  0xfed46df0 in __lwp_park ()
from /lib/libc.so.1
        

        #0  0xfed46df0 in __lwp_park ()
from /lib/libc.so.1
        
        bash-3.00# gdb syslog-ng core
        
        Core was generated by
`/usr/local/sbin/syslog-ng'.
        

        Program terminated with signal 11,
Segmentation fault.
        

        [New process 88098    ]
        

        [New process 153634    ]
        

        #0  0xfed46df0 in __lwp_park ()
from /lib/libc.so.1
        

        (gdb) 
      
    
  

Please show us output of "bt full" too


  
    
      
        

        

--- On Tue, 11/3/09, Balazs Scheidler <bazsi at balabit.hu>
wrote:

        

From: Balazs Scheidler <bazsi at balabit..hu>

Subject: Re: [syslog-ng] syslog-ng on solaris locks up after a while

To: imanassypov at rogers.com, "Syslog-ng users' and developers' mailing
list" <syslog-ng at lists.balabit.hu>

Cc: "Pallagi Zoltán" <pzolee at balabit.hu>, network at ci.com

Date: Tuesday, November 3, 2009, 2:11 PM

          

          Hi,

          

The problem is that you killed the supervisor process, which restarts

syslog-ng in case it crashes.. However the hang is not in this part, but

in its child.

          

So by looking at the ps output, I'd say that in this situation you

should have trussed 13621 and not its parent.

          

On Tue, 2009-11-03 at 08:54 -0800, Igor Manassypov wrote:

> Hi Zoltan,

> 

> 

> Here are the traces:

> 

> bash-3.00# ps -eaf | grep syslog

>     root 12694 12616   0 11:37:07 pts/1       0:00 grep syslog

>     root 13012     1   0   Oct 21 ?           0:00 syslog-ng -v

>     root 13013 13012   0   Oct 21 ?           0:41 syslog-ng -v

>     root 13620     1   0   Oct 08 ?

> 0:00 /usr/local/sbin/syslog-ng

>     root 13621 13620   0   Oct 08 ?

> 6:16 /usr/local/sbin/syslog-ng

> bash-3.00# truss -f -p "13620"

> 13620:  waitid(P_PID, 13621, 0xFFBFF468, WEXITED|WTRAPPED)

> (sleeping...)

> 

> 13620:      Received signal #11, SIGSEGV, in waitid() [default]

> 13620:        siginfo: SIGSEGV pid=12717 uid=0

> 13620:  waitid(P_PID, 13621, 0xFFBFF468, WEXITED|WTRAPPED) Err#4
EINTR

> 

> Core was generated by `/usr/local/sbin/syslog-ng'.

> Program terminated with signal 11, Segmentation fault.

> [New process 79156    ]

> #0  0xfed4ad80 in _waitid () from /lib/libc.so.1

> (gdb) bt full

> #0  0xfed4ad80 in _waitid () from /lib/libc.so.1

> No symbol table info available.

> #1  0xfecee038 in _waitpid () from /lib/libc.so.1

> No symbol table info available.

> #2  0xfed3a70c in waitpid () from /lib/libc.so.1

> No symbol table info available.

> #3  0x0003017c in g_process_start () at gprocess.c:1042

>         rc = 0

>         deadlock = 0

>         pid = 13621

>         __PRETTY_FUNCTION__ = "g_process_start"

> #4  0x0001c214 in main (argc=1, argv=0xffbffd14) at main.c:371

>         cfg = (GlobalConfig *) 0x10034

>         rc = 310272

>         ctx = (GOptionContext *) 0x76030

>         error = (GError *) 0x0

> 

> Please let me know if I can provide you with more information,

> 

> Thanks!

> 

> --- On Tue, 11/3/09, Pallagi Zoltán <pzolee at balabit.hu>
wrote:

>         

>         From: Pallagi Zoltán <pzolee at balabit.hu>

>         Subject: Re: [syslog-ng] syslog-ng on solaris locks up
after a

>         while

>         To: imanassypov at rogers.com,
"Syslog-ng users' and developers'

>         mailing list" <syslog-ng at lists.balabit.hu>

>         Received: Tuesday, November 3, 2009, 11:10 AM

>         

>         Hi Igor,

>         

>         Can you show me truss output or backtrace of the stuck

>         syslog-ng?:

>         truss:

>         

>         truss -f -p "syslog-ng pid"

>         

>         backtrace:

>         

>         kill -11 "syslog-ng pid" (syslog-ng will drop a core file)

>         gdb syslog-ng core

>         bt full

>         

>         Igor Manassypov írta: 

>         > Hello,

>         > 

>         > 

>         > I am having an issue with a solaris installation of
the

>         > syslog-ng. It is configured such that all the logs are

>         > stored different per-ip folders. This is my
centralized

>         > logging device, so it is fairly heavily loaded with

>         > receiving logs from a few dozen hosts. The syslog-ng
process

>         > locks up every two to three weeks, with no messages
logging

>         > to any of the files. The only way of getting it back
is kill

>         > -9 the process and restart it.

>         > 

>         > Is there any known issue of same sorts and is there
any

>         > other way around it other than recycling the daemon
every

>         > night?

>         > 

>         > 

>         > here is the version info:

>         > 

>         > bash-3.00# syslog-ng --version

>         > syslog-ng 3.0.4

>         > Revision: ssh

>         > +git://bazsi@git.balabit//var/scm/git/syslog-ng/syslog-ng-ose--mainline--3.0#master#1b5d618e301ad94aa20e692ffba16469dece8d10

>         > Compile-Date: Aug 11 2009 10:44:17

>         > Enable-Threads: on

>         > Enable-Debug: off

>         > Enable-GProf: off

>         > Enable-Memtrace: off

>         > Enable-Sun-STREAMS: on

>         > Enable-Sun-Door: on

>         > Enable-IPv6: off

>         > Enable-Spoof-Source: on

>         > Enable-TCP-Wrapper: off

>         > Enable-SSL: on

>         > Enable-SQL: on

>         > Enable-Linux-Caps: off

>         > Enable-Pcre: on

>         > 

>         > bash-3.00# uname -a

>         > SunOS prelude 5.10 Generic_137137-09 sun4v sparc
SUNW,T5240

>         > Thanks!

>         > 

>         > -igor

>         > 

>         > Igor Manassypov., M.Eng, P.Eng, CCIE 23032, CCVP
Network

>         > Architect

>         > 

>         >
____________________________________________________________

>         > 

>         >
______________________________________________________________________________

>         > Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng

>         > Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng

>         > FAQ: http://www.campin.net/syslog-ng/faq.html

>         > 

>         >   

>         

>         

>
______________________________________________________________________________

> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng

> Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng

> FAQ: http://www.campin.net/syslog-ng/faq.html

> 

-- 

Bazsi

          

          

          
        
        
      
    
  
  
______________________________________________________________________________
Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng
FAQ: http://www.campin.net/syslog-ng/faq.html

  



 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.balabit.hu/pipermail/syslog-ng/attachments/20091110/b08a5d4c/attachment-0001.htm 


More information about the syslog-ng mailing list