[syslog-ng] deadlock when logging to SQL destination (PostgreSQL)

Tomáš Novosad tomas.novosad at linuxbox.cz
Mon Jan 5 11:24:01 CET 2015


Hello,

and thanks for reply.

However that is not my case.
The problem is not becouse of TCP conns limit, but because SSH
daemon tries to log a message via syslog-ng and it is blocket on the
attempt, because syslog is blocked.
Syslog is just blocked and does nothing.
And this completely block all services on the server, which tries to log
a message via syslog.

I can reproduce the problem on virtual machine, so i have acces to
blocked system.
Even command:
# logger "bla bli"
just hangs and blocks.

My colleague used a debbuger on a blocked syslog and found it locked in this
position (this does not say much to me):
#0  0x00007f7948ff6654 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007f7948ff1f4a in _L_lock_1034 () from /lib64/libpthread.so.0
#2  0x00007f7948ff1e0c in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007f79471ef3da in afsql_dd_message_became_available_in_the_queue (user_data=0x2778340) at afsql.c:889
#4  0x00007f794a52dfea in log_queue_push_notify (self=0x259f0f0) at logqueue.c:58
#5  0x00007f794a52f029 in log_queue_fifo_push_tail (s=0x259f0f0,msg=0x2725cf0, path_options=0x7fffc5c92660) at logqueue-fifo.c:263
#6  0x00007f79471efbc4 in log_queue_push_tail (s=0x2778340,msg=0x2725cf0, path_options=0x7fffc5c92660, user_data=0x0) at ../../lib/logqueue.h:84
#7  afsql_dd_queue (s=0x2778340, msg=0x2725cf0, path_options=0x7fffc5c92660, user_data=0x0) at afsql.c:1198
#8  0x00007f794a527a55 in log_pipe_queue (s=0x25a3c60, msg=0x2725cf0,path_options=0x7fffc5c927b0, user_data=<value optimized out>) at logpipe.h:320
#9  log_multiplexer_queue (s=0x25a3c60, msg=0x2725cf0,path_options=0x7fffc5c927b0, user_data=<value optimized out>) at logmpx.c:106
#10 0x00007f794a527a55 in log_pipe_queue (s=0x25ff210, msg=0x2725cf0,path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logpipe.h:320
#11 log_multiplexer_queue (s=0x25ff210, msg=0x2725cf0,path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logmpx.c:106
#12 0x00007f794a5311d5 in log_pipe_forward_msg (s=0x277b1c0,msg=0x2725cf0, path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logpipe.h:320
#13 log_rewrite_queue (s=0x277b1c0, msg=0x2725cf0,path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logrewrite.c:68
#14 0x00007f794a5311d5 in log_pipe_forward_msg (s=0x277a980,msg=0x2725cf0, path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logpipe.h:320
#15 log_rewrite_queue (s=0x277a980, msg=0x2725cf0,path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logrewrite.c:68
#16 0x00007f794a5311d5 in log_pipe_forward_msg (s=0x277a1a0,msg=0x2725cf0, path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logpipe.h:320
#17 log_rewrite_queue (s=0x277a1a0, msg=0x2725cf0, path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logrewrite.c:68
#18 0x00007f794a5311d5 in log_pipe_forward_msg (s=0x2779bc0,msg=0x2725cf0, path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logpipe.h:320
#19 log_rewrite_queue (s=0x2779bc0, msg=0x2725cf0,path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logrewrite.c:68
#20 0x00007f794a52ac7c in log_pipe_forward_msg (s=0x2779460,msg=0x2689d00, path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logpipe.h:320
#21 log_parser_queue (s=0x2779460, msg=0x2689d00,path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logparser.c:81
#22 0x00007f794a522fcb in log_pipe_forward_msg (s=0x2788480,msg=0x26b4230, path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logpipe.h:320
#23 log_filter_pipe_queue (s=0x2788480, msg=0x26b4230,path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at filter.c:731
#24 0x00007f794a527744 in log_pipe_queue (s=<value optimized out>,msg=0x26b4230, path_options=0x0) at logpipe.h:320
#25 log_pipe_forward_msg (s=<value optimized out>, msg=0x26b4230,path_options=0x0) at logpipe.h:289
#26 log_pipe_queue (s=<value optimized out>, msg=0x26b4230,path_options=0x0) at logpipe.h:324
#27 log_pipe_forward_msg (s=<value optimized out>, msg=0x26b4230,path_options=0x0) at logpipe.h:289
#28 log_pipe_queue (s=<value optimized out>, msg=0x26b4230,path_options=0x0) at logpipe.h:324
#29 0x00007f794a527af1 in log_pipe_queue (s=0x275bbe0, msg=0x26b4230,path_options=0x7fffc5c93160, user_data=<value optimized out>) at logpipe.h:289
#30 log_multiplexer_queue (s=0x275bbe0, msg=0x26b4230,path_options=0x7fffc5c93160, user_data=<value optimized out>) at logmpx.c:106
#31 0x00007f794a531dbe in log_pipe_queue (self=<value optimized out>,msg=0x26b4230, path_options=0x0) at logpipe.h:320
#32 log_pipe_forward_msg (self=<value optimized out>, msg=0x26b4230,path_options=0x0) at logpipe.h:289
#33 log_pipe_queue (self=<value optimized out>, msg=0x26b4230,path_options=0x0) at logpipe.h:324
#34 log_pipe_forward_msg (self=<value optimized out>, msg=0x26b4230,path_options=0x0) at logpipe.h:289
#35 log_pipe_queue (self=<value optimized out>, msg=0x26b4230,path_options=0x0) at logpipe.h:324
#36 log_pipe_forward_msg (self=<value optimized out>, msg=0x26b4230,path_options=0x0) at logpipe.h:289
#37 0x00007f794a531e2d in log_pipe_queue (self=<value optimized out>,msg=0x26b4230, path_options=0x0) at logpipe.h:324
#38 log_pipe_forward_msg (self=<value optimized out>, msg=0x26b4230,path_options=0x0) at logpipe.h:289

I think i'll have to report this to Balabit.

Thanks for help
-- 
Tomáš Novosad
LinuxBox.cz, s.r.o.
28. října 168, 709 00 Ostrava

tel.:   +420 591 166 221
mobil:  +420 737 238 655
email:  tomas.novosad at linuxbox.cz
jabber: novosad at linuxbox.cz
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: servis at linuxbox.cz

On  2. 1. 10:08, Jim Hendrick wrote:
> Hi -
> 
>   Not directly similar, but somewhat analogous behaviour was observed
> using syslog-ng to log to a remote elasticsearch destination.
> 
> The system exhibited the same symptoms - no ability to connect or login.
> 
> We have no logs or data to directly support this - but my suspicion is
> that it had to do with TCP connections - possibly "maxing out" the
> system and leaving it impossible to establish any new connections.
> 
> We could not (and cannot) conduct any real diagnostics on this system,
> since it is a production server (kinda painful when it locks up)
> 
> But I would suggest looking at system level measurements & settings
> (memory, TCP connections)
> 
> Also - you don't mention load - do you have the ability to control the
> load and monitor the system as it increases?
> 
> You might try holding a small load for some time (minutes? hours?) while
> measuring system behaviour and ramping up in a "step function" to see if
> you can identify what is going on.
> 
> Good luck,
> Jim
> 


More information about the syslog-ng mailing list