<div dir="ltr"><div>Hi,<br><br></div>checking the backtrace again, it might actually be a genuine deadlock. can you give a backtrace of all the threads as syslog-ng stalls like that?<br><br></div><div class="gmail_extra"><br clear="all"><div><div class="gmail_signature"><div dir="ltr">-- <br>Bazsi<br></div></div></div>
<br><div class="gmail_quote">On Mon, Jan 5, 2015 at 12:09 PM, Scheidler, Balázs <span dir="ltr"><<a href="mailto:balazs.scheidler@balabit.com" target="_blank">balazs.scheidler@balabit.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div><div><div>Hi,<br><br></div>You probably have flow-control enabled, in which case if the destination stalls, syslog-ng will stop reading its inputs as well. Workarounds:<br><br></div>1) increase the window-size (log-iw-size at the source) & destination buffer size (log-fifo-size option at the destination), this will let more leeway, until syslog-ng blocks if its destination blocks<br></div><div>2) have postgres log to a file, and then read that file as a file source.<br></div>3) last, but not least disable flow-control<br><br></div>Hope this helps,<span class="HOEnZb"><font color="#888888"><br><br></font></span></div><div class="gmail_extra"><span class="HOEnZb"><font color="#888888"><br clear="all"><div><div><div dir="ltr">-- <br>Bazsi<br></div></div></div></font></span><div><div class="h5">
<br><div class="gmail_quote">On Mon, Jan 5, 2015 at 11:24 AM, Tomáš Novosad <span dir="ltr"><<a href="mailto:tomas.novosad@linuxbox.cz" target="_blank">tomas.novosad@linuxbox.cz</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hello,<br>
<br>
and thanks for reply.<br>
<br>
However that is not my case.<br>
The problem is not becouse of TCP conns limit, but because SSH<br>
daemon tries to log a message via syslog-ng and it is blocket on the<br>
attempt, because syslog is blocked.<br>
Syslog is just blocked and does nothing.<br>
And this completely block all services on the server, which tries to log<br>
a message via syslog.<br>
<br>
I can reproduce the problem on virtual machine, so i have acces to<br>
blocked system.<br>
Even command:<br>
# logger "bla bli"<br>
just hangs and blocks.<br>
<br>
My colleague used a debbuger on a blocked syslog and found it locked in this<br>
position (this does not say much to me):<br>
#0 0x00007f7948ff6654 in __lll_lock_wait () from /lib64/libpthread.so.0<br>
#1 0x00007f7948ff1f4a in _L_lock_1034 () from /lib64/libpthread.so.0<br>
#2 0x00007f7948ff1e0c in pthread_mutex_lock () from /lib64/libpthread.so.0<br>
#3 0x00007f79471ef3da in afsql_dd_message_became_available_in_the_queue (user_data=0x2778340) at afsql.c:889<br>
#4 0x00007f794a52dfea in log_queue_push_notify (self=0x259f0f0) at logqueue.c:58<br>
#5 0x00007f794a52f029 in log_queue_fifo_push_tail (s=0x259f0f0,msg=0x2725cf0, path_options=0x7fffc5c92660) at logqueue-fifo.c:263<br>
#6 0x00007f79471efbc4 in log_queue_push_tail (s=0x2778340,msg=0x2725cf0, path_options=0x7fffc5c92660, user_data=0x0) at ../../lib/logqueue.h:84<br>
#7 afsql_dd_queue (s=0x2778340, msg=0x2725cf0, path_options=0x7fffc5c92660, user_data=0x0) at afsql.c:1198<br>
#8 0x00007f794a527a55 in log_pipe_queue (s=0x25a3c60, msg=0x2725cf0,path_options=0x7fffc5c927b0, user_data=<value optimized out>) at logpipe.h:320<br>
#9 log_multiplexer_queue (s=0x25a3c60, msg=0x2725cf0,path_options=0x7fffc5c927b0, user_data=<value optimized out>) at logmpx.c:106<br>
#10 0x00007f794a527a55 in log_pipe_queue (s=0x25ff210, msg=0x2725cf0,path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logpipe.h:320<br>
#11 log_multiplexer_queue (s=0x25ff210, msg=0x2725cf0,path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logmpx.c:106<br>
#12 0x00007f794a5311d5 in log_pipe_forward_msg (s=0x277b1c0,msg=0x2725cf0, path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logpipe.h:320<br>
#13 log_rewrite_queue (s=0x277b1c0, msg=0x2725cf0,path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logrewrite.c:68<br>
#14 0x00007f794a5311d5 in log_pipe_forward_msg (s=0x277a980,msg=0x2725cf0, path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logpipe.h:320<br>
#15 log_rewrite_queue (s=0x277a980, msg=0x2725cf0,path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logrewrite.c:68<br>
#16 0x00007f794a5311d5 in log_pipe_forward_msg (s=0x277a1a0,msg=0x2725cf0, path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logpipe.h:320<br>
#17 log_rewrite_queue (s=0x277a1a0, msg=0x2725cf0, path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logrewrite.c:68<br>
#18 0x00007f794a5311d5 in log_pipe_forward_msg (s=0x2779bc0,msg=0x2725cf0, path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logpipe.h:320<br>
#19 log_rewrite_queue (s=0x2779bc0, msg=0x2725cf0,path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logrewrite.c:68<br>
#20 0x00007f794a52ac7c in log_pipe_forward_msg (s=0x2779460,msg=0x2689d00, path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logpipe.h:320<br>
#21 log_parser_queue (s=0x2779460, msg=0x2689d00,path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logparser.c:81<br>
#22 0x00007f794a522fcb in log_pipe_forward_msg (s=0x2788480,msg=0x26b4230, path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at logpipe.h:320<br>
#23 log_filter_pipe_queue (s=0x2788480, msg=0x26b4230,path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at filter.c:731<br>
#24 0x00007f794a527744 in log_pipe_queue (s=<value optimized out>,msg=0x26b4230, path_options=0x0) at logpipe.h:320<br>
#25 log_pipe_forward_msg (s=<value optimized out>, msg=0x26b4230,path_options=0x0) at logpipe.h:289<br>
#26 log_pipe_queue (s=<value optimized out>, msg=0x26b4230,path_options=0x0) at logpipe.h:324<br>
#27 log_pipe_forward_msg (s=<value optimized out>, msg=0x26b4230,path_options=0x0) at logpipe.h:289<br>
#28 log_pipe_queue (s=<value optimized out>, msg=0x26b4230,path_options=0x0) at logpipe.h:324<br>
#29 0x00007f794a527af1 in log_pipe_queue (s=0x275bbe0, msg=0x26b4230,path_options=0x7fffc5c93160, user_data=<value optimized out>) at logpipe.h:289<br>
#30 log_multiplexer_queue (s=0x275bbe0, msg=0x26b4230,path_options=0x7fffc5c93160, user_data=<value optimized out>) at logmpx.c:106<br>
#31 0x00007f794a531dbe in log_pipe_queue (self=<value optimized out>,msg=0x26b4230, path_options=0x0) at logpipe.h:320<br>
#32 log_pipe_forward_msg (self=<value optimized out>, msg=0x26b4230,path_options=0x0) at logpipe.h:289<br>
#33 log_pipe_queue (self=<value optimized out>, msg=0x26b4230,path_options=0x0) at logpipe.h:324<br>
#34 log_pipe_forward_msg (self=<value optimized out>, msg=0x26b4230,path_options=0x0) at logpipe.h:289<br>
#35 log_pipe_queue (self=<value optimized out>, msg=0x26b4230,path_options=0x0) at logpipe.h:324<br>
#36 log_pipe_forward_msg (self=<value optimized out>, msg=0x26b4230,path_options=0x0) at logpipe.h:289<br>
#37 0x00007f794a531e2d in log_pipe_queue (self=<value optimized out>,msg=0x26b4230, path_options=0x0) at logpipe.h:324<br>
#38 log_pipe_forward_msg (self=<value optimized out>, msg=0x26b4230,path_options=0x0) at logpipe.h:289<br>
<br>
I think i'll have to report this to Balabit.<br>
<br>
Thanks for help<br>
<span>--<br>
Tomáš Novosad<br>
LinuxBox.cz, s.r.o.<br>
28. října 168, 709 00 Ostrava<br>
<br>
tel.: <a href="tel:%2B420%20591%20166%20221" value="+420591166221" target="_blank">+420 591 166 221</a><br>
mobil: <a href="tel:%2B420%20737%20238%20655" value="+420737238655" target="_blank">+420 737 238 655</a><br>
email: <a href="mailto:tomas.novosad@linuxbox.cz" target="_blank">tomas.novosad@linuxbox.cz</a><br>
jabber: <a href="mailto:novosad@linuxbox.cz" target="_blank">novosad@linuxbox.cz</a><br>
<a href="http://www.linuxbox.cz" target="_blank">www.linuxbox.cz</a><br>
<br>
mobil servis: <a href="tel:%2B420%20737%20238%20656" value="+420737238656" target="_blank">+420 737 238 656</a><br>
email servis: <a href="mailto:servis@linuxbox.cz" target="_blank">servis@linuxbox.cz</a><br>
<br>
</span><span>On 2. 1. 10:08, Jim Hendrick wrote:<br>
> Hi -<br>
><br>
> Not directly similar, but somewhat analogous behaviour was observed<br>
> using syslog-ng to log to a remote elasticsearch destination.<br>
><br>
> The system exhibited the same symptoms - no ability to connect or login.<br>
><br>
> We have no logs or data to directly support this - but my suspicion is<br>
> that it had to do with TCP connections - possibly "maxing out" the<br>
> system and leaving it impossible to establish any new connections.<br>
><br>
> We could not (and cannot) conduct any real diagnostics on this system,<br>
> since it is a production server (kinda painful when it locks up)<br>
><br>
> But I would suggest looking at system level measurements & settings<br>
> (memory, TCP connections)<br>
><br>
> Also - you don't mention load - do you have the ability to control the<br>
> load and monitor the system as it increases?<br>
><br>
> You might try holding a small load for some time (minutes? hours?) while<br>
> measuring system behaviour and ramping up in a "step function" to see if<br>
> you can identify what is going on.<br>
><br>
> Good luck,<br>
> Jim<br>
><br>
</span><div><div>______________________________________________________________________________<br>
Member info: <a href="https://lists.balabit.hu/mailman/listinfo/syslog-ng" target="_blank">https://lists.balabit.hu/mailman/listinfo/syslog-ng</a><br>
Documentation: <a href="http://www.balabit.com/support/documentation/?product=syslog-ng" target="_blank">http://www.balabit.com/support/documentation/?product=syslog-ng</a><br>
FAQ: <a href="http://www.balabit.com/wiki/syslog-ng-faq" target="_blank">http://www.balabit.com/wiki/syslog-ng-faq</a><br>
<br>
</div></div></blockquote></div><br></div></div></div>
</blockquote></div><br></div>