[syslog-ng] deadlock when logging to SQL destination (PostgreSQL)

Scheidler, Balázs balazs.scheidler at balabit.com
Mon Jan 5 12:11:44 CET 2015


Hi,

checking the backtrace again, it might actually be a genuine deadlock. can
you give a backtrace of all the threads as syslog-ng stalls like that?


-- 
Bazsi

On Mon, Jan 5, 2015 at 12:09 PM, Scheidler, Balázs <
balazs.scheidler at balabit.com> wrote:

> Hi,
>
> You probably have flow-control enabled, in which case if the destination
> stalls, syslog-ng will stop reading its inputs as well. Workarounds:
>
> 1) increase the window-size (log-iw-size at the source) & destination
> buffer size (log-fifo-size option at the destination), this will let more
> leeway, until syslog-ng blocks if its destination blocks
> 2) have postgres log to a file, and then read that file as a file source.
> 3) last, but not least disable flow-control
>
> Hope this helps,
>
>
> --
> Bazsi
>
> On Mon, Jan 5, 2015 at 11:24 AM, Tomáš Novosad <tomas.novosad at linuxbox.cz>
> wrote:
>
>> Hello,
>>
>> and thanks for reply.
>>
>> However that is not my case.
>> The problem is not becouse of TCP conns limit, but because SSH
>> daemon tries to log a message via syslog-ng and it is blocket on the
>> attempt, because syslog is blocked.
>> Syslog is just blocked and does nothing.
>> And this completely block all services on the server, which tries to log
>> a message via syslog.
>>
>> I can reproduce the problem on virtual machine, so i have acces to
>> blocked system.
>> Even command:
>> # logger "bla bli"
>> just hangs and blocks.
>>
>> My colleague used a debbuger on a blocked syslog and found it locked in
>> this
>> position (this does not say much to me):
>> #0  0x00007f7948ff6654 in __lll_lock_wait () from /lib64/libpthread.so.0
>> #1  0x00007f7948ff1f4a in _L_lock_1034 () from /lib64/libpthread.so.0
>> #2  0x00007f7948ff1e0c in pthread_mutex_lock () from
>> /lib64/libpthread.so.0
>> #3  0x00007f79471ef3da in afsql_dd_message_became_available_in_the_queue
>> (user_data=0x2778340) at afsql.c:889
>> #4  0x00007f794a52dfea in log_queue_push_notify (self=0x259f0f0) at
>> logqueue.c:58
>> #5  0x00007f794a52f029 in log_queue_fifo_push_tail
>> (s=0x259f0f0,msg=0x2725cf0, path_options=0x7fffc5c92660) at
>> logqueue-fifo.c:263
>> #6  0x00007f79471efbc4 in log_queue_push_tail (s=0x2778340,msg=0x2725cf0,
>> path_options=0x7fffc5c92660, user_data=0x0) at ../../lib/logqueue.h:84
>> #7  afsql_dd_queue (s=0x2778340, msg=0x2725cf0,
>> path_options=0x7fffc5c92660, user_data=0x0) at afsql.c:1198
>> #8  0x00007f794a527a55 in log_pipe_queue (s=0x25a3c60,
>> msg=0x2725cf0,path_options=0x7fffc5c927b0, user_data=<value optimized out>)
>> at logpipe.h:320
>> #9  log_multiplexer_queue (s=0x25a3c60,
>> msg=0x2725cf0,path_options=0x7fffc5c927b0, user_data=<value optimized out>)
>> at logmpx.c:106
>> #10 0x00007f794a527a55 in log_pipe_queue (s=0x25ff210,
>> msg=0x2725cf0,path_options=0x7fffc5c92fe0, user_data=<value optimized out>)
>> at logpipe.h:320
>> #11 log_multiplexer_queue (s=0x25ff210,
>> msg=0x2725cf0,path_options=0x7fffc5c92fe0, user_data=<value optimized out>)
>> at logmpx.c:106
>> #12 0x00007f794a5311d5 in log_pipe_forward_msg
>> (s=0x277b1c0,msg=0x2725cf0, path_options=0x7fffc5c92fe0, user_data=<value
>> optimized out>) at logpipe.h:320
>> #13 log_rewrite_queue (s=0x277b1c0,
>> msg=0x2725cf0,path_options=0x7fffc5c92fe0, user_data=<value optimized out>)
>> at logrewrite.c:68
>> #14 0x00007f794a5311d5 in log_pipe_forward_msg
>> (s=0x277a980,msg=0x2725cf0, path_options=0x7fffc5c92fe0, user_data=<value
>> optimized out>) at logpipe.h:320
>> #15 log_rewrite_queue (s=0x277a980,
>> msg=0x2725cf0,path_options=0x7fffc5c92fe0, user_data=<value optimized out>)
>> at logrewrite.c:68
>> #16 0x00007f794a5311d5 in log_pipe_forward_msg
>> (s=0x277a1a0,msg=0x2725cf0, path_options=0x7fffc5c92fe0, user_data=<value
>> optimized out>) at logpipe.h:320
>> #17 log_rewrite_queue (s=0x277a1a0, msg=0x2725cf0,
>> path_options=0x7fffc5c92fe0, user_data=<value optimized out>) at
>> logrewrite.c:68
>> #18 0x00007f794a5311d5 in log_pipe_forward_msg
>> (s=0x2779bc0,msg=0x2725cf0, path_options=0x7fffc5c92fe0, user_data=<value
>> optimized out>) at logpipe.h:320
>> #19 log_rewrite_queue (s=0x2779bc0,
>> msg=0x2725cf0,path_options=0x7fffc5c92fe0, user_data=<value optimized out>)
>> at logrewrite.c:68
>> #20 0x00007f794a52ac7c in log_pipe_forward_msg
>> (s=0x2779460,msg=0x2689d00, path_options=0x7fffc5c92fe0, user_data=<value
>> optimized out>) at logpipe.h:320
>> #21 log_parser_queue (s=0x2779460,
>> msg=0x2689d00,path_options=0x7fffc5c92fe0, user_data=<value optimized out>)
>> at logparser.c:81
>> #22 0x00007f794a522fcb in log_pipe_forward_msg
>> (s=0x2788480,msg=0x26b4230, path_options=0x7fffc5c92fe0, user_data=<value
>> optimized out>) at logpipe.h:320
>> #23 log_filter_pipe_queue (s=0x2788480,
>> msg=0x26b4230,path_options=0x7fffc5c92fe0, user_data=<value optimized out>)
>> at filter.c:731
>> #24 0x00007f794a527744 in log_pipe_queue (s=<value optimized
>> out>,msg=0x26b4230, path_options=0x0) at logpipe.h:320
>> #25 log_pipe_forward_msg (s=<value optimized out>,
>> msg=0x26b4230,path_options=0x0) at logpipe.h:289
>> #26 log_pipe_queue (s=<value optimized out>,
>> msg=0x26b4230,path_options=0x0) at logpipe.h:324
>> #27 log_pipe_forward_msg (s=<value optimized out>,
>> msg=0x26b4230,path_options=0x0) at logpipe.h:289
>> #28 log_pipe_queue (s=<value optimized out>,
>> msg=0x26b4230,path_options=0x0) at logpipe.h:324
>> #29 0x00007f794a527af1 in log_pipe_queue (s=0x275bbe0,
>> msg=0x26b4230,path_options=0x7fffc5c93160, user_data=<value optimized out>)
>> at logpipe.h:289
>> #30 log_multiplexer_queue (s=0x275bbe0,
>> msg=0x26b4230,path_options=0x7fffc5c93160, user_data=<value optimized out>)
>> at logmpx.c:106
>> #31 0x00007f794a531dbe in log_pipe_queue (self=<value optimized
>> out>,msg=0x26b4230, path_options=0x0) at logpipe.h:320
>> #32 log_pipe_forward_msg (self=<value optimized out>,
>> msg=0x26b4230,path_options=0x0) at logpipe.h:289
>> #33 log_pipe_queue (self=<value optimized out>,
>> msg=0x26b4230,path_options=0x0) at logpipe.h:324
>> #34 log_pipe_forward_msg (self=<value optimized out>,
>> msg=0x26b4230,path_options=0x0) at logpipe.h:289
>> #35 log_pipe_queue (self=<value optimized out>,
>> msg=0x26b4230,path_options=0x0) at logpipe.h:324
>> #36 log_pipe_forward_msg (self=<value optimized out>,
>> msg=0x26b4230,path_options=0x0) at logpipe.h:289
>> #37 0x00007f794a531e2d in log_pipe_queue (self=<value optimized
>> out>,msg=0x26b4230, path_options=0x0) at logpipe.h:324
>> #38 log_pipe_forward_msg (self=<value optimized out>,
>> msg=0x26b4230,path_options=0x0) at logpipe.h:289
>>
>> I think i'll have to report this to Balabit.
>>
>> Thanks for help
>> --
>> Tomáš Novosad
>> LinuxBox.cz, s.r.o.
>> 28. října 168, 709 00 Ostrava
>>
>> tel.:   +420 591 166 221
>> mobil:  +420 737 238 655
>> email:  tomas.novosad at linuxbox.cz
>> jabber: novosad at linuxbox.cz
>> www.linuxbox.cz
>>
>> mobil servis: +420 737 238 656
>> email servis: servis at linuxbox.cz
>>
>> On  2. 1. 10:08, Jim Hendrick wrote:
>> > Hi -
>> >
>> >   Not directly similar, but somewhat analogous behaviour was observed
>> > using syslog-ng to log to a remote elasticsearch destination.
>> >
>> > The system exhibited the same symptoms - no ability to connect or login.
>> >
>> > We have no logs or data to directly support this - but my suspicion is
>> > that it had to do with TCP connections - possibly "maxing out" the
>> > system and leaving it impossible to establish any new connections.
>> >
>> > We could not (and cannot) conduct any real diagnostics on this system,
>> > since it is a production server (kinda painful when it locks up)
>> >
>> > But I would suggest looking at system level measurements & settings
>> > (memory, TCP connections)
>> >
>> > Also - you don't mention load - do you have the ability to control the
>> > load and monitor the system as it increases?
>> >
>> > You might try holding a small load for some time (minutes? hours?) while
>> > measuring system behaviour and ramping up in a "step function" to see if
>> > you can identify what is going on.
>> >
>> > Good luck,
>> > Jim
>> >
>>
>> ______________________________________________________________________________
>> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
>> Documentation:
>> http://www.balabit.com/support/documentation/?product=syslog-ng
>> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.balabit.hu/pipermail/syslog-ng/attachments/20150105/b10201e7/attachment-0001.htm 


More information about the syslog-ng mailing list