[syslog-ng] Fwd: Memleak in syslog-ng v3.19.1

Vidya Balakrishnan vidyabalakrishnan96 at gmail.com
Tue Jun 9 14:04:57 UTC 2020


Thanks again for the response.

I will look into implementing your suggestions and will observe the memory
trend of syslog-ng. Will keep you posted.

Regards,
Vidya Balakrishnan

On Tue, Jun 9, 2020 at 7:20 PM Balazs Scheidler <bazsi77 at gmail.com> wrote:

> Memory is allocated from the system in large chunks (by the libc
> allocator), which is then allocated by syslog-ng in smaller blocks. Some of
> these small blocks may still be in use in the large block which prevents
> the libc allocator to return it to the system. Also, returning these large
> chunks is done lazily and that is outside of the context of syslog-ng.
>
> There are alternative allocators (jemalloc for instance but there might be
> others), that exhibit this behavior to a lesser extent.
>
> Also, like I mentioned, using multiple threads increases memory usage in a
> number of ways, so disable them completely, that can save you a few megs.
>
>
>
> On Tue, Jun 9, 2020, 13:09 Vidya Balakrishnan <
> vidyabalakrishnan96 at gmail.com> wrote:
>
>> Greetings,
>>
>> Thanks for the quick response.
>>
>> I quickly looked into the output of smaps before and after memory
>> increase.
>> There are actually two scenarios were we see an increase in the memory
>> consumption as seen from the PS command as below
>>
>> *#ps | grep syslog*
>>
>> * PID    USER       VSZ    STAT  COMMAND*
>> *23760  admin      26700    S           /sbin/syslog-ng
>> --cfgfile=/var/syslog-ng.conf*
>>
>> *where VSZ -> virtual size*
>>
>> *Scenario 1:*
>> We run syslog-ng in a home router which pushes the logs to a Graylog
>> server periodically. Whenever there is a WAN connectivity issue in the
>> router, it generates a log file which contains almost upto 4000 lines of
>> logs. When the connectivity is up, these logs are pushed at once to the
>> queue of syslog-ng as seen from the output of syslog-ng-ctl stats and then
>> written to Graylog server. So when bulk log message containing upto 3000
>> lines are pushed at once, we see an increase in memory from 26MB to 28MB
>> and this value remains at 28MB even after the logs are pushed to the server
>> successfully.
>>
>> *Output of syslog-ng-ctl stats before WAN connectivity is lost in the
>> router:*
>>
>> dst.network;d_Log_Server#0;tcp,abcd.com:1514;a;dropped;0
>>
>> dst.network;d_Log_Server#0;tcp,abcd.com:1514;a;processed*;613*
>>
>> dst.network;d_Log_Server#0;tcp,abcd.com:1514;a;queued;*0*
>>
>> dst.network;d_Log_Server#0;tcp,abcd.com:1514;a;written;*613*
>>
>>
>> *After WAN connection is restored to the router, WAN DOWN logs get
>> written to the queue and queue size increases also memory consumption of
>> syslog-ng increases by 2MB:*
>>
>> dst.network;d_Log_Server#0;tcp, abcd.com :1514;a;dropped;0
>>
>> dst.network;d_Log_Server#0;tcp, abcd.com :1514;a;processed;4538
>>
>> dst.network;d_Log_Server#0;tcp, abcd.com :1514;a;queued;*3877*
>>
>> dst.network;d_Log_Server#0;tcp, abcd.com :1514;a;written;661
>>
>>
>> *After logs are written from queue to server, queue size comes back to 0,
>> but the memory consumption which increased to 28MB does not come down to
>> 26MB(which was the initial value at launch) :*
>>
>> dst.network;d_Log_Server#0;tcp, abcd.com :1514;a;dropped;0
>>
>> dst.network;d_Log_Server#0;tcp, abcd.com :1514;a;processed;5250
>>
>> dst.network;d_Log_Server#0;tcp, abcd.com :1514;a;queued;*0*
>>
>> dst.network;d_Log_Server#0;tcp, abcd.com :1514;a;written;5250
>>
>>
>> *Scenario 2:  *
>>
>> When syslog-ng is continuously in use for about 3 days, we see an
>> increase in memory from 26MB to 37MB all at once (please note that this is
>> not a gradual increase, but an increase that occurs all of a sudden).
>>
>> During this scenario, the server is reachable and the processed and
>> written message count is equal and queue size is 0.
>>
>>
>> dst.network;d_Log_Server#0;tcp, abcd.com :1514;a;dropped;0
>>
>> dst.network;d_Log_Server#0;tcp, abcd.com :1514;a;processed;5000
>>
>> dst.network;d_Log_Server#0;tcp, abcd.com :1514;a;queued;*0*
>>
>> dst.network;d_Log_Server#0;tcp, abcd.com :1514;a;written;5000
>>
>>
>> *Queries:*
>>
>> 1.From scenario 1, memory usage increases by 2MB (once in 5 tries) when
>> 3000 log messages are dumped to the queue at once. Is this a potential
>> memory leak ? Or is this behaviour expected ?
>>
>> 2.From scenario 2 even when the setup has proper WAN connectivity( no
>> bulk logs are pushed at once to the queue)  there is an increase seen in
>> memory usage of syslog-ng (from the output of PS command that has been
>> pasted above). Is it because of memory leak in the code of syslog-ng or is
>> this behaviour expected on running syslog-ng continuously for a few days?
>>
>>
>> Could you please help resolve my confusion regarding the above two
>> scenarios that have been described.
>>
>>
>> Regards,
>>
>> Vidya Balakrishnan
>>
>> On Tue, Jun 9, 2020 at 2:40 PM Balazs Scheidler <bazsi77 at gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> Unfortunately I am not sure this is a leak from what you have posted.
>>> The issues valgrind reported were one-off allocations caused by plugins
>>> being loaded into syslog-ng.
>>>
>>> The numbers (26, 35 and even 45mb) is not conclusive either. Syslog-ng
>>> uses internal data structures that are dynamically growing up to a point
>>> and then stabilize. Also, syslog-ng uses threads and per thread instances
>>> of these data structures, not to mention the libc's heap allocator that
>>> also tends to use per thread areas. Similarly stacks used by threads also
>>> get an allocation.
>>>
>>> You can get a glimpse of chunks being allocated to the process by
>>> checking /proc/PID/maps, and syslog-ng also has some stats counters that
>>> reflect its internal memory usage. (Scratch-buffers, etc).
>>>
>>> You can try to disable multi-threaded operation of syslog-ng and see if
>>> that decreases usage.
>>>
>>> On Tue, Jun 9, 2020, 09:31 Vidya Balakrishnan <
>>> vidyabalakrishnan96 at gmail.com> wrote:
>>>
>>>> Greetings,
>>>>
>>>> This mail is regarding Memory leak in Syslog-ng v3.19.1. The process
>>>> gets launched at 26MB and then grows to 37MB or even upto 45MB.
>>>>
>>>> Pasting the configurations used below:
>>>> @version: 3.19
>>>> @include "scl.conf"
>>>> source s_local {
>>>>     #system();
>>>>     #internal();
>>>>     syslog(port(514) transport("udp"));
>>>> };
>>>> rewrite tag_SN {
>>>>     set("ABC-XXXXXXX", value("HOST"));
>>>> };
>>>>
>>>> destination d_Log_Server {
>>>>     network("abcd.com"
>>>>             port(514)
>>>>             transport(tls)
>>>>             tls( ca-dir("/etc/ssl") peer-verify("required-trusted") )
>>>>             time-zone("+00:00")
>>>>
>>>> template("${R_YEAR}-${R_MONTH}-${R_DAY}T${R_HOUR}:${R_MIN}:${R_SEC}.${R_MSEC}${TZOFFSET}
>>>> ${HOST} ${MSGHDR}${MESSAGE}\n")
>>>>             );
>>>> };
>>>> log {
>>>>     source(s_local);
>>>>     rewrite(tag_SN);
>>>>     destination(d_Log_Server);
>>>> };
>>>>
>>>> Also, tried debugging using valgrind but did not resolve the issue.
>>>>
>>>> ==19671== Memcheck, a memory error detector
>>>>
>>>> ==19671== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et
>>>> al.
>>>>
>>>> ==19671== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright
>>>> info
>>>>
>>>> ==19671== Command: /sbin/syslog-ng --cfgfile=/var/syslog-ng.conf
>>>>
>>>> ==19671== Parent PID: 2190
>>>>
>>>> ==19671==
>>>>
>>>> ==19671== Conditional jump or move depends on uninitialised value(s)
>>>>
>>>> ==19671==    at 0x401A8D4: index (in /lib/ld-linux.so.3)
>>>>
>>>> ==19671==    by 0x4007F33: expand_dynamic_string_token (in
>>>> /lib/ld-linux.so.3)
>>>>
>>>> ==19671==
>>>>
>>>> ==19671== Conditional jump or move depends on uninitialised value(s)
>>>>
>>>> ==19671==    at 0x401A8D8: index (in /lib/ld-linux.so.3)
>>>>
>>>> ==19671==    by 0x4007F33: expand_dynamic_string_token (in
>>>> /lib/ld-linux.so.3)
>>>>
>>>> ==19671==
>>>>
>>>> ==19671== Conditional jump or move depends on uninitialised value(s)
>>>>
>>>> ==19671==    at 0x4007F38: expand_dynamic_string_token (in
>>>> /lib/ld-linux.so.3)
>>>>
>>>> ==19671==
>>>>
>>>> disInstr(arm): unhandled instruction: 0xEC510F1E
>>>>
>>>>                  cond=14(0xE) 27:20=197(0xC5) 4:4=1 3:0=14(0xE)
>>>>
>>>> ==19671== valgrind: Unrecognised instruction at address 0x4cd6e68.
>>>>
>>>> ==19671==    at 0x4CD6E68: _armv7_tick (in /lib/libcrypto.so.1.1)
>>>>
>>>> ==19671== Your program just tried to execute an instruction that
>>>> Valgrind
>>>>
>>>> ==19671== did not recognise.  There are two possible reasons for this.
>>>>
>>>> ==19671== 1. Your program has a bug and erroneously jumped to a non-code
>>>>
>>>> ==19671==    location.  If you are running Memcheck and you just saw a
>>>>
>>>> ==19671==    warning about a bad jump, it's probably your program's
>>>> fault.
>>>>
>>>> ==19671== 2. The instruction is legitimate but Valgrind doesn't handle
>>>> it,
>>>>
>>>> ==19671==    i.e. it's Valgrind's fault.  If you think this is the case
>>>> or
>>>>
>>>> ==19671==    you are not sure, please let us know and we'll try to fix
>>>> it.
>>>>
>>>> ==19671== Either way, Valgrind will now raise a SIGILL signal which will
>>>>
>>>> ==19671== probably kill your program.
>>>>
>>>> ==19700== Invalid read of size 4
>>>>
>>>> ==19700==    at 0x401A8D4: index (in /lib/ld-linux.so.3)
>>>>
>>>> ==19700==    by 0x40145BB: dl_open_worker (in /lib/ld-linux.so.3)
>>>>
>>>> ==19700==  Address 0x4f86bb4 is 3 bytes after a block of size 41 alloc'd
>>>>
>>>> ==19700==    at 0x4847654: malloc (vg_replace_malloc.c:299)
>>>>
>>>> ==19700==    by 0x49B3BC3: g_malloc (gmem.c:94)
>>>>
>>>> ==19700==    by 0x49CEA17: g_strdup (gstrfuncs.c:363)
>>>>
>>>> ==19700==    by 0x49417BF: g_module_open (gmodule.c:543)
>>>>
>>>> ==19700==    by 0x489AFDB: plugin_dlopen_module_as_filename
>>>> (plugin.c:252)
>>>>
>>>> ==19700==    by 0x489B8B3: plugin_dlopen_module_as_dir_and_filename
>>>> (plugin.c:271)
>>>>
>>>> ==19700==    by 0x489B8B3: plugin_load_candidate_modules (plugin.c:457)
>>>>
>>>> ==19700==    by 0x4887017: cfg_lexer_preprocess (cfg-lexer.c:1068)
>>>>
>>>> ==19700==    by 0x4887017: cfg_lexer_lex (cfg-lexer.c:1110)
>>>>
>>>> ==19700==    by 0x48A876F: main_parse (cfg-grammar.c:3143)
>>>>
>>>> ==19700==    by 0x4887D8F: cfg_parser_parse (cfg-parser.c:369)
>>>>
>>>> ==19700==    by 0x48845BF: cfg_run_parser (cfg.c:504)
>>>>
>>>> ==19700==    by 0x48846B3: cfg_read_config (cfg.c:556)
>>>>
>>>> ==19700==    by 0x489766B: main_loop_read_and_init_config
>>>> (mainloop.c:552)
>>>>
>>>> ==19700==
>>>>
>>>> ==19700== Conditional jump or move depends on uninitialised value(s)
>>>>
>>>> ==19700==    at 0x401A8D4: index (in /lib/ld-linux.so.3)
>>>>
>>>> ==19700==    by 0x40145BB: dl_open_worker (in /lib/ld-linux.so.3)
>>>>
>>>> ==19700==
>>>>
>>>> ==19700== Conditional jump or move depends on uninitialised value(s)
>>>>
>>>> ==19700==    at 0x401A8D8: index (in /lib/ld-linux.so.3)
>>>>
>>>> ==19700==    by 0x40145BB: dl_open_worker (in /lib/ld-linux.so.3)
>>>>
>>>> ==19700==
>>>>
>>>> ==19700== Conditional jump or move depends on uninitialised value(s)
>>>>
>>>> ==19700==    at 0x40145C0: dl_open_worker (in /lib/ld-linux.so.3)
>>>>
>>>> ==19700==
>>>>
>>>> ==19700== Conditional jump or move depends on uninitialised value(s)
>>>>
>>>> ==19700==    at 0x401AD9C: strcpy (in /lib/ld-linux.so.3)
>>>>
>>>> ==19700==    by 0x400A8D3: _dl_lookup_symbol_x (in /lib/ld-linux.so.3)
>>>>
>>>> ==19700==
>>>>
>>>> ==19700== Invalid read of size 4
>>>>
>>>> ==19700==    at 0x401AC60: strcpy (in /lib/ld-linux.so.3)
>>>>
>>>> ==19700==    by 0x400A8D3: _dl_lookup_symbol_x (in /lib/ld-linux.so.3)
>>>>
>>>> ==19700==  Address 0x5185254 is 0 bytes after a block of size 20 alloc'd
>>>>
>>>> ==19700==    at 0x4849EB4: realloc (vg_replace_malloc.c:785)
>>>>
>>>> ==19700==    by 0x49B3C9B: g_realloc (gmem.c:159)
>>>>
>>>> ==19700==    by 0x49FCAF3: _g_gnulib_vasnprintf (vasnprintf.c:5533)
>>>>
>>>> ==19700==    by 0x49FD263: _g_gnulib_vasprintf (printf.c:142)
>>>>
>>>> ==19700==    by 0x49F6B0B: g_vasprintf (gprintf.c:310)
>>>>
>>>> ==19700==    by 0x49CEB7F: g_strdup_vprintf (gstrfuncs.c:514)
>>>>
>>>> ==19700==    by 0x49CEBAB: g_strdup_printf (gstrfuncs.c:540)
>>>>
>>>> ==19700==    by 0x489AF7F: plugin_get_module_init_name (plugin.c:234)
>>>>
>>>> ==19700==    by 0x489B363: plugin_load_module (plugin.c:364)
>>>>
>>>> ==19700==    by 0x48ABCAB: pragma_parse (pragma-grammar.y:443)
>>>>
>>>> ==19700==    by 0x4887D8F: cfg_parser_parse (cfg-parser.c:369)
>>>>
>>>> ==19700==    by 0x4886E9B: cfg_lexer_parse_pragma (cfg-lexer.c:996)
>>>>
>>>> ==19700==    by 0x4886E9B: cfg_lexer_preprocess (cfg-lexer.c:1040)
>>>>
>>>> ==19700==    by 0x4886E9B: cfg_lexer_lex (cfg-lexer.c:1110)
>>>>
>>>> ==19700==
>>>>
>>>> ==19700== Conditional jump or move depends on uninitialised value(s)
>>>>
>>>> ==19700==    at 0x401AD80: strcpy (in /lib/ld-linux.so.3)
>>>>
>>>> ==19700==    by 0x400A8D3: _dl_lookup_symbol_x (in /lib/ld-linux.so.3)
>>>>
>>>> ==19700==
>>>>
>>>> ==19700== Use of uninitialised value of size 4
>>>>
>>>> ==19700==    at 0x401ACB8: strcpy (in /lib/ld-linux.so.3)
>>>>
>>>> ==19700==    by 0x400A8D3: _dl_lookup_symbol_x (in /lib/ld-linux.so.3)
>>>>
>>>> ==19700==
>>>>
>>>> ==19700== Invalid read of size 4
>>>>
>>>> ==19700==    at 0x401ADAC: strcpy (in /lib/ld-linux.so.3)
>>>>
>>>> ==19700==    by 0x400A8D3: _dl_lookup_symbol_x (in /lib/ld-linux.so.3)
>>>>
>>>> ==19700==  Address 0x533bdac is 3 bytes after a block of size 25 alloc'd
>>>>
>>>> ==19700==    at 0x4849EB4: realloc (vg_replace_malloc.c:785)
>>>>
>>>> ==19700==    by 0x49B3C9B: g_realloc (gmem.c:159)
>>>>
>>>> ==19700==    by 0x49FCAF3: _g_gnulib_vasnprintf (vasnprintf.c:5533)
>>>>
>>>> ==19700==    by 0x49FD263: _g_gnulib_vasprintf (printf.c:142)
>>>>
>>>> ==19700==    by 0x49F6B0B: g_vasprintf (gprintf.c:310)
>>>>
>>>> ==19700==    by 0x49CEB7F: g_strdup_vprintf (gstrfuncs.c:514)
>>>>
>>>> ==19700==    by 0x49CEBAB: g_strdup_printf (gstrfuncs.c:540)
>>>>
>>>> ==19700==    by 0x489AF7F: plugin_get_module_init_name (plugin.c:234)
>>>>
>>>> ==19700==    by 0x489B363: plugin_load_module (plugin.c:364)
>>>>
>>>> ==19700==    by 0x489B6BB: plugin_find (plugin.c:199)
>>>>
>>>> ==19700==    by 0x489921B: msg_format_options_init (msg-format.c:81)
>>>>
>>>> ==19700==    by 0x4891A9B: log_reader_options_init (logreader.c:728)
>>>>
>>>> ==19700==
>>>>
>>>> ==19700== Conditional jump or move depends on uninitialised value(s)
>>>>
>>>> ==19700==    at 0x401ADC4: strcpy (in /lib/ld-linux.so.3)
>>>>
>>>> ==19700==    by 0x400A8D3: _dl_lookup_symbol_x (in /lib/ld-linux.so.3)
>>>>
>>>> ==19700==
>>>>
>>>> ==19700== Conditional jump or move depends on uninitialised value(s)
>>>>
>>>> ==19700==    at 0x401A8D4: index (in /lib/ld-linux.so.3)
>>>>
>>>> ==19700==    by 0x4008BFB: _dl_map_object (in /lib/ld-linux.so.3)
>>>>
>>>> ==19700==
>>>>
>>>> ==19700== Conditional jump or move depends on uninitialised value(s)
>>>>
>>>> ==19700==    at 0x401A8D8: index (in /lib/ld-linux.so.3)
>>>>
>>>> ==19700==    by 0x4008BFB: _dl_map_object (in /lib/ld-linux.so.3)
>>>>
>>>> ==19700==
>>>>
>>>> ==19700== Conditional jump or move depends on uninitialised value(s)
>>>>
>>>> ==19700==    at 0x4008C00: _dl_map_object (in /lib/ld-linux.so.3)
>>>>
>>>> ==19700==
>>>>
>>>> ==19700== Syscall param epoll_pwait(sigmask) points to unaddressable
>>>> byte(s)
>>>>
>>>> ==19700==    at 0x4BCC73C: epoll_pwait (in /lib/libc.so.6)
>>>>
>>>> ==19700==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
>>>>
>>>> ==19700==
>>>>
>>>> ==19671==
>>>>
>>>> ==19671== HEAP SUMMARY:
>>>>
>>>> ==19671==     in use at exit: 11,980 bytes in 55 blocks
>>>>
>>>> ==19671==   total heap usage: 179 allocs, 124 frees, 49,157 bytes
>>>> allocated
>>>>
>>>> ==19671==
>>>>
>>>> ==19671== 419 (44 direct, 375 indirect) bytes in 1 blocks are
>>>> definitely lost in loss record 44 of 46
>>>>
>>>> ==19671==    at 0x4847654: malloc (vg_replace_malloc.c:299)
>>>>
>>>> ==19671==    by 0x49B3BC3: g_malloc (gmem.c:94)
>>>>
>>>> ==19671==    by 0x488BD07: g_process_set_argv_space (gprocess.c:569)
>>>>
>>>> ==19671==    by 0x1127B: main (main.c:216)
>>>>
>>>> ==19671==
>>>>
>>>> ==19671== LEAK SUMMARY:
>>>>
>>>> ==19671==    definitely lost: 44 bytes in 1 blocks
>>>>
>>>> ==19671==    indirectly lost: 375 bytes in 10 blocks
>>>>
>>>> ==19671==      possibly lost: 0 bytes in 0 blocks
>>>>
>>>> ==19671==    still reachable: 11,561 bytes in 44 blocks
>>>>
>>>> ==19671==         suppressed: 0 bytes in 0 blocks
>>>>
>>>> ==19671== Reachable blocks (those to which a pointer was found) are not
>>>> shown.
>>>>
>>>> ==19671== To see them, rerun with: --leak-check=full
>>>> --show-leak-kinds=all
>>>>
>>>>
>>>>
>>>> The memory usage keeps increasing. There are no local disks used to
>>>> store the logs messages. It is directly sent to the Graylog server.
>>>>
>>>> Could you please take a look at this issue and help resolving the
>>>> same.
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Vidya Balakrishnan
>>>>
>>>> ______________________________________________________________________________
>>>> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
>>>> Documentation:
>>>> http://www.balabit.com/support/documentation/?product=syslog-ng
>>>> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>>>>
>>>>
>>> ______________________________________________________________________________
>>> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
>>> Documentation:
>>> http://www.balabit.com/support/documentation/?product=syslog-ng
>>> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>>>
>>>
>> ______________________________________________________________________________
>> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
>> Documentation:
>> http://www.balabit.com/support/documentation/?product=syslog-ng
>> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>>
>>
> ______________________________________________________________________________
> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
> Documentation:
> http://www.balabit.com/support/documentation/?product=syslog-ng
> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.balabit.hu/pipermail/syslog-ng/attachments/20200609/43e5baff/attachment-0001.html>


More information about the syslog-ng mailing list