[syslog-ng] 3.3.0beta1 leaking memeory (Re: syslog-ng 3.3.0beta & ESX crashes)

Gergely Nagy algernon at balabit.hu
Fri Jul 22 11:42:28 CEST 2011


Hendrik Visage <hvjunk at gmail.com> writes:

>  I've mailed Gergely the Valgrind output to analyze 3.3.0beta1 memory
> leak(s), so if anybody else are interested in it, please contact me
> for them.

I had a little time to look into the memory leak issue, and I can
confirm: the leak happens all the time with latest git.

What happens is, that log_reader_work_perform() calls
log_reader_fetch_log(), which in turn calls log_reader_handle_line(),
which allocates a LogMessage.

This LogMessage is never freed, only forgotten, thus we end up with
stuff like this:

==13955== 39,240,752 (15,004,656 direct, 24,236,096 indirect) bytes in 19,953 blocks are definitely lost in loss record 623 of 623
==13955==    at 0x4C274A8: malloc (vg_replace_malloc.c:236)
==13955==    by 0x5941754: g_malloc (in /lib/libglib-2.0.so.0.2400.1)
==13955==    by 0x4E5EABD: log_msg_alloc (logmsg.c:925)                                                                                                                                                        
==13955==    by 0x4E5EB3F: log_msg_new (logmsg.c:950)                                                                                                                                                          
==13955==    by 0x4E6765C: log_reader_handle_line (logreader.c:494)
==13955==    by 0x4E6787D: log_reader_fetch_log (logreader.c:572)
==13955==    by 0x4E66A17: log_reader_work_perform (logreader.c:116)
==13955==    by 0x4E66B74: log_reader_io_process_input (logreader.c:191)
==13955==    by 0x4E972A5: iv_run_tasks (iv_task.c:53)
==13955==    by 0x4E95F7A: iv_main (iv_main.c:253)
==13955==    by 0x4E6F8F8: main_loop_run (mainloop.c:670)
==13955==    by 0x401AE8: main (main.c:263)

There's also another leak somewhere around patterndb (but I haven't
looked into this one yet):

==13955== 24,265,280 bytes in 19,955 blocks are indirectly lost in loss record 622 of 623
==13955==    at 0x4C275A2: realloc (vg_replace_malloc.c:525)
==13955==    by 0x594158E: g_realloc (in /lib/libglib-2.0.so.0.2400.1)
==13955==    by 0x4E7333A: nv_table_realloc (nvtable.c:685)
==13955==    by 0x4E5DBC1: log_msg_set_value_indirect (logmsg.c:491)
==13955==    by 0x8826CF6: pdb_rule_set_lookup (patterndb.c:1254)
==13955==    by 0x88273E3: pattern_db_process (patterndb.c:1458)
==13955==    by 0x881F408: log_db_parser_process (dbparser.c:217)
==13955==    by 0x4E5FE15: log_parser_queue (logparser.c:49)
==13955==    by 0x4E5C971: log_pipe_queue (logpipe.h:288)
==13955==    by 0x4E5CBF8: log_multiplexer_queue (logmpx.c:125)
==13955==    by 0x4E771A7: log_pipe_queue (logpipe.h:288)
==13955==    by 0x4E77149: log_pipe_forward_msg (logpipe.h:275)

The config to trigger this is pretty simple:

| @version: 3.3
| @include "scl.conf"
| @module tfjson
| 
| source s_local { internal(); };
| source s_network { tcp(port(10514) tags("tcp-tag")); };
| destination d_local {
|  file("/tmp/ose-messages"
|       template("$(format-json --scope all-nv-pairs --scope core)\n"));
| };
| parser p_loggen {
|  db_parser(file("/home/algernon/install/ose/syslog-ng-3.3/etc/loggen.pdb"));
| };
| 
| log { source(s_local); source(s_network); parser(p_loggen);
| destination(d_local); };

There's also a few other leaks, but those seem reasonably minor (eg,
internal messages seem leaking aswell), and are probably related to one
of the above leaks.

I have no fix yet, but I thought I'd share my findings nevertheless.

The loggen.pdb file is attached, for reference, and triggering the bug
was simply done by throwing a ton of logs at syslog-ng with loggen.

-- 
|8]

-------------- next part --------------
A non-text attachment was scrubbed...
Name: loggen.pdb
Type: text/xml
Size: 928 bytes
Desc: not available
Url : http://lists.balabit.hu/pipermail/syslog-ng/attachments/20110722/88fa2c00/attachment.bin 


More information about the syslog-ng mailing list