Hendrik Visage <hvjunk@gmail.com> writes:
I've mailed Gergely the Valgrind output to analyze 3.3.0beta1 memory leak(s), so if anybody else are interested in it, please contact me for them.
I had a little time to look into the memory leak issue, and I can confirm: the leak happens all the time with latest git. What happens is, that log_reader_work_perform() calls log_reader_fetch_log(), which in turn calls log_reader_handle_line(), which allocates a LogMessage. This LogMessage is never freed, only forgotten, thus we end up with stuff like this: ==13955== 39,240,752 (15,004,656 direct, 24,236,096 indirect) bytes in 19,953 blocks are definitely lost in loss record 623 of 623 ==13955== at 0x4C274A8: malloc (vg_replace_malloc.c:236) ==13955== by 0x5941754: g_malloc (in /lib/libglib-2.0.so.0.2400.1) ==13955== by 0x4E5EABD: log_msg_alloc (logmsg.c:925) ==13955== by 0x4E5EB3F: log_msg_new (logmsg.c:950) ==13955== by 0x4E6765C: log_reader_handle_line (logreader.c:494) ==13955== by 0x4E6787D: log_reader_fetch_log (logreader.c:572) ==13955== by 0x4E66A17: log_reader_work_perform (logreader.c:116) ==13955== by 0x4E66B74: log_reader_io_process_input (logreader.c:191) ==13955== by 0x4E972A5: iv_run_tasks (iv_task.c:53) ==13955== by 0x4E95F7A: iv_main (iv_main.c:253) ==13955== by 0x4E6F8F8: main_loop_run (mainloop.c:670) ==13955== by 0x401AE8: main (main.c:263) There's also another leak somewhere around patterndb (but I haven't looked into this one yet): ==13955== 24,265,280 bytes in 19,955 blocks are indirectly lost in loss record 622 of 623 ==13955== at 0x4C275A2: realloc (vg_replace_malloc.c:525) ==13955== by 0x594158E: g_realloc (in /lib/libglib-2.0.so.0.2400.1) ==13955== by 0x4E7333A: nv_table_realloc (nvtable.c:685) ==13955== by 0x4E5DBC1: log_msg_set_value_indirect (logmsg.c:491) ==13955== by 0x8826CF6: pdb_rule_set_lookup (patterndb.c:1254) ==13955== by 0x88273E3: pattern_db_process (patterndb.c:1458) ==13955== by 0x881F408: log_db_parser_process (dbparser.c:217) ==13955== by 0x4E5FE15: log_parser_queue (logparser.c:49) ==13955== by 0x4E5C971: log_pipe_queue (logpipe.h:288) ==13955== by 0x4E5CBF8: log_multiplexer_queue (logmpx.c:125) ==13955== by 0x4E771A7: log_pipe_queue (logpipe.h:288) ==13955== by 0x4E77149: log_pipe_forward_msg (logpipe.h:275) The config to trigger this is pretty simple: | @version: 3.3 | @include "scl.conf" | @module tfjson | | source s_local { internal(); }; | source s_network { tcp(port(10514) tags("tcp-tag")); }; | destination d_local { | file("/tmp/ose-messages" | template("$(format-json --scope all-nv-pairs --scope core)\n")); | }; | parser p_loggen { | db_parser(file("/home/algernon/install/ose/syslog-ng-3.3/etc/loggen.pdb")); | }; | | log { source(s_local); source(s_network); parser(p_loggen); | destination(d_local); }; There's also a few other leaks, but those seem reasonably minor (eg, internal messages seem leaking aswell), and are probably related to one of the above leaks. I have no fix yet, but I thought I'd share my findings nevertheless. The loggen.pdb file is attached, for reference, and triggering the bug was simply done by throwing a ton of logs at syslog-ng with loggen. -- |8]