On Thursday 22 of March 2012, Peter Czanik wrote:
On 03/22/2012 02:46 PM, Jakub Jankowski wrote:
I know at least about one memory leak that has been once fixed, but then reintroduced between 3.3.2 and 3.3.3, and I think is present since then. I've mentioned it on this list in February: https://lists.balabit.hu/pipermail/syslog-ng/2012-February/018334.html
It would be really nice if developers could take a look at the code again. :)
Check this: https://bugzilla.balabit.com/show_bug.cgi?id=160 The mentioned patch is available at http://git.balabit.hu/?p=bazsi/syslog-ng-3.3.git;a=commit;h=3c14a264ab4c76b9...
Unfortunately, this doesn't fix the memleak I'm seeing. I've just recreated my testbed and tried current 3.3 HEAD (using Gergely's tarball as source: http://packages.madhouse-project.org/syslog-ng/3.3/3.3.4/syslog-ng-3.3.4-201...) and can confirm "my" leak is still happening. Below are steps to reproduce it: Client is a loggen, started as: $ loggen --read-file access.random.10000.munged --active-connections 1 --interval 600 --syslog-proto <SERVERIP> 514 where access.random.10000.munged is a 10000-line file containing munged fragments of Apache accesslog: $ head access.random.10000.munged Sep 15 23:59:56 www142 accesslog: exyrqpxwva a.b.c.d - - [15/Sep/2011:23:59:56 +0200] Sep 15 23:59:57 www142 accesslog: mnufil a.b.c.d - - [15/Sep/2011:23:59:57 +0200] Sep 15 23:59:57 www142 accesslog: bvbgdknlvxibais a.b.c.d - - [15/Sep/2011:23:59:57 +0200] Sep 15 23:59:57 www83 accesslog: ttmpmlgf a.b.c.d - - [15/Sep/2011:23:59:57 +0200] Sep 15 23:59:57 www146 accesslog: igncmpwudj a.b.c.d - - [15/Sep/2011:23:59:57 +0200] Sep 15 23:59:57 www146 accesslog: pjecagrmuywulbro a.b.c.d - - [15/Sep/2011:23:59:57 +0200] Sep 15 23:59:57 www147 accesslog: ghjwjejsetvq a.b.c.d - - [15/Sep/2011:23:59:57 +0200] Sep 15 23:59:57 www22 accesslog: zcnalgemoowpunhtj a.b.c.d - - [15/Sep/2011:23:59:57 +0200] Sep 15 23:59:57 www146 accesslog: houclrfiukhbfofueill a.b.c.d - - [15/Sep/2011:23:59:57 +0200] Sep 15 23:59:57 www146 accesslog: kwqplgsulzkryhdnr a.b.c.d - - [15/Sep/2011:23:59:57 +0200] $ The most important thing are the random strings (which in real scenario are vhosts names). Destination filenames on server are based upon those strings. Server is configured like this: $ cat /etc/syslog-ng/syslog-ng.conf @version: 3.3 options { time_reap (3); log_fifo_size (100000); use_dns (no); use_fqdn (no); create_dirs (yes); keep_hostname (yes); log_msg_size(16384); # doesnt matter, with threaded(yes) it fails aswell threaded (no); owner ("root"); group ("root"); perm (0644); dir_owner ("root"); dir_group ("root"); dir_perm (0755); }; source s_web { syslog(ip(0.0.0.0) port(514) flags(no-multi-line) log_iw_size(6000) max-connections(30)); }; parser p_apache_split { csv-parser( columns("APACHE.ACCESS.VHOST", "APACHE.ACCESS.LOG_MESSAGE") flags(greedy) delimiters(" ") ); }; destination d_test { file("/logs/${APACHE.ACCESS.VHOST}-${R_YEAR}${R_MONTH}${R_DAY}" template("${APACHE.ACCESS.LOG_MESSAGE}\n")); }; log { source(s_web); parser(p_apache_split); destination(d_test); flags(final); }; $ Running it under valgrind like this: # G_SLICE=always-malloc valgrind --log-file=s3.3.4-fd3b-1.log \ --leak-check=full -v --show-reachable=yes --track-origins=yes \ syslog-ng -F --no-caps -p /var/run/syslog-ng.pid produces following log file: http://toxcorp.com/stuff/syslog-ng-leak/s3.3.4-fd3b-1.log I'm pretty sure this particular memory leak was fixed in 3.3.2, but then got reintroduced in this commit: http://git.balabit.hu/?p=bazsi/syslog-ng-3.3.git;a=commitdiff;h=c7070e2a6f1c... which was done to fix tcp() destination related crash after 3.3.2 was released. So, Bazsi, yes - it does reintroduce the leak :) (in reply to your comment in https://lists.balabit.hu/pipermail/syslog-ng/2011-November/017697.html ) Apart from the leak you can see in valgrind logfile, there is another symptom - some destination files are created empty: $ find /logs/ -type f | wc -l 10000 $ find /logs/ -type f -size 0 | wc -l 4410 $ find /logs/ -type f -size 0 | head -n 1 /logs/hcxjscbcnyoktilm-20120325 Even though there is a perfectly valid source record for them: $ grep hcxjscbcnyoktilm access.random.10000.munged Sep 16 00:00:05 www147 accesslog: hcxjscbcnyoktilm a.b.c.d - - [16/Sep/2011:00:00:05 +0200] $ The number of empty files corresponds to the number of valgrind reported errors: ==17931== 4410 errors in context 1 of 6: ==17931== Invalid read of size 8 ==17931== at 0x4E4BA05: log_dest_driver_deinit_method (driver.c:172) ==17931== by 0x6CFE559: affile_dd_deinit (affile.c:1001) ==17931== by 0x4E4AD33: log_dest_group_deinit (logpipe.h:254) ==17931== by 0x4E4650A: log_center_deinit (logpipe.h:254) ==17931== by 0x4E5F63F: main_loop_exit_finish (mainloop.c:585) ==17931== by 0x4E791A0: iv_run_timers (iv_timer.c:345) ==17931== by 0x4E777A5: iv_main (iv_main.c:252) ==17931== by 0x4E6023D: main_loop_run (mainloop.c:731) ==17931== by 0x40175D: main (main.c:260) I hope this can help you diagnose and fix this issue. HTH -- Jakub Jankowski|shasta@toxcorp.com|http://toxcorp.com/ GPG: FCBF F03D 9ADB B768 8B92 BB52 0341 9037 A875 942D