On Mon, 2011-10-03 at 00:14 +0200, Jakub Jankowski wrote:
Hi,
On Tue, 27 Sep 2011 20:16:29 +0200, Balazs Scheidler wrote:
Thanks for the detailed problem description Jakub and the diagnosis Gergely. I've committed this patch to master:
commit 2ed7f92153aec5e4e666ea17dcd8d2232a8e76f9 [...] commit e224d45da2ecad68f7f9c44895849a0fc795aae5
Sorry it took me so long to report back.
Having issues building el5 package directly from git (see my other mails in this thread), I chose only to apply 2ed7f92153aec5e4e666ea17dcd8d2232a8e76f9 and e224d45da2ecad68f7f9c44895849a0fc795aae5 commits to the source I'm building from, which is essentially 0a3d844ff94a14d770bcdfa993f02e87e58a81f2.
Unfortunately, those two patches do not seem to help, I'm still getting this in valgrind output on my test machine:
==17806== 419,076 bytes in 9,978 blocks are indirectly lost in loss record 581 of 582 ==17806== at 0x4022B83: malloc (vg_replace_malloc.c:195) ==17806== by 0x40FF615: g_malloc (in /lib/libglib-2.0.so.0.1200.3) ==17806== by 0x4112BC8: g_strdup (in /lib/libglib-2.0.so.0.1200.3) ==17806== by 0x40521D9: log_queue_init_instance (logqueue.c:188) ==17806== by 0x40528BA: log_queue_fifo_new (logqueue-fifo.c:440) ==17806== by 0x4045FF7: log_dest_driver_acquire_queue_method (driver.c:153) ==17806== by 0x4708219: affile_dw_init (driver.h:185) ==17806== by 0x4707C41: affile_dd_open_writer (logpipe.h:239) ==17806== by 0x405C268: main_loop_call (mainloop.c:145) ==17806== by 0x4707A6B: affile_dd_queue (affile.c:1127) ==17806== by 0x40451C8: log_dest_group_queue (logpipe.h:288) ==17806== by 0x404B9ED: log_multiplexer_queue (logpipe.h:288) ==17806== ==17806== 4,092,084 (3,673,008 direct, 419,076 indirect) bytes in 9,981 blocks are definitely lost in loss record 582 of 582 ==17806== at 0x4021EC2: calloc (vg_replace_malloc.c:418) ==17806== by 0x40FF57D: g_malloc0 (in /lib/libglib-2.0.so.0.1200.3) ==17806== by 0x40528A9: log_queue_fifo_new (logqueue-fifo.c:438) ==17806== by 0x4045FF7: log_dest_driver_acquire_queue_method (driver.c:153) ==17806== by 0x4708219: affile_dw_init (driver.h:185) ==17806== by 0x4707C41: affile_dd_open_writer (logpipe.h:239) ==17806== by 0x405C268: main_loop_call (mainloop.c:145) ==17806== by 0x4707A6B: affile_dd_queue (affile.c:1127) ==17806== by 0x40451C8: log_dest_group_queue (logpipe.h:288) ==17806== by 0x404B9ED: log_multiplexer_queue (logpipe.h:288) ==17806== by 0x404EC26: log_parser_queue (logpipe.h:288) ==17806== by 0x4047483: log_filter_pipe_queue (logpipe.h:288) ==17806== ==17806== LEAK SUMMARY: ==17806== definitely lost: 3,673,208 bytes in 9,984 blocks ==17806== indirectly lost: 425,595 bytes in 10,033 blocks ==17806== possibly lost: 20,425 bytes in 145 blocks ==17806== still reachable: 77,296 bytes in 3,288 blocks ==17806== suppressed: 0 bytes in 0 blocks
So, either 1) those two patches are not sufficent to fix this bug, 2) those patches do not fix *this* bug, or 3) PEBKAC, and I'm doing something wrong (very likely!).
Best way to see which one is true would be to build a package from git HEAD and run my tests again. Unfortunately, this is tricky on el5, I think I have to wait for Gergely to create a dist tarball like he promised :)
Strange, I've successfully reproduced the leak without the patches and they were fixed with them. Let's see if 3.3.1 solves the issue or not. -- Bazsi