[syslog-ng] Duplicate configuration items causing memory leak

Mon Oct 17 01:03:39 CEST 2011

>>> On 10/15/2011 at 12:36 PM, in message
<1318635416.18436.11.camel at monolith.infinux.org>, Christopher Barry
<christopher.barry at rackwareinc.com> wrote: 
> On Thu, 2011-10-13 at 09:47 +1300, anthony lineham wrote:
>> Hi Everyone,
>> 
>> I've recently found a memory leak in version 3.0.4 that occurs if the 
> configuration file contains any duplicate entries for the following items: 
> sources, destinations, filters, rewrites, parsers. The leak occurs when the 
> configuration is loaded or reloaded due to a SIGHUP.
>> The config items are stored in a hash table (hash is based on the name of 
> the filter). It seems that in the hashing utility if a request is made to add 
> an item with the same key as a previous item the old item is replaced by the 
> new item (libglib2-2.24.1/glib/ghash.c, g_hash_table_insert()). This is where 
> the memory gets lost. Syslog-ng is not notified that the item has been 
> discarded. 
>> 
>> I considered several possibilities to fix this issue. There are other modes 
> for the hash utility where destroy functions can be setup to deal with the 
> discarded items, but these are orientated around discarding the old item. I 
> thought this was risky as the item may already be referenced by other items. 
> I'm not sure if this case??? In the end I decided to check for the presence 
> of a duplicate in the table before adding the new item. If an existing item 
> is found I discard the new item.
>> 
>> While I realise 3.0.4 is quite old now and perhaps not supported, it looks 
> like the same basic mechanism is still used in OSE 3.3.1, so I'm submitting 
> my code
>  changes for consideration. I haven't tried reproducing the issue on 3.3.1.
>> 
>> Any comments?
>> 
>> Regards
>> Anthony Lineham
>> 
> 
> comment on figuring out the bug:
> Awesome! :)
> 
> comment on your solution to test first:
> seems correct to me
> 
> comment on the code:
> Would it be better coding form to create a parameterized function from
> the code block, and then just call it six times with 'source',
> 'rewrite', etc., rather than cut and paste essentially the same code
> block six times in it's entirety? Seems like it might be a bit cleaner
> and easier to maintain. Or was there a reason I'm not aware of as to why
> it needs to be this way?

Hi Christopher,

Thanks for the comments. You're right, there is some code replication going on there. However, there are actually 3 different types involved: pipes, process rules and templates. So, to boil it down to one function might be a bit messy (open to suggestions!). Another option would be to create a function for each of the 3 types. This would certainly reduce the replication and I think it would be worthwhile - for the reasons you've stated. For some reason I like the ability to easily report the type of item that is a duplicate, but I've attached a patch which sacrifices that ability in the interests of reducing the replication.

Note: I've compiled this code but I haven't re-run my tests on it.

Regards
Anthony

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: reload_leak_refactor.patch
Url: http://lists.balabit.hu/pipermail/syslog-ng/attachments/20111017/a28d429f/attachment.txt