On 10/15/2011 at 12:36 PM, in message <1318635416.18436.11.camel@monolith.infinux.org>, Christopher Barry <christopher.barry@rackwareinc.com> wrote: On Thu, 2011-10-13 at 09:47 +1300, anthony lineham wrote: Hi Everyone,
I've recently found a memory leak in version 3.0.4 that occurs if the configuration file contains any duplicate entries for the following items: sources, destinations, filters, rewrites, parsers. The leak occurs when the configuration is loaded or reloaded due to a SIGHUP. The config items are stored in a hash table (hash is based on the name of the filter). It seems that in the hashing utility if a request is made to add an item with the same key as a previous item the old item is replaced by the new item (libglib2-2.24.1/glib/ghash.c, g_hash_table_insert()). This is where the memory gets lost. Syslog-ng is not notified that the item has been discarded.
I considered several possibilities to fix this issue. There are other modes for the hash utility where destroy functions can be setup to deal with the discarded items, but these are orientated around discarding the old item. I thought this was risky as the item may already be referenced by other items. I'm not sure if this case??? In the end I decided to check for the presence of a duplicate in the table before adding the new item. If an existing item is found I discard the new item.
While I realise 3.0.4 is quite old now and perhaps not supported, it looks like the same basic mechanism is still used in OSE 3.3.1, so I'm submitting my code changes for consideration. I haven't tried reproducing the issue on 3.3.1.
Any comments?
Regards Anthony Lineham
comment on figuring out the bug: Awesome! :)
comment on your solution to test first: seems correct to me
comment on the code: Would it be better coding form to create a parameterized function from the code block, and then just call it six times with 'source', 'rewrite', etc., rather than cut and paste essentially the same code block six times in it's entirety? Seems like it might be a bit cleaner and easier to maintain. Or was there a reason I'm not aware of as to why it needs to be this way?
Hi Christopher, Thanks for the comments. You're right, there is some code replication going on there. However, there are actually 3 different types involved: pipes, process rules and templates. So, to boil it down to one function might be a bit messy (open to suggestions!). Another option would be to create a function for each of the 3 types. This would certainly reduce the replication and I think it would be worthwhile - for the reasons you've stated. For some reason I like the ability to easily report the type of item that is a duplicate, but I've attached a patch which sacrifices that ability in the interests of reducing the replication. Note: I've compiled this code but I haven't re-run my tests on it. Regards Anthony