On Thu, 2011-07-14 at 21:56 -0400, JP Vossen wrote:
On 07/14/2011 03:14 PM, Balazs Scheidler wrote:
On 07/14/2011 03:16 AM, Balazs Scheidler wrote:
On Sat, 2011-06-25 at 04:49 -0400, JP Vossen wrote:
On Mon, 2011-06-20 at 20:59 +0100, Jose Pedro Oliveira wrote: > There is a problem with the hash table implementation of glib2 > version 2.12.3-4 (version that ships in RHEL 5.x).
Is this hash problem going to cause critical failures? Under what circumstances? Or is it, well, it'd be nice if that hash problem didn't happen, but it's not a big deal...
Well, it probably mostly depends on why the hashtable collides in that glib version. This hash is a global hash that maps name-value pairs to their own unique IDs, which is then used to track name-value pairs in log messages.
Sorry if I am being dense. What name-value pairs used for what? Would this impact a basic syslog-ng config that emulates the sysklogd config? What syslog-ng features need to be in use to trigger this?
For syslog-ng a log message is a set of name-value pairs. Basic syslog properties like $HOST or $MSG as well. The only exceptions are the $PRI and $DATE fields.
But syslog-ng uses hash tables for a number of other things, so this can cause other bugs as well.
Ouch, that sounds like a show-stopper for using 3.2.x. on un-fixed RHEL-5/CentOS-5. So I wondered what I can use safely? Or am I over-reacting?
First I tried looking for any '*nvtable*' file, and 3.0.9 has none while 3.1.4 and 3.2.4 have some. But then I looked at https://bugzilla.redhat.com/show_bug.cgi?id=716447 and started looking for 'g_hash_table' and that's not too good.
I found lots of hits for 'g_hash_table' [1] in every version of syslog-ng I had laying around: syslog-ng-2.1.4 syslog-ng-3.0.9 syslog-ng-3.1.4 syslog-ng-3.2.4
[1] $ grep -Rc 'g_hash_table' syslog-ng-2.1.4/* syslog-ng-3.0.9/* syslog-ng-3.1.4/* syslog-ng-3.2.4/* | grep -v ':0$'
Does anyone know what is the latest syslog-ng that can safely be used on un-fixed RHEL-5/CentOS-5? What is a valid test to look for this problem (if nvtable files or g_hash_table isn't)?
In case the hash table returns non-matching elements, it means that two (or more) different name-value pairs will map to the same id, effectively one overwriting the other. Whether it happens in practice actually depends on what the exact bug in glib is.
Given how old CentOS-5 is, I wonder that this hasn't been noticed and reported before now. Perhaps that means it's rare to hit it in practice? Or maybe just really hard to identify the root cause.
I think the misbehaviour is difficult to notice, and the root cause is not easy either.
I've checked the glib history, but I've found no patch that jumped out.
The bug is still open (but also brand new): https://bugzilla.redhat.com/show_bug.cgi?id=716447
Well, syslog-ng uses hash tables a number of ways, but so does a lot of different programs. I think there's a different issue here. I've tried to recompile syslog-ng against the glib src.rpm coming from CentOS 5. The tests passed correctly. The program that is in the quoted bug report is however not completely correct. It formats the hash key in a static variable, inserts it into the hash, and then changes the key in place, effectively modifying the hash key. This is different from the unit test, which had the failure in the first place. There the key is duped in nv_registry_alloc_handle(), so the same corruption wouldn't happen. I don't have more time to take care about issue right now, any further investigation would be appreciated. Thanks in advance. -- Bazsi