Patrick Hemmer <syslogng@stormcloud9.net> writes:
I think I may have solved this. It was driving me insane as the problem had gotten even worse. I couldnt go 5 minutes without syslog-ng hanging. When using flush_timeout, if there were messages pending commit, and the flush_timeout was reached, it wasnt releasing the lock before restarting the loop. And then immediately after the loop restarted it tried to get a lock, but since it hadnt released the last one, it just hung there. Now, I'm not sure if this is really it or not. It seems to have solved it, but its just seems a little bit too obvious, making me feel like I dont understand what the code is doing there. But as mentioned, it does seem to have solved the issue as I've gone a few hours now where before I couldnt go a few minutes.
--- syslog-ng-3.3.4.orig/modules/afsql/afsql.c 2011-11-12 07:48:47.000000000 -0500 +++ syslog-ng-3.3.4/modules/afsql/afsql.c 2012-02-09 23:25:02.544892824 -0500 @@ -890,6 +890,7 @@ { afsql_dd_disconnect(self); afsql_dd_suspend(self); + g_mutex_unlock(self->db_thread_mutex); continue; } }
Nice catch. I was just about to investigate this issue when I saw your fix, and I can tell you that this fix is, indeed, correct (apparently, it was also found internally by one of our developers, but for some reason I overlooked it, and didn't forward it to the list - apologies!). I'll turn it into an appliable patch in a bit. Thanks for the fix!
Note, this does not solve the issue I was getting where it would complain with "error='5: database is locked', query='COMMIT'". This still happens every now and then, but it does seem to recover eventually.
I may have an idea for this, too - will test in a bit, and get back to you as soon as I find something. -- |8]