On Thu, 2010-08-05 at 22:05 +0200, Marius Tomaschewski wrote:
Am Dienstag, 3. August 2010 16:01:39 schrieb Ilas, Yann:
Thank you for your answer.
When I start the syslog-ng with the option "mark" sets to 0, no MARKS messages are sent (as written in the OSE 3.1 Administrator Guide). This problem appears only when the option "mark" is set from a value <> 0 to a value = 0 and when I reload the configuration with a HUP signal.
One of our customers run into this problem too, but it seems to be not easy to reproduce it clearly and it has been usually resolved as WORKSFORME.
Your hint, that it is required to change from <>0 to ==0 + reload may help to reproduce it, thanks!
See also: https://bugzilla.balabit.com/show_bug.cgi?id=10 http://bugs.gentoo.org/228601 https://bugzilla.novell.com/show_bug.cgi?id=617111
I think it happens when one function fetching time gets another second value than the another one (just a change in tv_usec) and this statement results in a 0 timeout:
*timeout = MAX((next_mark_target - tv.tv_sec) * 1000, 0);
Changing it to *timeout = MAX((next_mark_target - tv.tv_sec) * 1000, 250);
limits the number of the messages that may happen inside of 1 second when the event runs with timeout 0, what may fill /var partition with mark messages.
The attached patch makes more than this (maybe a little bit too much). It also disables the watch function completely when mark_freq==0, what saves a lot of CPU ticks and makes sure, that the time is not fetched twice in one function with 2 compares.
Balazs, please review it. Thanks!
This patch seems to solve the issue for me, I've pushed it to 3.2 OSE branch, but can trivially be backported to earlier branches. diff --git a/src/afinter.c b/src/afinter.c index d631c3d..31435f7 100644 --- a/src/afinter.c +++ b/src/afinter.c @@ -154,6 +154,7 @@ afinter_source_watch_new(LogSource *afinter_source, gint mark_freq) { AFInterWatch *self = (AFInterWatch *) g_source_new(&afinter_source_watch_funcs, sizeof(AFInterWatch)); + next_mark_target = -1; self->mark_freq = mark_freq; self->afinter_source = afinter_source; g_source_set_callback(&self->super, (GSourceFunc) afinter_source_dispatch_msg, log_pipe_ref(&afinter_source->super), (GDestroyNotify) log_pipe_unref);
--- syslog-ng-2.0.9/src/afinter.c +++ syslog-ng-2.0.9/src/afinter.c 2010/07/12 08:38:35 @@ -56,20 +56,17 @@ afinter_source_prepare(GSource *source,
*timeout = -1;
- if (self->mark_freq > 0 && next_mark_target == -1) + if (self->mark_freq > 0) { g_source_get_current_time(source, &tv); - next_mark_target = tv.tv_sec + self->mark_freq; - } - - if (next_mark_target != -1) - { - g_source_get_current_time(source, &tv); - *timeout = MAX((next_mark_target - tv.tv_sec) * 1000, 0); - } - else - { - *timeout = -1; + if(next_mark_target == -1) + { + next_mark_target = tv.tv_sec + self->mark_freq; + } + if (next_mark_target != -1) + { + *timeout = MAX((next_mark_target - tv.tv_sec) * 1000, 250); + } }
If timeout is negative or zero, we really need to emit a MARK message here. The root cause was that next_mark_target wasn't reinitialized when SIGHUP is received.
return !g_queue_is_empty(internal_msg_queue); } @@ -78,10 +75,12 @@ static gboolean afinter_source_check(GSource *source) { GTimeVal tv; + AFInterWatch *self = (AFInterWatch *) source;
g_source_get_current_time(source, &tv);
- if (next_mark_target != -1 && next_mark_target <= tv.tv_sec) + if (self->mark_freq > 0 && + next_mark_target != -1 && next_mark_target <= tv.tv_sec) return TRUE; return !g_queue_is_empty(internal_msg_queue); }
if next_mark_target becomes -1 when being reinitialized, this is not needed.
@@ -94,13 +93,15 @@ afinter_source_dispatch(GSource *source, LogMessage *msg; gint path_flags = 0; GTimeVal tv; - + AFInterWatch *self = (AFInterWatch *) source; + g_source_get_current_time(source, &tv);
- if (next_mark_target != -1 && next_mark_target <= tv.tv_sec) + if (self->mark_freq > 0 && next_mark_target != -1 && next_mark_target <= tv.tv_sec) { msg = log_msg_new_mark(); path_flags = PF_FLOW_CTL_OFF; + next_mark_target = tv.tv_sec + self->mark_freq; } else { @@ -154,9 +155,12 @@ afinter_source_init(LogPipe *s, GlobalCo AFInterSource *self = (AFInterSource *) s;
/* the source added below references this logreader, it will be unref'd - when the source is destroyed */ - self->watch = afinter_source_watch_new(&self->super.super, cfg->mark_freq); - g_source_attach(self->watch, NULL); + when the source is destroyed */ + if(cfg->mark_freq > 0) + { + self->watch = afinter_source_watch_new(&self->super.super, cfg->mark_freq); + g_source_attach(self->watch, NULL); + }
This breaks the internal() source. The watch must always be enabled, because it is also responsible for generating the callbacks needed for the internal source. Anyway, thanks for the proposed patch, you can probably trivially backport the fix I've posted to 2.0 as well. -- Bazsi