Hi, I'm using syslog-ng's match filter to extract parts of messages for use in parts of the destination; specifically in the log file name and template sections. The messages sent to syslog-ng for this purpose are all of the format: "<filename> <message>" (plus whatever sl-ng prepends). .conf extract: filter f_logsplitter { match ("^.*?: ([^ ]*) (.*)$"); }; destination d_logfiles { file ("/our/logs/$1" template ("$2\n") ... etc. Eg. For the message "root: foo/bar.log some text", the log file /our/logs/foo/bar.log gets "some text" appended to it. Now this seems to work fine for low traffic, but if I send 10k messages in a tight loop - either locally or remotely via TCP - $2 contains the wrong string for many/most of them, eg: 07/03/2006 16:56:34 SEEK #3 07/03/2006 16:56:35 SEEK #3 07/03/2006 16:56:35 SEEK #3 07/03/2006 16:56:35 SEEK #7 07/03/2006 16:56:35 SEEK #7 07/03/2006 16:56:35 SEEK #7 07/03/2006 16:56:35 SEEK #7 07/03/2006 16:56:35 SEEK #8 07/03/2006 16:56:35 SEEK #11 07/03/2006 16:56:35 SEEK #11 07/03/2006 16:56:35 SEEK #11 ...where the #number should be consecutive. If I use $MSGONLY, or such, the correct message line is used. It seems to me like the subpattern macros - $1, $2, etc - aren't maintained on a per-message basis, but are global..? Given that they aren't documented (that I can find), are the presence of $1, etc. an unintended side-effect of using a regex? Am I doing something I shouldn't by using them? Or did I find me a bug? ;o) Takk, - Mel C
On Tuesday 07 Mar 2006 18:18, Mel Collins wrote:
I'm using syslog-ng's match filter to extract parts of messages for use in parts of the destination; specifically in the log file name and template sections.
I've just been going through the source, and it's been a long while since I last used C, but I think I can answer my own question:
It seems to me like the subpattern macros - $1, $2, etc - aren't maintained on a per-message basis, but are global..?
Yes, the subpattern matches are stored globally, in the array re_matches (defined in filter.c/h, used in macros.c). I can't think of any logical reason why it would be designed this way intentionally, so I suspect it's an oversight/bug. Unless there's something I've missed? Is there any official way to report bugs, or is this as official as it gets? ;) Takk, - Mel C
Le Tue Mar 7 18:18:08 2006, Mel Collins a écrit: | Hi, | I'm using syslog-ng's match filter to extract parts of messages for use in | parts of the destination; specifically in the log file name and template | sections. | The messages sent to syslog-ng for this purpose are all of the format: | "<filename> <message>" (plus whatever sl-ng prepends). | | .conf extract: | filter f_logsplitter { match ("^.*?: ([^ ]*) (.*)$"); }; | destination d_logfiles { | file ("/our/logs/$1" | template ("$2\n") | ... | etc. | | Eg. For the message "root: foo/bar.log some text", the log | file /our/logs/foo/bar.log gets "some text" appended to it. | | Now this seems to work fine for low traffic, but if I send 10k messages in a | tight loop - either locally or remotely via TCP - $2 contains the wrong | string for many/most of them, eg: | 07/03/2006 16:56:34 SEEK #3 | 07/03/2006 16:56:35 SEEK #3 | 07/03/2006 16:56:35 SEEK #3 | 07/03/2006 16:56:35 SEEK #7 | 07/03/2006 16:56:35 SEEK #7 | 07/03/2006 16:56:35 SEEK #7 | 07/03/2006 16:56:35 SEEK #7 | 07/03/2006 16:56:35 SEEK #8 | 07/03/2006 16:56:35 SEEK #11 | 07/03/2006 16:56:35 SEEK #11 | 07/03/2006 16:56:35 SEEK #11 | ...where the #number should be consecutive. | If I use $MSGONLY, or such, the correct message line is used. | | It seems to me like the subpattern macros - $1, $2, etc - aren't maintained | on a per-message basis, but are global..? | Given that they aren't documented (that I can find), are the presence of $1, | etc. an unintended side-effect of using a regex? Am I doing something I | shouldn't by using them? | Or did I find me a bug? ;o) In fact It looks like the same problem I raised about syslog-ng missing messages. I have the exact same probleme except that I am not using pattern matching.... Vincent. -- .~. Vincent Haverlant -- Galadril -- #ICQ: 35695155 /V\ MSN: vincent_msn@haverlant.org -- http://www.haverlant.org/ /( )\ Parinux member: http://www.parinux.org/ ^^-^^ GPG: 8FEA 52C2 5C54 A201 2375 0FA5 AF2E 1881 92D0 EE84
Le Wed Mar 8 17:26:23 2006, Vincent Haverlant a écrit: | Le Tue Mar 7 18:18:08 2006, Mel Collins a écrit: | | 07/03/2006 16:56:35 SEEK #11 | | 07/03/2006 16:56:35 SEEK #11 | | ...where the #number should be consecutive. | | If I use $MSGONLY, or such, the correct message line is used. | | | | It seems to me like the subpattern macros - $1, $2, etc - aren't maintained | | on a per-message basis, but are global..? | | Given that they aren't documented (that I can find), are the presence of $1, | | etc. an unintended side-effect of using a regex? Am I doing something I | | shouldn't by using them? | | Or did I find me a bug? ;o) | | In fact It looks like the same problem I raised about syslog-ng missing | messages. I have the exact same probleme except that I am not using | pattern matching.... Well after all maybe not, but it could be related.... Vincent. -- .~. Vincent Haverlant -- Galadril -- #ICQ: 35695155 /V\ MSN: vincent_msn@haverlant.org -- http://www.haverlant.org/ /( )\ Parinux member: http://www.parinux.org/ ^^-^^ GPG: 8FEA 52C2 5C54 A201 2375 0FA5 AF2E 1881 92D0 EE84
On Tue, 2006-03-07 at 18:18 +0000, Mel Collins wrote:
Hi, I'm using syslog-ng's match filter to extract parts of messages for use in parts of the destination; specifically in the log file name and template sections. The messages sent to syslog-ng for this purpose are all of the format: "<filename> <message>" (plus whatever sl-ng prepends).
.conf extract: filter f_logsplitter { match ("^.*?: ([^ ]*) (.*)$"); }; destination d_logfiles { file ("/our/logs/$1" template ("$2\n") ... etc.
Eg. For the message "root: foo/bar.log some text", the log file /our/logs/foo/bar.log gets "some text" appended to it.
Now this seems to work fine for low traffic, but if I send 10k messages in a tight loop - either locally or remotely via TCP - $2 contains the wrong string for many/most of them, eg: 07/03/2006 16:56:34 SEEK #3 07/03/2006 16:56:35 SEEK #3 07/03/2006 16:56:35 SEEK #3 07/03/2006 16:56:35 SEEK #7 07/03/2006 16:56:35 SEEK #7 07/03/2006 16:56:35 SEEK #7 07/03/2006 16:56:35 SEEK #7 07/03/2006 16:56:35 SEEK #8 07/03/2006 16:56:35 SEEK #11 07/03/2006 16:56:35 SEEK #11 07/03/2006 16:56:35 SEEK #11 ...where the #number should be consecutive. If I use $MSGONLY, or such, the correct message line is used.
It seems to me like the subpattern macros - $1, $2, etc - aren't maintained on a per-message basis, but are global..? Given that they aren't documented (that I can find), are the presence of $1, etc. an unintended side-effect of using a regex? Am I doing something I shouldn't by using them? Or did I find me a bug? ;o)
You probably found a bug. I assummed that processing of a message is always finished before starting a new one, e.g. the pattern matches are global. This is however not true when the message is queued in which case the scenario that you described happens, the global match space is overwritten. Let me fix that buglet... Can you check if this compile-tested patch fixes it? M src/filter.c M src/filter.h M src/logmsg.h M src/macros.c * modified files --- orig/src/filter.c +++ mod/src/filter.c @@ -30,11 +30,8 @@ #include <string.h> -gchar *re_matches[RE_MAX_MATCHES]; - static void log_filter_rule_free(LogFilterRule *self); - gboolean log_filter_rule_eval(LogFilterRule *self, LogMessage *msg) { @@ -304,7 +301,7 @@ filter_re_compile(const char *re, regex_ } static gboolean -filter_re_eval(FilterRE *self, gchar *str) +filter_re_eval(FilterRE *self, LogMessage *msg, gchar *str) { regmatch_t matches[RE_MAX_MATCHES]; gboolean rc; @@ -312,8 +309,9 @@ filter_re_eval(FilterRE *self, gchar *st for (i = 0; i < RE_MAX_MATCHES; i++) { - g_free(re_matches[i]); - re_matches[i] = NULL; + if (msg->re_matches[i]) + g_free(msg->re_matches[i]); + msg->re_matches[i] = NULL; } rc = !regexec(&self->regex, str, RE_MAX_MATCHES, matches, 0); if (rc) @@ -321,9 +319,9 @@ filter_re_eval(FilterRE *self, gchar *st for (i = 0; i < RE_MAX_MATCHES && matches[i].rm_so != -1; i++) { gint length = matches[i].rm_eo - matches[i].rm_so; - re_matches[i] = g_malloc(length + 1); - memcpy(re_matches[i], &str[matches[i].rm_so], length); - re_matches[i][length] = 0; + msg->re_matches[i] = g_malloc(length + 1); + memcpy(msg->re_matches[i], &str[matches[i].rm_so], length); + msg->re_matches[i][length] = 0; } } return rc ^ self->super.comp; @@ -341,7 +339,7 @@ filter_re_free(FilterExprNode *s) static gboolean filter_prog_eval(FilterExprNode *s, LogMessage *msg) { - return filter_re_eval((FilterRE *) s, msg->program->str); + return filter_re_eval((FilterRE *) s, msg, msg->program->str); } FilterExprNode * @@ -363,7 +361,7 @@ filter_prog_new(gchar *prog) static gboolean filter_host_eval(FilterExprNode *s, LogMessage *msg) { - return filter_re_eval((FilterRE *) s, msg->host->str); + return filter_re_eval((FilterRE *) s, msg, msg->host->str); } FilterExprNode * @@ -385,7 +383,7 @@ filter_host_new(gchar *host) static gboolean filter_match_eval(FilterExprNode *s, LogMessage *msg) { - return filter_re_eval((FilterRE *) s, msg->msg->str); + return filter_re_eval((FilterRE *) s, msg, msg->msg->str); } FilterExprNode * --- orig/src/filter.h +++ mod/src/filter.h @@ -31,10 +31,6 @@ struct _LogFilterRule; struct _GlobalConfig; -/* regex substitutions from the last match */ -#define RE_MAX_MATCHES 10 -extern gchar *re_matches[RE_MAX_MATCHES]; - typedef struct _FilterExprNode { gboolean comp; --- orig/src/logmsg.h +++ mod/src/logmsg.h @@ -72,6 +72,8 @@ typedef struct _LogStamp void log_stamp_format(LogStamp *stamp, GString *target, gint ts_format, glong zone_offset, gint frac_digits); +#define RE_MAX_MATCHES 10 + typedef struct _LogMessage { guint ref_cnt; @@ -87,6 +89,7 @@ typedef struct _LogMessage LogStamp stamp; LogStamp recvd; GString *date, *host, *host_from, *program, *msg; + gchar *re_matches[RE_MAX_MATCHES]; } LogMessage; LogMessage *log_msg_ref(LogMessage *m); --- orig/src/macros.c +++ mod/src/macros.c @@ -142,8 +142,8 @@ log_macro_expand(GString *result, gint i { gint ndx = id - M_MATCH_REF_OFS; /* match reference */ - if (re_matches[ndx]) - result_append(result, re_matches[ndx], strlen(re_matches[ndx]), !!(flags & MF_ESCAPE_RESULT)); + if (msg->re_matches[ndx]) + result_append(result, msg->re_matches[ndx], strlen(msg->re_matches[ndx]), !!(flags & MF_ESCAPE_RESULT)); return TRUE; } -- Bazsi
The previous patch is crap (it leaks all memory used by substitution space), use this one: --- orig/src/filter.c +++ mod/src/filter.c @@ -30,11 +30,8 @@ #include <string.h> -gchar *re_matches[RE_MAX_MATCHES]; - static void log_filter_rule_free(LogFilterRule *self); - gboolean log_filter_rule_eval(LogFilterRule *self, LogMessage *msg) { @@ -304,26 +301,22 @@ filter_re_compile(const char *re, regex_ } static gboolean -filter_re_eval(FilterRE *self, gchar *str) +filter_re_eval(FilterRE *self, LogMessage *msg, gchar *str) { regmatch_t matches[RE_MAX_MATCHES]; gboolean rc; gint i; - - for (i = 0; i < RE_MAX_MATCHES; i++) - { - g_free(re_matches[i]); - re_matches[i] = NULL; - } + + log_msg_clear_matches(msg); rc = !regexec(&self->regex, str, RE_MAX_MATCHES, matches, 0); if (rc) { for (i = 0; i < RE_MAX_MATCHES && matches[i].rm_so != -1; i++) { gint length = matches[i].rm_eo - matches[i].rm_so; - re_matches[i] = g_malloc(length + 1); - memcpy(re_matches[i], &str[matches[i].rm_so], length); - re_matches[i][length] = 0; + msg->re_matches[i] = g_malloc(length + 1); + memcpy(msg->re_matches[i], &str[matches[i].rm_so], length); + msg->re_matches[i][length] = 0; } } return rc ^ self->super.comp; @@ -341,7 +334,7 @@ filter_re_free(FilterExprNode *s) static gboolean filter_prog_eval(FilterExprNode *s, LogMessage *msg) { - return filter_re_eval((FilterRE *) s, msg->program->str); + return filter_re_eval((FilterRE *) s, msg, msg->program->str); } FilterExprNode * @@ -363,7 +356,7 @@ filter_prog_new(gchar *prog) static gboolean filter_host_eval(FilterExprNode *s, LogMessage *msg) { - return filter_re_eval((FilterRE *) s, msg->host->str); + return filter_re_eval((FilterRE *) s, msg, msg->host->str); } FilterExprNode * @@ -385,7 +378,7 @@ filter_host_new(gchar *host) static gboolean filter_match_eval(FilterExprNode *s, LogMessage *msg) { - return filter_re_eval((FilterRE *) s, msg->msg->str); + return filter_re_eval((FilterRE *) s, msg, msg->msg->str); } FilterExprNode * --- orig/src/filter.h +++ mod/src/filter.h @@ -31,10 +31,6 @@ struct _LogFilterRule; struct _GlobalConfig; -/* regex substitutions from the last match */ -#define RE_MAX_MATCHES 10 -extern gchar *re_matches[RE_MAX_MATCHES]; - typedef struct _FilterExprNode { gboolean comp; --- orig/src/logmsg.c +++ mod/src/logmsg.c @@ -417,6 +417,19 @@ log_msg_parse(LogMessage *self, gchar *d g_string_assign_len(self->msg, src, left); } +void +log_msg_clear_matches(LogMessage *self) +{ + gint i; + + for (i = 0; i < RE_MAX_MATCHES; i++) + { + if (self->re_matches[i]) + g_free(self->re_matches[i]); + self->re_matches[i] = NULL; + } +} + /** * log_msg_free: * @self: LogMessage instance @@ -432,6 +445,7 @@ log_msg_free(LogMessage *self) g_string_free(self->host_from, TRUE); g_string_free(self->program, TRUE); g_string_free(self->msg, TRUE); + log_msg_clear_matches(self); g_free(self); } --- orig/src/logmsg.h +++ mod/src/logmsg.h @@ -72,6 +72,8 @@ typedef struct _LogStamp void log_stamp_format(LogStamp *stamp, GString *target, gint ts_format, glong zone_offset, gint frac_digits); +#define RE_MAX_MATCHES 10 + typedef struct _LogMessage { guint ref_cnt; @@ -87,6 +89,7 @@ typedef struct _LogMessage LogStamp stamp; LogStamp recvd; GString *date, *host, *host_from, *program, *msg; + gchar *re_matches[RE_MAX_MATCHES]; } LogMessage; LogMessage *log_msg_ref(LogMessage *m); --- orig/src/macros.c +++ mod/src/macros.c @@ -142,8 +142,8 @@ log_macro_expand(GString *result, gint i { gint ndx = id - M_MATCH_REF_OFS; /* match reference */ - if (re_matches[ndx]) - result_append(result, re_matches[ndx], strlen(re_matches[ndx]), !!(flags & MF_ESCAPE_RESULT)); + if (msg->re_matches[ndx]) + result_append(result, msg->re_matches[ndx], strlen(msg->re_matches[ndx]), !!(flags & MF_ESCAPE_RESULT)); return TRUE; } -- Bazsi
On Monday 13 Mar 2006 17:28, Balazs Scheidler wrote:
The previous patch is crap (it leaks all memory used by substitution space), use this one:
That patch seems to've been the ticket! Now that I've gotten some time to come back to it, I ran the test which duplicated records previously, and it works as expected. :) Takk, - Mel C
participants (3)
-
Balazs Scheidler
-
Mel Collins
-
Vincent Haverlant