[PATCH] basicfuncs: Implement a $(substr START LEN STR) template function.
This implements the $(substr) template function - a fairly primitive one: it does error checking, but there's no error reporting present, therefore invalid function calls will just get ignored. Usage should be straightforward. Reported-By: Jakub Jankowski <shasta@toxcorp.com> Signed-off-by: Gergely Nagy <algernon@balabit.hu> --- modules/basicfuncs/basic-funcs.c | 46 ++++++++++++++++++++++++++++++++++++++ 1 files changed, 46 insertions(+), 0 deletions(-) diff --git a/modules/basicfuncs/basic-funcs.c b/modules/basicfuncs/basic-funcs.c index 44b8af3..542c2ec 100644 --- a/modules/basicfuncs/basic-funcs.c +++ b/modules/basicfuncs/basic-funcs.c @@ -4,6 +4,9 @@ #include "filter-expr-parser.h" #include "cfg.h" +#include <stdlib.h> +#include <errno.h> + static void tf_echo(LogMessage *msg, gint argc, GString *argv[], GString *result) { @@ -19,6 +22,48 @@ tf_echo(LogMessage *msg, gint argc, GString *argv[], GString *result) TEMPLATE_FUNCTION_SIMPLE(tf_echo); +static gboolean +tf_substr_parse_int(const gchar *s, long *d) +{ + gchar *endptr; + glong val; + + errno = 0; + val = strtoll(s, &endptr, 10); + + if ((errno == ERANGE && (val == LONG_MAX || val == LONG_MIN)) + || (errno != 0 && val == 0)) + return FALSE; + + if (endptr == s || *endptr != '\0') + return FALSE; + + *d = val; + return TRUE; +} + +static void +tf_substr(LogMessage *msg, gint argc, GString *argv[], GString *result) +{ + glong start, len; + + if (argc != 3) + return; + + if (!tf_substr_parse_int (argv[0]->str, &start)) + return; + if (!tf_substr_parse_int (argv[1]->str, &len)) + return; + + if (start > argv[2]->len || + start + len >= argv[2]->len) + return; + + g_string_append_len (result, argv[2]->str + start - 1, len); +} + +TEMPLATE_FUNCTION_SIMPLE(tf_substr); + typedef struct _TFCondState { FilterExprNode *filter; @@ -158,6 +203,7 @@ static Plugin basicfuncs_plugins[] = TEMPLATE_FUNCTION_PLUGIN(tf_echo, "echo"), TEMPLATE_FUNCTION_PLUGIN(tf_grep, "grep"), TEMPLATE_FUNCTION_PLUGIN(tf_if, "if"), + TEMPLATE_FUNCTION_PLUGIN(tf_substr, "substr"), }; gboolean -- 1.7.2.5
Gergely Nagy <algernon@balabit.hu> writes:
This implements the $(substr) template function - a fairly primitive one: it does error checking, but there's no error reporting present, therefore invalid function calls will just get ignored.
The patch was made against syslog-ng 3.3 HEAD, but should apply to 3.2 aswell. Either cleanly or with minor tweaks only. -- |8]
Saturday 09 of April 2011 13:05:38 Gergely Nagy napisał(a):
This implements the $(substr) template function - a fairly primitive one: it does error checking, but there's no error reporting present, therefore invalid function calls will just get ignored.
Thanks! I'll be testing this later this week and report back issues (if any). Regards, -- Jakub Jankowski|shasta@toxcorp.com|http://toxcorp.com/ GPG: FCBF F03D 9ADB B768 8B92 BB52 0341 9037 A875 942D
Tuesday 12 of April 2011 10:50:05 Jakub Jankowski napisał(a):
Saturday 09 of April 2011 13:05:38 Gergely Nagy napisał(a):
This implements the $(substr) template function - a fairly primitive one: it does error checking, but there's no error reporting present, therefore invalid function calls will just get ignored.
Thanks! I'll be testing this later this week and report back issues (if any).
This is what I came up with, based on your patch: - it changes arguments order to $(substr <string> <offsetstart> [length]) - so string is first - offset = 0 means beginning of the string (and not 1) - called without third (length) argument returns substring from desired start offset to the end of the string - works with negative offsets/lengths in a manner similar to other implementations: - negative offset means "start that many characters counting from the end of the string", with -1 being the last character - negative length means "return substring from desired start up to that many characters from the end of the string" - adds some msg_error()s on argument parsing failures I'm not entirely sure if there are no off-by-one (or even off-by-many :) errors, I'd be grateful if someone could check that. I also had issues comparing negative glongs (signed) with GString's structure len field, which is gsize (so unsigned), hence that ugly type change at the beginning. Can anyone point me to a *right* direction there? There are some comments inline to clarify my intentions. Constructive criticism welcome. :) This is against 3.2.2, but should apply against 3.3 HEAD, I think. Signed-off-by: Jakub Jankowski <shasta@toxcorp.com> ============== --- syslog-ng-3.2.2/modules/basicfuncs/basic-funcs.c +++ syslog-ng-3.2.2/modules/basicfuncs/basic-funcs.c @@ -4,6 +4,9 @@ #include "filter-expr-parser.h" #include "cfg.h" +#include <stdlib.h> +#include <errno.h> + static void tf_echo(LogMessage *msg, gint argc, GString *argv[], GString *result) { @@ -19,6 +22,129 @@ TEMPLATE_FUNCTION_SIMPLE(tf_echo); +static gboolean +tf_substr_parse_int(const gchar *s, long *d) +{ + gchar *endptr; + glong val; + + errno = 0; + val = strtoll(s, &endptr, 10); + + if ((errno == ERANGE && (val == LONG_MAX || val == LONG_MIN)) + || (errno != 0 && val == 0)) + return FALSE; + + if (endptr == s || *endptr != '\0') + return FALSE; + + *d = val; + return TRUE; +} + +static void +tf_substr(LogMessage *msg, gint argc, GString *argv[], GString *result) +{ + glong start, len; + glong argv0len; + + /* + * We need to cast argv[0]->len, which is gsize (so unsigned) type, to a + * signed type, so we can compare negative offsets/lengths with string + * length. But if argv[0]->len is bigger than G_MAXLONG, this can lead to + * completely wrong calculations, so we'll just return nothing (alternative + * would be to return original string and perhaps print an error...) + */ + if (argv[0]->len >= G_MAXLONG) { + /* msg_error("$(substr) error: string length is too big to fit into glong type", NULL); */ + return; + } + /* K&R probably hate me */ + argv0len = argv[0]->len; + + /* check number of arguments */ + if (argc < 2 || argc > 3) + return; + + /* get offset position from second argument */ + if (!tf_substr_parse_int(argv[1]->str, &start)) { + msg_error("$(substr) parsing failed, start could not be parsed", evt_tag_str("start", argv[1]->str), NULL); + return; + } + + /* if we were called with >2 arguments, third was desired length */ + if (argc > 2) { + if (!tf_substr_parse_int(argv[2]->str, &len)) { + msg_error("$(substr) parsing failed, length could not be parsed", evt_tag_str("length", argv[2]->str), NULL); + return; + } + } else + len = argv0len; + + /* + * if requested substring length is negative and beyond string start, do nothing; + * also if it is greater than string length, limit it to string length + */ + if (len < 0 && -(len) > argv0len) + return; + else if (len > argv0len) + len = argv0len; + + /* + * if requested offset is beyond string length, do nothing; + * also, if offset is negative and "before" string beginning, do nothing + * (or should we perhaps align start with the beginning instead?) + */ + if (start >= argv0len) + return; + else if (start < 0 && -(start) > argv0len) + return; + + /* with negative length, see if we don't end up with start > end */ + if (len < 0 && ((start < 0 && start > len) || + (start >= 0 && (len + argv0len - start) < 0))) + return; + + /* if requested offset is negative, move start it accordingly */ + if (start < 0) { + start = argv0len + start; + /* + * this shouldn't actually happen, as earlier we tested for + * (start < 0 && -start > argv0len), but better safe than sorry + */ + if (start < 0) + start = 0; + } + + /* + * if requested length is negative, "resize" len to include exactly as many + * characters as needed from the end of the string, given our start position. + * (start is always non-negative here already) + */ + if (len < 0) { + len = argv0len - start + len; + /* this also shouldn't happen, but - again - better safe than sorry */ + if (len < 0) + return; + } + + /* if we're beyond string end, do nothing */ + if (start >= argv0len) + return; + + /* if we want something beyond string end, do it only till the end */ + if (start + len > argv0len) + len = argv0len - start; + + /* skip g_string_append_len if we ended up having to extract 0 chars */ + if (len == 0) + return; + + g_string_append_len(result, argv[0]->str + start, len); +} + +TEMPLATE_FUNCTION_SIMPLE(tf_substr); + typedef struct _TFCondState { FilterExprNode *filter; @@ -157,6 +283,7 @@ TEMPLATE_FUNCTION_PLUGIN(tf_echo, "echo"), TEMPLATE_FUNCTION_PLUGIN(tf_grep, "grep"), TEMPLATE_FUNCTION_PLUGIN(tf_if, "if"), + TEMPLATE_FUNCTION_PLUGIN(tf_substr, "substr"), }; gboolean ============== Regards, -- Jakub Jankowski|shasta@toxcorp.com|http://toxcorp.com/ GPG: FCBF F03D 9ADB B768 8B92 BB52 0341 9037 A875 942D
Jakub Jankowski <shasta@toxcorp.com> writes:
Tuesday 12 of April 2011 10:50:05 Jakub Jankowski napisał(a):
Saturday 09 of April 2011 13:05:38 Gergely Nagy napisał(a):
This implements the $(substr) template function - a fairly primitive one: it does error checking, but there's no error reporting present, therefore invalid function calls will just get ignored.
Thanks! I'll be testing this later this week and report back issues (if any).
This is what I came up with, based on your patch: - it changes arguments order to $(substr <string> <offsetstart> [length]) - so string is first - offset = 0 means beginning of the string (and not 1) - called without third (length) argument returns substring from desired start offset to the end of the string - works with negative offsets/lengths in a manner similar to other implementations: - negative offset means "start that many characters counting from the end of the string", with -1 being the last character - negative length means "return substring from desired start up to that many characters from the end of the string" - adds some msg_error()s on argument parsing failures
All of these sound good and reasonable, thank you!
I'm not entirely sure if there are no off-by-one (or even off-by-many :) errors, I'd be grateful if someone could check that. I also had issues comparing negative glongs (signed) with GString's structure len field, which is gsize (so unsigned), hence that ugly type change at the beginning. Can anyone point me to a *right* direction there?
I think you're on the right track there, but see my comments on the patch itself.
+static void +tf_substr(LogMessage *msg, gint argc, GString *argv[], GString *result) +{ + glong start, len; + glong argv0len; + + /* + * We need to cast argv[0]->len, which is gsize (so unsigned) type, to a + * signed type, so we can compare negative offsets/lengths with string + * length. But if argv[0]->len is bigger than G_MAXLONG, this can lead to + * completely wrong calculations, so we'll just return nothing (alternative + * would be to return original string and perhaps print an error...) + */ + if (argv[0]->len >= G_MAXLONG) { + /* msg_error("$(substr) error: string length is too big to fit into glong type", NULL); */ + return; + } + /* K&R probably hate me */ + argv0len = argv[0]->len; +
The intent here is good, but it could - in my opinion - be made a little more efficient: if (argv[0]->len >= G_MAXLONG) { msg_error("$(substr) error: string is too long", NULL); return; } And then just cast argv[0]->len to glong if so need be (see later). I'd say the error message can be simple too, mentioning the exact type is in my opinion, not needed. For those who'd understand what it means, it doesn't add much extra value, as they can just look at the source. For others, it's confusing. Short & to the point. Also, if we return due to a string being too long, I believe we should print at least a warning (but an error seems more appropriate). Returning the original string in this case sounds non-intuitive, so - as you did in your patch - it's better to do nothing instead.
+ /* if we were called with >2 arguments, third was desired length */ + if (argc > 2) { + if (!tf_substr_parse_int(argv[2]->str, &len)) { + msg_error("$(substr) parsing failed, length could not be parsed", evt_tag_str("length", argv[2]->str), NULL); + return; + } + } else + len = argv0len;
len = (glong)argv[0]->len And then you saved a variable, accomplished the same thing, and noone hates you! :]
+ /* + * if requested substring length is negative and beyond string start, do nothing; + * also if it is greater than string length, limit it to string length + */ + if (len < 0 && -(len) > argv0len) + return; + else if (len > argv0len) + len = argv0len;
See above. You can cast argv[0]->len to glong within the if, to avoid compiler warnings - and it's safe too, since you checked the length at the top. This goes for the rest of the len<->argv0len stuff too. However, explicit casting is merely my preference, using a variable is perfectly good too. Just name it something different then, like... "stringlen" - so one reading the code will have an idea what it is, without having to scroll up and check how it is initialized. Other than these, your patch looks fine on the first read, good work! I'll have another look at it as soon as possible, and apply it to my 3.3 branch too. -- |8]
Hi, I would want to integrate this into 3.4, since I've opened it. Could one of you please prepare a combined patch, that contains both Gergely's and Jakub's work? Thanks. On Wed, 2011-04-13 at 16:13 +0200, Gergely Nagy wrote:
Jakub Jankowski <shasta@toxcorp.com> writes:
Tuesday 12 of April 2011 10:50:05 Jakub Jankowski napisał(a):
Saturday 09 of April 2011 13:05:38 Gergely Nagy napisał(a):
This implements the $(substr) template function - a fairly primitive one: it does error checking, but there's no error reporting present, therefore invalid function calls will just get ignored.
Thanks! I'll be testing this later this week and report back issues (if any).
This is what I came up with, based on your patch: - it changes arguments order to $(substr <string> <offsetstart> [length]) - so string is first - offset = 0 means beginning of the string (and not 1) - called without third (length) argument returns substring from desired start offset to the end of the string - works with negative offsets/lengths in a manner similar to other implementations: - negative offset means "start that many characters counting from the end of the string", with -1 being the last character - negative length means "return substring from desired start up to that many characters from the end of the string" - adds some msg_error()s on argument parsing failures
All of these sound good and reasonable, thank you!
I'm not entirely sure if there are no off-by-one (or even off-by-many :) errors, I'd be grateful if someone could check that. I also had issues comparing negative glongs (signed) with GString's structure len field, which is gsize (so unsigned), hence that ugly type change at the beginning. Can anyone point me to a *right* direction there?
I think you're on the right track there, but see my comments on the patch itself.
+static void +tf_substr(LogMessage *msg, gint argc, GString *argv[], GString *result) +{ + glong start, len; + glong argv0len; + + /* + * We need to cast argv[0]->len, which is gsize (so unsigned) type, to a + * signed type, so we can compare negative offsets/lengths with string + * length. But if argv[0]->len is bigger than G_MAXLONG, this can lead to + * completely wrong calculations, so we'll just return nothing (alternative + * would be to return original string and perhaps print an error...) + */ + if (argv[0]->len >= G_MAXLONG) { + /* msg_error("$(substr) error: string length is too big to fit into glong type", NULL); */ + return; + } + /* K&R probably hate me */ + argv0len = argv[0]->len; +
The intent here is good, but it could - in my opinion - be made a little more efficient:
if (argv[0]->len >= G_MAXLONG) { msg_error("$(substr) error: string is too long", NULL); return; }
And then just cast argv[0]->len to glong if so need be (see later).
I'd say the error message can be simple too, mentioning the exact type is in my opinion, not needed. For those who'd understand what it means, it doesn't add much extra value, as they can just look at the source. For others, it's confusing.
Short & to the point.
Also, if we return due to a string being too long, I believe we should print at least a warning (but an error seems more appropriate). Returning the original string in this case sounds non-intuitive, so - as you did in your patch - it's better to do nothing instead.
+ /* if we were called with >2 arguments, third was desired length */ + if (argc > 2) { + if (!tf_substr_parse_int(argv[2]->str, &len)) { + msg_error("$(substr) parsing failed, length could not be parsed", evt_tag_str("length", argv[2]->str), NULL); + return; + } + } else + len = argv0len;
len = (glong)argv[0]->len
And then you saved a variable, accomplished the same thing, and noone hates you! :]
+ /* + * if requested substring length is negative and beyond string start, do nothing; + * also if it is greater than string length, limit it to string length + */ + if (len < 0 && -(len) > argv0len) + return; + else if (len > argv0len) + len = argv0len;
See above. You can cast argv[0]->len to glong within the if, to avoid compiler warnings - and it's safe too, since you checked the length at the top.
This goes for the rest of the len<->argv0len stuff too. However, explicit casting is merely my preference, using a variable is perfectly good too. Just name it something different then, like... "stringlen" - so one reading the code will have an idea what it is, without having to scroll up and check how it is initialized.
Other than these, your patch looks fine on the first read, good work!
I'll have another look at it as soon as possible, and apply it to my 3.3 branch too.
-- Bazsi
Balazs Scheidler <bazsi@balabit.hu> writes:
Hi,
I would want to integrate this into 3.4, since I've opened it. Could one of you please prepare a combined patch, that contains both Gergely's and Jakub's work?
I'll do that in a few hours - I'll also prep the v3 CAP_SYSLOG patch, so you can pull it easily. -- |8]
This is an implementation of a $(substr <string> <start> [<len>]) template function, where the first argument is the string we want to extract a part from, the second is the starting offset, and the third, optional is the length of the substring we want to extract. If no length is specified, then the substring will be extracted from the desired offset until the end of the original string. Offset 0 means the start of the string. Negative offsets/lengths works in a manner similar to other implementations: - negative offset means "start that many characters counting from the end of the string", with -1 being the last character - negative length means "return substring from desired start up to that many characters from the end of the string" Signed-off-by: Jakub Jankowski <shasta@toxcorp.com> Signed-off-by: Gergely Nagy <algernon@balabit.hu> --- modules/basicfuncs/basic-funcs.c | 124 ++++++++++++++++++++++++++++++++++++++ 1 files changed, 124 insertions(+), 0 deletions(-) diff --git a/modules/basicfuncs/basic-funcs.c b/modules/basicfuncs/basic-funcs.c index 44b8af3..a03e221 100644 --- a/modules/basicfuncs/basic-funcs.c +++ b/modules/basicfuncs/basic-funcs.c @@ -4,6 +4,9 @@ #include "filter-expr-parser.h" #include "cfg.h" +#include <stdlib.h> +#include <errno.h> + static void tf_echo(LogMessage *msg, gint argc, GString *argv[], GString *result) { @@ -19,6 +22,126 @@ tf_echo(LogMessage *msg, gint argc, GString *argv[], GString *result) TEMPLATE_FUNCTION_SIMPLE(tf_echo); +static gboolean +tf_substr_parse_int(const gchar *s, long *d) +{ + gchar *endptr; + glong val; + + errno = 0; + val = strtoll(s, &endptr, 10); + + if ((errno == ERANGE && (val == LONG_MAX || val == LONG_MIN)) + || (errno != 0 && val == 0)) + return FALSE; + + if (endptr == s || *endptr != '\0') + return FALSE; + + *d = val; + return TRUE; +} + +static void +tf_substr(LogMessage *msg, gint argc, GString *argv[], GString *result) +{ + glong start, len; + + /* + * We need to cast argv[0]->len, which is gsize (so unsigned) type, to a + * signed type, so we can compare negative offsets/lengths with string + * length. But if argv[0]->len is bigger than G_MAXLONG, this can lead to + * completely wrong calculations, so we'll just return nothing (alternative + * would be to return original string and perhaps print an error...) + */ + if (argv[0]->len >= G_MAXLONG) { + msg_error("$(substr) error: string is too long", NULL); + return; + } + + /* check number of arguments */ + if (argc < 2 || argc > 3) + return; + + /* get offset position from second argument */ + if (!tf_substr_parse_int(argv[1]->str, &start)) { + msg_error("$(substr) parsing failed, start could not be parsed", evt_tag_str("start", argv[1]->str), NULL); + return; + } + + /* if we were called with >2 arguments, third was desired length */ + if (argc > 2) { + if (!tf_substr_parse_int(argv[2]->str, &len)) { + msg_error("$(substr) parsing failed, length could not be parsed", evt_tag_str("length", argv[2]->str), NULL); + return; + } + } else + len = (glong)argv[0]->len; + + /* + * if requested substring length is negative and beyond string start, do nothing; + * also if it is greater than string length, limit it to string length + */ + if (len < 0 && -(len) > (glong)argv[0]->len) + return; + else if (len > (glong)argv[0]->len) + len = (glong)argv[0]->len; + + /* + * if requested offset is beyond string length, do nothing; + * also, if offset is negative and "before" string beginning, do nothing + * (or should we perhaps align start with the beginning instead?) + */ + if (start >= (glong)argv[0]->len) + return; + else if (start < 0 && -(start) > (glong)argv[0]->len) + return; + + /* with negative length, see if we don't end up with start > end */ + if (len < 0 && ((start < 0 && start > len) || + (start >= 0 && (len + ((glong)argv[0]->len) - start) < 0))) + return; + + /* if requested offset is negative, move start it accordingly */ + if (start < 0) { + start = start + (glong)argv[0]->len; + /* + * this shouldn't actually happen, as earlier we tested for + * (start < 0 && -start > argv0len), but better safe than sorry + */ + if (start < 0) + start = 0; + } + + /* + * if requested length is negative, "resize" len to include exactly as many + * characters as needed from the end of the string, given our start position. + * (start is always non-negative here already) + */ + if (len < 0) { + len = ((glong)argv[0]->len) - start + len; + /* this also shouldn't happen, but - again - better safe than sorry */ + if (len < 0) + return; + } + + /* if we're beyond string end, do nothing */ + if (start >= (glong)argv[0]->len) + return; + + /* if we want something beyond string end, do it only till the end */ + if (start + len > (glong)argv[0]->len) + len = ((glong)argv[0]->len) - start; + + /* skip g_string_append_len if we ended up having to extract 0 chars */ + if (len == 0) + return; + + g_string_append_len(result, argv[0]->str + start, len); +} + +TEMPLATE_FUNCTION_SIMPLE(tf_substr); + typedef struct _TFCondState { FilterExprNode *filter; @@ -158,6 +281,7 @@ static Plugin basicfuncs_plugins[] = TEMPLATE_FUNCTION_PLUGIN(tf_echo, "echo"), TEMPLATE_FUNCTION_PLUGIN(tf_grep, "grep"), TEMPLATE_FUNCTION_PLUGIN(tf_if, "if"), + TEMPLATE_FUNCTION_PLUGIN(tf_substr, "substr"), }; gboolean -- 1.7.2.5
On Sun, 2011-05-01 at 10:12 +0200, Gergely Nagy wrote:
This is an implementation of a $(substr <string> <start> [<len>]) template function, where the first argument is the string we want to extract a part from, the second is the starting offset, and the third, optional is the length of the substring we want to extract.
If no length is specified, then the substring will be extracted from the desired offset until the end of the original string. Offset 0 means the start of the string.
Negative offsets/lengths works in a manner similar to other implementations:
- negative offset means "start that many characters counting from the end of the string", with -1 being the last character - negative length means "return substring from desired start up to that many characters from the end of the string"
Signed-off-by: Jakub Jankowski <shasta@toxcorp.com> Signed-off-by: Gergely Nagy <algernon@balabit.hu>
Applied, thanks. I've added some unit testcases & they all passed flawlessly. -- Bazsi
participants (3)
-
Balazs Scheidler
-
Gergely Nagy
-
Jakub Jankowski