[syslog-ng] [PATCH] basicfuncs: Implement a $(substr STRING START LEN) template function.

Jakub Jankowski shasta at toxcorp.com
Wed Apr 13 15:47:02 CEST 2011


Tuesday 12 of April 2011 10:50:05 Jakub Jankowski napisał(a):
> Saturday 09 of April 2011 13:05:38 Gergely Nagy napisał(a):
> > This implements the $(substr) template function - a fairly primitive
> > one: it does error checking, but there's no error reporting present,
> > therefore invalid function calls will just get ignored.
>
> Thanks! I'll be testing this later this week and report back issues (if
> any).

This is what I came up with, based on your patch:
 - it changes arguments order to $(substr <string> <offsetstart> [length]) - so
   string is first
 - offset = 0 means beginning of the string (and not 1)
 - called without third (length) argument returns substring from desired start 
   offset to the end of the string
 - works with negative offsets/lengths in a manner similar to other implementations:
   - negative offset means "start that many characters counting from the 
     end of the string", with -1 being the last character
   - negative length means "return substring from desired start up to that many 
     characters from the end of the string"
 - adds some msg_error()s on argument parsing failures

I'm not entirely sure if there are no off-by-one (or even off-by-many :) errors,
I'd be grateful if someone could check that. I also had issues comparing negative
glongs (signed) with GString's structure len field, which is gsize (so unsigned),
hence that ugly type change at the beginning. Can anyone point me to a *right*
direction there?

There are some comments inline to clarify my intentions. Constructive criticism
welcome. :)

This is against 3.2.2, but should apply against 3.3 HEAD, I think.

Signed-off-by: Jakub Jankowski <shasta at toxcorp.com>

==============
--- syslog-ng-3.2.2/modules/basicfuncs/basic-funcs.c
+++ syslog-ng-3.2.2/modules/basicfuncs/basic-funcs.c
@@ -4,6 +4,9 @@
 #include "filter-expr-parser.h"
 #include "cfg.h"
 
+#include <stdlib.h>
+#include <errno.h>
+
 static void
 tf_echo(LogMessage *msg, gint argc, GString *argv[], GString *result)
 {
@@ -19,6 +22,129 @@
 
 TEMPLATE_FUNCTION_SIMPLE(tf_echo);
 
+static gboolean
+tf_substr_parse_int(const gchar *s, long *d)
+{
+  gchar *endptr;
+  glong val;
+
+  errno = 0;
+  val = strtoll(s, &endptr, 10);
+
+  if ((errno == ERANGE && (val == LONG_MAX || val == LONG_MIN))
+      || (errno != 0 && val == 0))
+    return FALSE;
+
+  if (endptr == s || *endptr != '\0')
+    return FALSE;
+
+  *d = val;
+  return TRUE;
+}
+
+static void
+tf_substr(LogMessage *msg, gint argc, GString *argv[], GString *result)
+{
+  glong start, len;
+  glong argv0len;
+
+  /*
+   * We need to cast argv[0]->len, which is gsize (so unsigned) type, to a
+   * signed type, so we can compare negative offsets/lengths with string
+   * length. But if argv[0]->len is bigger than G_MAXLONG, this can lead to
+   * completely wrong calculations, so we'll just return nothing (alternative
+   * would be to return original string and perhaps print an error...)
+   */
+  if (argv[0]->len >= G_MAXLONG) {
+    /* msg_error("$(substr) error: string length is too big to fit into glong type", NULL); */
+    return;
+  }
+  /* K&R probably hate me */
+  argv0len = argv[0]->len;
+
+  /* check number of arguments */
+  if (argc < 2 || argc > 3)
+    return;
+
+  /* get offset position from second argument */
+  if (!tf_substr_parse_int(argv[1]->str, &start)) {
+    msg_error("$(substr) parsing failed, start could not be parsed", evt_tag_str("start", argv[1]->str), NULL);
+    return;
+  }
+
+  /* if we were called with >2 arguments, third was desired length */
+  if (argc > 2) {
+    if (!tf_substr_parse_int(argv[2]->str, &len)) {
+      msg_error("$(substr) parsing failed, length could not be parsed", evt_tag_str("length", argv[2]->str), NULL);
+      return;
+    }
+  } else
+    len = argv0len;
+
+  /* 
+   * if requested substring length is negative and beyond string start, do nothing;
+   * also if it is greater than string length, limit it to string length
+   */
+  if (len < 0 && -(len) > argv0len)
+    return;
+  else if (len > argv0len)
+    len = argv0len;
+
+  /*
+   * if requested offset is beyond string length, do nothing;
+   * also, if offset is negative and "before" string beginning, do nothing 
+   * (or should we perhaps align start with the beginning instead?)
+   */
+  if (start >= argv0len)
+    return;
+  else if (start < 0 && -(start) > argv0len)
+    return;
+
+  /* with negative length, see if we don't end up with start > end */
+  if (len < 0 && ((start < 0 && start > len) || 
+      (start >= 0 && (len + argv0len - start) < 0)))
+    return;
+
+  /* if requested offset is negative, move start it accordingly */
+  if (start < 0) {
+    start = argv0len + start;
+    /*
+     * this shouldn't actually happen, as earlier we tested for 
+     * (start < 0 && -start > argv0len), but better safe than sorry
+     */
+    if (start < 0)
+      start = 0;
+  }
+
+  /*
+   * if requested length is negative, "resize" len to include exactly as many
+   * characters as needed from the end of the string, given our start position.
+   * (start is always non-negative here already)
+   */
+  if (len < 0) {
+    len = argv0len - start + len;
+    /* this also shouldn't happen, but - again - better safe than sorry */
+    if (len < 0)
+      return;
+  }
+
+  /* if we're beyond string end, do nothing */
+  if (start >= argv0len)
+    return;
+
+  /* if we want something beyond string end, do it only till the end */
+  if (start + len > argv0len)
+    len = argv0len - start;
+
+  /* skip g_string_append_len if we ended up having to extract 0 chars */
+  if (len == 0)
+    return;
+
+  g_string_append_len(result, argv[0]->str + start, len);
+}
+
+TEMPLATE_FUNCTION_SIMPLE(tf_substr);
+
 typedef struct _TFCondState
 {
   FilterExprNode *filter;
@@ -157,6 +283,7 @@
   TEMPLATE_FUNCTION_PLUGIN(tf_echo, "echo"),
   TEMPLATE_FUNCTION_PLUGIN(tf_grep, "grep"),
   TEMPLATE_FUNCTION_PLUGIN(tf_if, "if"),
+  TEMPLATE_FUNCTION_PLUGIN(tf_substr, "substr"),
 };
 
 gboolean
==============


Regards,

-- 
Jakub Jankowski|shasta at toxcorp.com|http://toxcorp.com/
GPG: FCBF F03D 9ADB B768 8B92 BB52 0341 9037 A875 942D


More information about the syslog-ng mailing list