Following this mail, a bigger patch will be sent, that implements a JSON parser - a fairly simple one, for now. I did not want to explain every little detail in the commit message, so this umbrella mail is sent first, with a few details that cannot be found in the commit message (there will be some overlap, though). This patch (and any further developments I make in the future, until the parser is merged) will be available on the feature/3.3/json/parser branch of my new and shiny git tree[1], as soon as I push it out. [1]: git://github.com/algernon/syslog-ng.git First of all, here's a little code snippet that shows how to use the module: ,----[ syslog-ng.conf ] | @module tfjson | @module jsonparser | | source s_json { tcp(port(12345) flags(no-parse)); }; | destination d_json { | file("/var/log/messages.json" | template("$(format-json --scope dot-nv-pairs)\n")); | }; | parser p_json { json-parser(prefix(".json.")); }; | log { source(s_json); parser(p_json); destination (d_json); }; `---- With this config, we can send logs to this source as follows: ,----[ shell ] | $ echo '{"string": "this is a string!\\n\\nmultiline!", "int": 42, "double": 3.14 }' | nc localhost 12345 `---- And 'lo and behold, the destination looks like this: ,----[ /var/log/messages.json ] | {".json.string":"this is a string!\n\nmultiline!",".json.int":"42",".json.double":"3.140000"} `---- Since syslog-ng stores every value as a string, they will all be converted to strings. At some point in the future, when we can also store types, this information will be transmitted through, and format-json will preserve the types. For now, the parser has only one optional parameter: prefix(). I think you can figure out what it does. I think this is pretty darn cool as it is! Now, lets get down to the gory implementation details, as I'm sure you're all very interested in that! At the core, we have a LogJSONParser structure, which holds the configuration (hi, gchar *prefix!), and two GStrings, that are used over and over again during processing. The reason these two GStrings (serialized.key and serialized.value) are there, is to have as few memory allocations as possible: so we allocate the strings once, and reuse them over and over again. We also have a JSON tokener here, for similar reasons. Parsing itself is delegated to json-c, and all the parser does is iterate over the parsed object's keys, coerce them into strings, prepare a key (based on prefix and the parsed key), and store them in a LogMessage. Then we're done. Yes, really. The rest is boilerplate. Of course, life ain't that simple. For now, there's a bug where the JSON can't be parsed, we honour that with a tasty segmentation fault. This will be fixed at a later time.
The new parser supports json-c only, it does not work with json-glib. For this reason, the configure script was modified a little, so that it uses --with-json only. When json-c is found or selected, then both the parser and the template function will be built. If json-glib is found or selected, then only the template function will be available. The summary after configure now lists the determined settings. The new parser is - so far - fairly simple: it parses a JSON with json-c (int, double and string values are supported so far, booleans, arrays and sub-objects aren't yet), and sets the appropriate name-value pair in the LogMessage. Users can configure a prefix to prepend before each key. Usage is as simple as: source s_json { tcp(port(21514) flags(no-parse)); }; destination d_json { file("/tmp/test.json" template("$(format-json --scope dot-nv-pairs)\n")); }; parser p_json { json-parser (prefix(".json.")); }; log { source(s_json); parser(p_json); destination(d_json); }; Signed-off-by: Gergely Nagy <algernon@balabit.hu> --- configure.in | 34 ++++--- modules/Makefile.am | 2 +- modules/jsonparser/Makefile.am | 22 ++++ modules/jsonparser/jsonparser-grammar.ym | 86 ++++++++++++++++ modules/jsonparser/jsonparser-parser.c | 49 +++++++++ modules/jsonparser/jsonparser-parser.h | 34 +++++++ modules/jsonparser/jsonparser-plugin.c | 53 ++++++++++ modules/jsonparser/jsonparser.c | 158 ++++++++++++++++++++++++++++++ modules/jsonparser/jsonparser.h | 33 ++++++ modules/tfjson/Makefile.am | 2 +- 10 files changed, 455 insertions(+), 18 deletions(-) create mode 100644 modules/jsonparser/Makefile.am create mode 100644 modules/jsonparser/jsonparser-grammar.ym create mode 100644 modules/jsonparser/jsonparser-parser.c create mode 100644 modules/jsonparser/jsonparser-parser.h create mode 100644 modules/jsonparser/jsonparser-plugin.c create mode 100644 modules/jsonparser/jsonparser.c create mode 100644 modules/jsonparser/jsonparser.h diff --git a/configure.in b/configure.in index 0d97bde..e9b0148 100644 --- a/configure.in +++ b/configure.in @@ -178,10 +178,6 @@ AC_ARG_WITH(json, Use the JSON implementation specified] ,,with_json="auto") -AC_ARG_ENABLE(json, - [ --enable-json Enable support for JSON template formatting (default: auto)] - ,,enable_json="auto") - AC_ARG_ENABLE(systemd, [ --enable-systemd Enable systemd support (default: auto)] ,,enable_systemd="auto") @@ -607,8 +603,12 @@ fi dnl *************************************************************************** dnl json headers/libraries dnl *************************************************************************** -PKG_CHECK_MODULES(JSON_C, json >= $JSON_C_MIN_VERSION,, JSON_C_LIBS="") -PKG_CHECK_MODULES(JSON_GLIB, json-glib-1.0 >= $JSON_GLIB_MIN_VERSION,, JSON_GLIB_LIBS="") +if test "x$with_json" = "xauto" || test "x$with_json" = "json-c"; then + PKG_CHECK_MODULES(JSON_C, json >= $JSON_C_MIN_VERSION,, JSON_C_LIBS="") +fi +if test "x$with_json" = "xauto" || test "x$with_json" = "json-glib"; then + PKG_CHECK_MODULES(JSON_GLIB, json-glib-1.0 >= $JSON_GLIB_MIN_VERSION,, JSON_GLIB_LIBS="") +fi dnl *************************************************************************** dnl pcre headers/libraries @@ -887,14 +887,14 @@ elif test "x$with_json" = "xjson-c"; then AC_MSG_ERROR([Cannot find json-c version >= $JSON_C_MIN_VERSION: is pkg-config in path?]) fi -if test "x$enable_json" = "xauto"; then - if test "x$with_json" = "xno"; then - enable_json="no" - else - enable_json="yes" - fi -elif test "x$enable_json" = "xyes" -a "x$with_json" = "xno"; then - AC_MSG_ERROR([Cannot find json-c version >= $JSON_C_MIN_VERSION or json-glib-1.0 >= $JSON_GLIB_MIN_VERSION: is pkg-config in path?]) +if test "x$with_json" = "xjson-c"; then + enable_json_parse="yes" + enable_json_format="yes" +fi + +if test "x$with_json" = "xjson-glib"; then + enable_json_parse="no" + enable_json_format="yes" fi if test "x$enable_systemd" = "xauto"; then @@ -1077,7 +1077,8 @@ AM_CONDITIONAL(ENABLE_SQL, [test "$enable_sql" = "yes"]) AM_CONDITIONAL(ENABLE_SUN_STREAMS, [test "$enable_sun_streams" = "yes"]) AM_CONDITIONAL(ENABLE_PACCT, [test "$enable_pacct" = "yes"]) AM_CONDITIONAL(ENABLE_MONGODB, [test "$enable_mongodb" = "yes"]) -AM_CONDITIONAL(ENABLE_JSON, [test "$enable_json" = "yes"]) +AM_CONDITIONAL(ENABLE_JSON_FORMAT, [test "$enable_json_format" = "yes"]) +AM_CONDITIONAL(ENABLE_JSON_PARSE, [test "$enable_json_parse" = "yes"]) AM_CONDITIONAL(WITH_LIBSYSTEMD, [test "$with_libsystemd" = "yes"]) # substitution into manual pages @@ -1145,6 +1146,7 @@ AC_OUTPUT(dist.conf modules/basicfuncs/Makefile modules/convertfuncs/Makefile modules/tfjson/Makefile + modules/jsonparser/Makefile scripts/Makefile scripts/update-patterndb doc/Makefile @@ -1194,6 +1196,6 @@ echo " SSL support (module) : ${enable_ssl:=no}" echo " SQL support (module) : ${enable_sql:=no}" echo " PACCT module (EXPERIMENTAL) : ${enable_pacct:=no}" echo " MongoDB destination (module): ${enable_mongodb:=no}" -echo " JSON support (module) : ${enable_json:=no} (using ${with_json})" +echo " JSON support (module) : parser=${enable_json_parse:=no}, formatter=${enable_json_format:=no} (using ${with_json})" diff --git a/modules/Makefile.am b/modules/Makefile.am index 46de2b8..9f67920 100644 --- a/modules/Makefile.am +++ b/modules/Makefile.am @@ -1 +1 @@ -SUBDIRS = afsocket afsql afstreams affile afprog afuser afmongodb csvparser confgen syslogformat pacctformat basicfuncs convertfuncs dbparser tfjson dummy +SUBDIRS = afsocket afsql afstreams affile afprog afuser afmongodb csvparser confgen syslogformat pacctformat basicfuncs convertfuncs dbparser tfjson jsonparser dummy diff --git a/modules/jsonparser/Makefile.am b/modules/jsonparser/Makefile.am new file mode 100644 index 0000000..0925f98 --- /dev/null +++ b/modules/jsonparser/Makefile.am @@ -0,0 +1,22 @@ +moduledir = @moduledir@ +AM_CPPFLAGS = -I$(top_srcdir)/lib -I../../lib +export top_srcdir + +if ENABLE_JSON_PARSE +module_LTLIBRARIES := libjsonparser.la +libjsonparser_la_SOURCES = \ + jsonparser.c jsonparser.h \ + jsonparser-grammar.y \ + jsonparser-parser.c jsonparser-parser.h \ + jsonparser-plugin.c + +libjsonparser_la_CPPFLAGS = $(AM_CPPFLAGS) +libjsonparser_la_CFLAGS = $(JSON_CFLAGS) +libjsonparser_la_LIBADD = $(MODULE_DEPS_LIBS) +libjsonparser_la_LDFLAGS = $(MODULE_LDFLAGS) $(JSON_LIBS) + +BUILT_SOURCES = jsonparser-grammar.y jsonparser-grammar.c jsonparser-grammar.h +EXTRA_DIST = $(BUILT_SOURCES) jsonparser-grammar.ym +endif + +include $(top_srcdir)/build/lex-rules.am diff --git a/modules/jsonparser/jsonparser-grammar.ym b/modules/jsonparser/jsonparser-grammar.ym new file mode 100644 index 0000000..2fe2f3c --- /dev/null +++ b/modules/jsonparser/jsonparser-grammar.ym @@ -0,0 +1,86 @@ +/* + * Copyright (c) 2011 BalaBit IT Ltd, Budapest, Hungary + * Copyright (c) 2011 Gergely Nagy <algernon@balabit.hu> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + * + * As an additional exemption you are allowed to compile & link against the + * OpenSSL libraries as published by the OpenSSL project. See the file + * COPYING for details. + */ + +%code top { +#include "jsonparser-parser.h" + +} + + +%code { + +#include "jsonparser.h" +#include "cfg-parser.h" +#include "jsonparser-grammar.h" +#include "syslog-names.h" +#include "messages.h" + +extern LogParser *last_parser; + +} + +%name-prefix "jsonparser_" + +/* this parameter is needed in order to instruct bison to use a complete + * argument list for yylex/yyerror */ + +%lex-param {CfgLexer *lexer} +%parse-param {CfgLexer *lexer} +%parse-param {LogParser **instance} +%parse-param {gpointer arg} + +/* INCLUDE_DECLS */ + +%token KW_JSON_PARSER +%token KW_PREFIX + +%type <ptr> parser_expr_json + +%% + +start + : LL_CONTEXT_PARSER parser_expr_json { YYACCEPT; } + ; + + +parser_expr_json + : KW_JSON_PARSER '(' + { + last_parser = *instance = (LogParser *) log_json_parser_new(); + } + parser_json_opts + ')' { $$ = last_parser; } + ; + +parser_json_opts + : parser_json_opt parser_json_opts + | + ; + +parser_json_opt + : KW_PREFIX '(' string ')' { log_json_parser_set_prefix(last_parser, $3); free ($3); } + | + ; + +/* INCLUDE_RULES */ + +%% diff --git a/modules/jsonparser/jsonparser-parser.c b/modules/jsonparser/jsonparser-parser.c new file mode 100644 index 0000000..a7fe1a6 --- /dev/null +++ b/modules/jsonparser/jsonparser-parser.c @@ -0,0 +1,49 @@ +/* + * Copyright (c) 2011 BalaBit IT Ltd, Budapest, Hungary + * Copyright (c) 2011 Gergely Nagy <algernon@balabit.hu> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + * + * As an additional exemption you are allowed to compile & link against the + * OpenSSL libraries as published by the OpenSSL project. See the file + * COPYING for details. + */ + +#include "jsonparser.h" +#include "cfg-parser.h" +#include "jsonparser-grammar.h" + +extern int jsonparser_debug; + +int jsonparser_parse(CfgLexer *lexer, LogParser **instance, gpointer arg); + +static CfgLexerKeyword jsonparser_keywords[] = +{ + { "json_parser", KW_JSON_PARSER, }, + { "prefix", KW_PREFIX, }, + { NULL } +}; + +CfgParser jsonparser_parser = +{ +#if ENABLE_DEBUG + .debug_flag = &jsonparser_debug, +#endif + .name = "jsonparser", + .keywords = jsonparser_keywords, + .parse = (gint (*)(CfgLexer *, gpointer *, gpointer)) jsonparser_parse, + .cleanup = (void (*)(gpointer)) log_pipe_unref, +}; + +CFG_PARSER_IMPLEMENT_LEXER_BINDING(jsonparser_, LogParser **) diff --git a/modules/jsonparser/jsonparser-parser.h b/modules/jsonparser/jsonparser-parser.h new file mode 100644 index 0000000..22ceb84 --- /dev/null +++ b/modules/jsonparser/jsonparser-parser.h @@ -0,0 +1,34 @@ +/* + * Copyright (c) 2011 BalaBit IT Ltd, Budapest, Hungary + * Copyright (c) 2011 Gergely Nagy <algernon@balabit.hu> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + * + * As an additional exemption you are allowed to compile & link against the + * OpenSSL libraries as published by the OpenSSL project. See the file + * COPYING for details. + */ + +#ifndef JSONPARSER_PARSER_H_INCLUDED +#define JSONPARSER_PARSER_H_INCLUDED + +#include "cfg-parser.h" +#include "cfg-lexer.h" +#include "logparser.h" + +extern CfgParser jsonparser_parser; + +CFG_PARSER_DECLARE_LEXER_BINDING(jsonparser_, LogParser **) + +#endif diff --git a/modules/jsonparser/jsonparser-plugin.c b/modules/jsonparser/jsonparser-plugin.c new file mode 100644 index 0000000..15c4c91 --- /dev/null +++ b/modules/jsonparser/jsonparser-plugin.c @@ -0,0 +1,53 @@ +/* + * Copyright (c) 2011 BalaBit IT Ltd, Budapest, Hungary + * Copyright (c) 2011 Gergely Nagy <algernon@balabit.hu> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + * + * As an additional exemption you are allowed to compile & link against the + * OpenSSL libraries as published by the OpenSSL project. See the file + * COPYING for details. + */ + +#include "cfg-parser.h" +#include "plugin.h" +#include "jsonparser.h" + +extern CfgParser jsonparser_parser; + +static Plugin jsonparser_plugins[] = +{ + { + .type = LL_CONTEXT_PARSER, + .name = "json-parser", + .parser = &jsonparser_parser, + }, +}; + +gboolean +jsonparser_module_init(GlobalConfig *cfg, CfgArgs *args) +{ + plugin_register(cfg, jsonparser_plugins, G_N_ELEMENTS(jsonparser_plugins)); + return TRUE; +} + +const ModuleInfo module_info = +{ + .canonical_name = "jsonparser", + .version = VERSION, + .description = "The jsonparser module provides JSON parsing support for syslog-ng.", + .core_revision = SOURCE_REVISION, + .plugins = jsonparser_plugins, + .plugins_len = G_N_ELEMENTS(jsonparser_plugins), +}; diff --git a/modules/jsonparser/jsonparser.c b/modules/jsonparser/jsonparser.c new file mode 100644 index 0000000..f42766d --- /dev/null +++ b/modules/jsonparser/jsonparser.c @@ -0,0 +1,158 @@ +/* + * Copyright (c) 2011 BalaBit IT Ltd, Budapest, Hungary + * Copyright (c) 2011 Gergely Nagy <algernon@balabit.hu> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + * + * As an additional exemption you are allowed to compile & link against the + * OpenSSL libraries as published by the OpenSSL project. See the file + * COPYING for details. + */ + +#include "jsonparser.h" +#include "logparser.h" + +#include <string.h> + +#include <json.h> +#include <json_object_private.h> + +struct _LogJSONParser +{ + LogParser super; + gchar *prefix; + + json_tokener *tokener; + struct + { + GString *key; + GString *value; + } serialized; +}; + +void +log_json_parser_set_prefix (LogParser *p, const gchar *prefix) +{ + LogJSONParser *self = (LogJSONParser *)p; + + g_free (self->prefix); + self->prefix = g_strdup (prefix); +} + +static gboolean +log_json_parser_process (LogParser *s, LogMessage *msg, const gchar *input) +{ + LogJSONParser *self = (LogJSONParser *) s; + struct json_object *jso; + struct json_object_iter itr; + + json_tokener_reset (self->tokener); + jso = json_tokener_parse_ex (self->tokener, input, strlen (input)); + + json_object_object_foreachC (jso, itr) + { + gboolean parsed = FALSE; + + switch (json_object_get_type (itr.val)) + { + case json_type_boolean: + msg_info ("JSON parser does not support boolean types yet, skipping", + evt_tag_str ("key", itr.key), NULL); + break; + case json_type_double: + parsed = TRUE; + g_string_printf (self->serialized.value, "%f", + json_object_get_double (itr.val)); + break; + case json_type_int: + parsed = TRUE; + g_string_printf (self->serialized.value, "%i", + json_object_get_int (itr.val)); + break; + case json_type_string: + parsed = TRUE; + g_string_assign (self->serialized.value, + json_object_get_string (itr.val)); + break; + case json_type_object: + case json_type_array: + msg_error ("JSON parser does not support objects and arrays yet, " + "skipping", + evt_tag_str ("key", itr.key), NULL); + break; + default: + msg_error ("JSON parser encountered an unknown type, skipping", + evt_tag_str ("key", itr.key), NULL); + break; + } + + if (parsed) + { + if (self->prefix) + g_string_printf (self->serialized.key, "%s%s", + self->prefix, itr.key); + else + g_string_assign (self->serialized.key, itr.key); + log_msg_set_value (msg, + log_msg_get_value_handle (self->serialized.key->str), + self->serialized.value->str, + self->serialized.value->len); + } + } + json_object_put (jso); + + return TRUE; +} + +static LogPipe * +log_json_parser_clone (LogProcessPipe *s) +{ + LogJSONParser *self = (LogJSONParser *) s; + LogJSONParser *cloned; + + cloned = (LogJSONParser *) log_json_parser_new (); + log_json_parser_set_prefix ((LogParser *)cloned, self->prefix); + + return &cloned->super.super.super; +} + +static void +log_json_parser_free (LogPipe *s) +{ + LogJSONParser *self = (LogJSONParser *)s; + + g_free (self->prefix); + g_string_free (self->serialized.key, TRUE); + g_string_free (self->serialized.value, TRUE); + + json_tokener_free (self->tokener); + log_parser_free_method (s); +} + +LogJSONParser * +log_json_parser_new (void) +{ + LogJSONParser *self = g_new0 (LogJSONParser, 1); + + log_parser_init_instance (&self->super); + self->super.super.super.free_fn = log_json_parser_free; + self->super.super.clone = log_json_parser_clone; + self->super.process = log_json_parser_process; + + self->tokener = json_tokener_new (); + self->serialized.key = g_string_new (NULL); + self->serialized.value = g_string_new (NULL); + + return self; +} diff --git a/modules/jsonparser/jsonparser.h b/modules/jsonparser/jsonparser.h new file mode 100644 index 0000000..3f6796a --- /dev/null +++ b/modules/jsonparser/jsonparser.h @@ -0,0 +1,33 @@ +/* + * Copyright (c) 2011 BalaBit IT Ltd, Budapest, Hungary + * Copyright (c) 2011 Gergely Nagy <algernon@balabit.hu> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + * + * As an additional exemption you are allowed to compile & link against the + * OpenSSL libraries as published by the OpenSSL project. See the file + * COPYING for details. + */ + +#ifndef JSONPARSER_H_INCLUDED +#define JSONPARSER_H_INCLUDED + +#include "logparser.h" + +typedef struct _LogJSONParser LogJSONParser; + +void log_json_parser_set_prefix(LogParser *p, const gchar *prefix); +LogJSONParser *log_json_parser_new(void); + +#endif diff --git a/modules/tfjson/Makefile.am b/modules/tfjson/Makefile.am index 5d80564..2387366 100644 --- a/modules/tfjson/Makefile.am +++ b/modules/tfjson/Makefile.am @@ -1,7 +1,7 @@ moduledir = @moduledir@ export top_srcdir -if ENABLE_JSON +if ENABLE_JSON_FORMAT AM_CPPFLAGS = -I$(top_srcdir)/lib -I../../lib $(JSON_CFLAGS) module_LTLIBRARIES = libtfjson.la -- 1.7.7.1
This is cool stuff, but again doesn't compile with a stock json-c. Hmm.. this time I've checked, newer json-c versions do install the referenced headers. Then I guess it's it needs an updated version number in configure.in. On Fri, 2011-10-28 at 16:04 +0200, Gergely Nagy wrote:
The new parser supports json-c only, it does not work with json-glib. For this reason, the configure script was modified a little, so that it uses --with-json only. When json-c is found or selected, then both the parser and the template function will be built. If json-glib is found or selected, then only the template function will be available.
The summary after configure now lists the determined settings.
The new parser is - so far - fairly simple: it parses a JSON with json-c (int, double and string values are supported so far, booleans, arrays and sub-objects aren't yet), and sets the appropriate name-value pair in the LogMessage.
Users can configure a prefix to prepend before each key.
Usage is as simple as:
source s_json { tcp(port(21514) flags(no-parse)); }; destination d_json { file("/tmp/test.json" template("$(format-json --scope dot-nv-pairs)\n")); }; parser p_json { json-parser (prefix(".json.")); }; log { source(s_json); parser(p_json); destination(d_json); };
Signed-off-by: Gergely Nagy <algernon@balabit.hu> --- configure.in | 34 ++++--- modules/Makefile.am | 2 +- modules/jsonparser/Makefile.am | 22 ++++ modules/jsonparser/jsonparser-grammar.ym | 86 ++++++++++++++++ modules/jsonparser/jsonparser-parser.c | 49 +++++++++ modules/jsonparser/jsonparser-parser.h | 34 +++++++ modules/jsonparser/jsonparser-plugin.c | 53 ++++++++++ modules/jsonparser/jsonparser.c | 158 ++++++++++++++++++++++++++++++ modules/jsonparser/jsonparser.h | 33 ++++++ modules/tfjson/Makefile.am | 2 +- 10 files changed, 455 insertions(+), 18 deletions(-) create mode 100644 modules/jsonparser/Makefile.am create mode 100644 modules/jsonparser/jsonparser-grammar.ym create mode 100644 modules/jsonparser/jsonparser-parser.c create mode 100644 modules/jsonparser/jsonparser-parser.h create mode 100644 modules/jsonparser/jsonparser-plugin.c create mode 100644 modules/jsonparser/jsonparser.c create mode 100644 modules/jsonparser/jsonparser.h
diff --git a/configure.in b/configure.in index 0d97bde..e9b0148 100644 --- a/configure.in +++ b/configure.in @@ -178,10 +178,6 @@ AC_ARG_WITH(json, Use the JSON implementation specified] ,,with_json="auto")
-AC_ARG_ENABLE(json, - [ --enable-json Enable support for JSON template formatting (default: auto)] - ,,enable_json="auto") - AC_ARG_ENABLE(systemd, [ --enable-systemd Enable systemd support (default: auto)] ,,enable_systemd="auto") @@ -607,8 +603,12 @@ fi dnl *************************************************************************** dnl json headers/libraries dnl *************************************************************************** -PKG_CHECK_MODULES(JSON_C, json >= $JSON_C_MIN_VERSION,, JSON_C_LIBS="") -PKG_CHECK_MODULES(JSON_GLIB, json-glib-1.0 >= $JSON_GLIB_MIN_VERSION,, JSON_GLIB_LIBS="") +if test "x$with_json" = "xauto" || test "x$with_json" = "json-c"; then + PKG_CHECK_MODULES(JSON_C, json >= $JSON_C_MIN_VERSION,, JSON_C_LIBS="") +fi +if test "x$with_json" = "xauto" || test "x$with_json" = "json-glib"; then + PKG_CHECK_MODULES(JSON_GLIB, json-glib-1.0 >= $JSON_GLIB_MIN_VERSION,, JSON_GLIB_LIBS="") +fi
dnl *************************************************************************** dnl pcre headers/libraries @@ -887,14 +887,14 @@ elif test "x$with_json" = "xjson-c"; then AC_MSG_ERROR([Cannot find json-c version >= $JSON_C_MIN_VERSION: is pkg-config in path?]) fi
-if test "x$enable_json" = "xauto"; then - if test "x$with_json" = "xno"; then - enable_json="no" - else - enable_json="yes" - fi -elif test "x$enable_json" = "xyes" -a "x$with_json" = "xno"; then - AC_MSG_ERROR([Cannot find json-c version >= $JSON_C_MIN_VERSION or json-glib-1.0 >= $JSON_GLIB_MIN_VERSION: is pkg-config in path?]) +if test "x$with_json" = "xjson-c"; then + enable_json_parse="yes" + enable_json_format="yes" +fi + +if test "x$with_json" = "xjson-glib"; then + enable_json_parse="no" + enable_json_format="yes" fi
if test "x$enable_systemd" = "xauto"; then @@ -1077,7 +1077,8 @@ AM_CONDITIONAL(ENABLE_SQL, [test "$enable_sql" = "yes"]) AM_CONDITIONAL(ENABLE_SUN_STREAMS, [test "$enable_sun_streams" = "yes"]) AM_CONDITIONAL(ENABLE_PACCT, [test "$enable_pacct" = "yes"]) AM_CONDITIONAL(ENABLE_MONGODB, [test "$enable_mongodb" = "yes"]) -AM_CONDITIONAL(ENABLE_JSON, [test "$enable_json" = "yes"]) +AM_CONDITIONAL(ENABLE_JSON_FORMAT, [test "$enable_json_format" = "yes"]) +AM_CONDITIONAL(ENABLE_JSON_PARSE, [test "$enable_json_parse" = "yes"]) AM_CONDITIONAL(WITH_LIBSYSTEMD, [test "$with_libsystemd" = "yes"])
# substitution into manual pages @@ -1145,6 +1146,7 @@ AC_OUTPUT(dist.conf modules/basicfuncs/Makefile modules/convertfuncs/Makefile modules/tfjson/Makefile + modules/jsonparser/Makefile scripts/Makefile scripts/update-patterndb doc/Makefile @@ -1194,6 +1196,6 @@ echo " SSL support (module) : ${enable_ssl:=no}" echo " SQL support (module) : ${enable_sql:=no}" echo " PACCT module (EXPERIMENTAL) : ${enable_pacct:=no}" echo " MongoDB destination (module): ${enable_mongodb:=no}" -echo " JSON support (module) : ${enable_json:=no} (using ${with_json})" +echo " JSON support (module) : parser=${enable_json_parse:=no}, formatter=${enable_json_format:=no} (using ${with_json})"
diff --git a/modules/Makefile.am b/modules/Makefile.am index 46de2b8..9f67920 100644 --- a/modules/Makefile.am +++ b/modules/Makefile.am @@ -1 +1 @@ -SUBDIRS = afsocket afsql afstreams affile afprog afuser afmongodb csvparser confgen syslogformat pacctformat basicfuncs convertfuncs dbparser tfjson dummy +SUBDIRS = afsocket afsql afstreams affile afprog afuser afmongodb csvparser confgen syslogformat pacctformat basicfuncs convertfuncs dbparser tfjson jsonparser dummy diff --git a/modules/jsonparser/Makefile.am b/modules/jsonparser/Makefile.am new file mode 100644 index 0000000..0925f98 --- /dev/null +++ b/modules/jsonparser/Makefile.am @@ -0,0 +1,22 @@ +moduledir = @moduledir@ +AM_CPPFLAGS = -I$(top_srcdir)/lib -I../../lib +export top_srcdir + +if ENABLE_JSON_PARSE +module_LTLIBRARIES := libjsonparser.la +libjsonparser_la_SOURCES = \ + jsonparser.c jsonparser.h \ + jsonparser-grammar.y \ + jsonparser-parser.c jsonparser-parser.h \ + jsonparser-plugin.c + +libjsonparser_la_CPPFLAGS = $(AM_CPPFLAGS) +libjsonparser_la_CFLAGS = $(JSON_CFLAGS) +libjsonparser_la_LIBADD = $(MODULE_DEPS_LIBS) +libjsonparser_la_LDFLAGS = $(MODULE_LDFLAGS) $(JSON_LIBS) + +BUILT_SOURCES = jsonparser-grammar.y jsonparser-grammar.c jsonparser-grammar.h +EXTRA_DIST = $(BUILT_SOURCES) jsonparser-grammar.ym +endif + +include $(top_srcdir)/build/lex-rules.am diff --git a/modules/jsonparser/jsonparser-grammar.ym b/modules/jsonparser/jsonparser-grammar.ym new file mode 100644 index 0000000..2fe2f3c --- /dev/null +++ b/modules/jsonparser/jsonparser-grammar.ym @@ -0,0 +1,86 @@ +/* + * Copyright (c) 2011 BalaBit IT Ltd, Budapest, Hungary + * Copyright (c) 2011 Gergely Nagy <algernon@balabit.hu> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + * + * As an additional exemption you are allowed to compile & link against the + * OpenSSL libraries as published by the OpenSSL project. See the file + * COPYING for details. + */ + +%code top { +#include "jsonparser-parser.h" + +} + + +%code { + +#include "jsonparser.h" +#include "cfg-parser.h" +#include "jsonparser-grammar.h" +#include "syslog-names.h" +#include "messages.h" + +extern LogParser *last_parser; + +} + +%name-prefix "jsonparser_" + +/* this parameter is needed in order to instruct bison to use a complete + * argument list for yylex/yyerror */ + +%lex-param {CfgLexer *lexer} +%parse-param {CfgLexer *lexer} +%parse-param {LogParser **instance} +%parse-param {gpointer arg} + +/* INCLUDE_DECLS */ + +%token KW_JSON_PARSER +%token KW_PREFIX + +%type <ptr> parser_expr_json + +%% + +start + : LL_CONTEXT_PARSER parser_expr_json { YYACCEPT; } + ; + + +parser_expr_json + : KW_JSON_PARSER '(' + { + last_parser = *instance = (LogParser *) log_json_parser_new(); + } + parser_json_opts + ')' { $$ = last_parser; } + ; + +parser_json_opts + : parser_json_opt parser_json_opts + | + ; + +parser_json_opt + : KW_PREFIX '(' string ')' { log_json_parser_set_prefix(last_parser, $3); free ($3); } + | + ; + +/* INCLUDE_RULES */ + +%% diff --git a/modules/jsonparser/jsonparser-parser.c b/modules/jsonparser/jsonparser-parser.c new file mode 100644 index 0000000..a7fe1a6 --- /dev/null +++ b/modules/jsonparser/jsonparser-parser.c @@ -0,0 +1,49 @@ +/* + * Copyright (c) 2011 BalaBit IT Ltd, Budapest, Hungary + * Copyright (c) 2011 Gergely Nagy <algernon@balabit.hu> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + * + * As an additional exemption you are allowed to compile & link against the + * OpenSSL libraries as published by the OpenSSL project. See the file + * COPYING for details. + */ + +#include "jsonparser.h" +#include "cfg-parser.h" +#include "jsonparser-grammar.h" + +extern int jsonparser_debug; + +int jsonparser_parse(CfgLexer *lexer, LogParser **instance, gpointer arg); + +static CfgLexerKeyword jsonparser_keywords[] = +{ + { "json_parser", KW_JSON_PARSER, }, + { "prefix", KW_PREFIX, }, + { NULL } +}; + +CfgParser jsonparser_parser = +{ +#if ENABLE_DEBUG + .debug_flag = &jsonparser_debug, +#endif + .name = "jsonparser", + .keywords = jsonparser_keywords, + .parse = (gint (*)(CfgLexer *, gpointer *, gpointer)) jsonparser_parse, + .cleanup = (void (*)(gpointer)) log_pipe_unref, +}; + +CFG_PARSER_IMPLEMENT_LEXER_BINDING(jsonparser_, LogParser **) diff --git a/modules/jsonparser/jsonparser-parser.h b/modules/jsonparser/jsonparser-parser.h new file mode 100644 index 0000000..22ceb84 --- /dev/null +++ b/modules/jsonparser/jsonparser-parser.h @@ -0,0 +1,34 @@ +/* + * Copyright (c) 2011 BalaBit IT Ltd, Budapest, Hungary + * Copyright (c) 2011 Gergely Nagy <algernon@balabit.hu> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + * + * As an additional exemption you are allowed to compile & link against the + * OpenSSL libraries as published by the OpenSSL project. See the file + * COPYING for details. + */ + +#ifndef JSONPARSER_PARSER_H_INCLUDED +#define JSONPARSER_PARSER_H_INCLUDED + +#include "cfg-parser.h" +#include "cfg-lexer.h" +#include "logparser.h" + +extern CfgParser jsonparser_parser; + +CFG_PARSER_DECLARE_LEXER_BINDING(jsonparser_, LogParser **) + +#endif diff --git a/modules/jsonparser/jsonparser-plugin.c b/modules/jsonparser/jsonparser-plugin.c new file mode 100644 index 0000000..15c4c91 --- /dev/null +++ b/modules/jsonparser/jsonparser-plugin.c @@ -0,0 +1,53 @@ +/* + * Copyright (c) 2011 BalaBit IT Ltd, Budapest, Hungary + * Copyright (c) 2011 Gergely Nagy <algernon@balabit.hu> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + * + * As an additional exemption you are allowed to compile & link against the + * OpenSSL libraries as published by the OpenSSL project. See the file + * COPYING for details. + */ + +#include "cfg-parser.h" +#include "plugin.h" +#include "jsonparser.h" + +extern CfgParser jsonparser_parser; + +static Plugin jsonparser_plugins[] = +{ + { + .type = LL_CONTEXT_PARSER, + .name = "json-parser", + .parser = &jsonparser_parser, + }, +}; + +gboolean +jsonparser_module_init(GlobalConfig *cfg, CfgArgs *args) +{ + plugin_register(cfg, jsonparser_plugins, G_N_ELEMENTS(jsonparser_plugins)); + return TRUE; +} + +const ModuleInfo module_info = +{ + .canonical_name = "jsonparser", + .version = VERSION, + .description = "The jsonparser module provides JSON parsing support for syslog-ng.", + .core_revision = SOURCE_REVISION, + .plugins = jsonparser_plugins, + .plugins_len = G_N_ELEMENTS(jsonparser_plugins), +}; diff --git a/modules/jsonparser/jsonparser.c b/modules/jsonparser/jsonparser.c new file mode 100644 index 0000000..f42766d --- /dev/null +++ b/modules/jsonparser/jsonparser.c @@ -0,0 +1,158 @@ +/* + * Copyright (c) 2011 BalaBit IT Ltd, Budapest, Hungary + * Copyright (c) 2011 Gergely Nagy <algernon@balabit.hu> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + * + * As an additional exemption you are allowed to compile & link against the + * OpenSSL libraries as published by the OpenSSL project. See the file + * COPYING for details. + */ + +#include "jsonparser.h" +#include "logparser.h" + +#include <string.h> + +#include <json.h> +#include <json_object_private.h> + +struct _LogJSONParser +{ + LogParser super; + gchar *prefix; + + json_tokener *tokener; + struct + { + GString *key; + GString *value; + } serialized; +}; + +void +log_json_parser_set_prefix (LogParser *p, const gchar *prefix) +{ + LogJSONParser *self = (LogJSONParser *)p; + + g_free (self->prefix); + self->prefix = g_strdup (prefix); +} + +static gboolean +log_json_parser_process (LogParser *s, LogMessage *msg, const gchar *input) +{ + LogJSONParser *self = (LogJSONParser *) s; + struct json_object *jso; + struct json_object_iter itr; + + json_tokener_reset (self->tokener); + jso = json_tokener_parse_ex (self->tokener, input, strlen (input)); + + json_object_object_foreachC (jso, itr) + { + gboolean parsed = FALSE; + + switch (json_object_get_type (itr.val)) + { + case json_type_boolean: + msg_info ("JSON parser does not support boolean types yet, skipping", + evt_tag_str ("key", itr.key), NULL); + break; + case json_type_double: + parsed = TRUE; + g_string_printf (self->serialized.value, "%f", + json_object_get_double (itr.val)); + break; + case json_type_int: + parsed = TRUE; + g_string_printf (self->serialized.value, "%i", + json_object_get_int (itr.val)); + break; + case json_type_string: + parsed = TRUE; + g_string_assign (self->serialized.value, + json_object_get_string (itr.val)); + break; + case json_type_object: + case json_type_array: + msg_error ("JSON parser does not support objects and arrays yet, " + "skipping", + evt_tag_str ("key", itr.key), NULL); + break; + default: + msg_error ("JSON parser encountered an unknown type, skipping", + evt_tag_str ("key", itr.key), NULL); + break; + } + + if (parsed) + { + if (self->prefix) + g_string_printf (self->serialized.key, "%s%s", + self->prefix, itr.key); + else + g_string_assign (self->serialized.key, itr.key); + log_msg_set_value (msg, + log_msg_get_value_handle (self->serialized.key->str), + self->serialized.value->str, + self->serialized.value->len); + } + } + json_object_put (jso); + + return TRUE; +} + +static LogPipe * +log_json_parser_clone (LogProcessPipe *s) +{ + LogJSONParser *self = (LogJSONParser *) s; + LogJSONParser *cloned; + + cloned = (LogJSONParser *) log_json_parser_new (); + log_json_parser_set_prefix ((LogParser *)cloned, self->prefix); + + return &cloned->super.super.super; +} + +static void +log_json_parser_free (LogPipe *s) +{ + LogJSONParser *self = (LogJSONParser *)s; + + g_free (self->prefix); + g_string_free (self->serialized.key, TRUE); + g_string_free (self->serialized.value, TRUE); + + json_tokener_free (self->tokener); + log_parser_free_method (s); +} + +LogJSONParser * +log_json_parser_new (void) +{ + LogJSONParser *self = g_new0 (LogJSONParser, 1); + + log_parser_init_instance (&self->super); + self->super.super.super.free_fn = log_json_parser_free; + self->super.super.clone = log_json_parser_clone; + self->super.process = log_json_parser_process; + + self->tokener = json_tokener_new (); + self->serialized.key = g_string_new (NULL); + self->serialized.value = g_string_new (NULL); + + return self; +} diff --git a/modules/jsonparser/jsonparser.h b/modules/jsonparser/jsonparser.h new file mode 100644 index 0000000..3f6796a --- /dev/null +++ b/modules/jsonparser/jsonparser.h @@ -0,0 +1,33 @@ +/* + * Copyright (c) 2011 BalaBit IT Ltd, Budapest, Hungary + * Copyright (c) 2011 Gergely Nagy <algernon@balabit.hu> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + * + * As an additional exemption you are allowed to compile & link against the + * OpenSSL libraries as published by the OpenSSL project. See the file + * COPYING for details. + */ + +#ifndef JSONPARSER_H_INCLUDED +#define JSONPARSER_H_INCLUDED + +#include "logparser.h" + +typedef struct _LogJSONParser LogJSONParser; + +void log_json_parser_set_prefix(LogParser *p, const gchar *prefix); +LogJSONParser *log_json_parser_new(void); + +#endif diff --git a/modules/tfjson/Makefile.am b/modules/tfjson/Makefile.am index 5d80564..2387366 100644 --- a/modules/tfjson/Makefile.am +++ b/modules/tfjson/Makefile.am @@ -1,7 +1,7 @@ moduledir = @moduledir@ export top_srcdir
-if ENABLE_JSON +if ENABLE_JSON_FORMAT AM_CPPFLAGS = -I$(top_srcdir)/lib -I../../lib $(JSON_CFLAGS) module_LTLIBRARIES = libtfjson.la
-- Bazsi
On Sat, 2011-10-29 at 13:38 +0200, Balazs Scheidler wrote:
This is cool stuff, but again doesn't compile with a stock json-c.
Hmm.. this time I've checked, newer json-c versions do install the referenced headers. Then I guess it's it needs an updated version number in configure.in.
I've integrated this, but I may have been a bit too fast handed. There's no locking in json-parser(), which will cause things to go bad if threaded(yes) is enabled in the configuration. And while looking further, $(format-json) is not either, as I've noticed that value-pairs() is not thread safe either, because it uses a similar GString instance that can be used from multiple threads. Can you look into making them thread safe? One way to do that is to allocate a new GString instance at every invocation, which is not nice as that increases malloc() load. The alternative idea I was playing with was to create a new API for managing per-thread scratch buffers that could be reused even between multiple independent portions of syslog-ng. The way it would work: - scratch buffers are a numbered array of GString buffers, allocated for _each_ thread. - as the message is processed various parts of syslog-ng allocate such buffers, do their thing with it. - in order to make it possible for a thread to work with scratch buffers, scratch_buffers_reset() would need to be called, once for each independent task (e.g. log message). GString *scratch_buffer_acquire() Allocate a new scratch buffer, used to hold a temp value, that _will_ automatically be freed in the future, when the current function call returns. The returned GString instance should not be saved to non-local variables. void scratch_buffer_release(GString *scratch) Free the specified scratch buffer to be useful further in the call chain. May not be needed at all, I guesstimate that processing a single message would need 4-8 scratch buffers, which get reset at the end of the current message processing. Probably complicated free-alloc mechanism is not needed. scratch_buffers_reset() Free all per-thread scratch buffers previously allocated. With this API, the required number of GString buffers would be allocated on-demand, but still they would not increase the malloc() load too much, as the scratch buffers subsystem can keep GString instances around. The code using GString buffers become simpler too, as the allocation to store it somewhere vanishes. What do you think? -- Bazsi
Balazs Scheidler <bazsi@balabit.hu> writes:
On Sat, 2011-10-29 at 13:38 +0200, Balazs Scheidler wrote:
This is cool stuff, but again doesn't compile with a stock json-c.
Hmm.. this time I've checked, newer json-c versions do install the referenced headers. Then I guess it's it needs an updated version number in configure.in.
I've integrated this, but I may have been a bit too fast handed. There's no locking in json-parser(), which will cause things to go bad if threaded(yes) is enabled in the configuration.
Note to self: turn on threaded(yes) already.
And while looking further, $(format-json) is not either, as I've noticed that value-pairs() is not thread safe either, because it uses a similar GString instance that can be used from multiple threads.
Can you look into making them thread safe?
One way to do that is to allocate a new GString instance at every invocation, which is not nice as that increases malloc() load.
I could do that, and probably will do so temporarily (patches will be sent to the list, and I'll try a git pull request aswell, too, to check how my setup can handle that).
The alternative idea I was playing with was to create a new API for managing per-thread scratch buffers that could be reused even between multiple independent portions of syslog-ng. [...] With this API, the required number of GString buffers would be allocated on-demand, but still they would not increase the malloc() load too much, as the scratch buffers subsystem can keep GString instances around. The code using GString buffers become simpler too, as the allocation to store it somewhere vanishes.
What do you think?
Sounds good. I'll play with it during the long weekend, and see how far I get. -- |8]
Balazs Scheidler <bazsi@balabit.hu> writes:
The alternative idea I was playing with was to create a new API for managing per-thread scratch buffers that could be reused even between multiple independent portions of syslog-ng.
The way it would work: - scratch buffers are a numbered array of GString buffers, allocated for _each_ thread.
Any reason we want per-thread scratch buffers? Apart from not needing to use a lock in this case.. As a first step in implementing this, I had something like the following in mind: static GTrashStack *scratch_buffers; GString *scratch_buffer_acquite (void) { GString *s; s = g_trash_stack_pop (scratch_buffers); if (!s) s = g_string_new (NULL); else g_string_set_size (s, 0); return s; } void scratch_buffer_release (GString *s) { g_trash_stack_push (s); } void scratch_buffers_reset (void) { GString *s; while ((s = g_trash_stack_pop (scratch_buffers)) != NULL) g_string_free (s, TRUE); } The expected usage is to acquire a buffer when needed, and release it whenever we don't need it anymore. Spice it up with some __tls_deref & similar, and we're pretty much done, I think. What do you think? (Meanwhile, I'll go ahead and see if this works) -- |8]
On Mon, 2011-10-31 at 10:16 +0100, Gergely Nagy wrote:
Balazs Scheidler <bazsi@balabit.hu> writes:
The alternative idea I was playing with was to create a new API for managing per-thread scratch buffers that could be reused even between multiple independent portions of syslog-ng.
The way it would work: - scratch buffers are a numbered array of GString buffers, allocated for _each_ thread.
Any reason we want per-thread scratch buffers? Apart from not needing to use a lock in this case..
that's the point, and to keep GString instances around.
As a first step in implementing this, I had something like the following in mind:
static GTrashStack *scratch_buffers;
GString *scratch_buffer_acquite (void)
a typo in the function name.
{ GString *s;
s = g_trash_stack_pop (scratch_buffers); if (!s) s = g_string_new (NULL); else g_string_set_size (s, 0); return s; }
are you sure that once you get your GString instance back from GTrashStack, it'll be a valid GString? GTrashStack seems to corrupt the first pointer of each data block pushed to it (which is the "str" pointer in this case).
void scratch_buffer_release (GString *s) { g_trash_stack_push (s); }
void scratch_buffers_reset (void) { GString *s;
while ((s = g_trash_stack_pop (scratch_buffers)) != NULL) g_string_free (s, TRUE); }
I wouldn't want to free the GString instances, but reuse them. This is the other point: to avoid having to allocate new GString instances all the time. The implementation I had in mind was: - dynamically sized array of GString instances (per-thread) - a stack pointer that contains the index within the array where we are at. Items below the pointer are used, above are not. - acquire: if there are unused items in the array, return it and increase the pointer. if there are no unused items, allocate new and store it in the array. - release: if the item to be freed is the one handed out last, decrement the pointer, else do nothing. - reset: reset the pointer to 0. This way: - the number of GString instances depend on their usage. - GStrings are only allocated rarely, if the parallel number of GString buffers increase for some reason. - no need for locking as scratch buffers are per-thread Of course it only shows its potential if all the similar cases already in the code are converted to scratch buffers (log_template_format, template functions, value-pairs, etc).
The expected usage is to acquire a buffer when needed, and release it whenever we don't need it anymore.
Spice it up with some __tls_deref & similar, and we're pretty much done, I think.
What do you think? (Meanwhile, I'll go ahead and see if this works)
-- Bazsi
On Mon, 2011-10-31 at 21:38 +0100, Balazs Scheidler wrote:
On Mon, 2011-10-31 at 10:16 +0100, Gergely Nagy wrote:
Balazs Scheidler <bazsi@balabit.hu> writes:
The alternative idea I was playing with was to create a new API for managing per-thread scratch buffers that could be reused even between multiple independent portions of syslog-ng.
The way it would work: - scratch buffers are a numbered array of GString buffers, allocated for _each_ thread.
Any reason we want per-thread scratch buffers? Apart from not needing to use a lock in this case..
that's the point, and to keep GString instances around.
As a first step in implementing this, I had something like the following in mind:
static GTrashStack *scratch_buffers;
GString *scratch_buffer_acquite (void)
a typo in the function name.
{ GString *s;
s = g_trash_stack_pop (scratch_buffers); if (!s) s = g_string_new (NULL); else g_string_set_size (s, 0); return s; }
are you sure that once you get your GString instance back from GTrashStack, it'll be a valid GString?
GTrashStack seems to corrupt the first pointer of each data block pushed to it (which is the "str" pointer in this case).
void scratch_buffer_release (GString *s) { g_trash_stack_push (s); }
void scratch_buffers_reset (void) { GString *s;
while ((s = g_trash_stack_pop (scratch_buffers)) != NULL) g_string_free (s, TRUE); }
I wouldn't want to free the GString instances, but reuse them. This is the other point: to avoid having to allocate new GString instances all the time.
The implementation I had in mind was: - dynamically sized array of GString instances (per-thread) - a stack pointer that contains the index within the array where we are at. Items below the pointer are used, above are not. - acquire: if there are unused items in the array, return it and increase the pointer. if there are no unused items, allocate new and store it in the array. - release: if the item to be freed is the one handed out last, decrement the pointer, else do nothing. - reset: reset the pointer to 0.
This way: - the number of GString instances depend on their usage. - GStrings are only allocated rarely, if the parallel number of GString buffers increase for some reason. - no need for locking as scratch buffers are per-thread
Of course it only shows its potential if all the similar cases already in the code are converted to scratch buffers (log_template_format, template functions, value-pairs, etc).
After I've posted this I noticed the implementation in follow-up patches. Those comments came after this, so read them too :) -- Bazsi
Balazs Scheidler <bazsi@balabit.hu> writes:
As a first step in implementing this, I had something like the following in mind:
static GTrashStack *scratch_buffers;
GString *scratch_buffer_acquite (void)
a typo in the function name.
Aye, these were written without any kind of testing. I just looked at the glib docs, found GTrashStack, and typed in a few functions.
void scratch_buffer_release (GString *s) { g_trash_stack_push (s); }
void scratch_buffers_reset (void) { GString *s;
while ((s = g_trash_stack_pop (scratch_buffers)) != NULL) g_string_free (s, TRUE); }
I wouldn't want to free the GString instances, but reuse them. This is the other point: to avoid having to allocate new GString instances all the time.
Well, in my implementation, GStrings are allocated on-demand, and free'd when a thread exits. They're not kept around for use between threads. I can do that, but that will involve some locking.
The implementation I had in mind was: - dynamically sized array of GString instances (per-thread) - a stack pointer that contains the index within the array where we are at. Items below the pointer are used, above are not. - acquire: if there are unused items in the array, return it and increase the pointer. if there are no unused items, allocate new and store it in the array. - release: if the item to be freed is the one handed out last, decrement the pointer, else do nothing. - reset: reset the pointer to 0.
This way: - the number of GString instances depend on their usage. - GStrings are only allocated rarely, if the parallel number of GString buffers increase for some reason. - no need for locking as scratch buffers are per-thread
I think my implementation achieves these goals too, just a bit differently. Instead of a dynamically sized array, I use a stack, and leave the administration stuff to the callers. The stack-way has the advantage of not having to resize an array, nor does it have to do any bookkeeping. The disadvantage is, that if a caller forgets to release a string instance, that will be lost forever. While in the case of an array-based implementation, we could free these leaked strings too. As for reset: I don't really see the use for resetting the pointer to 0. That's basically a give back everything thingy, which might be handy, but I prefer giving back one by one instead: it makes the flow more understandable, and involves less magic.
Of course it only shows its potential if all the similar cases already in the code are converted to scratch buffers (log_template_format, template functions, value-pairs, etc).
I already converted jsonparser & value-pairs (and thus format-json). Will have a stab at the rest once we agreed on how scratch buffers should work :) -- |8]
participants (2)
-
Balazs Scheidler
-
Gergely Nagy