Hi, This is my first post here, so I have to start by thanking all the contributors for an awesome product :) My question is about adding an array to a JSON document. What I'm trying to do is to send a message like this: @cee: {"message": "test message", "tags":["test", "message"]} My template looks a like this: template("@cee: $(format-json --pair message=\"$MSG\" --pair tags="test")\n") This works fine for a single tag, but how can I add multiple ones? The broader use-case is that I want to add tags to logs matching a specific filter. For example: ---------------------- filter user_tests { facility(user) and message(test) }; destination logsene_tests { syslog("logsene-receiver-syslog.sematext.com" transport("tcp") port(514) template("@cee: $(format-json --pair message=\"$MSG\" --pair tags=\"test\")\n") ); }; log { source(all_syslog); filter(user_tests); destination(logsene_tests); flags(final); }; ---------------------- If there's a better way to add multiple tags to a log, please tell me - I'm good with making big changes if it leads to a cleaner/better config. Best regards, Radu -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/
P.S. Sorry for not adding a subject :( I guess it's too late now... -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Fri, Jul 18, 2014 at 7:57 PM, Radu Gheorghe <radu.gheorghe@sematext.com> wrote:
Hi,
This is my first post here, so I have to start by thanking all the contributors for an awesome product :)
My question is about adding an array to a JSON document. What I'm trying to do is to send a message like this:
@cee: {"message": "test message", "tags":["test", "message"]}
My template looks a like this:
template("@cee: $(format-json --pair message=\"$MSG\" --pair tags="test")\n")
This works fine for a single tag, but how can I add multiple ones?
The broader use-case is that I want to add tags to logs matching a specific filter. For example: ---------------------- filter user_tests { facility(user) and message(test) };
destination logsene_tests { syslog("logsene-receiver-syslog.sematext.com" transport("tcp") port(514) template("@cee: $(format-json --pair message=\"$MSG\" --pair tags=\"test\")\n") ); };
log { source(all_syslog); filter(user_tests); destination(logsene_tests); flags(final); }; ----------------------
If there's a better way to add multiple tags to a log, please tell me - I'm good with making big changes if it leads to a cleaner/better config.
Best regards, Radu -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/
Hi, Thanks for the compliments :) Right now syslog-ng doesn't really support arrays, although we had plans about those in the past, but nothing concrete yet. syslog-ng has builtin support for tags (e.g. the tags() option for various sources and the db-parser()/patterndb configuration files), but those can also be limited a bit. Can you elaborate about your usecase? What part of your setup would associate the tags with the message? To add arrays to syslog-ng, one would need to add the appropriate logic to $(format-json), we've figured that the flat name-value pairs structure of syslog-ng would simply be formatted to be an array. Given the following set of name-value pairs: tags[0] = 'foo' tags[1] = 'bar' tags[2] = 'baz' Would become an array automatically, when formatted via format-json, e.g. tags = [ "foo", "bar", "baz" ] The only part missing is basically the recognition that a specific name has brackets at the end and sorting the elements properly. (right now we iterate in alphabetical order, which wouldn't work with numerical indices). Once this is in place, we would only need to add some rewrite operations to "append"/"pop" on an existing array. Such a contribution would be absolutely appreciated. Cheers, Bazsi On Fri, Jul 18, 2014 at 6:57 PM, Radu Gheorghe <radu.gheorghe@sematext.com> wrote:
Hi,
This is my first post here, so I have to start by thanking all the contributors for an awesome product :)
My question is about adding an array to a JSON document. What I'm trying to do is to send a message like this:
@cee: {"message": "test message", "tags":["test", "message"]}
My template looks a like this:
template("@cee: $(format-json --pair message=\"$MSG\" --pair tags="test")\n")
This works fine for a single tag, but how can I add multiple ones?
The broader use-case is that I want to add tags to logs matching a specific filter. For example: ---------------------- filter user_tests { facility(user) and message(test) };
destination logsene_tests { syslog("logsene-receiver-syslog.sematext.com" transport("tcp") port(514) template("@cee: $(format-json --pair message=\"$MSG\" --pair tags=\"test\")\n") ); };
log { source(all_syslog); filter(user_tests); destination(logsene_tests); flags(final); }; ----------------------
If there's a better way to add multiple tags to a log, please tell me - I'm good with making big changes if it leads to a cleaner/better config.
Best regards, Radu -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
-- Bazsi
Hello Bazsi, and thanks for your reply! My use-case is with clients sending logs to Logsene <http://sematext.com/logsene/index.html>, which accepts Elasticsearch-style JSON over HTTP or N flavors of syslog. Some clients may want to add tags to their logs on their way to Logsene. For example, if a message is an error and contains some text, you would give it a certain tag. The tag itself would be a part of the CEE-formatted JSON over syslog. This works well and we've documented <https://sematext.atlassian.net/wiki/display/PUBLOGSENE/syslog-ng#syslog-ng-Tagyourlogs> it, but right now we can't figure out how to add multiple tags. I thought that maybe I'm missing something that's already possible. Thanks a lot for clarifying! Best regards, Radu -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Sun, Jul 20, 2014 at 10:06 AM, Balazs Scheidler <bazsi77@gmail.com> wrote:
Hi,
Thanks for the compliments :) Right now syslog-ng doesn't really support arrays, although we had plans about those in the past, but nothing concrete yet.
syslog-ng has builtin support for tags (e.g. the tags() option for various sources and the db-parser()/patterndb configuration files), but those can also be limited a bit. Can you elaborate about your usecase? What part of your setup would associate the tags with the message?
To add arrays to syslog-ng, one would need to add the appropriate logic to $(format-json), we've figured that the flat name-value pairs structure of syslog-ng would simply be formatted to be an array. Given the following set of name-value pairs:
tags[0] = 'foo' tags[1] = 'bar' tags[2] = 'baz'
Would become an array automatically, when formatted via format-json, e.g.
tags = [ "foo", "bar", "baz" ]
The only part missing is basically the recognition that a specific name has brackets at the end and sorting the elements properly. (right now we iterate in alphabetical order, which wouldn't work with numerical indices).
Once this is in place, we would only need to add some rewrite operations to "append"/"pop" on an existing array.
Such a contribution would be absolutely appreciated.
Cheers,
Bazsi
On Fri, Jul 18, 2014 at 6:57 PM, Radu Gheorghe <radu.gheorghe@sematext.com
wrote:
Hi,
This is my first post here, so I have to start by thanking all the contributors for an awesome product :)
My question is about adding an array to a JSON document. What I'm trying to do is to send a message like this:
@cee: {"message": "test message", "tags":["test", "message"]}
My template looks a like this:
template("@cee: $(format-json --pair message=\"$MSG\" --pair tags="test")\n")
This works fine for a single tag, but how can I add multiple ones?
The broader use-case is that I want to add tags to logs matching a specific filter. For example: ---------------------- filter user_tests { facility(user) and message(test) };
destination logsene_tests { syslog("logsene-receiver-syslog.sematext.com" transport("tcp") port(514) template("@cee: $(format-json --pair message=\"$MSG\" --pair tags=\"test\")\n") ); };
log { source(all_syslog); filter(user_tests); destination(logsene_tests); flags(final); }; ----------------------
If there's a better way to add multiple tags to a log, please tell me - I'm good with making big changes if it leads to a cleaner/better config.
Best regards, Radu -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
-- Bazsi
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
Hi Radu, As Bazsi explained, there is currently no array implementation in syslog-ng, but you can naturally add as many tags to a message as you want. Now, when including the TAGS macro in a `format-json` statement, you will end up with a coma-separated field containing all tags. As it happens, if sent to Elasticsearch, this field will be indexed by default using a field of type 'string' and the standard 'analyzer'. This basically means you will be able to search your documents naturally by tag. So yes, out of the box, you don't need to do anything, just make sure the TAGS macro is being sent to ES. If you want to handle space-separated tags or be case-sensitive, you could define a custom ES analyzer to only tokenize at the comas, etc. Cheers
Hi Fabien, Thanks for you input. I didn't know about the fact that you end up with a comma-separated list of tags. The thing is, in Logsene we currently keep tags not analyzed for two reasons: - let users do exact matches, especially for multi-word tags like "user error" - be able to run a terms aggregation on them and show the available tags An array there would meet our requirements. But I will think about what you suggested and maybe find a good compromise. Thanks again! Best regards, Radu -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Mon, Jul 21, 2014 at 1:46 PM, Fabien Wernli <wernli@in2p3.fr> wrote:
Hi Radu,
As Bazsi explained, there is currently no array implementation in syslog-ng, but you can naturally add as many tags to a message as you want.
Now, when including the TAGS macro in a `format-json` statement, you will end up with a coma-separated field containing all tags.
As it happens, if sent to Elasticsearch, this field will be indexed by default using a field of type 'string' and the standard 'analyzer'. This basically means you will be able to search your documents naturally by tag.
So yes, out of the box, you don't need to do anything, just make sure the TAGS macro is being sent to ES.
If you want to handle space-separated tags or be case-sensitive, you could define a custom ES analyzer to only tokenize at the comas, etc.
Cheers
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
Hi, On Mon, Jul 21, 2014 at 02:50:58PM +0300, Radu Gheorghe wrote:
- let users do exact matches, especially for multi-word tags like "user error" - be able to run a terms aggregation on them and show the available tags
I'm not familiar with aggregations, but you could achieve the first requirement by using a custom analyzer which splits on the coma only with no token filter
Hi Fabien, Aggregations are means to count terms from documents, and you could combine them to get powerful statistics. In my case, tags are not analyzed, so each tag is a term. The terms aggregation <http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html> on my tags field would then give me the top N most frequent tags. If I'm analyzing the field, things get more complicated. For example, if the "kernel error" tag would be analyzed into "kernel" and "error", I would get "kernel" and "error" separately, which would be confusing. Thinking about what you suggested, I could have a comma-separated list of tags, and use the pattern tokenizer <http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-pattern-tokenizer.html> to separate terms when encountering a comma. This should give me what I need on both searches and aggregations. The only edge-case would be if a tag would contain a comma, but I can live with that, or even let users escape it. I'll let the idea bake a bit, thanks again for your suggestions! Best regards, Radu -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Mon, Jul 21, 2014 at 3:19 PM, Fabien Wernli <wernli@in2p3.fr> wrote:
Hi,
On Mon, Jul 21, 2014 at 02:50:58PM +0300, Radu Gheorghe wrote:
- let users do exact matches, especially for multi-word tags like "user error" - be able to run a terms aggregation on them and show the available tags
I'm not familiar with aggregations, but you could achieve the first requirement by using a custom analyzer which splits on the coma only with no token filter
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
participants (3)
-
Balazs Scheidler
-
Fabien Wernli
-
Radu Gheorghe