[syslog-ng] (no subject)

Radu Gheorghe radu.gheorghe at sematext.com
Mon Jul 21 14:57:16 CEST 2014


Hi Fabien,

Aggregations are means to count terms from documents, and you could combine
them to get powerful statistics. In my case, tags are not analyzed, so each
tag is a term. The terms aggregation
<http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html>
on my tags field would then give me the top N most frequent tags.

If I'm analyzing the field, things get more complicated. For example, if
the "kernel error" tag would be analyzed into "kernel" and "error", I would
get "kernel" and "error" separately, which would be confusing.

Thinking about what you suggested, I could have a comma-separated list of
tags, and use the pattern tokenizer
<http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-pattern-tokenizer.html>
to separate terms when encountering a comma. This should give me what I
need on both searches and aggregations. The only edge-case would be if a
tag would contain a comma, but I can live with that, or even let users
escape it.

I'll let the idea bake a bit, thanks again for your suggestions!

Best regards,
Radu
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Mon, Jul 21, 2014 at 3:19 PM, Fabien Wernli <wernli at in2p3.fr> wrote:

> Hi,
>
> On Mon, Jul 21, 2014 at 02:50:58PM +0300, Radu Gheorghe wrote:
> > - let users do exact matches, especially for multi-word tags like "user
> > error"
> > - be able to run a terms aggregation on them and show the available tags
>
> I'm not familiar with aggregations, but you could achieve the first
> requirement by using a custom analyzer which splits on the coma only with
> no token filter
>
>
> ______________________________________________________________________________
> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
> Documentation:
> http://www.balabit.com/support/documentation/?product=syslog-ng
> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.balabit.hu/pipermail/syslog-ng/attachments/20140721/5a7f03d6/attachment-0001.htm 


More information about the syslog-ng mailing list