<div dir="ltr">Hi Fabien,<div><br></div><div>Aggregations are means to count terms from documents, and you could combine them to get powerful statistics. In my case, tags are not analyzed, so each tag is a term. The <a href="http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html">terms aggregation</a> on my tags field would then give me the top N most frequent tags.</div>
<div><br></div><div>If I'm analyzing the field, things get more complicated. For example, if the "kernel error" tag would be analyzed into "kernel" and "error", I would get "kernel" and "error" separately, which would be confusing.</div>
<div><br></div><div>Thinking about what you suggested, I could have a comma-separated list of tags, and use the <a href="http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-pattern-tokenizer.html">pattern tokenizer</a> to separate terms when encountering a comma. This should give me what I need on both searches and aggregations. The only edge-case would be if a tag would contain a comma, but I can live with that, or even let users escape it.</div>
<div><br></div><div>I'll let the idea bake a bit, thanks again for your suggestions!</div><div><br></div><div>Best regards,</div><div>Radu</div><div class="gmail_extra"><div><div dir="ltr"><div>--</div><div>Performance Monitoring * Log Analytics * Search Analytics</div>
<div><span style="font-family:arial,sans-serif;font-size:13px">Solr & Elasticsearch Support * </span><a href="http://sematext.com/" style="font-size:13px;font-family:arial,sans-serif" target="_blank">http://sematext.com/</a></div>
</div></div>
<br><br><div class="gmail_quote">On Mon, Jul 21, 2014 at 3:19 PM, Fabien Wernli <span dir="ltr"><<a href="mailto:wernli@in2p3.fr" target="_blank">wernli@in2p3.fr</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi,<br>
<div class=""><br>
On Mon, Jul 21, 2014 at 02:50:58PM +0300, Radu Gheorghe wrote:<br>
> - let users do exact matches, especially for multi-word tags like "user<br>
> error"<br>
> - be able to run a terms aggregation on them and show the available tags<br>
<br>
</div>I'm not familiar with aggregations, but you could achieve the first<br>
requirement by using a custom analyzer which splits on the coma only with<br>
no token filter<br>
<div class="HOEnZb"><div class="h5"><br>
______________________________________________________________________________<br>
Member info: <a href="https://lists.balabit.hu/mailman/listinfo/syslog-ng" target="_blank">https://lists.balabit.hu/mailman/listinfo/syslog-ng</a><br>
Documentation: <a href="http://www.balabit.com/support/documentation/?product=syslog-ng" target="_blank">http://www.balabit.com/support/documentation/?product=syslog-ng</a><br>
FAQ: <a href="http://www.balabit.com/wiki/syslog-ng-faq" target="_blank">http://www.balabit.com/wiki/syslog-ng-faq</a><br>
<br>
</div></div></blockquote></div><br></div></div>