<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Thanks! I will look into setting that up (hopefully today, but it
may be the first of next week).<br>
<br>
Yesterday I was able to get ~4k/sec with format-json and a redis
destination, using logstash between redis and elasticsearch. In that
case, logstash was pretty clearly the bottleneck, since I was
pushing consistently ~4000-4500 through syslog-ng, but only
~3800-4000 were making it to elasticsearch. I saw this most clearly
when I shut down syslog-ng and it took the rest of the system
several minutes to process what was cached in redis. <br>
<br>
I am using ubuntu, but within a corporate net (lab systems but still
getting modules, etc. is not always trivial).<br>
<br>
Let me see if I can setup the profiling.<br>
<br>
(and as far as experience - I am *very* new to the ELK pieces -
learning as I go. It is still quite possible I can do some major
tuning in that area. That is one of the reasons I am trying to have
syslog-ng do as much as possible so I can remove "L" and only use
"E" and "K" :-)<br>
<br>
Thanks again all!<br>
<br>
Jim<br>
<br>
<div class="moz-cite-prefix">On 10/30/2014 11:55 PM, Balazs
Scheidler wrote:<br>
</div>
<blockquote
cite="mid:CAKcfE+Zc=OZN+LQ8CFs2+25uB2bKqSLCpu6AbM9uG7UzorBsdw@mail.gmail.com"
type="cite">
<div dir="ltr">
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>Hi,<br>
<br>
</div>
If the 3rd option is the slowest, then
this seems to be related to the
syslog-ng perl module or
Elasticsearch.pm.<br>
<br>
</div>
I've just checked, the syslog-ng perl
module does a value-pairs evaluation and
sends the results to the perl function as
a hash. This is not the speediest thing
(it'd be better to export the underlying C
structure as an object to Perl, but should
still cope with much more than 2k/sec).<br>
<br>
</div>
I'm wondering what I could do to help. I'm
following this thread, but as I lack the ES
experience I don't have the same environment
that you do.<br>
<br>
</div>
If you could use some kind of profiling (like
perf for instance) and had the associated
debug symbols in at least syslog-ng (and
preferably also in perl), we should easily
pinpoint the issue. Setting up perf and
symbols is easy if your distro supports it,
but is a big hassle if it doesn't.<br>
<br>
</div>
My experience with perf is on Ubuntu, but I
heard it's better in Fedora. Which distro are
you using?<br>
<br>
</div>
This is the outline what you'd have to do in order
to perform profiling:<br>
</div>
- don't strip syslog-ng (and neither
syslog-ng-incubator) after compilation and use -g in
CFLAGS, syslog-ng doesn't do this in its build
script, but .rpm/.deb packaging usually does<br>
</div>
<div>- you can verify this by running file
<path-to-binary><br>
</div>
<div><br>
</div>
- install symbols for syslog-ng dependencies (these
are the dbgsyms packages in ubuntu, <a
moz-do-not-send="true"
href="https://wiki.ubuntu.com/DebuggingProgramCrash#Debug_Symbol_Packages">https://wiki.ubuntu.com/DebuggingProgramCrash#Debug_Symbol_Packages</a>)<br>
</div>
- run perf record -g "syslog-ng command line"<br>
</div>
- reproduce the load<br>
</div>
- run perf report<br>
<br>
</div>
You'll see what parts uses the most CPU in the result. Or you
can send it here for analysis.<br>
<br>
</div>
HTH<br>
Bazsi<br>
<br>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Wed, Oct 29, 2014 at 2:50 PM, Jim
Hendrick <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:jrhendri@roadrunner.com" target="_blank">jrhendri@roadrunner.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">Thank you
sir!<br>
<br>
At least this is not unique to my testing (not sure that's
actually<br>
*good* news :-)<br>
<br>
I will try and reproduce some comparable baselines using a
couple setups<br>
I have tried:<br>
<br>
1) proxy-syslog --> syslog-ng --> redis -->
logstash+grok --> logstash<br>
--> elasticsearch<br>
This was essentially following a basic set of
instructions just to<br>
make sure I could reproduce them.<br>
<br>
2) proxy-syslog --> syslog-ng+patterndb+format-json
--> redis --><br>
logstash --> elasticsearch<br>
This moved the pattern matching and conversion to json
out to the<br>
edge, leaving redis & logstash since they worked well at
feeding<br>
elasticsearch.<br>
<br>
3) proxy-syslog --> syslog-ng+patterndb+Elasticsearch.pm
--> elasticsearch<br>
This seemed the simplest & most promising.<br>
<br>
I have not tried all three with the same load, so I cannot
definitively<br>
say one is better, but my subjective feel is that #3 was
actually the<br>
slowest. I suspect something with the way the data is being
sent to<br>
elasticsearch but I do not know whether it is an issue with
the perl<br>
module itself or somehow in the way the data is being sent
to<br>
elasticsearch (indexing, etc.)<br>
<br>
My overall thought is (still) that parsing at each syslog-ng
server with<br>
no middleman should be fastest, since as you scale to more
syslog-ng<br>
servers you are distributing the pattern matching load.<br>
<br>
I am still not sure if a broker (redis, rabbitmq, etc.) will
help as<br>
long as elasticsearch can accept the data fast enough.<br>
<br>
Thanks for the feedback - I will certainly post whatever I
come up with<br>
in the next day or so.<br>
<span class="HOEnZb"><font color="#888888"><br>
Jim<br>
</font></span>
<div class="HOEnZb">
<div class="h5"><br>
<br>
<br>
On 10/29/2014 09:29 AM, Fabien Wernli wrote:<br>
> Hi Jim,<br>
><br>
> On Tue, Oct 28, 2014 at 04:36:19PM -0400, Jim
Hendrick wrote:<br>
>> Now the issue is performance. I am sending
roughly ~5000 EPS to the<br>
>> syslog-ng instance running patterndb, but only
able to "sustain" less than<br>
>> 1000 to elasticsearch (oddly, ES seems to start
receiving at ~5000 EPS, and<br>
>> within an hour or less, drops to ~1000)<br>
> I've got a similar workload, and seeing drops too.<br>
> When EPS is below 2k/s, usually syslog-ng copes.
When it goes above, I can<br>
> see drops. Enabling flow-control seems to help from
the syslog-ng<br>
> perspective (no drops in `syslog-ng-ctl stats`) but
when I look at protocol<br>
> counters in the Linux kernel, the drops can be seen
as "InErrors" (I'm using<br>
> UDP). I'm a little lost when trying to interpret
the effect of syslog-ng<br>
> tuneables.<br>
><br>
>> I have tried a number of things, including
running a second ES node and<br>
>> letting syslog-ng "round robin" with no luck at
all.<br>
> We're doing that by specifying the `nodes` key in
Elasticsearch.pm:<br>
> according to its documentation [1] this should
ensure Search::Elasticsearch<br>
> makes use of load-balancing. This seems to work as
intended, when checking<br>
> the bandwidth between syslog-ng and all ES nodeS.<br>
><br>
> When looking at the statistics of my nodes, they
seem to be hitting no<br>
> bottleneck whatsoever:<br>
><br>
> * load is between 0 and 2 (8 cores total)<br>
> * writes average around 50/s with peaks around 150
(6+P RAID 10k SAS)<br>
> * reads are ridiculous<br>
> * heap usage is around 75% (of 24g)<br>
> * interface rx ~500k/s<br>
> * elasticsearch index rate ~500/s<br>
><br>
>> ES tuning has included locking 16G of memory
per ES instance, and setting<br>
>> indices.memory.index_buffer_size: 50%<br>
> We're using 'index_buffer_size: 30%' and
'ES_HEAP_SIZE=24g' on our 6 ES<br>
> nodes. max_size is 256 in
syslog-ng/Elasticsearch.pm<br>
><br>
> What we're currently doing (or planning) to try to
investigate:<br>
><br>
> 1. micro-benchmark the CPAN module to see if we can
go above 2k/s<br>
> 2. improve the statistics gathered by
collectd-elasticsearch [2]<br>
> 3. write a dummy ES server which only does some<br>
> accounting but throws data away, in order to do
some benchmarking.<br>
> 4. compare python, lua and perl implementations<br>
> 5. tune various syslog-ng parameters<br>
> 6. use some MQ implementation between ES and
syslog-ng<br>
> 7. use TCP instead of UDP for incoming syslog<br>
><br>
> I realize this won't help you much, but may be of
interest so we can channel<br>
> our common research. I'll be meeting with some
syslog-ng experts very soon,<br>
> and I am convinced I'll come back with many options
to improve the<br>
> situation.<br>
><br>
> Cheers<br>
><br>
> [1] <a moz-do-not-send="true"
href="http://search.cpan.org/%7Edrtech/Search-Elasticsearch-1.14/lib/Search/Elasticsearch.pm#nodes"
target="_blank">http://search.cpan.org/~drtech/Search-Elasticsearch-1.14/lib/Search/Elasticsearch.pm#nodes</a><br>
> [2] <a moz-do-not-send="true"
href="https://github.com/phobos182/collectd-elasticsearch"
target="_blank">https://github.com/phobos182/collectd-elasticsearch</a><br>
><br>
>
______________________________________________________________________________<br>
> Member info: <a moz-do-not-send="true"
href="https://lists.balabit.hu/mailman/listinfo/syslog-ng"
target="_blank">https://lists.balabit.hu/mailman/listinfo/syslog-ng</a><br>
> Documentation: <a moz-do-not-send="true"
href="http://www.balabit.com/support/documentation/?product=syslog-ng"
target="_blank">http://www.balabit.com/support/documentation/?product=syslog-ng</a><br>
> FAQ: <a moz-do-not-send="true"
href="http://www.balabit.com/wiki/syslog-ng-faq"
target="_blank">http://www.balabit.com/wiki/syslog-ng-faq</a><br>
><br>
><br>
<br>
______________________________________________________________________________<br>
Member info: <a moz-do-not-send="true"
href="https://lists.balabit.hu/mailman/listinfo/syslog-ng"
target="_blank">https://lists.balabit.hu/mailman/listinfo/syslog-ng</a><br>
Documentation: <a moz-do-not-send="true"
href="http://www.balabit.com/support/documentation/?product=syslog-ng"
target="_blank">http://www.balabit.com/support/documentation/?product=syslog-ng</a><br>
FAQ: <a moz-do-not-send="true"
href="http://www.balabit.com/wiki/syslog-ng-faq"
target="_blank">http://www.balabit.com/wiki/syslog-ng-faq</a><br>
<br>
</div>
</div>
</blockquote>
</div>
<br>
<br clear="all">
<br>
-- <br>
<div class="gmail_signature">Bazsi</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">______________________________________________________________________________
Member info: <a class="moz-txt-link-freetext" href="https://lists.balabit.hu/mailman/listinfo/syslog-ng">https://lists.balabit.hu/mailman/listinfo/syslog-ng</a>
Documentation: <a class="moz-txt-link-freetext" href="http://www.balabit.com/support/documentation/?product=syslog-ng">http://www.balabit.com/support/documentation/?product=syslog-ng</a>
FAQ: <a class="moz-txt-link-freetext" href="http://www.balabit.com/wiki/syslog-ng-faq">http://www.balabit.com/wiki/syslog-ng-faq</a>
</pre>
</blockquote>
<br>
</body>
</html>