[syslog-ng] Elasticsearch destination

Fri Oct 31 12:05:47 CET 2014

Thanks! I will look into setting that up (hopefully today, but it may be
the first of next week).

Yesterday I was able to get ~4k/sec with format-json and a redis
destination, using logstash between redis and elasticsearch. In that
case, logstash was pretty clearly the bottleneck, since I was pushing
consistently ~4000-4500 through syslog-ng, but only ~3800-4000 were
making it to elasticsearch. I saw this most clearly when I shut down
syslog-ng and it took the rest of the system several minutes to process
what was cached in redis.

I am using ubuntu, but within a corporate net (lab systems but still
getting modules, etc. is not always trivial).

Let me see if I can setup the profiling.

(and as far as experience - I am *very* new to the ELK pieces - learning
as I go. It is still quite possible I can do some major tuning in that
area. That is one of the reasons I am trying to have syslog-ng do as
much as possible so I can remove "L" and only use "E" and "K" :-)

Thanks again all!

Jim

On 10/30/2014 11:55 PM, Balazs Scheidler wrote:
> Hi,
>
> If the 3rd option is the slowest, then this seems to be related to the
> syslog-ng perl module or Elasticsearch.pm.
>
> I've just checked, the syslog-ng perl module does a value-pairs
> evaluation and sends the results to the perl function as a hash. This
> is not the speediest thing (it'd be better to export the underlying C
> structure as an object to Perl, but should still cope with much more
> than 2k/sec).
>
> I'm wondering what I could do to help. I'm following this thread, but
> as I lack the ES experience I don't have the same environment that you do.
>
> If you could use some kind of profiling (like perf for instance) and
> had the associated debug symbols in at least syslog-ng (and preferably
> also in perl), we should easily pinpoint the issue. Setting up perf
> and symbols is easy if your distro supports it, but is a big hassle if
> it doesn't.
>
> My experience with perf is on Ubuntu, but I heard it's better in
> Fedora. Which distro are you using?
>
> This is the outline what you'd have to do in order to perform profiling:
> - don't strip syslog-ng (and neither syslog-ng-incubator) after
> compilation and use -g in CFLAGS, syslog-ng doesn't do this in its
> build script, but .rpm/.deb packaging usually does
> - you can verify this by running file <path-to-binary>
>
> - install symbols for syslog-ng dependencies (these are the dbgsyms
> packages in ubuntu,
> https://wiki.ubuntu.com/DebuggingProgramCrash#Debug_Symbol_Packages)
> - run perf record -g  "syslog-ng command line"
> - reproduce the load
> - run perf report
>
> You'll see what parts uses the most CPU in the result. Or you can send
> it here for analysis.
>
> HTH
> Bazsi
>
>
> On Wed, Oct 29, 2014 at 2:50 PM, Jim Hendrick <jrhendri at roadrunner.com
> <mailto:jrhendri at roadrunner.com>> wrote:
>
>     Thank you sir!
>
>       At least this is not unique to my testing (not sure that's actually
>     *good* news :-)
>
>     I will try and reproduce some comparable baselines using a couple
>     setups
>     I have tried:
>
>     1) proxy-syslog --> syslog-ng --> redis --> logstash+grok --> logstash
>     --> elasticsearch
>         This was essentially following a basic set of instructions just to
>     make sure I could reproduce them.
>
>     2) proxy-syslog --> syslog-ng+patterndb+format-json --> redis -->
>     logstash --> elasticsearch
>         This moved the pattern matching and conversion to json out to the
>     edge, leaving redis & logstash since they worked well at feeding
>     elasticsearch.
>
>     3) proxy-syslog --> syslog-ng+patterndb+Elasticsearch.pm -->
>     elasticsearch
>         This seemed the simplest & most promising.
>
>     I have not tried all three with the same load, so I cannot
>     definitively
>     say one is better, but my subjective feel is that #3 was actually the
>     slowest. I suspect something with the way the data is being sent to
>     elasticsearch but I do not know whether it is an issue with the perl
>     module itself or somehow in the way the data is being sent to
>     elasticsearch (indexing, etc.)
>
>     My overall thought is (still) that parsing at each syslog-ng
>     server with
>     no middleman should be fastest, since as you scale to more syslog-ng
>     servers you are distributing the pattern matching load.
>
>     I am still not sure if a broker (redis, rabbitmq, etc.) will help as
>     long as elasticsearch can accept the data fast enough.
>
>     Thanks for the feedback - I will certainly post whatever I come up
>     with
>     in the next day or so.
>
>     Jim
>
>
>
>     On 10/29/2014 09:29 AM, Fabien Wernli wrote:
>     > Hi Jim,
>     >
>     > On Tue, Oct 28, 2014 at 04:36:19PM -0400, Jim Hendrick wrote:
>     >> Now the issue is performance. I am sending roughly ~5000 EPS to the
>     >> syslog-ng instance running patterndb, but only able to
>     "sustain" less than
>     >> 1000 to elasticsearch (oddly, ES seems to start receiving at
>     ~5000 EPS, and
>     >> within an hour or less, drops to ~1000)
>     > I've got a similar workload, and seeing drops too.
>     > When EPS is below 2k/s, usually syslog-ng copes. When it goes
>     above, I can
>     > see drops. Enabling flow-control seems to help from the syslog-ng
>     > perspective (no drops in `syslog-ng-ctl stats`) but when I look
>     at protocol
>     > counters in the Linux kernel, the drops can be seen as
>     "InErrors" (I'm using
>     > UDP). I'm a little lost when trying to interpret the effect of
>     syslog-ng
>     > tuneables.
>     >
>     >> I have tried a number of things, including running a second ES
>     node and
>     >> letting syslog-ng "round robin" with no luck at all.
>     > We're doing that by specifying the `nodes` key in Elasticsearch.pm:
>     > according to its documentation [1] this should ensure
>     Search::Elasticsearch
>     > makes use of load-balancing. This seems to work as intended,
>     when checking
>     > the bandwidth between syslog-ng and all ES nodeS.
>     >
>     > When looking at the statistics of my nodes, they seem to be
>     hitting no
>     > bottleneck whatsoever:
>     >
>     > * load is between 0 and 2 (8 cores total)
>     > * writes average around 50/s with peaks around 150  (6+P RAID
>     10k SAS)
>     > * reads are ridiculous
>     > * heap usage is around 75% (of 24g)
>     > * interface rx ~500k/s
>     > * elasticsearch index rate ~500/s
>     >
>     >> ES tuning has included locking 16G of memory per ES instance,
>     and setting
>     >> indices.memory.index_buffer_size: 50%
>     > We're using 'index_buffer_size: 30%' and 'ES_HEAP_SIZE=24g' on
>     our 6 ES
>     > nodes. max_size is 256 in syslog-ng/Elasticsearch.pm
>     >
>     > What we're currently doing (or planning) to try to investigate:
>     >
>     > 1. micro-benchmark the CPAN module to see if we can go above 2k/s
>     > 2. improve the statistics gathered by collectd-elasticsearch [2]
>     > 3. write a dummy ES server which only does some
>     >    accounting but throws data away, in order to do some
>     benchmarking.
>     > 4. compare python, lua and perl implementations
>     > 5. tune various syslog-ng parameters
>     > 6. use some MQ implementation between ES and syslog-ng
>     > 7. use TCP instead of UDP for incoming syslog
>     >
>     > I realize this won't help you much, but may be of interest so we
>     can channel
>     > our common research. I'll be meeting with some syslog-ng experts
>     very soon,
>     > and I am convinced I'll come back with many options to improve the
>     > situation.
>     >
>     > Cheers
>     >
>     > [1]
>     http://search.cpan.org/~drtech/Search-Elasticsearch-1.14/lib/Search/Elasticsearch.pm#nodes
>     <http://search.cpan.org/%7Edrtech/Search-Elasticsearch-1.14/lib/Search/Elasticsearch.pm#nodes>
>     > [2] https://github.com/phobos182/collectd-elasticsearch
>     >
>     >
>     ______________________________________________________________________________
>     > Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
>     > Documentation:
>     http://www.balabit.com/support/documentation/?product=syslog-ng
>     > FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>     >
>     >
>
>     ______________________________________________________________________________
>     Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
>     Documentation:
>     http://www.balabit.com/support/documentation/?product=syslog-ng
>     FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>
>
>
>
> -- 
> Bazsi
>
>
> ______________________________________________________________________________
> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
> Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng
> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.balabit.hu/pipermail/syslog-ng/attachments/20141031/acd935da/attachment.htm