Hi Jim, On Tue, Oct 28, 2014 at 04:36:19PM -0400, Jim Hendrick wrote:
Now the issue is performance. I am sending roughly ~5000 EPS to the syslog-ng instance running patterndb, but only able to "sustain" less than 1000 to elasticsearch (oddly, ES seems to start receiving at ~5000 EPS, and within an hour or less, drops to ~1000)
I've got a similar workload, and seeing drops too. When EPS is below 2k/s, usually syslog-ng copes. When it goes above, I can see drops. Enabling flow-control seems to help from the syslog-ng perspective (no drops in `syslog-ng-ctl stats`) but when I look at protocol counters in the Linux kernel, the drops can be seen as "InErrors" (I'm using UDP). I'm a little lost when trying to interpret the effect of syslog-ng tuneables.
I have tried a number of things, including running a second ES node and letting syslog-ng "round robin" with no luck at all.
We're doing that by specifying the `nodes` key in Elasticsearch.pm: according to its documentation [1] this should ensure Search::Elasticsearch makes use of load-balancing. This seems to work as intended, when checking the bandwidth between syslog-ng and all ES nodeS. When looking at the statistics of my nodes, they seem to be hitting no bottleneck whatsoever: * load is between 0 and 2 (8 cores total) * writes average around 50/s with peaks around 150 (6+P RAID 10k SAS) * reads are ridiculous * heap usage is around 75% (of 24g) * interface rx ~500k/s * elasticsearch index rate ~500/s
ES tuning has included locking 16G of memory per ES instance, and setting indices.memory.index_buffer_size: 50%
We're using 'index_buffer_size: 30%' and 'ES_HEAP_SIZE=24g' on our 6 ES nodes. max_size is 256 in syslog-ng/Elasticsearch.pm What we're currently doing (or planning) to try to investigate: 1. micro-benchmark the CPAN module to see if we can go above 2k/s 2. improve the statistics gathered by collectd-elasticsearch [2] 3. write a dummy ES server which only does some accounting but throws data away, in order to do some benchmarking. 4. compare python, lua and perl implementations 5. tune various syslog-ng parameters 6. use some MQ implementation between ES and syslog-ng 7. use TCP instead of UDP for incoming syslog I realize this won't help you much, but may be of interest so we can channel our common research. I'll be meeting with some syslog-ng experts very soon, and I am convinced I'll come back with many options to improve the situation. Cheers [1] http://search.cpan.org/~drtech/Search-Elasticsearch-1.14/lib/Search/Elastics... [2] https://github.com/phobos182/collectd-elasticsearch