Elasticsearch destination
Hi We are already using the open source version of syslog-ng and I am about to set up some elastic search instances and would much prefer to feed data direct from syslog-ng rather than go through logstash (I already have a heap of patterndb parsers and performance should be way better!) I have spent an hour or so with Google and have found various references to elastic search destination being available but I can find no mention of it in the release notes for 3.6.1. I have also downloaded the the tarball and unpacked it but could not find any evidence of the module , nore is there any mention of it in the manual. As of now what is the recommended way of getting parsed data from OS syslog-ng into ES? Thanks, Russell
Hi Russell, First of all - I'm glad to see more of us working on this. Now: - There are a couple of options in the syslog-ng-incubator that provide some elasticsearch destinations using Perl, Python and Lua scripts. I have done some basic testing and it looks like the Lua one has more features, but I am having library issues with it so I may try to use the Perl module and try to add some of these features (e.g. template() is missing in the current Elasticsearch.pm so using that to format-json seems out of the question at the moment) - However with syslog-ng OSE built with redis and json support, it is easily possible to do this: syslog-ng (using patterndb & format-json) => redis => logstash (with no pattern matching) => elasticsearch You still have logstash (and all it's java wonderfulness) in the middle, but it is a pretty minimal configuration just for the convenience of linking redis & elasticsearch and it seems to run pretty well. So far on a single 32G RAM, 8 CPU box running all the pieces I top out around 5000 events per second (EPS) before elasticsearch has performance issues. I am pretty confident if I split this out into shards and ran multiple machines it would be my best "production" bet right now. (I set a 4GB limit for elasticsearch and have it lock the memory) - Clearly there is also the option of using a program destination and letting something external feed it to elasticsearch. Please let me know how you proceed and let's see if we can figure out a decent architecture for this "stack". Thanks! Jim On 10/22/2014 07:17 PM, Russell Fulton wrote:
Hi
We are already using the open source version of syslog-ng and I am about to set up some elastic search instances and would much prefer to feed data direct from syslog-ng rather than go through logstash (I already have a heap of patterndb parsers and performance should be way better!)
I have spent an hour or so with Google and have found various references to elastic search destination being available but I can find no mention of it in the release notes for 3.6.1. I have also downloaded the the tarball and unpacked it but could not find any evidence of the module , nore is there any mention of it in the manual.
As of now what is the recommended way of getting parsed data from OS syslog-ng into ES?
Thanks, Russell
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
We are putting up this stack as well and will report on success. My concern is that you imply that Elasticsearch will have a performance issue on a single node around 5000 EPS. Is that really what you are saying. Our previous kick at this can we ran into a performance limit around 4000 EPS but I thought it was logstash. Now that we are removing that, I was hoping for a lot more performance. We were hoping to get to a total solution of around 100,000 EPS using 4 or 5 elasticsearch nodes. I'll let you know how we make out. Evan. On 10/22/2014 06:28 PM, Jim Hendrick wrote:
Hi Russell,
First of all - I'm glad to see more of us working on this.
Now: - There are a couple of options in the syslog-ng-incubator that provide some elasticsearch destinations using Perl, Python and Lua scripts. I have done some basic testing and it looks like the Lua one has more features, but I am having library issues with it so I may try to use the Perl module and try to add some of these features (e.g. template() is missing in the current Elasticsearch.pm so using that to format-json seems out of the question at the moment)
- However with syslog-ng OSE built with redis and json support, it is easily possible to do this: syslog-ng (using patterndb & format-json) => redis => logstash (with no pattern matching) => elasticsearch
You still have logstash (and all it's java wonderfulness) in the middle, but it is a pretty minimal configuration just for the convenience of linking redis & elasticsearch and it seems to run pretty well.
So far on a single 32G RAM, 8 CPU box running all the pieces I top out around 5000 events per second (EPS) before elasticsearch has performance issues. I am pretty confident if I split this out into shards and ran multiple machines it would be my best "production" bet right now. (I set a 4GB limit for elasticsearch and have it lock the memory)
- Clearly there is also the option of using a program destination and letting something external feed it to elasticsearch.
Please let me know how you proceed and let's see if we can figure out a decent architecture for this "stack".
Thanks! Jim
On 10/22/2014 07:17 PM, Russell Fulton wrote:
Hi
We are already using the open source version of syslog-ng and I am about to set up some elastic search instances and would much prefer to feed data direct from syslog-ng rather than go through logstash (I already have a heap of patterndb parsers and performance should be way better!)
I have spent an hour or so with Google and have found various references to elastic search destination being available but I can find no mention of it in the release notes for 3.6.1. I have also downloaded the the tarball and unpacked it but could not find any evidence of the module , nore is there any mention of it in the manual.
As of now what is the recommended way of getting parsed data from OS syslog-ng into ES?
Thanks, Russell
Hi, On Wed, Oct 22, 2014 at 09:28:23PM -0400, Jim Hendrick wrote:
First of all - I'm glad to see more of us working on this.
I second that. We should have a common repository to share our efforts, as I know the incubator team is very busy, we could as well help them take the right decisions.
scripts. I have done some basic testing and it looks like the Lua one has more features, but I am having library issues with it so I may try to use the Perl module and try to add some of these features (e.g. template() is missing in the current Elasticsearch.pm so using that to format-json seems out of the question at the moment)
If you're referring to my implementationi [1], the reason template() is missing, is that you actually don't need it, as the perl module passes a perl structure with all the key-values from `scope()` to the queue callback. As for the performance, I start to get drops at around 5k/s, and I have a 6-node ES cluster with pretty decent hardware. I suspect the bottleneck to be my syslog_ng server which is a virtual machine. My opinions/findings so far: 1) the lua destination is very nice, but lua IMHO lacks a decent Elasticsearch lib, and you have to format name-value pairs as json 2) the perl dest is nice as it gets the name-value pairs natively as perl structures, and CPAN has an awesome ES module [2] we're using it in production 3) python seems great too, and python has from what I hear a nice ES module it also gets the name-value pairs as a python dictionary it would be great if someone could test it 4) the last "official" option is using the SCL block from the incubator, which basically is a shell program destination, so I didn't even consider it for obvious performance reasons 5) other upcoming option: java destination in the works (which would obviously benefit from ES' native libs) Admittedly ES already takes json as input, so wether it's the destination handling the serialization or syslog's json parser is probably not so much of an issue, as long as it doesn't need to be munged in your destination code. Cheers [1] https://github.com/faxm0dem/syslog_ng-elasticsearch [2] http://search.cpan.org/~drtech/Search-Elasticsearch
Hi Fabien - Correct - I am trying your Perl module. What I would like to do is: 1) have the syslog-ng servers run patterndb to parse different log types (makes that pattern matching scale over multiple servers) 2) send directly to an ES cluster I was thinking maybe a "broker" like redis or RabitMQ might add buffering for performance but hoping it would not be necessary What I have working is this: destination d_redis { redis ( host("localhost") command("LPUSH", "logstash", "$(format-json proxy_time=${PROXY.TIME} proxy_s_ip=${PROXY.S_IP} proxy_c_ip=${PROXY.C_IP} proxy_cs_mthd=${PROXY.CS_METHOD} proxy_s_action=${PROXY.S_ACTION} proxy_cs_host=${PROXY.CS_HOST} proxy_cs_uri_port=${PROXY.CS_URI_PORT} proxy_cs_username=${PROXY.CS_USERNAME} proxy_user_agent=${PROXY.USER_AGENT} proxy_cs_categories=${PROXY.CS_CATEGORIES})\n") ); }; with logstash simply pulling from redis and feeding ES. Are you saying I would not need to use the format-json bit? If so - how would I select/name the desired fields that were parsed with patterndb? As far as overall performance - I really think it is a combination of disk I/O and memory starvation. I see a spike in "majflt/s" around the time the performance goes down hitting around 100 - 200 I also see a *lot* of reads *and* writes which could be the paging... Anyway - I think I could scale (out) the ES across multiple nodes once I get the syslog-ng indexing to json part working well. Could you help me grok how to specify the fields to your Perl mod? (otherwise I might have to read the source :-( Thanks!! Jim ---- Fabien Wernli <wernli@in2p3.fr> wrote:
Hi,
On Wed, Oct 22, 2014 at 09:28:23PM -0400, Jim Hendrick wrote:
First of all - I'm glad to see more of us working on this.
I second that. We should have a common repository to share our efforts, as I know the incubator team is very busy, we could as well help them take the right decisions.
scripts. I have done some basic testing and it looks like the Lua one has more features, but I am having library issues with it so I may try to use the Perl module and try to add some of these features (e.g. template() is missing in the current Elasticsearch.pm so using that to format-json seems out of the question at the moment)
If you're referring to my implementationi [1], the reason template() is missing, is that you actually don't need it, as the perl module passes a perl structure with all the key-values from `scope()` to the queue callback.
As for the performance, I start to get drops at around 5k/s, and I have a 6-node ES cluster with pretty decent hardware. I suspect the bottleneck to be my syslog_ng server which is a virtual machine.
My opinions/findings so far:
1) the lua destination is very nice, but lua IMHO lacks a decent Elasticsearch lib, and you have to format name-value pairs as json 2) the perl dest is nice as it gets the name-value pairs natively as perl structures, and CPAN has an awesome ES module [2] we're using it in production 3) python seems great too, and python has from what I hear a nice ES module it also gets the name-value pairs as a python dictionary it would be great if someone could test it 4) the last "official" option is using the SCL block from the incubator, which basically is a shell program destination, so I didn't even consider it for obvious performance reasons 5) other upcoming option: java destination in the works (which would obviously benefit from ES' native libs)
Admittedly ES already takes json as input, so wether it's the destination handling the serialization or syslog's json parser is probably not so much of an issue, as long as it doesn't need to be munged in your destination code.
Cheers
[1] https://github.com/faxm0dem/syslog_ng-elasticsearch [2] http://search.cpan.org/~drtech/Search-Elasticsearch
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
On Thu, Oct 23, 2014 at 10:50:55AM -0400, jrhendri@roadrunner.com wrote:
Are you saying I would not need to use the format-json bit? If so - how would I select/name the desired fields that were parsed with patterndb?
By simply passing `scope` to the destination block [1] I also use a special `exclude` [2] parameter that lets me furter drop unwanted name-values.
As far as overall performance - I really think it is a combination of disk I/O and memory starvation.
I'm using collectd, riemann and riemann-dash to monitor syslog-ng and ES performance live [1] https://github.com/faxm0dem/syslog_ng-elasticsearch/blob/master/perl/syslog-... [2] https://github.com/faxm0dem/syslog_ng-elasticsearch/blob/master/perl/plugin....
Perfect! I should have seen it before. When I was sending logs and I was not seeing anything in Kibana I thought something was wrong (even captured the packets and it showed the whole message actually being sent) Today I finally noticed I was using the "logstash" Kibana dashboard and when I switched to the generic one all the parsed data was there. I still need to get it built on a more production system - but I'm sure that will go OK once I spend some time. Thanks! Jim On 10/23/2014 11:03 AM, Fabien Wernli wrote:
On Thu, Oct 23, 2014 at 10:50:55AM -0400, jrhendri@roadrunner.com wrote:
Are you saying I would not need to use the format-json bit? If so - how would I select/name the desired fields that were parsed with patterndb? By simply passing `scope` to the destination block [1] I also use a special `exclude` [2] parameter that lets me furter drop unwanted name-values.
As far as overall performance - I really think it is a combination of disk I/O and memory starvation. I'm using collectd, riemann and riemann-dash to monitor syslog-ng and ES performance live
[1] https://github.com/faxm0dem/syslog_ng-elasticsearch/blob/master/perl/syslog-... [2] https://github.com/faxm0dem/syslog_ng-elasticsearch/blob/master/perl/plugin.... ______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
OK - latest update and a request for recommendations (help actually :-) I have syslog-ng receiving logs from the network, parsing them with patterndb using fairly complex parsing consisting of 5 patterns parsing Bluecoat proxy logs into their respective fields. Here is one example: <pattern>@STRING:PROXY.TIME::@ @NUMBER:PROXY.TIME_TAKEN@ @IPv4:PROXY.C_IP@ @NUMBER:PROXY.SC_STATUS@ @STRING:PROXY.S_ACTION:_@ @NUMBER:PROXY.SC_BYTES@ @NUMBER:PROXY.CS_BYTES@ @STRING:PROXY.CS_METHOD@ @STRING:PROXY.CS_URI_SCHEME:-@ @STRING:PROXY.CS_HOST:_-.@ @NUMBER:PROXY.CS_URI_PORT:-@ @ESTRING:PROXY.CS_URI_PATH: @@ESTRING:PROXY.CS_URI_EQUERY: @@STRING:PROXY.CS_USERNAME:-$@ @STRING:PROXY.CS_AUTH__GROUP:-_@ @STRING:PROXY.S_SUPPLIER_NAME:_-.@ @ESTRING:PROXY.CONTENT_TYPE: @@ESTRING:PROXY.REFERRER: @@QSTRING:PROXY.USER_AGENT:"@ @ESTRING:PROXY.FILTER_RESULT: @@QSTRING:PROXY.CS_CATEGORIES:"@ @STRING:PROXY.X_VIRUS_ID:-@ @IPv4:PROXY.S_IP@</pattern> This is using the Perl Search::Elasticsearch module running on syslog-ng-3.5.6 with the incubator adding mod-perl support. It is being sent to elasticsearch, and I can build basic Kibana dashboards to start analyzing the logs. So far so good. Now the issue is performance. I am sending roughly ~5000 EPS to the syslog-ng instance running patterndb, but only able to "sustain" less than 1000 to elasticsearch (oddly, ES seems to start receiving at ~5000 EPS, and within an hour or less, drops to ~1000) I have tried a number of things, including running a second ES node and letting syslog-ng "round robin" with no luck at all. ES tuning has included locking 16G of memory per ES instance, and setting indices.memory.index_buffer_size: 50% syslog-ng tuning was limited to setting "threaded(yes)" but since I only have a single source and destination, I didn't expect much from this. I *did* notice that when I increased max_count from 256 to 1024 that syslog-ng memory usage dropped dramatically (it had been around 8GB and now has been holding around 100m !) but the overall performance has not improved (much). I feel like I must not be looking in the right area, since syslog-ng stats show a huge drop rate ( 60 - 80 % !!) and also, the "network" source shows absolutely nothing (zero). SourceName;SourceId;SourceInstance;State;Type;Number source;s_network;;a;processed;0 center;;received;a;processed;5 destination;d_elasticsearch;;a;processed;633968 src.internal;s_local#2;;a;processed;5 src.internal;s_local#2;;a;stamp;1414527594 center;;queued;a;processed;633988 dst.none;d_elasticsearch#0;perl,/usr/local/share/include/scl/es-perl/Elasticsearch.pm,SyslogNG::Elasticsearch::init,SyslogNG::Elasticsearch::queue_daily,SyslogNG::Elasticsearch::deinit;a;dropped;511475 dst.none;d_elasticsearch#0;perl,/usr/local/share/include/scl/es-perl/Elasticsearch.pm,SyslogNG::Elasticsearch::init,SyslogNG::Elasticsearch::queue_daily,SyslogNG::Elasticsearch::deinit;a;stored;10000 src.none;;;a;processed;0 src.none;;;a;stamp;0 global;payload_reallocs;;a;processed;870904 global;sdata_updates;;a;processed;0 destination;d_local;;a;processed;20 global;msg_clones;;a;processed;0 source;s_local;;a;processed;5 Any help on where to look next would be greatly appreciated!! Thanks all. Jim
Hi Jim, On Tue, Oct 28, 2014 at 04:36:19PM -0400, Jim Hendrick wrote:
Now the issue is performance. I am sending roughly ~5000 EPS to the syslog-ng instance running patterndb, but only able to "sustain" less than 1000 to elasticsearch (oddly, ES seems to start receiving at ~5000 EPS, and within an hour or less, drops to ~1000)
I've got a similar workload, and seeing drops too. When EPS is below 2k/s, usually syslog-ng copes. When it goes above, I can see drops. Enabling flow-control seems to help from the syslog-ng perspective (no drops in `syslog-ng-ctl stats`) but when I look at protocol counters in the Linux kernel, the drops can be seen as "InErrors" (I'm using UDP). I'm a little lost when trying to interpret the effect of syslog-ng tuneables.
I have tried a number of things, including running a second ES node and letting syslog-ng "round robin" with no luck at all.
We're doing that by specifying the `nodes` key in Elasticsearch.pm: according to its documentation [1] this should ensure Search::Elasticsearch makes use of load-balancing. This seems to work as intended, when checking the bandwidth between syslog-ng and all ES nodeS. When looking at the statistics of my nodes, they seem to be hitting no bottleneck whatsoever: * load is between 0 and 2 (8 cores total) * writes average around 50/s with peaks around 150 (6+P RAID 10k SAS) * reads are ridiculous * heap usage is around 75% (of 24g) * interface rx ~500k/s * elasticsearch index rate ~500/s
ES tuning has included locking 16G of memory per ES instance, and setting indices.memory.index_buffer_size: 50%
We're using 'index_buffer_size: 30%' and 'ES_HEAP_SIZE=24g' on our 6 ES nodes. max_size is 256 in syslog-ng/Elasticsearch.pm What we're currently doing (or planning) to try to investigate: 1. micro-benchmark the CPAN module to see if we can go above 2k/s 2. improve the statistics gathered by collectd-elasticsearch [2] 3. write a dummy ES server which only does some accounting but throws data away, in order to do some benchmarking. 4. compare python, lua and perl implementations 5. tune various syslog-ng parameters 6. use some MQ implementation between ES and syslog-ng 7. use TCP instead of UDP for incoming syslog I realize this won't help you much, but may be of interest so we can channel our common research. I'll be meeting with some syslog-ng experts very soon, and I am convinced I'll come back with many options to improve the situation. Cheers [1] http://search.cpan.org/~drtech/Search-Elasticsearch-1.14/lib/Search/Elastics... [2] https://github.com/phobos182/collectd-elasticsearch
Thank you sir! At least this is not unique to my testing (not sure that's actually *good* news :-) I will try and reproduce some comparable baselines using a couple setups I have tried: 1) proxy-syslog --> syslog-ng --> redis --> logstash+grok --> logstash --> elasticsearch This was essentially following a basic set of instructions just to make sure I could reproduce them. 2) proxy-syslog --> syslog-ng+patterndb+format-json --> redis --> logstash --> elasticsearch This moved the pattern matching and conversion to json out to the edge, leaving redis & logstash since they worked well at feeding elasticsearch. 3) proxy-syslog --> syslog-ng+patterndb+Elasticsearch.pm --> elasticsearch This seemed the simplest & most promising. I have not tried all three with the same load, so I cannot definitively say one is better, but my subjective feel is that #3 was actually the slowest. I suspect something with the way the data is being sent to elasticsearch but I do not know whether it is an issue with the perl module itself or somehow in the way the data is being sent to elasticsearch (indexing, etc.) My overall thought is (still) that parsing at each syslog-ng server with no middleman should be fastest, since as you scale to more syslog-ng servers you are distributing the pattern matching load. I am still not sure if a broker (redis, rabbitmq, etc.) will help as long as elasticsearch can accept the data fast enough. Thanks for the feedback - I will certainly post whatever I come up with in the next day or so. Jim On 10/29/2014 09:29 AM, Fabien Wernli wrote:
Hi Jim,
On Tue, Oct 28, 2014 at 04:36:19PM -0400, Jim Hendrick wrote:
Now the issue is performance. I am sending roughly ~5000 EPS to the syslog-ng instance running patterndb, but only able to "sustain" less than 1000 to elasticsearch (oddly, ES seems to start receiving at ~5000 EPS, and within an hour or less, drops to ~1000) I've got a similar workload, and seeing drops too. When EPS is below 2k/s, usually syslog-ng copes. When it goes above, I can see drops. Enabling flow-control seems to help from the syslog-ng perspective (no drops in `syslog-ng-ctl stats`) but when I look at protocol counters in the Linux kernel, the drops can be seen as "InErrors" (I'm using UDP). I'm a little lost when trying to interpret the effect of syslog-ng tuneables.
I have tried a number of things, including running a second ES node and letting syslog-ng "round robin" with no luck at all. We're doing that by specifying the `nodes` key in Elasticsearch.pm: according to its documentation [1] this should ensure Search::Elasticsearch makes use of load-balancing. This seems to work as intended, when checking the bandwidth between syslog-ng and all ES nodeS.
When looking at the statistics of my nodes, they seem to be hitting no bottleneck whatsoever:
* load is between 0 and 2 (8 cores total) * writes average around 50/s with peaks around 150 (6+P RAID 10k SAS) * reads are ridiculous * heap usage is around 75% (of 24g) * interface rx ~500k/s * elasticsearch index rate ~500/s
ES tuning has included locking 16G of memory per ES instance, and setting indices.memory.index_buffer_size: 50% We're using 'index_buffer_size: 30%' and 'ES_HEAP_SIZE=24g' on our 6 ES nodes. max_size is 256 in syslog-ng/Elasticsearch.pm
What we're currently doing (or planning) to try to investigate:
1. micro-benchmark the CPAN module to see if we can go above 2k/s 2. improve the statistics gathered by collectd-elasticsearch [2] 3. write a dummy ES server which only does some accounting but throws data away, in order to do some benchmarking. 4. compare python, lua and perl implementations 5. tune various syslog-ng parameters 6. use some MQ implementation between ES and syslog-ng 7. use TCP instead of UDP for incoming syslog
I realize this won't help you much, but may be of interest so we can channel our common research. I'll be meeting with some syslog-ng experts very soon, and I am convinced I'll come back with many options to improve the situation.
Cheers
[1] http://search.cpan.org/~drtech/Search-Elasticsearch-1.14/lib/Search/Elastics... [2] https://github.com/phobos182/collectd-elasticsearch
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
Hi, If the 3rd option is the slowest, then this seems to be related to the syslog-ng perl module or Elasticsearch.pm. I've just checked, the syslog-ng perl module does a value-pairs evaluation and sends the results to the perl function as a hash. This is not the speediest thing (it'd be better to export the underlying C structure as an object to Perl, but should still cope with much more than 2k/sec). I'm wondering what I could do to help. I'm following this thread, but as I lack the ES experience I don't have the same environment that you do. If you could use some kind of profiling (like perf for instance) and had the associated debug symbols in at least syslog-ng (and preferably also in perl), we should easily pinpoint the issue. Setting up perf and symbols is easy if your distro supports it, but is a big hassle if it doesn't. My experience with perf is on Ubuntu, but I heard it's better in Fedora. Which distro are you using? This is the outline what you'd have to do in order to perform profiling: - don't strip syslog-ng (and neither syslog-ng-incubator) after compilation and use -g in CFLAGS, syslog-ng doesn't do this in its build script, but .rpm/.deb packaging usually does - you can verify this by running file <path-to-binary> - install symbols for syslog-ng dependencies (these are the dbgsyms packages in ubuntu, https://wiki.ubuntu.com/DebuggingProgramCrash#Debug_Symbol_Packages) - run perf record -g "syslog-ng command line" - reproduce the load - run perf report You'll see what parts uses the most CPU in the result. Or you can send it here for analysis. HTH Bazsi On Wed, Oct 29, 2014 at 2:50 PM, Jim Hendrick <jrhendri@roadrunner.com> wrote:
Thank you sir!
At least this is not unique to my testing (not sure that's actually *good* news :-)
I will try and reproduce some comparable baselines using a couple setups I have tried:
1) proxy-syslog --> syslog-ng --> redis --> logstash+grok --> logstash --> elasticsearch This was essentially following a basic set of instructions just to make sure I could reproduce them.
2) proxy-syslog --> syslog-ng+patterndb+format-json --> redis --> logstash --> elasticsearch This moved the pattern matching and conversion to json out to the edge, leaving redis & logstash since they worked well at feeding elasticsearch.
3) proxy-syslog --> syslog-ng+patterndb+Elasticsearch.pm --> elasticsearch This seemed the simplest & most promising.
I have not tried all three with the same load, so I cannot definitively say one is better, but my subjective feel is that #3 was actually the slowest. I suspect something with the way the data is being sent to elasticsearch but I do not know whether it is an issue with the perl module itself or somehow in the way the data is being sent to elasticsearch (indexing, etc.)
My overall thought is (still) that parsing at each syslog-ng server with no middleman should be fastest, since as you scale to more syslog-ng servers you are distributing the pattern matching load.
I am still not sure if a broker (redis, rabbitmq, etc.) will help as long as elasticsearch can accept the data fast enough.
Thanks for the feedback - I will certainly post whatever I come up with in the next day or so.
Jim
On 10/29/2014 09:29 AM, Fabien Wernli wrote:
Hi Jim,
On Tue, Oct 28, 2014 at 04:36:19PM -0400, Jim Hendrick wrote:
Now the issue is performance. I am sending roughly ~5000 EPS to the syslog-ng instance running patterndb, but only able to "sustain" less than 1000 to elasticsearch (oddly, ES seems to start receiving at ~5000 EPS, and within an hour or less, drops to ~1000) I've got a similar workload, and seeing drops too. When EPS is below 2k/s, usually syslog-ng copes. When it goes above, I can see drops. Enabling flow-control seems to help from the syslog-ng perspective (no drops in `syslog-ng-ctl stats`) but when I look at protocol counters in the Linux kernel, the drops can be seen as "InErrors" (I'm using UDP). I'm a little lost when trying to interpret the effect of syslog-ng tuneables.
I have tried a number of things, including running a second ES node and letting syslog-ng "round robin" with no luck at all. We're doing that by specifying the `nodes` key in Elasticsearch.pm: according to its documentation [1] this should ensure Search::Elasticsearch makes use of load-balancing. This seems to work as intended, when checking the bandwidth between syslog-ng and all ES nodeS.
When looking at the statistics of my nodes, they seem to be hitting no bottleneck whatsoever:
* load is between 0 and 2 (8 cores total) * writes average around 50/s with peaks around 150 (6+P RAID 10k SAS) * reads are ridiculous * heap usage is around 75% (of 24g) * interface rx ~500k/s * elasticsearch index rate ~500/s
ES tuning has included locking 16G of memory per ES instance, and setting indices.memory.index_buffer_size: 50% We're using 'index_buffer_size: 30%' and 'ES_HEAP_SIZE=24g' on our 6 ES nodes. max_size is 256 in syslog-ng/Elasticsearch.pm
What we're currently doing (or planning) to try to investigate:
1. micro-benchmark the CPAN module to see if we can go above 2k/s 2. improve the statistics gathered by collectd-elasticsearch [2] 3. write a dummy ES server which only does some accounting but throws data away, in order to do some benchmarking. 4. compare python, lua and perl implementations 5. tune various syslog-ng parameters 6. use some MQ implementation between ES and syslog-ng 7. use TCP instead of UDP for incoming syslog
I realize this won't help you much, but may be of interest so we can channel our common research. I'll be meeting with some syslog-ng experts very soon, and I am convinced I'll come back with many options to improve the situation.
Cheers
[1] http://search.cpan.org/~drtech/Search-Elasticsearch-1.14/lib/Search/Elastics... [2] https://github.com/phobos182/collectd-elasticsearch
______________________________________________________________________________
Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
-- Bazsi
Hi, On Fri, Oct 31, 2014 at 04:55:09AM +0100, Balazs Scheidler wrote:
I've just checked, the syslog-ng perl module does a value-pairs evaluation and sends the results to the perl function as a hash. This is not the speediest thing (it'd be better to export the underlying C structure as an object to Perl, but should still cope with much more than 2k/sec).
We just checked with algernon, and my perl destination can handle a lot more workload (15k/s) with a dummy queue function, so this really points to Search::Elasticsearch backpressure. Cheers
Thanks! I will look into setting that up (hopefully today, but it may be the first of next week). Yesterday I was able to get ~4k/sec with format-json and a redis destination, using logstash between redis and elasticsearch. In that case, logstash was pretty clearly the bottleneck, since I was pushing consistently ~4000-4500 through syslog-ng, but only ~3800-4000 were making it to elasticsearch. I saw this most clearly when I shut down syslog-ng and it took the rest of the system several minutes to process what was cached in redis. I am using ubuntu, but within a corporate net (lab systems but still getting modules, etc. is not always trivial). Let me see if I can setup the profiling. (and as far as experience - I am *very* new to the ELK pieces - learning as I go. It is still quite possible I can do some major tuning in that area. That is one of the reasons I am trying to have syslog-ng do as much as possible so I can remove "L" and only use "E" and "K" :-) Thanks again all! Jim On 10/30/2014 11:55 PM, Balazs Scheidler wrote:
Hi,
If the 3rd option is the slowest, then this seems to be related to the syslog-ng perl module or Elasticsearch.pm.
I've just checked, the syslog-ng perl module does a value-pairs evaluation and sends the results to the perl function as a hash. This is not the speediest thing (it'd be better to export the underlying C structure as an object to Perl, but should still cope with much more than 2k/sec).
I'm wondering what I could do to help. I'm following this thread, but as I lack the ES experience I don't have the same environment that you do.
If you could use some kind of profiling (like perf for instance) and had the associated debug symbols in at least syslog-ng (and preferably also in perl), we should easily pinpoint the issue. Setting up perf and symbols is easy if your distro supports it, but is a big hassle if it doesn't.
My experience with perf is on Ubuntu, but I heard it's better in Fedora. Which distro are you using?
This is the outline what you'd have to do in order to perform profiling: - don't strip syslog-ng (and neither syslog-ng-incubator) after compilation and use -g in CFLAGS, syslog-ng doesn't do this in its build script, but .rpm/.deb packaging usually does - you can verify this by running file <path-to-binary>
- install symbols for syslog-ng dependencies (these are the dbgsyms packages in ubuntu, https://wiki.ubuntu.com/DebuggingProgramCrash#Debug_Symbol_Packages) - run perf record -g "syslog-ng command line" - reproduce the load - run perf report
You'll see what parts uses the most CPU in the result. Or you can send it here for analysis.
HTH Bazsi
On Wed, Oct 29, 2014 at 2:50 PM, Jim Hendrick <jrhendri@roadrunner.com <mailto:jrhendri@roadrunner.com>> wrote:
Thank you sir!
At least this is not unique to my testing (not sure that's actually *good* news :-)
I will try and reproduce some comparable baselines using a couple setups I have tried:
1) proxy-syslog --> syslog-ng --> redis --> logstash+grok --> logstash --> elasticsearch This was essentially following a basic set of instructions just to make sure I could reproduce them.
2) proxy-syslog --> syslog-ng+patterndb+format-json --> redis --> logstash --> elasticsearch This moved the pattern matching and conversion to json out to the edge, leaving redis & logstash since they worked well at feeding elasticsearch.
3) proxy-syslog --> syslog-ng+patterndb+Elasticsearch.pm --> elasticsearch This seemed the simplest & most promising.
I have not tried all three with the same load, so I cannot definitively say one is better, but my subjective feel is that #3 was actually the slowest. I suspect something with the way the data is being sent to elasticsearch but I do not know whether it is an issue with the perl module itself or somehow in the way the data is being sent to elasticsearch (indexing, etc.)
My overall thought is (still) that parsing at each syslog-ng server with no middleman should be fastest, since as you scale to more syslog-ng servers you are distributing the pattern matching load.
I am still not sure if a broker (redis, rabbitmq, etc.) will help as long as elasticsearch can accept the data fast enough.
Thanks for the feedback - I will certainly post whatever I come up with in the next day or so.
Jim
On 10/29/2014 09:29 AM, Fabien Wernli wrote: > Hi Jim, > > On Tue, Oct 28, 2014 at 04:36:19PM -0400, Jim Hendrick wrote: >> Now the issue is performance. I am sending roughly ~5000 EPS to the >> syslog-ng instance running patterndb, but only able to "sustain" less than >> 1000 to elasticsearch (oddly, ES seems to start receiving at ~5000 EPS, and >> within an hour or less, drops to ~1000) > I've got a similar workload, and seeing drops too. > When EPS is below 2k/s, usually syslog-ng copes. When it goes above, I can > see drops. Enabling flow-control seems to help from the syslog-ng > perspective (no drops in `syslog-ng-ctl stats`) but when I look at protocol > counters in the Linux kernel, the drops can be seen as "InErrors" (I'm using > UDP). I'm a little lost when trying to interpret the effect of syslog-ng > tuneables. > >> I have tried a number of things, including running a second ES node and >> letting syslog-ng "round robin" with no luck at all. > We're doing that by specifying the `nodes` key in Elasticsearch.pm: > according to its documentation [1] this should ensure Search::Elasticsearch > makes use of load-balancing. This seems to work as intended, when checking > the bandwidth between syslog-ng and all ES nodeS. > > When looking at the statistics of my nodes, they seem to be hitting no > bottleneck whatsoever: > > * load is between 0 and 2 (8 cores total) > * writes average around 50/s with peaks around 150 (6+P RAID 10k SAS) > * reads are ridiculous > * heap usage is around 75% (of 24g) > * interface rx ~500k/s > * elasticsearch index rate ~500/s > >> ES tuning has included locking 16G of memory per ES instance, and setting >> indices.memory.index_buffer_size: 50% > We're using 'index_buffer_size: 30%' and 'ES_HEAP_SIZE=24g' on our 6 ES > nodes. max_size is 256 in syslog-ng/Elasticsearch.pm > > What we're currently doing (or planning) to try to investigate: > > 1. micro-benchmark the CPAN module to see if we can go above 2k/s > 2. improve the statistics gathered by collectd-elasticsearch [2] > 3. write a dummy ES server which only does some > accounting but throws data away, in order to do some benchmarking. > 4. compare python, lua and perl implementations > 5. tune various syslog-ng parameters > 6. use some MQ implementation between ES and syslog-ng > 7. use TCP instead of UDP for incoming syslog > > I realize this won't help you much, but may be of interest so we can channel > our common research. I'll be meeting with some syslog-ng experts very soon, > and I am convinced I'll come back with many options to improve the > situation. > > Cheers > > [1] http://search.cpan.org/~drtech/Search-Elasticsearch-1.14/lib/Search/Elastics... <http://search.cpan.org/%7Edrtech/Search-Elasticsearch-1.14/lib/Search/Elasticsearch.pm#nodes> > [2] https://github.com/phobos182/collectd-elasticsearch > > ______________________________________________________________________________ > Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng > Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng > FAQ: http://www.balabit.com/wiki/syslog-ng-faq > >
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
-- Bazsi
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
Hi, On Fri, Oct 31, 2014 at 07:05:47AM -0400, Jim Hendrick wrote:
area. That is one of the reasons I am trying to have syslog-ng do as much as possible so I can remove "L" and only use "E" and "K" :-)
yeah, let's call it the SNEK stack, or SNERK when using Riemann too :o)
There are a number of occurrences in log lines that are of the form {address}:{port} This is fine for IPv4 addresses 127.0.0.1:123 which can be matched with @IPv4@:@NUMBER@ If the address is a complete IPv6 address with port 2607:f8f0:c10:fff:200:5efe:ce57:5330:123 the same pattern can be used to match it @IPvAny@:@NUMBER@ If the IP address is a shortform IPv6 address such as ::ffff:127.0.0.1 and adding the port number ::ffff:127.0.0.1:123 this fails to be matched by @IPvAny@:@NUMBER@ Has anyone else seen bumped into this issue? For anyone taking a stab at this, this page looks interesting http://rosettacode.org/wiki/Parse_an_IP_Address
For the same reason the ipv6 address is enclosed in square brackets. Is that not the case? On Oct 31, 2014 4:24 PM, "Evan Rempel" <erempel@uvic.ca> wrote:
There are a number of occurrences in log lines that are of the form {address}:{port}
This is fine for IPv4 addresses 127.0.0.1:123 which can be matched with @IPv4@:@NUMBER@
If the address is a complete IPv6 address with port 2607:f8f0:c10:fff:200:5efe:ce57:5330:123 the same pattern can be used to match it @IPvAny@:@NUMBER@
If the IP address is a shortform IPv6 address such as ::ffff:127.0.0.1 and adding the port number ::ffff:127.0.0.1:123 this fails to be matched by @IPvAny@:@NUMBER@
Has anyone else seen bumped into this issue?
For anyone taking a stab at this, this page looks interesting
http://rosettacode.org/wiki/Parse_an_IP_Address
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
I now see this in all of the specifications, but alas my legacy application is not spart enough to deal with an IPv6 address with a port number any different than an IPv4 address with a port number. Oh how I love the consistency of vendor logs :-) Thanks for your time. Evan. On 10/31/2014 01:03 PM, Balazs Scheidler wrote:
For the same reason the ipv6 address is enclosed in square brackets. Is that not the case?
On Oct 31, 2014 4:24 PM, "Evan Rempel" <erempel@uvic.ca <mailto:erempel@uvic.ca>> wrote:
There are a number of occurrences in log lines that are of the form {address}:{port}
This is fine for IPv4 addresses 127.0.0.1:123 <http://127.0.0.1:123> which can be matched with @IPv4@:@NUMBER@
If the address is a complete IPv6 address with port 2607:f8f0:c10:fff:200:5efe:ce57:5330:123 the same pattern can be used to match it @IPvAny@:@NUMBER@
If the IP address is a shortform IPv6 address such as ::ffff:127.0.0.1 and adding the port number ::ffff:127.0.0.1:123 <http://127.0.0.1:123> this fails to be matched by @IPvAny@:@NUMBER@
Has anyone else seen bumped into this issue?
For anyone taking a stab at this, this page looks interesting
Thanks for all the feedback folks that has clarified things significantly. I am moving from ELSA to ES for some targeted applications (firstly our authentication database an then IDS and friends).
From this discussion I will probably try Fabien’s perl module. Luckily in my case performance is not a major issue — 1000 EPS is fine.
Russell On 23/10/2014, at 12:17 pm, Russell Fulton <r.fulton@auckland.ac.nz> wrote:
Hi
We are already using the open source version of syslog-ng and I am about to set up some elastic search instances and would much prefer to feed data direct from syslog-ng rather than go through logstash (I already have a heap of patterndb parsers and performance should be way better!)
I have spent an hour or so with Google and have found various references to elastic search destination being available but I can find no mention of it in the release notes for 3.6.1. I have also downloaded the the tarball and unpacked it but could not find any evidence of the module , nore is there any mention of it in the manual.
As of now what is the recommended way of getting parsed data from OS syslog-ng into ES?
Thanks, Russell
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
participants (6)
-
Balazs Scheidler
-
Evan Rempel
-
Fabien Wernli
-
Jim Hendrick
-
jrhendri@roadrunner.com
-
Russell Fulton