Elasticsearch destination and time-zone info
We are in the process of integrating our logging infrastructure into elasticsearch with Kibana, but have a slight challenge regarding time zones. Some of our equipment will only log in UTC time. This is not an issue because Kibana does time presentation in localtime. Most of our hosts log with our local timezone. This all works out fine due to the way kibana displays the logs - localtime. The problem is that the elasticsearch indexes roll over based on some template (XXXX-YYYY-MM-DD) or such, and this template will an incorrect set of messages. For example, I live in time zone -7:00. This means that any messages after 17:00 (17:00 + 7:00 = 00:00 the next day) that were logged with UTC will go into the index for the next day. So, is there any way to set the time-zone option for the elasticsearch destination? Alternatively, are there any date templates where I could do something like $(timezone UTC $ISODATE) $(timezone UTC $HOUR) in my template? Any help would be appreciated. -- Evan Rempel erempel@uvic.ca Senior Systems Administrator 250.721.7691 Data Centre Services, University Systems, University of Victoria
Hi Evan, Just use the `time-zone` option in the `java` block. Cheers
That certainly sounds obvious, however, I can't get it to work. The documented options for the "7.2.4. Elasticsearch destination options" does NOT include a time-zone option. My java destination is devined as: destination d_elasticsearch_1 { java( class-path("/usr/local/lib64/syslog-ng/java-modules/*.jar:/usr/share/elasticsearch/lib/*.jar") class-name("org.syslog_ng.elasticsearch.ElasticSearchDestination") option("index", "flare-${YEAR}.${MONTH}.${DAY}.${HOUR}") option("type", "test") option("client-mode", "node") option("resource", "/etc/elasticsearch/elasticsearch.yml") option("log-fifo-size","75000") option("time-zone","UTC") option("cluster", "uvic-cluster-01") option("message-template", "$MESSAGE") option("flush-limit", "50") ); }; But my index uses the hour from the local timezone, not the UTC time zone. Is the order of the options important? Does the elasticsearch destination fail apply the time zone to the index? This is beginning to look like a bug. Evan. On 09/28/2015 10:04 PM, Fabien Wernli wrote:
Hi Evan,
Just use the `time-zone` option in the `java` block.
Cheers
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
-- Evan Rempel erempel@uvic.ca Senior Systems Administrator 250.721.7691 Data Centre Services, University Systems, University of Victoria
Hi, In fact `time_zone()` is a meta-option which spans multiple block types (implicitly). We use the following: destination d_es { java( time_zone("UTC") ... ); }; I'll submit a github PR to improve the documentation ASAP Cheers
Thanks, that works like a charm. We are now feeding a steady 5,000 messages per second into elasticsearch with spikes into the 30,000 messages per second. All the right indexes and all of the soft macros parsed by the syslog-ng patterndb. Exciting times for us (only a sysadmin right :-) Evan. On 09/29/2015 09:04 AM, Fabien Wernli wrote:
Hi,
In fact `time_zone()` is a meta-option which spans multiple block types (implicitly). We use the following:
destination d_es { java( time_zone("UTC") ... ); };
I'll submit a github PR to improve the documentation ASAP
Cheers
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
Hi Evan, On Tue, Sep 29, 2015 at 09:13:40AM -0700, Evan Rempel wrote:
We are now feeding a steady 5,000 messages per second into elasticsearch with spikes into the 30,000 messages per second. All the right indexes and all of the soft macros parsed by the syslog-ng patterndb.
Good to hear! Do you use transport or node client mode? Also, it would be great if you could share some details about your Elasticsearch cluster architecture (number of nodes, shards, replicas, etc.) Thanks!
We are running this in mode "node" on a three node cluster running in vmware. It does not handle the load yet :-( There is a bottle neck from syslog-ng to produce a json stream of more than about 10,000 messages per second. Right now we are kind of surviving just due to the in memory buffering of syslog-ng. I don't actually run the elasticsearch cluster, but am getting more involved all of the time. We are in the process of setting up an elasticsearch cluster with the following 2 nodes used in node mode to ingest the data from syslog-ng. This could scale out when I get my roundRobin transport code in place. 3 nodes with storage, so this is the real elasticsearch cluster 1 node running kibana. With this setup we will be able to determine where the bottle necks are and then address them as needed. I am working on a piece of code that will round robin the data that syslog-ng sends it (program destination) so I can set up something like filter f_persecond { match("XX") value("$SEC") }; log { filter(f_persecond) destination(d_round_robin) }; ... log { filter(f_persecond) destination(d_round_robin) }; for each value of $SEC. This will give syslog-ng 60 threads by which to make json objects, which can then be done 10,000 per core on the syslog server. So this would scale to 200,000+ message per second on a 24 core box, and evenly load the ingestion nodes of the elasticsearch cluster. I'll let the list know when I get more details. On 09/29/2015 12:24 PM, Fabien Wernli wrote:
Hi Evan,
On Tue, Sep 29, 2015 at 09:13:40AM -0700, Evan Rempel wrote:
We are now feeding a steady 5,000 messages per second into elasticsearch with spikes into the 30,000 messages per second. All the right indexes and all of the soft macros parsed by the syslog-ng patterndb. Good to hear! Do you use transport or node client mode? Also, it would be great if you could share some details about your Elasticsearch cluster architecture (number of nodes, shards, replicas, etc.)
Thanks!
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
-- Evan Rempel erempel@uvic.ca Senior Systems Administrator 250.721.7691 Data Centre Services, University Systems, University of Victoria
I think I havecome across a bug in the elasticsearch destination where log lines with UTF8 characters result in a shortend message length attribute which results in a slightly truncated json object being sent to elasticsearch. and here is the source syslog line at our syslog server. This is where the json object is created. 2015-10-02T10:22:47-07:00 local@sandtiger.comp.uvic.ca/sandtiger.comp.uvic.ca mail.warning mimedefang.pl[10880]: t92HMkGW028396: Allowing attachment named OutlookEmoji-<U+1F633>.png, ext=.png, type=image/png, RELAY=mail-bn1on0131.outbound.protection.outlook.com [157.56.110.131], FROM=<Holly.Richardson@Dal.Ca>, TO=<cobyt@uvic.ca> Here is the json object as logged to a file destination on the same host that is rujnning the elasticsearch destination. This is just looging $MESSAGE since the payload is already JSON. {"flare":{"profile":"DCS"},"cfgmgrrole":"INFRA","cfgmgrosFull":"Redhat 5_64","cfgmgros":"unix","cfgmgrmodel":"ESX 5","cfgmgrlocation":"ESX-PROD","cfgmgrenvironment":"Prod","cfgmgrassetType":"Virtual Server","SOURCEHOST":"sandtiger.comp.uvic.ca","SHORTHOST":"sandtiger","PROGRAM":"mimedefang.pl","PRIORITY":"warning","PID":"10880","PATTERNID":"377","MESSAGE":"t92HMkGW028396: Allowing attachment named OutlookEmoji-<U+1F633>.png, ext=.png, type=image/png, RELAY=mail-bn1on0131.outbound.protection.outlook.com [157.56.110.131], FROM=<Holly.Richardson@Dal.Ca>, TO=<cobyt@uvic.ca>","ISODATE":"2015-10-02T10:22:47-07:00","HOST":"sandtiger.comp.uvic.ca","FACILITY":"mail"} This is the same conent that is sent to the elasticsearch destination -- option("message-template", "$MESSAGE\n") and here is the failed message from the elasticsearch server [2015-10-02 10:22:48,630][DEBUG][action.bulk ] [sponge] [flare-2015.10.02.17][2] failed to execute bulk item (index) index {[flare-2015.10.02.17][test][AVApk-CyhIyyHCO_k_bc], source[{"flare":{"profile":"DCS"},"cfgmgrrole":"INFRA","cfgmgrosFull":"Redhat 5_64","cfgmgros":"unix","cfgmgrmodel":"ESX 5","cfgmgrlocation":"ESX-PROD","cfgmgrenvironment":"Prod","cfgmg rassetType":"Virtual Server","SOURCEHOST":"sandtiger.comp.uvic.ca","SHORTHOST":"sandtiger","PROGRAM":"mimedefang.pl","PRIORITY":"warning","PID":"10880","PATTERNID":"377","MESSAGE":"t92HMkGW028396: Allowing attachment named OutlookEmoji-ð<U+009F><U+0098>³.png, ext=.png, type=image/png, RELAY=mail-bn1on0131.outbound.protection.outlook.com [157.56.110.131], FROM=<Holly.Rich ardson@Dal.Ca>, TO=<cobyt@uvic.ca>","ISODATE":"2015-10-02T10:22:47-07:00","HOST":"sandtiger.comp.uvic.ca","FACILITY":"mail]} Note that the source has unicde data as <U+1F633> The elasticsearch destination is sent <U+1F633> but the elastisearch server logs ð<U+009F><U+0098>³ The elasticsearch server also seems to end the message with the text "FACILITY":"mail when it should end with "FACILITY":"mail"} so it is missing two characters. Does anyone want to guess at what is happening? Should I post to the elasticsearch group with the reasoning that the source (syslog-ng) and the destination (elasticsearch) need to be configured with the same unicode settings? Thanks, -- Evan Rempel erempel@uvic.ca Senior Systems Administrator 250.721.7691 Data Centre Services, University Systems, University of Victoria
Hi, Do i understand correctly that you added <U+1F633> in place of utf8 sequences in the email and the file contains utf8 encoding of the same value? My theory right now is that elastic uses a 16bit representation of unicode codepoints, and 1f633 doesnt fit there. But I couldnt come up with plausible explanation how it would become ð<U+009F><U+0098>³ Syslog-ng uses utf8 internally, so it should work with long utf8 sequences without problems. Do you perhaps have an encoding() option at the elastic destination? It could also be a problem in the elastic java plugin, I dont know how we supply the data. @juhaszviktor do you see any chance of this happening in the java code? On Oct 2, 2015 20:19, "Evan Rempel" <erempel@uvic.ca> wrote:
I think I havecome across a bug in the elasticsearch destination where log lines with UTF8 characters result in a shortend message length attribute which results in a slightly truncated json object being sent to elasticsearch.
and here is the source syslog line at our syslog server. This is where the json object is created.
2015-10-02T10:22:47-07:00 local@sandtiger.comp.uvic.ca/sandtiger.comp.uvic.ca mail.warning mimedefang.pl[10880]: t92HMkGW028396: Allowing attachment named OutlookEmoji-<U+1F633>.png, ext=.png, type=image/png, RELAY= mail-bn1on0131.outbound.protection.outlook.com [157.56.110.131], FROM=<Holly.Richardson@Dal.Ca>, TO=<cobyt@uvic.ca>
Here is the json object as logged to a file destination on the same host that is rujnning the elasticsearch destination. This is just looging $MESSAGE since the payload is already JSON.
{"flare":{"profile":"DCS"},"cfgmgrrole":"INFRA","cfgmgrosFull":"Redhat 5_64","cfgmgros":"unix","cfgmgrmodel":"ESX 5","cfgmgrlocation":"ESX-PROD","cfgmgrenvironment":"Prod","cfgmgrassetType":"Virtual Server","SOURCEHOST":"sandtiger.comp.uvic.ca ","SHORTHOST":"sandtiger","PROGRAM":"mimedefang.pl","PRIORITY":"warning","PID":"10880","PATTERNID":"377","MESSAGE":"t92HMkGW028396: Allowing attachment named OutlookEmoji-<U+1F633>.png, ext=.png, type=image/png, RELAY=mail-bn1on0131.outbound.protection.outlook.com [157.56.110.131], FROM=<Holly.Richardson@Dal.Ca>, TO=<cobyt@uvic.ca
","ISODATE":"2015-10-02T10:22:47-07:00","HOST":"sandtiger.comp.uvic.ca ","FACILITY":"mail"}
This is the same conent that is sent to the elasticsearch destination -- option("message-template", "$MESSAGE\n")
and here is the failed message from the elasticsearch server
[2015-10-02 10:22:48,630][DEBUG][action.bulk ] [sponge] [flare-2015.10.02.17][2] failed to execute bulk item (index) index {[flare-2015.10.02.17][test][AVApk-CyhIyyHCO_k_bc], source[{"flare":{"profile":"DCS"},"cfgmgrrole":"INFRA","cfgmgrosFull":"Redhat 5_64","cfgmgros":"unix","cfgmgrmodel":"ESX 5","cfgmgrlocation":"ESX-PROD","cfgmgrenvironment":"Prod","cfgmg rassetType":"Virtual Server","SOURCEHOST":"sandtiger.comp.uvic.ca ","SHORTHOST":"sandtiger","PROGRAM":"mimedefang.pl","PRIORITY":"warning","PID":"10880","PATTERNID":"377","MESSAGE":"t92HMkGW028396: Allowing attachment named OutlookEmoji-ð<U+009F><U+0098>³.png, ext=.png, type=image/png, RELAY=mail-bn1on0131.outbound.protection.outlook.com [157.56.110.131], FROM=<Holly.Rich ardson@Dal.Ca>, TO=<cobyt@uvic.ca
","ISODATE":"2015-10-02T10:22:47-07:00","HOST":"sandtiger.comp.uvic.ca ","FACILITY":"mail]}
Note that the source has unicde data as <U+1F633> The elasticsearch destination is sent <U+1F633> but the elastisearch server logs ð<U+009F><U+0098>³
The elasticsearch server also seems to end the message with the text
"FACILITY":"mail
when it should end with
"FACILITY":"mail"}
so it is missing two characters.
Does anyone want to guess at what is happening?
Should I post to the elasticsearch group with the reasoning that the source (syslog-ng) and the destination (elasticsearch) need to be configured with the same unicode settings?
Thanks,
-- Evan Rempel erempel@uvic.ca Senior Systems Administrator 250.721.7691 Data Centre Services, University Systems, University of Victoria
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
Hi, Hmmm, it looks that something went wrong while creating java string from c string (calling jni NewStringUTF) This looks like a bug. I will do the root cause analyse. BR, Viktor On Sat, Oct 3, 2015 at 8:36 AM, Scheidler, Balázs < balazs.scheidler@balabit.com> wrote:
Hi,
Do i understand correctly that you added <U+1F633> in place of utf8 sequences in the email and the file contains utf8 encoding of the same value?
My theory right now is that elastic uses a 16bit representation of unicode codepoints, and 1f633 doesnt fit there. But I couldnt come up with plausible explanation how it would become ð<U+009F><U+0098>³
Syslog-ng uses utf8 internally, so it should work with long utf8 sequences without problems. Do you perhaps have an encoding() option at the elastic destination?
It could also be a problem in the elastic java plugin, I dont know how we supply the data. @juhaszviktor do you see any chance of this happening in the java code? On Oct 2, 2015 20:19, "Evan Rempel" <erempel@uvic.ca> wrote:
I think I havecome across a bug in the elasticsearch destination where log lines with UTF8 characters result in a shortend message length attribute which results in a slightly truncated json object being sent to elasticsearch.
and here is the source syslog line at our syslog server. This is where the json object is created.
2015-10-02T10:22:47-07:00 local@sandtiger.comp.uvic.ca/sandtiger.comp.uvic.ca mail.warning mimedefang.pl[10880]: t92HMkGW028396: Allowing attachment named OutlookEmoji-<U+1F633>.png, ext=.png, type=image/png, RELAY= mail-bn1on0131.outbound.protection.outlook.com [157.56.110.131], FROM=<Holly.Richardson@Dal.Ca>, TO=<cobyt@uvic.ca>
Here is the json object as logged to a file destination on the same host that is rujnning the elasticsearch destination. This is just looging $MESSAGE since the payload is already JSON.
{"flare":{"profile":"DCS"},"cfgmgrrole":"INFRA","cfgmgrosFull":"Redhat 5_64","cfgmgros":"unix","cfgmgrmodel":"ESX 5","cfgmgrlocation":"ESX-PROD","cfgmgrenvironment":"Prod","cfgmgrassetType":"Virtual Server","SOURCEHOST":"sandtiger.comp.uvic.ca ","SHORTHOST":"sandtiger","PROGRAM":"mimedefang.pl","PRIORITY":"warning","PID":"10880","PATTERNID":"377","MESSAGE":"t92HMkGW028396: Allowing attachment named OutlookEmoji-<U+1F633>.png, ext=.png, type=image/png, RELAY=mail-bn1on0131.outbound.protection.outlook.com [157.56.110.131], FROM=<Holly.Richardson@Dal.Ca>, TO=<cobyt@uvic.ca
","ISODATE":"2015-10-02T10:22:47-07:00","HOST":"sandtiger.comp.uvic.ca ","FACILITY":"mail"}
This is the same conent that is sent to the elasticsearch destination -- option("message-template", "$MESSAGE\n")
and here is the failed message from the elasticsearch server
[2015-10-02 10:22:48,630][DEBUG][action.bulk ] [sponge] [flare-2015.10.02.17][2] failed to execute bulk item (index) index {[flare-2015.10.02.17][test][AVApk-CyhIyyHCO_k_bc], source[{"flare":{"profile":"DCS"},"cfgmgrrole":"INFRA","cfgmgrosFull":"Redhat 5_64","cfgmgros":"unix","cfgmgrmodel":"ESX 5","cfgmgrlocation":"ESX-PROD","cfgmgrenvironment":"Prod","cfgmg rassetType":"Virtual Server","SOURCEHOST":"sandtiger.comp.uvic.ca ","SHORTHOST":"sandtiger","PROGRAM":"mimedefang.pl","PRIORITY":"warning","PID":"10880","PATTERNID":"377","MESSAGE":"t92HMkGW028396: Allowing attachment named OutlookEmoji-ð<U+009F><U+0098>³.png, ext=.png, type=image/png, RELAY=mail-bn1on0131.outbound.protection.outlook.com [157.56.110.131], FROM=<Holly.Rich ardson@Dal.Ca>, TO=<cobyt@uvic.ca
","ISODATE":"2015-10-02T10:22:47-07:00","HOST":"sandtiger.comp.uvic.ca ","FACILITY":"mail]}
Note that the source has unicde data as <U+1F633> The elasticsearch destination is sent <U+1F633> but the elastisearch server logs ð<U+009F><U+0098>³
The elasticsearch server also seems to end the message with the text
"FACILITY":"mail
when it should end with
"FACILITY":"mail"}
so it is missing two characters.
Does anyone want to guess at what is happening?
Should I post to the elasticsearch group with the reasoning that the source (syslog-ng) and the destination (elasticsearch) need to be configured with the same unicode settings?
Thanks,
-- Evan Rempel erempel@uvic.ca Senior Systems Administrator 250.721.7691 Data Centre Services, University Systems, University of Victoria
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
participants (4)
-
Evan Rempel
-
Fabien Wernli
-
Juhász, Viktor
-
Scheidler, Balázs