<p dir="ltr"></p>
<p dir="ltr">Hi,</p>
<p dir="ltr">Do i understand correctly that you added &lt;U+1F633&gt; in place of utf8 sequences in the email and the file contains utf8 encoding of the same value?</p>
<p dir="ltr">My theory right now is that elastic uses a 16bit representation of unicode codepoints,  and 1f633 doesnt fit there. But I couldnt come up with plausible explanation how it would become ð&lt;U+009F&gt;&lt;U+0098&gt;³</p>
<p dir="ltr">Syslog-ng uses utf8 internally, so it should work with long utf8 sequences without problems. Do you perhaps have an encoding() option at the elastic destination?</p>
<p dir="ltr">It could also be a problem in the elastic java plugin, I dont know how we supply the data. @juhaszviktor do you see any chance of this happening in the java code?</p>
<div class="gmail_quote">On Oct 2, 2015 20:19, &quot;Evan Rempel&quot; &lt;<a href="mailto:erempel@uvic.ca">erempel@uvic.ca</a>&gt; wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I think I havecome across a bug in the elasticsearch destination where log lines with UTF8 characters result in a shortend message length attribute which results in a slightly truncated json object being sent to elasticsearch.<br>
<br>
<br>
and here is the source syslog line at our syslog server. This is where the json object is created.<br>
<br>
2015-10-02T10:22:47-07:00 <a href="http://local@sandtiger.comp.uvic.ca/sandtiger.comp.uvic.ca" rel="noreferrer" target="_blank">local@sandtiger.comp.uvic.ca/sandtiger.comp.uvic.ca</a> mail.warning <a href="http://mimedefang.pl" rel="noreferrer" target="_blank">mimedefang.pl</a>[10880]: t92HMkGW028396: Allowing attachment named OutlookEmoji-&lt;U+1F633&gt;.png, ext=.png, type=image/png, RELAY=<a href="http://mail-bn1on0131.outbound.protection.outlook.com" rel="noreferrer" target="_blank">mail-bn1on0131.outbound.protection.outlook.com</a> [157.56.110.131], FROM=&lt;Holly.Richardson@Dal.Ca&gt;, TO=&lt;<a href="mailto:cobyt@uvic.ca">cobyt@uvic.ca</a>&gt;<br>
<br>
Here is the json object as logged to a file destination on the same host that is rujnning the elasticsearch destination. This is just looging $MESSAGE since the payload is already JSON.<br>
<br>
{&quot;flare&quot;:{&quot;profile&quot;:&quot;DCS&quot;},&quot;cfgmgrrole&quot;:&quot;INFRA&quot;,&quot;cfgmgrosFull&quot;:&quot;Redhat 5_64&quot;,&quot;cfgmgros&quot;:&quot;unix&quot;,&quot;cfgmgrmodel&quot;:&quot;ESX 5&quot;,&quot;cfgmgrlocation&quot;:&quot;ESX-PROD&quot;,&quot;cfgmgrenvironment&quot;:&quot;Prod&quot;,&quot;cfgmgrassetType&quot;:&quot;Virtual Server&quot;,&quot;SOURCEHOST&quot;:&quot;<a href="http://sandtiger.comp.uvic.ca" rel="noreferrer" target="_blank">sandtiger.comp.uvic.ca</a>&quot;,&quot;SHORTHOST&quot;:&quot;sandtiger&quot;,&quot;PROGRAM&quot;:&quot;<a href="http://mimedefang.pl" rel="noreferrer" target="_blank">mimedefang.pl</a>&quot;,&quot;PRIORITY&quot;:&quot;warning&quot;,&quot;PID&quot;:&quot;10880&quot;,&quot;PATTERNID&quot;:&quot;377&quot;,&quot;MESSAGE&quot;:&quot;t92HMkGW028396: Allowing attachment named OutlookEmoji-&lt;U+1F633&gt;.png, ext=.png, type=image/png,<br>
RELAY=<a href="http://mail-bn1on0131.outbound.protection.outlook.com" rel="noreferrer" target="_blank">mail-bn1on0131.outbound.protection.outlook.com</a> [157.56.110.131], FROM=&lt;Holly.Richardson@Dal.Ca&gt;, TO=&lt;<a href="mailto:cobyt@uvic.ca">cobyt@uvic.ca</a>&gt;&quot;,&quot;ISODATE&quot;:&quot;2015-10-02T10:22:47-07:00&quot;,&quot;HOST&quot;:&quot;<a href="http://sandtiger.comp.uvic.ca" rel="noreferrer" target="_blank">sandtiger.comp.uvic.ca</a>&quot;,&quot;FACILITY&quot;:&quot;mail&quot;}<br>
<br>
This is the same conent that is sent to the elasticsearch destination --  option(&quot;message-template&quot;, &quot;$MESSAGE\n&quot;)<br>
<br>
and here is the failed message from the elasticsearch server<br>
<br>
[2015-10-02 10:22:48,630][DEBUG][action.bulk              ] [sponge] [flare-2015.10.02.17][2] failed to execute bulk item (index) index {[flare-2015.10.02.17][test][AVApk-CyhIyyHCO_k_bc], source[{&quot;flare&quot;:{&quot;profile&quot;:&quot;DCS&quot;},&quot;cfgmgrrole&quot;:&quot;INFRA&quot;,&quot;cfgmgrosFull&quot;:&quot;Redhat 5_64&quot;,&quot;cfgmgros&quot;:&quot;unix&quot;,&quot;cfgmgrmodel&quot;:&quot;ESX 5&quot;,&quot;cfgmgrlocation&quot;:&quot;ESX-PROD&quot;,&quot;cfgmgrenvironment&quot;:&quot;Prod&quot;,&quot;cfgmg<br>
rassetType&quot;:&quot;Virtual Server&quot;,&quot;SOURCEHOST&quot;:&quot;<a href="http://sandtiger.comp.uvic.ca" rel="noreferrer" target="_blank">sandtiger.comp.uvic.ca</a>&quot;,&quot;SHORTHOST&quot;:&quot;sandtiger&quot;,&quot;PROGRAM&quot;:&quot;<a href="http://mimedefang.pl" rel="noreferrer" target="_blank">mimedefang.pl</a>&quot;,&quot;PRIORITY&quot;:&quot;warning&quot;,&quot;PID&quot;:&quot;10880&quot;,&quot;PATTERNID&quot;:&quot;377&quot;,&quot;MESSAGE&quot;:&quot;t92HMkGW028396: Allowing attachment named OutlookEmoji-ð&lt;U+009F&gt;&lt;U+0098&gt;³.png, ext=.png, type=image/png, RELAY=<a href="http://mail-bn1on0131.outbound.protection.outlook.com" rel="noreferrer" target="_blank">mail-bn1on0131.outbound.protection.outlook.com</a> [157.56.110.131], FROM=&lt;Holly.Rich<br>
ardson@Dal.Ca&gt;, TO=&lt;<a href="mailto:cobyt@uvic.ca">cobyt@uvic.ca</a>&gt;&quot;,&quot;ISODATE&quot;:&quot;2015-10-02T10:22:47-07:00&quot;,&quot;HOST&quot;:&quot;<a href="http://sandtiger.comp.uvic.ca" rel="noreferrer" target="_blank">sandtiger.comp.uvic.ca</a>&quot;,&quot;FACILITY&quot;:&quot;mail]}<br>
<br>
<br>
<br>
Note that the source has unicde data as &lt;U+1F633&gt;<br>
The elasticsearch destination is sent &lt;U+1F633&gt;<br>
but the elastisearch server logs ð&lt;U+009F&gt;&lt;U+0098&gt;³<br>
<br>
The elasticsearch server also seems to end the message with the text<br>
<br>
  &quot;FACILITY&quot;:&quot;mail<br>
<br>
when it should end with<br>
<br>
&quot;FACILITY&quot;:&quot;mail&quot;}<br>
<br>
so it is missing two characters.<br>
<br>
Does anyone want to guess at what is happening?<br>
<br>
Should I post to the elasticsearch group with the reasoning that the source (syslog-ng) and the destination (elasticsearch) need to be configured with the same unicode settings?<br>
<br>
Thanks,<br>
<br>
--<br>
Evan Rempel                                      <a href="mailto:erempel@uvic.ca">erempel@uvic.ca</a><br>
Senior Systems Administrator                        250.721.7691<br>
Data Centre Services, University Systems, University of Victoria<br>
<br>
______________________________________________________________________________<br>
Member info: <a href="https://lists.balabit.hu/mailman/listinfo/syslog-ng" rel="noreferrer" target="_blank">https://lists.balabit.hu/mailman/listinfo/syslog-ng</a><br>
Documentation: <a href="http://www.balabit.com/support/documentation/?product=syslog-ng" rel="noreferrer" target="_blank">http://www.balabit.com/support/documentation/?product=syslog-ng</a><br>
FAQ: <a href="http://www.balabit.com/wiki/syslog-ng-faq" rel="noreferrer" target="_blank">http://www.balabit.com/wiki/syslog-ng-faq</a><br>
<br>
</blockquote></div>