Hi Attila,

Thanks for responding!

The message contained 954 bytes on the wire with 70 non printable chars, generated from a Windows 2k12 security event log (event id 4648) entry. I make it 1164 bytes once encoded...

The non-utf8 chars being sent were a mix of all \0x7f\0x7f except one \0x92 char towards the end with one further \0x7f\0x7f there after.

Going to give a few things a go, very helpful!!

Kr,

James



On 6 January 2017 12:41:03 GMT+00:00, "Szalai, Attila" <Attila.Szalai@morganstanley.com> wrote:

Hi James,

 

Checking the source, it means the following:

 

The code allocate a buffer 6 times bigger than the original string length to able to hold the escaped form of the utf-8 character.

 

The assert means, that the string, after escaping was not fit into this buffer for some reason. Or, to be more precise, the GString implementation decided that it should reallocate the string, which usually only happen if the string gets too big to fit into its original place. Currently I have no recent glib source to check if I’m right.

 

The original string could help a lot to find the root cause.

 

Ps.: the escaping works as replacing the original byte with \xHH, so theoretically it can only grows from 1 byte to 4, which should fit into a buffer 6 times bigger than the original size.

 

From: syslog-ng [mailto:syslog-ng-bounces@lists.balabit.hu] On Behalf Of James Elstone
Sent: Thursday, January 05, 2017 10:35 PM
To: syslog-ng@lists.balabit.hu
Subject: [syslog-ng] Hitting g_assert when using sanitize-utf8 enabled!

 

Hi Balabit et al,

When using the sanitize-utf8 flag I am hitting a g_assert in modules/syslogformat/syslog-format.c; what could be causing this?

Any advice welcome!!

Kr,

James


--
Sent from my Android device with K-9 Mail. Please excuse my brevity.