[syslog-ng] Hitting g_assert when using sanitize-utf8 enabled!

Scheidler, Balázs balazs.scheidler at balabit.com
Tue Jan 10 10:14:09 UTC 2017


Does this fix it?

diff --git a/lib/utf8utils.c b/lib/utf8utils.c
index 2b84bdc..c76ffc1 100644
--- a/lib/utf8utils.c
+++ b/lib/utf8utils.c
@@ -114,7 +114,7 @@ _append_unsafe_utf8_as_escaped(GString *escaped_output,
const gchar *raw,
       _append_escaped_utf8_character(escaped_output, &raw, -1,
unsafe_chars,
                                      control_format, invalid_format);
   else
-    while (raw_len)
+    while (raw_len > 0)
       raw_len -= _append_escaped_utf8_character(escaped_output, &raw,
raw_len, unsafe_chars,
                  control_format, invalid_format);
 }


-- 
Bazsi

On Tue, Jan 10, 2017 at 11:12 AM, Scheidler, Balázs <
balazs.scheidler at balabit.com> wrote:

> Hmm, thanks for the analysis so far. Is the 0x92 value followed by a zero
> byte? It seems that for some reason the utf8 escaping functions skip that.
>
> On Jan 9, 2017 9:52 PM, "James Elstone" <james at elstone.net> wrote:
>
>> Hi Attila,
>>
>> The syslog message being sent is with utf8_sanitise enabled on the udp
>> transport:
>>
>> <38>Jan 7 20:10:11 hostname-01 microsoft-windows-security-auditing[success]
>> 4648 A logon was attempted by that account at s credentials.
>>
>> Where @ is byte hex value of 0x92, which is a valid graphical apostrophe
>> in Windows-1252 character set, but in UTF-8 any char with a byte value of
>> between 127 to 159 decimal are control characters. I have truncated the
>> actual log message for brevity here. There has to be syslog load before and
>> after this message is received to see the issue.
>>
>> Specifically when reading in UTF-8, (g-string is native UTF-8) byte 0x92
>> looks for a corresponding 0x9c and ignores null terminations in between...
>> (See Wikipedia' C0 C1 Utf-8 page for a little historic information).
>>
>> Looking at the contents of the <src> variable (in 3.7.3 code), it
>> contains multiple syslog messages in syslog-format.c, and strlen of <src>
>> does not equal <left> prior to the procedure call into utf8utils.c. The
>> message received on the wire is about 850 bytes long, <src> is about 8000
>> bytes when going into utf8utils.c and about 15 bytes in the reassigned ptr
>> variable of the g-string, hence the assert being triggered.
>>
>> Going to move to 3.8.1 as there has been a bit of work in this area since
>> 3.7.3 and will retest tomorrow.
>>
>> Is there anyway to control the character set the inbound message is
>> parsed against; we only want a UTF-8 compliant stream being outputted by
>> syslog-ng?
>>
>> Alternatively is there a way to filter this char out on an upstream
>> syslog-ng instance please (it is passing through an identical instance
>> without utf8_sanitise enabled on it without problem)?
>>
>> Kind regards,
>>
>> James
>>
>> Kr,
>>
>> James
>>
>>
>>
>> On 7 January 2017 19:44:15 GMT+00:00, "Szalai, Attila" <
>> Attila.Szalai at morganstanley.com> wrote:
>>>
>>> I’ve checked the glib source too (in version 2.50, but I do not think
>>>  it changed too much between the two version) and have no idea how this
>>> could happen.
>>>
>>>
>>>
>>> So, an example line is definitively needed to find the root cause.
>>>
>>>
>>>
>>> On the other hand, there is a trick in that code to save a malloc and a
>>> “static”[*] buffer is used in that code. Therefore if that buffer is
>>> reallocated (and therefore the “static” buffer is freed, that means that
>>> the memory gets to be corrupted.
>>>
>>>
>>>
>>> [*] Practically the buffer is allocated from the stack, but it’s working
>>> just like a static buffer from the malloc point of view. It should not be
>>> freed.
>>>
>>>
>>>
>>> *From:* syslog-ng [mailto:syslog-ng-bounces at lists.balabit.hu] *On
>>> Behalf Of *James Elstone
>>> *Sent:* Friday, January 06, 2017 2:55 PM
>>> *To:* Syslog-ng users' and developers' mailing list
>>> *Subject:* Re: [syslog-ng] Hitting g_assert when using sanitize-utf8
>>> enabled!
>>>
>>>
>>>
>>> Sorry; update - It happens on the first packet that contains \x092 when
>>> sanitize-utf8 is enabled; consistently.
>>>
>>> Running glib 2.46.2 with Syslog-ng 3.7.3 on FreeBSD 10.3.
>>>
>>> Any ideas please?
>>>
>>> Kr,
>>>
>>> James
>>>
>>> James
>>>
>>> On 6 January 2017 13:38:58 GMT+00:00, James Elstone <james at elstone.net>
>>> wrote:
>>>
>>> Hi Bazsi,
>>>
>>> The version of glib is 2.46.2 on FreeBSD 10.3.
>>>
>>> The issue does not occur on the first packet coming through, but when
>>> under light load (~100/sec)...
>>>
>>> Tried reducing the number of unprintable chars and now only \0x92 exists
>>> in the inbound message it falls over on. It is always a message with \0x92
>>> that causes it to fail.
>>>
>>> Is there a way to have a filter applies before the message is
>>> utf8_sanitised using a regular expression or the like?
>>>
>>> What if the assert was removed, what effect would it have?
>>>
>>> Many thanks to all!
>>>
>>> Kr,
>>>
>>> James
>>>
>>> On 6 January 2017 12:49:28 GMT+00:00, "Scheidler, Balázs" <
>>> balazs.scheidler at balabit.com> wrote:
>>>
>>> Hi,
>>>
>>> Attila is right, it would help a lot to see the original log message and
>>> your glib version. That code path uses a performance hack that relies on a
>>> GLib implementation detail. Either the glib behaviour has changed or
>>> another assumption fails, but just looking at the code I don't know what
>>> might.
>>>
>>>
>>> --
>>> Bazsi
>>>
>>>
>>>
>>> On Fri, Jan 6, 2017 at 1:41 PM, Szalai, Attila <
>>> Attila.Szalai at morganstanley.com> wrote:
>>>
>>> Hi James,
>>>
>>>
>>>
>>> Checking the source, it means the following:
>>>
>>>
>>>
>>> The code allocate a buffer 6 times bigger than the original string
>>> length to able to hold the escaped form of the utf-8 character.
>>>
>>>
>>>
>>> The assert means, that the string, after escaping was not fit into this
>>> buffer for some reason. Or, to be more precise, the GString implementation
>>> decided that it should reallocate the string, which usually only happen if
>>> the string gets too big to fit into its original place. Currently I have no
>>> recent glib source to check if I’m right.
>>>
>>>
>>>
>>> The original string could help a lot to find the root cause.
>>>
>>>
>>>
>>> Ps.: the escaping works as replacing the original byte with \xHH, so
>>> theoretically it can only grows from 1 byte to 4, which should fit into a
>>> buffer 6 times bigger than the original size.
>>>
>>>
>>>
>>> *From:* syslog-ng [mailto:syslog-ng-bounces at lists.balabit.hu] *On
>>> Behalf Of *James Elstone
>>> *Sent:* Thursday, January 05, 2017 10:35 PM
>>> *To:* syslog-ng at lists.balabit.hu
>>> *Subject:* [syslog-ng] Hitting g_assert when using sanitize-utf8
>>> enabled!
>>>
>>>
>>>
>>> Hi Balabit et al,
>>>
>>> When using the sanitize-utf8 flag I am hitting a g_assert in
>>> modules/syslogformat/syslog-format.c; what could be causing this?
>>>
>>> Any advice welcome!!
>>>
>>> Kr,
>>>
>>> James
>>>
>>
>> --
>> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>>
>> ____________________________________________________________
>> __________________
>> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
>> Documentation: http://www.balabit.com/support/documentation/?product=
>> syslog-ng
>> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.balabit.hu/pipermail/syslog-ng/attachments/20170110/b544e99c/attachment-0001.html>


More information about the syslog-ng mailing list