[syslog-ng] Hitting g_assert when using sanitize-utf8 enabled!

Scheidler, Balázs balazs.scheidler at balabit.com
Thu Jan 12 17:51:33 UTC 2017


I have tried a number of combinations to cause aborts, but without success
so far. I can imagine that my change may fix your issue, but I couldn't
reproduce a case where we would cross the closing NUL byte in the input.

I did my testing both via end-to-end tests (e.g. sending the above message
to syslog-ng with sanitize-utf8 enabled) and via a unit test program.

Neither caused the failure. So a longer excerpt from your input would be
very much welcome, maybe in private.

Bazsi

-- 
Bazsi

On Tue, Jan 10, 2017 at 2:25 PM, James Elstone <james at elstone.net> wrote:

> Will give this a twirl shortly.
>
> James
>
>
> On 10 January 2017 10:14:09 GMT+00:00, "Scheidler, Balázs" <
> balazs.scheidler at balabit.com> wrote:
>>
>> Does this fix it?
>>
>> diff --git a/lib/utf8utils.c b/lib/utf8utils.c
>> index 2b84bdc..c76ffc1 100644
>> --- a/lib/utf8utils.c
>> +++ b/lib/utf8utils.c
>> @@ -114,7 +114,7 @@ _append_unsafe_utf8_as_escaped(GString
>> *escaped_output, const gchar *raw,
>>        _append_escaped_utf8_character(escaped_output, &raw, -1,
>> unsafe_chars,
>>                                       control_format, invalid_format);
>>    else
>> -    while (raw_len)
>> +    while (raw_len > 0)
>>        raw_len -= _append_escaped_utf8_character(escaped_output, &raw,
>> raw_len, unsafe_chars,
>>                   control_format, invalid_format);
>>  }
>>
>>
>> --
>> Bazsi
>>
>> On Tue, Jan 10, 2017 at 11:12 AM, Scheidler, Balázs <
>> balazs.scheidler at balabit.com> wrote:
>>
>>> Hmm, thanks for the analysis so far. Is the 0x92 value followed by a
>>> zero byte? It seems that for some reason the utf8 escaping functions skip
>>> that.
>>>
>>> On Jan 9, 2017 9:52 PM, "James Elstone" <james at elstone.net> wrote:
>>>
>>>> Hi Attila,
>>>>
>>>> The syslog message being sent is with utf8_sanitise enabled on the udp
>>>> transport:
>>>>
>>>> <38>Jan 7 20:10:11 hostname-01 microsoft-windows-security-auditing[success]
>>>> 4648 A logon was attempted by that account at s credentials.
>>>>
>>>> Where @ is byte hex value of 0x92, which is a valid graphical
>>>> apostrophe in Windows-1252 character set, but in UTF-8 any char with a byte
>>>> value of between 127 to 159 decimal are control characters. I have
>>>> truncated the actual log message for brevity here. There has to be syslog
>>>> load before and after this message is received to see the issue.
>>>>
>>>> Specifically when reading in UTF-8, (g-string is native UTF-8) byte
>>>> 0x92 looks for a corresponding 0x9c and ignores null terminations in
>>>> between... (See Wikipedia' C0 C1 Utf-8 page for a little historic
>>>> information).
>>>>
>>>> Looking at the contents of the <src> variable (in 3.7.3 code), it
>>>> contains multiple syslog messages in syslog-format.c, and strlen of <src>
>>>> does not equal <left> prior to the procedure call into utf8utils.c. The
>>>> message received on the wire is about 850 bytes long, <src> is about 8000
>>>> bytes when going into utf8utils.c and about 15 bytes in the reassigned ptr
>>>> variable of the g-string, hence the assert being triggered.
>>>>
>>>> Going to move to 3.8.1 as there has been a bit of work in this area
>>>> since 3.7.3 and will retest tomorrow.
>>>>
>>>> Is there anyway to control the character set the inbound message is
>>>> parsed against; we only want a UTF-8 compliant stream being outputted by
>>>> syslog-ng?
>>>>
>>>> Alternatively is there a way to filter this char out on an upstream
>>>> syslog-ng instance please (it is passing through an identical instance
>>>> without utf8_sanitise enabled on it without problem)?
>>>>
>>>> Kind regards,
>>>>
>>>> James
>>>>
>>>> Kr,
>>>>
>>>> James
>>>>
>>>>
>>>>
>>>> On 7 January 2017 19:44:15 GMT+00:00, "Szalai, Attila" <
>>>> Attila.Szalai at morganstanley.com> wrote:
>>>>>
>>>>> I’ve checked the glib source too (in version 2.50, but I do not think
>>>>>  it changed too much between the two version) and have no idea how this
>>>>> could happen.
>>>>>
>>>>>
>>>>>
>>>>> So, an example line is definitively needed to find the root cause.
>>>>>
>>>>>
>>>>>
>>>>> On the other hand, there is a trick in that code to save a malloc and
>>>>> a “static”[*] buffer is used in that code. Therefore if that buffer is
>>>>> reallocated (and therefore the “static” buffer is freed, that means that
>>>>> the memory gets to be corrupted.
>>>>>
>>>>>
>>>>>
>>>>> [*] Practically the buffer is allocated from the stack, but it’s
>>>>> working just like a static buffer from the malloc point of view. It should
>>>>> not be freed.
>>>>>
>>>>>
>>>>>
>>>>> *From:* syslog-ng [mailto:syslog-ng-bounces at lists.balabit.hu] *On
>>>>> Behalf Of *James Elstone
>>>>> *Sent:* Friday, January 06, 2017 2:55 PM
>>>>> *To:* Syslog-ng users' and developers' mailing list
>>>>> *Subject:* Re: [syslog-ng] Hitting g_assert when using sanitize-utf8
>>>>> enabled!
>>>>>
>>>>>
>>>>>
>>>>> Sorry; update - It happens on the first packet that contains \x092
>>>>> when sanitize-utf8 is enabled; consistently.
>>>>>
>>>>> Running glib 2.46.2 with Syslog-ng 3.7.3 on FreeBSD 10.3.
>>>>>
>>>>> Any ideas please?
>>>>>
>>>>> Kr,
>>>>>
>>>>> James
>>>>>
>>>>> James
>>>>>
>>>>> On 6 January 2017 13:38:58 GMT+00:00, James Elstone <james at elstone.net>
>>>>> wrote:
>>>>>
>>>>> Hi Bazsi,
>>>>>
>>>>> The version of glib is 2.46.2 on FreeBSD 10.3.
>>>>>
>>>>> The issue does not occur on the first packet coming through, but when
>>>>> under light load (~100/sec)...
>>>>>
>>>>> Tried reducing the number of unprintable chars and now only \0x92
>>>>> exists in the inbound message it falls over on. It is always a message with
>>>>> \0x92 that causes it to fail.
>>>>>
>>>>> Is there a way to have a filter applies before the message is
>>>>> utf8_sanitised using a regular expression or the like?
>>>>>
>>>>> What if the assert was removed, what effect would it have?
>>>>>
>>>>> Many thanks to all!
>>>>>
>>>>> Kr,
>>>>>
>>>>> James
>>>>>
>>>>> On 6 January 2017 12:49:28 GMT+00:00, "Scheidler, Balázs" <
>>>>> balazs.scheidler at balabit.com> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> Attila is right, it would help a lot to see the original log message
>>>>> and your glib version. That code path uses a performance hack that relies
>>>>> on a GLib implementation detail. Either the glib behaviour has changed or
>>>>> another assumption fails, but just looking at the code I don't know what
>>>>> might.
>>>>>
>>>>>
>>>>> --
>>>>> Bazsi
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jan 6, 2017 at 1:41 PM, Szalai, Attila <
>>>>> Attila.Szalai at morganstanley.com> wrote:
>>>>>
>>>>> Hi James,
>>>>>
>>>>>
>>>>>
>>>>> Checking the source, it means the following:
>>>>>
>>>>>
>>>>>
>>>>> The code allocate a buffer 6 times bigger than the original string
>>>>> length to able to hold the escaped form of the utf-8 character.
>>>>>
>>>>>
>>>>>
>>>>> The assert means, that the string, after escaping was not fit into
>>>>> this buffer for some reason. Or, to be more precise, the GString
>>>>> implementation decided that it should reallocate the string, which usually
>>>>> only happen if the string gets too big to fit into its original place.
>>>>> Currently I have no recent glib source to check if I’m right.
>>>>>
>>>>>
>>>>>
>>>>> The original string could help a lot to find the root cause.
>>>>>
>>>>>
>>>>>
>>>>> Ps.: the escaping works as replacing the original byte with \xHH, so
>>>>> theoretically it can only grows from 1 byte to 4, which should fit into a
>>>>> buffer 6 times bigger than the original size.
>>>>>
>>>>>
>>>>>
>>>>> *From:* syslog-ng [mailto:syslog-ng-bounces at lists.balabit.hu] *On
>>>>> Behalf Of *James Elstone
>>>>> *Sent:* Thursday, January 05, 2017 10:35 PM
>>>>> *To:* syslog-ng at lists.balabit.hu
>>>>> *Subject:* [syslog-ng] Hitting g_assert when using sanitize-utf8
>>>>> enabled!
>>>>>
>>>>>
>>>>>
>>>>> Hi Balabit et al,
>>>>>
>>>>> When using the sanitize-utf8 flag I am hitting a g_assert in
>>>>> modules/syslogformat/syslog-format.c; what could be causing this?
>>>>>
>>>>> Any advice welcome!!
>>>>>
>>>>> Kr,
>>>>>
>>>>> James
>>>>>
>>>>
>>>> --
>>>> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>>>>
>>>> ____________________________________________________________
>>>> __________________
>>>> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
>>>> Documentation: http://www.balabit.com/support
>>>> /documentation/?product=syslog-ng
>>>> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>>>>
>>>>
>>>>
>>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.balabit.hu/pipermail/syslog-ng/attachments/20170112/eedcdb89/attachment.html>


More information about the syslog-ng mailing list