[syslog-ng] Reload does not free up RAM

Evan Rempel erempel at uvic.ca
Tue Oct 7 18:06:14 CEST 2008


Evan Rempel wrote:
> Balazs Scheidler wrote:
>> On Mon, 2008-10-06 at 11:38 -0700, Evan Rempel wrote:
>>> Balazs Scheidler wrote:
>>>> On Sun, 2008-10-05 at 13:49 -0700, Evan Rempel wrote:
>>>>> Balazs Scheidler wrote:
>>>>>>>>>>> We have cases where a syslog-ng buffers quite a bit of data (2GB or so) before
>>>>>>>>>>> it is able to flush the data to the destination. Currently, the only way to "free up" that RAM
>>>>>>>>>>> is to restart syslog-ng, which means that some log messages are lost during the restart
>>>>>>>>>>> window.
>>>>>> Both PE and OSE frees up disk/memory when possible. If they don't, then
>>>>>> that's probably a leak.
>>> No, it isn't a memory leak as the process can be months old and only grows when
>>> it buffers messages.
>>>
>>>>>> So if your syslog-ng process is 2GB in size, and you didn't specify an
>>>>>> overly large log_fifo_size(), it's probably because of a leak.
>>>  >>
>>>>> I have a large log_fifo_size() but the process starts small, grows when it needs
>>>>> to buffer, and does NOT get small again when the destination is available.
>>>> Hmm... that might be related to the malloc implementation. Which
>>>> platform are you using?
>>>>
>>>>>>> 2. The buffer space should be thrown away only after the message
>>>>>>>     has been delivered to the destination.
>>>>>> I see, this is what should happen now.
>>>  >>
>>>>> This does not appear to be happening.
>>>>> I am using OSE 2.0.8 Have there been memory fixes/changes since then?
>>>> Not that I know of.
>>>>
>>>>> Is there a way to get buffer count statistics out? How many messages are
>>>>> in each fifo buffer?
>>>> In the experimental 3.0 branch, yes there is. In current 2.0, I'm afraid
>>>> there's none.
>>>>
>>>> Is there a limit on the growth of syslog-ng? I mean if it gets large
>>>> once and then empties its buffers, it stays large, but does it start
>>>> growing again, or it stays steadily at the size it was before?
>>> I have not specifically tested this scenario, but we do track processes
>>> and memory usage, and it appears that once the space is allocated, it does
>>> not grow unless a larger set of messages gets buffered.
>>>
>>>
>>>> Again, the platform matters a lot, how the local malloc implementation
>>>> takes care about fragmentation, whether it shrinks the process once
>>>> memory is freed.
>>> This is running on Redhat AS 3 (2.4 kernel) and Redhat AS 4 (2.6 kernel)
>>>
>>> On the Redhat 4 system the glibc packages installed are
>>>
>>> glibc-common-2.3.4-2.36
>>> glibc-devel-2.3.4-2.36
>>> glibc-headers-2.3.4-2.36
>>> glib2-2.4.7-1
>>> glibc-kernheaders-2.4-9.1.100.EL
>>>
>> hmm.. this version should be all right, at least in my experience glibc
>> tries to release memory back to the system once the application freed it
>> using free().
>>
>> Maybe it is some kind of memory fragmentation issue. How easy is this to
>> reproduce? Does it require months of running, or you have a simple
>> recipe to create it?
> 
> We are encountering this fairly easily in production, so I should be able to
> verify a recipe. It will take me a couple of days to "get to it", but will report back.


I thought that this deserved some attention...

make a destination

destination erempel     { pipe("/var/log/syslog.pipes/erempel" log_fifo_size(1000000) perm(0666) ); };

and set up some logging to it. I directed only *.debug stuff to mine.

filter f_debug  { priority(debug); };

log { source(local); filter(f_debug); destination(erempel); flags(final); };


Start a "tail -f /var/log/syslog.pipes/erempel"

and then generate some logs
ls -1 /usr/bin | logger -p user.debug

and I see the stuff pop out from the "tail -f"

Cancel the tail process and product a whole bunch more logs.

ls -1 /usr/bin | logger -p user.debug
repeatedly while watching the size of the syslog process.

Tue Oct 7 08:56:09 PDT 2008 root 23265 1.7 0.1 4004 1396 ? Ss 08:53 0:02 /usr/local/sbin/syslog-ng
Tue Oct 7 08:56:14 PDT 2008 root 23265 1.7 0.1 4004 1412 ? Ss 08:53 0:02 /usr/local/sbin/syslog-ng
Tue Oct 7 08:56:19 PDT 2008 root 23265 1.7 0.1 4004 1412 ? Ss 08:53 0:02 /usr/local/sbin/syslog-ng
Tue Oct 7 08:56:24 PDT 2008 root 23265 2.5 1.2 15524 12840 ? Ss 08:53 0:04 /usr/local/sbin/syslog-ng
Tue Oct 7 08:56:29 PDT 2008 root 23265 3.6 3.1 35384 32668 ? Ss 08:53 0:06 /usr/local/sbin/syslog-ng
Tue Oct 7 08:56:34 PDT 2008 root 23265 4.7 5.1 55904 53264 ? Ss 08:53 0:08 /usr/local/sbin/syslog-ng
Tue Oct 7 08:56:40 PDT 2008 root 23265 5.7 7.0 75724 73100 ? Ss 08:53 0:10 /usr/local/sbin/syslog-ng
Tue Oct 7 08:56:45 PDT 2008 root 23265 5.6 7.1 76504 73864 ? Ss 08:53 0:10 /usr/local/sbin/syslog-ng
Tue Oct 7 08:56:50 PDT 2008 root 23265 6.8 7.1 76504 73864 ? Ss 08:53 0:13 /usr/local/sbin/syslog-ng
Tue Oct 7 08:56:55 PDT 2008 root 23265 6.6 7.1 76504 73864 ? Ss 08:53 0:13 /usr/local/sbin/syslog-ng

The intent is to start the "rail -f" again to read all of the records and see if the memory gets smaller,
but when I start the tail -f again, I get NOTHING. Syslog-ng seems to have closed the pipe destination
and is NOT reopening it. Fortunately, the author produced a HUP signal that will cause all outputs
to be closed and opened again. So, I do a HUP, and only the LAST 10 records are displayed. Where did all
of the other records go?

If I do
ls -1 /usr/bin | logger -p user.debug
I see the full list from the "tail -f"

Back to memory. The memory never gets smaller.

Now I am beginning to thing that the records got orphaned inside of syslog-ng such that they never
made it to the destination, and as such are never candidates for memory reclaiming.

I think I have stumbled on a bug for record loss, and I would like to state again that ALL destinations
should follow the time_reopen() setting.

-- 
Evan Rempel


More information about the syslog-ng mailing list