Hi,

The same free policy cannot work with _handle_file_deleted, because we cannot free the readers immediately. There is a race between file delete and syslog-ng reading the data. It can happen that one deletes a file before syslog-ng reads all its contents, which would cause message loss. In that case we just signal the deletion event to the reader (log_pipe_notify(&reader->super, NC_FILE_DELETED, NULL); and when the reader reads EOF at a later time, that's when the reader object is freed.

Can you send the output of pmap <pid of syslog-ng> with the leak? That may give us a clue where to look with the leak.

Also can you send your configuration, or at least the part about the wildcard filesources? The reproduction might depend on max_files or monitor_method parameter, or maybe other.

Thanks,
  Antal


On Thu, Sep 6, 2018 at 12:00 PM Jose Angel Santiago <jasantiago@stratio.com> wrote:
Reviewing your commit I've found out that the method _handle_file_deleted seems not to free memory when a monitored file has been deleted, according to the changes you made in _handler_handler_directory_deleted. Could it be the cause of the memory leak?

Thanks.

2018-09-06 11:50 GMT+02:00 Jose Angel Santiago <jasantiago@stratio.com>:
Hi,

I was testing your fix in several environments before replying, unfortunately syslog-ng stills shows a huge RAM consumption in our use of case.

We have two wildcard-file sources defined in our syslog-ng conf file, both of them have the same base-dir reference, but one source has a filename-pattern and the other source has a different one (I think we can't use the same source and several filename-patterns because of the lack of regular expression support). We execute the deletion of obsolete files and folder under base-dir once every 2 days, and we create a lot of new folders and files constantly while using our platform, but the growth of these log files (the ones affected by the filename-pattern) has been pretty low in this test, about 10-20 msgs per second total.

This is some data of the fix test:

Syslog-ng was consuming 21,7G of RAM with 10600 files affected by the filename-pattern and 20000 folders under base-dir (not a single folder were deleted yet so this is the expected behaviour of syslog-ng, and we find it extremely unaffordable)

One hour later, right before the cleaning process was executed, syslog-ng was consuming 35G of RAM, and it didn't decrease after the cleaning, not even a little, with just 380 files affected by the filename-pattern and 2200 folders under base-dir.

There is no message in the syslog-agent queue and all the messages are received by the relay (which ingest messages in an elasticsearch)

For the record, we do noticed a small RAM usage decrease in a previous test in a lab environment when we deleted every folder & file under base-dir (from 870M to 550M).

It seems syslog-ng wildcard-source is not optimized when it has too handle thousands of folders and filename-patterns under base-dir. Have you ever test the wildcard-source with a use of case similar to ours?

Can I send you any more information which may help to find out how to optimize it? I wish I can help with the code but I have no idea of C language :-(

Thanks in advance.




2018-09-04 18:03 GMT+02:00 Nagy, Gábor <gabor.nagy@oneidentity.com>:
Hello,

Thanks for the thorough investigation!
We've checked it and found a memory leak in directory monitoring.

I've pushed the fix to my fork and created a merge request about it:
Can you verify that this fixes your problem, please?

> log-fifo-size value is 345000, so I assume it can't be a buffer situation since 345000 messages can't occupy 40-50 GB of memory.
Well our default log-msg-size is 64kB so if all messages have a size of log-msg-size then it can be around ~21GB when syslog-ng is buffering.

Best Regards,
Gabor


On Tue, Sep 4, 2018 at 10:50 AM Jose Angel Santiago <jasantiago@stratio.com> wrote:
Hi,

I'm using syslog-ng 3.16.1 with wildcard-file as source (let's call it "syslog-agent), which sends log messages to another syslog-ng acting as a relay.

I've noticed that syslog-agent instances RAM consumption keeps increasing until they leave no free memory in the cluster (each server has 64G of RAM). In my use case, new folders & files are created constantly under base-dir folder, but every 2 days obsolete folders & files are deleted. I assumed that syslog-ng would free some RAM every time those folders & files are deleted, but it doesn't, not even if I run a syslog-ng-ctl reload operation.

log-fifo-size value is 345000, so I assume it can't be a buffer situation since 345000 messages can't occupy 40-50 GB of memory.

I've performed the following test to reproduce the situation in small scale:

- Launch a syslog-agent with a wildcard-file source reading from "/tmp/test/" base-dir. syslog-agent RAM usage is about 125M.
- Run a simple script to create complex folder hierarchy under /tmp/test and some files with 5000 log messages to read from.
- Wait until syslog-agent RAM usage gets 1GB
- Stop script execution and wait until syslog-agent has send all logs to relay.
- Delete everything under /tmp/test and execute syslog-ng-ctl reload operation
- 24 hours after, syslog-agent RAM usage still is 1GB

I've used heaptrack tool as a try to find a memory leak in syslog-ng, you can see in the attached image that iv_list_empty function in iv_list.h file is where most of the RAM usage is.

How do I get syslog-ng to free RAM? Or is it a memory leak?

Thanks in advance.


--

| Jose Angel Santiago

Logo_signature2.png

Vía de las dos Castillas, 33, Ática 4, 3ª Planta

28224 Pozuelo de Alarcón, Madrid, Spain

+34 918 286 473 | www.stratio.com


______________________________________________________________________________
Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng
FAQ: http://www.balabit.com/wiki/syslog-ng-faq


______________________________________________________________________________
Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng
FAQ: http://www.balabit.com/wiki/syslog-ng-faq





--

| Jose Angel Santiago

Logo_signature2.png

Vía de las dos Castillas, 33, Ática 4, 3ª Planta

28224 Pozuelo de Alarcón, Madrid, Spain

+34 918 286 473 | www.stratio.com





--

| Jose Angel Santiago

Logo_signature2.png

Vía de las dos Castillas, 33, Ática 4, 3ª Planta

28224 Pozuelo de Alarcón, Madrid, Spain

+34 918 286 473 | www.stratio.com


______________________________________________________________________________
Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng
FAQ: http://www.balabit.com/wiki/syslog-ng-faq