Hi, The same free policy cannot work with _handle_file_deleted, because we cannot free the readers immediately. There is a race between file delete and syslog-ng reading the data. It can happen that one deletes a file before syslog-ng reads all its contents, which would cause message loss. In that case we just signal the deletion event to the reader (log_pipe_notify(&reader->super, NC_FILE_DELETED, NULL); and when the reader reads EOF at a later time, that's when the reader object is freed. Can you send the output of pmap <pid of syslog-ng> with the leak? That may give us a clue where to look with the leak. Also can you send your configuration, or at least the part about the wildcard filesources? The reproduction might depend on max_files or monitor_method parameter, or maybe other. Thanks, Antal On Thu, Sep 6, 2018 at 12:00 PM Jose Angel Santiago <jasantiago@stratio.com> wrote:
Reviewing your commit I've found out that the method *_handle_file_deleted* seems not to free memory when a monitored file has been deleted, according to the changes you made in *_handler_handler_directory_deleted.* Could it be the cause of the memory leak?
Thanks.
2018-09-06 11:50 GMT+02:00 Jose Angel Santiago <jasantiago@stratio.com>:
Hi,
I was testing your fix in several environments before replying, unfortunately syslog-ng stills shows a huge RAM consumption in our use of case.
We have two wildcard-file sources defined in our syslog-ng conf file, both of them have the same base-dir reference, but one source has a filename-pattern and the other source has a different one (I think we can't use the same source and several filename-patterns because of the lack of regular expression support). We execute the deletion of obsolete files and folder under base-dir once every 2 days, and we create a lot of new folders and files constantly while using our platform, but the growth of these log files (the ones affected by the filename-pattern) has been pretty low in this test, about 10-20 msgs per second total.
This is some data of the fix test:
Syslog-ng was consuming 21,7G of RAM with 10600 files affected by the filename-pattern and 20000 folders under base-dir (not a single folder were deleted yet so this is the expected behaviour of syslog-ng, and we find it extremely unaffordable)
One hour later, right before the cleaning process was executed, syslog-ng was consuming 35G of RAM, and it didn't decrease after the cleaning, not even a little, with just 380 files affected by the filename-pattern and 2200 folders under base-dir.
There is no message in the syslog-agent queue and all the messages are received by the relay (which ingest messages in an elasticsearch)
For the record, we do noticed a small RAM usage decrease in a previous test in a lab environment when we deleted every folder & file under base-dir (from 870M to 550M).
It seems syslog-ng wildcard-source is not optimized when it has too handle thousands of folders and filename-patterns under base-dir. Have you ever test the wildcard-source with a use of case similar to ours?
Can I send you any more information which may help to find out how to optimize it? I wish I can help with the code but I have no idea of C language :-(
Thanks in advance.
2018-09-04 18:03 GMT+02:00 Nagy, Gábor <gabor.nagy@oneidentity.com>:
Hello,
Thanks for the thorough investigation! We've checked it and found a memory leak in directory monitoring.
I've pushed the fix to my fork and created a merge request about it: https://github.com/gaborznagy/syslog-ng/commits/fix-wildcard-memleak https://github.com/balabit/syslog-ng/pull/2261 Can you verify that this fixes your problem, please?
log-fifo-size value is 345000, so I assume it can't be a buffer situation since 345000 messages can't occupy 40-50 GB of memory. Well our default log-msg-size is 64kB so if all messages have a size of log-msg-size then it can be around ~21GB when syslog-ng is buffering.
Best Regards, Gabor
On Tue, Sep 4, 2018 at 10:50 AM Jose Angel Santiago < jasantiago@stratio.com> wrote:
Hi,
I'm using syslog-ng 3.16.1 with wildcard-file as source (let's call it "syslog-agent), which sends log messages to another syslog-ng acting as a relay.
I've noticed that syslog-agent instances RAM consumption keeps increasing until they leave no free memory in the cluster (each server has 64G of RAM). In my use case, new folders & files are created constantly under base-dir folder, but every 2 days obsolete folders & files are deleted. I assumed that syslog-ng would free some RAM every time those folders & files are deleted, but it doesn't, not even if I run a syslog-ng-ctl reload operation.
log-fifo-size value is 345000, so I assume it can't be a buffer situation since 345000 messages can't occupy 40-50 GB of memory.
I've performed the following test to reproduce the situation in small scale:
- Launch a syslog-agent with a wildcard-file source reading from "/tmp/test/" base-dir. syslog-agent RAM usage is about 125M. - Run a simple script to create complex folder hierarchy under /tmp/test and some files with 5000 log messages to read from. - Wait until syslog-agent RAM usage gets 1GB - Stop script execution and wait until syslog-agent has send all logs to relay. - Delete everything under /tmp/test and execute syslog-ng-ctl reload operation - 24 hours after, syslog-agent RAM usage still is 1GB
I've used heaptrack tool as a try to find a memory leak in syslog-ng, you can see in the attached image that iv_list_empty function in iv_list.h file is where most of the RAM usage is.
How do I get syslog-ng to free RAM? Or is it a memory leak?
Thanks in advance.
--
| Jose Angel Santiago
[image: Logo_signature2.png] <http://www.stratio.com/>
Vía de las dos Castillas, 33, Ática 4, 3ª Planta
28224 Pozuelo de Alarcón, Madrid, Spain
+34 918 286 473 <+34%20918%2028%2064%2073> | www.stratio.com <https://twitter.com/stratiobd> <https://www.linkedin.com/company/stratiobd> <https://www.youtube.com/c/StratioBD>
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
--
| Jose Angel Santiago
[image: Logo_signature2.png] <http://www.stratio.com/>
Vía de las dos Castillas, 33, Ática 4, 3ª Planta
28224 Pozuelo de Alarcón, Madrid, Spain
+34 918 286 473 <+34%20918%2028%2064%2073> | www.stratio.com <https://twitter.com/stratiobd> <https://www.linkedin.com/company/stratiobd> <https://www.youtube.com/c/StratioBD>
--
| Jose Angel Santiago
[image: Logo_signature2.png] <http://www.stratio.com/>
Vía de las dos Castillas, 33, Ática 4, 3ª Planta
28224 Pozuelo de Alarcón, Madrid, Spain
+34 918 286 473 <+34%20918%2028%2064%2073> | www.stratio.com <https://twitter.com/stratiobd> <https://www.linkedin.com/company/stratiobd> <https://www.youtube.com/c/StratioBD>
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq