Re: [syslog-ng] Disk buffer file truncation issue

14 Jun 2020

      Hi,

Here is some more detail on the issue:

Buffer file on syslog-ng machine

root@mymachine:/etc/syslog-ng/disk-buffer# ls -l
total 61580512
-rw------- 1 root root        4096 Jun 14 16:57 syslog-ng-00000.rqf
-rw------- 1 root root        4096 Jun  8 13:00 syslog-ng-00001.rqf
-rw------- 1 root root 63058421272 Jun 14 16:57 syslog-ng-00002.rqf
-rw------- 1 root root        4096 Jun 14 16:56 syslog-ng-00003.rqf

When I run syslog-ng in debug mode, I see the following lines related to
buffer

[2020-06-14T16:43:32.085616] Reliable disk-buffer state loaded;
filename='/etc/syslog-ng/disk-buffer/syslog-ng-00000.rqf',
queue_length='0', size='0'
[2020-06-14T16:43:32.085690] Reliable disk-buffer internal state;
filename='/etc/syslog-ng/disk-buffer/syslog-ng-00000.rqf',
backlog_head='4096', read_head='4096', write_head='4096', backlog_len='0'
[2020-06-14T16:43:32.086555] WARNING: window sizing for tcp sources were
changed in syslog-ng 3.3, the configuration value was divided by the value
of max-connections(). The result was too small, clamping to value of
min_iw_size_per_reader. Ensure you have a proper log_fifo_size setting to
avoid message loss.; orig_log_iw_size='2', new_log_iw_size='100',
min_iw_size_per_reader='100', min_log_fifo_size='50000'
[2020-06-14T16:43:32.086677] Accepting connections; addr='AF_INET(
0.0.0.0:514)'
[2020-06-14T16:43:32.086802] Reliable disk-buffer state loaded;
filename='/etc/syslog-ng/disk-buffer/syslog-ng-00002.rqf',
queue_length='14255828', size='20391794080'
*[2020-06-14T16:43:32.086871] Reliable disk-buffer internal state;
filename='/etc/syslog-ng/disk-buffer/syslog-ng-00002.rqf',
backlog_head='42534306360', read_head='42534307736',
write_head='62926101816', backlog_len='1'*
[2020-06-14T16:43:32.087359] Seeking the journal to the last cursor
position;
cursor='s=3f7a52af1f26424db365ccc75b2cf79d;i=192f;b=6921d51da51b41d5a26d8674232c77d9;m=75ef33f5cc;t=5a80dfded040d;x=b88b960795be54d2'
[2020-06-14T16:43:32.087572] Reliable disk-buffer state loaded;
filename='/etc/syslog-ng/disk-buffer/syslog-ng-00003.rqf',
queue_length='0', size='0'
[2020-06-14T16:43:32.087637] Reliable disk-buffer internal state;
filename='/etc/syslog-ng/disk-buffer/syslog-ng-00003.rqf',
backlog_head='4096', read_head='4096', write_head='4096', backlog_len='0'

I can see that logs are getting sent out to the destination server.
However, the buffer is still piling up.
I am not sure how to get the buffer cleared.

Please advise.

Thanks
Raghu

On Sat, Jun 13, 2020 at 9:09 PM Raghunath Adhyapak <funduraghu@gmail.com>
wrote:
...
Hi, I don't have an exact way to reproduce it. However, this is what we
observe:
- The buffer file keeps increasing in size.
- We keep getting syslog messages and there doesn't seem to be any issue
with the service.
- When we check the buffer size it grows into gigabytes.
- At this point, in one case we saw the buffer file size was around 4GB.
We restarted the service and after the restart, boom, the buffer files got
cleared immediately (size of buffer file was 4k)
- Next time when this happened, we checked the contents of the buffer file
and investigated whether the events were received or not. We found that we
had received all the events.
Therefore we believe that there was a file truncation issue and restart
helped to clear it.
Let me know if you need any further details and also on how to prevent
this from happening
Thanks
Raghu
On Tue, Jun 9, 2020, 12:58 Nagy Gábor <gabor.hl@gmail.com> wrote:
...
Hi,
Can you describe the issue in detail, please? Could you share the
reproduction steps?
You say there is an issue with disk-buffer truncating after it's emptied,
and that is resolved when syslog-ng restart?
How do you check if the queue is empty?
Regards,
Gabor
Raghunath Adhyapak <funduraghu@gmail.com> ezt írta (időpont: 2020. jún.
9., K, 0:08):
...
Correcting subject
We did some more troubleshooting on this and we found that all logs in
buffer were indeed sent out and that syslog-ng was facing issues in
truncating the file.
This issue got fixed with restart.
However, we are observing that this issue is happening too often.
Would anyone help me understand why this could be happening?
Thanks
Raghu
On Mon, May 11, 2020, 19:34 Fabien Wernli <wernli@in2p3.fr> wrote:
...
Hi,
On Mon, May 11, 2020 at 12:23:27PM +0000, László Várady (lvarady) wrote:
...
...
2. Why couldn't syslog-ng resume operations after partition was
freed up and destination was available?
That might be a bug. Once a destination becomes available (set
time-reopen() to a lower value to check more frequently), syslog-ng should
send messages out from the disk buffer.
Could you reproduce this issue and share the reproduction steps?
I remember having a lot of corrupt disk buffers when disk was full.
In my case, they caused a segfault on startup, maybe that got "fixed"
by a
deletion instead?
______________________________________________________________________________
Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
Documentation:
http://www.balabit.com/support/documentation/?product=syslog-ng
FAQ: http://www.balabit.com/wiki/syslog-ng-faq
______________________________________________________________________________
Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
Documentation:
http://www.balabit.com/support/documentation/?product=syslog-ng
FAQ: http://www.balabit.com/wiki/syslog-ng-faq
______________________________________________________________________________
Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
Documentation:
http://www.balabit.com/support/documentation/?product=syslog-ng
FAQ: http://www.balabit.com/wiki/syslog-ng-faq