[syslog-ng] race condition causes copytruncate log rotation problems
Jim Segrave
jes at j-e-s.net
Fri Apr 29 16:16:59 CEST 2016
On a set of very busy Usenet servers, we are seeing problems with using
syslog-ng together with copytruncate in logrotation. I am not sure why,
but syslog-ng opens log files without
having O_APPEND set, and appends log mesages with separate syscalls - an
lseek to SEEK_END, followed by a write. On a busy server- perhaps 300
logs/sec each about 120 bytes long, we sometimes find that logrotate,
which is configured to do a copy-truncate, actually produces a sparse
file which begins with blocks of zeroes up to the pre-rotate log file size,
followed by logs written after the copytruncate completes.
Looking at the Linux /proc/PID/fdinfo information, I see that the
logfiles are not opened in append mode, although they are opened O_WR. I
haven't begun looking through the source to see if syslog-ng ever tries
to overwrite data it's already logged by seeking backwards in the file -
I must confess I can't think of any reason to do so. The only other
reason for not having O_APPEND set would be to avoid problems for
people using NFS to collect logs centrally by having different servers
all writing to one NFS file.
As I see it, the sequence leading to this problem is:
syslog-ng logrotate copytruncate function
------------- -------------------------------------------
,,,
begin copytruncate, reach EOF on the copy
lseek(fd, SEEK_END)
<------------------------------------------------------------- the end
of the file before logrotate ftruncates() it
ftruncate(fd, 0)
write(fd, log_msg, log_msg_size)
<----------------------------------------- the write takes place at the
old EOF, so the kernel considers the file to have empty blocks from 0 to
that point
If there is a reason that some users need the logfile opened without
O_APPEND mode, then making that choice a configuration option would be
helpful. For those using O_APPEND, the lseek() calls are harmless and
won't add any more resource usage than is already present, but file
truncation will prevent a zero filled pad being prefixed to the file.
More information about the syslog-ng
mailing list