[syslog-ng] race condition causes copytruncate log rotation problems

Jim Segrave jes at j-e-s.net
Fri Apr 29 16:16:59 CEST 2016


On a set of very busy Usenet servers, we are seeing problems with using 
syslog-ng together with copytruncate in logrotation. I am not sure why, 
but syslog-ng opens log files without
having O_APPEND set, and appends log mesages with separate syscalls - an 
lseek to SEEK_END, followed by a write. On a busy server- perhaps 300 
logs/sec each about 120 bytes long, we sometimes find that logrotate, 
which is configured to do a copy-truncate, actually produces a sparse 
file which begins with blocks of zeroes up to the pre-rotate log file size,
followed by logs written after the copytruncate completes.

Looking at the Linux /proc/PID/fdinfo information, I see that the 
logfiles are not opened in append mode, although they are opened O_WR. I 
haven't begun looking through the source to see if syslog-ng ever tries 
to overwrite data it's already logged by seeking backwards in the file - 
I must confess I can't think of any reason to do so. The only other 
reason for not having O_APPEND set  would be to avoid problems for 
people using NFS to collect logs centrally by having different servers 
all writing to one NFS file.

As I see it, the sequence leading to this problem is:

syslog-ng                logrotate copytruncate function
------------- -------------------------------------------
,,,
                                begin copytruncate, reach EOF on the copy
lseek(fd, SEEK_END) 
<------------------------------------------------------------- the end 
of the file before logrotate ftruncates() it

                                ftruncate(fd, 0)
write(fd, log_msg, log_msg_size) 
<-----------------------------------------  the write takes place at the 
old EOF, so the kernel considers the file to have empty blocks from 0 to 
that point


If there is a reason that some users need the logfile opened without 
O_APPEND mode, then making that choice a configuration option would be 
helpful. For those using O_APPEND, the lseek() calls are harmless and 
won't add any more resource usage than is already present, but file 
truncation will prevent a zero filled pad being prefixed to the file.





More information about the syslog-ng mailing list