On Tue, 15 Jan 2008 13:50:20 +0200, dstuxo said:
I am using syslog-ng in a system which is supposed to be fail safe to avoid data loss. However I encountered an unexpected delay and messages are lost.
This turns out to be a *lot* harder to achieve than you would expect...
I know about the fsync(yes) option and I tried it, but with high load of messages It doesn't work proper (syslog-ng.config becomes corrupted and messages are also lost)
On any given hardware, there is an *absolute* upper limit to how many fsync() operations it can support per second - and it's often quite low, on the order of 50-75 per second. This is mainly due to the fact that an fsync will almost always cause a disk seek, which can take up to 10 milliseconds (restricting you to 100 per second). There's only a few ways around this: 0) Many/most disk drives have a rather small (8M to 64M) cache in front of them - if you care about performance, you enable it, if you care about "fail safe", you turn it off, because if you do a "wait 2 seconds then I power down", anything that had made it to the disk's cache but wasn't written out *will go away*. If you need to run it with cache on, you need to make sure that (a) you have a UPS so it never loses power before you can (b) make sure that you properly shut down the disks, including a 'flush cache" and waiting for it to complete, before pulling the power... 1) Tell syslog-ng to only fsync every <insert time unit here>, and be prepared to lose the last few second's worth if there's a crash. 2) Be prepared to spend US$400,000 and up for a high-end EMC or similar disk array that can handle insanely high I/O loads. Part of this cost is the battery backup for the disks, so that even if you "wait 2 seconds and power down", the disks don't actually spin down. 3) Learn to accept that unless you spend lots of money, there's lots of ways that you can lose messages.