https://bugzilla.balabit.com/show_bug.cgi?id=113 Balazs Scheidler <bazsi@balabit.hu> changed: What |Removed |Added ---------------------------------------------------------------------------- Resolution| |INVALID Status|NEW |RESOLVED --- Comment #2 from Balazs Scheidler <bazsi@balabit.hu> 2011-02-23 10:38:17 --- Well, this is more an OS limitation. The POSIX API (and the Linux kernel) won't tell you if: 1) in advacne if a write() would block in this fashion 2) once we call write() it doesn't return The above facts, combined with syslog-ng <= 3.2 being single threaded causes all processing of syslog-ng to stop, if even a single write() blocks. 3.3 will be somewhat better, since it'll be using separate threads to write to output files, even though even in that case, I/O threads may be fully consumed (there's only a limited amount of them) by different file writes. (e.g. if you have 1000 destination files each going into the stalled NFS server, only 1000 I/O threads would be able to cope with this situation). The default number of I/O threads is the number of CPU cores in your system multiplied by two. 3.3 is only available as an alpha release though. One way to solve this problem is to mount your NFS in "soft" mode, quoting the manual page: soft / hard Determines the recovery behavior of the NFS client after an NFS request times out. If neither option is specified (or if the hard option is specified), NFS requests are retried indefinitely. If the soft option is specified, then the NFS client fails an NFS request after retrans retransmissions have been sent, causing the NFS client to return an error to the calling application. NB: A so-called "soft" timeout can cause silent data corruption in certain cases. As such, use the soft option only when client responsiveness is more important than data integrity. Using NFS over TCP or increasing the value of the retrans option may mitigate some of the risks of using the soft option. But it's better to read all NFS mount options and set up values accordingly. In summary, this is less of a syslog-ng related problem, but rather an NFS sideeffect. And even with using "soft" mode, a certain time will be spent in trying to recover, causing syslog-ng to stop in the same way, but not indefinitely. The only way to work around this limitation is to continously monitor the NFS mounts from a script and if it doesn't react in a given timeframe (let's say 1 seconds), tell syslog-ng not to write to those files (by changing the configuration file and SIGHUPing the process). Hope this helps. -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.