[syslog-ng] On FreeBSd: syslog-ng: I/O error occurred while writing; fd='xx', error='No buffer space available (yy)'

Traiano Welcome Traiano.Welcome at mtnbusiness.co.za
Mon Apr 2 16:48:32 CEST 2012


Hi Balasz


On 31/03/2012 21:16, "Balazs Scheidler" <bazsi at balabit.hu> wrote:

>On Thu, 2012-03-22 at 07:49 +0000, Traiano Welcome wrote:
>> Anybody ?
>> 
>> 
>> On 20/03/2012 12:53, "Traiano Welcome"
>><Traiano.Welcome at mtnbusiness.co.za>
>> wrote:
>> 
>> >Hi List
>> >
>> >I've been seeing the following in the messages log of my freebsd syslog
>> >server for quite some time now:
>> >
>> >---
>> >Mar 20 12:19:12 syslog2 syslog-ng[35313]: I/O error occurred while
>> >writing; fd='12', error='No buffer space available (55)'
>> >Mar 20 12:19:12 syslog2 syslog-ng[35313]: Connection broken;
>> >time_reopen='60'
>> >Mar 20 12:19:12 syslog2 syslog-ng[35313]: I/O error occurred while
>> >writing; fd='13', error='No buffer space available (55)'
>> >Mar 20 12:19:12 syslog2 syslog-ng[35313]: Connection broken;
>> >time_reopen='60'
>> >---
>> >
>> >These happen at a frequency of about 7 per minute on average. See
>>attached
>> >trend graphs for an idea of the volume of traffic we're doing, as well
>>as
>> >the memory and cpu utilisation trends on this server during this
>>period.
>> >As can be seen from the graphs, load does not seem to be the issue.
>> >Occasionally during the week, the system freezes and requires a
>>reboot, I
>> >think it's related to the above message, though I'm not sure.
>> >
>> >My question is: What does this error mean, and how can I resolve it?
>> >
>> >I have tried to frame this as an operating system kernel resource
>>issue,
>> >and experimented with increasing the freebsd kernel sysctls for UDP
>> >performance:
>> >
>> >---
>> >[root at syslog2 <mailto:root at syslog2.ops.mtnbusiness.co.za> /var/log]#
>> >sysctl kern.ipc.nmbclusters=102400
>> >kern.ipc.nmbclusters: 25600 -> 102400
>> >
>> >[root at syslog2 <mailto:root at syslog2.ops.mtnbusiness.co.za> /var/log]#
>> >sysctl kern.ipc.maxsockbuf=201326592
>> >kern.ipc.maxsockbuf: 100663296 -> 201326592
>> >
>> >[root at syslog2 <mailto:root at syslog2.ops.mtnbusiness.co.za> /var/log]#
>> >sysctl net.inet.udp.recvspace=33554432
>> >net.inet.udp.recvspace: 16777216 -> 33554432
>> >---
>> >
>> >This has reduced the frequency of the errors a little, but in general
>>the
>> >problem still remains.
>> >
>> >Syslog version:
>> >
>> >--
>> > [root at syslog2]# syslog-ng -V
>> > syslog-ng 2.0.10
>
>I think I know the issue. poll() indicates that it is possible to write
>messages to the udp destination, and then send() fails with ENOBUFS.
>
>Last time I've checked the only possible way to handle ENOBUFS in this
>case is either to ignore it (after all udp is lossy), or retry sending
>after sleeping some. A similar fix has gone into the "loggen" tool,
>which generates loads of messages and triggered ENOBUFS similarly.
>
>In loggen, a sleep was added, in syslog-ng, that's less of an option as
>that might block other things from being done.
>
>The easiest solution is to return EAGAIN in this case, which causes
>syslog-ng to retry the send() operation in the next poll iteration,
>which causes a busy loop until more buffers become available.
>
>The patch below in 3.3  makes syslog-ng ignore ENOBUFS. I don't maintain
>2.0 anymore, but hey, it was trivial, so here you go:
>
>diff --git a/src/fdwrite.c b/src/fdwrite.c
>index 3957b7d..f76bea7 100644
>--- a/src/fdwrite.c
>+++ b/src/fdwrite.c
>@@ -52,6 +52,9 @@ fd_do_write(FDWrite *self, const void *buf, size_t
>buflen)
>         fsync(self->fd);
>     }
>   while (rc == -1 && errno == EINTR);
>+  
>+  if (rc < 0 && errno == ENOBUFS)
>+    return buflen;
>   return rc;
> }
> 
>And the commit in 3.3:
>
>commit 4d6656bbbc19877e7dc2175a7fb3b9aeeaced270
>Author: Balazs Scheidler <bazsi at balabit.hu>
>Date:   Sat Mar 31 21:14:28 2012 +0200
>
>    logtransport: handle ENOBUFS on output
>    
>    FreeBSD returns ENOBUFS on send() failure instead of indicating
>    this conditions via poll().  The return of ENOBUFS actually is a send
>    error and is calulated in IP statistics, so the best is to handle it
>as
>    a success.  The only alternative would be to return EAGAIN, which
>could
>    cause syslog-ng to spin as long as buffer space is unavailable.  Since
>    I'm not sure how much time that would take and I think spinning the
>CPU
>    is not a good idea in general, I just drop the packet in this case.
>    UDP is lossy anyway
>    
>    Reported-By: Traiano Welcome <Traiano.Welcome at mtnbusiness.co.za>
>    Signed-off-by: Balazs Scheidler <bazsi at balabit.hu>
>



Thanks very much! I'm currently testing and will feedback after watching
for some time!

Traiano




>
>
>-- 
>Bazsi
>
>
>__________________________________________________________________________
>____
>Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
>Documentation: 
>http://www.balabit.com/support/documentation/?product=syslog-ng
>FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>



More information about the syslog-ng mailing list