[syslog-ng]1.5.23 dies

Balazs Scheidler bazsi@balabit.hu
Tue, 5 Nov 2002 18:50:37 +0100


On Tue, Nov 05, 2002 at 07:25:54PM +0200, Jaakko Niemi wrote:
> Balazs Scheidler <bazsi@balabit.hu> writes:
> 
> > On Tue, Nov 05, 2002 at 07:02:20PM +0200, Jaakko Niemi wrote:
> >> Balazs Scheidler <bazsi@balabit.hu> writes:
> >> 
> >> > On Tue, Nov 05, 2002 at 06:44:36PM +0200, Jaakko Niemi wrote:
> >> >> 
> >> >>  Hi,
> >> >> 
> >> >> gc_mark: Marking object of class 'affile_dest_reaper' (1)
> >> >> io.c: sockaddr2info(): Unsupported address family 63828.
> >> >
> >> > this is caused by a kernel bug. the latest libol has workaround (0.3.5), or
> >> > install 2.4.20 (it was fixed in one of the preXX releases)
> >> 
> >>  Well, the machine is running 2.4.20-rc1, and I compiled with libol 0.3.5.
> >
> > hm.. please check what you get back upon return from recvfrom()
> >
> > the bug was that for SOCK_DGRAM unix domain sockets, recvfrom() returned a
> > sockaddr struct 2 bytes long, and the sa_family member was not filled.
> 
>  Hmm, I missed that from the changelog..
> 
> > try strace-ing the process, and see what the kernel returns.
> 
> poll([{fd=9, events=0}, {fd=4, events=POLLIN}, {fd=3, events=POLLIN, revents=POLLIN}], 3, 100) = 1
> recvfrom(3, "<38>Nov  5 17:53:01 PAM_unix[952"..., 2048, 0, {sin_family=0xf80c /* AF_??? */, {sa_family=63500, sa_data="\377\27
> 7\6\351\4\10\10\270\5\10\360@\5\10"}, [256]) = 82
> 
>  Only modification to the kernel is that HZ is defined as 1024, but I don't see
> how this can affect. 

I sent this patch to lkml, and they told me it was fixed in
2.4.20presomething. As it seems they didn't, they told me it should be
returning 0 as the sockaddr len, but your strace shows that it is 256.

--- af_unix.c~	Mon Feb 25 20:38:16 2002
+++ af_unix.c	Fri Oct  4 09:46:26 2002
@@ -1392,6 +1392,9 @@
 		       sk->protinfo.af_unix.addr->name,
 		       sk->protinfo.af_unix.addr->len);
 	}
+	else {
+		((struct sockaddr *) msg->msg_name)->sa_family = AF_UNIX;
+	}
 }
 
 static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, int size,
--------

... checking out 2.4.20-rc1, it seems to be indeed fixed, at least they
return 0 bytes as this patch shows:

@@ -1385,7 +1391,7 @@
 
 static void unix_copy_addr(struct msghdr *msg, struct sock *sk)
 {
-       msg->msg_namelen = sizeof(short);
+       msg->msg_namelen = 0;
        if (sk->protinfo.af_unix.addr) {
                msg->msg_namelen=sk->protinfo.af_unix.addr->len;
                memcpy(msg->msg_name,
--------

hmm... this fix is insufficient, as sys_recvfrom() in socket.c only copies
the resulting sockaddr back to userspace, _iff_ the new length is non-zero,
and in these cases the addrlen is not copied either, this part:

        if(err >= 0 && addr != NULL && msg.msg_namelen)
        {
                err2=move_addr_to_user(address, msg.msg_namelen, addr, addr_len);
                if(err2<0)
                        err=err2;
        }

msg.msg_namelen is what is set to 0, move_addr_to_user is not called, try
this patch (in addition to rc1):

--- socket.c.old	Tue Nov  5 18:48:22 2002
+++ socket.c	Tue Nov  5 18:49:34 2002
@@ -1262,7 +1262,7 @@
 		flags |= MSG_DONTWAIT;
 	err=sock_recvmsg(sock, &msg, size, flags);
 
-	if(err >= 0 && addr != NULL && msg.msg_namelen)
+	if(err >= 0 && addr != NULL)
 	{
 		err2=move_addr_to_user(address, msg.msg_namelen, addr, addr_len);
 		if(err2<0)
--------------

I send this patch to davem.

-- 
Bazsi
PGP info: KeyID 9AF8D0A9 Fingerprint CD27 CFB0 802C 0944 9CFD 804E C82C 8EB1