Bad Interaction between NAT and TPROXY
Hello! We're seeing a problem when we combine NAT and TPROXY 1.2.1 on a firewall system with application-layer proxies in the 2.4.26 kernel. Say the firewall system has two interfaces, public (202.20.5.1) and private (10.0.0.1). There is an SMTP server (10.0.0.2) on the private network, and we have DNAT rules configured to map 202.20.5.1 ---> 10.0.0.2 So then, in practice, the SMTP proxy on the firewall receives a redirected SMTP connection from a client (90.1.1.1) on the net. It then does its application stuff, and sets up a connection to the server by: (1) Binding a socket to a local address (2) Setting up a TPROXY assignment from the local address to 90.1.1.1 (3) Connect out to the server using the destination address recovered by getsockopt(ORIGINAL_DST) on the original incoming socket This all works fine when there is no NAT (and the private addresses are eliminated). In that case, what the server sees is an incoming connection from 90.1.1.1. But with NAT in the picture, what the server sees is an incoming connection from 10.0.0.1. Working through the kernel code, it appears that when CONFIG_IP_NF_NAT_LOCAL is enabled, there is a call to do_extra_mangle() in ip_nat_core.c that is forcing the source address of outgoing packets to be the preferred source from the routing tables in the LOCAL_OUT hook, so by the time the packet gets to POSTROUTING, the outbound packet no longer matches the TPROXY rule that we set up, so the source address goes out as-is. Looking for a general solution, all I can think of at the moment is to patch ip_nat_core.c so that the call to do_extra_mangle.c is skipped if connection has been assigned by TPROXY, on the theory that the code setting up the TPROXY assignment knows what it wants the final source or destination to be, and that this should take priority over the extra mangling done by NAT. Perhaps testing for IPS_TPROXY before the call to do_extra_mangle()? If that would work (the code is still a jungle to me), it seems like this would solve the problem and still be relatively harmless, but I'm not sure. Actually I'm not sure why the NAT code rewrites the source address of a DNAT packet anyway (or the destination address of an SNAT packet), so it makes me nervous. What we do now as a workaround in simple networks is manually force the proxy to bind the local socket to 10.0.0.1 before the TPROXY assignment. It's ugly, but it works, since it anticipates how the kernel will mangle the packets. However it's obviously not a good general solution, and would fail in complex networks where routing to the destination is dynamic. What's the best way to deal with this? Thanks! Tim _______________________________ Do you Yahoo!? Win 1 of 4,000 free domain names from Yahoo! Enter now. http://promotions.yahoo.com/goldrush
Hi, 2004-08-20, p keltezéssel 06:53-kor Tim Burress ezt írta:
We're seeing a problem when we combine NAT and TPROXY 1.2.1 on a firewall system with application-layer proxies in the 2.4.26 kernel. Say the firewall system has two interfaces, public (202.20.5.1) and private (10.0.0.1). There is an SMTP server (10.0.0.2) on the private network, and we have DNAT rules configured to map
202.20.5.1 ---> 10.0.0.2
So then, in practice, the SMTP proxy on the firewall receives a redirected SMTP connection from a client (90.1.1.1) on the net. It then does its application stuff, and sets up a connection to the server by:
(1) Binding a socket to a local address (2) Setting up a TPROXY assignment from the local address to 90.1.1.1 (3) Connect out to the server using the destination address recovered by getsockopt(ORIGINAL_DST) on the original incoming socket
This all works fine when there is no NAT (and the private addresses are eliminated). In that case, what the server sees is an incoming connection from 90.1.1.1. But with NAT in the picture, what the server sees is an incoming connection from 10.0.0.1.
Working through the kernel code, it appears that when CONFIG_IP_NF_NAT_LOCAL is enabled, there is a call to do_extra_mangle() in ip_nat_core.c that is forcing the source address of outgoing packets to be the preferred source from the routing tables in the LOCAL_OUT hook, so by the time the packet gets to POSTROUTING, the outbound packet no longer matches the TPROXY rule that we set up, so the source address goes out as-is.
Looking for a general solution, all I can think of at the moment is to patch ip_nat_core.c so that the call to do_extra_mangle.c is skipped if connection has been assigned by TPROXY, on the theory that the code setting up the TPROXY assignment knows what it wants the final source or destination to be, and that this should take priority over the extra mangling done by NAT.
Perhaps testing for IPS_TPROXY before the call to do_extra_mangle()?
If that would work (the code is still a jungle to me), it seems like this would solve the problem and still be relatively harmless, but I'm not sure. Actually I'm not sure why the NAT code rewrites the source address of a DNAT packet anyway (or the destination address of an SNAT packet), so it makes me nervous.
What we do now as a workaround in simple networks is manually force the proxy to bind the local socket to 10.0.0.1 before the TPROXY assignment. It's ugly, but it works, since it anticipates how the kernel will mangle the packets. However it's obviously not a good general solution, and would fail in complex networks where routing to the destination is dynamic.
What's the best way to deal with this?
Thanks for the detailed problem report... This seems to be a quite tough problem, and a very curious interaction between different parts of Netfilter. At the moment I do not have any better ideas than checking IPS_TPROXY, but I hope that the problem can be investigated in more detail on the Netfilter workshop. Please try to use your proposed workaround until then, we'll try to come up with a solution acceptable to everyone. -- Regards, Krisztian KOVACS
Hi Krisztian, --- KOVACS Krisztian <hidden@balabit.hu> wrote:
Thanks for the detailed problem report... This seems to be a quite tough problem, and a very curious interaction between different parts of Netfilter. At the moment I do not have any better ideas than checking IPS_TPROXY, but I hope that the problem can be investigated in more detail on the Netfilter workshop.
Just FYI, we did try checking IPS_TPROXY at the point where do_extra_mangle() is called, but it seems like, at that point, the flag has not yet been set in the conntrack record, so it didn't work out. Thanks for your reply and anything you can come up with at the workshop! Tim _______________________________ Do you Yahoo!? Win 1 of 4,000 free domain names from Yahoo! Enter now. http://promotions.yahoo.com/goldrush
Hi Krisztian, I was just wondering if any sort of resolution has appeared for this interaction between TPROXY and the various components of Netfilter? It wasn't clear from the Netfilter summary how much discussion might have gone on. Thanks! Tim --- KOVACS Krisztian <hidden@balabit.hu> wrote:
Hi,
2004-08-20, p keltez�ssel 06:53-kor Tim Burress ezt �rta:
We're seeing a problem when we combine NAT and TPROXY 1.2.1 on a firewall system with application-layer proxies in the 2.4.26 kernel. Say the firewall system has two interfaces, public (202.20.5.1) and private (10.0.0.1). There is an SMTP server (10.0.0.2) on the private network, and we have DNAT rules configured to map
202.20.5.1 ---> 10.0.0.2
So then, in practice, the SMTP proxy on the firewall receives a redirected SMTP connection from a client (90.1.1.1) on the net. It then does its application stuff, and sets up a connection to the server by:
(1) Binding a socket to a local address (2) Setting up a TPROXY assignment from the local address to 90.1.1.1 (3) Connect out to the server using the destination address recovered by getsockopt(ORIGINAL_DST) on the original incoming socket
This all works fine when there is no NAT (and the private addresses are eliminated). In that case, what the server sees is an incoming connection from 90.1.1.1. But with NAT in the picture, what the server sees is an incoming connection from 10.0.0.1.
Working through the kernel code, it appears that when CONFIG_IP_NF_NAT_LOCAL is enabled, there is a call to do_extra_mangle() in ip_nat_core.c that is forcing the source address of outgoing packets to be the preferred source from the routing tables in the LOCAL_OUT hook, so by the time the packet gets to POSTROUTING, the outbound packet no longer matches the TPROXY rule that we set up, so the source address goes out as-is.
Looking for a general solution, all I can think of at the moment is to patch ip_nat_core.c so that the call to do_extra_mangle.c is skipped if connection has been assigned by TPROXY, on the theory that the code setting up the TPROXY assignment knows what it wants the final source or destination to be, and that this should take priority over the extra mangling done by NAT.
Perhaps testing for IPS_TPROXY before the call to do_extra_mangle()?
If that would work (the code is still a jungle to me), it seems like this would solve the problem and still be relatively harmless, but I'm not sure. Actually I'm not sure why the NAT code rewrites the source address of a DNAT packet anyway (or the destination address of an SNAT packet), so it makes me nervous.
What we do now as a workaround in simple networks is manually force the proxy to bind the local socket to 10.0.0.1 before the TPROXY assignment. It's ugly, but it works, since it anticipates how the kernel will mangle the packets. However it's obviously not a good general solution, and would fail in complex networks where routing to the destination is dynamic.
What's the best way to deal with this?
Thanks for the detailed problem report... This seems to be a quite tough problem, and a very curious interaction between different parts of Netfilter. At the moment I do not have any better ideas than checking IPS_TPROXY, but I hope that the problem can be investigated in more detail on the Netfilter workshop.
Please try to use your proposed workaround until then, we'll try to come up with a solution acceptable to everyone.
-- Regards, Krisztian KOVACS
_______________________________ Do you Yahoo!? Declare Yourself - Register online to vote today! http://vote.yahoo.com
Hi, 2004-09-29, sze keltezéssel 07:01-kor Tim Burress ezt írta:
I was just wondering if any sort of resolution has appeared for this interaction between TPROXY and the various components of Netfilter? It wasn't clear from the Netfilter summary how much discussion might have gone on.
Oh, sorry, I completely forgot to reply to this mail after the workshop. So, it looks that the problem is that Netfilter does an implicit SNAT on LOCAL_OUT if you use DNAT rules and a specific DNAT rule would cause the packet to go out from a different interface than it was originally destined to. I'm not sure that this is necessary at all, and it looks like we've been able to convince Rusty that it should be probably removed. Along with other NAT-related Netfilter patches, it is waiting for Rusty to submit them. As for now, you could remove this routing lookup and check from the NAT code, and see what happens. -- Regards, Krisztian KOVACS
participants (2)
-
KOVACS Krisztian
-
Tim Burress