[tproxy] tproxy can't work with ip_conntrack

KOVACS Krisztian hidden at sch.bme.hu
Wed Oct 22 08:56:00 CEST 2008


Hi,

On sze, okt 22, 2008 at 09:50:29 +0800, Dong Wei wrote:
> When I use a transparent proxy application using tproxy, I find there
> is something wrong when I load ip_conntrack module
> Here is a sample:
> network topology(router mode):
> Client[192.168.2.2]--------[192.168.2.1]TPROXY
> Server[192.168.1.1]--------[192.168.1.2]HTTP Server
> 
> 1. Client try to connect to HTTP Server, and TPROXY Server redirect
> the traffic to itself. Now the conntrack is:
> ORIG: 192.168.2.2:12345->192.168.1.2:80 REPLY:192.168.1.2:80->192.168.2.2
> After 3 way handshake the connection between Client and TPROXY server
> should be in ESTABLISHED state
> 
> 2. TPROXY Server connect to HTTP Server using fake src ip and src
> port. Now the SYN packet should be
> 192.168.2.2:12345->192.168.1.2:80. But kernel find there is an old
> ip_conntrack entry for this SYN packet whose state is ESTABLISHED
> So, kernel will drop the SYN packet, and delete the old entry
> 
> 3. SYN packet will be retransmitted after 1 sec on TPROXY Server. And
> then kernel will allocate an new entry for this SYN packet.
> 
> 4. After 3 way handshake between TPROXY Server and HTTP Server, this
> ip_conntrack entry should be in ESTABLISHED state

This is a well-known conflic between transparent proxying and connection
tracking. The problem is, just as you point out below, that conntrack
identifies the connection based solely on the endpoint addresses so that
you cannot have two connections with exactly the same endpoints.

Unfortunately, with transparent proxying you'd want to do exactly the
same: have a fake connection that completely matches the original one.

> I think that we can add some new fields to identify the two HTTP
> connections in ip_conntrack structure(one is Client->TPROXY Server,
> and the other is TPROXY Server->HTTP Server).
> I hope we can get the right conntrack when we just know the tuple
> containing src_ip,src_port,dst_ip,dst_port. Because there is lots of
> kernel code like this:
> 
> 1. get the tuple 4 fields(src_ip,src_port,dst_ip,dst_port) from skb
> 2. call: nf_conntrack_find_get(tuple) to find the corresponding ip_conntrack.
> So in this case we can only know the 4 fields, we can't get any more
> message from skb.
> 
> Assuming there are 2 ip_conntrack:
> [1]: 192.168.2.2:12345->192.168.1.2:80(client->TPROXY Server)
> [2]: 192.168.2.2:12345->192.168.1.2:80(TPROXY Server->HTTP Server)
> When we process TCP packets between client and TPROXY Server, we find
> the conntrack for 192.168.2.2:123456->192.168.1.2:80, the result
> should
> be [1], and if the TCP packets belong to TPROXY Server and HTTP Server
> connection, the result should be [2]
> 
> Does anyone have good idea about the requirement mentioned above?

I think most users work around the problem: you usually don't really
need the source port to be preserved exactly and by choosing a different
source port the problem goes away.

(If you don't care about the source port then bind the socket to port 0
and the kernel will choose an unused port.)

-- 
KOVACS Krisztian


More information about the tproxy mailing list