Hi, On sze, okt 22, 2008 at 09:50:29 +0800, Dong Wei wrote:
When I use a transparent proxy application using tproxy, I find there is something wrong when I load ip_conntrack module Here is a sample: network topology(router mode): Client[192.168.2.2]--------[192.168.2.1]TPROXY Server[192.168.1.1]--------[192.168.1.2]HTTP Server
1. Client try to connect to HTTP Server, and TPROXY Server redirect the traffic to itself. Now the conntrack is: ORIG: 192.168.2.2:12345->192.168.1.2:80 REPLY:192.168.1.2:80->192.168.2.2 After 3 way handshake the connection between Client and TPROXY server should be in ESTABLISHED state
2. TPROXY Server connect to HTTP Server using fake src ip and src port. Now the SYN packet should be 192.168.2.2:12345->192.168.1.2:80. But kernel find there is an old ip_conntrack entry for this SYN packet whose state is ESTABLISHED So, kernel will drop the SYN packet, and delete the old entry
3. SYN packet will be retransmitted after 1 sec on TPROXY Server. And then kernel will allocate an new entry for this SYN packet.
4. After 3 way handshake between TPROXY Server and HTTP Server, this ip_conntrack entry should be in ESTABLISHED state
This is a well-known conflic between transparent proxying and connection tracking. The problem is, just as you point out below, that conntrack identifies the connection based solely on the endpoint addresses so that you cannot have two connections with exactly the same endpoints. Unfortunately, with transparent proxying you'd want to do exactly the same: have a fake connection that completely matches the original one.
I think that we can add some new fields to identify the two HTTP connections in ip_conntrack structure(one is Client->TPROXY Server, and the other is TPROXY Server->HTTP Server). I hope we can get the right conntrack when we just know the tuple containing src_ip,src_port,dst_ip,dst_port. Because there is lots of kernel code like this:
1. get the tuple 4 fields(src_ip,src_port,dst_ip,dst_port) from skb 2. call: nf_conntrack_find_get(tuple) to find the corresponding ip_conntrack. So in this case we can only know the 4 fields, we can't get any more message from skb.
Assuming there are 2 ip_conntrack: [1]: 192.168.2.2:12345->192.168.1.2:80(client->TPROXY Server) [2]: 192.168.2.2:12345->192.168.1.2:80(TPROXY Server->HTTP Server) When we process TCP packets between client and TPROXY Server, we find the conntrack for 192.168.2.2:123456->192.168.1.2:80, the result should be [1], and if the TCP packets belong to TPROXY Server and HTTP Server connection, the result should be [2]
Does anyone have good idea about the requirement mentioned above?
I think most users work around the problem: you usually don't really need the source port to be preserved exactly and by choosing a different source port the problem goes away. (If you don't care about the source port then bind the socket to port 0 and the kernel will choose an unused port.) -- KOVACS Krisztian