On Mon, 2008-07-07 at 15:57 +0200, thomas.wenz@gmx-topmail.de wrote:
Hi,
Thanks for taking you so much time to help me!
You are welcome. I like doing my best for people who see the potential behind Zorp.
Your idea with self.rerequest_attempts setting to 1 is fantastic! I tried it and it moves the stacking so much up like I want it! Now I can do exactly what I described with POST!
The problem left is that stacking is still not done on GET-requests. Somehow, your suggested change didn't do anything because it's an OR-clause and if any of these checks fails it enters the section and ends. If I put a comment around this whole if-section, it's a little bit better: the program gets executed but a) it's then not executed at the same time like with POST and self.rerequest_attempts=1 so it seems like the alternative path with blobs can only be used with POST. I tried changing the has_data value in the http_fetch_request function but it didn't help (probably there's more needed). b) it's then executed but still no data (= no headers) is passed so it's quite useless. There must be another check somewhere later in the program which prevents sending the headers when there's no data part.
There's a patch below, which makes stacking work with "GET" and HTTK_STK_MIME flavour of stacking. At least my limited testing shows it works. It was more complicated than I thought, primarily because our data transfer mechanism assumed that the client or the server are the ones who drive the the transmission, and in your special case it is the proxy that does so, with the intent of writing the HTTP request to the downstream proxy. All the rest is just moving code here and there to make it possible to see that it is indeed a MIME stacking.
I just noticed another minor problem: A connection takes quite long if I set forge_port to Z_PORT_EXACT or Z_PORT_GROUP. Between the "Initiating connection" and "Established connection" it takes more than 3 seconds. If I set "self.transparent_mode = TRUE" this gets really slow because then the connection seems to be not reused, if it's set to false it goes really fast (only the first request and if there was no request for some time take the 3 seconds). Why is this so? Does it wait for the other connection to somehow time out before it can use the same port? With Z_PORT_ANY it works flawlessly...
GROUP should work without problems, Z_PORT_EXACT will almost always fail, except if there is only one connection at a time. The problem is that the client-side and the server side tuples _must_ be different. For instance in connection tracking there's no way to differentiate between the two connections if the complete tuples are equal. Earlier, with the NAT based tproxies (tproxy2 and 3) this collision was a real problem (conntrack removed entries based on a timer and had no way of knowing that a given tuple was already released). Now with tproxy4, the local stack should have a chance to differentiate between the incoming and the outgoing connections, even if their tuples match completely: the local endpoint of the socket is different. client side connection: client IP:port -> server IP:port (local) server side connection: client IP:port (local) -> server IP:port But still, our 2.6.17-4.x kernels include conntrack for other purposes, so the conntrack collision still applies. However Z_PORT_GROUP should work, all it does is that it tries to allocate a port in the same port range in userspace, which means that it performs a loop on the permitted set of ports and tries to bind to each of them. I don't really see how that could take a long time. Some investigation seems to be necessary here. Probably a tcpdump could help (is the SYN sent out right at the "initiating" log message, or later, etc)
And a more cosmetical issue I found during the investigation of the connection problem: When I attach an strace to a process with a stacked program I get a huge filte (18MB) with thousands of "close(229284) = -1 EBADF (Bad file descriptor)" At one point this process then calls the stacked proxy and seems to normally continue. Is this normal? I've attached the log of this process below.
This is bug. Closing unopened file descriptors in a multi-threaded program causes unpredictable behaviour. Can you send me the complete strace? Here's the patch: M modules/http/httpfltr.c M libproxy/transfer2.c M libproxy/zorp/proxy/transfer2.h * modified files --- orig/libproxy/transfer2.c +++ mod/libproxy/transfer2.c @@ -278,7 +278,7 @@ z_transfer2_update_cond(ZTransfer2 *self { if (!z_transfer2_get_status(self, ZT2S_EOF_SOURCE)) { - if (z_transfer2_buffer_empty(&self->buffers[0])) + if (z_transfer2_buffer_empty(&self->buffers[0]) && !z_transfer2_get_status(self, ZT2S_PROXY_OUT)) z_stream_set_cond(z_transfer2_get_stream(self, ZT2E_SOURCE), G_IO_IN, TRUE); else z_stream_set_cond(z_transfer2_get_stream(self, ZT2E_DOWN_SOURCE), G_IO_OUT, TRUE); @@ -296,7 +296,7 @@ z_transfer2_update_cond(ZTransfer2 *self /* no stacking */ if (!z_transfer2_get_status(self, ZT2S_EOF_SOURCE)) { - if (z_transfer2_buffer_empty(&self->buffers[0]) || z_transfer2_get_status(self, ZT2S_EOF_DEST) != 0) + if ((z_transfer2_buffer_empty(&self->buffers[0]) || z_transfer2_get_status(self, ZT2S_EOF_DEST) != 0) && !z_transfer2_get_status(self, ZT2S_PROXY_OUT)) z_stream_set_cond(z_transfer2_get_stream(self, ZT2E_SOURCE), G_IO_IN, TRUE); else z_stream_set_cond(z_transfer2_get_stream(self, ZT2E_DEST), G_IO_OUT, TRUE); @@ -876,6 +876,7 @@ z_transfer2_run_method(ZTransfer2 *self) { z_enter(); z_transfer2_switch_to_transfer_context(self); + z_transfer2_update_cond(self); z_transfer2_update_status(self, ZT2S_STARTED, TRUE); z_transfer2_update_status(self, ZT2S_SUSPENDED, FALSE); --- orig/libproxy/zorp/proxy/transfer2.h +++ mod/libproxy/zorp/proxy/transfer2.h @@ -36,6 +36,9 @@ typedef enum #define ZT2S_SOFT_EOF_SOURCE 0x0400 #define ZT2S_SOFT_EOF_DEST 0x0800 +/* if this flag is set, the proxy has something to say, so regardless of the buffer contents, poll the output side */ +#define ZT2S_PROXY_OUT 0x1000 + #define ZT2S_EOF_BITS (ZT2S_EOF_SOURCE | ZT2S_EOF_DEST | ZT2S_SOFT_EOF_SOURCE | ZT2S_SOFT_EOF_DEST) #define ZT2E_STACKED 0x02 @@ -158,6 +161,15 @@ z_transfer2_set_content_format(ZTransfer self->content_format = content_format; } +static inline void +z_transfer2_set_proxy_out(ZTransfer2 *self, gboolean enable) +{ + if (enable) + self->status |= ZT2S_PROXY_OUT; + else + self->status &= ~ZT2S_PROXY_OUT; +} + static inline const gchar * z_transfer2_get_stack_info(ZTransfer2 *self) { --- orig/modules/http/httpfltr.c +++ mod/modules/http/httpfltr.c @@ -154,7 +154,10 @@ http_transfer_src_read(ZTransfer2 *s, ZS *bytes_read = move; if (self->stacked_preamble_ofs == self->stacked_preamble->len) - self->src_read_state = HTTP_SR_READ_INITIAL; + { + z_transfer2_set_proxy_out(s, FALSE); + self->src_read_state = HTTP_SR_READ_INITIAL; + } return G_IO_STATUS_NORMAL; } else @@ -727,22 +730,12 @@ http_transfer_stack_proxy(ZTransfer2 *s, /* query python for a stacked proxy */ - if (self->transfer_type != HTTP_TRANSFER_NORMAL && self->transfer_type != HTTP_TRANSFER_TO_BLOB) + if (self->suppress_data || (self->transfer_type != HTTP_TRANSFER_NORMAL && self->transfer_type != HTTP_TRANSFER_TO_BLOB)) { *stacked = NULL; return TRUE; } - /* we don't stack anything, if - 1) we suppress data - 2) data is not indicated by either the presence of some header fields nor do we expect data - */ - if (self->suppress_data || - !(self->expect_data || http_lookup_header(headers, "Transfer-Encoding", &hdr) || http_lookup_header(headers, "Content-Length", &hdr))) - { - *stacked = NULL; - return TRUE; - } z_policy_lock(self->super.owner->thread); @@ -773,17 +766,21 @@ http_transfer_stack_proxy(ZTransfer2 *s, } success = TRUE; - switch (stack_type) + if (stack_type == HTTP_STK_MIME) + self->push_mime_headers = TRUE; + + /* we don't stack anything, if + 1) we suppress data + 2) data is not indicated by either the presence of some header fields nor do we expect data + */ + if (!(self->expect_data || self->push_mime_headers || http_lookup_header(headers, "Transfer-Encoding", &hdr) || http_lookup_header(headers, "Content-Length", &hdr))) { - case HTTP_STK_NONE: - break; - case HTTP_STK_MIME: - self->push_mime_headers = TRUE; - /* fallthrough */ - case HTTP_STK_DATA: - success = z_proxy_stack_object(s->owner, stack_object, stacked); - break; + *stacked = NULL; + goto unref_unlock; } + + if (stack_type != HTTP_STK_NONE) + success = z_proxy_stack_object(s->owner, stack_object, stacked); unref_unlock: z_policy_var_unref(proxy_stack_tuple); z_policy_unlock(self->super.owner->thread); @@ -851,6 +848,7 @@ http_transfer_setup(ZTransfer2 *s) { self->format_preamble_func((HttpProxy *) self->super.owner, TRUE, self->stacked_preamble); g_string_append(self->stacked_preamble, "\r\n"); + z_transfer2_set_proxy_out(s, TRUE); } self->src_chunked = FALSE; @@ -1028,7 +1026,7 @@ http_transfer_run(ZTransfer2 *s) HttpTransfer *self = Z_CAST(s, HttpTransfer); GError *local_error = NULL; - if (self->content_length != HTTP_LENGTH_NONE && self->content_length != 0) + if ((self->content_length != HTTP_LENGTH_NONE || self->push_mime_headers) && self->content_length != 0) { return Z_SUPER(self, ZTransfer2)->run(&self->super); } -- Bazsi