[zorp] Zorp can't handle load

Balazs Scheidler zorp@lists.balabit.hu
Tue, 25 May 2004 13:26:11 +0200

2004-05-25, k keltezéssel 10:59-kor Sheldon Hearn ezt írta:
> On Fri, 2004-05-21 at 10:58, Balazs Scheidler wrote:
> > The solution is to split your single Zorp instances to smaller instances
> > working on the same set of connections. This can be achieved by running
> > for example 16 instances of HTTP listening on different ports. (for
> > example 50080 - 50095) then use 16 packet filter rules to distribute the
> > load between processes based on source port for example. 
> This works very well, thank you.
> When I push a single instance to its maximum threads limit, I soon get
> the following:
> (zorp_default_http_00@default@balrog/nosession): Too many running
> threads, waiting for one to become free; num_threads='1000',
> max_threads='1000'
> zorp_default_http_00[12741]: (Log thread):
> zorp_default_http_00[12741]: (Log thread): GLib-ERROR **: Cannot create
> pipe main loop wake-up: Too many open files
> Is this one of the serious problems you warned me about with Glib, for
> which you have a patched version of GLib as part of your Debian
> packages?  Do you have the patches available?

There's a --threads command line argument for Zorp. It will not create
more than that number of threads. As it seems you might also have more
than the maximum number of file descriptors. You might need to increase
your resource limits.

The thread limit of 1024 per process is inherent to libc 2.2.5, later
libcs (at least 2.3.2) with NPTL support and kernel 2.6 (or 2.4 with
backported O1/futex patch) can run with more than this limit.

I think the better solution is to use as many instances for the same
traffic as you need (separating the load using packet filter rules).

> > Do you have more system or userspace CPU time? (vmstat will tell you that)
> It's about equally split, with system outweighing userspace 4:3 as total
> utilization approaches 50% (about the highest I get with the limited
> load I've been able to produce in testing).

hmm. that system load is quite high, it should not be more than 20-30%
How many interrupts/context switches do you have?

PGP info: KeyID 9AF8D0A9 Fingerprint CD27 CFB0 802C 0944 9CFD 804E C82C 8EB1