[zorp] Zorp can't handle load

Balazs Scheidler zorp@lists.balabit.hu
Tue, 25 May 2004 14:09:32 +0200

2004-05-25, k keltezéssel 13:45-kor Sheldon Hearn ezt írta:
> On Tue, 2004-05-25 at 13:26, Balazs Scheidler wrote:
> > There's a --threads command line argument for Zorp. It will not create
> > more than that number of threads. As it seems you might also have more
> > than the maximum number of file descriptors. You might need to increase
> > your resource limits.
> I thought the resource limits were adjusted by zorpctl.  I edited the
> script to print them, and they looked fine.
> > The thread limit of 1024 per process is inherent to libc 2.2.5, later
> > libcs (at least 2.3.2) with NPTL support and kernel 2.6 (or 2.4 with
> > backported O1/futex patch) can run with more than this limit.
> Right.  I have --threads set to 1000.  And when I reach the maximum, I
> see a log message that says "Too many running threads, ...".  So that's
> fine.
> The thing I think is a problem is that it falls over even though it's
> limited in this way.  Why do I get these errors (which cause the entire
> instance to die) when I've limited the number of threads per instance to
> 1000?

because the fd limit you set is too low for the thread number you need.
zorpctl tries to guess the necessary fd limit for a given number of
threads, but you can change its default. For example:

intra_http -v3 -p /etc/zorp/policy.py -- --fd-limit 4096

> (Log thread): GLib-ERROR **: Cannot create pipe main loop wake-up: Too
> many open files
> That's why I asked about your glib patches.

I see, but the Glib error is triggered by the fact pipe()  failed
because it could not allocate enough file descriptors, that's why I
think that your fd limit is too low.

We have three glib patches on our binaries currently:

glib-abort-recursed.diff    -- moves an recursion checking in g_log() to 
                               make it possible to get a backtrace (which uses g_log()
		               when an assertion is triggered (SIGABRT);
                               This is not so important, but makes 
                               debugging SIGABRT problems easier

glib-memchunk-race.diff     -- this is a MUSTHAVE, fixes a serious glib race condition;
                               it is said to be fixed in glib2.4 (IIRC) Without
                               this patch Zorp will crash under load.

glib-eintr.diff             -- glib thrashes errno in g_strerror() which results in
                               a) ugly error messages, b) incorrect behaviour
                               this should be applied, though not absolutely necessary

We did not touch the error message above, though it would be better to 
fail gracefully.

> So this GLib-ERROR seems quite serious to me.

see above.

PGP info: KeyID 9AF8D0A9 Fingerprint CD27 CFB0 802C 0944 9CFD 804E C82C 8EB1