TPROXY 2.0.5 with NAT reservations deadlocks on 2.4.33.3
Hi, I'm running TPROXY 2.0.5 compiled with the NAT reservations patch on 2.4.33.3.
From time to time the machine freezes. I was able to capture the following stack trace:
Stack traceback for pid 0 0x403b2000 0 0 1 0 R 0x403b2370 *swapper EBP EIP Function (args) 0x403b3bd8 0x40107e23 __write_lock_failed+0xf (0x60ad9300, 0x140, 0x58e55214, 0x 58b74a68, 0x58b74a68) kernel .text 0x40100000 0x40107e14 0x40107e34 0x608c0e03 [iptable_nat].text.lock.ip_nat_core+0x6d iptable_nat .text 0x608be060 0x608c0d96 0x608c0eb 0 0x608bf5d1 [iptable_nat]ip_nat_reserved_unregister_all+0x31 (0x58e552 14) iptable_nat .text 0x608be060 0x608bf5a0 0x608bf67 0 0x403b3be4 0x608bf681 [iptable_nat]ip_nat_reserved_cleanup_expect+0x11 (0x58e552 14, 0x58e55214) iptable_nat .text 0x608be060 0x608bf670 0x608bf69 0 0x403b3bf4 0x608a7bb9 [ip_conntrack]destroy_expect+0x29 (0x58e55214, 0x58e55214) ip_conntrack .text 0x608a7060 0x608a7b90 0x608a7b c0 0x403b3c04 0x608a7dc8 [ip_conntrack]__unexpect_related+0x68 (0x58e55214, 0x58e55 214) ip_conntrack .text 0x608a7060 0x608a7d60 0x608a7d d0 0x403b3c14 0x608a7e00 [ip_conntrack]unexpect_related+0x30 (0x58e55214, 0x58b74a1 0, 0x403b3d60, 0x0) ip_conntrack .text 0x608a7060 0x608a7dd0 0x608a7e 10 0x403b3c2c 0x608a7e7e [ip_conntrack]remove_expectations+0x6e (0x58b74a10, 0x1, 0 x58b74a10) ip_conntrack .text 0x608a7060 0x608a7e10 0x608a7e 80 0x403b3c40 0x608a7eea [ip_conntrack]clean_from_lists+0x6a (0x58b74a10, 0x58b74a3 4) ip_conntrack .text 0x608a7060 0x608a7e80 0x608a7f 00 0x403b3c50 0x608a81d5 [ip_conntrack]death_by_timeout+0x55 (0x58b74a10, 0x5a7805b 4, 0x1d6002c, 0x1500, 0x1d6002d) ip_conntrack .text 0x608a7060 0x608a8180 0x608a82 30 0x403b3c74 0x608bf73a [iptable_nat]ip_nat_used_tuple+0xaa (0x403b3d60, 0x5a7805b 4, 0x1, 0x1, 0x608c4a00) iptable_nat .text 0x608be060 0x608bf690 0x608bf74 0 0x403b3cbc 0x608bfd5c [iptable_nat]get_unique_tuple+0xcc (0x403b3d60, 0x403b3d30 , 0x403b3d9c, 0x5a7805b4, 0x0) iptable_nat .text 0x608be060 0x608bfc90 0x608bfeb 0 0x403b3d7c 0x608bff11 [iptable_nat]ip_nat_setup_info+0x61 (0x5a7805b4, 0x403b3d9 c, 0x0, 0x0, 0x608c48a0) iptable_nat .text 0x608be060 0x608bfeb0 0x608c023 0 0x403b3dc4 0x608beacb [iptable_nat]ip_nat_rule_find+0xab (0x403b3ec8, 0x0, 0x60a d9300, 0x0, 0x5a7805b4) iptable_nat .text 0x608be060 0x608bea20 0x608beae 0 0x403b3df8 0x608be1a5 [iptable_nat]ip_nat_fn+0x145 (0x0, 0x403b3ec8, 0x60ad9300, 0x0, 0x4025a270) iptable_nat .text 0x608be060 0x608be060 0x608be2f 0 0x403b3e20 0x608be331 [iptable_nat]ip_nat_in+0x41 (0x0, 0x403b3ec8, 0x60ad9300, 0x0, 0x4025a270) iptable_nat .text 0x608be060 0x608be2f0 0x608be37 0 0x403b3e48 0x4024c175 nf_iterate+0x65 (0x404e35c0, 0x403b3ec8, 0x0, 0x60ad9300, 0x0) kernel .text 0x40100000 0x4024c110 0x4024c190 0x403b3e84 0x4024c552 nf_hook_slow+0xc2 (0x2, 0x0, 0x403b3ec8, 0x60ad9300, 0x0) kernel .text 0x40100000 0x4024c490 0x4024c690 0x403b3ec0 0x4025a02c ip_rcv+0x45c (0x5ee7e4c4, 0x60ad9300, 0x403ab390, 0x60ad93 00, 0x4040ee2c) kernel .text 0x40100000 0x40259bd0 0x4025a0e0 0x403b3ee0 0x402429a8 netif_receive_skb+0x148 (0x5ee7e4c4, 0x1a5220, 0x4040ee20, 0x40, 0x4040ef10) kernel .text 0x40100000 0x40242860 0x40242a80 0x403b3f04 0x40242b0a process_backlog+0x8a (0x4040ee4c, 0x403b3f20, 0x1a5220, 0x 4040ee20, 0x0) kernel .text 0x40100000 0x40242a80 0x40242bb0 0x403b3f30 0x40242c7a net_rx_action+0xca (0x4040e8f0, 0x46, 0x0, 0x403b3f44, 0x4 040a980) kernel .text 0x40100000 0x40242bb0 0x40242d30 0x403b3f54 0x40120339 do_softirq+0x109 (0xc, 0x403b3f84, 0x5eee0424, 0x180, 0x0) kernel .text 0x40100000 0x40120230 0x40120370 0x403b3f7c 0x4010b396 do_IRQ+0x106 (0x40107210, 0x5b06e000, 0x403b2000, 0x403b20 00, 0x403b2000) kernel .text 0x40100000 0x4010b290 0x4010b3b0 0x403b3fb8 0x4010de68 call_do_IRQ+0x5 (0x1, 0xa0600, 0x40105000) kernel .text 0x40100000 0x4010de63 0x4010de70 0x401072d2 cpu_idle+0x52 (0x40105090, 0x0, 0x10e00) kernel .text 0x40100000 0x40107280 0x401072f0 0x403b3fe0 0x40105044 stext+0x44 (0x402bd6c0, 0x40413440, 0x0, 0x40413340) kernel .text 0x40100000 0x40105000 0x40105060 0x403b3ff8 0x403b49ab start_kernel+0x15b kernel .text.init 0x403b4000 0x403b4850 0x403b4a10 Is this a known issue with TPROXY 2.0.5?
On Tue, 2006-12-26 at 14:44 +0200, Lior Dotan wrote:
Hi,
I'm running TPROXY 2.0.5 compiled with the NAT reservations patch on 2.4.33.3.
From time to time the machine freezes. I was able to capture the following stack trace:
Hmm... You might be bitten with these: * patches/netfilter_tproxy.patch: update to version 3.0.1, fixes two serious problems possibly leading to kernel crashes * patches/netfilter_nat-delete.patch: fix SMP unsafeness However the changelog entries above are from our ZorpOS kernel changelog, I'm not sure that a proper tproxy release was made yet. You can get the ZorpOS patch-tree from here: http://www.balabit.com/downloads/kernel-patches/2.4.32/ -- Bazsi
On Dec 27 2006 22:25, Balazs Scheidler wrote:
On Tue, 2006-12-26 at 14:44 +0200, Lior Dotan wrote:
Hi,
I'm running TPROXY 2.0.5 compiled with the NAT reservations patch on 2.4.33.3.
From time to time the machine freezes. I was able to capture the following stack trace:
Hmm... You might be bitten with these:
* patches/netfilter_tproxy.patch: update to version 3.0.1, fixes two serious problems possibly leading to kernel crashes
* patches/netfilter_nat-delete.patch: fix SMP unsafeness
I think the second one was caught by the recent tproxy release 20061211/20061212. -`J' --
Hi, On Tuesday 26 December 2006 13:44, Lior Dotan wrote:
I'm running TPROXY 2.0.5 compiled with the NAT reservations patch on 2.4.33.3.
From time to time the machine freezes. I was able to capture the following stack trace:
[...]
Is this a known issue with TPROXY 2.0.5?
No, it isn't. I've looked into the problem and it seems to be a locking problem involving expectation removal code paths being called with the NAT lock held from ip_nat_used_tuple()... Unfortunately I do not have a correct fix yet, but two possible workarounds are: - not applying the nat_delete patch - applying the attached patch on top of all those found in tproxy 2.0.5 -- Regards, Krisztian Kovacs
Hi, On Saturday 30 December 2006 22:47, KOVACS Krisztian wrote:
Is this a known issue with TPROXY 2.0.5?
No, it isn't. I've looked into the problem and it seems to be a locking problem involving expectation removal code paths being called with the NAT lock held from ip_nat_used_tuple()...
Unfortunately I do not have a correct fix yet, but two possible workarounds are:
Ok, I've uploaded a new snapshot of tproxy which supposedly fixes this problem. Could you try whether or not it solves your deadlock problem? http://people.balabit.hu/hidden/tproxy2-2.4.33_20070102.tar.bz2 http://people.balabit.hu/hidden/tproxy2-2.4.33_20070102.tar.bz2.asc -- Regards, Krisztian Kovacs
participants (4)
-
Balazs Scheidler
-
Jan Engelhardt
-
KOVACS Krisztian
-
Lior Dotan