linux/net/xfrm
Varad Gautam d7b0408934 xfrm: policy: Read seqcount outside of rcu-read side in xfrm_policy_lookup_bytype
xfrm_policy_lookup_bytype loops on seqcount mutex xfrm_policy_hash_generation
within an RCU read side critical section. Although ill advised, this is fine if
the loop is bounded.

xfrm_policy_hash_generation wraps mutex hash_resize_mutex, which is used to
serialize writers (xfrm_hash_resize, xfrm_hash_rebuild). This is fine too.

On PREEMPT_RT=y, the read_seqcount_begin call within xfrm_policy_lookup_bytype
emits a mutex lock/unlock for hash_resize_mutex. Mutex locking is fine, since
RCU read side critical sections are allowed to sleep with PREEMPT_RT.

xfrm_hash_resize can, however, block on synchronize_rcu while holding
hash_resize_mutex.

This leads to the following situation on PREEMPT_RT, where the writer is
blocked on RCU grace period expiry, while the reader is blocked on a lock held
by the writer:

Thead 1 (xfrm_hash_resize)	Thread 2 (xfrm_policy_lookup_bytype)

				rcu_read_lock();
mutex_lock(&hash_resize_mutex);
				read_seqcount_begin(&xfrm_policy_hash_generation);
				mutex_lock(&hash_resize_mutex); // block
xfrm_bydst_resize();
synchronize_rcu(); // block
		<RCU stalls in xfrm_policy_lookup_bytype>

Move the read_seqcount_begin call outside of the RCU read side critical section,
and do an rcu_read_unlock/retry if we got stale data within the critical section.

On non-PREEMPT_RT, this shortens the time spent within RCU read side critical
section in case the seqcount needs a retry, and avoids unbounded looping.

Fixes: 77cc278f7b ("xfrm: policy: Use sequence counters with associated lock")
Signed-off-by: Varad Gautam <varad.gautam@suse.com>
Cc: linux-rt-users <linux-rt-users@vger.kernel.org>
Cc: netdev@vger.kernel.org
Cc: stable@vger.kernel.org # v4.9
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Florian Westphal <fw@strlen.de>
Cc: "Ahmed S. Darwish" <a.darwish@linutronix.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Ahmed S. Darwish <a.darwish@linutronix.de>
2021-06-01 07:48:46 +02:00
..
espintcp.c espintcp: restore IP CB before handing the packet to xfrm 2020-08-17 15:58:04 +02:00
Kconfig xfrm/compat: Add 32=>64-bit messages translator 2020-09-24 08:53:03 +02:00
Makefile xfrm: Provide API to register translator module 2020-09-24 08:53:03 +02:00
xfrm_algo.c crypto: skcipher - remove the "blkcipher" algorithm type 2019-11-01 13:38:32 +08:00
xfrm_compat.c xfrm/compat: Cleanup WARN()s that can be user-triggered 2021-03-30 07:29:09 +02:00
xfrm_device.c xfrm: Provide private skb extensions for segmented and hw offloaded ESP packets 2021-03-29 09:14:12 +02:00
xfrm_hash.c
xfrm_hash.h
xfrm_inout.h xfrm: move xfrm4_extract_header to common helper 2020-05-06 09:40:08 +02:00
xfrm_input.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec 2021-01-21 11:05:10 -08:00
xfrm_interface.c xfrm: interface: fix ipv4 pmtu check to honor ip header df 2021-02-23 18:23:58 +01:00
xfrm_ipcomp.c net: Use skb_frag_off accessors 2019-07-30 14:21:32 -07:00
xfrm_output.c xfrm: BEET mode doesn't support fragments for inner packets 2021-03-24 09:58:19 +01:00
xfrm_policy.c xfrm: policy: Read seqcount outside of rcu-read side in xfrm_policy_lookup_bytype 2021-06-01 07:48:46 +02:00
xfrm_proc.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
xfrm_replay.c xfrm: introduce oseq-may-wrap flag 2020-06-24 07:51:01 +02:00
xfrm_state.c xfrm: xfrm_state_mtu should return at least 1280 for ipv6 2021-04-19 12:43:50 +02:00
xfrm_sysctl.c
xfrm_user.c xfrm: Return the correct errno code 2021-02-04 09:29:27 +01:00