linux/net
Xin Long 88956177db tipc: call tipc_lxc_xmit without holding node_read_lock
When sending packets between nodes in netns, it calls tipc_lxc_xmit() for
peer node to receive the packets where tipc_sk_mcast_rcv()/tipc_sk_rcv()
might be called, and it's pretty much like in tipc_rcv().

Currently the local 'node rw lock' is held during calling tipc_lxc_xmit()
to protect the peer_net not being freed by another thread. However, when
receiving these packets, tipc_node_add_conn() might be called where the
peer 'node rw lock' is acquired. Then a dead lock warning is triggered by
lockdep detector, although it is not a real dead lock:

    WARNING: possible recursive locking detected
    --------------------------------------------
    conn_server/1086 is trying to acquire lock:
    ffff8880065cb020 (&n->lock#2){++--}-{2:2}, \
                     at: tipc_node_add_conn.cold.76+0xaa/0x211 [tipc]

    but task is already holding lock:
    ffff8880065cd020 (&n->lock#2){++--}-{2:2}, \
                     at: tipc_node_xmit+0x285/0xb30 [tipc]

    other info that might help us debug this:
     Possible unsafe locking scenario:

           CPU0
           ----
      lock(&n->lock#2);
      lock(&n->lock#2);

     *** DEADLOCK ***

     May be due to missing lock nesting notation

    4 locks held by conn_server/1086:
     #0: ffff8880036d1e40 (sk_lock-AF_TIPC){+.+.}-{0:0}, \
                          at: tipc_accept+0x9c0/0x10b0 [tipc]
     #1: ffff8880036d5f80 (sk_lock-AF_TIPC/1){+.+.}-{0:0}, \
                          at: tipc_accept+0x363/0x10b0 [tipc]
     #2: ffff8880065cd020 (&n->lock#2){++--}-{2:2}, \
                          at: tipc_node_xmit+0x285/0xb30 [tipc]
     #3: ffff888012e13370 (slock-AF_TIPC){+...}-{2:2}, \
                          at: tipc_sk_rcv+0x2da/0x1b40 [tipc]

    Call Trace:
     <TASK>
     dump_stack_lvl+0x44/0x5b
     __lock_acquire.cold.77+0x1f2/0x3d7
     lock_acquire+0x1d2/0x610
     _raw_write_lock_bh+0x38/0x80
     tipc_node_add_conn.cold.76+0xaa/0x211 [tipc]
     tipc_sk_finish_conn+0x21e/0x640 [tipc]
     tipc_sk_filter_rcv+0x147b/0x3030 [tipc]
     tipc_sk_rcv+0xbb4/0x1b40 [tipc]
     tipc_lxc_xmit+0x225/0x26b [tipc]
     tipc_node_xmit.cold.82+0x4a/0x102 [tipc]
     __tipc_sendstream+0x879/0xff0 [tipc]
     tipc_accept+0x966/0x10b0 [tipc]
     do_accept+0x37d/0x590

This patch avoids this warning by not holding the 'node rw lock' before
calling tipc_lxc_xmit(). As to protect the 'peer_net', rcu_read_lock()
should be enough, as in cleanup_net() when freeing the netns, it calls
synchronize_rcu() before the free is continued.

Also since tipc_lxc_xmit() is like the RX path in tipc_rcv(), it makes
sense to call it under rcu_read_lock(). Note that the right lock order
must be:

   rcu_read_lock();
   tipc_node_read_lock(n);
   tipc_node_read_unlock(n);
   tipc_lxc_xmit();
   rcu_read_unlock();

instead of:

   tipc_node_read_lock(n);
   rcu_read_lock();
   tipc_node_read_unlock(n);
   tipc_lxc_xmit();
   rcu_read_unlock();

and we have to call tipc_node_read_lock/unlock() twice in
tipc_node_xmit().

Fixes: f73b12812a ("tipc: improve throughput between nodes in netns")
Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/5bdd1f8fee9db695cfff4528a48c9b9d0523fb00.1670110641.git.lucien.xin@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-07 11:32:04 +01:00
..
6lowpan
9p Including fixes from bpf, can and wifi. 2022-11-29 09:52:10 -08:00
802 treewide: use prandom_u32_max() when possible, part 1 2022-10-11 17:42:55 -06:00
8021q net: gro: skb_gro_header helper function 2022-08-25 10:33:21 +02:00
appletalk
atm net/atm: fix proc_mpc_write incorrect return value 2022-10-15 11:08:36 +01:00
ax25 ax25: move from strlcpy with unused retval to strscpy 2022-08-22 17:55:50 -07:00
batman-adv Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-09-22 13:02:10 -07:00
bluetooth Bluetooth: Fix crash when replugging CSR fake controllers 2022-12-02 13:22:56 -08:00
bpf bpf, test_run: Fix alignment problem in bpf_prog_test_run_skb() 2022-11-04 16:22:34 +01:00
bpfilter
bridge bridge: switchdev: Fix memory leaks when changing VLAN protocol 2022-11-15 13:38:11 +01:00
caif net: caif: fix double disconnect client in chnl_net_open() 2022-11-14 10:51:13 +00:00
can can: j1939: j1939_send_one(): fix missing CAN header initialization 2022-11-07 14:00:27 +01:00
ceph Random number generator fixes for Linux 6.1-rc1. 2022-10-16 15:27:07 -07:00
core Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec 2022-11-23 19:18:59 -08:00
dcb
dccp dccp/tcp: Fixup bhash2 bucket when connect() fails. 2022-11-22 20:15:37 -08:00
dns_resolver
dsa net: dsa: sja1105: Check return value 2022-12-02 20:46:52 -08:00
ethernet net: gro: skb_gro_header helper function 2022-08-25 10:33:21 +02:00
ethtool ethtool: eeprom: fix null-deref on genl_info in dump 2022-10-24 19:08:07 -07:00
hsr net: hsr: Fix potential use-after-free 2022-11-28 18:09:00 -08:00
ieee802154 net: ieee802154: fix error return code in dgram_bind() 2022-10-07 09:29:17 +02:00
ife
ipv4 ipv4: Fix incorrect route flushing when table ID 0 is used 2022-12-06 20:34:43 -08:00
ipv6 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec 2022-11-23 19:18:59 -08:00
iucv
kcm kcm: close race conditions on sk_receive_queue 2022-11-15 12:42:26 +01:00
key xfrm: Fix oops in __xfrm_state_delete() 2022-11-22 07:14:55 +01:00
l2tp l2tp: Don't sleep and disable BH under writer-side sk_callback_lock 2022-11-23 12:45:19 +00:00
l3mdev
lapb
llc
mac80211 wifi: mac8021: fix possible oob access in ieee80211_get_rate_duration 2022-11-25 12:45:53 +01:00
mac802154 mac802154: Fix LQI recording 2022-10-24 11:07:39 +02:00
mctp mctp: Fix an error handling path in mctp_init() 2022-11-09 19:26:08 -08:00
mpls net: Use u64_stats_fetch_begin_irq() for stats fetch. 2022-08-29 13:02:27 +01:00
mptcp mptcp: fix sleep in atomic at close time 2022-11-28 18:03:07 -08:00
ncsi genetlink: start to validate reserved header bytes 2022-08-29 12:47:15 +01:00
netfilter netfilter: ctnetlink: fix compilation warning after data race fixes in ct mark 2022-11-30 13:08:49 +01:00
netlabel genetlink: start to validate reserved header bytes 2022-08-29 12:47:15 +01:00
netlink genetlink: limit the use of validation workarounds to old ops 2022-10-27 08:20:21 -07:00
netrom
nfc NFC: nci: Bounds check struct nfc_target arrays 2022-12-05 17:46:25 -08:00
nsh
openvswitch netfilter: conntrack: Fix data-races around ct mark 2022-11-18 15:21:00 +01:00
packet packet: do not set TP_STATUS_CSUM_VALID on CHECKSUM_COMPLETE 2022-11-29 08:30:18 -08:00
phonet
psample genetlink: start to validate reserved header bytes 2022-08-29 12:47:15 +01:00
qrtr net: qrtr: start MHI channel after endpoit creation 2022-08-15 11:21:42 +01:00
rds treewide: use get_random_{u8,u16}() when possible, part 2 2022-10-11 17:42:58 -06:00
rfkill
rose rose: Fix NULL pointer dereference in rose_send_frame() 2022-11-02 11:57:30 +00:00
rxrpc rxrpc: Fix race between conn bundle lookup and bundle removal [ZDI-CAN-15975] 2022-11-18 12:05:44 +00:00
sched net: sched: allow act_ct to be built without NF_NAT 2022-11-22 12:16:55 +01:00
sctp sctp: fix memory leak in sctp_stream_outq_migrate() 2022-11-29 08:30:50 -08:00
smc net/smc: Fix possible leaked pernet namespace in smc_init() 2022-11-02 20:42:09 -07:00
strparser
sunrpc SUNRPC: Fix crasher in gss_unwrap_resp_integ() 2022-10-27 15:52:10 -04:00
switchdev
tipc tipc: call tipc_lxc_xmit without holding node_read_lock 2022-12-07 11:32:04 +01:00
tls net/tls: Fix memory leak in tls_enc_skb() and tls_sw_fallback_init() 2022-11-11 20:08:17 -08:00
unix af_unix: Get user_ns from in_skb in unix_diag_get_exact(). 2022-12-01 10:32:20 +01:00
vmw_vsock vsock: fix possible infinite sleep in vsock_connectible_wait_data() 2022-11-03 10:49:29 +01:00
wireless wifi: cfg80211: don't allow multi-BSSID in S1G 2022-11-25 12:43:14 +01:00
x25 net/x25: Fix skb leak in x25_lapb_receive_frame() 2022-11-15 20:22:19 -08:00
xdp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-10-03 17:44:18 -07:00
xfrm Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec 2022-11-23 19:18:59 -08:00
compat.c net: clear msg_get_inq in __get_compat_msghdr() 2022-09-20 08:23:20 -07:00
devres.c
Kconfig Remove DECnet support from kernel 2022-08-22 14:26:30 +01:00
Kconfig.debug net: make NET_(DEV|NS)_REFCNT_TRACKER depend on NET 2022-09-20 14:23:56 -07:00
Makefile Remove DECnet support from kernel 2022-08-22 14:26:30 +01:00
socket.c d_path pile 2022-10-06 16:55:41 -07:00
sysctl_net.c