linux/net
D. Wythe 5c15b3123f net/smc: Prevent smc_release() from long blocking
In nginx/wrk benchmark, there's a hung problem with high probability
on case likes that: (client will last several minutes to exit)

server: smc_run nginx

client: smc_run wrk -c 10000 -t 1 http://server

Client hangs with the following backtrace:

0 [ffffa7ce8Of3bbf8] __schedule at ffffffff9f9eOd5f
1 [ffffa7ce8Of3bc88] schedule at ffffffff9f9eløe6
2 [ffffa7ce8Of3bcaO] schedule_timeout at ffffffff9f9e3f3c
3 [ffffa7ce8Of3bd2O] wait_for_common at ffffffff9f9el9de
4 [ffffa7ce8Of3bd8O] __flush_work at ffffffff9fOfeOl3
5 [ffffa7ce8øf3bdfO] smc_release at ffffffffcO697d24 [smc]
6 [ffffa7ce8Of3be2O] __sock_release at ffffffff9f8O2e2d
7 [ffffa7ce8Of3be4ø] sock_close at ffffffff9f8ø2ebl
8 [ffffa7ce8øf3be48] __fput at ffffffff9f334f93
9 [ffffa7ce8Of3be78] task_work_run at ffffffff9flOlff5
10 [ffffa7ce8Of3beaO] do_exit at ffffffff9fOe5Ol2
11 [ffffa7ce8Of3bflO] do_group_exit at ffffffff9fOe592a
12 [ffffa7ce8Of3bf38] __x64_sys_exit_group at ffffffff9fOe5994
13 [ffffa7ce8Of3bf4O] do_syscall_64 at ffffffff9f9d4373
14 [ffffa7ce8Of3bfsO] entry_SYSCALL_64_after_hwframe at ffffffff9fa0007c

This issue dues to flush_work(), which is used to wait for
smc_connect_work() to finish in smc_release(). Once lots of
smc_connect_work() was pending or all executing work dangling,
smc_release() has to block until one worker comes to free, which
is equivalent to wait another smc_connnect_work() to finish.

In order to fix this, There are two changes:

1. For those idle smc_connect_work(), cancel it from the workqueue; for
   executing smc_connect_work(), waiting for it to finish. For that
   purpose, replace flush_work() with cancel_work_sync().

2. Since smc_connect() hold a reference for passive closing, if
   smc_connect_work() has been cancelled, release the reference.

Fixes: 24ac3a08e6 ("net/smc: rebuild nonblocking connect")
Reported-by: Tony Lu <tonylu@linux.alibaba.com>
Tested-by: Dust Li <dust.li@linux.alibaba.com>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Link: https://lore.kernel.org/r/1639571361-101128-1-git-send-email-alibuda@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-16 08:11:05 -08:00
..
6lowpan
9p 9p: fix a bunch of checkpatch warnings 2021-11-04 21:04:25 +09:00
802 llc/snap: constify dev_addr passing 2021-10-13 09:40:46 -07:00
8021q net: vlan: fix underflow for the real_dev refcnt 2021-11-26 11:20:46 -08:00
appletalk
atm net: atm: use address setting helpers 2021-10-24 13:59:45 +01:00
ax25 ax25: constify dev_addr passing 2021-10-13 09:40:45 -07:00
batman-adv Merge branch 'exit-cleanups-for-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2021-11-10 16:15:54 -08:00
bluetooth bluetooth: use dev_addr_set() 2021-10-25 11:01:29 -07:00
bpf bpf: Add dummy BPF STRUCT_OPS for test purpose 2021-11-01 14:10:00 -07:00
bpfilter
bridge Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-11-01 20:05:14 -07:00
caif net: caif: get ready for const netdev->dev_addr 2021-10-24 13:59:45 +01:00
can can: j1939: j1939_tp_cmd_recv(): check the dst address of TP.CM_BAM 2021-11-06 17:29:32 +01:00
ceph libceph, ceph: move ceph_osdc_copy_from() into cephfs code 2021-11-08 03:29:52 +01:00
core net: Fix double 0x prefix print in SKB dump 2021-12-16 11:08:15 +00:00
dcb
dccp tcp: switch orphan_count to bare per-cpu counters 2021-10-15 11:28:34 +01:00
decnet net: Remove redundant if statements 2021-08-05 13:27:50 +01:00
dns_resolver
dsa net: dsa: felix: fix broken VLAN-tagged PTP under VLAN-aware bridge 2021-11-03 14:22:00 +00:00
ethernet eth: platform: add a helper for loading netdev->dev_addr 2021-10-08 14:54:33 +01:00
ethtool ethtool: do not perform operations on net devices being unregistered 2021-12-06 16:53:32 -08:00
hsr net: hsr: Add support for redbox supervision frames 2021-10-26 14:52:17 +01:00
ieee802154 mac802154: use dev_addr_set() - manual 2021-10-20 14:27:40 +01:00
ife
ipv4 inet_diag: fix kernel-infoleak for UDP sockets 2021-12-10 21:14:49 -08:00
ipv6 seg6: fix the iif in the IPv6 socket control block 2021-12-09 07:55:42 -08:00
iucv net/iucv: Replace deprecated CPU-hotplug functions. 2021-08-09 10:13:32 +01:00
kcm
key
l2tp net/l2tp: Fix reference count leak in l2tp_udp_recv_core 2021-09-09 11:00:20 +01:00
l3mdev
lapb
llc llc/snap: constify dev_addr passing 2021-10-13 09:40:46 -07:00
mac80211 mac80211: do drv_reconfig_complete() before restarting all 2021-12-14 11:22:20 +01:00
mac802154 mac802154: use dev_addr_set() - manual 2021-10-20 14:27:40 +01:00
mctp mctp: Don't let RTM_DELROUTE delete local routes 2021-12-02 12:15:25 +00:00
mpls net: mpls: Remove rcu protection from nh_dev 2021-11-29 12:39:42 +00:00
mptcp mptcp: fix deadlock in __mptcp_push_pending() 2021-12-14 18:49:40 -08:00
ncsi net/ncsi : Add payload to be 32-bit aligned to fix dropped packets 2021-11-24 11:53:17 +00:00
netfilter netfilter: conntrack: annotate data-races around ct->timeout 2021-12-08 01:29:15 +01:00
netlabel net: fix NULL pointer reference in cipso_v4_doi_free 2021-08-30 12:23:18 +01:00
netlink net: netlink: af_netlink: Prevent empty skb by adding a check on len. 2021-11-30 17:45:01 -08:00
netrom ax25: constify dev_addr passing 2021-10-13 09:40:45 -07:00
nfc nfc: fix potential NULL pointer deref in nfc_genl_dump_ses_done 2021-12-09 07:50:32 -08:00
nsh
openvswitch include/linux/mm.h: move nr_free_buffer_pages from swap.h to mm.h 2021-11-06 13:30:43 -07:00
packet net/packet: rx_owner_map depends on pg_vec 2021-12-15 17:49:36 -08:00
phonet phonet: refcount leak in pep_sock_accep 2021-12-10 19:53:52 -08:00
psample
qrtr net: qrtr: combine nameservice into main module 2021-09-28 17:36:43 -07:00
rds rds: memory leak in __rds_conn_create() 2021-12-14 12:51:52 +00:00
rfkill
rose rose: constify dev_addr passing 2021-10-13 09:40:45 -07:00
rxrpc rxrpc: Fix rxrpc_local leak in rxrpc_lookup_peer() 2021-11-29 15:40:02 +00:00
sched flow_offload: return EOPNOTSUPP for the unsupported mpls action type 2021-12-14 12:33:19 +00:00
sctp net,lsm,selinux: revert the security_sctp_assoc_established() hook 2021-11-14 12:21:53 +00:00
smc net/smc: Prevent smc_release() from long blocking 2021-12-16 08:11:05 -08:00
strparser bpf: sockmap, strparser, and tls are reusing qdisc_skb_cb and colliding 2021-11-09 01:05:28 +01:00
sunrpc NFS client bugfixes for Linux 5.16 2021-11-27 10:33:55 -08:00
switchdev net: switchdev: merge switchdev_handle_fdb_{add,del}_to_device 2021-10-27 14:54:02 +01:00
tipc tipc: check for null after calling kmemdup 2021-11-17 20:04:52 -08:00
tls net/tls: Fix authentication failure in CCM mode 2021-11-29 12:48:28 +00:00
unix af_unix: fix regression in read after shutdown 2021-11-20 15:10:30 +00:00
vmw_vsock vsock/virtio: suppress used length validation 2021-11-22 14:49:03 +00:00
wireless cfg80211: Acquire wiphy mutex on regulatory work 2021-12-14 11:20:11 +01:00
x25
xdp xsk: Fix crash on double free in buffer pool 2021-11-12 15:55:27 +01:00
xfrm Core: 2021-11-02 06:20:58 -07:00
compat.c
devres.c
Kconfig net/core: disable NET_RX_BUSY_POLL on PREEMPT_RT 2021-10-01 15:45:10 -07:00
Makefile
socket.c Core: 2021-08-31 16:43:06 -07:00
sysctl_net.c sections: move and rename core_kernel_data() to is_kernel_core_data() 2021-11-09 10:02:50 -08:00