Wenjia Zhang
9708efad9b
net/smc: fix deadlock triggered by cancel_delayed_work_syn()
...
[ Upstream commit 13085e1b5cab8ad802904d72e6a6dae85ae0cd20 ]
The following LOCKDEP was detected:
Workqueue: events smc_lgr_free_work [smc]
WARNING: possible circular locking dependency detected
6.1.0-20221027.rc2.git8.56bc5b569087.300.fc36.s390x+debug #1 Not tainted
------------------------------------------------------
kworker/3:0/176251 is trying to acquire lock:
00000000f1467148 ((wq_completion)smc_tx_wq-00000000#2){+.+.}-{0:0},
at: __flush_workqueue+0x7a/0x4f0
but task is already holding lock:
0000037fffe97dc8 ((work_completion)(&(&lgr->free_work)->work)){+.+.}-{0:0},
at: process_one_work+0x232/0x730
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #4 ((work_completion)(&(&lgr->free_work)->work)){+.+.}-{0:0}:
__lock_acquire+0x58e/0xbd8
lock_acquire.part.0+0xe2/0x248
lock_acquire+0xac/0x1c8
__flush_work+0x76/0xf0
__cancel_work_timer+0x170/0x220
__smc_lgr_terminate.part.0+0x34/0x1c0 [smc]
smc_connect_rdma+0x15e/0x418 [smc]
__smc_connect+0x234/0x480 [smc]
smc_connect+0x1d6/0x230 [smc]
__sys_connect+0x90/0xc0
__do_sys_socketcall+0x186/0x370
__do_syscall+0x1da/0x208
system_call+0x82/0xb0
-> #3 (smc_client_lgr_pending){+.+.}-{3:3}:
__lock_acquire+0x58e/0xbd8
lock_acquire.part.0+0xe2/0x248
lock_acquire+0xac/0x1c8
__mutex_lock+0x96/0x8e8
mutex_lock_nested+0x32/0x40
smc_connect_rdma+0xa4/0x418 [smc]
__smc_connect+0x234/0x480 [smc]
smc_connect+0x1d6/0x230 [smc]
__sys_connect+0x90/0xc0
__do_sys_socketcall+0x186/0x370
__do_syscall+0x1da/0x208
system_call+0x82/0xb0
-> #2 (sk_lock-AF_SMC){+.+.}-{0:0}:
__lock_acquire+0x58e/0xbd8
lock_acquire.part.0+0xe2/0x248
lock_acquire+0xac/0x1c8
lock_sock_nested+0x46/0xa8
smc_tx_work+0x34/0x50 [smc]
process_one_work+0x30c/0x730
worker_thread+0x62/0x420
kthread+0x138/0x150
__ret_from_fork+0x3c/0x58
ret_from_fork+0xa/0x40
-> #1 ((work_completion)(&(&smc->conn.tx_work)->work)){+.+.}-{0:0}:
__lock_acquire+0x58e/0xbd8
lock_acquire.part.0+0xe2/0x248
lock_acquire+0xac/0x1c8
process_one_work+0x2bc/0x730
worker_thread+0x62/0x420
kthread+0x138/0x150
__ret_from_fork+0x3c/0x58
ret_from_fork+0xa/0x40
-> #0 ((wq_completion)smc_tx_wq-00000000#2){+.+.}-{0:0}:
check_prev_add+0xd8/0xe88
validate_chain+0x70c/0xb20
__lock_acquire+0x58e/0xbd8
lock_acquire.part.0+0xe2/0x248
lock_acquire+0xac/0x1c8
__flush_workqueue+0xaa/0x4f0
drain_workqueue+0xaa/0x158
destroy_workqueue+0x44/0x2d8
smc_lgr_free+0x9e/0xf8 [smc]
process_one_work+0x30c/0x730
worker_thread+0x62/0x420
kthread+0x138/0x150
__ret_from_fork+0x3c/0x58
ret_from_fork+0xa/0x40
other info that might help us debug this:
Chain exists of:
(wq_completion)smc_tx_wq-00000000#2
--> smc_client_lgr_pending
--> (work_completion)(&(&lgr->free_work)->work)
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock((work_completion)(&(&lgr->free_work)->work));
lock(smc_client_lgr_pending);
lock((work_completion)
(&(&lgr->free_work)->work));
lock((wq_completion)smc_tx_wq-00000000#2);
*** DEADLOCK ***
2 locks held by kworker/3:0/176251:
#0 : 0000000080183548
((wq_completion)events){+.+.}-{0:0},
at: process_one_work+0x232/0x730
#1 : 0000037fffe97dc8
((work_completion)
(&(&lgr->free_work)->work)){+.+.}-{0:0},
at: process_one_work+0x232/0x730
stack backtrace:
CPU: 3 PID: 176251 Comm: kworker/3:0 Not tainted
Hardware name: IBM 8561 T01 701 (z/VM 7.2.0)
Call Trace:
[<000000002983c3e4>] dump_stack_lvl+0xac/0x100
[<0000000028b477ae>] check_noncircular+0x13e/0x160
[<0000000028b48808>] check_prev_add+0xd8/0xe88
[<0000000028b49cc4>] validate_chain+0x70c/0xb20
[<0000000028b4bd26>] __lock_acquire+0x58e/0xbd8
[<0000000028b4cf6a>] lock_acquire.part.0+0xe2/0x248
[<0000000028b4d17c>] lock_acquire+0xac/0x1c8
[<0000000028addaaa>] __flush_workqueue+0xaa/0x4f0
[<0000000028addf9a>] drain_workqueue+0xaa/0x158
[<0000000028ae303c>] destroy_workqueue+0x44/0x2d8
[<000003ff8029af26>] smc_lgr_free+0x9e/0xf8 [smc]
[<0000000028adf3d4>] process_one_work+0x30c/0x730
[<0000000028adf85a>] worker_thread+0x62/0x420
[<0000000028aeac50>] kthread+0x138/0x150
[<0000000028a63914>] __ret_from_fork+0x3c/0x58
[<00000000298503da>] ret_from_fork+0xa/0x40
INFO: lockdep is turned off.
===================================================================
This deadlock occurs because cancel_delayed_work_sync() waits for
the work(&lgr->free_work) to finish, while the &lgr->free_work
waits for the work(lgr->tx_wq), which needs the sk_lock-AF_SMC, that
is already used under the mutex_lock.
The solution is to use cancel_delayed_work() instead, which kills
off a pending work.
Fixes: a52bcc919b14 ("net/smc: improve termination processing")
Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
Reviewed-by: Jan Karcher <jaka@linux.ibm.com>
Reviewed-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-22 13:29:58 +01:00
..
2021-09-15 09:50:34 +02:00
2023-03-11 16:40:13 +01:00
2023-01-14 10:16:18 +01:00
2022-07-29 17:19:07 +02:00
2021-04-07 15:00:08 +02:00
2022-10-30 09:41:16 +01:00
2022-06-22 14:13:17 +02:00
2022-05-18 10:23:42 +02:00
2023-03-11 16:40:19 +01:00
2023-01-14 10:15:31 +01:00
2021-07-14 16:56:29 +02:00
2023-03-11 16:40:12 +01:00
2023-03-17 08:45:11 +01:00
2023-02-15 17:22:23 +01:00
2022-05-25 09:17:56 +02:00
2023-03-17 08:45:15 +01:00
2022-03-08 19:09:37 +01:00
2023-02-22 12:55:57 +01:00
2022-08-31 17:15:19 +02:00
2022-12-14 11:32:01 +01:00
2023-01-24 07:19:55 +01:00
2023-01-14 10:15:37 +01:00
2022-11-03 23:57:51 +09:00
2023-03-22 13:29:58 +01:00
2023-03-22 13:29:58 +01:00
2021-03-07 12:34:05 +01:00
2022-11-25 17:45:56 +01:00
2022-12-02 17:39:58 +01:00
2023-03-11 16:39:29 +01:00
2022-04-27 13:53:50 +02:00
2021-02-10 09:29:14 +01:00
2022-03-28 09:57:10 +02:00
2023-03-11 16:39:28 +01:00
2022-12-14 11:32:01 +01:00
2023-02-22 12:55:58 +01:00
2023-01-14 10:16:52 +01:00
2022-01-05 12:40:32 +01:00
2023-03-22 13:29:57 +01:00
2022-04-13 21:01:00 +02:00
2023-02-01 08:23:24 +01:00
2023-02-15 17:22:12 +01:00
2023-03-17 08:45:07 +01:00
2023-02-22 12:55:57 +01:00
2023-01-14 10:16:29 +01:00
2022-01-11 15:25:01 +01:00
2021-03-07 12:34:07 +01:00
2023-02-15 17:22:16 +01:00
2023-03-11 16:39:26 +01:00
2023-02-22 12:55:53 +01:00
2023-01-14 10:16:12 +01:00
2023-03-11 16:40:13 +01:00
2023-03-11 16:40:12 +01:00
2023-03-22 13:29:58 +01:00
2021-11-18 14:04:27 +01:00
2023-03-17 08:45:12 +01:00
2021-02-07 15:37:12 +01:00
2023-01-18 11:44:58 +01:00
2023-03-11 16:40:19 +01:00
2022-12-14 11:32:01 +01:00
2023-01-14 10:15:42 +01:00
2023-03-13 10:19:36 +01:00
2023-02-15 17:22:15 +01:00
2022-10-15 07:55:51 +02:00
2023-03-22 13:29:55 +01:00
2021-06-18 10:00:06 +02:00
2023-01-04 11:39:24 +01:00