Jens Axboe 009ad9f0c6 io_uring: drop ctx->uring_lock before acquiring sqd->lock
The SQPOLL thread dictates the lock order, and we hold the ctx->uring_lock
for all the registration opcodes. We also hold a ref to the ctx, and we
do drop the lock for other reasons to quiesce, so it's fine to drop the
ctx lock temporarily to grab the sqd->lock. This fixes the following
lockdep splat:

======================================================
WARNING: possible circular locking dependency detected
5.14.0-syzkaller #0 Not tainted
------------------------------------------------------
syz-executor.5/25433 is trying to acquire lock:
ffff888023426870 (&sqd->lock){+.+.}-{3:3}, at: io_register_iowq_max_workers fs/io_uring.c:10551 [inline]
ffff888023426870 (&sqd->lock){+.+.}-{3:3}, at: __io_uring_register fs/io_uring.c:10757 [inline]
ffff888023426870 (&sqd->lock){+.+.}-{3:3}, at: __do_sys_io_uring_register+0x10aa/0x2e70 fs/io_uring.c:10792

but task is already holding lock:
ffff8880885b40a8 (&ctx->uring_lock){+.+.}-{3:3}, at: __do_sys_io_uring_register+0x2e1/0x2e70 fs/io_uring.c:10791

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&ctx->uring_lock){+.+.}-{3:3}:
       __mutex_lock_common kernel/locking/mutex.c:596 [inline]
       __mutex_lock+0x131/0x12f0 kernel/locking/mutex.c:729
       __io_sq_thread fs/io_uring.c:7291 [inline]
       io_sq_thread+0x65a/0x1370 fs/io_uring.c:7368
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295

-> #0 (&sqd->lock){+.+.}-{3:3}:
       check_prev_add kernel/locking/lockdep.c:3051 [inline]
       check_prevs_add kernel/locking/lockdep.c:3174 [inline]
       validate_chain kernel/locking/lockdep.c:3789 [inline]
       __lock_acquire+0x2a07/0x54a0 kernel/locking/lockdep.c:5015
       lock_acquire kernel/locking/lockdep.c:5625 [inline]
       lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5590
       __mutex_lock_common kernel/locking/mutex.c:596 [inline]
       __mutex_lock+0x131/0x12f0 kernel/locking/mutex.c:729
       io_register_iowq_max_workers fs/io_uring.c:10551 [inline]
       __io_uring_register fs/io_uring.c:10757 [inline]
       __do_sys_io_uring_register+0x10aa/0x2e70 fs/io_uring.c:10792
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&ctx->uring_lock);
                               lock(&sqd->lock);
                               lock(&ctx->uring_lock);
  lock(&sqd->lock);

 *** DEADLOCK ***

Fixes: 2e480058ddc2 ("io-wq: provide a way to limit max number of workers")
Reported-by: syzbot+97fa56483f69d677969f@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-08 19:07:26 -06:00
..
2021-09-02 09:21:27 -07:00
2021-08-19 09:02:55 +09:00
2021-09-02 09:21:27 -07:00
2021-09-02 09:21:27 -07:00
2021-04-12 15:04:29 +02:00
2021-09-02 09:21:27 -07:00
2021-07-06 11:06:04 -07:00
2021-09-02 09:21:27 -07:00
2021-09-04 10:48:47 -07:00
2021-09-02 12:32:12 -07:00
2021-09-07 12:18:29 -07:00
2021-09-02 09:21:27 -07:00
2021-09-04 10:25:26 -07:00
2021-09-04 10:25:26 -07:00
2021-07-06 20:14:41 -04:00
2021-09-02 15:09:46 -07:00
\n
2021-08-30 10:04:31 -07:00
2021-08-18 22:08:24 +02:00
2021-09-03 18:42:01 +02:00
2021-06-30 12:21:16 -07:00
2021-06-29 10:53:48 -07:00
2021-09-03 15:33:47 -07:00
2021-08-16 10:50:32 -06:00
2021-08-19 09:02:55 +09:00
2021-09-02 09:21:27 -07:00
\n
2021-08-30 10:24:50 -07:00
2021-08-30 19:30:30 -07:00
2021-08-31 11:13:35 -07:00
2021-08-31 12:00:07 -07:00
2021-08-31 11:13:35 -07:00
2021-08-31 11:06:32 -07:00
2021-06-29 10:53:48 -07:00
2021-08-18 22:08:24 +02:00
2021-02-21 11:02:48 -08:00
2021-02-13 17:17:53 +01:00
2021-08-10 17:57:22 +02:00