linux/fs/fuse
Miklos Szeredi 2fdbb8dd01 fuse: fix deadlock between atomic O_TRUNC and page invalidation
fuse_finish_open() will be called with FUSE_NOWRITE set in case of atomic
O_TRUNC open(), so commit 76224355db ("fuse: truncate pagecache on
atomic_o_trunc") replaced invalidate_inode_pages2() by truncate_pagecache()
in such a case to avoid the A-A deadlock. However, we found another A-B-B-A
deadlock related to the case above, which will cause the xfstests
generic/464 testcase hung in our virtio-fs test environment.

For example, consider two processes concurrently open one same file, one
with O_TRUNC and another without O_TRUNC. The deadlock case is described
below, if open(O_TRUNC) is already set_nowrite(acquired A), and is trying
to lock a page (acquiring B), open() could have held the page lock
(acquired B), and waiting on the page writeback (acquiring A). This would
lead to deadlocks.

open(O_TRUNC)
----------------------------------------------------------------
fuse_open_common
  inode_lock            [C acquire]
  fuse_set_nowrite      [A acquire]

  fuse_finish_open
    truncate_pagecache
      lock_page         [B acquire]
      truncate_inode_page
      unlock_page       [B release]

  fuse_release_nowrite  [A release]
  inode_unlock          [C release]
----------------------------------------------------------------

open()
----------------------------------------------------------------
fuse_open_common
  fuse_finish_open
    invalidate_inode_pages2
      lock_page         [B acquire]
        fuse_launder_page
          fuse_wait_on_page_writeback [A acquire & release]
      unlock_page       [B release]
----------------------------------------------------------------

Besides this case, all calls of invalidate_inode_pages2() and
invalidate_inode_pages2_range() in fuse code also can deadlock with
open(O_TRUNC).

Fix by moving the truncate_pagecache() call outside the nowrite protected
region.  The nowrite protection is only for delayed writeback
(writeback_cache) case, where inode lock does not protect against
truncation racing with writes on the server.  Write syscalls racing with
page cache truncation still get the inode lock protection.

This patch also changes the order of filemap_invalidate_lock()
vs. fuse_set_nowrite() in fuse_open_common().  This new order matches the
order found in fuse_file_fallocate() and fuse_do_setattr().

Reported-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
Tested-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
Fixes: e4648309b8 ("fuse: truncate pending writes on O_TRUNC")
Cc: <stable@vger.kernel.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-07-21 16:02:45 +02:00
..
acl.c vfs: add rcu argument to ->get_acl() callback 2021-08-18 22:08:24 +02:00
control.c fuse: remove reliance on bdi congestion 2022-03-22 15:57:00 -07:00
cuse.c cuse: simplify refcount 2021-04-14 10:40:58 +02:00
dax.c dax: introduce DAX_RECOVERY_WRITE dax access mode 2022-05-16 13:35:56 -07:00
dev.c fuse: remove reliance on bdi congestion 2022-03-22 15:57:00 -07:00
dir.c fuse: fix deadlock between atomic O_TRUNC and page invalidation 2022-07-21 16:02:45 +02:00
file.c fuse: fix deadlock between atomic O_TRUNC and page invalidation 2022-07-21 16:02:45 +02:00
fuse_i.h fs: Remove ->readpages address space operation 2022-04-01 13:45:33 -04:00
inode.c fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
ioctl.c x86: Remove toolchain check for X32 ABI capability 2022-03-15 10:32:48 +01:00
Kconfig dax: remove CONFIG_DAX_DRIVER 2021-12-04 08:58:51 -08:00
Makefile fuse: move ioctl to separate source file 2021-04-12 15:04:30 +02:00
readdir.c fuse: only update necessary attributes 2021-10-28 09:45:33 +02:00
virtio_fs.c dax: introduce DAX_RECOVERY_WRITE dax access mode 2022-05-16 13:35:56 -07:00
xattr.c fuse: move fuse_invalidate_attr() into fuse_update_ctime() 2021-10-22 17:03:01 +02:00