linux

iv/linux

History

Josef Bacik 38e3eebff6 btrfs: honor path->skip_locking in backref code Qgroups will do the old roots lookup at delayed ref time, which could be while walking down the extent root while running a delayed ref. This should be fine, except we specifically lock eb's in the backref walking code irrespective of path->skip_locking, which deadlocks the system. Fix up the backref code to honor path->skip_locking, nobody will be modifying the commit_root when we're searching so it's completely safe to do. This happens since `fb235dc06f` ("btrfs: qgroup: Move half of the qgroup accounting time out of commit trans"), kernel may lockup with quota enabled. There is one backref trace triggered by snapshot dropping along with write operation in the source subvolume. The example can be reliably reproduced: btrfs-cleaner D 0 4062 2 0x80000000 Call Trace: schedule+0x32/0x90 btrfs_tree_read_lock+0x93/0x130 [btrfs] find_parent_nodes+0x29b/0x1170 [btrfs] btrfs_find_all_roots_safe+0xa8/0x120 [btrfs] btrfs_find_all_roots+0x57/0x70 [btrfs] btrfs_qgroup_trace_extent_post+0x37/0x70 [btrfs] btrfs_qgroup_trace_leaf_items+0x10b/0x140 [btrfs] btrfs_qgroup_trace_subtree+0xc8/0xe0 [btrfs] do_walk_down+0x541/0x5e3 [btrfs] walk_down_tree+0xab/0xe7 [btrfs] btrfs_drop_snapshot+0x356/0x71a [btrfs] btrfs_clean_one_deleted_snapshot+0xb8/0xf0 [btrfs] cleaner_kthread+0x12b/0x160 [btrfs] kthread+0x112/0x130 ret_from_fork+0x27/0x50 When dropping snapshots with qgroup enabled, we will trigger backref walk. However such backref walk at that timing is pretty dangerous, as if one of the parent nodes get WRITE locked by other thread, we could cause a dead lock. For example: FS 260 FS 261 (Dropped) node A node B / \ / \ node C node D node E / \ / \ / \ leaf F\|leaf G\|leaf H\|leaf I\|leaf J\|leaf K The lock sequence would be: Thread A (cleaner) \| Thread B (other writer) ----------------------------------------------------------------------- write_lock(B) \| write_lock(D) \| ^^^ called by walk_down_tree() \| \| write_lock(A) \| write_lock(D) << Stall read_lock(H) << for backref walk \| read_lock(D) << lock owner is \| the same thread A \| so read lock is OK \| read_lock(A) << Stall \| So thread A hold write lock D, and needs read lock A to unlock. While thread B holds write lock A, while needs lock D to unlock. This will cause a deadlock. This is not only limited to snapshot dropping case. As the backref walk, even only happens on commit trees, is breaking the normal top-down locking order, makes it deadlock prone. Fixes: `fb235dc06f` ("btrfs: qgroup: Move half of the qgroup accounting time out of commit trans") CC: stable@vger.kernel.org # 4.14+ Reported-and-tested-by: David Sterba <dsterba@suse.com> Reported-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> [ rebase to latest branch and fix lock assert bug in btrfs/007 ] Signed-off-by: Qu Wenruo <wqu@suse.com> [ copy logs and deadlock analysis from Qu's patch ] Signed-off-by: David Sterba <dsterba@suse.com>		2019-02-25 14:13:39 +01:00
..
tests	btrfs: remove always true if branch in find_delalloc_range	2018-12-17 14:51:44 +01:00
acl.c	Btrfs: setup a nofs context for memory allocation at __btrfs_set_acl	2019-02-25 14:13:17 +01:00
async-thread.c	btrfs: simplify workqueue name when allocating	2019-02-25 14:13:24 +01:00
async-thread.h	btrfs: replace GPL boilerplate by SPDX -- headers	2018-04-12 16:29:46 +02:00
backref.c	btrfs: honor path->skip_locking in backref code	2019-02-25 14:13:39 +01:00
backref.h	btrfs: replace GPL boilerplate by SPDX -- headers	2018-04-12 16:29:46 +02:00
btrfs_inode.h	Btrfs: fix fsync of files with multiple hard links in new directories	2018-12-17 14:51:43 +01:00
check-integrity.c	btrfs: Fix typos in comments and strings	2018-12-17 14:51:50 +01:00
check-integrity.h	btrfs: replace GPL boilerplate by SPDX -- headers	2018-04-12 16:29:46 +02:00
compression.c	btrfs: change set_level() to bound the level passed in	2019-02-25 14:13:32 +01:00
compression.h	btrfs: change set_level() to bound the level passed in	2019-02-25 14:13:32 +01:00
ctree.c	btrfs: merge btrfs_set_lock_blocking_rw with it's caller	2019-02-25 14:13:28 +01:00
ctree.h	btrfs: scrub: remove unused nocow worker pointer	2019-02-25 14:13:38 +01:00
dedupe.h	btrfs: replace GPL boilerplate by SPDX -- headers	2018-04-12 16:29:46 +02:00
delayed-inode.c	Btrfs: kill btrfs_clear_path_blocking	2018-10-15 17:23:38 +02:00
delayed-inode.h	Btrfs: delayed-inode: use rb_first_cached for ins_root and del_root	2018-10-15 17:23:33 +02:00
delayed-ref.c	btrfs: qgroup: Move reserved data accounting from btrfs_delayed_ref_head to btrfs_qgroup_extent_record	2019-02-25 14:13:39 +01:00
delayed-ref.h	btrfs: qgroup: Move reserved data accounting from btrfs_delayed_ref_head to btrfs_qgroup_extent_record	2019-02-25 14:13:39 +01:00
dev-replace.c	btrfs: merge btrfs_find_device and find_device	2019-02-25 14:13:24 +01:00
dev-replace.h	btrfs: dev-replace: open code trivial locking helpers	2018-12-17 14:51:45 +01:00
dir-item.c	btrfs: Remove root parameter from btrfs_insert_dir_item	2018-10-15 17:23:25 +02:00
disk-io.c	btrfs: scrub: convert scrub_workers_refcnt to refcount_t	2019-02-25 14:13:38 +01:00
disk-io.h	btrfs: drop extra enum initialization where using defaults	2018-12-17 14:51:43 +01:00
export.c	btrfs: Remove 'objectid' member from struct btrfs_root	2018-10-15 17:23:25 +02:00
export.h	btrfs: replace GPL boilerplate by SPDX -- headers	2018-04-12 16:29:46 +02:00
extent_io.c	btrfs: extent_io: Kill the forward declaration of flush_write_bio	2019-02-25 14:13:37 +01:00
extent_io.h	btrfs: Remove EXTENT_FIRST_DELALLOC bit	2019-02-25 14:13:36 +01:00
extent_map.c	btrfs: Remove impossible condition from mergable_maps	2019-02-25 14:13:21 +01:00
extent_map.h	btrfs: Remove impossible condition from mergable_maps	2019-02-25 14:13:21 +01:00
extent-tree.c	btrfs: qgroup: Move reserved data accounting from btrfs_delayed_ref_head to btrfs_qgroup_extent_record	2019-02-25 14:13:39 +01:00
file-item.c	btrfs: replace btrfs_io_bio::end_io with a simple helper	2018-12-17 14:51:40 +01:00
file.c	btrfs: Remove unused arguments from btrfs_get_extent_fiemap	2019-02-25 14:13:17 +01:00
free-space-cache.c	Btrfs: fix deadlock on tree root leaf when finding free extent	2018-11-06 16:42:32 +01:00
free-space-cache.h	btrfs: replace GPL boilerplate by SPDX -- headers	2018-04-12 16:29:46 +02:00
free-space-tree.c	btrfs: use EXPORT_FOR_TESTS for conditionally exported functions	2018-12-17 14:51:37 +01:00
free-space-tree.h	btrfs: Remove fs_info argument from add_to_free_space_tree	2018-05-28 18:07:36 +02:00
inode-item.c	btrfs: replace GPL boilerplate by SPDX -- sources	2018-04-12 16:29:51 +02:00
inode-map.c	btrfs: prune unused includes	2018-08-06 13:12:43 +02:00
inode-map.h	btrfs: replace GPL boilerplate by SPDX -- headers	2018-04-12 16:29:46 +02:00
inode.c	btrfs: reserve extra space during evict	2019-02-25 14:13:35 +01:00
ioctl.c	btrfs: merge btrfs_find_device and find_device	2019-02-25 14:13:24 +01:00
Kconfig	btrfs: add SPDX header to Kconfig	2018-04-12 16:29:55 +02:00
locking.c	btrfs: simplify waiting loop in btrfs_tree_lock	2019-02-25 14:13:28 +01:00
locking.h	btrfs: merge btrfs_set_lock_blocking_rw with it's caller	2019-02-25 14:13:28 +01:00
lzo.c	btrfs: change set_level() to bound the level passed in	2019-02-25 14:13:32 +01:00
Makefile	btrfs: Remove custom crc32c init code	2018-03-26 15:09:39 +02:00
math.h	btrfs: replace GPL boilerplate by SPDX -- headers	2018-04-12 16:29:46 +02:00
ordered-data.c	Btrfs: remove no longer used stuff for tracking pending ordered extents	2018-12-17 14:51:25 +01:00
ordered-data.h	btrfs: switch BTRFS_ORDERED_* to enums	2018-12-17 14:51:43 +01:00
orphan.c	btrfs: replace GPL boilerplate by SPDX -- sources	2018-04-12 16:29:51 +02:00
print-tree.c	btrfs: annotate unlikely branches after V0 extent type removal	2018-08-06 13:12:41 +02:00
print-tree.h	btrfs: print-tree: debugging output enhancement	2018-04-20 19:18:16 +02:00
props.c	btrfs: property: Set incompat flag if lzo/zstd compression is set	2018-05-17 14:18:25 +02:00
props.h	btrfs: replace GPL boilerplate by SPDX -- headers	2018-04-12 16:29:46 +02:00
qgroup.c	btrfs: qgroup: Make qgroup async transaction commit more aggressive	2019-02-25 14:13:39 +01:00
qgroup.h	btrfs: qgroup: Move reserved data accounting from btrfs_delayed_ref_head to btrfs_qgroup_extent_record	2019-02-25 14:13:39 +01:00
raid56.c	btrfs: Fix typos in comments and strings	2018-12-17 14:51:50 +01:00
raid56.h	btrfs: replace GPL boilerplate by SPDX -- headers	2018-04-12 16:29:46 +02:00
rcu-string.h	btrfs: replace GPL boilerplate by SPDX -- headers	2018-04-12 16:29:46 +02:00
reada.c	btrfs: dev-replace: open code trivial locking helpers	2018-12-17 14:51:45 +01:00
ref-verify.c	btrfs: replace btrfs_set_lock_blocking_rw with appropriate helpers	2019-02-25 14:13:27 +01:00
ref-verify.h	btrfs: replace GPL boilerplate by SPDX -- headers	2018-04-12 16:29:46 +02:00
relocation.c	btrfs: open code now trivial btrfs_set_lock_blocking	2019-02-25 14:13:27 +01:00
root-tree.c	btrfs: Remove fs_info from btrfs_add_root_ref	2018-08-06 13:13:00 +02:00
scrub.c	btrfs: scrub: add assertions for worker pointers	2019-02-25 14:13:38 +01:00
send.c	Remove 'type' argument from access_ok() function	2019-01-03 18:57:57 -08:00
send.h	btrfs: replace GPL boilerplate by SPDX -- headers	2018-04-12 16:29:46 +02:00
struct-funcs.c	btrfs: prune unused includes	2018-08-06 13:12:43 +02:00
super.c	btrfs: add zstd compression level support	2019-02-25 14:13:33 +01:00
sysfs.c	btrfs: Add sysfs support for metadata_uuid feature	2018-12-17 14:51:37 +01:00
sysfs.h	btrfs: drop extra enum initialization where using defaults	2018-12-17 14:51:43 +01:00
transaction.c	btrfs: open code now trivial btrfs_set_lock_blocking	2019-02-25 14:13:27 +01:00
transaction.h	btrfs: drop extra enum initialization where using defaults	2018-12-17 14:51:43 +01:00
tree-checker.c	btrfs: Fix typos in comments and strings	2018-12-17 14:51:50 +01:00
tree-checker.h	btrfs: replace GPL boilerplate by SPDX -- headers	2018-04-12 16:29:46 +02:00
tree-defrag.c	btrfs: open code now trivial btrfs_set_lock_blocking	2019-02-25 14:13:27 +01:00
tree-log.c	btrfs: open code now trivial btrfs_set_lock_blocking	2019-02-25 14:13:27 +01:00
tree-log.h	Btrfs: remove no longer used io_err from btrfs_log_ctx	2018-12-17 14:51:31 +01:00
ulist.c	btrfs: replace GPL boilerplate by SPDX -- sources	2018-04-12 16:29:51 +02:00
ulist.h	btrfs: replace GPL boilerplate by SPDX -- headers	2018-04-12 16:29:46 +02:00
uuid-tree.c	btrfs: Remove fs_info argument from btrfs_uuid_tree_rem	2018-05-30 16:46:53 +02:00
volumes.c	btrfs: fix comment its device list mutex not volume lock	2019-02-25 14:13:37 +01:00
volumes.h	btrfs: introduce new ioctl to unregister a btrfs device	2019-02-25 14:13:30 +01:00
xattr.c	Btrfs: use nofs context when initializing security xattrs to avoid deadlock	2018-12-17 14:51:49 +01:00
xattr.h	btrfs: replace GPL boilerplate by SPDX -- headers	2018-04-12 16:29:46 +02:00
zlib.c	btrfs: change set_level() to bound the level passed in	2019-02-25 14:13:32 +01:00
zstd.c	btrfs: add zstd compression level support	2019-02-25 14:13:33 +01:00