linux/fs/btrfs
Josef Bacik 0568e82dbe btrfs: run delayed items before dropping the snapshot
With my delayed refs patches in place we started seeing a large amount
of aborts in __btrfs_free_extent:

 BTRFS error (device sdb1): unable to find ref byte nr 91947008 parent 0 root 35964  owner 1 offset 0
 Call Trace:
  ? btrfs_merge_delayed_refs+0xaf/0x340
  __btrfs_run_delayed_refs+0x6ea/0xfc0
  ? btrfs_set_path_blocking+0x31/0x60
  btrfs_run_delayed_refs+0xeb/0x180
  btrfs_commit_transaction+0x179/0x7f0
  ? btrfs_check_space_for_delayed_refs+0x30/0x50
  ? should_end_transaction.isra.19+0xe/0x40
  btrfs_drop_snapshot+0x41c/0x7c0
  btrfs_clean_one_deleted_snapshot+0xb5/0xd0
  cleaner_kthread+0xf6/0x120
  kthread+0xf8/0x130
  ? btree_invalidatepage+0x90/0x90
  ? kthread_bind+0x10/0x10
  ret_from_fork+0x35/0x40

This was because btrfs_drop_snapshot depends on the root not being
modified while it's dropping the snapshot.  It will unlock the root node
(and really every node) as it walks down the tree, only to re-lock it
when it needs to do something.  This is a problem because if we modify
the tree we could cow a block in our path, which frees our reference to
that block.  Then once we get back to that shared block we'll free our
reference to it again, and get ENOENT when trying to lookup our extent
reference to that block in __btrfs_free_extent.

This is ultimately happening because we have delayed items left to be
processed for our deleted snapshot _after_ all of the inodes are closed
for the snapshot.  We only run the delayed inode item if we're deleting
the inode, and even then we do not run the delayed insertions or delayed
removals.  These can be run at any point after our final inode does its
last iput, which is what triggers the snapshot deletion.  We can end up
with the snapshot deletion happening and then have the delayed items run
on that file system, resulting in the above problem.

This problem has existed forever, however my patches made it much easier
to hit as I wake up the cleaner much more often to deal with delayed
iputs, which made us more likely to start the snapshot dropping work
before the transaction commits, which is when the delayed items would
generally be run.  Before, generally speaking, we would run the delayed
items, commit the transaction, and wakeup the cleaner thread to start
deleting snapshots, which means we were less likely to hit this problem.
You could still hit it if you had multiple snapshots to be deleted and
ended up with lots of delayed items, but it was definitely harder.

Fix for now by simply running all the delayed items before starting to
drop the snapshot.  We could make this smarter in the future by making
the delayed items per-root, and then simply drop any delayed items for
roots that we are going to delete.  But for now just a quick and easy
solution is the safest.

CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2018-12-17 14:51:49 +01:00
..
tests btrfs: remove always true if branch in find_delalloc_range 2018-12-17 14:51:44 +01:00
acl.c btrfs: remove unnecessary curly braces in btrfs_get_acl 2018-08-06 13:12:41 +02:00
async-thread.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
async-thread.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
backref.c btrfs: Remove needless tree locking in iterate_inode_extrefs 2018-12-17 14:51:30 +01:00
backref.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
btrfs_inode.h Btrfs: fix fsync of files with multiple hard links in new directories 2018-12-17 14:51:43 +01:00
check-integrity.c btrfs: use PAGE_ALIGNED instead of open-coding it 2018-12-17 14:51:45 +01:00
check-integrity.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
compression.c btrfs: use PAGE_ALIGNED instead of open-coding it 2018-12-17 14:51:45 +01:00
compression.h btrfs: compression: Add linux/sizes.h for compression.h 2018-05-29 18:13:00 +02:00
ctree.c btrfs: catch cow on deleting snapshots 2018-12-17 14:51:48 +01:00
ctree.h btrfs: catch cow on deleting snapshots 2018-12-17 14:51:48 +01:00
dedupe.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
delayed-inode.c Btrfs: kill btrfs_clear_path_blocking 2018-10-15 17:23:38 +02:00
delayed-inode.h Btrfs: delayed-inode: use rb_first_cached for ins_root and del_root 2018-10-15 17:23:33 +02:00
delayed-ref.c btrfs: introduce delayed_refs_rsv 2018-12-17 14:51:46 +01:00
delayed-ref.h btrfs: add btrfs_delete_ref_head helper 2018-12-17 14:51:46 +01:00
dev-replace.c btrfs: dev-replace: open code trivial locking helpers 2018-12-17 14:51:45 +01:00
dev-replace.h btrfs: dev-replace: open code trivial locking helpers 2018-12-17 14:51:45 +01:00
dir-item.c btrfs: Remove root parameter from btrfs_insert_dir_item 2018-10-15 17:23:25 +02:00
disk-io.c btrfs: introduce delayed_refs_rsv 2018-12-17 14:51:46 +01:00
disk-io.h btrfs: drop extra enum initialization where using defaults 2018-12-17 14:51:43 +01:00
export.c btrfs: Remove 'objectid' member from struct btrfs_root 2018-10-15 17:23:25 +02:00
export.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
extent_io.c btrfs: use offset_in_page instead of open-coding it 2018-12-17 14:51:45 +01:00
extent_io.h btrfs: remove always true if branch in find_delalloc_range 2018-12-17 14:51:44 +01:00
extent_map.c Btrfs: extent_map: use rb_first_cached 2018-10-15 17:23:33 +02:00
extent_map.h btrfs: switch EXTENT_FLAG_* to enums 2018-12-17 14:51:43 +01:00
extent-tree.c btrfs: run delayed items before dropping the snapshot 2018-12-17 14:51:49 +01:00
file-item.c btrfs: replace btrfs_io_bio::end_io with a simple helper 2018-12-17 14:51:40 +01:00
file.c btrfs: use offset_in_page instead of open-coding it 2018-12-17 14:51:45 +01:00
free-space-cache.c Btrfs: fix deadlock on tree root leaf when finding free extent 2018-11-06 16:42:32 +01:00
free-space-cache.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
free-space-tree.c btrfs: use EXPORT_FOR_TESTS for conditionally exported functions 2018-12-17 14:51:37 +01:00
free-space-tree.h btrfs: Remove fs_info argument from add_to_free_space_tree 2018-05-28 18:07:36 +02:00
inode-item.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
inode-map.c btrfs: prune unused includes 2018-08-06 13:12:43 +02:00
inode-map.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
inode.c btrfs: fix truncate throttling 2018-12-17 14:51:47 +01:00
ioctl.c btrfs: Remove fsid/metadata_fsid fields from btrfs_info 2018-12-17 14:51:37 +01:00
Kconfig btrfs: add SPDX header to Kconfig 2018-04-12 16:29:55 +02:00
locking.c btrfs: replace waitqueue_actvie with cond_wake_up 2018-05-28 18:23:09 +02:00
locking.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
lzo.c btrfs: lzo: Harden inline lzo compressed extent decompression 2018-05-30 16:46:43 +02:00
Makefile btrfs: Remove custom crc32c init code 2018-03-26 15:09:39 +02:00
math.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
ordered-data.c Btrfs: remove no longer used stuff for tracking pending ordered extents 2018-12-17 14:51:25 +01:00
ordered-data.h btrfs: switch BTRFS_ORDERED_* to enums 2018-12-17 14:51:43 +01:00
orphan.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
print-tree.c btrfs: annotate unlikely branches after V0 extent type removal 2018-08-06 13:12:41 +02:00
print-tree.h btrfs: print-tree: debugging output enhancement 2018-04-20 19:18:16 +02:00
props.c btrfs: property: Set incompat flag if lzo/zstd compression is set 2018-05-17 14:18:25 +02:00
props.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
qgroup.c Btrfs: fix deadlock when enabling quotas due to concurrent snapshot creation 2018-12-17 14:51:39 +01:00
qgroup.h btrfs: drop extra enum initialization where using defaults 2018-12-17 14:51:43 +01:00
raid56.c btrfs: raid56: catch errors from full_stripe_write 2018-08-06 13:12:45 +02:00
raid56.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
rcu-string.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
reada.c btrfs: dev-replace: open code trivial locking helpers 2018-12-17 14:51:45 +01:00
ref-verify.c btrfs: Remove 'objectid' member from struct btrfs_root 2018-10-15 17:23:25 +02:00
ref-verify.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
relocation.c btrfs: add helper to describe block group flags 2018-12-17 14:51:39 +01:00
root-tree.c btrfs: Remove fs_info from btrfs_add_root_ref 2018-08-06 13:13:00 +02:00
scrub.c Btrfs: scrub, move setup of nofs contexts higher in the stack 2018-12-17 14:51:48 +01:00
send.c btrfs: use offset_in_page instead of open-coding it 2018-12-17 14:51:45 +01:00
send.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
struct-funcs.c btrfs: prune unused includes 2018-08-06 13:12:43 +02:00
super.c btrfs: Remove fsid/metadata_fsid fields from btrfs_info 2018-12-17 14:51:37 +01:00
sysfs.c btrfs: Add sysfs support for metadata_uuid feature 2018-12-17 14:51:37 +01:00
sysfs.h btrfs: drop extra enum initialization where using defaults 2018-12-17 14:51:43 +01:00
transaction.c btrfs: don't run delayed refs in the end transaction logic 2018-12-17 14:51:47 +01:00
transaction.h btrfs: drop extra enum initialization where using defaults 2018-12-17 14:51:43 +01:00
tree-checker.c btrfs: tree-checker: Don't check max block group size as current max chunk size limit is unreliable 2018-12-04 15:05:30 +01:00
tree-checker.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
tree-defrag.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
tree-log.c Btrfs: fix fsync of files with multiple hard links in new directories 2018-12-17 14:51:43 +01:00
tree-log.h Btrfs: remove no longer used io_err from btrfs_log_ctx 2018-12-17 14:51:31 +01:00
ulist.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
ulist.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
uuid-tree.c btrfs: Remove fs_info argument from btrfs_uuid_tree_rem 2018-05-30 16:46:53 +02:00
volumes.c btrfs: use offset_in_page instead of open-coding it 2018-12-17 14:51:45 +01:00
volumes.h btrfs: remove btrfs_bio_end_io_t 2018-12-17 14:51:40 +01:00
xattr.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
xattr.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
zlib.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
zstd.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00