linux/fs/btrfs
Filipe Manana f73853c716 btrfs: send: avoid double extent tree search when finding clone source
At find_extent_clone() we search twice for the extent item corresponding
to the data extent that the current file extent items points to:

1) Once with a call to extent_from_logical();

2) Once again during backref walking, through iterate_extent_inodes()
   which eventually leads to find_parent_nodes() where we will search
   again the extent tree for the same extent item.

The extent tree can be huge, so doing this one extra search for every
extent we want to send adds up and it's expensive.

The first call is there since the send code was introduced and it
accomplishes two things:

1) Check that the extent is flagged as a data extent in the extent tree.
   But it can not be anything else, otherwise we wouldn't have a file
   extent item in the send root pointing to it.
   This was probably added to catch bugs in the early days where send was
   yet too young and the interaction with everything else was far from
   perfect;

2) Check how many direct references there are on the extent, and if
   there's too many (more than SEND_MAX_EXTENT_REFS), avoid doing the
   backred walking as it may take too long and slowdown send.

So improve on this by having a callback in the backref walking code that
is called when it finds the extent item in the extent tree, and have those
checks done in the callback. When the callback returns anything different
from 0, it stops the backref walking code. This way we do a single search
on the extent tree for the extent item of our data extent.

Also, before this change we were only checking the number of references on
the data extent against SEND_MAX_EXTENT_REFS, but after starting backref
walking we will end up resolving backrefs for extent buffers in the path
from a leaf having a file extent item pointing to our data extent, up to
roots of trees from which the extent buffer is accessible from, due to
shared subtrees resulting from snapshoting. We were therefore allowing for
the possibility for send taking too long due to some node in the path from
the leaf to a root node being shared too many times. After this change we
check for reference counts being greater than SEND_MAX_EXTENT_REFS for
both data extents and metadata extents.

This change is part of a patchset comprised of the following patches:

  01/17 btrfs: fix inode list leak during backref walking at resolve_indirect_refs()
  02/17 btrfs: fix inode list leak during backref walking at find_parent_nodes()
  03/17 btrfs: fix ulist leaks in error paths of qgroup self tests
  04/17 btrfs: remove pointless and double ulist frees in error paths of qgroup tests
  05/17 btrfs: send: avoid unnecessary path allocations when finding extent clone
  06/17 btrfs: send: update comment at find_extent_clone()
  07/17 btrfs: send: drop unnecessary backref context field initializations
  08/17 btrfs: send: avoid unnecessary backref lookups when finding clone source
  09/17 btrfs: send: optimize clone detection to increase extent sharing
  10/17 btrfs: use a single argument for extent offset in backref walking functions
  11/17 btrfs: use a structure to pass arguments to backref walking functions
  12/17 btrfs: reuse roots ulist on each leaf iteration for iterate_extent_inodes()
  13/17 btrfs: constify ulist parameter of ulist_next()
  14/17 btrfs: send: cache leaf to roots mapping during backref walking
  15/17 btrfs: send: skip unnecessary backref iterations
  16/17 btrfs: send: avoid double extent tree search when finding clone source
  17/17 btrfs: send: skip resolution of our own backref when finding clone source

Performance test results are in the changelog of patch 17/17.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-12-05 18:00:50 +01:00
..
tests btrfs: use a structure to pass arguments to backref walking functions 2022-12-05 18:00:50 +01:00
accessors.c btrfs: move btrfs_map_token to accessors 2022-12-05 18:00:42 +01:00
accessors.h btrfs: extend btrfs_dir_item type to store encryption status 2022-12-05 18:00:43 +01:00
acl.c btrfs: move acl prototypes into acl.h 2022-12-05 18:00:46 +01:00
acl.h btrfs: move acl prototypes into acl.h 2022-12-05 18:00:46 +01:00
async-thread.c btrfs: simplify WQ_HIGHPRI handling in struct btrfs_workqueue 2022-05-16 17:03:15 +02:00
async-thread.h btrfs: remove unused typedefs get_extent_t and btrfs_work_func_t 2022-07-25 17:45:36 +02:00
backref.c btrfs: send: avoid double extent tree search when finding clone source 2022-12-05 18:00:50 +01:00
backref.h btrfs: send: avoid double extent tree search when finding clone source 2022-12-05 18:00:50 +01:00
block-group.c btrfs: simplify percent calculation helpers, rename div_factor 2022-12-05 18:00:48 +01:00
block-group.h btrfs: skip update of block group item if used bytes are the same 2022-12-05 18:00:40 +01:00
block-rsv.c btrfs: simplify percent calculation helpers, rename div_factor 2022-12-05 18:00:48 +01:00
block-rsv.h btrfs: simplify percent calculation helpers, rename div_factor 2022-12-05 18:00:48 +01:00
btrfs_inode.h btrfs: move inode prototypes to btrfs_inode.h 2022-12-05 18:00:45 +01:00
check-integrity.c btrfs: move accessor helpers into accessors.h 2022-12-05 18:00:42 +01:00
check-integrity.h btrfs: check-integrity: split submit_bio from btrfsic checking 2022-05-16 17:03:12 +02:00
compression.c btrfs: move super_block specific helpers into super.h 2022-12-05 18:00:47 +01:00
compression.h btrfs: add blk_types.h include to compression.h 2022-12-05 18:00:45 +01:00
ctree.c btrfs: move relocation prototypes into relocation.h 2022-12-05 18:00:47 +01:00
ctree.h btrfs: move orphan prototypes into orphan.h 2022-12-05 18:00:47 +01:00
defrag.c btrfs: remove new_inline argument from btrfs_extent_item_to_extent_map() 2022-12-05 18:00:48 +01:00
defrag.h btrfs: move defrag related prototypes to their own header 2022-12-05 18:00:46 +01:00
delalloc-space.c btrfs: update function comments 2022-12-05 18:00:45 +01:00
delalloc-space.h btrfs: move delalloc space related prototypes to delalloc-space.h 2022-12-05 18:00:44 +01:00
delayed-inode.c btrfs: move file-item prototypes into their own header 2022-12-05 18:00:46 +01:00
delayed-inode.h btrfs: extend btrfs_dir_item type to store encryption status 2022-12-05 18:00:43 +01:00
delayed-ref.c btrfs: update function comments 2022-12-05 18:00:45 +01:00
delayed-ref.h btrfs: remove btrfs_delayed_extent_op::is_data 2022-05-16 17:17:31 +02:00
dev-replace.c btrfs: move scrub prototypes into scrub.h 2022-12-05 18:00:47 +01:00
dev-replace.h btrfs: move dev-replace prototypes into dev-replace.h 2022-12-05 18:00:47 +01:00
dir-item.c btrfs: move dir-item prototypes into dir-item.h 2022-12-05 18:00:46 +01:00
dir-item.h btrfs: move dir-item prototypes into dir-item.h 2022-12-05 18:00:46 +01:00
discard.c btrfs: update function comments 2022-12-05 18:00:45 +01:00
discard.h
disk-io.c btrfs: remove the unused endio_raid56_workers and btrfs_raid_bio::end_io_work 2022-12-05 18:00:49 +01:00
disk-io.h btrfs: remove unused function prototypes 2022-12-05 18:00:44 +01:00
export.c btrfs: move super_block specific helpers into super.h 2022-12-05 18:00:47 +01:00
export.h btrfs: simplify generation check in btrfs_get_dentry 2022-12-05 18:00:41 +01:00
extent_io.c btrfs: merge struct extent_page_data to btrfs_bio_ctrl 2022-12-05 18:00:48 +01:00
extent_io.h btrfs: convert extent_io page op defines to enum bits 2022-12-05 18:00:40 +01:00
extent_map.c btrfs: selftests: remove impossible inline extent at non-zero file offset 2022-12-05 18:00:47 +01:00
extent_map.h btrfs: get the next extent map during fiemap/lseek more efficiently 2022-12-05 18:00:38 +01:00
extent-io-tree.c btrfs: move the printk helpers out of ctree.h 2022-12-05 18:00:41 +01:00
extent-io-tree.h btrfs: remove unused unlock_extent_atomic 2022-12-05 18:00:41 +01:00
extent-tree.c btrfs: move orphan prototypes into orphan.h 2022-12-05 18:00:47 +01:00
extent-tree.h btrfs: move the snapshot drop related prototypes to extent-tree.h 2022-12-05 18:00:46 +01:00
file-item.c btrfs: remove new_inline argument from btrfs_extent_item_to_extent_map() 2022-12-05 18:00:48 +01:00
file-item.h btrfs: remove new_inline argument from btrfs_extent_item_to_extent_map() 2022-12-05 18:00:48 +01:00
file.c btrfs: update stale comment for nowait direct IO writes 2022-12-05 18:00:48 +01:00
file.h btrfs: move file prototypes to file.h 2022-12-05 18:00:46 +01:00
free-space-cache.c btrfs: simplify percent calculation helpers, rename div_factor 2022-12-05 18:00:48 +01:00
free-space-cache.h btrfs: convert discard stat defs to enum 2022-12-05 18:00:45 +01:00
free-space-tree.c btrfs: move root tree prototypes to their own header 2022-12-05 18:00:44 +01:00
free-space-tree.h
fs.c btrfs: move accessor helpers into accessors.h 2022-12-05 18:00:42 +01:00
fs.h btrfs: remove the unused endio_raid56_workers and btrfs_raid_bio::end_io_work 2022-12-05 18:00:49 +01:00
inode-item.c btrfs: move file-item prototypes into their own header 2022-12-05 18:00:46 +01:00
inode-item.h btrfs: use struct fscrypt_str instead of struct qstr 2022-12-05 18:00:43 +01:00
inode.c btrfs: extract the inline extent read code into its own function 2022-12-05 18:00:48 +01:00
ioctl.c btrfs: move super prototypes into super.h 2022-12-05 18:00:47 +01:00
ioctl.h btrfs: move ioctl prototypes into ioctl.h 2022-12-05 18:00:46 +01:00
Kconfig btrfs: use generic Kconfig option for 256kB page size limit 2022-01-20 08:52:55 +02:00
locking.c btrfs: move accessor helpers into accessors.h 2022-12-05 18:00:42 +01:00
locking.h btrfs: move the lockdep helpers into locking.h 2022-12-05 18:00:44 +01:00
lzo.c btrfs: move super_block specific helpers into super.h 2022-12-05 18:00:47 +01:00
Makefile btrfs: rename tree-defrag.c to defrag.c 2022-12-05 18:00:45 +01:00
messages.c btrfs: move super_block specific helpers into super.h 2022-12-05 18:00:47 +01:00
messages.h btrfs: move the 32bit warn defines into messages.h 2022-12-05 18:00:46 +01:00
misc.h btrfs: simplify percent calculation helpers, rename div_factor 2022-12-05 18:00:48 +01:00
ordered-data.c btrfs: move super_block specific helpers into super.h 2022-12-05 18:00:47 +01:00
ordered-data.h btrfs: use cached_state for btrfs_check_nocow_lock 2022-12-05 18:00:36 +01:00
orphan.c btrfs: move orphan prototypes into orphan.h 2022-12-05 18:00:47 +01:00
orphan.h btrfs: move orphan prototypes into orphan.h 2022-12-05 18:00:47 +01:00
print-tree.c btrfs: extend btrfs_dir_item type to store encryption status 2022-12-05 18:00:43 +01:00
print-tree.h
props.c btrfs: move super_block specific helpers into super.h 2022-12-05 18:00:47 +01:00
props.h btrfs: make module init/exit match their sequence 2022-12-05 18:00:40 +01:00
qgroup.c btrfs: use a structure to pass arguments to backref walking functions 2022-12-05 18:00:50 +01:00
qgroup.h btrfs: sink gfp_t parameter to btrfs_qgroup_trace_extent 2022-12-05 18:00:43 +01:00
raid56.c btrfs: raid56: switch scrub path to use a single function 2022-12-05 18:00:49 +01:00
raid56.h btrfs: remove the unused endio_raid56_workers and btrfs_raid_bio::end_io_work 2022-12-05 18:00:49 +01:00
rcu-string.h
ref-verify.c btrfs: move accessor helpers into accessors.h 2022-12-05 18:00:42 +01:00
ref-verify.h
reflink.c btrfs: move super_block specific helpers into super.h 2022-12-05 18:00:47 +01:00
reflink.h
relocation.c btrfs: use a structure to pass arguments to backref walking functions 2022-12-05 18:00:50 +01:00
relocation.h btrfs: move relocation prototypes into relocation.h 2022-12-05 18:00:47 +01:00
root-tree.c btrfs: move orphan prototypes into orphan.h 2022-12-05 18:00:47 +01:00
root-tree.h btrfs: move root tree prototypes to their own header 2022-12-05 18:00:44 +01:00
scrub.c btrfs: use a structure to pass arguments to backref walking functions 2022-12-05 18:00:50 +01:00
scrub.h btrfs: move scrub prototypes into scrub.h 2022-12-05 18:00:47 +01:00
send.c btrfs: send: avoid double extent tree search when finding clone source 2022-12-05 18:00:50 +01:00
send.h btrfs: send add define for v2 buffer size 2022-12-05 18:00:41 +01:00
space-info.c btrfs: simplify percent calculation helpers, rename div_factor 2022-12-05 18:00:48 +01:00
space-info.h btrfs: move btrfs_account_ro_block_groups_free_space into space-info.c 2022-12-05 18:00:44 +01:00
subpage.c btrfs: move the printk helpers out of ctree.h 2022-12-05 18:00:41 +01:00
subpage.h btrfs: make nodesize >= PAGE_SIZE case to reuse the non-subpage routine 2022-05-16 17:03:11 +02:00
super.c btrfs: move super prototypes into super.h 2022-12-05 18:00:47 +01:00
super.h btrfs: move super_block specific helpers into super.h 2022-12-05 18:00:47 +01:00
sysfs.c btrfs: simplify percent calculation helpers, rename div_factor 2022-12-05 18:00:48 +01:00
sysfs.h
transaction.c btrfs: simplify percent calculation helpers, rename div_factor 2022-12-05 18:00:48 +01:00
transaction.h btrfs: remove fs_info::pending_changes and related code 2022-12-05 18:00:42 +01:00
tree-checker.c btrfs: extend btrfs_dir_item type to store encryption status 2022-12-05 18:00:43 +01:00
tree-checker.h btrfs: tree-checker: check extent buffer owner against owner rootid 2022-05-16 17:03:09 +02:00
tree-log.c btrfs: move orphan prototypes into orphan.h 2022-12-05 18:00:47 +01:00
tree-log.h btrfs: use struct fscrypt_str instead of struct qstr 2022-12-05 18:00:43 +01:00
tree-mod-log.c btrfs: move accessor helpers into accessors.h 2022-12-05 18:00:42 +01:00
tree-mod-log.h btrfs: fix SPDX comment in tree-mod-log.h 2022-12-05 18:00:48 +01:00
ulist.c btrfs: constify ulist parameter of ulist_next() 2022-12-05 18:00:50 +01:00
ulist.h btrfs: constify ulist parameter of ulist_next() 2022-12-05 18:00:50 +01:00
uuid-tree.c btrfs: move uuid tree prototypes to uuid-tree.h 2022-12-05 18:00:46 +01:00
uuid-tree.h btrfs: move uuid tree prototypes to uuid-tree.h 2022-12-05 18:00:46 +01:00
verity.c btrfs: move orphan prototypes into orphan.h 2022-12-05 18:00:47 +01:00
verity.h btrfs: move verity prototypes into verity.h 2022-12-05 18:00:47 +01:00
volumes.c btrfs: simplify percent calculation helpers, rename div_factor 2022-12-05 18:00:48 +01:00
volumes.h btrfs: move btrfs_chunk_item_size out of ctree.h 2022-12-05 18:00:45 +01:00
xattr.c btrfs: move dir-item prototypes into dir-item.h 2022-12-05 18:00:46 +01:00
xattr.h
zlib.c btrfs: zlib: replace kmap() with kmap_local_page() in zlib_decompress_bio() 2022-07-25 17:45:41 +02:00
zoned.c btrfs: update function comments 2022-12-05 18:00:45 +01:00
zoned.h btrfs: move the printk helpers out of ctree.h 2022-12-05 18:00:41 +01:00
zstd.c btrfs: update function comments 2022-12-05 18:00:45 +01:00