linux/fs/btrfs
Filipe Manana adf0241868 btrfs: send: skip resolution of our own backref when finding clone source
When doing backref walking to determine a source range to clone from, it
is worthless to collect and resolve our own data backref, as we can't
obviously use it as a clone source and it represents the range we want to
clone into. Collecting the backref implies doing the extra work to resolve
it, doing the search for a file extent item in a subvolume tree, etc.
Skipping the data backref is valid as long as we only have the send root
as the single clone root, otherwise the leaf with the file extent item may
be accessible from another clone root due to shared subtrees created by
snapshots, and therefore we have to collect the backref and resolve it.

So add a callback to the backref walking code to guide it to skip data
backrefs.

This change is part of a patchset comprised of the following patches:

  01/17 btrfs: fix inode list leak during backref walking at resolve_indirect_refs()
  02/17 btrfs: fix inode list leak during backref walking at find_parent_nodes()
  03/17 btrfs: fix ulist leaks in error paths of qgroup self tests
  04/17 btrfs: remove pointless and double ulist frees in error paths of qgroup tests
  05/17 btrfs: send: avoid unnecessary path allocations when finding extent clone
  06/17 btrfs: send: update comment at find_extent_clone()
  07/17 btrfs: send: drop unnecessary backref context field initializations
  08/17 btrfs: send: avoid unnecessary backref lookups when finding clone source
  09/17 btrfs: send: optimize clone detection to increase extent sharing
  10/17 btrfs: use a single argument for extent offset in backref walking functions
  11/17 btrfs: use a structure to pass arguments to backref walking functions
  12/17 btrfs: reuse roots ulist on each leaf iteration for iterate_extent_inodes()
  13/17 btrfs: constify ulist parameter of ulist_next()
  14/17 btrfs: send: cache leaf to roots mapping during backref walking
  15/17 btrfs: send: skip unnecessary backref iterations
  16/17 btrfs: send: avoid double extent tree search when finding clone source
  17/17 btrfs: send: skip resolution of our own backref when finding clone source

The following test was run on non-debug kernel (Debian's default kernel
config) before and after applying the patchset:

   $ cat test-send-many-shared-extents.sh
   #!/bin/bash

   DEV=/dev/sdh
   MNT=/mnt/sdh

   umount $DEV &> /dev/null
   mkfs.btrfs -f $DEV
   mount $DEV $MNT

   num_files=50000
   num_clones_per_file=50

   for ((i = 1; i <= $num_files; i++)); do
       xfs_io -f -c "pwrite 0 64K" $MNT/file_$i > /dev/null
       echo -ne "\r$i files created..."
   done
   echo

   btrfs subvolume snapshot -r $MNT $MNT/snap1

   cloned=0
   for ((i = 1; i <= $num_clones_per_file; i++)); do
       for ((j = 1; j <= $num_files; j++)); do
           cp --reflink=always $MNT/file_$j $MNT/file_${j}_clone_${i}
           cloned=$((cloned + 1))
           echo -ne "\r$cloned / $((num_files * num_clones_per_file)) clone operations"
       done
   done
   echo

   btrfs subvolume snapshot -r $MNT $MNT/snap2

   # Unmount and mount again to clear all cached metadata (and data).
   umount $DEV
   mount $DEV $MNT

   start=$(date +%s%N)
   btrfs send $MNT/snap2 > /dev/null
   end=$(date +%s%N)

   dur=$(( (end - start) / 1000000000 ))
   echo -e "\nFull send took $dur seconds"

   # Unmount and mount again to clear all cached metadata (and data).
   umount $DEV
   mount $DEV $MNT

   start=$(date +%s%N)
   btrfs send -p $MNT/snap1 $MNT/snap2 > /dev/null
   end=$(date +%s%N)

   dur=$(( (end - start) / 1000000000 ))
   echo -e "\nIncremental send took $dur seconds"

   umount $MNT

Before applying the patchset:

   (...)
   Full send took 1108 seconds
   (...)
   Incremental send took 1135 seconds

After applying the whole patchset:

   (...)
   Full send took 268 seconds            (-75.8%)
   (...)
   Incremental send took 316 seconds     (-72.2%)

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-12-05 18:00:50 +01:00
..
tests btrfs: use a structure to pass arguments to backref walking functions 2022-12-05 18:00:50 +01:00
accessors.c btrfs: move btrfs_map_token to accessors 2022-12-05 18:00:42 +01:00
accessors.h btrfs: extend btrfs_dir_item type to store encryption status 2022-12-05 18:00:43 +01:00
acl.c btrfs: move acl prototypes into acl.h 2022-12-05 18:00:46 +01:00
acl.h btrfs: move acl prototypes into acl.h 2022-12-05 18:00:46 +01:00
async-thread.c btrfs: simplify WQ_HIGHPRI handling in struct btrfs_workqueue 2022-05-16 17:03:15 +02:00
async-thread.h btrfs: remove unused typedefs get_extent_t and btrfs_work_func_t 2022-07-25 17:45:36 +02:00
backref.c btrfs: send: skip resolution of our own backref when finding clone source 2022-12-05 18:00:50 +01:00
backref.h btrfs: send: skip resolution of our own backref when finding clone source 2022-12-05 18:00:50 +01:00
block-group.c btrfs: simplify percent calculation helpers, rename div_factor 2022-12-05 18:00:48 +01:00
block-group.h btrfs: skip update of block group item if used bytes are the same 2022-12-05 18:00:40 +01:00
block-rsv.c btrfs: simplify percent calculation helpers, rename div_factor 2022-12-05 18:00:48 +01:00
block-rsv.h btrfs: simplify percent calculation helpers, rename div_factor 2022-12-05 18:00:48 +01:00
btrfs_inode.h btrfs: move inode prototypes to btrfs_inode.h 2022-12-05 18:00:45 +01:00
check-integrity.c btrfs: move accessor helpers into accessors.h 2022-12-05 18:00:42 +01:00
check-integrity.h btrfs: check-integrity: split submit_bio from btrfsic checking 2022-05-16 17:03:12 +02:00
compression.c btrfs: move super_block specific helpers into super.h 2022-12-05 18:00:47 +01:00
compression.h btrfs: add blk_types.h include to compression.h 2022-12-05 18:00:45 +01:00
ctree.c btrfs: move relocation prototypes into relocation.h 2022-12-05 18:00:47 +01:00
ctree.h btrfs: move orphan prototypes into orphan.h 2022-12-05 18:00:47 +01:00
defrag.c btrfs: remove new_inline argument from btrfs_extent_item_to_extent_map() 2022-12-05 18:00:48 +01:00
defrag.h btrfs: move defrag related prototypes to their own header 2022-12-05 18:00:46 +01:00
delalloc-space.c btrfs: update function comments 2022-12-05 18:00:45 +01:00
delalloc-space.h btrfs: move delalloc space related prototypes to delalloc-space.h 2022-12-05 18:00:44 +01:00
delayed-inode.c btrfs: move file-item prototypes into their own header 2022-12-05 18:00:46 +01:00
delayed-inode.h btrfs: extend btrfs_dir_item type to store encryption status 2022-12-05 18:00:43 +01:00
delayed-ref.c btrfs: update function comments 2022-12-05 18:00:45 +01:00
delayed-ref.h btrfs: remove btrfs_delayed_extent_op::is_data 2022-05-16 17:17:31 +02:00
dev-replace.c btrfs: move scrub prototypes into scrub.h 2022-12-05 18:00:47 +01:00
dev-replace.h btrfs: move dev-replace prototypes into dev-replace.h 2022-12-05 18:00:47 +01:00
dir-item.c btrfs: move dir-item prototypes into dir-item.h 2022-12-05 18:00:46 +01:00
dir-item.h btrfs: move dir-item prototypes into dir-item.h 2022-12-05 18:00:46 +01:00
discard.c btrfs: update function comments 2022-12-05 18:00:45 +01:00
discard.h btrfs: cleanup btrfs_discard_update_discardable usage 2020-12-08 15:54:02 +01:00
disk-io.c btrfs: remove the unused endio_raid56_workers and btrfs_raid_bio::end_io_work 2022-12-05 18:00:49 +01:00
disk-io.h btrfs: remove unused function prototypes 2022-12-05 18:00:44 +01:00
export.c btrfs: move super_block specific helpers into super.h 2022-12-05 18:00:47 +01:00
export.h btrfs: simplify generation check in btrfs_get_dentry 2022-12-05 18:00:41 +01:00
extent_io.c btrfs: merge struct extent_page_data to btrfs_bio_ctrl 2022-12-05 18:00:48 +01:00
extent_io.h btrfs: convert extent_io page op defines to enum bits 2022-12-05 18:00:40 +01:00
extent_map.c btrfs: selftests: remove impossible inline extent at non-zero file offset 2022-12-05 18:00:47 +01:00
extent_map.h btrfs: get the next extent map during fiemap/lseek more efficiently 2022-12-05 18:00:38 +01:00
extent-io-tree.c btrfs: move the printk helpers out of ctree.h 2022-12-05 18:00:41 +01:00
extent-io-tree.h btrfs: remove unused unlock_extent_atomic 2022-12-05 18:00:41 +01:00
extent-tree.c btrfs: move orphan prototypes into orphan.h 2022-12-05 18:00:47 +01:00
extent-tree.h btrfs: move the snapshot drop related prototypes to extent-tree.h 2022-12-05 18:00:46 +01:00
file-item.c btrfs: remove new_inline argument from btrfs_extent_item_to_extent_map() 2022-12-05 18:00:48 +01:00
file-item.h btrfs: remove new_inline argument from btrfs_extent_item_to_extent_map() 2022-12-05 18:00:48 +01:00
file.c btrfs: update stale comment for nowait direct IO writes 2022-12-05 18:00:48 +01:00
file.h btrfs: move file prototypes to file.h 2022-12-05 18:00:46 +01:00
free-space-cache.c btrfs: simplify percent calculation helpers, rename div_factor 2022-12-05 18:00:48 +01:00
free-space-cache.h btrfs: convert discard stat defs to enum 2022-12-05 18:00:45 +01:00
free-space-tree.c btrfs: move root tree prototypes to their own header 2022-12-05 18:00:44 +01:00
free-space-tree.h
fs.c btrfs: move accessor helpers into accessors.h 2022-12-05 18:00:42 +01:00
fs.h btrfs: remove the unused endio_raid56_workers and btrfs_raid_bio::end_io_work 2022-12-05 18:00:49 +01:00
inode-item.c btrfs: move file-item prototypes into their own header 2022-12-05 18:00:46 +01:00
inode-item.h btrfs: use struct fscrypt_str instead of struct qstr 2022-12-05 18:00:43 +01:00
inode.c btrfs: extract the inline extent read code into its own function 2022-12-05 18:00:48 +01:00
ioctl.c btrfs: move super prototypes into super.h 2022-12-05 18:00:47 +01:00
ioctl.h btrfs: move ioctl prototypes into ioctl.h 2022-12-05 18:00:46 +01:00
Kconfig btrfs: use generic Kconfig option for 256kB page size limit 2022-01-20 08:52:55 +02:00
locking.c btrfs: move accessor helpers into accessors.h 2022-12-05 18:00:42 +01:00
locking.h btrfs: move the lockdep helpers into locking.h 2022-12-05 18:00:44 +01:00
lzo.c btrfs: move super_block specific helpers into super.h 2022-12-05 18:00:47 +01:00
Makefile btrfs: rename tree-defrag.c to defrag.c 2022-12-05 18:00:45 +01:00
messages.c btrfs: move super_block specific helpers into super.h 2022-12-05 18:00:47 +01:00
messages.h btrfs: move the 32bit warn defines into messages.h 2022-12-05 18:00:46 +01:00
misc.h btrfs: simplify percent calculation helpers, rename div_factor 2022-12-05 18:00:48 +01:00
ordered-data.c btrfs: move super_block specific helpers into super.h 2022-12-05 18:00:47 +01:00
ordered-data.h btrfs: use cached_state for btrfs_check_nocow_lock 2022-12-05 18:00:36 +01:00
orphan.c btrfs: move orphan prototypes into orphan.h 2022-12-05 18:00:47 +01:00
orphan.h btrfs: move orphan prototypes into orphan.h 2022-12-05 18:00:47 +01:00
print-tree.c btrfs: extend btrfs_dir_item type to store encryption status 2022-12-05 18:00:43 +01:00
print-tree.h btrfs: print the actual offset in btrfs_root_name 2021-01-07 17:25:05 +01:00
props.c btrfs: move super_block specific helpers into super.h 2022-12-05 18:00:47 +01:00
props.h btrfs: make module init/exit match their sequence 2022-12-05 18:00:40 +01:00
qgroup.c btrfs: use a structure to pass arguments to backref walking functions 2022-12-05 18:00:50 +01:00
qgroup.h btrfs: sink gfp_t parameter to btrfs_qgroup_trace_extent 2022-12-05 18:00:43 +01:00
raid56.c btrfs: raid56: switch scrub path to use a single function 2022-12-05 18:00:49 +01:00
raid56.h btrfs: remove the unused endio_raid56_workers and btrfs_raid_bio::end_io_work 2022-12-05 18:00:49 +01:00
rcu-string.h
ref-verify.c btrfs: move accessor helpers into accessors.h 2022-12-05 18:00:42 +01:00
ref-verify.h
reflink.c btrfs: move super_block specific helpers into super.h 2022-12-05 18:00:47 +01:00
reflink.h
relocation.c btrfs: use a structure to pass arguments to backref walking functions 2022-12-05 18:00:50 +01:00
relocation.h btrfs: move relocation prototypes into relocation.h 2022-12-05 18:00:47 +01:00
root-tree.c btrfs: move orphan prototypes into orphan.h 2022-12-05 18:00:47 +01:00
root-tree.h btrfs: move root tree prototypes to their own header 2022-12-05 18:00:44 +01:00
scrub.c btrfs: use a structure to pass arguments to backref walking functions 2022-12-05 18:00:50 +01:00
scrub.h btrfs: move scrub prototypes into scrub.h 2022-12-05 18:00:47 +01:00
send.c btrfs: send: skip resolution of our own backref when finding clone source 2022-12-05 18:00:50 +01:00
send.h btrfs: send add define for v2 buffer size 2022-12-05 18:00:41 +01:00
space-info.c btrfs: simplify percent calculation helpers, rename div_factor 2022-12-05 18:00:48 +01:00
space-info.h btrfs: move btrfs_account_ro_block_groups_free_space into space-info.c 2022-12-05 18:00:44 +01:00
subpage.c btrfs: move the printk helpers out of ctree.h 2022-12-05 18:00:41 +01:00
subpage.h btrfs: make nodesize >= PAGE_SIZE case to reuse the non-subpage routine 2022-05-16 17:03:11 +02:00
super.c btrfs: move super prototypes into super.h 2022-12-05 18:00:47 +01:00
super.h btrfs: move super_block specific helpers into super.h 2022-12-05 18:00:47 +01:00
sysfs.c btrfs: simplify percent calculation helpers, rename div_factor 2022-12-05 18:00:48 +01:00
sysfs.h btrfs: split and refactor btrfs_sysfs_remove_devices_dir 2020-10-07 12:12:21 +02:00
transaction.c btrfs: simplify percent calculation helpers, rename div_factor 2022-12-05 18:00:48 +01:00
transaction.h btrfs: remove fs_info::pending_changes and related code 2022-12-05 18:00:42 +01:00
tree-checker.c btrfs: extend btrfs_dir_item type to store encryption status 2022-12-05 18:00:43 +01:00
tree-checker.h btrfs: tree-checker: check extent buffer owner against owner rootid 2022-05-16 17:03:09 +02:00
tree-log.c btrfs: move orphan prototypes into orphan.h 2022-12-05 18:00:47 +01:00
tree-log.h btrfs: use struct fscrypt_str instead of struct qstr 2022-12-05 18:00:43 +01:00
tree-mod-log.c btrfs: move accessor helpers into accessors.h 2022-12-05 18:00:42 +01:00
tree-mod-log.h btrfs: fix SPDX comment in tree-mod-log.h 2022-12-05 18:00:48 +01:00
ulist.c btrfs: constify ulist parameter of ulist_next() 2022-12-05 18:00:50 +01:00
ulist.h btrfs: constify ulist parameter of ulist_next() 2022-12-05 18:00:50 +01:00
uuid-tree.c btrfs: move uuid tree prototypes to uuid-tree.h 2022-12-05 18:00:46 +01:00
uuid-tree.h btrfs: move uuid tree prototypes to uuid-tree.h 2022-12-05 18:00:46 +01:00
verity.c btrfs: move orphan prototypes into orphan.h 2022-12-05 18:00:47 +01:00
verity.h btrfs: move verity prototypes into verity.h 2022-12-05 18:00:47 +01:00
volumes.c btrfs: simplify percent calculation helpers, rename div_factor 2022-12-05 18:00:48 +01:00
volumes.h btrfs: move btrfs_chunk_item_size out of ctree.h 2022-12-05 18:00:45 +01:00
xattr.c btrfs: move dir-item prototypes into dir-item.h 2022-12-05 18:00:46 +01:00
xattr.h
zlib.c btrfs: zlib: replace kmap() with kmap_local_page() in zlib_decompress_bio() 2022-07-25 17:45:41 +02:00
zoned.c btrfs: update function comments 2022-12-05 18:00:45 +01:00
zoned.h btrfs: move the printk helpers out of ctree.h 2022-12-05 18:00:41 +01:00
zstd.c btrfs: update function comments 2022-12-05 18:00:45 +01:00