linux/fs
Filipe Manana c91ea4bfa6 btrfs: make extent state merges more efficient during insertions
When inserting a new extent state record into an io tree that happens to
be mergeable, we currently do the following:

1) Insert the extent state record in the io tree's rbtree. This requires
   going down the tree to find where to insert it, and during the
   insertion we often need to balance the rbtree;

2) We then check if the previous node is mergeable, so we call rb_prev()
   to find it, which requires some looping to find the previous node;

3) If the previous node is mergeable, we adjust our node to include the
   range of the previous node and then delete the previous node from the
   rbtree, which again may need to balance the rbtree;

4) Then we check if the next node is mergeable with the node we inserted,
   so we call rb_next(), which requires some looping too. If the next node
   is indeed mergeable, we expand the range of our node to include the
   next node's range and then delete the next node from the rbtree, which
   again may need to balance the tree.

So these are quite of lot of iterations and looping over the rbtree, and
some of the operations may need to rebalance the rb tree. This can be made
a bit more efficient by:

1) When iterating the rbtree, once we find a node that is mergeable with
   the node we want to insert, we can just adjust that node's range with
   the range of the node to insert - this avoids continuing iterating
   over the tree and deleting a node from the rbtree;

2) If we expand the range of a mergeable node, then we find the next or
   the previous node, depending on other we merged a range to the right or
   to the left of the node we are currently at during the iteration. This
   merging is as before, we find the next or previous node with rb_next()
   or rb_prev() and if that other node is mergeable with the current one,
   we adjust the range of the current node and remove the other node from
   the rbtree;

3) Whenever we need to insert the new extent state record it's because
   we don't have any extent state record in the rbtree which can be
   merged, so we can remove the call to merge_state() after the insertion,
   saving rb_next() and rb_prev() calls, which require some looping.

So update the insertion function insert_state() to have this behaviour.

Running dbench for 120 seconds and capturing the execution times of
set_extent_bit() at pin_down_extent(), resulted in the following data
(time values are in nanoseconds):

Before this change:

  Count: 2278299
  Range:  0.000 - 4003728.000; Mean: 713.436; Median: 612.000; Stddev: 3606.952
  Percentiles:  90th: 1187.000; 95th: 1350.000; 99th: 1724.000
       0.000 -       7.534:       5 |
       7.534 -      35.418:      36 |
      35.418 -     154.403:     273 |
     154.403 -     662.138: 1244016 #####################################################
     662.138 -    2828.745: 1031335 ############################################
    2828.745 -   12074.102:    1395 |
   12074.102 -   51525.930:     806 |
   51525.930 -  219874.955:     162 |
  219874.955 -  938254.688:      22 |
  938254.688 - 4003728.000:       3 |

After this change:

  Count: 2275862
  Range:  0.000 - 1605175.000; Mean: 678.903; Median: 590.000; Stddev: 2149.785
  Percentiles:  90th: 1105.000; 95th: 1245.000; 99th: 1590.000
       0.000 -      10.219:      10 |
      10.219 -      40.957:      36 |
      40.957 -     155.907:     262 |
     155.907 -     585.789: 1127214 ####################################################
     585.789 -    2193.431: 1145134 #####################################################
    2193.431 -    8205.578:    1648 |
    8205.578 -   30689.378:    1039 |
   30689.378 -  114772.699:     362 |
  114772.699 -  429221.537:      52 |
  429221.537 - 1605175.000:      10 |

Maximum duration (range), average duration, percentiles and standard
deviation are all better.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-12 16:44:14 +02:00
..
9p - Some swap cleanups from Ma Wupeng ("fix WARN_ON in add_to_avail_list") 2023-08-29 14:25:26 -07:00
adfs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
affs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
afs - Some swap cleanups from Ma Wupeng ("fix WARN_ON in add_to_avail_list") 2023-08-29 14:25:26 -07:00
autofs v6.6-vfs.autofs 2023-08-28 11:39:14 -07:00
befs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
bfs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
btrfs btrfs: make extent state merges more efficient during insertions 2023-10-12 16:44:14 +02:00
cachefiles - Some swap cleanups from Ma Wupeng ("fix WARN_ON in add_to_avail_list") 2023-08-29 14:25:26 -07:00
ceph ceph: remove unnecessary check for NULL in parse_longname() 2023-09-18 12:04:50 +02:00
coda v6.6-vfs.ctime 2023-08-28 09:31:32 -07:00
configfs configfs: convert to ctime accessor functions 2023-07-13 10:28:05 +02:00
cramfs v6.6-vfs.super 2023-08-28 11:04:18 -07:00
crypto
debugfs Char/Misc driver changes for 6.6-rc1 2023-09-01 09:53:54 -07:00
devpts v6.6-vfs.misc 2023-08-28 10:17:14 -07:00
dlm dlm: fix plock lookup when using multiple lockspaces 2023-08-25 10:31:39 -05:00
ecryptfs v6.6-vfs.misc 2023-08-28 10:17:14 -07:00
efivarfs efivarfs: fix statfs() on efivarfs 2023-09-11 09:10:02 +00:00
efs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
erofs erofs: allow empty device tags in flatdev mode 2023-09-19 08:39:02 +08:00
exfat for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
exportfs exportfs: remove kernel-doc warnings in exportfs 2023-08-29 17:45:22 -04:00
ext2 \n 2023-08-30 12:10:50 -07:00
ext4 Revert "ext4: switch to multigrain timestamps" 2023-09-20 18:05:31 +02:00
f2fs f2fs update for 6.6-rc1 2023-09-02 15:37:59 -07:00
fat for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
freevxfs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
fscache
fuse fuse update for 6.6 2023-09-05 12:45:55 -07:00
gfs2 gfs2: Fix quota=quiet oversight 2023-09-18 16:26:24 +02:00
hfs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
hfsplus for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
hostfs hostfs: convert to ctime accessor functions 2023-07-24 10:30:00 +02:00
hpfs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
hugetlbfs - Some swap cleanups from Ma Wupeng ("fix WARN_ON in add_to_avail_list") 2023-08-29 14:25:26 -07:00
iomap iomap: Spelling s/preceeding/preceding/g 2023-09-28 09:26:58 -07:00
isofs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
jbd2 Regression and bug fixes for ext4. 2023-09-17 10:33:53 -07:00
jffs2 jffs2: convert to ctime accessor functions 2023-07-24 10:30:01 +02:00
jfs A few small fixes 2023-08-31 15:25:01 -07:00
kernfs Driver core changes for 6.6-rc1 2023-09-01 09:43:18 -07:00
lockd SUNRPC: Add enum svc_auth_status 2023-08-29 17:45:22 -04:00
minix for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
netfs netfs: Only call folio_start_fscache() one time for each folio 2023-09-18 12:03:46 -07:00
nfs nfs: decrement nrequests counter before releasing the req 2023-09-28 12:22:25 -04:00
nfs_common
nfsd nfsd-6.6 fixes: 2023-09-30 09:44:48 -07:00
nilfs2 nilfs2: fix potential use after free in nilfs_gccache_submit_read_data() 2023-09-29 17:20:46 -07:00
nls nls: Hide new NLS_UCS2_UTILS 2023-08-31 12:07:34 -05:00
notify dnotify: Pass argument of fcntl_dirnotify as int 2023-07-10 14:36:12 +02:00
ntfs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
ntfs3 ntfs3: put resources during ntfs_fill_super() 2023-09-25 14:12:42 +02:00
ocfs2 Many ext4 and jbd2 cleanups and bug fixes for v6.6-rc1. 2023-08-31 15:18:15 -07:00
omfs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
openpromfs openpromfs: convert to ctime accessor functions 2023-07-24 10:30:03 +02:00
orangefs fs: drop the timespec64 argument from update_time 2023-08-11 09:04:57 +02:00
overlayfs ovl: fix NULL pointer defer when encoding non-decodable lower fid 2023-10-03 09:24:11 +03:00
proc proc: nommu: fix empty /proc/<pid>/maps 2023-09-19 13:21:34 -07:00
pstore pstore fix for v6.6-rc1 2023-09-02 10:45:17 -07:00
qnx4 for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
qnx6 for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
quota quota: Fix slow quotaoff 2023-10-06 11:01:23 +02:00
ramfs ramfs: convert to ctime accessor functions 2023-07-24 10:30:04 +02:00
reiserfs reiserfs: Replace 1-element array with C99 style flex-array 2023-09-11 14:07:46 +02:00
romfs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
smb Six SMB3 server fixes 2023-10-08 10:10:52 -07:00
squashfs squashfs: convert to ctime accessor functions 2023-07-24 10:30:05 +02:00
sysfs
sysv for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
tracefs eventfs: Test for dentries array allocated in eventfs_release() 2023-09-30 16:26:04 -04:00
ubifs fs: drop the timespec64 argument from update_time 2023-08-11 09:04:57 +02:00
udf \n 2023-08-30 12:10:50 -07:00
ufs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
unicode
vboxsf v6.6-vfs.ctime 2023-08-28 09:31:32 -07:00
verity fsverity: skip PKCS#7 parser when keyring is empty 2023-08-20 10:33:43 -07:00
xfs xfs: abort fstrim if kernel is suspending 2023-10-04 09:25:04 +11:00
zonefs New code for 6.6: 2023-08-28 11:59:52 -07:00
aio.c aio: Annotate struct kioctx_table with __counted_by 2023-09-20 14:22:01 +02:00
anon_inodes.c
attr.c v6.6-vfs.misc 2023-08-28 10:17:14 -07:00
bad_inode.c fs: drop the timespec64 argument from update_time 2023-08-11 09:04:57 +02:00
binfmt_elf_fdpic.c fs: binfmt_elf_efpic: fix personality for ELF-FDPIC 2023-09-29 17:20:45 -07:00
binfmt_elf_test.c
binfmt_elf.c Merge branch 'expand-stack' 2023-06-28 20:35:21 -07:00
binfmt_flat.c
binfmt_misc.c fs: convert to ctime accessor functions 2023-07-13 10:28:04 +02:00
binfmt_script.c
buffer.c iomap: add a workaround for racy i_size updates on block devices 2023-09-25 08:55:00 -07:00
char_dev.c
compat_binfmt_elf.c
coredump.c v6.5/vfs.misc 2023-06-26 09:50:21 -07:00
d_path.c
dax.c mm: remove enum page_entry_size 2023-08-24 16:20:30 -07:00
dcache.c fs/dcache: Replace printk and WARN_ON by WARN 2023-08-19 13:41:11 +02:00
direct-io.c - Yosry Ahmed brought back some cgroup v1 stats in OOM logs. 2023-06-28 10:28:11 -07:00
drop_caches.c fs: drop_caches: draining pages before dropping caches 2023-08-18 10:12:11 -07:00
eventfd.c eventfd: prevent underflow for eventfd semaphores 2023-07-11 11:41:34 +02:00
eventpoll.c epoll: simplify ep_alloc() 2023-07-26 14:56:07 +02:00
exec.c - An extensive rework of kexec and crash Kconfig from Eric DeVolder 2023-08-29 14:53:51 -07:00
fcntl.c fcntl: Cast commands with int args explicitly 2023-07-10 14:36:11 +02:00
fhandle.c
file_table.c fs: use __fput_sync in close(2) 2023-08-08 19:36:51 +02:00
file.c v6.6-vfs.misc 2023-08-28 10:17:14 -07:00
filesystems.c
fs_context.c v6.6-vfs.misc 2023-08-28 10:17:14 -07:00
fs_parser.c
fs_pin.c
fs_struct.c kill do_each_thread() 2023-08-21 13:46:25 -07:00
fs_types.c
fs-writeback.c fs-writeback: do not requeue a clean inode having skipped pages 2023-09-20 14:22:01 +02:00
fsopen.c fs: add FSCONFIG_CMD_CREATE_EXCL 2023-08-14 18:48:02 +02:00
init.c
inode.c Revert "fs: add infrastructure for multigrain timestamps" 2023-09-20 18:05:31 +02:00
internal.h for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
ioctl.c v6.6-vfs.super 2023-08-28 11:04:18 -07:00
Kconfig for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
Kconfig.binfmt riscv: support the elf-fdpic binfmt loader 2023-08-23 14:17:43 -07:00
kernel_read_file.c fs: Fix kernel-doc warnings 2023-08-19 12:12:12 +02:00
libfs.c direct_write_fallback(): on error revert the ->ki_pos update from buffered write 2023-09-20 14:22:01 +02:00
locks.c NFSD 6.6 Release Notes 2023-08-31 15:32:18 -07:00
Makefile fs: add CONFIG_BUFFER_HEAD 2023-08-02 09:13:09 -06:00
mbcache.c
mnt_idmapping.c
mount.h
mpage.c
namei.c fs: Fix kernel-doc warnings 2023-08-19 12:12:12 +02:00
namespace.c v6.5/vfs.mount 2023-06-26 10:27:04 -07:00
nsfs.c fs: convert to ctime accessor functions 2023-07-13 10:28:04 +02:00
open.c v6.6-vfs.fchmodat2 2023-08-28 11:25:27 -07:00
pipe.c fs/pipe: remove duplicate "offset" initializer 2023-09-20 14:22:01 +02:00
pnode.c
pnode.h
posix_acl.c fs: convert to ctime accessor functions 2023-07-13 10:28:04 +02:00
proc_namespace.c
read_write.c fs: Fix one kernel-doc comment 2023-08-15 08:32:45 +02:00
readdir.c vfs: get rid of old '->iterate' directory operation 2023-08-06 15:08:35 +02:00
remap_range.c
select.c
seq_file.c
signalfd.c
splice.c - Some swap cleanups from Ma Wupeng ("fix WARN_ON in add_to_avail_list") 2023-08-29 14:25:26 -07:00
stack.c fs: convert to ctime accessor functions 2023-07-13 10:28:04 +02:00
stat.c Revert "fs: add infrastructure for multigrain timestamps" 2023-09-20 18:05:31 +02:00
statfs.c
super.c fs: export sget_dev() 2023-08-31 12:47:15 +02:00
sync.c
sysctls.c
timerfd.c
userfaultfd.c mm: userfaultfd: remove stale comment about core dump locking 2023-08-24 16:20:27 -07:00
utimes.c
xattr.c tmpfs,xattr: GFP_KERNEL_ACCOUNT for simple xattrs 2023-08-22 10:57:46 +02:00