linux/fs
Johannes Thumshirn 1ec17ef591 btrfs: zoned: fix use-after-free in do_zone_finish()
Shinichiro reported the following use-after-free triggered by the device
replace operation in fstests btrfs/070.

 BTRFS info (device nullb1): scrub: finished on devid 1 with status: 0
 ==================================================================
 BUG: KASAN: slab-use-after-free in do_zone_finish+0x91a/0xb90 [btrfs]
 Read of size 8 at addr ffff8881543c8060 by task btrfs-cleaner/3494007

 CPU: 0 PID: 3494007 Comm: btrfs-cleaner Tainted: G        W          6.8.0-rc5-kts #1
 Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 3.3 02/21/2020
 Call Trace:
  <TASK>
  dump_stack_lvl+0x5b/0x90
  print_report+0xcf/0x670
  ? __virt_addr_valid+0x200/0x3e0
  kasan_report+0xd8/0x110
  ? do_zone_finish+0x91a/0xb90 [btrfs]
  ? do_zone_finish+0x91a/0xb90 [btrfs]
  do_zone_finish+0x91a/0xb90 [btrfs]
  btrfs_delete_unused_bgs+0x5e1/0x1750 [btrfs]
  ? __pfx_btrfs_delete_unused_bgs+0x10/0x10 [btrfs]
  ? btrfs_put_root+0x2d/0x220 [btrfs]
  ? btrfs_clean_one_deleted_snapshot+0x299/0x430 [btrfs]
  cleaner_kthread+0x21e/0x380 [btrfs]
  ? __pfx_cleaner_kthread+0x10/0x10 [btrfs]
  kthread+0x2e3/0x3c0
  ? __pfx_kthread+0x10/0x10
  ret_from_fork+0x31/0x70
  ? __pfx_kthread+0x10/0x10
  ret_from_fork_asm+0x1b/0x30
  </TASK>

 Allocated by task 3493983:
  kasan_save_stack+0x33/0x60
  kasan_save_track+0x14/0x30
  __kasan_kmalloc+0xaa/0xb0
  btrfs_alloc_device+0xb3/0x4e0 [btrfs]
  device_list_add.constprop.0+0x993/0x1630 [btrfs]
  btrfs_scan_one_device+0x219/0x3d0 [btrfs]
  btrfs_control_ioctl+0x26e/0x310 [btrfs]
  __x64_sys_ioctl+0x134/0x1b0
  do_syscall_64+0x99/0x190
  entry_SYSCALL_64_after_hwframe+0x6e/0x76

 Freed by task 3494056:
  kasan_save_stack+0x33/0x60
  kasan_save_track+0x14/0x30
  kasan_save_free_info+0x3f/0x60
  poison_slab_object+0x102/0x170
  __kasan_slab_free+0x32/0x70
  kfree+0x11b/0x320
  btrfs_rm_dev_replace_free_srcdev+0xca/0x280 [btrfs]
  btrfs_dev_replace_finishing+0xd7e/0x14f0 [btrfs]
  btrfs_dev_replace_by_ioctl+0x1286/0x25a0 [btrfs]
  btrfs_ioctl+0xb27/0x57d0 [btrfs]
  __x64_sys_ioctl+0x134/0x1b0
  do_syscall_64+0x99/0x190
  entry_SYSCALL_64_after_hwframe+0x6e/0x76

 The buggy address belongs to the object at ffff8881543c8000
  which belongs to the cache kmalloc-1k of size 1024
 The buggy address is located 96 bytes inside of
  freed 1024-byte region [ffff8881543c8000, ffff8881543c8400)

 The buggy address belongs to the physical page:
 page:00000000fe2c1285 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1543c8
 head:00000000fe2c1285 order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
 flags: 0x17ffffc0000840(slab|head|node=0|zone=2|lastcpupid=0x1fffff)
 page_type: 0xffffffff()
 raw: 0017ffffc0000840 ffff888100042dc0 ffffea0019e8f200 dead000000000002
 raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
 page dumped because: kasan: bad access detected

 Memory state around the buggy address:
  ffff8881543c7f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  ffff8881543c7f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 >ffff8881543c8000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                        ^
  ffff8881543c8080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff8881543c8100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

This UAF happens because we're accessing stale zone information of a
already removed btrfs_device in do_zone_finish().

The sequence of events is as follows:

btrfs_dev_replace_start
  btrfs_scrub_dev
   btrfs_dev_replace_finishing
    btrfs_dev_replace_update_device_in_mapping_tree <-- devices replaced
    btrfs_rm_dev_replace_free_srcdev
     btrfs_free_device                              <-- device freed

cleaner_kthread
 btrfs_delete_unused_bgs
  btrfs_zone_finish
   do_zone_finish              <-- refers the freed device

The reason for this is that we're using a cached pointer to the chunk_map
from the block group, but on device replace this cached pointer can
contain stale device entries.

The staleness comes from the fact, that btrfs_block_group::physical_map is
not a pointer to a btrfs_chunk_map but a memory copy of it.

Also take the fs_info::dev_replace::rwsem to prevent
btrfs_dev_replace_update_device_in_mapping_tree() from changing the device
underneath us again.

Note: btrfs_dev_replace_update_device_in_mapping_tree() is holding
fs_info::mapping_tree_lock, but as this is a spinning read/write lock we
cannot take it as the call to blkdev_zone_mgmt() requires a memory
allocation which may not sleep.
But btrfs_dev_replace_update_device_in_mapping_tree() is always called with
the fs_info::dev_replace::rwsem held in write mode.

Many thanks to Shinichiro for analyzing the bug.

Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
CC: stable@vger.kernel.org # 6.8
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-26 16:41:01 +01:00
..
9p 9p: Use length of data written to the server in preference to error 2024-01-04 13:15:31 +00:00
adfs adfs: remove writepage implementation 2023-12-29 11:58:33 -08:00
affs affs: free affs_sb_info with kfree_rcu() 2024-02-25 02:10:31 -05:00
afs afs: Fix endless loop in directory parsing 2024-02-27 11:20:43 +01:00
autofs dcache stuff for this cycle 2024-01-11 20:11:35 -08:00
bcachefs bcachefs: fix bch2_save_backtrace() 2024-02-25 15:45:36 -05:00
befs befs: d_obtain_alias(ERR_PTR(...)) will do the right thing 2023-12-21 12:51:02 -05:00
bfs misc cleanups (the part that hadn't been picked by individual fs trees) 2024-01-11 20:23:50 -08:00
btrfs btrfs: zoned: fix use-after-free in do_zone_finish() 2024-03-26 16:41:01 +01:00
cachefiles cachefiles: fix memory leak in cachefiles_add_cache() 2024-02-20 09:46:07 +01:00
ceph ceph: switch to corrected encoding of max_xattr_size in mdsmap 2024-02-26 19:20:30 +01:00
coda dcache stuff for this cycle 2024-01-11 20:11:35 -08:00
configfs
cramfs vfs-6.7.ctime 2023-10-30 09:47:13 -10:00
crypto fscrypt: document that CephFS supports fscrypt now 2023-12-26 22:55:42 -06:00
debugfs Merge branches 'acpi-pm', 'acpi-video', 'acpi-apei' and 'acpi-extlog' 2024-01-04 13:19:40 +01:00
devpts fs: Remove the now superfluous sentinel elements from ctl_table array 2023-12-28 04:57:57 -08:00
dlm dlm: update format header reflect current format 2023-12-20 15:36:48 -06:00
ecryptfs fix directory locking scheme on rename 2024-01-11 20:00:22 -08:00
efivarfs efivarfs: Drop 'duplicates' bool parameter on efivar_init() 2024-02-25 09:43:39 +01:00
efs vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
erofs Change since last update: 2024-02-25 09:53:13 -08:00
exfat Description for this pull request: 2024-03-01 12:22:30 -08:00
exportfs fs: fix build error with CONFIG_EXPORTFS=m or not defined 2023-10-28 16:16:19 +02:00
ext2 fix directory locking scheme on rename 2024-01-11 20:00:22 -08:00
ext4 We still have some races in filesystem methods when exposed to RCU 2024-02-25 09:29:05 -08:00
f2fs f2fs: fix double free of f2fs_sb_info 2024-01-12 18:55:09 -08:00
fat vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
freevxfs freevxfs: lookup: fix function params kernel-doc 2023-12-20 15:02:58 -08:00
fuse fuse: fix UAF in rcu pathwalks 2024-02-25 02:10:32 -05:00
gfs2 Revert "gfs2: Use GL_NOBLOCK flag for non-blocking lookups" 2024-02-02 17:21:44 +01:00
hfs hfs: really remove hfs_writepage 2023-12-29 11:58:34 -08:00
hfsplus hfsplus: switch to rcu-delayed unloading of nls and freeing ->s_fs_info 2024-02-25 02:10:31 -05:00
hostfs hostfs: use d_splice_alias() calling conventions to simplify failure exits 2023-12-21 12:51:00 -05:00
hpfs
hugetlbfs fs,hugetlb: fix NULL pointer dereference in hugetlbs_fill_super 2024-02-07 21:20:36 -08:00
iomap mm: add folio_fill_tail() and use it in iomap 2023-12-10 16:51:36 -08:00
isofs
jbd2 jbd2: abort journal when detecting metadata writeback error of fs dev 2024-01-04 23:42:21 -05:00
jffs2 jffs2: mark __jffs2_dbg_superblock_counts() static 2023-12-10 17:21:43 -08:00
jfs Revert "jfs: fix shift-out-of-bounds in dbJoin" 2024-01-29 08:45:10 -06:00
kernfs Revert "kernfs: convert kernfs_idr_lock to an irq safe raw spinlock" 2024-01-11 11:51:27 +01:00
lockd sysctl-6.8-rc1 2024-01-10 17:44:36 -08:00
minix minixfs kmap_local_page() switchover and related fixes - very similar to sysv series. 2024-01-11 19:54:18 -08:00
netfs netfs: Fix missing zero-length check in unbuffered write 2024-01-29 14:53:21 +01:00
nfs nfs: fix UAF on pathwalk running into umount 2024-02-25 02:10:32 -05:00
nfs_common
nfsd nfsd-6.8 fixes: 2024-02-07 17:48:15 +00:00
nilfs2 nilfs2: fix potential bug in end_buffer_async_write 2024-02-07 21:20:37 -08:00
nls
notify dcache stuff for this cycle 2024-01-11 20:11:35 -08:00
ntfs sysctl-6.8-rc1 2024-01-10 17:44:36 -08:00
ntfs3 fs/ntfs3: fix build without CONFIG_NTFS3_LZX_XPRESS 2024-02-26 09:32:23 -08:00
ocfs2 misc cleanups (the part that hadn't been picked by individual fs trees) 2024-01-11 20:23:50 -08:00
omfs
openpromfs
orangefs orangefs: saner arguments passing in readdir guts 2023-12-21 12:53:36 -05:00
overlayfs vfs-6.8-rc5.fixes 2024-02-12 07:15:45 -08:00
proc We still have some races in filesystem methods when exposed to RCU 2024-02-25 09:29:05 -08:00
pstore pstore: inode: Use cleanup.h for struct pstore_private 2023-12-08 14:15:44 -08:00
qnx4 qnx4: Use get_directory_fname() in qnx4_match() 2023-12-13 11:19:18 -08:00
qnx6
quota sysctl-6.8-rc1 2024-01-10 17:44:36 -08:00
ramfs mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER 2024-01-08 15:27:15 -08:00
reiserfs misc cleanups (the part that hadn't been picked by individual fs trees) 2024-01-11 20:23:50 -08:00
romfs vfs-6.7.ctime 2023-10-30 09:47:13 -10:00
smb We still have some races in filesystem methods when exposed to RCU 2024-02-25 09:29:05 -08:00
squashfs Squashfs: fix variable overflow triggered by sysbot 2023-12-10 17:21:26 -08:00
sysfs fs/sysfs/dir.c : Fix typo in comment 2023-12-07 11:35:23 +09:00
sysv sysv: remove writepage implementation 2023-12-29 11:58:35 -08:00
tracefs eventfs: Keep all directory links at 1 2024-02-01 11:53:53 -05:00
ubifs ubifs: fix kernel-doc warnings 2024-01-06 23:49:50 +01:00
udf misc cleanups (the part that hadn't been picked by individual fs trees) 2024-01-11 20:23:50 -08:00
ufs Many singleton patches against the MM code. The patch series which 2024-01-09 11:18:47 -08:00
unicode
vboxsf fs: vboxsf: fix a kernel-doc warning 2023-12-08 15:32:31 -07:00
verity Networking changes for 6.8. 2024-01-11 10:07:29 -08:00
xfs xfs: drop experimental warning for FSDAX 2024-02-27 09:53:30 +05:30
zonefs zonefs: Improve error handling 2024-02-16 10:20:35 +09:00
aio.c fs/aio: Make io_cancel() generate completions again 2024-02-27 11:20:44 +01:00
anon_inodes.c Merge branch 'kvm-guestmemfd' into HEAD 2023-11-14 08:31:31 -05:00
attr.c fs: fix doc comment typo fs tree wide 2023-12-21 13:17:54 +01:00
backing-file.c fs: factor out backing_file_mmap() helper 2023-12-23 16:35:09 +02:00
bad_inode.c
binfmt_elf_fdpic.c execve updates for v6.7-rc1 2023-10-30 19:28:19 -10:00
binfmt_elf_test.c
binfmt_elf.c
binfmt_flat.c
binfmt_misc.c execve updates for v6.7-rc1 2023-10-30 19:28:19 -10:00
binfmt_script.c
buffer.c Many singleton patches against the MM code. The patch series which 2024-01-09 11:18:47 -08:00
char_dev.c As usual, lots of singleton and doubleton patches all over the tree and 2023-11-02 20:53:31 -10:00
compat_binfmt_elf.c
coredump.c fs: Remove the now superfluous sentinel elements from ctl_table array 2023-12-28 04:57:57 -08:00
d_path.c
dax.c fs : Fix warning using plain integer as NULL 2023-11-18 15:00:01 +01:00
dcache.c Revert "get rid of DCACHE_GENOCIDE" 2024-02-09 23:31:16 -05:00
direct-io.c fs : Fix warning using plain integer as NULL 2023-11-18 15:00:01 +01:00
drop_caches.c
eventfd.c eventfd: Remove usage of the deprecated ida_simple_xx() API 2023-12-12 14:24:55 +01:00
eventpoll.c fs: Remove the now superfluous sentinel elements from ctl_table array 2023-12-28 04:57:57 -08:00
exec.c execve fixes for v6.8-rc2 2024-01-24 13:32:29 -08:00
fcntl.c
fhandle.c exportfs: add helpers to check if filesystem can encode/decode file handles 2023-10-24 17:57:45 +02:00
file_table.c dcache stuff for this cycle 2024-01-11 20:11:35 -08:00
file.c file: remove __receive_fd() 2023-12-12 14:24:14 +01:00
filesystems.c
fs_context.c
fs_parser.c
fs_pin.c
fs_struct.c
fs_types.c
fs-writeback.c netfs: Move pinning-for-writeback from fscache to netfs 2023-12-24 15:08:49 +00:00
fsopen.c
init.c
inode.c fix directory locking scheme on rename 2024-01-11 20:00:22 -08:00
internal.h dcache stuff for this cycle 2024-01-11 20:11:35 -08:00
ioctl.c lsm: new security_file_ioctl_compat() hook 2023-12-24 15:48:03 -05:00
Kconfig vfs-6.8.netfs 2024-01-19 09:10:23 -08:00
Kconfig.binfmt
kernel_read_file.c
libfs.c dcache stuff for this cycle 2024-01-11 20:11:35 -08:00
locks.c fs: Remove the now superfluous sentinel elements from ctl_table array 2023-12-28 04:57:57 -08:00
Makefile vfs-6.8.netfs 2024-01-19 09:10:23 -08:00
mbcache.c
mnt_idmapping.c mnt_idmapping: decouple from namespaces 2023-11-28 14:08:47 +01:00
mount.h mounts: keep list of mounts in an rbtree 2023-11-18 14:56:16 +01:00
mpage.c fs: convert block_write_full_page to block_write_full_folio 2023-12-29 11:58:35 -08:00
namei.c rcu pathwalk: prevent bogus hard errors from may_lookup() 2024-02-25 02:10:31 -05:00
namespace.c fs: relax mount_setattr() permission checks 2024-02-07 21:16:29 +01:00
nsfs.c nsfs: use d_make_root() 2023-11-25 02:49:43 -05:00
open.c vfs-6.8.rw 2024-01-08 11:11:51 -08:00
pipe.c sysctl-6.8-rc1 2024-01-10 17:44:36 -08:00
pnode.c mounts: keep list of mounts in an rbtree 2023-11-18 14:56:16 +01:00
pnode.h
posix_acl.c fs: fix doc comment typo fs tree wide 2023-12-21 13:17:54 +01:00
proc_namespace.c namespace: extract show_path() helper 2023-11-18 14:56:16 +01:00
read_write.c fsnotify: optionally pass access range in file permission hooks 2023-12-12 16:20:02 +01:00
readdir.c fsnotify: optionally pass access range in file permission hooks 2023-12-12 16:20:02 +01:00
remap_range.c remap_range: merge do_clone_file_range() into vfs_clone_file_range() 2024-02-06 17:07:21 +01:00
select.c
seq_file.c
signalfd.c
splice.c fs: use splice_copy_file_range() inline helper 2023-12-12 16:20:02 +01:00
stack.c
stat.c vfs-6.8.mount 2024-01-08 10:57:34 -08:00
statfs.c
super.c fs/super.c: don't drop ->s_user_ns until we free struct super_block itself 2024-02-25 02:10:31 -05:00
sync.c
sysctls.c fs: Remove the now superfluous sentinel elements from ctl_table array 2023-12-28 04:57:57 -08:00
timerfd.c
userfaultfd.c Generic: 2024-01-17 13:03:37 -08:00
utimes.c
xattr.c