linux

iv/linux

History

Filipe Manana 1cb3db1cf3 btrfs: fix deadlock with concurrent chunk allocations involving system chunks When a task attempting to allocate a new chunk verifies that there is not currently enough free space in the system space_info and there is another task that allocated a new system chunk but it did not finish yet the creation of the respective block group, it waits for that other task to finish creating the block group. This is to avoid exhaustion of the system chunk array in the superblock, which is limited, when we have a thundering herd of tasks allocating new chunks. This problem was described and fixed by commit `eafa4fd0ad` ("btrfs: fix exhaustion of the system chunk array due to concurrent allocations"). However there are two very similar scenarios where this can lead to a deadlock: 1) Task B allocated a new system chunk and task A is waiting on task B to finish creation of the respective system block group. However before task B ends its transaction handle and finishes the creation of the system block group, it attempts to allocate another chunk (like a data chunk for an fallocate operation for a very large range). Task B will be unable to progress and allocate the new chunk, because task A set space_info->chunk_alloc to 1 and therefore it loops at btrfs_chunk_alloc() waiting for task A to finish its chunk allocation and set space_info->chunk_alloc to 0, but task A is waiting on task B to finish creation of the new system block group, therefore resulting in a deadlock; 2) Task B allocated a new system chunk and task A is waiting on task B to finish creation of the respective system block group. By the time that task B enter the final phase of block group allocation, which happens at btrfs_create_pending_block_groups(), when it modifies the extent tree, the device tree or the chunk tree to insert the items for some new block group, it needs to allocate a new chunk, so it ends up at btrfs_chunk_alloc() and keeps looping there because task A has set space_info->chunk_alloc to 1, but task A is waiting for task B to finish creation of the new system block group and release the reserved system space, therefore resulting in a deadlock. In short, the problem is if a task B needs to allocate a new chunk after it previously allocated a new system chunk and if another task A is currently waiting for task B to complete the allocation of the new system chunk. Unfortunately this deadlock scenario introduced by the previous fix for the system chunk array exhaustion problem does not have a simple and short fix, and requires a big change to rework the chunk allocation code so that chunk btree updates are all made in the first phase of chunk allocation. And since this deadlock regression is being frequently hit on zoned filesystems and the system chunk array exhaustion problem is triggered in more extreme cases (originally observed on PowerPC with a node size of 64K when running the fallocate tests from stress-ng), revert the changes from that commit. The next patch in the series, with a subject of "btrfs: rework chunk allocation to avoid exhaustion of the system chunk array" does the necessary changes to fix the system chunk array exhaustion problem. Reported-by: Naohiro Aota <naohiro.aota@wdc.com> Link: https://lore.kernel.org/linux-btrfs/20210621015922.ewgbffxuawia7liz@naota-xeon/ Fixes: `eafa4fd0ad` ("btrfs: fix exhaustion of the system chunk array due to concurrent allocations") CC: stable@vger.kernel.org # 5.12+ Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Tested-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Tested-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>		2021-07-07 17:42:40 +02:00
..
9p	9p for 5.13-rc1	2021-05-07 11:18:52 -07:00
adfs	fs: make helpers idmap mount aware	2021-01-24 14:27:20 +01:00
affs	idmapped-mounts-v5.12	2021-02-23 13:39:45 -08:00
afs	afs: Re-enable freezing once a page fault is interrupted	2021-06-18 13:49:07 -07:00
autofs	autofs: should_expire() argument is guaranteed to be positive	2021-03-24 14:14:27 -04:00
befs	fs/befs: Delete obsolete TODO file	2021-03-30 16:54:49 -07:00
bfs	fs: make helpers idmap mount aware	2021-01-24 14:27:20 +01:00
btrfs	btrfs: fix deadlock with concurrent chunk allocations involving system chunks	2021-07-07 17:42:40 +02:00
cachefiles	fscache, cachefiles: Add alternate API to use kiocb for read/write to cache	2021-04-23 10:14:32 +01:00
ceph	Notable items here are a series to take advantage of David Howells'	2021-05-06 10:27:02 -07:00
cifs	cifs: change format of CIFS_FULL_KEY_DUMP ioctl	2021-05-27 15:26:32 -05:00
coda	coda: fix reference counting in coda_file_mmap error path	2021-04-23 14:42:39 -07:00
configfs	treewide: remove editor modelines and cruft	2021-05-07 00:26:34 -07:00
cramfs	cramfs: use %pD instead of messing with file_dentry()->d_name	2021-01-05 23:02:47 -05:00
crypto	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6	2021-04-26 08:51:23 -07:00
debugfs	debugfs: Fix debugfs_read_file_str()	2021-06-04 15:01:08 +02:00
devpts
dlm	fs: dlm: fix missing unlock on error in accept_from_sock()	2021-03-29 13:28:18 -05:00
ecryptfs	fs: ecryptfs: remove BUG_ON from crypt_scatterlist	2021-05-13 18:32:26 +02:00
efivarfs	efivars: convert to fileattr	2021-04-12 15:04:29 +02:00
efs	[PATCH] reduce boilerplate in fsid handling	2020-09-18 16:45:50 -04:00
erofs	erofs: fix 1 lcluster-sized pcluster for big pcluster	2021-05-13 15:58:46 +08:00
exfat	exfat: speed up iterate/lookup by fixing start point of traversing cluster chain	2021-04-27 20:45:07 +09:00
exportfs	exportfs: Add a function to return the raw output from fh_to_dentry()	2020-12-09 09:39:38 -05:00
ext2	Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2021-05-02 09:14:01 -07:00
ext4	Miscellaneous ext4 bug fixes for v5.13	2021-06-06 14:24:13 -07:00
f2fs	f2fs: return EINVAL for hole cases in swap file	2021-05-12 07:38:00 -07:00
fat	fs: fat: fix spelling typo of values	2021-05-07 00:26:34 -07:00
freevxfs
fscache	fscache, cachefiles: Add alternate API to use kiocb for read/write to cache	2021-04-23 10:14:32 +01:00
fuse	Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2021-05-02 09:14:01 -07:00
gfs2	Revert "gfs2: Fix mmap locking for write faults"	2021-06-01 23:16:42 +02:00
hfs	fs: make helpers idmap mount aware	2021-01-24 14:27:20 +01:00
hfsplus	hfsplus: prevent corruption in shrinking truncate	2021-05-14 19:41:32 -07:00
hostfs	Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2021-05-02 09:14:01 -07:00
hpfs	hpfs: replace one-element array with flexible-array member	2021-05-06 19:24:13 -07:00
hugetlbfs	mm/hugetlb: expand restore_reserve_on_error functionality	2021-06-16 09:24:42 -07:00
iomap	mm/filemap: fix readahead return types	2021-05-14 19:41:32 -07:00
isofs	isofs: fix fall-through warnings for Clang	2021-05-06 19:24:13 -07:00
jbd2	ext4: fix debug format string warning	2021-04-09 23:32:16 -04:00
jffs2	This pull request contains changes for JFFS2, UBI and UBIFS	2021-05-04 18:08:40 -07:00
jfs	jfs: convert to fileattr	2021-04-12 15:04:29 +02:00
kernfs	idmapped-mounts-v5.12	2021-02-23 13:39:45 -08:00
lockd	SUNRPC: Make trace_svc_process() display the RPC procedure symbolically	2021-01-25 09:36:23 -05:00
minix	fs: make helpers idmap mount aware	2021-01-24 14:27:20 +01:00
netfs	netfs: Make CONFIG_NETFS_SUPPORT auto-selected rather than manual	2021-05-25 13:48:04 +01:00
nfs	NFSv4: Fix second deadlock in nfs4_evict_inode()	2021-06-03 10:14:42 -04:00
nfs_common	NFSD: Add an xdr_stream-based encoder for NFSv2/3 ACLs	2021-03-22 10:19:00 -04:00
nfsd	NFS client updates for Linux 5.13	2021-05-07 11:23:41 -07:00
nilfs2	Merge branch 'akpm' (patches from Andrew)	2021-05-07 00:34:51 -07:00
nls
notify	fanotify: fix copy_event_to_user() fid error clean up	2021-06-14 12:16:37 +02:00
ntfs	ntfs: check for valid standard information attribute	2021-02-24 13:38:26 -08:00
ocfs2	ocfs2: fix data corruption by fallocate	2021-06-05 08:58:12 -07:00
omfs	fs: make helpers idmap mount aware	2021-01-24 14:27:20 +01:00
openpromfs	openpromfs: don't do unlock_new_inode() until the new inode is set up	2021-03-12 22:15:22 -05:00
orangefs	orangefs: leave files in the page cache for a few micro seconds at least	2021-04-29 08:06:05 -04:00
overlayfs	overlayfs update for 5.13	2021-04-30 15:17:08 -07:00
proc	proc: only require mm_struct for writing	2021-06-15 10:47:51 -07:00
pstore	printk changes for 5.13	2021-04-27 18:09:44 -07:00
qnx4	[PATCH] reduce boilerplate in fsid handling	2020-09-18 16:45:50 -04:00
qnx6	[PATCH] reduce boilerplate in fsid handling	2020-09-18 16:45:50 -04:00
quota	quota: Use 'hlist_for_each_entry' to simplify code	2021-05-10 16:27:49 +02:00
ramfs	ramfs: support O_TMPFILE	2021-02-24 13:38:26 -08:00
reiserfs	treewide: remove editor modelines and cruft	2021-05-07 00:26:34 -07:00
romfs	Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2020-10-24 12:26:05 -07:00
squashfs	squashfs: fix divide error in calculate_skip()	2021-05-14 19:41:32 -07:00
sysfs	sysfs: Support zapping of binary attr mmaps	2021-01-12 14:26:31 +01:00
sysv	fs: make helpers idmap mount aware	2021-01-24 14:27:20 +01:00
tracefs	tracing: Fix various typos in comments	2021-03-23 14:08:18 -04:00
ubifs	This pull request contains changes for JFFS2, UBI and UBIFS	2021-05-04 18:08:40 -07:00
udf	useful constants: struct qstr for ".."	2021-04-15 22:36:45 -04:00
ufs	useful constants: struct qstr for ".."	2021-04-15 22:36:45 -04:00
unicode	.gitignore: prefix local generated files with a slash	2021-05-02 00:43:35 +09:00
vboxsf	vboxsf: don't allow to change the inode type	2021-03-12 22:15:00 -05:00
verity	fsverity: relax build time dependency on CRYPTO_SHA256	2021-04-22 17:31:32 +10:00
xfs	xfs: bunmapi has unnecessary AG lock ordering issues	2021-05-27 08:11:24 -07:00
zonefs	\n	2021-04-29 11:06:13 -07:00
aio.c	Revert "mremap: don't allow MREMAP_DONTUNMAP on special_mappings and aio"	2021-04-30 11:20:39 -07:00
anon_inodes.c	fs: anon_inodes: rephrase to appropriate kernel-doc	2021-01-15 12:17:25 -05:00
attr.c	ima: handle idmapped mounts	2021-01-24 14:27:20 +01:00
bad_inode.c	fs: make helpers idmap mount aware	2021-01-24 14:27:20 +01:00
binfmt_aout.c
binfmt_elf_fdpic.c	coredump: don't bother with do_truncate()	2021-03-08 10:21:11 -05:00
binfmt_elf.c	coredump: don't bother with do_truncate()	2021-03-08 10:21:11 -05:00
binfmt_em86.c
binfmt_flat.c	binfmt_flat: allow not offsetting data start	2021-04-19 09:56:37 +10:00
binfmt_misc.c	binfmt_misc: fix possible deadlock in bm_register_write	2021-03-13 11:27:30 -08:00
binfmt_script.c
block_dev.c	block-5.13-2021-05-22	2021-05-22 07:40:34 -10:00
buffer.c	Merge branch 'akpm' (patches from Andrew)	2021-05-05 13:50:15 -07:00
char_dev.c
compat_binfmt_elf.c	get rid of COMPAT_ELF_EXEC_PAGESIZE	2021-01-06 08:42:51 -05:00
coredump.c	coredump: Limit what can interrupt coredumps	2021-06-10 14:02:29 -07:00
d_path.c	constify dentry argument of dentry_path()/dentry_path_raw()	2021-03-21 11:43:58 -04:00
dax.c	dax fixes for 5.13-rc2	2021-05-15 08:28:08 -07:00
dcache.c	useful constants: struct qstr for ".."	2021-04-15 22:36:45 -04:00
direct-io.c	fs: direct-io: fix missing sdio->boundary	2021-04-09 14:54:23 -07:00
drop_caches.c
eventfd.c	eventfd: Export eventfd_ctx_do_read()	2020-11-15 09:49:10 -05:00
eventpoll.c	fs/epoll: restore waking from ep_done_scan()	2021-05-06 19:24:13 -07:00
exec.c	fs: delete repeated words in comments	2021-02-24 13:38:26 -08:00
fcntl.c	idmapped-mounts-v5.12	2021-02-23 13:39:45 -08:00
fhandle.c	fs: delete repeated words in comments	2021-02-24 13:38:26 -08:00
file_table.c	epoll: take epitem list out of struct file	2020-10-25 20:02:08 -04:00
file.c	Merge branch 'work.file' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2021-05-03 11:05:28 -07:00
filesystems.c
fs_context.c
fs_parser.c	vfs: fs_parser: clean up kernel-doc warnings	2021-04-30 11:20:35 -07:00
fs_pin.c
fs_struct.c
fs_types.c
fs-writeback.c	fs: improve comments for writeback_single_inode()	2021-01-13 17:26:50 +01:00
fsopen.c
init.c	init: handle idmapped mounts	2021-01-24 14:27:19 +01:00
inode.c	mm: remove nrexceptional from inode: remove BUG_ON	2021-05-05 11:27:20 -07:00
internal.h	idmapped-mounts-v5.12	2021-02-23 13:39:45 -08:00
io_uring.c	io_uring: add feature flag for rsrc tags	2021-06-10 16:33:51 -06:00
io-wq.c	io-wq: Fix UAF when wakeup wqe in hash waitqueue	2021-05-26 09:03:56 -06:00
io-wq.h	io_uring/io-wq: close io-wq full-stop gap	2021-05-25 19:39:58 -06:00
ioctl.c	vfs: add fileattr ops	2021-04-12 15:04:23 +02:00
Kconfig	NFS client updates for Linux 5.13	2021-05-07 11:23:41 -07:00
Kconfig.binfmt	binfmt_flat: allow not offsetting data start	2021-04-19 09:56:37 +10:00
kernel_read_file.c	fs/kernel_file_read: Add "offset" arg for partial reads	2020-10-05 13:37:04 +02:00
libfs.c	libfs: fix kernel-doc for mnt_userns	2021-03-23 11:20:25 +01:00
locks.c	Additional fixes and clean-ups for NFSD since tags/nfsd-5.13,	2021-05-05 13:44:19 -07:00
Makefile	netfs: Provide readahead and readpage netfs helpers	2021-04-23 10:14:32 +01:00
mbcache.c
mount.h	mount: make {lock,unlock}_mount_hash() static	2021-01-24 14:29:34 +01:00
mpage.c	block: rename BIO_MAX_PAGES to BIO_MAX_VECS	2021-03-11 07:47:48 -07:00
namei.c	fs.idmapped.helpers.v5.13	2021-04-27 12:49:42 -07:00
namespace.c	fs/mount_setattr: tighten permission checks	2021-05-12 14:13:16 +02:00
no-block.c
nsfs.c
open.c	idmapped-mounts-v5.12	2021-02-23 13:39:45 -08:00
pipe.c	fs: delete repeated words in comments	2021-02-24 13:38:26 -08:00
pnode.c
pnode.h	mount: fix mounting of detached mounts onto targets that reside on shared mounts	2021-03-08 15:18:43 +01:00
posix_acl.c	fs: make helpers idmap mount aware	2021-01-24 14:27:20 +01:00
proc_namespace.c	fs: introduce MOUNT_ATTR_IDMAP	2021-01-24 14:43:45 +01:00
read_write.c	teach sendfile(2) to handle send-to-pipe directly	2021-01-25 23:29:36 -05:00
readdir.c	readdir: make sure to verify directory entry for legacy interfaces too	2021-04-17 11:39:49 -07:00
remap_range.c	ioctl: handle idmapped mounts	2021-01-24 14:27:19 +01:00
select.c	kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data()	2021-03-16 22:13:10 +01:00
seq_file.c	seq_file: Add a seq_bprintf function	2021-04-27 15:50:15 -07:00
signalfd.c	signalfd: Remove SIL_PERF_EVENT fields from signalfd_siginfo	2021-05-18 16:20:54 -05:00
splice.c	for-5.12/block-2021-02-17	2021-02-21 11:02:48 -08:00
stack.c
stat.c	fs: fix reporting supported extra file attributes for statx()	2021-04-17 23:03:50 -04:00
statfs.c	s390,alpha: switch to 64-bit ino_t	2021-02-13 17:17:53 +01:00
super.c	fs,security: Add sb_delete hook	2021-04-22 12:22:11 -07:00
sync.c
timerfd.c
userfaultfd.c	userfaultfd: add UFFDIO_CONTINUE ioctl	2021-05-05 11:27:22 -07:00
utimes.c	utimes: handle idmapped mounts	2021-01-24 14:27:18 +01:00
xattr.c	xattr: fix kernel-doc for mnt_userns and vfs xattr helpers	2021-03-23 11:20:26 +01:00