linux

iv/linux

History

Brian Foster d9252d526b xfs: validate writeback mapping using data fork seq counter The writeback code caches the current extent mapping across multiple xfs_do_writepage() calls to avoid repeated lookups for sequential pages backed by the same extent. This is known to be slightly racy with extent fork changes in certain difficult to reproduce scenarios. The cached extent is trimmed to within EOF to help avoid the most common vector for this problem via speculative preallocation management, but this is a band-aid that does not address the fundamental problem. Now that we have an xfs_ifork sequence counter mechanism used to facilitate COW writeback, we can use the same mechanism to validate consistency between the data fork and cached writeback mappings. On its face, this is somewhat of a big hammer approach because any change to the data fork invalidates any mapping currently cached by a writeback in progress regardless of whether the data fork change overlaps with the range under writeback. In practice, however, the impact of this approach is minimal in most cases. First, data fork changes (delayed allocations) caused by sustained sequential buffered writes are amortized across speculative preallocations. This means that a cached mapping won't be invalidated by each buffered write of a common file copy workload, but rather only on less frequent allocation events. Second, the extent tree is always entirely in-core so an additional lookup of a usable extent mostly costs a shared ilock cycle and in-memory tree lookup. This means that a cached mapping reval is relatively cheap compared to the I/O itself. Third, spurious invalidations don't impact ioend construction. This means that even if the same extent is revalidated multiple times across multiple writepage instances, we still construct and submit the same size ioend (and bio) if the blocks are physically contiguous. Update struct xfs_writepage_ctx with a new field to hold the sequence number of the data fork associated with the currently cached mapping. Check the wpc seqno against the data fork when the mapping is validated and reestablish the mapping whenever the fork has changed since the mapping was cached. This ensures that writeback always uses a valid extent mapping and thus prevents lost writebacks and stale delalloc block problems. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>		2019-02-11 16:07:01 -08:00
..
9p	Merge branch 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2018-11-01 19:58:52 -07:00
adfs	adfs: use timespec64 for time conversion	2018-08-22 10:52:51 -07:00
affs	affs: fix potential memory leak when parsing option 'prefix'	2018-05-28 12:36:41 +02:00
afs	afs: Fix race in async call refcounting	2019-01-17 15:17:28 +00:00
autofs	autofs: fix error return in autofs_fill_super()	2019-02-01 15:46:24 -08:00
befs	fix a series of Documentation/ broken file name references	2018-06-15 18:10:01 -03:00
bfs	bfs: extra sanity checking and static inode bitmap	2019-01-04 13:13:47 -08:00
btrfs	for-5.0-rc4-tag	2019-02-03 08:48:33 -08:00
cachefiles	fscache, cachefiles: remove redundant variable 'cache'	2018-11-30 16:00:58 +00:00
ceph	ceph: quota: cleanup license mess	2019-01-21 14:53:23 +01:00
cifs	cifs: update internal module version number	2019-01-31 07:05:06 -06:00
coda	vfs: change inode times to use struct timespec64	2018-06-05 16:57:31 -07:00
configfs	configfs: fix registered group removal	2018-07-17 06:14:07 -07:00
cramfs	Make the Cramfs code more robust against filesystem corruptions,	2018-10-30 12:46:25 -07:00
crypto	fscrypt: add Adiantum support	2019-01-06 08:36:21 -05:00
debugfs	debugfs: debugfs_lookup() should return NULL if not found	2019-01-30 12:39:49 +01:00
devpts	devpts: Convert to new IDA API	2018-08-21 23:54:17 -04:00
dlm	dlm: fix invalid cluster name warning	2018-12-03 15:30:24 -06:00
ecryptfs	ecryptfs_rename(): verify that lower dentries are still OK after lock_rename()	2018-10-09 23:33:17 -04:00
efivarfs	efivars: Call guid_parse() against guid_t type of variable	2018-07-22 14:13:44 +02:00
efs
exofs	exofs_mount(): fix leaks on failure exits	2018-12-17 18:36:33 -05:00
exportfs	exportfs: do not read dentry after free	2018-11-23 09:08:17 -05:00
ext2	\n	2018-12-27 17:00:35 -08:00
ext4	Fix a number of ext4 bugs.	2019-01-06 12:19:23 -08:00
f2fs	f2fs-for-4.21-rc1	2018-12-31 09:41:37 -08:00
fat	Merge branch 'akpm' (patches from Andrew)	2019-01-05 09:16:18 -08:00
freevxfs
fscache	fscache: fix race between enablement and dropping of object	2018-11-30 15:57:31 +00:00
fuse	fuse: decrement NR_WRITEBACK_TEMP on the right page	2019-01-16 10:27:59 +01:00
gfs2	gfs2: Revert "Fix loop in gfs2_rbm_find"	2019-01-31 11:45:11 -08:00
hfs	hfs: do not free node before using	2018-11-30 14:56:14 -08:00
hfsplus	hfsplus: return file attributes on statx	2019-01-04 13:13:47 -08:00
hostfs	vfs: discard ATTR_ATTR_FLAG	2018-08-17 16:20:28 -07:00
hpfs	hpfs: remove unnecessary checks on the value of r when assigning error code	2018-08-25 12:42:33 -07:00
hugetlbfs	hugetlbfs: revert "Use i_mmap_rwsem to fix page fault/truncate race"	2019-01-08 17:15:11 -08:00
isofs	Update email address	2018-09-29 22:47:48 -04:00
jbd2	jbd2: clean up indentation issue, replace spaces with tab	2018-12-04 00:20:10 -05:00
jffs2	jffs2: Fix use of uninitialized delayed_work, lockdep breakage	2018-12-02 09:20:34 +01:00
jfs	jfs: remove redundant dquot_initialize() in jfs_evict_inode()	2018-09-20 09:28:49 -05:00
kernfs	kernfs: Improve kernfs_notify() poll notification latency	2018-11-27 11:59:33 +01:00
lockd	NFS client updates for Linux 4.21	2019-01-02 16:35:23 -08:00
minix
nfs	NFS: Fix up return value on fatal errors in nfs_page_async_flush()	2019-01-29 16:33:24 -05:00
nfs_common
nfsd	nfsd: Fix error return values for nfsd4_clone_file_range()	2019-02-06 15:32:05 -05:00
nilfs2	nilfs2: Use xa_erase_irq	2018-11-05 14:57:05 -05:00
nls
notify	inotify: Fix fd refcount leak in inotify_add_watch().	2019-01-02 18:28:37 +01:00
ntfs	mm: convert totalram_pages and totalhigh_pages variables to atomic	2018-12-28 12:11:47 -08:00
ocfs2	Merge branch 'akpm' (patches from Andrew)	2019-01-05 09:16:18 -08:00
omfs
openpromfs	fs/openpromfs: Use of_node_name_eq for node name comparisons	2018-11-18 13:35:19 -08:00
orangefs	fs: don't open code lru_to_page()	2019-01-04 13:13:48 -08:00
overlayfs	Revert "ovl: relax permission checking on underlying layers"	2018-12-04 11:31:30 +01:00
proc	proc: fix /proc/net/* after setns(2)	2019-02-01 15:46:22 -08:00
pstore	pstore/ram: Avoid allocation and leak of platform data	2019-01-20 14:44:52 -08:00
qnx4
qnx6
quota	quota: Lock s_umount in exclusive mode for Q_XQUOTA{ON,OFF} quotactls.	2018-12-18 18:29:15 +01:00
ramfs
reiserfs	reiserfs: remove workaround code for GCC 3.x	2018-10-31 08:54:14 -07:00
romfs
squashfs	Squashfs: Compute expected length from inode size rather than block length	2018-08-02 09:34:02 -07:00
sysfs	sysfs: convert BUG_ON to WARN_ON	2019-01-07 08:53:32 +01:00
sysv	sysv: return 'err' instead of 0 in __sysv_write_inode	2018-11-10 08:02:40 -05:00
tracefs	tracefs: Annotate tracefs_ops with __ro_after_init	2018-07-31 11:32:44 -04:00
ubifs	mm: migrate: drop unused argument of migrate_page_move_mapping()	2018-12-28 12:11:51 -08:00
udf	\n	2018-12-27 17:00:35 -08:00
ufs	fs/ufs: use ktime_get_real_seconds for sb and cg timestamps	2018-08-17 16:20:27 -07:00
xfs	xfs: validate writeback mapping using data fork seq counter	2019-02-11 16:07:01 -08:00
aio.c	aio: initialize kiocb private in case any filesystems expect it.	2019-02-06 08:04:22 -07:00
anon_inodes.c	anon_inode_getfile(): switch to alloc_file_pseudo()	2018-07-12 10:04:27 -04:00
attr.c	fs: Fix attr.c kernel-doc	2018-07-03 16:44:45 -04:00
bad_inode.c	get rid of 'opened' argument of ->atomic_open() - part 3	2018-07-12 10:04:20 -04:00
binfmt_aout.c	Remove 'type' argument from access_ok() function	2019-01-03 18:57:57 -08:00
binfmt_elf_fdpic.c	treewide: kmalloc() -> kmalloc_array()	2018-06-12 16:19:22 -07:00
binfmt_elf.c	signal: Distinguish between kernel_siginfo and siginfo	2018-10-03 16:47:43 +02:00
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c	turn filp_clone_open() into inline wrapper for dentry_open()	2018-07-10 23:29:03 -04:00
binfmt_script.c	exec: load_script: don't blindly truncate shebang string	2019-01-04 13:13:47 -08:00
block_dev.c	blockdev: Fix livelocks on loop device	2019-01-15 07:30:56 -07:00
buffer.c	fs: ratelimit __find_get_block_slow() failure message.	2019-02-06 12:58:56 -07:00
char_dev.c
compat_binfmt_elf.c	y2038: globally rename compat_time to old_time32	2018-08-27 14:48:48 +02:00
compat_ioctl.c	media updates for v4.20-rc1	2018-10-29 14:29:58 -07:00
compat.c	ncpfs: remove compat functionality	2018-06-05 19:23:26 +02:00
coredump.c	signal: Distinguish between kernel_siginfo and siginfo	2018-10-03 16:47:43 +02:00
d_path.c
dax.c	dax fix 4.21	2018-12-31 09:46:39 -08:00
dcache.c	fs/dcache: Track & report number of negative dentries	2019-01-30 11:02:11 -08:00
dcookies.c
direct-io.c	direct-io: allow direct writes to empty inodes	2019-01-22 08:26:44 -07:00
drop_caches.c	fs/drop_caches.c: avoid softlockups in drop_pagecache_sb()	2019-02-01 15:46:24 -08:00
eventfd.c	Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL	2018-06-28 10:40:47 -07:00
eventpoll.c	Merge branch 'akpm' (patches from Andrew)	2019-01-05 09:16:18 -08:00
exec.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2019-01-05 13:18:59 -08:00
fcntl.c	signal: Distinguish between kernel_siginfo and siginfo	2018-10-03 16:47:43 +02:00
fhandle.c
file_table.c	mm: convert totalram_pages and totalhigh_pages variables to atomic	2018-12-28 12:11:47 -08:00
file.c	Char/Misc driver patches for 4.21-rc1	2018-12-28 20:54:57 -08:00
filesystems.c
fs_pin.c
fs_struct.c
fs-writeback.c	writeback: synchronize sync(2) against cgroup writeback membership switches	2019-01-22 14:39:38 -07:00
inode.c	y2038: more syscalls and cleanups	2018-12-28 12:45:04 -08:00
internal.h	overlayfs update for 4.19	2018-08-21 18:19:09 -07:00
ioctl.c	Remove 'type' argument from access_ok() function	2019-01-03 18:57:57 -08:00
iomap.c	iomap: fix a use after free in iomap_dio_rw	2019-01-27 08:47:42 -08:00
Kconfig	autofs: remove left-over autofs4 stubs	2018-06-11 08:22:34 -07:00
Kconfig.binfmt	kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt	2018-08-02 08:06:55 +09:00
libfs.c
locks.c	locks: fix error in locks_move_blocks()	2019-01-02 20:14:50 -05:00
Makefile	autofs: remove left-over autofs4 stubs	2018-06-11 08:22:34 -07:00
mbcache.c	treewide: kmalloc() -> kmalloc_array()	2018-06-12 16:19:22 -07:00
mount.h
mpage.c	mpage: mpage_readpages() should submit IO as read-ahead	2018-08-17 16:20:29 -07:00
namei.c	Revert "vfs: Allow userns root to call mknod on owned filesystems."	2018-12-22 14:18:34 -08:00
namespace.c	Merge branch 'mount.part1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2019-01-05 13:25:58 -08:00
no-block.c
nsfs.c
open.c	overlayfs update for 4.19	2018-08-21 18:19:09 -07:00
pipe.c	Merge branch 'work.open3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2018-08-13 19:58:36 -07:00
pnode.c	vfs: Suppress MS_* flag defs within the kernel unless explicitly enabled	2018-12-20 16:32:56 +00:00
pnode.h
posix_acl.c
proc_namespace.c
read_write.c	Remove 'type' argument from access_ok() function	2019-01-03 18:57:57 -08:00
readdir.c	Remove 'type' argument from access_ok() function	2019-01-03 18:57:57 -08:00
select.c	Remove 'type' argument from access_ok() function	2019-01-03 18:57:57 -08:00
seq_file.c	fs/seq_file.c: simplify seq_file iteration code and interface	2018-08-17 16:20:28 -07:00
signalfd.c	signal: Distinguish between kernel_siginfo and siginfo	2018-10-03 16:47:43 +02:00
splice.c	splice: don't read more than available pipe space	2018-12-04 08:50:49 -08:00
stack.c
stat.c	y2038: Remove newstat family from default syscall set	2018-08-29 15:42:20 +02:00
statfs.c	kernel: add kcompat_sys_{f,}statfs64()	2018-07-12 14:49:48 +01:00
super.c	mount_fs: suppress MAC on MS_SUBMOUNT as well as MS_KERNMOUNT	2018-12-21 11:51:23 -05:00
sync.c
timerfd.c	y2038: globally rename compat_time to old_time32	2018-08-27 14:48:48 +02:00
userfaultfd.c	userfaultfd: clear flag if remap event not enabled	2018-12-28 12:11:51 -08:00
utimes.c	y2038: utimes: Rework #ifdef guards for compat syscalls	2018-08-29 15:42:23 +02:00
xattr.c	sysfs: Do not return POSIX ACL xattrs via listxattr	2018-09-18 07:30:48 -04:00