linux

iv/linux

History

Jeff Mahoney 144439376b btrfs: fix lockup in find_free_extent with read-only block groups If we have a block group that is all of the following: 1) uncached in memory 2) is read-only 3) has a disk cache state that indicates we need to recreate the cache AND the file system has enough free space fragmentation such that the request for an extent of a given size can't be honored; AND have a single CPU core; AND it's the block group with the highest starting offset such that there are no opportunities (like reading from disk) for the loop to yield the CPU; We can end up with a lockup. The root cause is simple. Once we're in the position that we've read in all of the other block groups directly and none of those block groups can honor the request, there are no more opportunities to sleep. We end up trying to start a caching thread which never gets run if we only have one core. This should present as a hung task waiting on the caching thread to make some progress, but it doesn't. Instead, it degrades into a busy loop because of the placement of the read-only check. During the first pass through the loop, block_group->cached will be set to BTRFS_CACHE_STARTED and have_caching_bg will be set. Then we hit the read-only check and short circuit the loop. We're not yet in LOOP_CACHING_WAIT, so we skip that loop back before going through the loop again for other raid groups. Then we move to LOOP_CACHING_WAIT state. During the this pass through the loop, ->cached will still be BTRFS_CACHE_STARTED, which means it's not cached, so we'll enter cache_block_group, do a lot of nothing, and return, and also set have_caching_bg again. Then we hit the read-only check and short circuit the loop. The same thing happens as before except now we DO trigger the LOOP_CACHING_WAIT && have_caching_bg check and loop back up to the top. We do this forever. There are two fixes in this patch since they address the same underlying bug. The first is to add a cond_resched to the end of the loop to ensure that the caching thread always has an opportunity to run. This will fix the soft lockup issue, but find_free_extent will still loop doing nothing until the thread has completed. The second is to move the read-only check to the top of the loop. We're never going to return an allocation within a read-only block group so we may as well skip it early. The check for ->cached == BTRFS_CACHE_ERROR would cause the same problem except that BTRFS_CACHE_ERROR is considered a "done" state and we won't re-set have_caching_bg again. Many thanks to Stephan Kulow <coolo@suse.de> for his excellent help in the testing process. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>		2017-07-24 16:04:02 +02:00
..
tests	Btrfs: replace tree->mapping with tree->private_data	2017-06-19 18:25:58 +02:00
acl.c	btrfs: Don't clear SGID when inheriting ACLs	2017-06-29 20:24:59 +02:00
async-thread.c	btrfs: fix crash when tracepoint arguments are freed by wq callbacks	2017-01-09 11:24:50 +01:00
async-thread.h	btrfs: limit async_work allocation and worker func duration	2016-12-13 11:01:30 -08:00
backref.c	btrfs: use GFP_KERNEL in init_ipath	2017-06-19 18:26:02 +02:00
backref.h
btrfs_inode.h	Btrfs: fix reported number of inode blocks	2017-04-26 16:27:26 +01:00
check-integrity.c	btrfs: sink gfp parameter to btrfs_io_bio_alloc	2017-06-19 18:26:04 +02:00
check-integrity.h	btrfs: take an fs_info directly when the root is not used otherwise	2016-12-06 16:06:59 +01:00
compression.c	btrfs: cloned bios must not be iterated by bio_for_each_segment_all	2017-07-14 20:39:31 +02:00
compression.h	btrfs: reduce arguments for decompress_bio ops	2017-06-19 18:26:00 +02:00
ctree.c	btrfs: adjust includes after vmalloc removal	2017-06-19 18:26:02 +02:00
ctree.h	btrfs: qgroup: Fix qgroup reserved space underflow by only freeing reserved ranges	2017-06-29 20:17:02 +02:00
dedupe.h	btrfs: expand cow_file_range() to support in-band dedup and subpage-blocksize	2016-07-26 13:52:25 +02:00
delayed-inode.c	btrfs: convert btrfs_delayed_item.refs from atomic_t to refcount_t	2017-04-18 14:07:23 +02:00
delayed-inode.h	btrfs: convert btrfs_delayed_item.refs from atomic_t to refcount_t	2017-04-18 14:07:23 +02:00
delayed-ref.c	Btrfs: return old and new total ref mods when adding delayed refs	2017-06-29 20:17:01 +02:00
delayed-ref.h	Btrfs: return old and new total ref mods when adding delayed refs	2017-06-29 20:17:01 +02:00
dev-replace.c	btrfs: fix integer overflow in calc_reclaim_items_nr	2017-06-29 20:17:02 +02:00
dev-replace.h	btrfs: constify device path passed to relevant helpers	2017-02-28 14:26:07 +01:00
dir-item.c	btrfs: fix validation of XATTR_ITEM dir items	2017-06-29 20:06:11 +02:00
disk-io.c	btrfs: cloned bios must not be iterated by bio_for_each_segment_all	2017-07-14 20:39:31 +02:00
disk-io.h	btrfs: btrfs_wait_tree_block_writeback can be void return	2017-06-19 18:26:01 +02:00
export.c	btrfs: Check name_len before reading btrfs_get_name	2017-06-21 19:16:04 +02:00
export.h
extent_io.c	Btrfs: fix unexpected return value of bio_readpage_error	2017-07-14 20:42:37 +02:00
extent_io.h	Btrfs: fix unexpected return value of bio_readpage_error	2017-07-14 20:42:37 +02:00
extent_map.c	btrfs: convert extent_map.refs from atomic_t to refcount_t	2017-04-18 14:07:23 +02:00
extent_map.h	btrfs: convert extent_map.refs from atomic_t to refcount_t	2017-04-18 14:07:23 +02:00
extent-tree.c	btrfs: fix lockup in find_free_extent with read-only block groups	2017-07-24 16:04:02 +02:00
file-item.c	Btrfs: change how we iterate bios in endio	2017-06-19 18:25:59 +02:00
file.c	btrfs: qgroup: Fix qgroup reserved space underflow by only freeing reserved ranges	2017-06-29 20:17:02 +02:00
free-space-cache.c	btrfs: use clear_page where appropriate	2017-04-18 14:07:26 +02:00
free-space-cache.h	btrfs: free-space-cache, clean up unnecessary root arguments	2017-02-17 12:03:56 +01:00
free-space-tree.c	Btrfs: use memalloc_nofs and kvzalloc() for free space tree bitmaps	2017-06-19 18:26:01 +02:00
free-space-tree.h
hash.c	crypto: Work around deallocated stack frame reference gcc bug on sparc.	2017-06-08 17:36:03 +08:00
hash.h	btrfs: advertise which crc32c implementation is being used at module load	2016-06-06 14:08:28 +02:00
inode-item.c	btrfs: take an fs_info directly when the root is not used otherwise	2016-12-06 16:06:59 +01:00
inode-map.c	btrfs: qgroup: Introduce extent changeset for qgroup reserve functions	2017-06-29 20:17:02 +02:00
inode-map.h
inode.c	btrfs: btrfs_create_repair_bio never fails, skip error handling	2017-07-14 20:42:08 +02:00
ioctl.c	btrfs: fix integer overflow in calc_reclaim_items_nr	2017-06-29 20:17:02 +02:00
Kconfig
locking.c
locking.h
lzo.c	btrfs: switch to kvmalloc and GFP_KERNEL in lzo/zlib alloc_workspace	2017-06-19 18:26:02 +02:00
Makefile
math.h
ordered-data.c	btrfs: fix integer overflow in calc_reclaim_items_nr	2017-06-29 20:17:02 +02:00
ordered-data.h	btrfs: fix integer overflow in calc_reclaim_items_nr	2017-06-29 20:17:02 +02:00
orphan.c
print-tree.c	Btrfs: let btrfs_print_leaf print more about block group	2017-06-19 18:26:00 +02:00
print-tree.h	btrfs: take an fs_info directly when the root is not used otherwise	2016-12-06 16:06:59 +01:00
props.c	btrfs: Verify dir_item in iterate_object_props	2017-06-21 19:16:04 +02:00
props.h
qgroup.c	btrfs: fix integer overflow in calc_reclaim_items_nr	2017-06-29 20:17:02 +02:00
qgroup.h	btrfs: qgroup: Fix qgroup reserved space underflow by only freeing reserved ranges	2017-06-29 20:17:02 +02:00
raid56.c	Btrfs: fix write corruption due to bio cloning on raid5/6	2017-07-13 19:26:01 +01:00
raid56.h	btrfs: take an fs_info directly when the root is not used otherwise	2016-12-06 16:06:59 +01:00
rcu-string.h
reada.c	btrfs: remove unused member err from reada_extent	2017-06-19 18:25:59 +02:00
relocation.c	btrfs: fix integer overflow in calc_reclaim_items_nr	2017-06-29 20:17:02 +02:00
root-tree.c	btrfs: Check name_len before in btrfs_del_root_ref	2017-06-21 19:16:04 +02:00
scrub.c	btrfs: fix integer overflow in calc_reclaim_items_nr	2017-06-29 20:17:02 +02:00
send.c	Btrfs: incremental send, fix invalid memory access	2017-07-06 23:02:30 +01:00
send.h
struct-funcs.c	btrfs: fix string and comment grammatical issues and typos	2016-05-25 22:35:14 +02:00
super.c	btrfs: fix integer overflow in calc_reclaim_items_nr	2017-06-29 20:17:02 +02:00
sysfs.c	btrfs: Add quota_override knob into sysfs	2017-06-19 18:25:58 +02:00
sysfs.h
transaction.c	btrfs: fix integer overflow in calc_reclaim_items_nr	2017-06-29 20:17:02 +02:00
transaction.h	btrfs: remove unused qgroup members from btrfs_trans_handle	2017-04-18 14:07:25 +02:00
tree-defrag.c
tree-log.c	Btrfs: fix dir item validation when replaying xattr deletes	2017-07-19 20:38:16 +02:00
tree-log.h	btrfs: Make btrfs_del_inode_ref take btrfs_inode	2017-02-14 15:50:54 +01:00
ulist.c	btrfs: ulist: rename ulist_fini to ulist_release	2017-02-17 12:03:50 +01:00
ulist.h	btrfs: ulist: rename ulist_fini to ulist_release	2017-02-17 12:03:50 +01:00
uuid-tree.c	btrfs: return the actual error value from from btrfs_uuid_tree_iterate	2016-12-19 18:08:15 +01:00
volumes.c	btrfs: preallocate device flush bio	2017-06-21 19:03:38 +02:00
volumes.h	btrfs: preallocate device flush bio	2017-06-21 19:03:38 +02:00
xattr.c	btrfs: Check name_len with boundary in verify dir_item	2017-06-21 19:16:04 +02:00
xattr.h
zlib.c	btrfs: switch to kvmalloc and GFP_KERNEL in lzo/zlib alloc_workspace	2017-06-19 18:26:02 +02:00