linux

iv/linux

Author	SHA1	Message	Date
Kent Overstreet	ee7570546e	bcachefs: Fix a deadlock Waiting on a btree node write with btree locks held can deadlock, if the write errors: the write error path has to do do a btree update to drop the pointer to the replica that errored. The interior update path has to wait on in flight btree writes before freeing nodes on disk. Previously, this was done in bch2_btree_interior_update_will_free_node(), and could deadlock; now, we just stash a pointer to the node and do it in btree_update_nodes_written(), just prior to the transactional part of the update. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:05 -04:00
Kent Overstreet	9f2772c454	bcachefs: Split out btree_error_wq We can't use btree_update_wq becuase btree updates may be waiting on btree writes to complete. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:04 -04:00
Kent Overstreet	bff796ae65	bcachefs: Fix pathalogical behaviour with inode sharding by cpu ID If the transactior restarts on a different CPU, it could end up needing to read in a different btree node, which makes another transaction restart more likely... Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	d797ca3d8e	bcachefs: Fix journal write error path Journal write errors were racing with the submission path - potentially causing writes to other replicas to not get submitted. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	9eba7c8d15	bcachefs: Reflink refcount fix __bch2_trans_mark_reflink_p wasn't always correctly returning the number of sectors processed - the new logic is a bit more straightforward overall too. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	b282a74fae	bcachefs: Add an option to control sharding new inode numbers We're seeing a bug where inode creates end up spinning in bch2_inode_create - disabling sharding will simplify what we're testing. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	9f311f2166	bcachefs: Don't use bch_write_op->cl for delivering completions We already had op->end_io as an alternative mechanism to op->cl.parent for delivering write completions; this switches all code paths to using op->end_io. Two reasons: - op->end_io is more efficient, due to fewer atomic ops, this completes the conversion that was originally only done for the direct IO path. - We'll be restructing the write path to use a different mechanism for punting to process context, refactoring to not use op->cl will make that easier. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:04 -04:00
Kent Overstreet	af17118319	bcachefs: Kill bch_write_op.index_update_fn This deletes bch_write_op.index_update_fn: indirect function calls have gotten considerably more expensive post spectre/meltdown, and we only have two different index_update_fns - this patch adds a flag to specify which one to use (normal vs. data move path). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:04 -04:00
Kent Overstreet	7e94eeffe0	bcachefs: Inline fastpath of bch2_disk_reservation_add() The fastpath now doesn't even disable preemption - instead we use a (non locked) cmpxchg. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:04 -04:00
Kent Overstreet	ddc7dd62f0	bcachefs: Don't use uuid in tracepoints %pU for printing out pointers to uuids doesn't work in perf trace Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	19d2819d2d	bcachefs: Add a tracepoint for copygc waiting Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	c4d4b2f01a	bcachefs: Add a cond_resched call to the copygc main loop We seem to have a bug where the copygc thread ends up spinning and making the system unusable - this will at least prevent it from locking up the machine, and it's a good thing to have anyways. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	443d2760e5	bcachefs: Fix a null ptr deref bch2_btree_iter_peek() won't always return a key - whoops. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	9dd89a05fd	bcachefs: Fix an issue with inconsistent btree writes after unclean shutdown After unclean shutdown, btree writes may have completed on one device and not others - and this inconsistency could lead us to writing new bsets with a gap in our btree node in one of our replicas. Fortunately, this is only an issue with bsets that are newer than the most recent journal flush, and we already have a mechanism for detecting and blacklisting those. We just need to make sure to start new btree writes after the most recent _non_ blacklisted bset. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	4495cbed56	bcachefs: Improve FS_IOC_GOINGDOWN ioctl We weren't interpreting the flags argument at all. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	731bdd2eff	bcachefs: Add a workqueue for btree io completions Also, clean up workqueue usage - we shouldn't be using system workqueues, pretty much everything we do needs to be on our own WQ_MEM_RECLAIM workqueues. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Brett Holman	2eba51a69a	bcachefs: rewrote prefetch asm in gas syntax for clang compatibility Signed-off-by: Brett Holman <bpholman5@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:04 -04:00
Kent Overstreet	1ce0cf5fe9	bcachefs: Add a debug mode that always reads from every btree replica There's a new module parameter, verify_all_btree_replicas, that enables reading from every btree replica when reading in btree nodes and comparing them against each other. We've been seeing some strange btree corruption - this will hopefully aid in tracking it down and catching it more often. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	596d3bdc1e	bcachefs: Don't repair btree nodes until after interior journal replay is done We need the btree to be in a consistent state before we can rewrite btree nodes. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	304b7e08c7	bcachefs: Fix an uninitialized var this fixes a valgrind complaint Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	a6336910b1	bcachefs: Fix for buffered writes getting -ENOSPC Buffered writes may have to increase their disk reservation at btree update time, due to compression and erasure coding being unpredictable: O_DIRECT writes should be checking for -ENOSPC, but buffered writes have already been accepted and should not. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	16ac8c9523	bcachefs: Fix inode backpointers in RENAME_OVERWRITE When we delete the dirent an inode points to, we need to zero out the backpointer fields - this was missed in the RENAME_OVERWRITE case. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	e7084c9c81	bcachefs: Make bch2_remap_range respect O_SYNC Caught by xfstest generic/628 Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	d6462f494d	bcachefs: Split extents if necessary in bch2_trans_update() Currently, we handle multiple overlapping extents in the same transaction commit by doing fixups in bch2_trans_update() - this patch extents that to split updates when necessary. The next patch that changes the reflink code to not fragment extents when making them indirect will require this. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:03 -04:00
Kent Overstreet	ef1b20924b	bcachefs: Ratelimiting for writeback IOs Writeback throttling is a kernel config option and not always enabled. When it's not enabled we need a fallback, to avoid unbounded memory pinning and work item backlogs. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:03 -04:00
Dan Robertson	ed34341189	bcachefs: statfs resports incorrect avail blocks The current implementation of bch_statfs does not scale the number of available blocks provided in f_bavail by the reserve factor. This causes an allocation of a file of this size to fail. Signed-off-by: Dan Robertson <dan@dlrobertson.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:03 -04:00
Kent Overstreet	c21d537779	bcachefs: Fix for bch2_bkey_pack_pos() not initializing len/version fields This bug led to push_whiteout() generating whiteouts that failed bch2_bkey_invalid() due to nonzero length fields - oops. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:03 -04:00
Kent Overstreet	82355e2882	bcachefs: Fix a memcpy call Not supposed to pass a null ptr to memcpy (even if the size is 0). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:03 -04:00
Kent Overstreet	bbfcb4519d	bcachefs: Fix bch2_extent_can_insert() call It was being skipped when hole punching, leading to problems when splitting compressed extents. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:03 -04:00
Kent Overstreet	2e8f9d23cb	bcachefs: Make sure to pass a disk reservation to bch2_extent_update() It's needed when we split an existing compressed extent - we get a null ptr deref without it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:03 -04:00
Brett Holman	2cd0563461	bcachefs: made changes to support clang, fixed a couple bugs fs/bcachefs/bset.c edited prefetch macro to add clang support fs/bcachefs/btree_iter.c bugfix: initialize iter->real_pos in bch2_btree_iter_init for later use fs/bcachefs/io.c bugfix: eliminated undefined behavior (negative bitshift) fs/bcachefs/buckets.c bugfix: invert sign to handle 64bit abs() Signed-off-by: Brett Holman <bpholman5@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:03 -04:00
Kent Overstreet	6ebe32b94c	bcachefs: Fix locking in __bch2_set_nr_journal_buckets() We weren't holding mark_lock correctly - it's needed for the new_fs path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:03 -04:00
Dan Robertson	d125615a4e	bcachefs: properly initialize used values - Ensure the second key value in bch_hash_info is initialized to zero if the info type is of type BCH_STR_HASH_SIPHASH. - Initialize the possibly returned value in bch2_inode_create. Assuming bch2_btree_iter_peek returns bkey_s_c_null, the uninitialized value of ret could be returned to the user as an error pointer. - Fix compiler warning in initialization of bkey_s_c_stripe fs/bcachefs/buckets.c:1646:35: warning: suggest braces around initialization of subobject [-Wmissing-braces] struct bkey_s_c_stripe new_s = { NULL }; ^~~~ Signed-off-by: Dan Robertson <dan@dlrobertson.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:03 -04:00
Kent Overstreet	e1036ce581	bcachefs: Repair code for multiple types of data in same bucket bch2_check_fix_ptrs() is awkward, we need to find a way to improve it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:03 -04:00
Dan Robertson	faf1a5f417	bcachefs: Fix out of bounds read in fs usage ioctl Fix a possible read out of bounds if bch2_ioctl_fs_usage is called when replica_entries_bytes is set to a value that is smaller than the size of bch_replicas_usage. Signed-off-by: Dan Robertson <dan@dlrobertson.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:03 -04:00
Dan Robertson	2b25de552f	bcachefs: Fix null deref in bch2_ioctl_read_super Do not attempt to cleanup the returned value of bch2_device_lookup if the returned value was an error pointer. We currently check to see if the returned value is null and run the cleanup otherwise. As a result, we attempt to run the cleanup on a error pointer. Signed-off-by: Dan Robertson <dan@dlrobertson.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:03 -04:00
Dan Robertson	ec4ab9d2fc	bcachefs: Fix possible null deref on mount Ensure that the block device pointer in a superblock handle is not null before dereferencing it in bch2_dev_to_fs. The block device pointer may be null when mounting a new bcachefs filesystem given another mounted bcachefs filesystem exists that has at least one device that is offline. Signed-off-by: Dan Robertson <dan@dlrobertson.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:03 -04:00
Dan Robertson	baf056b87d	bcachefs: Fix error in parsing of mount options When parsing the mount options duplicate the given options. This is required as the options are parsed twice and strsep is used in parsing. The options will be modified into a possibly invalid options set for the second round of parsing if the options are not duplicated before parsing. Signed-off-by: Dan Robertson <dan@dlrobertson.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:03 -04:00
Stijn Tintel	ffcf9ec78c	bcachefs: avoid out-of-bounds in split_devs Calling mount with an empty source string causes an out-of-bounds error in split_devs. Check the length of the source string to avoid this. Signed-off-by: Stijn Tintel <stijn@linux-ipv6.be> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:03 -04:00
Kent Overstreet	909004d2f9	bcachefs: Make sure to use BTREE_ITER_PREFETCH in fsck Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:03 -04:00
Kent Overstreet	360746bf6f	bcachefs: Fix bch2_btree_iter_peek_with_updates() By not re-fetching the next update we were going into an infinite loop. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:03 -04:00
Kent Overstreet	933532b8b2	bcachefs: Fix reflink trigger The trigger for reflink pointers wasn't always incrementing/decrementing the refcounts correctly - this patch fixes that logic. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:03 -04:00
Kent Overstreet	3a402c8dab	bcachefs: Fix some refcounting bugs We really need debug mode assertions that ca->ref and ca->io_ref are used correctly. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:03 -04:00
Dan Robertson	5bc38f44fa	bcachefs: Fix oob write in __bch2_btree_node_write Fix a possible out of bounds write in __bch2_btree_node_write when the data buffer padding is cleared up to the block size. The out of bounds write is possible if the data buffers size is not a multiple of the block size. Signed-off-by: Dan Robertson <dan@dlrobertson.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:03 -04:00
Kent Overstreet	1784d43a88	bcachefs: Fix usage of last_seq + encryption jset->last_seq is in the region that's encrypted - on journal write completion, we were using it and getting garbage. This patch shadows it to fix. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:02 -04:00
Kent Overstreet	ac1019d32b	bcachefs: Clean up bch2_btree_and_journal_walk() Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:02 -04:00
Kent Overstreet	e68031fb46	bcachefs: Mark newly allocated btree nodes as accessed This was a major oversight - this means under memory pressure we can end up reading in a btree node, then having it evicted before we get to use it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:02 -04:00
Kent Overstreet	595c1e9bab	bcachefs: Fix time handling There were some overflows in the time conversion functions - fix this by converting tv_sec and tv_nsec separately. Also, set sb->time_min and sb->time_max. Fixes xfstest generic/258. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:02 -04:00
Kent Overstreet	4f6dad46cb	bcachefs: Add a tracepoint for when we block on journal reclaim Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:02 -04:00
Kent Overstreet	2ce867df31	bcachefs: Make sure to initialize j->last_flushed If the journal reclaim thread makes it to the timeout without ever initializing j->last_flushed, we could end up sleeping for a very long time. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:02 -04:00

1 2 3 4 5 ...

1216349 Commits