linux

iv/linux

Author	SHA1	Message	Date
Kent Overstreet	5db95e50e1	bcachefs: Re-implement extent merging in transaction commit path We haven't had extent merging in quite some time. It used to be done by the btree code when sorting btree nodes, but that was eliminated as part of the work to separate extent handling from core btree code. This patch re-implements extent merging in the transaction commit path. We don't currently have the ability to merge reflink pointers, we need to do some work on the triggers code to be able to do that without ending up with incorrect refcounts. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:06 -04:00
Kent Overstreet	81d22e5d83	bcachefs: Refactor extent_handle_overwrites() Prep work for extent merging Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:06 -04:00
Kent Overstreet	59ba21d99f	bcachefs: Clean up key merging This patch simplifies the key merging code by getting rid of partial merges - it's simpler and saner if we just don't merge extents when they'd overflow k->size. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:06 -04:00
Kent Overstreet	cd8319fdd9	bcachefs: Kill trans->updates2 Now that extent handling has been lifted to bch2_trans_update(), we don't need to keep two different lists of updates. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:06 -04:00
Kent Overstreet	c1949baa51	bcachefs: Simplify reflink trigger Now that we only mark entire extents, we can ditch the "reflink_p_frag_references" code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:06 -04:00
Kent Overstreet	8e6bbc4181	bcachefs: Move extent_handle_overwrites() to bch2_trans_update() This lifts handling of overlapping extents out of __bch2_trans_commit() and moves it to where we first do the update - which means that BTREE_ITER_WITH_UPDATES can now work correctly in extents mode. Also, this patch reworks how extent triggers work: previously, on partial extent overwrite we would pass this information to the trigger, telling it what part of the extent was being overwritten. But, this approach has had too many subtle corner cases - now, we only mark whole extents, meaning on partial extent overwrite we unmark the old extent and mark the new extent. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:06 -04:00
Kent Overstreet	b1d87f527d	bcachefs: bch2_btree_iter_peek_slot() now saves initial position when searching Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:06 -04:00
Kent Overstreet	1d214eb18d	bcachefs: Kill __bch2_btree_iter_peek_slot_extents() This codepath won't just be for extents in the future, it'll also be for BTREE_ITER_FILTER_SNAPSHOTS mode. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:05 -04:00
Kent Overstreet	e750296bf5	bcachefs: bch2_btree_iter_peek_slot() now supports BTREE_ITER_WITH_UPDATES Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:05 -04:00
Kent Overstreet	5288e66a7b	bcachefs: BTREE_ITER_WITH_UPDATES This drops bch2_btree_iter_peek_with_updates() and replaces it with a new flag, BTREE_ITER_WITH_UPDATES, and also reworks bch2_btree_iter_peek_slot() to respect it too. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:05 -04:00
Kent Overstreet	509d3e0a8d	bcachefs: Child btree iterators This adds the ability for btree iterators to own child iterators - to be used by an upcoming rework of bch2_btree_iter_peek_slot(), so we can scan forwards while maintaining our current position. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:05 -04:00
Kent Overstreet	c205321b12	bcachefs: Drop all btree locks when submitting btree node reads As a rule we don't want to be holding btree locks while submitting IO - this will improve overall filesystem latency. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:05 -04:00
Kent Overstreet	4351d3ecb4	bcachefs: More topology repair code This improves the handling of overlapping btree nodes; now, we handle the case where one btree node completely overwrites another. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:05 -04:00
Kent Overstreet	74cc1abdbf	bcachefs: Fix a buffer overrun In make_extent_indirect(), we were allocating too small of a buffer for the new indirect extent. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:05 -04:00
Kent Overstreet	224ec3e677	bcachefs: Don't mark superblocks past end of usable space bcachefs-tools recently started putting a backup superblock at the end of the device. This causes a problem if the bucket size doesn't divide the device size - but we can fix it by just skipping marking that part. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:05 -04:00
Kent Overstreet	7138f22097	bcachefs: Fix a spurious debug mode assertion When we switched to using bch2_btree_bset_insert_key() for extents it turned out it started leaving invalid keys around - of type deleted but nonzero size - but this is fine (if ugly) because they're never written out. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:05 -04:00
Brett Holman	ca47fa2362	bcachefs: Fix unitialized use of a value Signed-off-by: Brett Holman <bpholman5@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:05 -04:00
Dan Robertson	59e2480ff7	bcachefs: do not compile acl mod on minimal config Do not compile the acl.o target if BCACHEFS_POSIX_ACL is not enabled. Signed-off-by: Dan Robertson <dan@dlrobertson.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:05 -04:00
Kent Overstreet	66a0a49750	bcachefs: btree_iter->should_be_locked Add a field to struct btree_iter for tracking whether it should be locked - this fixes spurious transaction restarts in bch2_trans_relock(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:05 -04:00
Kent Overstreet	531a0095c9	bcachefs: Improve btree iterator tracepoints This patch adds some new tracepoints to the btree iterator code, and adds new fields to the existing tracepoints - primarily for the iterator position. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:05 -04:00
Kent Overstreet	f7beb4ca04	bcachefs: Preallocate transaction mem This helps avoid transaction restarts. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:05 -04:00
Kent Overstreet	bc3f8b25f3	bcachefs: Check for errors from bch2_trans_update() Upcoming refactoring is going to change bch2_trans_update() to start returning transaction restarts. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:05 -04:00
Kent Overstreet	01254036a3	bcachefs; Check for allocator thread shutdown We were missing a kthread_should_stop() check in the loop in bch2_invalidate_buckets(), very occasionally leading to us getting stuck while shutting down. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:05 -04:00
Kent Overstreet	d7fc453bdb	bcachefs: Journal space calculation fix When devices have different bucket sizes, we may accumulate a journal write that doesn't fit on some of our devices - previously, we'd underflow when calculating space on that device and then everything would get weird. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:05 -04:00
Kent Overstreet	649d9a4dfc	bcachefs: Don't fragment extents when making them indirect This fixes a "disk usage increased without a reservation" bug, when reflinking compressed extents. Also, there's no good reason for reflink to be fragmenting extents anyways. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:05 -04:00
Kent Overstreet	890b74f03d	bcachefs: Fsck for reflink refcounts Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:05 -04:00
Kent Overstreet	c0ebe3e48c	bcachefs: Assorted endianness fixes Found by sparse Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:05 -04:00
Kent Overstreet	ee7570546e	bcachefs: Fix a deadlock Waiting on a btree node write with btree locks held can deadlock, if the write errors: the write error path has to do do a btree update to drop the pointer to the replica that errored. The interior update path has to wait on in flight btree writes before freeing nodes on disk. Previously, this was done in bch2_btree_interior_update_will_free_node(), and could deadlock; now, we just stash a pointer to the node and do it in btree_update_nodes_written(), just prior to the transactional part of the update. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:05 -04:00
Kent Overstreet	9f2772c454	bcachefs: Split out btree_error_wq We can't use btree_update_wq becuase btree updates may be waiting on btree writes to complete. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:04 -04:00
Kent Overstreet	bff796ae65	bcachefs: Fix pathalogical behaviour with inode sharding by cpu ID If the transactior restarts on a different CPU, it could end up needing to read in a different btree node, which makes another transaction restart more likely... Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	d797ca3d8e	bcachefs: Fix journal write error path Journal write errors were racing with the submission path - potentially causing writes to other replicas to not get submitted. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	9eba7c8d15	bcachefs: Reflink refcount fix __bch2_trans_mark_reflink_p wasn't always correctly returning the number of sectors processed - the new logic is a bit more straightforward overall too. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	b282a74fae	bcachefs: Add an option to control sharding new inode numbers We're seeing a bug where inode creates end up spinning in bch2_inode_create - disabling sharding will simplify what we're testing. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	9f311f2166	bcachefs: Don't use bch_write_op->cl for delivering completions We already had op->end_io as an alternative mechanism to op->cl.parent for delivering write completions; this switches all code paths to using op->end_io. Two reasons: - op->end_io is more efficient, due to fewer atomic ops, this completes the conversion that was originally only done for the direct IO path. - We'll be restructing the write path to use a different mechanism for punting to process context, refactoring to not use op->cl will make that easier. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:04 -04:00
Kent Overstreet	af17118319	bcachefs: Kill bch_write_op.index_update_fn This deletes bch_write_op.index_update_fn: indirect function calls have gotten considerably more expensive post spectre/meltdown, and we only have two different index_update_fns - this patch adds a flag to specify which one to use (normal vs. data move path). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:04 -04:00
Kent Overstreet	7e94eeffe0	bcachefs: Inline fastpath of bch2_disk_reservation_add() The fastpath now doesn't even disable preemption - instead we use a (non locked) cmpxchg. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:04 -04:00
Kent Overstreet	ddc7dd62f0	bcachefs: Don't use uuid in tracepoints %pU for printing out pointers to uuids doesn't work in perf trace Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	19d2819d2d	bcachefs: Add a tracepoint for copygc waiting Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	c4d4b2f01a	bcachefs: Add a cond_resched call to the copygc main loop We seem to have a bug where the copygc thread ends up spinning and making the system unusable - this will at least prevent it from locking up the machine, and it's a good thing to have anyways. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	443d2760e5	bcachefs: Fix a null ptr deref bch2_btree_iter_peek() won't always return a key - whoops. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	9dd89a05fd	bcachefs: Fix an issue with inconsistent btree writes after unclean shutdown After unclean shutdown, btree writes may have completed on one device and not others - and this inconsistency could lead us to writing new bsets with a gap in our btree node in one of our replicas. Fortunately, this is only an issue with bsets that are newer than the most recent journal flush, and we already have a mechanism for detecting and blacklisting those. We just need to make sure to start new btree writes after the most recent _non_ blacklisted bset. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	4495cbed56	bcachefs: Improve FS_IOC_GOINGDOWN ioctl We weren't interpreting the flags argument at all. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	731bdd2eff	bcachefs: Add a workqueue for btree io completions Also, clean up workqueue usage - we shouldn't be using system workqueues, pretty much everything we do needs to be on our own WQ_MEM_RECLAIM workqueues. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Brett Holman	2eba51a69a	bcachefs: rewrote prefetch asm in gas syntax for clang compatibility Signed-off-by: Brett Holman <bpholman5@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:04 -04:00
Kent Overstreet	1ce0cf5fe9	bcachefs: Add a debug mode that always reads from every btree replica There's a new module parameter, verify_all_btree_replicas, that enables reading from every btree replica when reading in btree nodes and comparing them against each other. We've been seeing some strange btree corruption - this will hopefully aid in tracking it down and catching it more often. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	596d3bdc1e	bcachefs: Don't repair btree nodes until after interior journal replay is done We need the btree to be in a consistent state before we can rewrite btree nodes. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	304b7e08c7	bcachefs: Fix an uninitialized var this fixes a valgrind complaint Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	a6336910b1	bcachefs: Fix for buffered writes getting -ENOSPC Buffered writes may have to increase their disk reservation at btree update time, due to compression and erasure coding being unpredictable: O_DIRECT writes should be checking for -ENOSPC, but buffered writes have already been accepted and should not. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	16ac8c9523	bcachefs: Fix inode backpointers in RENAME_OVERWRITE When we delete the dirent an inode points to, we need to zero out the backpointer fields - this was missed in the RENAME_OVERWRITE case. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	e7084c9c81	bcachefs: Make bch2_remap_range respect O_SYNC Caught by xfstest generic/628 Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00

... 3 4 5 6 7 ...

1443 Commits