linux

iv/linux

Author	SHA1	Message	Date
Kent Overstreet	14b393ee76	bcachefs: Subvolumes, snapshots This patch adds subvolume.c - support for the subvolumes and snapshots btrees and related data types and on disk data structures. The next patches will start hooking up this new code to existing code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:12 -04:00
Kent Overstreet	aa76bd3321	bcachefs: Add a missing bch2_trans_relock() call This was causing an assertion to pop in fsck, in one of the repair paths. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:12 -04:00
Kent Overstreet	caaa66aa54	bcachefs: Better approach to write vs. read lock deadlocks Instead of unconditionally upgrading read locks to intent locks in do_bch2_trans_commit(), this patch changes the path that takes write locks to first trylock, and then if trylock fails check if we have a conflicting read lock, and restart the transaction if necessary. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:12 -04:00
Kent Overstreet	b301105b48	bcachefs: normalize_read_intent_locks This is a new approach to avoiding the self deadlock we'd get if we tried to take a write lock on a node while holding a read lock - we simply upgrade the readers to intent. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:12 -04:00
Kent Overstreet	db92f2ea5e	bcachefs: Optimize btree lookups in write path This patch significantly reduces the number of btree lookups required in the extent update path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:12 -04:00
Kent Overstreet	cf3c68cda6	bcachefs: No need to clone iterators for update Since btree_path is now internally refcounted, we don't need to clone an iterator before calling bch2_trans_update() if we'll be mutating it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:11 -04:00
Kent Overstreet	f48361b00c	bcachefs: Drop some fast path tracepoints These haven't turned out to be useful Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:11 -04:00
Kent Overstreet	67e0dd8f0d	bcachefs: btree_path This splits btree_iter into two components: btree_iter is now the externally visible componont, and it points to a btree_path which is now reference counted. This means we no longer have to clone iterators up front if they might be mutated - btree_path can be shared by multiple iterators, and cloned if an iterator would mutate a shared btree_path. This will help us use iterators more efficiently, as well as slimming down the main long lived state in btree_trans, and significantly cleans up the logic for iterator lifetimes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:11 -04:00
Kent Overstreet	fbf14104da	bcachefs: Improve an error message When we detect an invalid key being inserted, we should print what code was doing the update. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:11 -04:00
Kent Overstreet	f21566f17a	bcachefs: Kill BTREE_ITER_NODES We really only need to distinguish between btree iterators and btree key cache iterators - this is more prep work for btree_path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:11 -04:00
Kent Overstreet	6fba6b83b4	bcachefs: Prefer using btree_insert_entry to btree_iter This moves some data dependencies forward, to improve pipelining. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:11 -04:00
Kent Overstreet	a0a568794d	bcachefs: More renaming Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:11 -04:00
Kent Overstreet	f7a966a3e2	bcachefs: Clean up/rename bch2_trans_node_* fns These utility functions are for managing btree node state within a btree_trans - rename them for consistency, and drop some unneeded arguments. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:11 -04:00
Kent Overstreet	78cf784eaa	bcachefs: Further reduce iter->trans usage This is prep work for splitting btree_path out from btree_iter - btree_path will not have a pointer to btree_trans. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:11 -04:00
Kent Overstreet	05046a962f	bcachefs: Better algorithm for btree node merging in write path The existing algorithm was O(n^2) in the number of updates in the commit. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:11 -04:00
Kent Overstreet	5f8077cca8	bcachefs: Kill BTREE_ITER_SET_POS_AFTER_COMMIT BTREE_ITER_SET_POS_AFTER_COMMIT is used internally to automagically advance extent btree iterators on sucessful commit. But with the upcomnig btree_path patch it's getting more awkward to support, and it adds overhead to core data structures that's only used in a few places, and can be easily done by the caller instead. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:11 -04:00
Kent Overstreet	638c6ff951	bcachefs: Refactor bch2_trans_update_extent() This consolidates the code for doing extent updates, and makes the btree iterator usage a bit cleaner and more efficient. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:11 -04:00
Kent Overstreet	9f6bd30703	bcachefs: Reduce iter->trans usage Disfavoured, and should go away. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:10 -04:00
Kent Overstreet	84841b0d13	bcachefs: bch2_dump_trans_iters_updates() This factors out bch2_dump_trans_iters_updates() from the iter alloc overflow path, and makes some small improvements to what it prints. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:10 -04:00
Kent Overstreet	dc02bed6d9	bcachefs: Free iterator if we have duplicate This helps - but does not fully fix - the outstanding "transaction iterator overflow" bugs. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:10 -04:00
Kent Overstreet	e363726602	bcachefs: Ensure that new inodes hit underlying btree Inode creation is done with non-cached btree iterators, but then in the same transaction the inode may be updated again with a cached iterator - it makes cache coherency easier if new inodes always land in the underlying btree. This patch adds a check to bch2_trans_update() - if the same key is updated multiple times in the same transaction with both cached and non cache iterators, use the non cached iterator. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:10 -04:00
Kent Overstreet	9cba7bf7c7	bcachefs: Don't drop read locks at transaction commit time Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:10 -04:00
Kent Overstreet	1a488e7306	bcachefs: Kill BTREE_INSERT_NOUNLOCK With the recent transaction restart changes, it's no longer needed - all transaction commits have BTREE_INSERT_NOUNLOCK semantics. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:10 -04:00
Kent Overstreet	b253a90d06	bcachefs: Btree splits no longer automatically cause a transaction restart With the new and improved handling of transaction restarts, this should finally be safe. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:10 -04:00
Kent Overstreet	955af63441	bcachefs: __bch2_trans_commit() no longer calls bch2_trans_reset() It's now the caller's responsibility to call bch2_trans_begin. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:09 -04:00
Kent Overstreet	e5af273fce	bcachefs: trans->restarted Start tracking when btree transactions have been restarted - and assert that we're always calling bch2_trans_begin() immediately after transaction restart. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:09 -04:00
Kent Overstreet	6918bb55f6	bcachefs: Don't traverse iterators in __bch2_trans_commit() They should already be traversed, and we're asserting that since the introduction of iter->should_be_locked Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:09 -04:00
Kent Overstreet	ed5580b43b	bcachefs: Fix a btree iterator leak Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:09 -04:00
Kent Overstreet	9f1833cadd	bcachefs: Update btree ptrs after every write This closes a significant hole (and last known hole) in our ability to verify metadata. Previously, since btree nodes are log structured, we couldn't detect lost btree writes that weren't the first write to a given node. Additionally, this seems to have lead to some significant metadata corruption on multi device filesystems with metadata replication: since a write may have made it to one device and not another, if we read that btree node back from the replica that did have that write and started appending after that point, the other replica would have a gap in the bset entries and reading from that replica wouldn't find the rest of the bsets. But, since updates to interior btree nodes are now journalled, we can close this hole by updating pointers to btree nodes after every write with the currently written number of sectors, without negatively affecting performance. This means we will always detect lost or corrupt metadata - it also means that our btree is now a curious hybrid of COW and non COW btrees, with all the benefits of both (excluding complexity). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:08 -04:00
Kent Overstreet	e3a67bdb6e	bcachefs: Regularize argument passing of btree_trans btree_trans should always be passed when we have one - iter->trans is disfavoured. This mainly updates old code in btree_update_interior.c, some of which predates btree_trans. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:08 -04:00
Kent Overstreet	b00fde8fb1	bcachefs: BTREE_UPDATE_INTERNAL_SNAPSHOT_NODE Add a new flag to control assertions about updating to internal snapshot nodes, that normally should not be written to - to be used in an upcoming patch. Also do some renaming - trigger_flags is now update_flags. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:08 -04:00
Kent Overstreet	297d89343d	bcachefs: Extensive triggers cleanups - We no longer mark subsets of extents, they're marked like regular keys now - which means we can drop the offset & sectors arguments to trigger functions - Drop other arguments that are no longer needed anymore in various places - fs_usage - Drop the logic for handling extents in bch2_mark_update() that isn't needed anymore, to match bch2_trans_mark_update() - Better logic for hanlding the BTREE_ITER_CACHED_NOFILL case, where we don't have an old key to mark Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:07 -04:00
Kent Overstreet	8c3f6da9fc	bcachefs: Improve iter->should_be_locked Adding iter->should_be_locked introduced a regression where it ended up not being set on the iterator passed to bch2_btree_update_start(), which is definitely not what we want. This patch requires it to be set when calling bch2_trans_update(), and adds various fixups to make that happen. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:06 -04:00
Kent Overstreet	b89726ab86	bcachefs: Kill __btree_delete_at() With trans->updates2 gone, we can now drop this helper and use bch2_btree_delete_at() instead. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:06 -04:00
Kent Overstreet	a49e9a0589	bcachefs: Fix null ptr deref when splitting compressed extents Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:06 -04:00
Kent Overstreet	5db95e50e1	bcachefs: Re-implement extent merging in transaction commit path We haven't had extent merging in quite some time. It used to be done by the btree code when sorting btree nodes, but that was eliminated as part of the work to separate extent handling from core btree code. This patch re-implements extent merging in the transaction commit path. We don't currently have the ability to merge reflink pointers, we need to do some work on the triggers code to be able to do that without ending up with incorrect refcounts. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:06 -04:00
Kent Overstreet	81d22e5d83	bcachefs: Refactor extent_handle_overwrites() Prep work for extent merging Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:06 -04:00
Kent Overstreet	cd8319fdd9	bcachefs: Kill trans->updates2 Now that extent handling has been lifted to bch2_trans_update(), we don't need to keep two different lists of updates. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:06 -04:00
Kent Overstreet	8e6bbc4181	bcachefs: Move extent_handle_overwrites() to bch2_trans_update() This lifts handling of overlapping extents out of __bch2_trans_commit() and moves it to where we first do the update - which means that BTREE_ITER_WITH_UPDATES can now work correctly in extents mode. Also, this patch reworks how extent triggers work: previously, on partial extent overwrite we would pass this information to the trigger, telling it what part of the extent was being overwritten. But, this approach has had too many subtle corner cases - now, we only mark whole extents, meaning on partial extent overwrite we unmark the old extent and mark the new extent. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:06 -04:00
Kent Overstreet	5288e66a7b	bcachefs: BTREE_ITER_WITH_UPDATES This drops bch2_btree_iter_peek_with_updates() and replaces it with a new flag, BTREE_ITER_WITH_UPDATES, and also reworks bch2_btree_iter_peek_slot() to respect it too. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:05 -04:00
Kent Overstreet	531a0095c9	bcachefs: Improve btree iterator tracepoints This patch adds some new tracepoints to the btree iterator code, and adds new fields to the existing tracepoints - primarily for the iterator position. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:05 -04:00
Kent Overstreet	bc3f8b25f3	bcachefs: Check for errors from bch2_trans_update() Upcoming refactoring is going to change bch2_trans_update() to start returning transaction restarts. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:05 -04:00
Kent Overstreet	a6336910b1	bcachefs: Fix for buffered writes getting -ENOSPC Buffered writes may have to increase their disk reservation at btree update time, due to compression and erasure coding being unpredictable: O_DIRECT writes should be checking for -ENOSPC, but buffered writes have already been accepted and should not. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00
Kent Overstreet	d6462f494d	bcachefs: Split extents if necessary in bch2_trans_update() Currently, we handle multiple overlapping extents in the same transaction commit by doing fixups in bch2_trans_update() - this patch extents that to split updates when necessary. The next patch that changes the reflink code to not fragment extents when making them indirect will require this. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:03 -04:00
Kent Overstreet	bbfcb4519d	bcachefs: Fix bch2_extent_can_insert() call It was being skipped when hole punching, leading to problems when splitting compressed extents. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:03 -04:00
Kent Overstreet	4f6dad46cb	bcachefs: Add a tracepoint for when we block on journal reclaim Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:02 -04:00
Kent Overstreet	bc2e5d5c66	bcachefs: Fix an out of bounds read bch2_varint_decode() can read up to 7 bytes past the end of the buffer, which means we need to allocate slightly larger key cache buffers. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:02 -04:00
Kent Overstreet	8ce600d447	bcachefs: Fix for btree_gc repairing interior btree ptrs Using the normal transaction commit path to insert and journal updates to interior nodes hadn't been done before this repair code was written, not surprising that there was a bug. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:01 -04:00
Kent Overstreet	fa272f33bb	bcachefs: Always check for invalid bkeys in trans commit path We check for this prior to metadata being written, but we're seeing some strange bugs lately, and this will help catch those closer to where they occur. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:01 -04:00
Kent Overstreet	0ef107859b	bcachefs: Fix journal_reclaim_wait_done() Can't run arbitrary code inside a wait_event() conditional, due to task state being weird... Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:00 -04:00

1 2 3 4 5

207 Commits