linux

iv/linux

Author	SHA1	Message	Date
Kent Overstreet	549d173c1b	bcachefs: EINTR -> BCH_ERR_transaction_restart Now that we have error codes, with subtypes, we can switch to our own error code for transaction restarts - and even better, a distinct error code for each transaction restart reason: clearer code and better debugging. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:37 -04:00
Kent Overstreet	0990efaeea	bcachefs: btree_trans_too_many_iters() is now a transaction restart All transaction restarts need a tracepoint - this is essential for debugging Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:36 -04:00
Kent Overstreet	e941ae7d3a	bcachefs: Add a counter for btree_trans restarts This will help us improve nested transactions - we need to add assertions that whenever an inner transaction handles a restart, it still returns -EINTR to the outer transaction. This also adds nested_lockrestart_do() and nested_commit_do() which use the new counters to correctly return -EINTR when the transaction was restarted. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:36 -04:00
Daniel Hill	8bfe14e86a	bcachefs: lock time stats prep work. We need the caller name and a place to store our results, btree_trans provides this. Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:35 -04:00
Kent Overstreet	a1783320d4	bcachefs: for_each_btree_key2() This introduces two new macros for iterating through the btree, with transaction restart handling - for_each_btree_key2() - for_each_btree_key_commit() Every iteration is now in an implicit transaction, and - as with lockrestart_do() and commit_do() - returning -EINTR will cause the transaction to be restarted, at the same key. This patch converts a bunch of code that was open coding this to these new macros, saving a substantial amount of code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:35 -04:00
Kent Overstreet	50b13beef0	bcachefs: Improve an error message When inserting a key type that's not valid for a given btree, we should print out which btree we were inserting into. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:34 -04:00
Kent Overstreet	30525f6863	bcachefs: Fix journal_keys_search() overhead Previously, on every btree_iter_peek() operation we were searching the journal keys, doing a full binary search - which was slow. This patch fixes that by saving our position in the journal keys, so that we only do a full binary search when moving our position backwards or a large jump forwards. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:33 -04:00
Kent Overstreet	b0babf2a34	bcachefs: bch2_btree_iter_peek_all_levels() This adds bch2_btree_iter_peek_all_levels(), which returns keys from every level of the btree - interior nodes included - in monotonically increasing order, soon to be used by the backpointers check & repair code. - BTREE_ITER_ALL_LEVELS can now be passed to for_each_btree_key() to iterate thusly, much like BTREE_ITER_SLOTS - The existing algorithm in bch2_btree_iter_advance() doesn't work with peek_all_levels(): we have to defer the actual advancing until the next time we call peek, where we have the btree path traversed and uptodate. So, we add an advanced bit to btree_iter; when BTREE_ITER_ALL_LEVELS is set bch2_btree_iter_advanced() just marks the iterator as advanced. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:32 -04:00
Kent Overstreet	d864842581	bcachefs: btree_path_make_mut() clears should_be_locked This fixes a bug where __bch2_btree_node_update_key() wasn't clearing should_be_locked, leading to bch2_btree_path_traverse() always failing - all callers of btree_path_make_mut() want should_be_locked cleared, so do it there. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:28 -04:00
Kent Overstreet	8570d775ca	bcachefs: bch2_trans_updates_to_text() This turns bch2_dump_trans_updates() into a to_text() method - this way it can be used by debug tracing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:27 -04:00
Kent Overstreet	2158fe463b	bcachefs: bch2_trans_inconsistent() Add a new error macro that also dumps transaction updates in addition to doing an emergency shutdown - when a transaction update discovers or is causing a fs inconsistency, it's helpful to see what updates it was doing. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:27 -04:00
Kent Overstreet	85d8cf161f	bcachefs: bch2_btree_iter_peek_upto() In BTREE_ITER_FILTER_SNAPHOTS mode, we skip over keys in unrelated snapshots. When we hit the end of an inode, if the next inode(s) are in a different subvolume, we could potentially have to skip past many keys before finding a key we can return to the caller, so they can terminate the iteration. This adds a peek_upto() variant to solve this problem, to be used when we know the range we're searching within. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:27 -04:00
Kent Overstreet	f7b6ca23b6	bcachefs: BTREE_ITER_WITH_KEY_CACHE This is the start of cache coherency with the btree key cache - this adds a btree iterator flag that causes lookups to also check the key cache when we're iterating over the btree (not iterating over the key cache). Note that we could still race with another thread creating at item in the key cache and updating it, since we aren't holding the key cache locked if it wasn't found. The next patch for the update path will address this by causing the transaction to restart if the key cache is found to be dirty. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:23 -04:00
Kent Overstreet	ce91abd60b	bcachefs: bch2_btree_path_set_pos() bch2_btree_path_set_pos() is now available outside of btree_iter.c Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:23 -04:00
Kent Overstreet	1f2d919250	bcachefs: iter->update_path With BTREE_ITER_FILTER_SNAPSHOTS, we have to distinguish between the path where the key was found, and the path for inserting into the current snapshot. This adds a new field to struct btree_iter for saving a path for the current snapshot, and plumbs it through bch2_trans_update(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:22 -04:00
Kent Overstreet	a1e82d35f8	bcachefs: Refactor bch2_btree_iter() This splits bch2_btree_iter() up into two functions: an inner function that handles BTREE_ITER_WITH_JOURNAL, BTREE_ITER_WITH_UPDATES, and iterating acrcoss leaf nodes, and an outer one that implements BTREE_ITER_FILTER_SNAPHSOTS. This is prep work for remember a btree_path at our update position in BTREE_ITER_FILTER_SNAPSHOTS mode. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:22 -04:00
Kent Overstreet	669f87a5da	bcachefs: Switch to __func__for recording where btree_trans was initialized Symbol decoding, via %ps, isn't supported in userspace - this will also be faster when we're using trans->fn in the fast path, as with the new BCH_JSET_ENTRY_log journal messages. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:21 -04:00
Kent Overstreet	f3e1f44433	bcachefs: BTREE_ITER_NOPRESERVE This adds a flag to not mark the initial btree_path as preserve, for paths that we expect to be cheap to reconstitute if necessary - this solves a btree_path overflow caused by need_whiteout_for_snapshot(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:19 -04:00
Kent Overstreet	084d42bbd6	bcachefs: Apply workaround for too many btree iters to read path Reading from cached data, which calls bch2_bucket_io_time_reset(), is leading to transaction iterator overflows - this standardizes the workaround. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:17 -04:00
Kent Overstreet	32b26e8c7f	bcachefs: bch2_assert_pos_locked() This adds a new assertion to be used by bch2_inode_update_after_write(), which updates the VFS inode based on the update to the btree inode we just did - we require that the btree inode still be locked when we do that update. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:16 -04:00
Kent Overstreet	9a74f63c97	bcachefs: path->should_be_locked fixes - We should only be clearing should_be_locked in btree_path_set_pos() - it's the responsiblity of the btree_path code, not the btree_iter code. - bch2_path_put() needs to pay attention to path->should_be_locked, to ensure we don't drop locks we're supposed to be keeping. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:16 -04:00
Kent Overstreet	f527afea5a	bcachefs: Fix upgrade_readers() The bch2_btree_path_upgrade() call was failing and tripping an assert - path->level + 1 is in this case not necessarily exactly what we want, fix it by upgrading exactly the locks we want. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	d121172561	bcachefs: More general fix for transaction paths overflow for_each_btree_key() now calls bch2_trans_begin() as needed; that means, we can also call it when we're in danger of overflowing transaction paths. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	b0d1b70af8	bcachefs: Must check for errors from bch2_trans_cond_resched() But we don't need to call it from outside the btree iterator code anymore, since it's called by bch2_trans_begin() and bch2_btree_path_traverse(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	e5fa91d7ac	bcachefs: Fix restart handling in for_each_btree_key() Code that uses for_each_btree_key often wants transaction restarts to be handled locally and not returned. Originally, we wouldn't return transaction restarts if there was a single iterator in the transaction - the reasoning being if there weren't other iterators being invalidated, and the current iterator was being advanced/retraversed, there weren't any locks or iterators we were required to preserve. But with the btree_path conversion that approach doesn't work anymore - even when we're using for_each_btree_key() with a single iterator there will still be two paths in the transaction, since we now always preserve the path at the pos the iterator was initialized at - the reason being that on restart we often restart from the same place. And it turns out there's now a lot of for_each_btree_key() uses that _do not_ want transaction restarts handled locally, and should be returning them. This patch splits out for_each_btree_key_norestart() and for_each_btree_key_continue_norestart(), and converts existing users as appropriate. for_each_btree_key(), for_each_btree_key_continue(), and for_each_btree_node() now handle transaction restarts themselves by calling bch2_trans_begin() when necessary - and the old hack to not return transaction restarts when there's a single path in the transaction has been deleted. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	9a796fdb06	bcachefs: bch2_trans_exit() no longer returns errors Now that peek_node()/next_node() are converted to return errors directly, we don't need bch2_trans_exit() to return errors - it's cleaner this way and wasn't used much anymore. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	d355c6f4f7	bcachefs: for_each_btree_node() now returns errors directly This changes for_each_btree_node() to work like for_each_btree_key(), and to that end bch2_btree_iter_peek_node() and next_node() also return error ptrs. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	c075ff700f	bcachefs: BTREE_ITER_FILTER_SNAPSHOTS For snapshots, we need to implement btree lookups that return the first key that's an ancestor of the snapshot ID the lookup is being done in - and filter out keys in unrelated snapshots. This patch adds the btree iterator flag BTREE_ITER_FILTER_SNAPSHOTS which does that filtering. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:12 -04:00
Kent Overstreet	db92f2ea5e	bcachefs: Optimize btree lookups in write path This patch significantly reduces the number of btree lookups required in the extent update path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:12 -04:00
Kent Overstreet	1d3ecd7ea7	bcachefs: Tighten up btree locking invariants New rule is: if a btree path holds any locks it should be holding precisely the locks wanted (accoringing to path->level and path->locks_want). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:11 -04:00
Kent Overstreet	068bcaa589	bcachefs: Add more assertions for locking btree iterators out of order btree_path_traverse_all() traverses btree iterators in sorted order, and thus shouldn't see transaction restarts due to potential deadlocks - but sometimes we do. This patch adds some more assertions and tracks some more state to help track this down. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:11 -04:00
Kent Overstreet	67e0dd8f0d	bcachefs: btree_path This splits btree_iter into two components: btree_iter is now the externally visible componont, and it points to a btree_path which is now reference counted. This means we no longer have to clone iterators up front if they might be mutated - btree_path can be shared by multiple iterators, and cloned if an iterator would mutate a shared btree_path. This will help us use iterators more efficiently, as well as slimming down the main long lived state in btree_trans, and significantly cleans up the logic for iterator lifetimes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:11 -04:00
Kent Overstreet	deb0e573b4	bcachefs: Kill BTREE_ITER_NEED_PEEK This was used for an optimization that hasn't existing in quite awhile - iter->uptodate will probably be going away as well. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:11 -04:00
Kent Overstreet	a0a568794d	bcachefs: More renaming Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:11 -04:00
Kent Overstreet	f7a966a3e2	bcachefs: Clean up/rename bch2_trans_node_* fns These utility functions are for managing btree node state within a btree_trans - rename them for consistency, and drop some unneeded arguments. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:11 -04:00
Kent Overstreet	78cf784eaa	bcachefs: Further reduce iter->trans usage This is prep work for splitting btree_path out from btree_iter - btree_path will not have a pointer to btree_trans. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:11 -04:00
Kent Overstreet	9f6bd30703	bcachefs: Reduce iter->trans usage Disfavoured, and should go away. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:10 -04:00
Kent Overstreet	84841b0d13	bcachefs: bch2_dump_trans_iters_updates() This factors out bch2_dump_trans_iters_updates() from the iter alloc overflow path, and makes some small improvements to what it prints. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:10 -04:00
Kent Overstreet	0423fb7185	bcachefs: Keep a sorted list of btree iterators This will be used to make other operations on btree iterators within a transaction more efficient, and enable some other improvements to how we manage btree iterators. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:10 -04:00
Kent Overstreet	955af63441	bcachefs: __bch2_trans_commit() no longer calls bch2_trans_reset() It's now the caller's responsibility to call bch2_trans_begin. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:09 -04:00
Kent Overstreet	e5af273fce	bcachefs: trans->restarted Start tracking when btree transactions have been restarted - and assert that we're always calling bch2_trans_begin() immediately after transaction restart. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:09 -04:00
Kent Overstreet	a88171c9e6	bcachefs: Clean up interior update paths Btree node merging now happens prior to transaction commit, not after, so we don't need to pay attention to BTREE_INSERT_NOUNLOCK. Also, foreground_maybe_merge shouldn't be calling bch2_btree_iter_traverse_all() - this is becoming private to the btree iterator code and should only be called by bch2_trans_begin(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:09 -04:00
Kent Overstreet	6e075b54a3	bcachefs: bch2_btree_iter_relock_intent() This adds a new helper for btree_cache.c that does what we want where the iterator is still being traverse - and also eliminates some unnecessary transaction restarts. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:09 -04:00
Kent Overstreet	9f1833cadd	bcachefs: Update btree ptrs after every write This closes a significant hole (and last known hole) in our ability to verify metadata. Previously, since btree nodes are log structured, we couldn't detect lost btree writes that weren't the first write to a given node. Additionally, this seems to have lead to some significant metadata corruption on multi device filesystems with metadata replication: since a write may have made it to one device and not another, if we read that btree node back from the replica that did have that write and started appending after that point, the other replica would have a gap in the bset entries and reading from that replica wouldn't find the rest of the bsets. But, since updates to interior btree nodes are now journalled, we can close this hole by updating pointers to btree nodes after every write with the currently written number of sectors, without negatively affecting performance. This means we will always detect lost or corrupt metadata - it also means that our btree is now a curious hybrid of COW and non COW btrees, with all the benefits of both (excluding complexity). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:08 -04:00
Kent Overstreet	5aab663534	bcachefs: Tighten up btree_iter locking assertions We weren't correctly verifying that we had interior node intent locks - this patch also fixes bugs uncovered by the new assertions. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:08 -04:00
Dan Robertson	d38494c462	bcachefs: docs: add docs for bch2_trans_reset Add basic kernel docs for bch2_trans_reset and bch2_trans_begin. Signed-off-by: Dan Robertson <dan@dlrobertson.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:08 -04:00
Kent Overstreet	8c3f6da9fc	bcachefs: Improve iter->should_be_locked Adding iter->should_be_locked introduced a regression where it ended up not being set on the iterator passed to bch2_btree_update_start(), which is definitely not what we want. This patch requires it to be set when calling bch2_trans_update(), and adds various fixups to make that happen. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:06 -04:00
Kent Overstreet	953ee28a3e	bcachefs: Kill bch2_btree_iter_peek_cached() It's now been rolled into bch2_btree_iter_peek_slot() Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:06 -04:00
Kent Overstreet	5288e66a7b	bcachefs: BTREE_ITER_WITH_UPDATES This drops bch2_btree_iter_peek_with_updates() and replaces it with a new flag, BTREE_ITER_WITH_UPDATES, and also reworks bch2_btree_iter_peek_slot() to respect it too. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:05 -04:00
Kent Overstreet	509d3e0a8d	bcachefs: Child btree iterators This adds the ability for btree iterators to own child iterators - to be used by an upcoming rework of bch2_btree_iter_peek_slot(), so we can scan forwards while maintaining our current position. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:05 -04:00

1 2 3 4

171 Commits