Commit Graph

1217050 Commits

Author SHA1 Message Date
Kent Overstreet
fd211bc71c bcachefs: Don't set should_be_locked on paths that aren't locked
It doesn't make any sense to set should_be_locked on btree_paths that
aren't locked, and is often a bug - this patch adds assertions and fixes
some of those bugs.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:37 -04:00
Kent Overstreet
31301dd469 bcachefs: Fix missing error handling in bch2_subvolume_delete()
This fixes an assertion when the transaction has been unexpectedly
restarted.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:37 -04:00
Kent Overstreet
15f11c1aa8 bcachefs: Improve an error message
Update an error message to use bch2_err_str().

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:37 -04:00
Kent Overstreet
49e401fa55 bcachefs: Tracepoint improvements
- use strlcpy(), not strncpy()
 - add tracepoints for btree_path alloc and free
 - give the tracepoint for key cache upgrade fail a proper name
 - add a tracepoint for btree_node_upgrade_fail

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:37 -04:00
Kent Overstreet
17047fbced bcachefs: Fix incorrectly freeing btree_path in alloc path
Clearing path->preserve means the path will be dropping in
bch2_trans_begin() - but on transaction restart, we're likely to need
that path again.

This fixes a livelock in the allocation path.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:37 -04:00
Kent Overstreet
86b7445193 bcachefs: Fix bch2_btree_trans_to_text()
bch2_btree_trans_to_text() is used to print btree_transactions owned by
other threads; thus, it needs to be particularly careful. This fixes a
null ptr deref caused by racing with the owning thread changing
path->l[].b.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:37 -04:00
Kent Overstreet
ae33e7a274 bcachefs: Add distinct error code for key_cache_upgrade
This aids in debugging.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:37 -04:00
Kent Overstreet
4a7f7e9e4d bcachefs: Fix not punting to worqueue when promoting
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:37 -04:00
Kent Overstreet
0763c552e7 bcachefs: fsck: Fix nested transaction handling
This uses the new trans->restart count to make sure we always correctly
return -BCH_ERR_transaction_restart_nested when we restart a nested
transaction - eliminating some other hacks and preparing for new
assertions.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:37 -04:00
Kent Overstreet
91f1b9fdd2 bcachefs: Add an O_DIRECT option (for userspace)
Sometimes we see IO errors due to O_DIRECT alignment issues - having an
option to use buffered IO will be helpful.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:37 -04:00
Kent Overstreet
01eed77178 bcachefs: Tighten up btree_path assertions
Currently seeing a very rare and difficult to explain btree_path
inconsistency - this patch adds assertions to the only place that seems
to be missing them.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:37 -04:00
Kent Overstreet
db346e7120 bcachefs: bch2_bucket_alloc_trans_early -> for_each_btree_key_norestart
Nested btree transactions require special care, and an upcoming patch is
going to add assertions to that effect. We don't want to be using them
unnecessarily, so this patch switches bch2_bucket_trans_early() to not
handle transaction restarts.

This patch also adds a cursor so that on transaction restart we can
continue scanning from where the previous search for an empty bucket
left off.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:37 -04:00
Kent Overstreet
7903e3d2d7 bcachefs: Fix check_i_sectors()
bch2_count_inode_sectors() uses for_each_btree_key() internally, which
handles lock restarts - the lockrestart_do() in check_i_sectors() is
redundant, and buggy here since the count that
bch2_count_inode_sectors() returns was interpreted as an error by
lockrestart_do().

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:37 -04:00
Kent Overstreet
d7228ecc48 bcachefs: Convert debugfs code to for_each_btree_key2()
This fixes a bug where we were leaking a transaction restart error to
userspace.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:37 -04:00
Kent Overstreet
84ece59ad5 bcachefs: Unit test updates
- Convert to for_each_btree_key2(), for_each_btree_key_commit(),
   for_each_btree_key_reverse()
 - No more bare bch2_btree_iter_peek(); we're now fault-injection lock
   restarts, so we always need a lockrestart_do() or equivalent.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:37 -04:00
Kent Overstreet
4f84b7e30b bcachefs: for_each_btree_key_reverse()
This adds a new macro, like for_each_btree_key2(), but for iterating in
reverse order.

Also, change for_each_btree_key2() to properly check the return value of
bch2_btree_iter_advance().

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:37 -04:00
Kent Overstreet
1ed0a5d280 bcachefs: Convert fsck errors to errcode.h
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:37 -04:00
Kent Overstreet
a0cb8d784f bcachefs: Inject transaction restarts in debug mode
In CONFIG_BCACHEFS_DEBUG mode, we'll now randomly issue transaction
restarts - with a decaying probability based on the number of restarts
we've already had, to ensure that transactions eventually make forward
progress. This should help shake out some bugs.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:37 -04:00
Kent Overstreet
549d173c1b bcachefs: EINTR -> BCH_ERR_transaction_restart
Now that we have error codes, with subtypes, we can switch to our own
error code for transaction restarts - and even better, a distinct error
code for each transaction restart reason: clearer code and better
debugging.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:37 -04:00
Kent Overstreet
0990efaeea bcachefs: btree_trans_too_many_iters() is now a transaction restart
All transaction restarts need a tracepoint - this is essential for
debugging

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:36 -04:00
Kent Overstreet
90cecb921c bcachefs: Prevent a btree iter overflow in alloc path
In bch2_bucket_alloc_trans(), we're iterating over buckets - but not
directly with an iterator, since we're iterating over the freespace
btree.

This means that we need to clear iter->path->preserve, otherwise we'll
end up retaining a btree_path for every alloc key we touched - which is
not what we want here.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:36 -04:00
Kent Overstreet
d4bf5eecd7 bcachefs: Use bch2_err_str() in error messages
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:36 -04:00
Kent Overstreet
615f867c14 bcachefs: Improved errcodes
Instead of overloading standard error codes (EINTR/EAGAIN), and defining
short lists of error codes in multiple places that potentially end up
overlapping & conflicting, we're now going to have one master list of
error codes.

Error codes are defined with an x-macro: thus we also have
bch2_err_str() now.

Also, error codes have a class field. Now, instead of checking for
errors with ==, code should use bch2_err_matches(), which returns true
if the error is equal to or a sub-error of the error class.

This means we can define unique errors for every source location where
an error is generated, which will help improve our error messages.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:36 -04:00
Kent Overstreet
3ab25c1b4e bcachefs: We can handle missing btree roots for all alloc btrees
We can rebuild alloc info if these btree roots are missing - no need to
bail out and say the filesystem is unrecoverable

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:36 -04:00
Kent Overstreet
b962552eab bcachefs: Fix should_invalidate_buckets()
Like bch2_copygc_wait_amount, should_invalidate_buckets() needs to try
to ensure that there are always more buckets free than the largest
reserve.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:36 -04:00
Kent Overstreet
175379db20 bcachefs: ec_stripe_bkey_insert() -> for_each_btree_key_norestart()
With the upcoming patches to add assertions for incorrect nested
transaction restart handling, this code is now bogus. Switch it to
for_each_btree_key_norestart() so that transaction restarts are only
handled in one place.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:36 -04:00
Kent Overstreet
0a5156334c bcachefs: Convert erasure coding to for_each_btree_key_commit()
The new for_each_btree_key2() macro handles transaction retries,
allowing us to avoid nested transactions - which we want to avoid since
they're tricky to do completely correctly and upcoming assertions are
going to be checking for that.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:36 -04:00
Kent Overstreet
e941ae7d3a bcachefs: Add a counter for btree_trans restarts
This will help us improve nested transactions - we need to add
assertions that whenever an inner transaction handles a restart, it
still returns -EINTR to the outer transaction.

This also adds nested_lockrestart_do() and nested_commit_do() which use
the new counters to correctly return -EINTR when the transaction was
restarted.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:36 -04:00
Kent Overstreet
445d184af2 bcachefs: Convert alloc code to for_each_btree_key_commit()
The new for_each_btree_key2() macro handles transaction retries,
allowing us to avoid nested transactions - which we want to avoid since
they're tricky to do completely correctly and upcoming assertions are
going to be checking for that.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:36 -04:00
Kent Overstreet
6738dd19db bcachefs: Convert subvol code to for_each_btree_key_commit()
The new for_each_btree_key2() macro handles transaction retries,
allowing us to avoid nested transactions - which we want to avoid since
they're tricky to do completely correctly and upcoming assertions are
going to be checking for that.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:36 -04:00
Kent Overstreet
8933315689 bcachefs: Convert bch2_dev_usrdata_drop() to for_each_btree_key2()
The new for_each_btree_key2() macro handles transaction retries,
allowing us to avoid nested transactions - which we want to avoid since
they're tricky to do completely correctly and upcoming assertions are
going to be checking for that.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:36 -04:00
Kent Overstreet
d04801a0f4 bcachefs: Convert bch2_do_invalidates_work() to for_each_btree_key2()
The new for_each_btree_key2() macro handles transaction retries,
allowing us to avoid nested transactions - which we want to avoid since
they're tricky to do completely correctly and upcoming assertions are
going to be checking for that.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:36 -04:00
Kent Overstreet
dadecd02c4 bcachefs: bch2_trans_run()
This adds a new helper, bch2_trans_run(), that runs a function with a
btree_transaction context but without handling transaction restarts.
We're adding checks for nested transaction restart handling: when an
inner transaction handles a transaction restart it will still have to
return it to the outer transaction, or else assertions will be popped in
the outer transaction.

But some places don't need restart handling at the outer scope, so this
helper does what they need.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:36 -04:00
Kent Overstreet
326568f18c bcachefs: Convert bch2_gc_done() for_each_btree_key2()
This converts bch2_gc_stripes_done() and bch2_gc_reflink_done() to the
new for_each_btree_key_commit() macro.

The new for_each_btree_key2() and for_each_btree_key_commit() macros
handles transaction retries, allowing us to avoid nested transactions -
which we want to avoid since they're tricky to do completely correctly
and upcoming assertions are going to be checking for that.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:36 -04:00
Kent Overstreet
eace11a730 bcachefs: Convert more fsck code to for_each_btree_key2()
The new for_each_btree_key2() macro handles transaction retries,
allowing us to avoid nested transactions - which we want to avoid since
they're tricky to do completely correctly and upcoming assertions are
going to be checking for that.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:36 -04:00
Kent Overstreet
1329c7ce56 bcachefs: Convert more quota code to for_each_btree_key2()
The new for_each_btree_key2() macro handles transaction retries,
allowing us to avoid nested transactions - which we want to avoid since
they're tricky to do completely correctly and upcoming assertions are
going to be checking for that.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:36 -04:00
Kent Overstreet
1615505cdf bcachefs: Convert bch2_check_lrus() to for_each_btree_key_commit()
The new for_each_btree_key2() macro handles transaction retries,
allowing us to avoid nested transactions - which we want to avoid since
they're tricky to do completely correctly and upcoming assertions are
going to be checking for that.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:36 -04:00
Kent Overstreet
ca91f40ff7 bcachefs: Convert bch2_dev_freespace_init() to for_each_btree_key_commit()
The new for_each_btree_key2() macro handles transaction retries,
allowing us to avoid nested transactions - which we want to avoid since
they're tricky to do completely correctly and upcoming assertions are
going to be checking for that.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:36 -04:00
Kent Overstreet
4910a9506c bcachefs: Convert bch2_do_discards_work() to for_each_btree_key2()
The new for_each_btree_key2() macro handles transaction retries,
allowing us to avoid nested transactions - which we want to avoid since
they're tricky to do completely correctly and upcoming assertions are
going to be checking for that.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:36 -04:00
Kent Overstreet
8ef9831399 bcachefs: Improve bucket_alloc_fail tracepoint
We should be printing the number of free buckets, not just the number of
available buckets.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:36 -04:00
Kent Overstreet
f501ad2b81 bcachefs: bch2_mark_alloc(): Do wakeups after updating usage
We have an obvious wake up race if we do the wakeup _before_ updating
the counters the thing doing the waiting is reading.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:36 -04:00
Daniel Hill
c807ca95a6 bcachefs: added lock held time stats
We now record the length of time btree locks are held and expose this in debugfs.

Enabled via CONFIG_BCACHEFS_LOCK_TIME_STATS.

Signed-off-by: Daniel Hill <daniel@gluo.nz>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:35 -04:00
Daniel Hill
25055c690f bcachefs: bch2_time_stats_to_text now indents properly
Printbufs indentation feature doesn't yet work with '\n' and '\t'. So we've
replaced all instances of '\n' with prt_newline.

Signed-off-by: Daniel Hill <daniel@gluo.nz>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:35 -04:00
Daniel Hill
8bfe14e86a bcachefs: lock time stats prep work.
We need the caller name and a place to store our results, btree_trans provides this.

Signed-off-by: Daniel Hill <daniel@gluo.nz>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:35 -04:00
Kent Overstreet
43de721a33 bcachefs: Unlock in bch2_trans_begin() if we've held locks more than 10us
We try to ensure we never hold btree locks for too long - bcachefs tries
to be soft realtime. This adds a check when restarting a transaction,
where a transaction restart is cheap - if we've been holding locks for
too long, drop and retake them.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:35 -04:00
Kent Overstreet
a1783320d4 bcachefs: for_each_btree_key2()
This introduces two new macros for iterating through the btree, with
transaction restart handling
 - for_each_btree_key2()
 - for_each_btree_key_commit()

Every iteration is now in an implicit transaction, and - as with
lockrestart_do() and commit_do() - returning -EINTR will cause the
transaction to be restarted, at the same key.

This patch converts a bunch of code that was open coding this to these
new macros, saving a substantial amount of code.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:35 -04:00
Kent Overstreet
0d06b4eca6 bcachefs: Fix repair for extent past end of inode
When we find an extent past an inode's i_size, we need to do the
deletion in the inode's snapshot (which will emit a whiteout if
necessary); and we also need to note that we now have an a key at that
position and snapshot, so that we don't go into an infinite loop.

Also, switch to walking inodes in reverse older, oldest snapshot to
newest, so that we emit the fewest whiteouts possible.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:35 -04:00
Kent Overstreet
c7a09cb1b1 bcachefs: When fsck finds redundant snapshot keys, trigger snapshots cleanup
Fsck now checks for keys in different snapshot IDs that are now
redundant due to other snapshots being deleted - it needs to for its own
algorithms to not get confused.

When it detects this it should re-run the post snapshot deletion cleanup
- this patch does that.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:35 -04:00
Kent Overstreet
35f1a5034d bcachefs: Improve fsck for subvols/snapshots
- Bunch of refactoring, and move some code out of
   bch2_snapshots_start() and into bch2_snapshots_check(), for constency
   with the rest of fsck

 - Interior snapshot nodes no longer point to a subvolume; this is so we
   don't end up with dangling subvol references when deleting or require
   scanning the full snapshots btree.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:35 -04:00
Kent Overstreet
49124d8a7f bcachefs: Improve snapshots_seen
This makes the snapshots_seen data structure fsck private and improves
it; we now also track the equivalence class for each snapshot id we've
seen, which means we can detect when snapshot deletion hasn't finished
or run correctly (which will otherwise confuse fsck).

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:35 -04:00