linux

iv/linux

Author	SHA1	Message	Date
Kent Overstreet	b0551285e1	bcachefs: Improve reflink repair code When a reflink pointer points to a missing indirect extent, we replace it with an error key. Instead of replacing the entire reflink pointer with an error key, this patch replaces only the missing range with an error key. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:25 -04:00
Kent Overstreet	78c8fe20be	bcachefs: Normal update/commit path now works before going RW This improves __bch2_trans_commit - early in the recovery process, when we're running btree_gc and before we want to go RW, it now uses bch2_journal_key_insert() to add the update to the list of updates for journal replay to do, instead of btree_gc having to use separate interfaces depending on whether we're running at bringup or, later, runtime. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:25 -04:00
Kent Overstreet	b66b2bc0f6	bcachefs: Revert "Ensure journal doesn't get stuck in nochanges mode" This patch was originally to work around the journal geting stuck in nochanges mode - but that was just a hack, we needed to fix the actual bug. It should be fixed now, so revert it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:25 -04:00
Kent Overstreet	e201f70b11	bcachefs: Fix for journal getting stuck The journal can get stuck if we need to get a journal reservation for something we have a pre-reservation for, but aren't able to reclaim space, or if the pin fifo is full - it's impractical to resize the pin fifo at runtime. Previously, we reserved 8 entries in the pin fifo for pre-reservations, but that seems small - we're seeing the journal occasionally get stuck. Let's reserve a quarter of it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:25 -04:00
Kent Overstreet	6e44568cc3	bcachefs: Set BTREE_NODE_SEQ() correctly in merge path BTREE_NODE_SEQ() is supposed to give us a time ordering of btree nodes on disk, so that we can tell which btree node is newer if we ever have to scan the entire device to find btree nodes. The btree node merge path wasn't setting it correctly on the new node - oops. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:25 -04:00
Kent Overstreet	5838c1702b	bcachefs: Drop journal_write_compact() Long ago it was possible to get a journal reservation and not use it, but that's no longer allowed, which means journal_write_compact() has very little work to do, and isn't really worth the code anymore. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	8322a9376e	bcachefs: Btree key cache optimization This helps with lock contention in the journalling code: instead of updating our journal pin on every write, only get a journal pin if we don't have one. This means we can avoid hammering on journal locks nearly so much, at the cost of carrying around a journal pin for an older entry than the one we actually need. To handle that, if needed we update our journal pin to the correct one when flushed by journal reclaim. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:24 -04:00
Kent Overstreet	702a4ef077	bcachefs: Add tabstops to printbufs Now, when outputting to printbufs, we can set tabstops and left or right justify text to them - this is to be used by the userspace 'bcachefs fs usage' command. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	f61816d0fc	bcachefs: Fix a use after free In move_read_endio, we were checking if the next pending write has its read completed - but this can turn after a use after free (and we were accessing the list without a lock), so instead just better to just unconditionally do the wakeup. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	12bf93a429	bcachefs: Add .to_text() methods for all superblock sections This patch improves the superblock .to_text() methods and adds methods for all types that were missing them. It also improves printbufs by allowing them to specfiy what units we want to be printing in, and adds new wrapper methods for unifying our kernel and userspace environments. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	d4b691522c	bcachefs: Kill bch_scnmemcpy() bch_scnmemcpy was for printing length-limited strings that might not have a terminating null - turns out sprintf & pr_buf can do this with %.*s. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	3117db99f3	bcachefs: Don't issue discards when in nochanges mode When the nochanges option is selected, we're supposed to never issue writes. Unfortunately, it seems discards were missed when implemnting this, leading to some painful filesystem corruption. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	8ccf4dff09	bcachefs: opts.read_journal_only Add an option that tells recovery to only read the journal, to be used by the list_journal command. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	06a98c966f	bcachefs: Change __bch2_trans_commit() to run triggers then get RW This is prep work for the next patch, which is going to change __bch2_trans_commit() to use bch2_journal_key_insert() when very early in the recovery process, so that we have a unified interface for doing btree updates. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	10b93677d3	bcachefs: Delete some flag bits that are no longer used Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	72b7d6332b	bcachefs: Store logical location of journal entries When viewing what's in the journal, it's more useful to have the logical location - journal bucket and offset within that bucket - than just the offset on that device. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	a9de137bf6	bcachefs: Check for errors from crypto_skcipher_encrypt() Apparently it actually is possible for crypto_skcipher_encrypt() to return an error - not sure why that would be - but we need to replace our assertion with actual error handling. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	8f9ad91a02	bcachefs: Fix failure to allocate btree node in cache The error code when we fail to allocate a node in the btree node cache doesn't make it to bch2_btree_path_traverse_all(). Instead, we need to stash a flag in btree_trans so we know we have to take the cannibalize lock. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	bf7e49a4ae	bcachefs: Change bch2_dev_lookup() to not use lookup_bdev() bch2_dev_lookup() is used from the extended attribute set methods, for setting the target options, where we're already holding an inode lock - it turns out pathname lookups also take inode locks, so that was susceptible to deadlocks. Fortunately we already stash the device name in ca->name. This does change user-visible behaviour though: instead of specifying e.g. /dev/sda1, user must now specify sda1. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	2232fa397c	bcachefs: Only allocate buckets_nouse when requested It's only needed by the migrate tool - this patch adds an option to enable allocating it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	c929f2306e	bcachefs: Stale ptr cleanup is now done by gc_gens Before we had dedicated gc code for bucket->oldest_gen this was btree_gc's responsibility, but now that we have that we can rip it out, simplifying the already overcomplicated btree_gc. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	e7bc7cdff8	bcachefs: Improve journal_entry_btree_keys_to_text() This improves the formatting of journal_entry_btree_keys_to_text() by putting each key on its own line. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	33aa419db9	bcachefs: Fix __btree_path_traverse_all The loop that traverses paths in traverse_all() needs to be a little bit tricky, because traversing a path can cause other paths to be added (or perhaps removed) at about the same position. The old logic was buggy, replace it with simpler logic. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	4b59a319ad	bcachefs: Fix slow tracepoints Some of our tracepoints were calling snprintf("pS") - which does symbol table lookups - in TP_fast_assign(), which turns out to be a really bad idea. This was done because perf trace wasn't correctly printing tracepoints that use %pS anymore - but it turns out trace-cmd does handle it correctly. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	eb331fe5a4	bcachefs: Check for stale dirty pointer before reads Since we retry reads when we discover we read from a pointer that went stale, if a dirty pointer is erroniously stale it would cause us to loop retrying that read forever - unless we check before issuing the read, while the btree is still locked, when we know that a dirty pointer should never be stale. This patch adds that check, along with printing some helpful debug info. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	fcf01959ea	bcachefs: Kill verify_not_stale() This is ancient code that's more effectively checked in other places now. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	7abda8c1d8	bcachefs: Fix __bch2_btree_node_lock __bch2_btree_node_lock() was implementing the wrong lock ordering for cached vs. non cached paths - this fixes it to match the btree path sort order as defined by __btree_path_cmp(), and also simplifies the code some. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	c7ce27328b	bcachefs: Also show when blocked on write locks This consolidates some of the btree node lock path, so that when we're blocked taking a write lock on a node it shows up in bch2_btree_trans_to_text(), along with intent and read locks. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:23 -04:00
Kent Overstreet	8be1aff009	bcachefs: Delete redundant tracepoint We were emitting two trace events on transaction restart in this code path - delete the redundant one. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:23 -04:00
Kent Overstreet	52eef42c5f	bcachefs: Fix locking in data move path We need to ensure we don't have any btree locks held when calling do_pending_writes() - besides issuing IOs, upcoming allocator changes will have allocations doing btree lookups directly. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:23 -04:00
Kent Overstreet	2ce8fbd9bb	bcachefs: Kill bch2_bkey_debugcheck The old .debugcheck methods are no more and this just calls the .invalid method, which doesn't add much since we already check that when doing btree updates and when reading metadata in. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:23 -04:00
Kent Overstreet	0f78264a6b	bcachefs: Print a better message for mark and sweep pass Btree gc, aka mark and sweep, checks allocations - so let's just print that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:23 -04:00
Kent Overstreet	9e34316156	bcachefs: Small fsck fix The check_dirents pass handles transaction restarts at the toplevel - check_subdir_count() was incorrectly handling transaction restarts without returning -EINTR, meaning that the iterator pointing to the dirent being checked was left invalid. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:23 -04:00
Kent Overstreet	aa8982c3f2	bcachefs: Fix reflink repair code The reflink repair code was incorrectly inserting a nonzero deleted key via journal replay - this is due to bch2_journal_key_insert() being somewhat hacky, and so this fix is also hacky for now. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:23 -04:00
Kent Overstreet	c45c866761	bcachefs: bch2_gc_gens() no longer uses bucket array Like the previous patches, this converts bch2_gc_gens() to use the alloc btree directly, and private arrays of generation numbers for its own recalculation of oldest_gen. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:23 -04:00
Kent Overstreet	d73e0d2cd1	bcachefs: Copygc no longer uses bucket array This converts the copygc code to use the alloc btree directly to find buckets that need to be evacuated instead of the in-memory bucket array, which is finally going away soon. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:23 -04:00
Kent Overstreet	ec061b215d	bcachefs: btree_gc no longer uses main in-memory bucket array This changes the btree_gc code to only use the second bucket array, the one dedicated to GC. On completion, it compares what's in its in memory bucket array to the allocation information in the btree and writes it directly, instead of updating the main in-memory bucket array and writing that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:23 -04:00
Kent Overstreet	63a2edce94	bcachefs: Inode create no longer needs to probe key cache Now that we have full key cache coherency, we can simplify bch2_inode_create(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:23 -04:00
Kent Overstreet	12ce5b7df1	bcachefs: Btree key cache coherency - Updates to non key cache iterators will now be transparently redirected to the key cache for cached btrees. - Except when creating new keys: then the update goes to underlying btree For for iterating over a cached btree to work, we need to ensure that if a key exists in the key cache, it also exists in the btree - otherwise the iterator code will skip past it and not check the key cache. Otherwise, for consistency, all updates should go to the same place - the key cache. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:23 -04:00
Kent Overstreet	f7b6ca23b6	bcachefs: BTREE_ITER_WITH_KEY_CACHE This is the start of cache coherency with the btree key cache - this adds a btree iterator flag that causes lookups to also check the key cache when we're iterating over the btree (not iterating over the key cache). Note that we could still race with another thread creating at item in the key cache and updating it, since we aren't holding the key cache locked if it wasn't found. The next patch for the update path will address this by causing the transaction to restart if the key cache is found to be dirty. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:23 -04:00
Kent Overstreet	45e4cd9e3a	bcachefs: run_one_trigger() now checks journal keys Previously, when doing updates and running triggers before journal replay completes, triggers would see the incorrect key for the old key being overwritten - this patch updates the trigger code to check the journal keys when necessary, needed for the upcoming allocator rewrite. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:23 -04:00
Kent Overstreet	2e63e18066	bcachefs: Stash a copy of key being overwritten in btree_insert_entry We currently need to call bch2_btree_path_peek_slot() multiple times in the transaction commit path - and some of those need to be updated to also check the keys from journal replay, too. Let's consolidate this and stash the key being overwritten in btree_insert_entry. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:23 -04:00
Kent Overstreet	ce91abd60b	bcachefs: bch2_btree_path_set_pos() bch2_btree_path_set_pos() is now available outside of btree_iter.c Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:23 -04:00
Kent Overstreet	7c8f6f980d	bcachefs: btree_id_cached() Add a new helper that returns true if the given btree ID uses the btree key cache. This enables some new cleanups, since the helper can check the options for whether caching is enabled on a given btree. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:23 -04:00
Kent Overstreet	a9c0b125d8	bcachefs: Improve btree_key_cache_flush_pos() btree_key_cache_flush_pos() uses BTREE_ITER_CACHED_NOFILL - but it wasn't checking for !ck->valid. It does check for the entry being dirty, so it shouldn't matter, but this refactor it a bit and adds and assertion. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:23 -04:00
Kent Overstreet	80bf2f3454	bcachefs: Fix freeing in bch2_dev_buckets_resize() We were double-freeing old_buckets and not freeing old_buckets_gens: also, the code was supposed to free buckets, not old_buckets; old_buckets is only needed because we have to use rcu_assign_pointer() instead of swap(), and won't be set if we hit the error path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:23 -04:00
Kent Overstreet	35228ecb7e	bcachefs: Don't keep nodes in btree_reserve locked These nodes aren't reachable by other threads, so there's no need to keep it locked - and this fixes a bug with the assertion in bch2_trans_unlock() firing on transaction restart. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:23 -04:00
Kent Overstreet	b74b147dda	bcachefs: Log message improvements Change the error messages in bch2_inconsistent_error() and bch2_fatal_error() so we can distinguish them. Also, prefer bch2_fs_fatal_error() (which also logs an error message) to bch2_fatal_error(), and change a call to bch2_inconsistent_error() to bch2_fatal_error() when we can't continue. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:23 -04:00
Kent Overstreet	54460a6292	bcachefs: Delete some dead code __bch2_mark_replicas() is now only used in one place, so inline it into the caller. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:22 -04:00
Kent Overstreet	0678cbe2cb	bcachefs: Ignore cached data when calculating fragmentation Previously, bucket fragmentation was considered to be bucket size - total amount of live data, both dirty and cached. This meant that if a bucket was full but only a small amount of data in it was dirty - the rest cached, we'd get stuck: copygc wouldn't move the dirty data out of the bucket and the allocator wouldn't be able to invalidate and drop the cached data. This changes fragmentation to exclude cached data, so that copygc will evacuate these buckets and copygc/the allocator will always be able to make forward progress. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:22 -04:00

1 2 3 4 5 ...

1216779 Commits