linux

iv/linux

Author	SHA1	Message	Date
Kent Overstreet	adfe9357c3	bcachefs: Tweak btree key cache shrinker so it actually frees Freeing key cache items is a multi stage process; we need to wait for an SRCU grace period to elapse, and we handle this ourselves - partially to avoid callback overhead, but primarily so that when allocating we can first allocate from the freed items waiting for an SRCU grace period. Previously, the shrinker was counting the items on the 'waiting for SRCU grace period' lists as items being scanned, but this meant that too many items waiting for an SRCU grace period could prevent it from doing any work at all. After this, we're seeing that items skipped due to the accessed bit are the main cause of the shrinker not making any progress, and we actually want the key cache shrinker to run quite aggressively because reclaimed items will still generally be found (more compactly) in the btree node cache - so we also tweak the shrinker to not count those against nr_to_scan. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-20 17:06:20 -04:00
Kent Overstreet	6e4d9bd110	bcachefs: bkey_cached.btree_trans_barrier_seq needs to be a ulong this stores the SRCU sequence number, which we use to check if an SRCU barrier has elapsed; this is a partial fix for the key cache shrinker not actually freeing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-20 15:15:51 -04:00
Kent Overstreet	ec438ac59d	bcachefs: Fix missing call to bch2_fs_allocator_background_exit() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-20 00:31:59 -04:00
Kent Overstreet	fcdbc1d7a4	bcachefs: Check for journal entries overruning end of sb clean section Fix a missing bounds check in superblock validation. Note that we don't yet have repair code for this case - repair code for individual items is generally low priority, since the whole superblock is checksummed, validated prior to write, and we have backups. Reported-by: lei lu <llfamsec@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-20 00:16:53 -04:00
Kent Overstreet	0389c09b2f	bcachefs: Fix bio alloc in check_extent_checksum() if the buffer is virtually mapped it won't be a single bvec Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-17 17:29:58 -04:00
Kent Overstreet	719aec84b1	bcachefs: fix leak in bch2_gc_write_reflink_key Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-17 17:29:58 -04:00
Kent Overstreet	605109ff5e	bcachefs: KEY_TYPE_error is allowed for reflink KEY_TYPE_error is left behind when we have to delete all pointers in an extent in fsck; it allows errors to be correctly returned by reads later. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-17 17:29:58 -04:00
Kent Overstreet	fa845c7349	bcachefs: Fix bch2_dev_btree_bitmap_marked_sectors() shift Fixes: 27c15ed297cb bcachefs: bch_member.btree_allocated_bitmap Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-17 17:29:53 -04:00
Kent Overstreet	79055f50a6	bcachefs: make sure to release last journal pin in replay This fixes a deadlock when journal replay has many keys to insert that were from fsck, not the journal. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-16 19:14:01 -04:00
Kent Overstreet	fabb4d4985	bcachefs: node scan: ignore multiple nodes with same seq if interior Interior nodes are not really needed, when we have to scan - but if this pops up for leaf nodes we'll need a real heuristic. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-16 19:14:00 -04:00
Nathan Chancellor	9fd5a48a1e	bcachefs: Fix format specifier in validate_bset_keys() When building for 32-bit platforms, for which size_t is 'unsigned int', there is a warning from a format string in validate_bset_keys(): fs/bcachefs/btree_io.c: In function 'validate_bset_keys': fs/bcachefs/btree_io.c:891:34: error: format '%lu' expects argument of type 'long unsigned int', but argument 12 has type 'unsigned int' [-Werror=format=] 891 \| "bad k->u64s %u (min %u max %lu)", k->u64s, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ fs/bcachefs/btree_io.c:603:32: note: in definition of macro 'btree_err' 603 \| msg, ##__VA_ARGS__); \ \| ^~~ fs/bcachefs/btree_io.c:887:21: note: in expansion of macro 'btree_err_on' 887 \| if (btree_err_on(!bkeyp_u64s_valid(&b->format, k), \| ^~~~~~~~~~~~ fs/bcachefs/btree_io.c:891:64: note: format string is defined here 891 \| "bad k->u64s %u (min %u max %lu)", k->u64s, \| ~~^ \| \| \| long unsigned int \| %u cc1: all warnings being treated as errors BKEY_U64s is size_t so the entire expression is promoted to size_t. Use the '%zu' specifier so that there is no warning regardless of the width of size_t. Fixes: 031ad9e7dbd1 ("bcachefs: Check for packed bkeys that are too big") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202404130747.wH6Dd23p-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202404131536.HdAMBOVc-lkp@intel.com/ Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-16 19:11:49 -04:00
Kent Overstreet	02bed83d59	bcachefs: Fix null ptr deref in twf from BCH_IOCTL_FSCK_OFFLINE We need to initialize the stdio redirects before they're used. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-16 19:11:49 -04:00
Kent Overstreet	ad29cf999a	bcachefs: set_btree_iter_dontneed also clears should_be_locked This is part of a larger series cleaning up the semantics of should_be_locked and adding assertions around it; if we don't need an iterator/path anymore, it clearly doesn't need to be locked. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-15 13:31:15 -04:00
Chao Yu	3078e059a5	bcachefs: fix error path of __bch2_read_super() In __bch2_read_super(), if kstrdup() fails, it needs to release memory in sb->holder, fix to call bch2_free_super() in the error path. Signed-off-by: Chao Yu <chao@kernel.org> Reviewed-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-15 13:31:15 -04:00
Kent Overstreet	f0a73d4fde	bcachefs: Check for backpointer bucket_offset >= bucket size Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-14 20:02:11 -04:00
Kent Overstreet	27c15ed297	bcachefs: bch_member.btree_allocated_bitmap This adds a small (64 bit) per-device bitmap that tracks ranges that have btree nodes, for accelerating btree node scan if it is ever needed. - New helpers, bch2_dev_btree_bitmap_marked() and bch2_dev_bitmap_mark(), for checking and updating the bitmap - Interior btree update path updates the bitmaps when required - The check_allocations pass has a new fsck_err check, btree_bitmap_not_marked - New on disk format version, mi_btree_mitmap, which indicates the new bitmap is present - Upgrade table lists the required recovery pass and expected fsck error - Btree node scan uses the bitmap to skip ranges if we're on the new version Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-14 20:02:11 -04:00
Kent Overstreet	bdae2a7e60	bcachefs: sysfs internal/trigger_journal_flush Add a sysfs knob for immediately flushing the entire journal. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-14 20:02:11 -04:00
Kent Overstreet	e879389f57	bcachefs: Fix bch2_btree_node_fill() for !path We shouldn't be doing the unlock/relock dance when we're not using a path - this fixes an assertion pop when called from btree node scan. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-14 20:02:11 -04:00
Kent Overstreet	8cf2036e7b	bcachefs: add safety checks in bch2_btree_node_fill() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-14 18:01:12 -04:00
Kent Overstreet	d789e9a7d5	bcachefs: Interior known are required to have known key types For forwards compatibilyt, we allow bkeys of unknown type in leaf nodes; we can simply ignore metadata we don't understand. Pointers to btree nodes must always be of known types, howwever. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-14 18:01:12 -04:00
Kent Overstreet	bceb86be9e	bcachefs: add missing bounds check in __bch2_bkey_val_invalid() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-14 18:01:12 -04:00
Kent Overstreet	86dbf8c566	bcachefs: Fix btree node merging on write buffer btrees The btree write buffer flush fastpath that avoids the main transaction commit path had the unfortunate side effect of not doing btree node merging. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-13 22:49:25 -04:00
Kent Overstreet	3f10048973	bcachefs: Disable merges from interior update path There's been a bug in the btree write buffer where it wasn't triggering btree node merges - and leaving behind a bunch of nearly empty btree nodes. Then during journal replay, when updates to the backpointers btree aren't using the btree write buffer (because we require synchronization with journal replay), we end up doing those merges all at once. Then if it's the interior update path running them, we deadlock because those run with the highest watermark. There's no real need for the interior update path to be doing btree node merges; other code paths can handle that at lower watermarks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-13 22:49:25 -04:00
Kent Overstreet	9054ef2ea9	bcachefs: Run merges at BCH_WATERMARK_btree This fixes a deadlock where the interior update path during journal replay ends up doing a ton of merges on the backpointers btree, and deadlocking. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-13 22:49:25 -04:00
Kent Overstreet	9e203c43dc	bcachefs: Fix missing write refs in fs fio paths bch2_journal_flush_seq requires us to have a write ref Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-13 22:48:17 -04:00
Kent Overstreet	82cf18f23e	bcachefs: Fix deadlock in journal replay btree_key_can_insert_cached() should be checking the watermark - BCH_TRANS_COMMIT_journal_replay really means nonblocking mode when watermark < reclaim, it was being used incorrectly. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-13 22:48:17 -04:00
Kent Overstreet	4518e80adf	bcachefs: Go rw if running any explicit recovery passes This fixes a bug where we fail to start when upgrading/downgrading because we forgot we needed to go rw. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-13 22:48:17 -04:00
Kent Overstreet	9abb6dd7ce	bcachefs: Standardize helpers for printing enum strs with bounds checks Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-13 22:48:17 -04:00
Kent Overstreet	ba8ed36e72	bcachefs: don't queue btree nodes for rewrites during scan many nodes found during scan will be old nodes, overwritten by newer nodes Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-13 22:48:17 -04:00
Kent Overstreet	7b4c4ccf84	bcachefs: fix race in bch2_btree_node_evict() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-13 22:48:16 -04:00
Kent Overstreet	2aeed876d7	bcachefs: fix unsafety in bch2_stripe_to_text() .to_text() functions need to work on key values that didn't pass .valid Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-13 22:48:16 -04:00
Kent Overstreet	dc32c118ec	bcachefs: fix unsafety in bch2_extent_ptr_to_text() Need to check if we have a valid bucket before checking if ptr is stale Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-13 22:48:16 -04:00
Kent Overstreet	87cb0239c8	bcachefs: btree node scan: handle encrypted nodes Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-13 22:48:16 -04:00
Kent Overstreet	031ad9e7db	bcachefs: Check for packed bkeys that are too big add missing validation; fixes assertion pop in bkey unpack Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-13 22:48:16 -04:00
Kent Overstreet	58caa786f1	bcachefs: Fix UAFs of btree_insert_entry array The btree paths array is now dynamically resizable - and as well the btree_insert_entries array, as it needs to be the same size. The merge path (and interior update path) allocates new btree paths, thus can trigger a resize; thus we need to not retain direct pointers after invoking merge; similarly when running btree node triggers. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-13 22:48:16 -04:00
Kent Overstreet	2b3e79fea6	bcachefs: Don't use bch2_btree_node_lock_write_nofail() in btree split path It turns out - btree splits happen with the rest of the transaction still locked, to avoid unnecessary restarts, which means using nofail doesn't work here - we can deadlock. Fortunately, we now have the ability to return errors here. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-11 23:45:12 -04:00
Kent Overstreet	1189bdda6c	bcachefs: Fix __bch2_btree_and_journal_iter_init_node_iter() We weren't respecting trans->journal_replay_not_finished - we shouldn't be searching the journal keys unless we have a ref on them. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-10 22:28:36 -04:00
Kent Overstreet	517236cb3e	bcachefs: Kill read lock dropping in bch2_btree_node_lock_write_nofail() dropping read locks in bch2_btree_node_lock_write_nofail() dates from before we had the cycle detector; we can now tell the cycle detector directly when taking a lock may not fail because we can't handle transaction restarts. This is needed for adding should_be_locked asserts. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-10 22:28:36 -04:00
Kent Overstreet	beccf29114	bcachefs: Fix a race in btree_update_nodes_written() One btree update might have terminated in a node update, and then while it is in flight another btree update might free that original node. This race has to be handled in btree_update_nodes_written() - we were missing a READ_ONCE(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-10 22:28:36 -04:00
Kent Overstreet	9b31152fd7	bcachefs: btree_node_scan: Respect member.data_allowed If a device wasn't used for btree nodes, no need to scan for them. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-09 18:54:46 -04:00
Kent Overstreet	5ab4beb759	bcachefs: Don't scan for btree nodes when we can reconstruct Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-09 00:53:14 -04:00
Kent Overstreet	359571c327	bcachefs: Fix check_topology() when using node scan shoot down journal keys _before_ populating journal keys with pointers to scanned nodes Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-09 00:04:57 -04:00
Kent Overstreet	9c432404b9	bcachefs: fix eytzinger0_find_gt() - fix return types: promoting from unsigned to ssize_t does not do what we want here, and was pointless since the rest of the eytzinger code is u32 - nr, not size Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-08 22:56:37 -04:00
Kent Overstreet	b897b148ee	bcachefs: fix bch2_get_acl() transaction restart handling bch2_acl_from_disk() uses allocate_dropping_locks, and can thus return a transaction restart - this wasn't handled. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-07 17:15:53 -04:00
Hongbo Li	09e913f582	bcachefs: fix the count of nr_freed_pcpu after changing bc->freed_nonpcpu list When allocating bkey_cached from bc->freed_pcpu list, it missed decreasing the count of nr_freed_pcpu which would cause the mismatch between the value of nr_freed_pcpu and the list items. This problem also exists in moving new bkey_cached to bc->freed_pcpu list. If these happened, the bug info may appear in bch2_fs_btree_key_cache_exit by the follow code: BUG_ON(list_count_nodes(&bc->freed_pcpu) != bc->nr_freed_pcpu); BUG_ON(list_count_nodes(&bc->freed_nonpcpu) != bc->nr_freed_nonpcpu); Fixes: c65c13f0eac6 ("bcachefs: Run btree key cache shrinker less aggressively") Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-07 13:40:35 -04:00
Kent Overstreet	30e615a2ce	bcachefs: Fix gap buffer bug in bch2_journal_key_insert_take() Multiple bug fixes for journal iters: - When the journal keys gap buffer is resized, we have to adjust the iterators for moving the gap to the end - We don't want to rewind iterators to point to the key we just inserted if it's not for the correct btree/level Also, add some new assertions. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-07 02:22:28 -04:00
Thorsten Blum	2d793e9315	bcachefs: Rename struct field swap to prevent macro naming collision The struct field swap can collide with the swap() macro defined in linux/minmax.h. Rename the struct field to prevent such collisions. Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-06 17:39:12 -04:00
Kent Overstreet	6088234ce8	bcachefs: JOURNAL_SPACE_LOW "bcachefs; Fix deadlock in bch2_btree_update_start()" was a significant performance regression (nearly 50%) on multithreaded random writes with fio. The reason is that the journal watermark checks multiple things, including the state of the btree write buffer, and on multithreaded update heavy workloads we're bottleneked on write buffer flushing - we don't want kicknig off btree updates to depend on the state of the write buffer. This isn't strictly correct; the interior btree update path does do write buffer updates, but it's a tiny fraction of total accounting updates and we're more concerned with space in the journal itself. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-06 13:50:26 -04:00
Kent Overstreet	05801b6526	bcachefs: Disable errors=panic for BCH_IOCTL_FSCK_OFFLINE BCH_IOCTL_FSCK_OFFLINE allows the userspace fsck tool to use the kernel implementation of fsck - primarily when the kernel version is a better version match. It should look and act exactly like the normal userspace fsck that the user expected to be invoking, so errors should never result in a kernel panic. We may want to consider further restricting errors=panic - it's only intended for debugging in controlled test environments, it should have no purpose it normal usage. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-06 13:50:25 -04:00
Kent Overstreet	374b3d38fe	bcachefs: Fix BCH_IOCTL_FSCK_OFFLINE for encrypted filesystems To open an encrypted filesystem, we use request_key() to get the encryption key from the user's keyring - but request_key() needs to happen in the context of the process that invoked the ioctl. This easily fixed by using bch2_fs_open() in nostart mode. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-06 13:50:22 -04:00

1 2 3 4 5 ...

3471 Commits