linux

iv/linux

Author	SHA1	Message	Date
Kent Overstreet	e296b1f9ca	bcachefs: Fix inode_backpointer_exists() If the dirent an inode points to doesn't exist, we shouldn't be returning an error - just 0/false. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	0b09032653	bcachefs: Improve bch2_lru_delete() error messages When we detect a filesystem inconsistency, we should include the relevent keys in the error message. This patch adds a parameter to pass the key with the lru entry to bch2_lru_delete(), so that it can be printed. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	5650bb46be	bcachefs: Introduce bch2_journal_keys_peek_(upto\|slot)() When many journal replay keys have been overwritten, bch2_journal_keys_peek() was taking excessively long to scan before it found a key to return. Fix this by introducing bch2_journal_keys_peek_upto() which takes a parameter for the end of the range we want, so that we can terminate the search much sooner, and replace all uses of bch2_journal_keys_peek() with peek_upto() or peek_slot(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	9b93596c33	bcachefs: Improve error message when alloc key doesn't match lru entry Error messages should always print out the full key when available - this gives us a starting point when looking through the journal to debug what went wrong. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	7003589dab	bcachefs: Ensure buckets have io_time[READ] set It's an error if a bucket is in state BCH_DATA_cached but not on the LRU btree - i.e io_time[READ] == 0 - so, make sure it's set before adding it. Also, make some of the LRU code a bit clearer and more direct. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	84befe8ef9	bcachefs: Use bch2_trans_inconsistent_on() in more places This gets us better error messages. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	3518e6faef	bcachefs: Improve bch2_open_buckets_to_text() This patch updates bch2_open_buckets_to_text() to include the device and bucket the open_bucket owns. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	ec7ccbde6b	bcachefs: Fix CPU usage in journal read path In journal_entry_add(), we were repeatedly scanning the journal entries radix tree to scan for old entries that can be freed, with O(n^2) behaviour. This patch tweaks things to remember the previous last_seq, so we don't have to scan for entries to free from the start. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	6e811bbbc2	bcachefs: Fix a null ptr deref We start doing allocations before the GC thread is created, which means we need to check for that to avoid a null ptr deref. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	cf0dd697eb	bcachefs: Don't trigger extra assertions in journal replay We now pass a rw argument to .key_invalid methods so they can trigger assertions for updates but not on existing keys. We shouldn't trigger these extra assertions in journal replay - this patch changes the transaction commit path accordingly. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	a9c0a4cbf1	bcachefs: Minor device removal fixes - We weren't clearing the LRU btree - bch2_alloc_read() runs before bch2_check_alloc_key() deletes alloc keys for devices/buckets that don't exists, so it needs to check for that - bch2_check_lrus() needs to check that buckets exists - improve some error messages Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	502f973dba	bcachefs: Fix a few warnings on 32 bit These showed up when building for mips. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	aae29082c6	bcachefs: bch2_btree_delete_extent_at() New helper, for deleting extents. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	7c4ca54ae6	bcachefs: Don't skip triggers in fcollapse() With backpointers this doesn't work anymore - backpointers always need to be updated to point to the new extent position. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	84c72755b9	bcachefs: Initialize ec work structs early We need to ensure that work structs in bch_fs always get initialized - otherwise an error in filesystem initialization can pop a warning in the workqueue code when we try to cancel a work struct that wasn't initialized. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	ce6201c456	bcachefs: Use a genradix for reading journal entries Previously, the journal read path used a linked list for storing the journal entries we read from disk. But there's been a bug that's been causing journal_flush_delay to incorrectly be set to 0, leading to far more journal entries than is normal being written out, which then means filesystems are no longer able to start due to the O(n^2) behaviour of inserting into/searching that linked list. Fix this by switching to a radix tree. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	95752a02cb	bcachefs: Refactor journal_keys_sort() to return an error code When there weren't any keys in the journal there's no need to allocate the buffer - but doing that causes a spurious -ENOMEM. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	822835ffea	bcachefs: Fold bucket_state in to BCH_DATA_TYPES() Previously, we were missing accounting for buckets in need_gc_gens and need_discard states. This matters because buckets in those states need other btree operations done before they can be used, so they can't be conuted when checking current number of free buckets against the allocation watermark. Also, we weren't directly counting free buckets at all. Now, data type 0 == BCH_DATA_free, and free buckets are counted; this means we can get rid of the separate (poorly defined) count of unavailable buckets. This is a new on disk format version, with upgrade and fsck required for the accounting changes. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	8058ea64c3	bcachefs: Add a sysfs attr for triggering discards We're currently debugging an issue with discards not getting run; this patch adds a manual trigger so we can then watch the tracepoint while it runs. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	48620e5177	bcachefs: Topology repair fixes - We were failing to start topology repair, because we hadn't set the superblock flag indicating it needed to run - set_node_min() forget to update the btree node's key - bch2_gc_alloc_reset() didn't reset data type, leading to inserting an invalid key that was empty but had nonzero data type Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	5e05d7ed3d	bcachefs: Use bch2_trans_inconsistent() more This gets us better error messages. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	62491956f4	bcachefs: Move alloc assertion to .key_invalid() .key_invalid is a better place for this assertion. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	1d8a268940	bcachefs: Improve btree_bad_header() In the future printbufs will be mempool-ified, so we shouldn't be using more than one at a time if we don't have to. This also fixes an extra trailing newline. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	11c7d3e817	bcachefs: Check for read_time == 0 in bch2_alloc_v4_invalid() We've been seeing this error in fsck and we weren't able to track down where it came from - but now that .key_invalid methods take a rw argument, we can safely check for this. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	292dea86df	bcachefs: fsck: Work around transaction restarts In check_extents() and check_dirents(), we're working towards only handling transaction restarts in one place, at the top level - but we're not there yet. check_i_sectors() and check_subdir_count() handle transaction restarts locally, which means the iterator for the dirent/extent is left unlocked (should_be_locked == 0), leading to asserts popping when we go to do updates. This patch hacks around this for now, until we can delete the offending code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	275c8426fb	bcachefs: Add rw to .key_invalid() This adds a new parameter to .key_invalid() methods for whether the key is being read or written; the idea being that methods can do more aggressive checks when a key is newly created and being written, when we wouldn't want to delete the key because of those checks. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	e1effd42a1	bcachefs: More improvements for alloc info checks - Move checks for whether the device & bucket are valid from the .key_invalid method to bch2_check_alloc_key(). This is because .key_invalid() is called on keys that may no longer exist (post journal replay), which is a problem when removing/resizing devices. - We weren't checking the need_discard btree to ensure that every set bucket has a corresponding alloc key. This refactors the code for checking the freespace btree, so that it now checks both. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	afb6f7f61b	bcachefs: Silence spurious copygc err when shutting down Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	f0ac7df23d	bcachefs: Convert .key_invalid methods to printbufs Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	d1d7737fd9	bcachefs: Gap buffer for journal keys Btree updates before we go RW work by inserting into the array of keys that journal replay will insert - but inserting into a flat array is O(n), meaning if btree_gc needs to update many alloc keys, we're O(n^2). Fortunately, the updates btree_gc does happens in sequential order, which means a gap buffer works nicely here - this patch implements a gap buffer for journal keys. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	7c7e071d90	bcachefs: Don't normalize to pages in btree cache shrinker This behavior dates from the early, early days of bcache, and upon further delving appears to not make any sense. The shrinker only works in terms of 'objects' of unknown size; normalizing to pages only had the effect of changing the batch size, which we could do directly - if we wanted; we probably don't. Normalizing to pages meant our batch size was very small, which seems to have been keeping us from doing as much shrinking as we should be under heavy memory pressure; this patch appears to alleviate some OOMs we've been seeing. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	4254f5bf6e	bcachefs: Add a tracepoint for superblock writes Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:30 -04:00
Kent Overstreet	c6b6d41612	bcachefs: gc mark fn fixes, cleanups mark_stripe_bucket() was busted; it was using @new unitialized. Also, clean up all the gc mark functions, and convert them to the same style. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	80c80164a5	bcachefs: Don't write partially-initialized superblocks This neatly avoids bugs where we fail partway through initializing a new filesystem, if we just don't write out partly-initialized state. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	64afbbc909	bcachefs: Improve read_from_stale_dirty_pointer() message With printbufs, it's now easy to build up multi-line log messages and emit them with one call, which is good because it prevents multiple multi-line log messages from getting Interspersed in the log buffer; this patch also improves the formatting and converts it to latest style. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	75f02de43f	bcachefs: Use crc_is_compressed() Trivial cleanup. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	c32fc674d4	bcachefs: Fix pr_buf() calls In a few places we were passing a variable to pr_buf() for the format string - oops. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:29 -04:00
Kent Overstreet	66d9082385	bcachefs: Kill struct bucket_mark This switches struct bucket to using a lock, instead of cmpxchg. And now that the protected members no longer need to fit into a u64, we can expand the sector counts to 32 bits. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	5735608c14	bcachefs: Kill main in-memory bucket array All code using the in-memory bucket array, excluding GC, has now been converted to use the alloc btree directly - so we can finally delete it. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	5f43f99c6e	bcachefs: bch2_dev_usage_update() no longer depends on bucket_mark This is one of the last steps in getting rid of the main in-memory bucket array. This changes bch2_dev_usage_update() to take bkey_alloc_unpacked instead of bucket_mark, and for the places where we are in fact working with bucket_mark and don't have bkey_alloc_unpacked, we add a wrapper that takes bucket_mark and converts to bkey_alloc_unpacked. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	5add07d56a	bcachefs: Fsck for need_discard & freespace btrees Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	caece7fe3f	bcachefs: New bucket invalidate path In the old allocator code, preparing an existing empty bucket was part of the same code path that invalidated buckets containing cached data. In the new allocator code this is no longer the case: the main allocator path finds empty buckets (via the new freespace btree), and can't allocate buckets that contain cached data. We now need a separate code path to invalidate buckets containing cached data when we're low on empty buckets, which this patch implements. When the number of free buckets decreases that triggers the new invalidate path to run, which uses the LRU btree to pick cached data buckets to invalidate until we're above our watermark. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	59cc38b8d4	bcachefs: New discard implementation In the old allocator code, buckets would be discarded just prior to being used - this made sense in bcache where we were discarding buckets just after invalidating the cached data they contain, but in a filesystem where we typically have more free space we want to be discarding buckets when they become empty. This patch implements the new behaviour - it checks the need_discard btree for buckets awaiting discards, and then clears the appropriate bit in the alloc btree, which moves the buckets to the freespace btree. Additionally, discards are now enabled by default. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	f25d8215f4	bcachefs: Kill allocator threads & freelists Now that we have new persistent data structures for the allocator, this patch converts the allocator to use them. Now, foreground bucket allocation uses the freespace btree to find buckets to allocate, instead of popping buckets off the freelist. The background allocator threads are no longer needed and are deleted, as well as the allocator freelists. Now we only need background tasks for invalidating buckets containing cached data (when we are low on empty buckets), and for issuing discards. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	c6b2826cd1	bcachefs: Freespace, need_discard btrees This adds two new btrees for the upcoming allocator rewrite: an extents btree of free buckets, and a btree for buckets awaiting discards. We also add a new trigger for alloc keys to keep the new btrees up to date, and a compatibility path to initialize them on existing filesystems. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	3d48a7f85f	bcachefs: KEY_TYPE_alloc_v4 This introduces a new alloc key which doesn't use varints. Soon we'll be adding backpointers and storing them in alloc keys, which means our pack/unpack workflow for alloc keys won't really work - we'll need to be mutating alloc keys in place. Instead of bch2_alloc_unpack(), we now have bch2_alloc_to_v4() that converts older types of alloc keys to v4 if needed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	d326ab2f5d	bcachefs: LRU btree This implements new persistent LRUs, to be used for buckets containing cached data, as well as stripes ordered by time when a block became empty. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:29 -04:00
Kent Overstreet	179e3434fa	bcachefs: KEY_TYPE_set A new empty key type, to be used when using a btree as a set. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:29 -04:00
Kent Overstreet	25be2e5d4a	bcachefs: bch_sb_field_journal_v2 Add a new superblock field which represents journal buckets as ranges: also move code for the superblock journal fields to journal_sb.c. This also reworks the code for resizing the journal to write the new superblock before using the new journal buckets, and thus be a bit safer. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:29 -04:00
Kent Overstreet	b17d3cec14	bcachefs: Run btree updates after write out of write_point In the write path, after the write to the block device(s) complete we have to punt to process context to do the btree update. Instead of using the work item embedded in op->cl, this patch switches to a per write-point work item. This helps with two different issues: - lock contention: btree updates to the same writepoint will (usually) be updating the same alloc keys - context switch overhead: when we're bottlenecked on btree updates, having a thread (running out of a work item) checking the write point for completed ops is cheaper than queueing up a new work item and waking up a kworker. In an arbitrary benchmark, 4k random writes with fio running inside a VM, this patch resulted in a 10% improvement in total iops. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00

1 2 3 4 5 ...

1217017 Commits