linux

iv/linux

Author	SHA1	Message	Date
Kent Overstreet	9a74f63c97	bcachefs: path->should_be_locked fixes - We should only be clearing should_be_locked in btree_path_set_pos() - it's the responsiblity of the btree_path code, not the btree_iter code. - bch2_path_put() needs to pay attention to path->should_be_locked, to ensure we don't drop locks we're supposed to be keeping. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:16 -04:00
Kent Overstreet	e2b605601a	bcachefs: Clean up error reporting in the startup path It used to be that error reporting in the startup path was done by returning strings describing the error, but that turned out to be a rather silly idea - if there's something we can describe about the error, just print it right away. This converts a good chunk of code to returning error codes, as is more typical style. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:16 -04:00
Chris Webb	7be9ab637f	bcachefs: Return -ENOKEY/EINVAL when mount decryption fails bch2_fs_encryption_init() correctly passes back -ENOKEY from request_key() when no unlock key is found, or -EINVAL if superblock decryption fails because of an invalid key. However, these get absorbed into a generic NULL return from bch2_fs_alloc() and later returned to user space as -ENOMEM, leading to a misleading error from mount(1): mount(2) system call failed: Out of memory. Return explicit error pointers out of bch2_fs_alloc() and handle them in both callers, so the user instead sees mount(2) system call failed: Required key not available. when attempting to mount a filesystem which is still locked. Signed-off-by: Chris Webb <chris@arachsys.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:16 -04:00
Kent Overstreet	076c783cd3	bcachefs: Fix upgrade path for reflink_p fix Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:16 -04:00
Kent Overstreet	68a2054d88	bcachefs: Switch fsync to use bi_journal_seq Now that we're recording in each inode the journal sequence number of the most recent update, fsync becomes a lot simpler and we can delete all the plumbing for ei_journal_seq. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:16 -04:00
Kent Overstreet	e15a57ac05	bcachefs: Kill bucket quantiles sysfs code We're getting rid of code that uses the in memory bucket array - and we now have better mechanisms for viewing most of what the bucket quantiles code gave us (especially internal fragmentation). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:16 -04:00
Kent Overstreet	0e030f5e20	bcachefs: Kill journal buf bloom filter This was used for recording which inodes have been modified by in flight journal writes, but was broken and has been superceded. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:16 -04:00
Kent Overstreet	3e52c22255	bcachefs: Add journal_seq to inode & alloc keys Add fields to inode & alloc keys that record the journal sequence number when they were most recently modified. For alloc keys, this is needed to know what journal sequence number we have to flush before the bucket can be reused. Currently this is tracked in memory, but we'll be getting rid of the in memory bucket array. For inodes, this is needed for fsync when the inode has been evicted from the vfs cache. Currently we use a bloom filter per outstanding journal buf - but that mechanism has been broken since we added the ability to not issue a flush/fua for every journal write. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:16 -04:00
Kent Overstreet	1db84979c7	bcachefs: Update inode on every write This is going to be a performance regression until we get the btree key cache re-enabled - but it's needed for fixing fsync. Upcoming patches will record the journal_seq an inode was updated at in the inode itself. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:16 -04:00
Kent Overstreet	34d74830b2	bcachefs: BTREE_UPDATE_NOJOURNAL We're going to have btree updates that don't need to be journalled; add a flag for that. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:16 -04:00
Kent Overstreet	c27314b448	bcachefs: Fix __remove_dirent() __lookup_inode() doesn't work for what __remove_dirent() wants - it just wants the first inode at a given inode number, they all have the same hash info. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:16 -04:00
Kent Overstreet	47f80bbf38	bcachefs: Fix check_inodes() We were starting at the wrong btree position, and thus not actually checking any inodes - oops. Also, make check_key_has_snapshot() a mustfix fsck error, since later fsck code assumes that all keys have valid snapshot IDs. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:16 -04:00
Kent Overstreet	d647db314a	bcachefs: Improve error message in bch2_write_super() It's helpful to know what the superblock write is for. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	85eb2bae7f	bcachefs: Fix trans_lock_write() On failure to get a write lock (because we had a conflicting read lock), we need to make sure to upgrade the read lock to an intent lock - or we could end up spinning. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	f527afea5a	bcachefs: Fix upgrade_readers() The bch2_btree_path_upgrade() call was failing and tripping an assert - path->level + 1 is in this case not necessarily exactly what we want, fix it by upgrading exactly the locks we want. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	d740729272	bcachefs: Fix faulty assertion Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	2debb1b875	bcachefs: BTREE_TRIGGER_INSERT now only means insert This allows triggers to distinguish between a key entering the btree - i.e. being called from the trans commit path - vs. being called on a key that already exists, i.e. by GC. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	904823de49	bcachefs: Convert bch2_mark_key() to take a btree_trans * This helps to unify the interface between bch2_mark_key() and bch2_trans_mark_key() - and it also gives access to the journal reservation and journal seq in the mark_key path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	961b2d6282	bcachefs: Assorted ec fixes - The backpointer that ec_stripe_update_ptrs() uses now needs to include the snapshot ID, which means we have to change where we add the backpointer to after getting the snapshot ID for the new extents - ec_stripe_update_ptrs() needs to be calling bch2_trans_begin() - improve error message in bch2_mark_stripe() Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	37f72492f4	bcachefs: Fix bch2_mark_update() When the old or new key doesn't exist, we should still pass in a deleted key with the correct pos. This fixes a bug in the ec code, when bch2_mark_stripe() was looking up the wrong in-memory stripe. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	fae1157d18	bcachefs: Ensure journal doesn't get stuck in nochanges mode This tweaks the journal code to always act as if there's space available in nochanges mode, when we're not going to be doing any writes. This helps in recovering filesystems that won't mount because they need journal replay and the journal has gotten stuck. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	285b181ad4	bcachefs: Improve transaction restart handling in fsck code The fsck code has been handling transaction restarts locally, to avoid calling fsck_err() multiple times (and asking the user/logging the error multiple times) on transaction restart. However, with our improving assertions about iterator validity, this isn't working anymore - the code wasn't entirely correct, in ways that are fine for now but are going to matter once we start wanting online fsck. This code converts much of the fsck code to handle transaction restarts in a more rigorously correct way - moving restart handling up to the top level of check_dirent, check_xattr and others - at the cost of logging errors multiple times on transaction restart. Fixing the issues with logging errors multiple times is probably going to require memoizing calls to fsck_err() - we'll leave that for future improvements. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	6caf057850	bcachefs: Fix bch2_btree_iter_advance() Was popping an assertion on !BTREE_ITER_ALL_SNAPSHOTS iters when getting to the end. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	41f9b7d39f	bcachefs: Move bch2_evict_subvolume_inodes() to fs.c This fixes building in userspace - code that's coupled to the kernel VFS interface should live in fs.c Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	8325cd1ed4	bcachefs: Don't do upgrades in nochanges mode nochanges mode is often used for getting data off of otherwise nonrecoverable filesystems, which is often because of errors hit during fsck. Don't force version upgrade & fsck in nochanges mode, so that it's more likely to mount. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	f124345e2b	bcachefs: Drop bch2_journal_meta() call when going RW Back when we relied on the journal sequence number blacklist machinery for consistency between btree and the journal, we needed to ensure a new journal entry was written before any btree writes were done. But, this had the side effect of consuming some space in the journal prior to doing journal replay - which could lead to a very wedged filesystem, since we don't yet have a way to grow the journal prior to going RW. Fortunately, the journal sequence number blacklist machinery isn't needed anymore, as btree node pointers now record the numer of sectors currently written to that node - that code should all be ripped out. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	2027875bd8	bcachefs: Add BCH_SUBVOLUME_UNLINKED Snapshot deletion needs to become a multi step process, where we unlink, then tear down the page cache, then delete the subvolume - the deleting flag is equivalent to an inode with i_nlink = 0. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	f3b1e19379	bcachefs: Improve error messages in trans_mark_reflink_p() We should always print out the key we were marking. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	6b3d8b8992	bcachefs: Don't run triggers in fix_reflink_p_key() It seems some users have reflink pointers which span many indirect extents, from a short window in time when merging of reflink pointers was allowed. Now, we're seeing transaction path overflows in fix_reflink_p(), the code path to clear out the reflink_p fields now used for front/back pad - but, we don't actually need to be running triggers in that path, which is an easy partial fix. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	d121172561	bcachefs: More general fix for transaction paths overflow for_each_btree_key() now calls bch2_trans_begin() as needed; that means, we can also call it when we're in danger of overflowing transaction paths. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	396a887d8f	bcachefs: Fix fsck path for refink pointers The way __bch2_mark_reflink_p returns errors was clashing with returning the number of sectors processed - we weren't returning FSCK_ERR_EXIT correctly. Fix this by only using the return code for errors, which actually ends up simplifying the overall logic. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	23af498cc4	bcachefs: Ensure we flush btree updates in evacuate path This fixes a possible race where we fail to remove a device because of btree nodes still on it, that are being deleted by in flight btree updates. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	f3cf0999ac	bcachefs: bch2_btree_node_rewrite() now returns transaction restarts We have been getting away from handling transaction restarts locally - convert bch2_btree_node_rewrite() to the newer style. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	979735df98	bcachefs: Fix bch2_btree_iter_next_node() We were modifying state, then return -EINTR, causing us to skip nodes - ouch. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	b0d1b70af8	bcachefs: Must check for errors from bch2_trans_cond_resched() But we don't need to call it from outside the btree iterator code anymore, since it's called by bch2_trans_begin() and bch2_btree_path_traverse(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	114eea75c7	bcachefs: Fix dev accounting after device add This is a hacky but effective fix to device usage stats for superblock and journal being wrong on a newly added device (following the comment that already told us how it needed to be done!) Reported-by: Chris Webb <chris@arachsys.com> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	ab44d7bdee	bcachefs: Fix a transaction path overflow readdir() in a directory with many subvolumes could overflow transaction paths - this is a simple hack around the issue. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	de924abbe7	bcachefs: Fix error handling in bch2_trans_extent_merging The back merging case wasn't returning errors correctly. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	4db650277d	bcachefs: Subvol dirents are now only visible in parent subvol This changes the on disk format for dirents that point to subvols so that they also record the subvolid of the parent subvol, so that we can filter them out in other subvolumes. This also updates the dirent code to do that filtering, and in particular tweaks the rename code - we need to ensure that there's only ever one dirent (counting multiplicities in different snapshots) that point to a subvolume. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	e5fa91d7ac	bcachefs: Fix restart handling in for_each_btree_key() Code that uses for_each_btree_key often wants transaction restarts to be handled locally and not returned. Originally, we wouldn't return transaction restarts if there was a single iterator in the transaction - the reasoning being if there weren't other iterators being invalidated, and the current iterator was being advanced/retraversed, there weren't any locks or iterators we were required to preserve. But with the btree_path conversion that approach doesn't work anymore - even when we're using for_each_btree_key() with a single iterator there will still be two paths in the transaction, since we now always preserve the path at the pos the iterator was initialized at - the reason being that on restart we often restart from the same place. And it turns out there's now a lot of for_each_btree_key() uses that _do not_ want transaction restarts handled locally, and should be returning them. This patch splits out for_each_btree_key_norestart() and for_each_btree_key_continue_norestart(), and converts existing users as appropriate. for_each_btree_key(), for_each_btree_key_continue(), and for_each_btree_node() now handle transaction restarts themselves by calling bch2_trans_begin() when necessary - and the old hack to not return transaction restarts when there's a single path in the transaction has been deleted. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	d17bc1739c	bcachefs: cached data shouldn't prevent fs from mounting It's not an error if we don't have cached data - skip BCH_DATA_cached in bch2_have_enough_devs(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	521b806769	bcachefs: Delete dentry when deleting snapshots This fixes a bug where subsequently doing creates with the same name fails. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	6e0c886d3c	bcachefs: Fix check_path() for snapshots check_path() wasn't checking the snapshot ID when checking for directory structure loops - so, snapshots would cause us to detect a loop that wasn't actually a loop. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	6d76aefea1	bcachefs: Fix for leaking of reflinked extents When a reflink pointer points to only part of an indirect extent, and then that indirect extent is fragmented (e.g. by copygc), if the reflink pointer only points to one of the fragments we leak a reference. Fix this by storing front/back pad values in reflink pointers - when inserting reflink pointesr, we initialize them to cover the full range of the indirect extents we reference. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	bfe88863cf	bcachefs: New on disk format to fix reflink_p pointers We had a bug where reflink_p pointers weren't being initialized to 0, and when we started using the second word, things broke badly. This patch revs the on disk format version and adds cleanup code to zero out the second word of reflink_p pointers before we start using it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	b71717dac6	bcachefs: Handle transaction restarts in bch2_blacklist_entries_gc() It shouldn't be necessary when we're only using a single iterator and not doing updates, but that's harder to debug at the moment. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	9a796fdb06	bcachefs: bch2_trans_exit() no longer returns errors Now that peek_node()/next_node() are converted to return errors directly, we don't need bch2_trans_exit() to return errors - it's cleaner this way and wasn't used much anymore. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	d355c6f4f7	bcachefs: for_each_btree_node() now returns errors directly This changes for_each_btree_node() to work like for_each_btree_key(), and to that end bch2_btree_iter_peek_node() and next_node() also return error ptrs. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	dfc276df91	bcachefs: Improve reflink repair code When a reflink pointer points to an indirect extent that doesn't exist, we need to replace it with a KEY_TYPE_error key. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	488f97764a	bcachefs: Fix check_path() across subvolumes Checking of directory structure across subvolumes was broken - we need to look up the snapshot ID of the parent subvolume when crossing subvol boundaries. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00

1 2 3 4 5 ...

1462 Commits