Commit Graph

3291 Commits

Author SHA1 Message Date
Kent Overstreet
b26d79147f bcachefs: BTREE_ID_subvolume_children
Add a btree to record a parent -> child subvolume relationships,
according to the filesystem heirarchy.

The subvolume_children btree is a bitset btree: if a bit is set at pos
p, that means p.offset is a child of subvolume p.inode.

This will be used for efficiently listing subvolumes, as well as
recursive deletion.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:24 -04:00
Kent Overstreet
b8628a2529 bcachefs: bch_subvolume::fs_path_parent
Record the filesystem path heirarchy for subvolumes in bch_subvolume

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:24 -04:00
Kent Overstreet
e07c28ab92 bcachefs: bch2_btree_bit_mod()
Provide a non-write buffer version of bch2_btree_bit_mod_buffered(), for
the subvolume children btree.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:24 -04:00
Kent Overstreet
506b187603 bcachefs: bch2_btree_bit_mod -> bch2_btree_bit_mod_buffered
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:24 -04:00
Kent Overstreet
56e230473d bcachefs: Correctly reattach subvolumes
Subvolumes need special handling to reattach - we always reattach them
in the root subvolume's lost+found, and they need a slightly different
kind of dirent.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:24 -04:00
Kent Overstreet
3a136177f3 bcachefs: check_path() now prints full inode when reattaching
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:24 -04:00
Kent Overstreet
688a769409 bcachefs: Pass inode bkey to check_path()
prep work for improving logging/error messages

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:24 -04:00
Kent Overstreet
f5d58d0c72 bcachefs: Fix path where dirent -> subvol missing and we don't fix
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:24 -04:00
Kent Overstreet
64304aaf4e bcachefs: bch_subvolume::parent -> creation_parent
bit of renaming, prep for adding a fs path parent

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:24 -04:00
Kent Overstreet
45b4ed525e bcachefs: Repair subvol dirents that point to non subvols
when repair switches d_type to or from DT_SUBVOL, we need to update the
target accordingly

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:24 -04:00
Kent Overstreet
c60b7f803c bcachefs: check dirent->d_parent_subvol
Check that d_parent_subvol makes sense - the dirent's snapshot must be
visible in d_parent_subvol (i.e. an ancestor of d_parent_subvol's
snapshot) in order to be visible.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:24 -04:00
Kent Overstreet
f4e68c859f bcachefs: check inode->bi_parent_subvol against dirent
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:24 -04:00
Kent Overstreet
ea27001e14 bcachefs: delete duplicated checks in check_dirent_to_subvol()
these were already checked in check_subvol()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:24 -04:00
Kent Overstreet
e539ebb867 bcachefs: simplify check_dirent_inode_dirent()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:24 -04:00
Kent Overstreet
0b498a5a39 bcachefs: check bi_parent_subvol in check_inode()
check for inodes with a nonzero bi_parent_subvol field that aren't
actually subvolume roots

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:23 -04:00
Kent Overstreet
971a1503a2 bcachefs: better log message in lookup_inode_for_snapshot()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:23 -04:00
Kent Overstreet
0b17618fdc bcachefs: check_inode_dirent_inode()
check that if an inode has a backpointer, the dirent it points to points
back to it.

We do this in check_dirent_inode_dirent(), but only for inodes that have
dirents that point to them - we also have to do the check starting from
the inode to catch inodes that don't have dirents that point to them.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:23 -04:00
Kent Overstreet
f2b02d099c bcachefs: Check subvol <-> inode pointers in check_inode()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:23 -04:00
Kent Overstreet
4c20278eb1 bcachefs: Check subvol <-> inode pointers in check_subvol()
Subvolumes and subvolume root inodes point to each other: this verifies
the subvolume -> inode -> subvolme path.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:23 -04:00
Kent Overstreet
52946d828a bcachefs: Kill more -EIO error codes
This converts -EIOs related to btree node errors to private error codes,
which will help with some ongoing debugging by giving us better error
messages.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:23 -04:00
Kent Overstreet
da23795e4c bcachefs: thread_with_file: add f_ops.flush
Add a flush op, to return the exit code via close().

Also update bcachefs usage to use this to return fsck exit codes.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:20 -04:00
Kent Overstreet
6b33312925 bcachefs: thread_with_file: Fix missing va_end()
Fixes: https://lore.kernel.org/linux-bcachefs/202402131603.E953E2CF@keescook/T/#u
Reported-by: coverity scan
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:18 -04:00
Darrick J. Wong
658a1e42ce bcachefs: thread_with_file: allow ioctls against these files
Make it so that a thread_with_stdio user can handle ioctls against the
file descriptor.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:15 -04:00
Darrick J. Wong
ab6752e24e bcachefs: thread_with_file: create ops structure for thread_with_stdio
Create an ops structure so we can add more file-based functionality in
the next few patches.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:13 -04:00
Darrick J. Wong
1cbae651e5 bcachefs: thread_with_file: fix various printf problems
Experimentally fix some problems with stdio_redirect_vprintf by creating
a MOO variant with which we can experiment.  We can't do a GFP_KERNEL
allocation while holding the spinlock, and I don't like how the printf
function can silently truncate the output if memory allocation fails.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:10 -04:00
Darrick J. Wong
fcb1620edd bcachefs: thread_with_file: allow creation of readonly files
Create a new run_thread_with_stdout function that opens a file in
O_RDONLY mode so that the kernel can write things to userspace but
userspace cannot write to the kernel.  This will be used to convey xfs
health event information to userspace.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:08 -04:00
Kent Overstreet
a5a650d647 bcachefs: thread_with_stdio: suppress hung task warning
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:04 -04:00
Kent Overstreet
8f9320d3a3 bcachefs: thread_with_stdio: Mark completed in ->release()
This fixes stdio_redirect_read() getting stuck, not noticing that the
pipe has been closed.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:00 -04:00
Kent Overstreet
032b3fd057 bcachefs: Thread with file documentation
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 18:39:13 -04:00
Kent Overstreet
f704f108af bcachefs: thread_with_stdio: fix bch2_stdio_redirect_readline()
This fixes a bug where we'd return data without waiting for a newline,
if data was present but a newline was not.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 18:39:13 -04:00
Kent Overstreet
a6777ca4ff bcachefs: thread_with_stdio: kill thread_with_stdio_done()
Move the cleanup code to a wrapper function, where we can call it after
the thread_with_stdio fn exits.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 18:39:13 -04:00
Kent Overstreet
60e1baa872 bcachefs: thread_with_stdio: convert to darray
- eliminate the dependency on printbufs, so that we can lift
   thread_with_file for use in xfs
 - add a nonblocking parameter to stdio_redirect_printf(), and either
   block if the buffer is full or drop it on the floor - don't buffer
   infinitely

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 18:39:13 -04:00
Kent Overstreet
e017047fdb bcachefs: thread_with_stdio: eliminate double buffering
The output buffer lock has to be a spinlock so that we can write to it
from interrupt context, so we can't use a direct copy_to_user; this
switches thread_with_file_read() to use fault_in_writeable() and
copy_to_user_nofault(), similar to how thread_with_file_write() works.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 18:39:13 -04:00
Kent Overstreet
cb6fc943b6 bcachefs: kill kvpmalloc()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 18:39:12 -04:00
Kent Overstreet
737cd174d1 bcachefs: bch2_lookup() gives better error message on inode not found
When a dirent points to a missing inode, we really should print out the
dirent.

This requires quite a bit of refactoring, but there's some other
benefits: we now do the entire looup (dirent and inode) in a single
btree transaction, and copy to the VFS inode with btree locks still
held, like the create path.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10 15:34:09 -04:00
Kent Overstreet
a91bc5e505 bcachefs: bch2_inode_insert()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10 15:34:09 -04:00
Kent Overstreet
3d4998c202 bcachefs: factor out check_inode_backpointer()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10 15:34:09 -04:00
Kent Overstreet
11def1888f bcachefs: Factor out check_subvol_dirent()
Going to be adding more code here for checking subvol structure.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10 15:34:09 -04:00
Kent Overstreet
ce3e9283de bcachefs: Kill some -EINVALs
Repurposing standard error codes in bcachefs code is banned in new code,
and we need to get rid of the remaining ones - private error codes give
us much better error messages.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10 15:34:09 -04:00
Kent Overstreet
82fdc1dc98 bcachefs: bump max_active on btree_interior_update_worker
WQ_UNBOUND with max_active 1 means ordered workqueue, but we don't
actually need or want ordered semantics - and probably want a higher
concurrency limit anyways.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10 15:34:09 -04:00
Kent Overstreet
69c8e6ce02 bcachefs: move fsck_write_inode() to inode.c
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10 15:34:09 -04:00
Kent Overstreet
29223b5a55 bcachefs: Initialize super_block->s_uuid
Need to fix this oversight for the new FS_IOC_(GET|SET)UUID ioctls.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10 15:34:09 -04:00
Kent Overstreet
f8f8fb443b bcachefs: Switch to uuid_to_fsid()
switch the statfs code from something horrible and open coded to the
more standard uuid_to_fsid()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10 15:34:09 -04:00
Kent Overstreet
7f76b08aca bcachefs: Subvolumes may now be renamed
Files within a subvolume cannot be renamed into another subvolume, but
subvolumes themselves were intended to be.

This implements subvolume renaming - we need to ensure that there's only
a single dirent that points to a subvolume key (not multiple versions in
different snapshots), and we need to ensure that dirent.d_parent_subol
and inode.bi_parent_subvol are updated.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10 15:34:08 -04:00
Kent Overstreet
5f43b0134e bcachefs: btree node prefetching in check_topology
btree_and_journal_iter is old code that we want to get rid of, but we're
not ready to yet.

lack of btree node prefetching is, it turns out, a real performance
issue for fsck on spinning rust, so - add it.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10 15:34:08 -04:00
Kent Overstreet
fc634d8e46 bcachefs: btree_and_journal_iter.trans
we now always have a btree_trans when using a btree_and_journal_iter;
prep work for adding prefetching to btree_and_journal_iter

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10 15:34:08 -04:00
Kent Overstreet
916abefd43 bcachefs: better journal pipelining
Recently a severe performance regression was discovered, which bisected
to

  a6548c8b5e bcachefs: Avoid flushing the journal in the discard path

It turns out the old behaviour, which issued excessive journal flushes,
worked around a performance issue where queueing delays would cause the
journal to not be able to write quickly enough and stall.

The journal flushes masked the issue because they periodically flushed
the device write cache, reducing write latency for non flushes.

This patch reworks the journalling code to allow more than one
(non-flush) write to be in flight at a time. With this patch, doing 4k
random writes and an iodepth of 128, we are now able to hit 560k iops to
a Samsung 970 EVO Plus - previously, we were stuck in the ~200k range.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10 15:34:08 -04:00
Kent Overstreet
38789c2508 bcachefs: closure per journal buf
Prep work for having multiple journal writes in flight.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10 15:34:08 -04:00
Kent Overstreet
5165400275 bcachefs: bio per journal buf
Prep work for having multiple journal writes in flight.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10 15:34:08 -04:00
Kent Overstreet
52f7d75e7d bcachefs: jset_entry_datetime
This gives us a way to record the date and time every journal entry was
written - useful for debugging.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10 15:34:08 -04:00