Commit Graph

1171138 Commits

Author SHA1 Message Date
Filipe Manana
3a49a54894 btrfs: initialize ret to -ENOSPC at __reserve_bytes()
At space-info.c:__reserve_bytes(), instead of initializing 'ret' to 0 when
it's declared and then shortly after set it to -ENOSPC under the space
info's spinlock, initialize it to -ENOSPC when declaring it.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:18 +02:00
Filipe Manana
9d0d47d5c3 btrfs: update flush method assertion when reserving space
When reserving space, at space-info.c:__reserve_bytes(), we assert that
either the current task is not holding a transacion handle, or, if it is,
that the flush method is not BTRFS_RESERVE_FLUSH_ALL. This is because that
flush method can trigger transaction commits, and therefore could lead to
a deadlock.

However there are other 2 flush methods that can trigger transaction
commits:

1) BTRFS_RESERVE_FLUSH_ALL_STEAL
2) BTRFS_RESERVE_FLUSH_EVICT

So update the assertion to check the flush method is also not one those
two methods if the current task is holding a transaction handle.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:18 +02:00
Filipe Manana
1a332502c8 btrfs: update documentation for BTRFS_RESERVE_FLUSH_EVICT flush method
The BTRFS_RESERVE_FLUSH_EVICT flush method can also commit transactions,
see the definition of the evict_flush_states const array at space-info.c,
but the documentation for it at space-info.h does not mention it.
So update the documentation.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:18 +02:00
Filipe Manana
b93fa4acbb btrfs: remove check for NULL block reserve at btrfs_block_rsv_check()
The block reserve passed to btrfs_block_rsv_check() is never NULL, so
remove the check. In case it can ever become NULL in the future, then
we'll get a pretty obvious and clear NULL pointer dereference crash and
stack trace.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:18 +02:00
Filipe Manana
5c1f2c6bca btrfs: pass a bool size update argument to btrfs_block_rsv_add_bytes()
At btrfs_delayed_refs_rsv_refill(), we are passing a value of 0 to the
'update_size' argument of btrfs_block_rsv_add_bytes(), which is defined
as a boolean. Functionally this is fine because a 0 is, implicitly,
converted to a boolean false value. However it's easier to read an
explicit 'false' value, so just pass 'false' instead of 0.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:18 +02:00
Filipe Manana
4e0527deb3 btrfs: pass a bool to btrfs_block_rsv_migrate() at evict_refill_and_join()
The last argument of btrfs_block_rsv_migrate() is a boolean, but we are
passing an integer, with a value of 1, to it at evict_refill_and_join().
While this is not a bug, due to type conversion, it's a lot more clear to
simply pass the boolean true value instead. So just do that.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:18 +02:00
Filipe Manana
318eee0328 btrfs: remove btrfs_lru_cache_is_full() inline function
It's not used anywhere at the moment, but it was used in earlier version
of a patch that removed its use in the second version. So just remove
btrfs_lru_cache_is_full().

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:18 +02:00
Christoph Hellwig
43fa4219bc btrfs: simplify adding pages in btrfs_add_compressed_bio_pages
btrfs_add_compressed_bio_pages is needlessly complicated.  Instead
of iterating over the logic disk offset just to add pages to the bio
use a simple offset starting at 0, which also removes most of the
claiming.  Additionally __bio_add_pages already takes care of the
assert that the bio is always properly sized, and btrfs_submit_bio
called right after asserts that the bio size is non-zero.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:18 +02:00
Christoph Hellwig
4513cb0c40 btrfs: move the bi_sector assignment out of btrfs_add_compressed_bio_pages
Adding pages to a bio has nothing to do with the sector.  Move the
assignment to the two callers in preparation for cleaning up
btrfs_add_compressed_bio_pages.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:18 +02:00
Naohiro Aota
d1cc579383 btrfs: sysfs: relax bg_reclaim_threshold for debugging purposes
Currently, /sys/fs/btrfs/<UUID>/bg_reclaim_threshold is limited to 0
(disable) or [50 .. 100]%, so we need to fill 50% of a device to start the
auto reclaim process. It is cumbersome to do so when we want to shake out
possible race issues of normal write vs reclaim.

Relax the threshold check under the BTRFS_DEBUG option.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:18 +02:00
Christoph Hellwig
2cef0c79bb btrfs: make btrfs_split_bio work on struct btrfs_bio
btrfs_split_bio expects a btrfs_bio as argument and always allocates one.
Type both the orig_bio argument and the return value as struct btrfs_bio
to improve type safety.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:18 +02:00
Christoph Hellwig
b41bbd293e btrfs: return a btrfs_bio from btrfs_bio_alloc
Return the containing struct btrfs_bio instead of the less type safe
struct bio from btrfs_bio_alloc.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:17 +02:00
Christoph Hellwig
9dfde1b47b btrfs: store a pointer to a btrfs_bio in struct btrfs_bio_ctrl
The bio in struct btrfs_bio_ctrl must be a btrfs_bio, so store a pointer
to the btrfs_bio for better type checking.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:17 +02:00
Christoph Hellwig
d733ea012d btrfs: simplify finding the inode in submit_one_bio
struct btrfs_bio now has an always valid inode pointer that can be used
to find the inode in submit_one_bio, so use that and initialize all
variables for which it is possible at declaration time.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:17 +02:00
Christoph Hellwig
b7d463a1d1 btrfs: store a pointer to the original btrfs_bio in struct compressed_bio
The original bio must be a btrfs_bio, so store a pointer to the
btrfs_bio for better type checking.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:17 +02:00
Christoph Hellwig
690834e47c btrfs: pass a btrfs_bio to btrfs_submit_compressed_read
btrfs_submit_compressed_read expects the bio passed to it to be embedded
into a btrfs_bio structure.  Pass the btrfs_bio directly to increase type
safety and make the code self-documenting.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:17 +02:00
Christoph Hellwig
ae42a154ca btrfs: pass a btrfs_bio to btrfs_submit_bio
btrfs_submit_bio expects the bio passed to it to be embedded into a
btrfs_bio structure.  Pass the btrfs_bio directly to increase type
safety and make the code self-documenting.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:17 +02:00
Christoph Hellwig
7edb9a3e72 btrfs: move zero filling of compressed read bios into common code
All algorithms have to fill the remainder of the orig_bio with zeroes,
so do it in common code.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:17 +02:00
Christoph Hellwig
34f888ce3a btrfs: cleanup main loop in btrfs_encoded_read_regular_fill_pages
btrfs_encoded_read_regular_fill_pages has a pretty odd control flow.
Unwind it so that there is a single loop over the pages array.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:17 +02:00
Christoph Hellwig
b665affe93 btrfs: remove unused members from struct btrfs_encoded_read_private
The inode and file_offset members in struct btrfs_encoded_read_private
are unused, so remove them.

Last used in commit 7959bd4411 ("btrfs: remove the start argument to
check_data_csum and export") and commit 7609afac67 ("btrfs: handle
checksum validation and repair at the storage layer").

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:17 +02:00
David Sterba
0b5485391d btrfs: locking: use atomic for DREW lock writers
The DREW lock uses percpu variable to track lock counters and for that
it needs to allocate the structure. In btrfs_read_tree_root() or
btrfs_init_fs_root() this may add another error case or requires the
NOFS scope protection.

One way is to preallocate the structure as was suggested in
https://lore.kernel.org/linux-btrfs/20221214021125.28289-1-robbieko@synology.com/

We may avoid the allocation altogether if we don't use the percpu
variables but an atomic for the writer counter. This should not make any
difference, the DREW lock is used for truncate and NOCOW writes along
with other IO operations.

The percpu counter for writers has been there since the original commit
8257b2dc3c "Btrfs: introduce btrfs_{start, end}_nocow_write() for
each subvolume". The reason could be to avoid hammering the same
cacheline from all the readers but then the writers do that anyway.

Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:17 +02:00
Anand Jain
ce4cf3793e btrfs: remove redundant clearing of NODISCARD
If no discard mount option is specified, including the NODISCARD option,
we make the async discard the default option then we don't have to call
the clear_opt again to clear the NODISCARD flag. Though this makes no
difference, that the call is redundant has been pointed out several
times so we better remove it.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:17 +02:00
Anand Jain
0f202b256a btrfs: avoid repetitive define BTRFS_FEATURE_INCOMPAT_SUPP
BTRFS_FEATURE_INCOMPAT_SUPP is defined twice, once under
CONFIG_BTRFS_DEBUG and once without it, resulting in repetitive code. The
reason for this is to add experimental features under CONFIG_BTRFS_DEBUG.

To avoid repetitive code, add a common list BTRFS_FEATURE_INCOMPAT_SUPP_STABLE,
and append experimental features only under CONFIG_BTRFS_DEBUG.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:16 +02:00
Qu Wenruo
6b4d375a81 btrfs: scrub: remove root and csum_root arguments from scrub_simple_mirror()
We don't need to pass the roots as arguments, reading them from the
rb-tree is cheap.  Thus there is really not much need to pre-fetch it
and pass it all the way from scrub_stripe().

And we already have more than enough arguments in scrub_simple_mirror()
and scrub_simple_stripe(), it's better to remove them and only grab
those roots in scrub_simple_mirror().

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:16 +02:00
Qu Wenruo
1d40329736 btrfs: scrub: remove unused path inside scrub_stripe()
The variable @path is no longer passed into any call sites after commit
18d30ab961 ("btrfs: scrub: use scrub_simple_mirror() to handle RAID56
data stripe scrub"), thus we can remove the variable completely.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:16 +02:00
Qu Wenruo
5f50fa918f btrfs: do not use replace target device as an extra mirror
[BUG]
Currently btrfs can use dev-replace device as an extra mirror for
read-repair.  But it can lead to NODATASUM corruption in the following
case:

 There is a RAID1 data chunk, and dev-replace is running from
 dev2 to dev0.

 |//| = Replaced data
          X       X+1MB     X+2MB
  Dev 2:  |       |         |           <- Source dev
  Dev 0:  |///////|         |           <- Target dev

Then a read on dev 2 X+2MB happens.
And something wrong happened inside devid 2, causing an -EIO.

In that case, read-repair would try the next mirror, and since we can
use target device as an extra mirror, we will use that mirror instead.

But unfortunately since the read is beyond the current replace cursor,
we should not trust it at all, what we get would be just uninitialized
garbage.

But if this read is for NODATASUM range, then we just trust them and
cause data corruption.

[CAUSE]
We used to have some checks to make sure we only return such extra
mirror when the range is before our left cursor.

The first commit introducing this behavior is ad6d620e2a ("Btrfs:
allow repair code to include target disk when searching mirrors").

But later a fix, 22ab04e814 ("Btrfs: fix race between device replace
and chunk allocation") changed the behavior, to always let
btrfs_map_block() include the extra mirror to address a race in
dev-replace which can cause missing writes to target device.

This means, we lose the tracking of cursor for the extra mirror, thus
can lead to above corruption.

[FIX]
The extra mirror is never a reliable one, at the beginning of
dev-replace, the reliability is zero, while only at the end of the
replace it's a fully reliable mirror.

We either do the complex tracking, or never trust it.

IMHO it's much easier to maintain if we don't trust it at all, and the
extra mirror can only benefit for a limited period of time (during
replace).

Thus this patch would completely remove the ability to use target device
as an extra mirror.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:16 +02:00
Qu Wenruo
4871c33baf btrfs: open_ctree() error handling cleanup
Currently open_ctree() still uses two variables for error handling, err
and ret. This can be confusing and missing some errors and does not
conform to current coding style.

This patch will fix the problems by:

- Use only ret for error handling

- Add proper ret assignment
  Originally we rely on the default value (-EINVAL) of err to handle
  errors, but that doesn't really reflects the error.
  This will change it use the correct error number for the following
  call sites:

  * subpage_info allocation
  * btrfs_free_extra_devids()
  * btrfs_check_rw_degradable()
  * cleaner_kthread allocation
  * transaction_kthread allocation

- Add an extra ASSERT()
  To make sure we error out instead of returning 0.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:16 +02:00
Christoph Hellwig
e2eb02480c btrfs: cleanup the main loop in btrfs_lookup_bio_sums
Introduce a bio_offset variable for the current offset into the bio
instead of recalculating it over and over.   Remove the now only used
once search_len and sector_offset variables, and reduce the scope for
count and cur_disk_bytenr.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:16 +02:00
Christoph Hellwig
65886d2b1f btrfs: remove search_file_offset_in_bio
There is no need to search for a file offset in a bio, it is now always
provided in bbio->file_offset (set at bio allocation time since
0d495430db ("btrfs: set bbio->file_offset in alloc_new_bio")).  Just
use that with the offset into the bio.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:16 +02:00
Johannes Thumshirn
198bd49e5f btrfs: sink calc_bio_boundaries into its only caller
Nowadays calc_bio_boundaries() is a relatively simple function that only
guarantees the one bio equals to one ordered extent rule for uncompressed
Zone Append bios.

Sink it into it's only caller alloc_new_bio().

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:16 +02:00
Christoph Hellwig
24e6c80822 btrfs: simplify main loop in submit_extent_page
bio_add_page always adds either the entire range passed to it or nothing.
Based on that btrfs_bio_add_page can only return a length smaller than
the passed in one when hitting the ordered extent limit, which can only
happen for writes.  Given that compressed writes never even use this code
path, this means that all the special cases for compressed extent offset
handling are dead code.

Reflow submit_extent_page to take advantage of this by inlining
btrfs_bio_add_page and handling the ordered extent limit by decrementing
it for each added range and thus significantly simplifying the loop.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:16 +02:00
Christoph Hellwig
78a2ef1b7b btrfs: check for contiguity in submit_extent_page
Different loop iterations in btrfs_bio_add_page not only have the same
contiguity parameters, but also any non-initial operation operates on a
fresh bio anyway.

Factor out the contiguity check into a new btrfs_bio_is_contig and only
call it once in submit_extent_page before descending into the
bio_add_page loop.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:16 +02:00
Christoph Hellwig
5380311fc8 btrfs: simplify the error handling in __extent_writepage_io
Remove the has_error and saved_ret variables, and just jump to a goto
label for error handling from the only place returning an error from the
main loop.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:16 +02:00
Christoph Hellwig
551733372f btrfs: remove the submit_extent_page return value
submit_extent_page always returns 0 since commit d5e4377d50 ("btrfs:
split zone append bios in btrfs_submit_bio").  Change it to a void return
type and remove all the unreachable error handling code in the callers.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:16 +02:00
Christoph Hellwig
f8ed4852f3 btrfs: remove the compress_type argument to submit_extent_page
Update the compress_type in the btrfs_bio_ctrl after forcing out the
previous bio in btrfs_do_readpage, so that alloc_new_bio can just use
the compress_type member in struct btrfs_bio_ctrl instead of passing the
same information redundantly as a function argument.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:15 +02:00
Christoph Hellwig
a140453bf9 btrfs: rename the this_bio_flag variable in btrfs_do_readpage
Rename this_bio_flag to compress_type to match the surrounding code
and better document the intent.  Also use the proper enum type instead
of unsigned long.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:15 +02:00
Christoph Hellwig
c9bc621fb4 btrfs: move the compress_type check out of btrfs_bio_add_page
The compress_type can only change on a per-extent basis.  So instead of
checking it for every page in btrfs_bio_add_page, do the check once in
btrfs_do_readpage, which is the only caller of btrfs_bio_add_page and
submit_extent_page that deals with compressed extents.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:15 +02:00
Christoph Hellwig
72b505dc57 btrfs: add a wbc pointer to struct btrfs_bio_ctrl
Instead of passing down the wbc pointer the deep call chain, just
add it to the btrfs_bio_ctrl structure.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:15 +02:00
Christoph Hellwig
794c26e214 btrfs: remove the sync_io flag in struct btrfs_bio_ctrl
The sync_io flag is equivalent to wbc->sync_mode == WB_SYNC_ALL, so
just check for that and remove the separate flag.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:15 +02:00
Christoph Hellwig
c000bc04ba btrfs: store the bio opf in struct btrfs_bio_ctrl
The bio op and flags never change over the life time of a bio_ctrl,
so move it in there instead of passing it down the deep call chain
all the way down to alloc_new_bio.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:15 +02:00
Christoph Hellwig
eb8d0c6d04 btrfs: remove the force_bio_submit to submit_extent_page
If force_bio_submit, submit_extent_page simply calls submit_one_bio as
the first thing.  This can just be moved to the only caller that sets
force_bio_submit to true.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:15 +02:00
Christoph Hellwig
67998cf438 btrfs: don't set force_bio_submit in read_extent_buffer_subpage
When read_extent_buffer_subpage calls submit_extent_page, it does
so on a freshly initialized btrfs_bio_ctrl structure that can't have
a valid bio to submit.  Clear the force_bio_submit parameter to false
as there is nothing to submit.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:15 +02:00
Anand Jain
fdf8d595f4 btrfs: open code btrfs_bin_search()
btrfs_bin_search() is a simple wrapper that searches for the whole slots
by calling btrfs_generic_bin_search() with the starting slot/first_slot
preset to 0.

This simple wrapper can be open coded as btrfs_bin_search().

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:15 +02:00
Qu Wenruo
7b31e0451d btrfs: dev-replace: properly follow its read mode
[BUG]
Although dev replace ioctl has a way to specify the mode on whether we
should read from the source device, it's not properly followed.

 # mkfs.btrfs -f -d raid1 -m raid1 $dev1 $dev2
 # mount $dev1 $mnt
 # xfs_io -f -c "pwrite 0 32M" $mnt/file
 # sync
 # btrfs replace start -r -f 1 $dev3 $mnt

And one extra trace is added to scrub_submit(), showing the detail about
the bio:

  btrfs-11569 [005] ...  37.0270: scrub_submit.part.0: devid=1 logical=22036480 phy=22036480 len=16384
  btrfs-11569 [005] ...  37.0273: scrub_submit.part.0: devid=1 logical=30457856 phy=30457856 len=32768
  btrfs-11569 [005] ...  37.0274: scrub_submit.part.0: devid=1 logical=30507008 phy=30507008 len=49152
  btrfs-11569 [005] ...  37.0274: scrub_submit.part.0: devid=1 logical=30605312 phy=30605312 len=32768
  btrfs-11569 [005] ...  37.0275: scrub_submit.part.0: devid=1 logical=30703616 phy=30703616 len=65536
  btrfs-11569 [005] ...  37.0281: scrub_submit.part.0: devid=1 logical=298844160 phy=298844160 len=131072
  ...
  btrfs-11569 [005] ...  37.0762: scrub_submit.part.0: devid=1 logical=322961408 phy=322961408 len=131072
  btrfs-11569 [005] ...  37.0762: scrub_submit.part.0: devid=1 logical=323092480 phy=323092480 len=131072

One can see that all the reads are submitted to devid 1, even if we have
specified "-r" option to avoid reading from the source device.

[CAUSE]
The dev-replace read mode is only set but not followed by scrub code at
all.  In fact, only common read path is properly following the read
mode, but scrub itself has its own read path, thus not following the
mode.

[FIX]
Here we enhance scrub_find_good_copy() to also follow the read mode.

The idea is pretty simple, in the first loop, we avoid the following
devices:

- Missing devices
  This is the existing condition

- The source device if the replace wants to avoid it.

And if above loop found no candidate (e.g. replace a single device),
then we discard the 2nd condition, and try again.

Since we're here, also enhance the function scrub_find_good_copy() by:

- Remove the forward declaration

- Makes it return int
  To indicates errors, e.g. no good mirror found.

- Add extra error messages

Now with the same trace, "btrfs replace start -r" works as expected:

  btrfs-1213 [000] ...  991.9059: scrub_submit.part.0: devid=2 logical=22036480 phy=1064960 len=16384
  btrfs-1213 [000] ...  991.9062: scrub_submit.part.0: devid=2 logical=30457856 phy=9486336 len=32768
  btrfs-1213 [000] ...  991.9063: scrub_submit.part.0: devid=2 logical=30507008 phy=9535488 len=49152
  btrfs-1213 [000] ...  991.9064: scrub_submit.part.0: devid=2 logical=30605312 phy=9633792 len=32768
  btrfs-1213 [000] ...  991.9065: scrub_submit.part.0: devid=2 logical=30703616 phy=9732096 len=65536
  btrfs-1213 [000] ...  991.9073: scrub_submit.part.0: devid=2 logical=298844160 phy=277872640 len=131072
  btrfs-1213 [000] ...  991.9075: scrub_submit.part.0: devid=2 logical=298975232 phy=278003712 len=131072
  btrfs-1213 [000] ...  991.9078: scrub_submit.part.0: devid=2 logical=299106304 phy=278134784 len=131072
  ...
  btrfs-1213 [000] ...  991.9474: scrub_submit.part.0: devid=2 logical=318504960 phy=297533440 len=131072
  btrfs-1213 [000] ...  991.9476: scrub_submit.part.0: devid=2 logical=318636032 phy=297664512 len=131072
  btrfs-1213 [000] ...  991.9479: scrub_submit.part.0: devid=2 logical=318767104 phy=297795584 len=131072

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:15 +02:00
Christoph Hellwig
f9327a70c1 btrfs: fold finish_compressed_bio_write into btrfs_finish_compressed_write_work
Fold finish_compressed_bio_write into its only caller as there is no
reason to keep them separate.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:15 +02:00
Christoph Hellwig
a959a1745d btrfs: don't clear page->mapping in btrfs_free_compressed_pages
No one ever set ->mapping on these pages, so don't bother clearing it.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:15 +02:00
Christoph Hellwig
32586c5bca btrfs: factor out a btrfs_free_compressed_pages helper
Share the code to free the compressed pages and the array to hold them
into a common helper.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:14 +02:00
Christoph Hellwig
10e924bc32 btrfs: factor out a btrfs_add_compressed_bio_pages helper
Factor out a common helper to add the compressed_bio pages to the
bio that is shared by the compressed read and write path.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:14 +02:00
Christoph Hellwig
d7294e4dee btrfs: use the bbio file offset in add_ra_bio_pages
struct btrfs_bio now has a file_offset field set up by all submitters.
Use that value combined with the bio size in add_ra_bio_pages to
calculate the last offset in the bio.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:14 +02:00
Christoph Hellwig
e7aff33e31 btrfs: use the bbio file offset in btrfs_submit_compressed_read
struct btrfs_bio now has a file_offset field set up by all submitters.
Use that in btrfs_submit_compressed_read instead of recalculating the
value.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-04-17 18:01:14 +02:00