Commit Graph

3 Commits

Author SHA1 Message Date
Qu Wenruo
d73a27b86f btrfs: handle case when repair happens with dev-replace
[BUG]
There is a bug report that a BUG_ON() in btrfs_repair_io_failure()
(originally repair_io_failure() in v6.0 kernel) got triggered when
replacing a unreliable disk:

  BTRFS warning (device sda1): csum failed root 257 ino 2397453 off 39624704 csum 0xb0d18c75 expected csum 0x4dae9c5e mirror 3
  kernel BUG at fs/btrfs/extent_io.c:2380!
  invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
  CPU: 9 PID: 3614331 Comm: kworker/u257:2 Tainted: G           OE      6.0.0-5-amd64 #1  Debian 6.0.10-2
  Hardware name: Micro-Star International Co., Ltd. MS-7C60/TRX40 PRO WIFI (MS-7C60), BIOS 2.70 07/01/2021
  Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
  RIP: 0010:repair_io_failure+0x24a/0x260 [btrfs]
  Call Trace:
   <TASK>
   clean_io_failure+0x14d/0x180 [btrfs]
   end_bio_extent_readpage+0x412/0x6e0 [btrfs]
   ? __switch_to+0x106/0x420
   process_one_work+0x1c7/0x380
   worker_thread+0x4d/0x380
   ? rescuer_thread+0x3a0/0x3a0
   kthread+0xe9/0x110
   ? kthread_complete_and_exit+0x20/0x20
   ret_from_fork+0x22/0x30

[CAUSE]

Before the BUG_ON(), we got some read errors from the replace target
first, note the mirror number (3, which is beyond RAID1 duplication,
thus it's read from the replace target device).

Then at the BUG_ON() location, we are trying to writeback the repaired
sectors back the failed device.

The check looks like this:

		ret = btrfs_map_block(fs_info, BTRFS_MAP_WRITE, logical,
				      &map_length, &bioc, mirror_num);
		if (ret)
			goto out_counter_dec;
		BUG_ON(mirror_num != bioc->mirror_num);

But inside btrfs_map_block(), we can modify bioc->mirror_num especially
for dev-replace:

	if (dev_replace_is_ongoing && mirror_num == map->num_stripes + 1 &&
	    !need_full_stripe(op) && dev_replace->tgtdev != NULL) {
		ret = get_extra_mirror_from_replace(fs_info, logical, *length,
						    dev_replace->srcdev->devid,
						    &mirror_num,
					    &physical_to_patch_in_first_stripe);
		patch_the_first_stripe_for_dev_replace = 1;
	}

Thus if we're repairing the replace target device, we're going to
trigger that BUG_ON().

But in reality, the read failure from the replace target device may be
that, our replace hasn't reached the range we're reading, thus we're
reading garbage, but with replace running, the range would be properly
filled later.

Thus in that case, we don't need to do anything but let the replace
routine to handle it.

[FIX]
Instead of a BUG_ON(), just skip the repair if we're repairing the
device replace target device.

Reported-by: 小太 <nospam@kota.moe>
Link: https://lore.kernel.org/linux-btrfs/CACsxjPYyJGQZ+yvjzxA1Nn2LuqkYqTCcUH43S=+wXhyf8S00Ag@mail.gmail.com/
CC: stable@vger.kernel.org # 6.0+
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-01-03 15:53:18 +01:00
Christoph Hellwig
bacf60e515 btrfs: move repair_io_failure to bio.c
repair_io_failure ties directly into all the glory low-level details of
mapping a bio with a logic address to the actual physical location.
Move it right below btrfs_submit_bio to keep all the related logic
together.

Also move btrfs_repair_eb_io_failure to its caller in disk-io.c now that
repair_io_failure is available in a header.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-12-05 18:00:57 +01:00
Christoph Hellwig
103c19723c btrfs: split the bio submission path into a separate file
The code used by btrfs_submit_bio only interacts with the rest of
volumes.c through __btrfs_map_block (which itself is a more generic
version of two exported helpers) and does not really have anything
to do with volumes.c.  Create a new bio.c file and a bio.h header
going along with it for the btrfs_bio-based storage layer, which
will grow even more going forward.

Also update the file with my copyright notice given that a large
part of the moved code was written or rewritten by me.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-12-05 18:00:57 +01:00