Go to file
Qu Wenruo c28ea613fa btrfs: subpage: fix the false data csum mismatch error
[BUG]
When running fstresss, we can hit strange data csum mismatch where the
on-disk data is in fact correct (passes scrub).

With some extra debug info added, we have the following traces:

  0482us: btrfs_do_readpage: root=5 ino=284 offset=393216, submit force=0 pgoff=0 iosize=8192
  0494us: btrfs_do_readpage: root=5 ino=284 offset=401408, submit force=0 pgoff=8192 iosize=4096
  0498us: btrfs_submit_data_bio: root=5 ino=284 bio first bvec=393216 len=8192
  0591us: btrfs_do_readpage: root=5 ino=284 offset=405504, submit force=0 pgoff=12288 iosize=36864
  0594us: btrfs_submit_data_bio: root=5 ino=284 bio first bvec=401408 len=4096
  0863us: btrfs_submit_data_bio: root=5 ino=284 bio first bvec=405504 len=36864
  0933us: btrfs_verify_data_csum: root=5 ino=284 offset=393216 len=8192
  0967us: btrfs_do_readpage: root=5 ino=284 offset=442368, skip beyond isize pgoff=49152 iosize=16384
  1047us: btrfs_verify_data_csum: root=5 ino=284 offset=401408 len=4096
  1163us: btrfs_verify_data_csum: root=5 ino=284 offset=405504 len=36864
  1290us: check_data_csum: !!! root=5 ino=284 offset=438272 pg_off=45056 !!!
  7387us: end_bio_extent_readpage: root=5 ino=284 before pending_read_bios=0

[CAUSE]
Normally we expect all submitted bio reads to only touch the range we
specified, and under subpage context, it means we should only touch the
range specified in each bvec.

But in data read path, inside end_bio_extent_readpage(), we have page
zeroing which only takes regular page size into consideration.

This means for subpage if we have an inode whose content looks like below:

  0       16K     32K     48K     64K
  |///////|       |///////|       |

  |//| = data needs to be read from disk
  |  | = hole

And i_size is 64K initially.

Then the following race can happen:

		T1		|		T2
--------------------------------+--------------------------------
btrfs_do_readpage()		|
|- isize = 64K;			|
|  At this time, the isize is 	|
|  64K				|
|				|
|- submit_extent_page()		|
|  submit previous assembled bio|
|  assemble bio for [0, 16K)	|
|				|
|- submit_extent_page()		|
   submit read bio for [0, 16K) |
   assemble read bio for	|
   [32K, 48K)			|
 				|
				| btrfs_setsize()
				| |- i_size_write(, 16K);
				|    Now i_size is only 16K
end_io() for [0K, 16K)		|
|- end_bio_extent_readpage()	|
   |- btrfs_verify_data_csum()  |
   |  No csum error		|
   |- i_size = 16K;		|
   |- zero_user_segment(16K,	|
      PAGE_SIZE);		|
      !!! We zeroed range	|
      !!! [32K, 48K)		|
				| end_io for [32K, 48K)
				| |- end_bio_extent_readpage()
				|    |- btrfs_verify_data_csum()
				|       ! CSUM MISMATCH !
				|       ! As the range is zeroed now !

[FIX]
To fix the problem, make end_bio_extent_readpage() to only zero the
range of bvec.

The bug only affects subpage read-write support, as for full read-only
mount we can't change i_size thus won't hit the race condition.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-03-02 17:48:00 +01:00
arch - For syscall user dispatch, separate ptctl operation from syscall 2021-02-07 10:16:24 -08:00
block block: add bio_add_zone_append_page 2021-02-09 00:52:19 +01:00
certs .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
crypto X.509: Fix crash caused by NULL pointer 2021-01-20 11:33:51 -08:00
Documentation - For syscall user dispatch, separate ptctl operation from syscall 2021-02-07 10:16:24 -08:00
drivers libnvdimm for 5.11-rc7 2021-02-07 10:45:26 -08:00
fs btrfs: subpage: fix the false data csum mismatch error 2021-03-02 17:48:00 +01:00
include iomap: support REQ_OP_ZONE_APPEND 2021-02-09 00:52:19 +01:00
init init/gcov: allow CONFIG_CONSTRUCTORS on UML to fix module gcov 2021-02-05 11:03:47 -08:00
ipc Merge branch 'akpm' (patches from Andrew) 2020-12-15 12:53:37 -08:00
kernel tracing: Fix output of top level event "enable" file 2021-02-08 11:32:39 -08:00
lib - Revert an attempt to not spread IRQ threads on isolated CPUs which has 2021-02-07 10:03:43 -08:00
LICENSES LICENSES: Add the CC-BY-4.0 license 2020-12-08 10:33:27 -07:00
mm mm: hugetlb: fix missing put_page in gather_surplus_pages() 2021-02-05 11:03:47 -08:00
net Fixes: 2021-02-05 10:11:14 -08:00
samples ARM: SoC drivers for v5.11 2020-12-16 16:38:41 -08:00
scripts kallsyms: fix nonconverging kallsyms table with lld 2021-02-05 17:53:28 +09:00
security cap: fix conversions on getxattr 2021-01-28 10:22:48 +01:00
sound ALSA: hda/via: Apply the workaround generically for Clevo machines 2021-01-26 18:05:03 +01:00
tools libnvdimm for 5.11-rc7 2021-02-07 10:45:26 -08:00
usr Merge branch 'work.fdpic' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-08-07 13:29:39 -07:00
virt KVM/arm64 fixes for 5.11, take #2 2021-01-25 18:52:01 -05:00
.clang-format clang-format: Update with the latest for_each macro list 2021-01-29 15:00:23 +01:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore .gitignore: docs: ignore sphinx_*/ directories 2020-09-10 10:44:31 -06:00
.mailmap block-5.11-2021-02-05 2021-02-06 14:40:27 -08:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: dccp: move Gerrit Renker to CREDITS 2021-01-14 10:53:49 -08:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS MAINTAINERS/.mailmap: use my @kernel.org address 2021-02-05 11:03:47 -08:00
Makefile Linux 5.11-rc7 2021-02-07 13:57:38 -08:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.