linux/fs/ocfs2
Su Yue 952b023f06 ocfs2: fix races between hole punching and AIO+DIO
After commit "ocfs2: return real error code in ocfs2_dio_wr_get_block",
fstests/generic/300 become from always failed to sometimes failed:

========================================================================
[  473.293420 ] run fstests generic/300

[  475.296983 ] JBD2: Ignoring recovery information on journal
[  475.302473 ] ocfs2: Mounting device (253,1) on (node local, slot 0) with ordered data mode.
[  494.290998 ] OCFS2: ERROR (device dm-1): ocfs2_change_extent_flag: Owner 5668 has an extent at cpos 78723 which can no longer be found
[  494.291609 ] On-disk corruption discovered. Please run fsck.ocfs2 once the filesystem is unmounted.
[  494.292018 ] OCFS2: File system is now read-only.
[  494.292224 ] (kworker/19:11,2628,19):ocfs2_mark_extent_written:5272 ERROR: status = -30
[  494.292602 ] (kworker/19:11,2628,19):ocfs2_dio_end_io_write:2374 ERROR: status = -3
fio: io_u error on file /mnt/scratch/racer: Read-only file system: write offset=460849152, buflen=131072
=========================================================================

In __blockdev_direct_IO, ocfs2_dio_wr_get_block is called to add unwritten
extents to a list.  extents are also inserted into extent tree in
ocfs2_write_begin_nolock.  Then another thread call fallocate to puch a
hole at one of the unwritten extent.  The extent at cpos was removed by
ocfs2_remove_extent().  At end io worker thread, ocfs2_search_extent_list
found there is no such extent at the cpos.

    T1                        T2                T3
                              inode lock
                                ...
                                insert extents
                                ...
                              inode unlock
ocfs2_fallocate
 __ocfs2_change_file_space
  inode lock
  lock ip_alloc_sem
  ocfs2_remove_inode_range inode
   ocfs2_remove_btree_range
    ocfs2_remove_extent
    ^---remove the extent at cpos 78723
  ...
  unlock ip_alloc_sem
  inode unlock
                                       ocfs2_dio_end_io
                                        ocfs2_dio_end_io_write
                                         lock ip_alloc_sem
                                         ocfs2_mark_extent_written
                                          ocfs2_change_extent_flag
                                           ocfs2_search_extent_list
                                           ^---failed to find extent
                                          ...
                                          unlock ip_alloc_sem

In most filesystems, fallocate is not compatible with racing with AIO+DIO,
so fix it by adding to wait for all dio before fallocate/punch_hole like
ext4.

Link: https://lkml.kernel.org/r/20240408082041.20925-3-glass.su@suse.com
Fixes: b25801038d ("ocfs2: Support xfs style space reservation ioctls")
Signed-off-by: Su Yue <glass.su@suse.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25 21:07:06 -07:00
..
cluster ocfs2: port block device access to file 2024-02-25 12:05:26 +01:00
dlm ocfs2: fix sparse warnings 2024-04-25 21:07:04 -07:00
dlmfs ocfs2: remove SLAB_MEM_SPREAD flag usage 2024-03-14 09:17:29 -07:00
acl.c ocfs2: convert to new timestamp accessors 2023-10-18 14:08:24 +02:00
acl.h fs: port ->set_acl() to pass mnt_idmap 2023-01-19 09:24:27 +01:00
alloc.c fs: convert block_write_full_page to block_write_full_folio 2023-12-29 11:58:35 -08:00
alloc.h
aops.c ocfs2: return real error code in ocfs2_dio_wr_get_block 2024-04-25 21:07:06 -07:00
aops.h
blockcheck.c
blockcheck.h
buffer_head_io.c ocfs2: fix a spelling typo in comment 2023-11-01 12:46:59 -07:00
buffer_head_io.h
dcache.c ocfs2_find_match(): there's no such thing as NULL or negative ->d_parent 2023-12-21 12:53:30 -05:00
dcache.h
dir.c __ocfs2_add_entry(), ocfs2_prepare_dir_for_insert(): namelen checks 2023-12-21 12:53:21 -05:00
dir.h
dlmglue.c ocfs2: spelling fix 2024-02-22 15:38:51 -08:00
dlmglue.h
export.c ocfs2: fix sparse warnings 2024-04-25 21:07:04 -07:00
export.h
extent_map.c
extent_map.h
file.c ocfs2: fix races between hole punching and AIO+DIO 2024-04-25 21:07:06 -07:00
file.h fs: port ->permission() to pass mnt_idmap 2023-01-19 09:24:28 +01:00
filecheck.c
filecheck.h
heartbeat.c ocfs2: fix a typo in a comment 2022-07-29 18:12:36 -07:00
heartbeat.h
inode.c ocfs2: fix sparse warnings 2024-04-25 21:07:04 -07:00
inode.h quota: Properly annotate i_dquot arrays with __rcu 2024-02-08 12:04:59 +01:00
ioctl.c ocfs2: update inode ctime in ocfs2_fileattr_set 2024-04-25 21:07:01 -07:00
ioctl.h fs: port ->fileattr_set() to pass mnt_idmap 2023-01-19 09:24:27 +01:00
journal.c ocfs2: annotate struct ocfs2_replay_map with __counted_by 2023-10-18 14:43:21 -07:00
journal.h ocfs2: use flexible array in 'struct ocfs2_recovery_map' 2023-08-18 10:18:57 -07:00
Kconfig fs: add CONFIG_BUFFER_HEAD 2023-08-02 09:13:09 -06:00
localalloc.c ocfs2: fix sparse warnings 2024-04-25 21:07:04 -07:00
localalloc.h
locks.c ocfs2: adapt to breakup of struct file_lock 2024-02-05 13:11:43 +01:00
locks.h
Makefile
mmap.c
mmap.h
move_extents.c ocfs2: improve write IO performance when fragmentation is high 2024-04-25 21:07:03 -07:00
move_extents.h
namei.c ocfs2: Avoid touching renamed directory if parent does not change 2023-11-25 02:53:20 -05:00
namei.h
ocfs1_fs_compat.h
ocfs2_fs.h ocfs2: improve write IO performance when fragmentation is high 2024-04-25 21:07:03 -07:00
ocfs2_ioctl.h
ocfs2_lockid.h
ocfs2_lockingver.h
ocfs2_trace.h ocfs2: remove writepage implementation 2023-12-29 11:58:35 -08:00
ocfs2.h ocfs2: always read both high and low parts of dinode link count 2022-12-11 19:30:19 -08:00
quota_global.c quota: Set nofs allocation context when acquiring dqio_sem 2024-01-23 19:21:11 +01:00
quota_local.c quota: Set nofs allocation context when acquiring dqio_sem 2024-01-23 19:21:11 +01:00
quota.h
refcounttree.c ocfs2: fix sparse warnings 2024-04-25 21:07:04 -07:00
refcounttree.h
reservations.c ocfs2: correctly use ocfs2_find_next_zero_bit() 2024-04-25 21:07:01 -07:00
reservations.h
resize.c ocfs2: improve write IO performance when fragmentation is high 2024-04-25 21:07:03 -07:00
resize.h
slot_map.c ocfs2: Annotate struct ocfs2_slot_info with __counted_by 2023-10-02 09:48:52 -07:00
slot_map.h
stack_o2cb.c ocfs2: use bitmap API in fill_node_map 2022-11-18 13:55:06 -08:00
stack_user.c ocfs2: adapt to breakup of struct file_lock 2024-02-05 13:11:43 +01:00
stackglue.c fs: Remove the now superfluous sentinel elements from ctl_table array 2023-12-28 04:57:57 -08:00
stackglue.h
suballoc.c ocfs2: speed up chain-list searching 2024-04-25 21:07:04 -07:00
suballoc.h ocfs2: improve write IO performance when fragmentation is high 2024-04-25 21:07:03 -07:00
super.c - Kuan-Wei Chiu has developed the well-named series "lib min_heap: Min 2024-03-14 18:03:09 -07:00
super.h
symlink.c
symlink.h
sysfile.c
sysfile.h
uptodate.c
uptodate.h
xattr.c vfs-6.7.ctime 2023-10-30 09:47:13 -10:00
xattr.h ocfs2: move ocfs2_xattr_handlers and ocfs2_xattr_handler_map to .rodata 2023-10-09 16:24:20 +02:00