linux

iv/linux

History

Zhihao Cheng 556c19f563 ubifs: Queue up space reservation tasks if retrying many times

Recently we catched ENOSPC returned by make_reservation() while doing
fsstress on UBIFS, we got following information when it occurred (See
details in Link):

 UBIFS error (ubi0:0 pid 3640152): make_reservation [ubifs]: cannot
 reserve 112 bytes in jhead 2, error -28
 CPU: 2 PID: 3640152 Comm: kworker/u16:2 Tainted: G    B   W
 Hardware name: Hisilicon PhosphorHi1230 EMU (DT)
 Workqueue: writeback wb_workfn (flush-ubifs_0_0)
 Call trace:
  dump_stack+0x114/0x198
  make_reservation+0x564/0x610 [ubifs]
  ubifs_jnl_write_data+0x328/0x48c [ubifs]
  do_writepage+0x2a8/0x3e4 [ubifs]
  ubifs_writepage+0x16c/0x374 [ubifs]
  generic_writepages+0xb4/0x114
  do_writepages+0xcc/0x11c
  writeback_sb_inodes+0x2d0/0x564
  wb_writeback+0x20c/0x2b4
  wb_workfn+0x404/0x510
  process_one_work+0x304/0x4ac
  worker_thread+0x31c/0x4e4
  kthread+0x23c/0x290
  Budgeting info: data budget sum 17576, total budget sum 17768
	budg_data_growth 4144, budg_dd_growth 13432, budg_idx_growth 192
	min_idx_lebs 13, old_idx_sz 988640, uncommitted_idx 0
	page_budget 4144, inode_budget 160, dent_budget 312
	nospace 0, nospace_rp 0
	dark_wm 8192, dead_wm 4096, max_idx_node_sz 192
	freeable_cnt 0, calc_idx_sz 988640, idx_gc_cnt 0
	dirty_pg_cnt 4, dirty_zn_cnt 0, clean_zn_cnt 4811
	gc_lnum 21, ihead_lnum 14
	jhead 0 (GC)	 LEB 16
	jhead 1 (base)	 LEB 34
	jhead 2 (data)	 LEB 23
	bud LEB 16
	bud LEB 23
	bud LEB 34
	old bud LEB 33
	old bud LEB 31
	old bud LEB 15
	commit state 4
 Budgeting predictions:
	available: 33832, outstanding 17576, free 15356
 (pid 3640152) start dumping LEB properties
 (pid 3640152) Lprops statistics: empty_lebs 3, idx_lebs  11
	taken_empty_lebs 1, total_free 1253376, total_dirty 2445736
	total_used 3438712, total_dark 65536, total_dead 17248
 LEB 15 free 0      dirty 248000   used 5952   (taken)
 LEB 16 free 110592 dirty 896      used 142464 (taken, jhead 0 (GC))
 LEB 21 free 253952 dirty 0        used 0      (taken, GC LEB)
 LEB 23 free 0      dirty 248104   used 5848   (taken, jhead 2 (data))
 LEB 29 free 253952 dirty 0        used 0      (empty)
 LEB 33 free 0      dirty 253952   used 0      (taken)
 LEB 34 free 217088 dirty 36544    used 320    (taken, jhead 1 (base))
 LEB 37 free 253952 dirty 0        used 0      (empty)
 OTHERS: index lebs, zero-available non-index lebs

According to the budget algorithm, there are 5 LEBs reserved for budget:
three journal heads(16,23,34), 1 GC LEB(21) and 1 deletion LEB(can be
used in make_reservation()). There are 2 empty LEBs used for index nodes,
which is calculated as min_idx_lebs - idx_lebs = 2. In theory, LEB 15
and 33 should be reclaimed as free state after committing, but it is now
in taken state. After looking the realization of reserve_space(), there's
a possible situation:

LEB 15: free 2000 dirty 248000 used 3952 (jhead 2)
LEB 23: free 2000 dirty 248104 used 3848 (bud, taken)
LEB 33: free 2000 dirty 251952 used 0    (bud, taken)

      wb_workfn          wb_workfn_2
do_writepage // write 3000 bytes
 ubifs_jnl_write_data
  make_reservation
   reserve_space
    ubifs_garbage_collect
     ubifs_find_dirty_leb // ret ENOSPC, dirty LEBs are taken
   nospc_retries++  // 1
   ubifs_run_commit
    do_commit

LEB 15: free 2000 dirty 248000 used 3952 (jhead 2)
LEB 23: free 2000 dirty 248104 used 3848 (dirty)
LEB 33: free 2000 dirty 251952 used 0    (dirty)

                   do_writepage // write 2000 bytes for 3 times
		    ubifs_jnl_write_data
		    // grabs 15\23\33

LEB 15: free 0    dirty 248000 used 5952 (bud, taken)
LEB 23: free 0    dirty 248104 used 5848 (jhead 2)
LEB 33: free 0    dirty 253952 used 0    (bud, taken)

   reserve_space
    ubifs_garbage_collect
     ubifs_find_dirty_leb // ret ENOSPC, dirty LEBs are taken
   if (nospc_retries++ < 2) // false
 ubifs_ro_mode !

Fetch a reproducer in Link.

The dirty LEBs could be grabbed by other threads, which fails finding dirty
LEBs of GC in current thread, so make_reservation() could try many times to
invoke GC&&committing, but current realization limits the times of retrying
as 'nospc_retries'(twice).
Fix it by adding a wait queue, start queuing up space reservation tasks
when someone task has retried gc + commit for many times. Then there is
only one task making space reservation at any time, and it can always make
success under the premise of correct budgeting.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=218164
Fixes: 1e51764a3c2a ("UBIFS: add new flash file system")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>

2024-02-25 22:09:27 +01:00

9p: Use length of data written to the server in preference to error

2024-01-04 13:15:31 +00:00

adfs

adfs: remove writepage implementation

2023-12-29 11:58:33 -08:00

affs

affs: d_obtain_alias(ERR_PTR(...)) will do the right thing

2023-12-21 12:51:02 -05:00

afs

afs: Fix missing/incorrect unlocking of RCU read lock

2024-01-22 22:30:38 +00:00

autofs

dcache stuff for this cycle

2024-01-11 20:11:35 -08:00

bcachefs

bcachefs: Fix missing va_end()

2024-02-13 21:59:27 -05:00

befs

befs: d_obtain_alias(ERR_PTR(...)) will do the right thing

2023-12-21 12:51:02 -05:00

bfs

misc cleanups (the part that hadn't been picked by individual fs trees)

2024-01-11 20:23:50 -08:00

btrfs

for-6.8-rc4-tag

2024-02-14 15:47:02 -08:00

cachefiles

cachefiles, erofs: Fix NULL deref in when cachefiles is not doing ondemand-mode

2024-01-22 22:25:15 +00:00

ceph

ceph: add ceph_cap_unlink_work to fire check_caps() immediately

2024-02-13 11:22:54 +01:00

coda

dcache stuff for this cycle

2024-01-11 20:11:35 -08:00

configfs

…

cramfs

vfs-6.7.ctime

2023-10-30 09:47:13 -10:00

crypto

fscrypt: document that CephFS supports fscrypt now

2023-12-26 22:55:42 -06:00

debugfs

Merge branches 'acpi-pm', 'acpi-video', 'acpi-apei' and 'acpi-extlog'

2024-01-04 13:19:40 +01:00

devpts

fs: Remove the now superfluous sentinel elements from ctl_table array

2023-12-28 04:57:57 -08:00

dlm

dlm: update format header reflect current format

2023-12-20 15:36:48 -06:00

ecryptfs

fix directory locking scheme on rename

2024-01-11 20:00:22 -08:00

efivarfs

efivarfs: automatically update super block flag

2023-12-11 11:19:18 +01:00

efs

vfs-6.7.fsid

2023-11-07 12:11:26 -08:00

erofs

erofs: relaxed temporary buffers allocation on readahead

2024-01-27 12:28:08 +08:00

exfat

exfat: fix zero the unwritten part for dio read

2024-01-18 23:01:51 +09:00

exportfs

fs: fix build error with CONFIG_EXPORTFS=m or not defined

2023-10-28 16:16:19 +02:00

ext2

fix directory locking scheme on rename

2024-01-11 20:00:22 -08:00

ext4

Miscellaneous bug fixes and cleanups in ext4's multi-block allocator

2024-02-04 07:33:01 +00:00

f2fs

f2fs: fix double free of f2fs_sb_info

2024-01-12 18:55:09 -08:00

fat

vfs-6.7.fsid

2023-11-07 12:11:26 -08:00

freevxfs

freevxfs: lookup: fix function params kernel-doc

2023-12-20 15:02:58 -08:00

fuse

vfs-6.8.rw

2024-01-08 11:11:51 -08:00

gfs2

Revert "gfs2: Use GL_NOBLOCK flag for non-blocking lookups"

2024-02-02 17:21:44 +01:00

hfs

hfs: really remove hfs_writepage

2023-12-29 11:58:34 -08:00

hfsplus

Many singleton patches against the MM code. The patch series which

2024-01-09 11:18:47 -08:00

hostfs

hostfs: use d_splice_alias() calling conventions to simplify failure exits

2023-12-21 12:51:00 -05:00

hpfs

…

hugetlbfs

fs,hugetlb: fix NULL pointer dereference in hugetlbs_fill_super

2024-02-07 21:20:36 -08:00

iomap

mm: add folio_fill_tail() and use it in iomap

2023-12-10 16:51:36 -08:00

isofs

…

jbd2

jbd2: abort journal when detecting metadata writeback error of fs dev

2024-01-04 23:42:21 -05:00

jffs2

jffs2: mark __jffs2_dbg_superblock_counts() static

2023-12-10 17:21:43 -08:00

jfs

Revert "jfs: fix shift-out-of-bounds in dbJoin"

2024-01-29 08:45:10 -06:00

kernfs

Revert "kernfs: convert kernfs_idr_lock to an irq safe raw spinlock"

2024-01-11 11:51:27 +01:00

lockd

sysctl-6.8-rc1

2024-01-10 17:44:36 -08:00

minix

minixfs kmap_local_page() switchover and related fixes - very similar to sysv series.

2024-01-11 19:54:18 -08:00

netfs

netfs: Fix a NULL vs IS_ERR() check in netfs_perform_write()

2024-01-22 21:58:35 +00:00

nfs

vfs-6.8.netfs

2024-01-19 09:10:23 -08:00

nfs_common

…

nfsd

nfsd-6.8 fixes:

2024-02-07 17:48:15 +00:00

nilfs2

nilfs2: fix potential bug in end_buffer_async_write

2024-02-07 21:20:37 -08:00

nls

…

notify

dcache stuff for this cycle

2024-01-11 20:11:35 -08:00

ntfs

sysctl-6.8-rc1

2024-01-10 17:44:36 -08:00

ntfs3

fs/ntfs3: Slightly simplify ntfs_inode_printk()

2024-01-29 12:05:09 +03:00

ocfs2

misc cleanups (the part that hadn't been picked by individual fs trees)

2024-01-11 20:23:50 -08:00

omfs

…

openpromfs

…

orangefs

orangefs: saner arguments passing in readdir guts

2023-12-21 12:53:36 -05:00

overlayfs

vfs-6.8-rc5.fixes

2024-02-12 07:15:45 -08:00

proc

fs/proc: do_task_stat: use sig->stats_lock to gather the threads/children stats

2024-02-07 21:20:33 -08:00

pstore

pstore: inode: Use cleanup.h for struct pstore_private

2023-12-08 14:15:44 -08:00

qnx4

qnx4: Use get_directory_fname() in qnx4_match()

2023-12-13 11:19:18 -08:00

qnx6

…

quota

sysctl-6.8-rc1

2024-01-10 17:44:36 -08:00

ramfs

mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER

2024-01-08 15:27:15 -08:00

reiserfs

misc cleanups (the part that hadn't been picked by individual fs trees)

2024-01-11 20:23:50 -08:00

romfs

vfs-6.7.ctime

2023-10-30 09:47:13 -10:00

smb

smb: Fix regression in writes when non-standard maximum write size negotiated

2024-02-15 22:19:23 -06:00

squashfs

Squashfs: fix variable overflow triggered by sysbot

2023-12-10 17:21:26 -08:00

sysfs

fs/sysfs/dir.c : Fix typo in comment

2023-12-07 11:35:23 +09:00

sysv

sysv: remove writepage implementation

2023-12-29 11:58:35 -08:00

tracefs

eventfs: Keep all directory links at 1

2024-02-01 11:53:53 -05:00

ubifs

ubifs: Queue up space reservation tasks if retrying many times

2024-02-25 22:09:27 +01:00

udf

misc cleanups (the part that hadn't been picked by individual fs trees)

2024-01-11 20:23:50 -08:00

ufs

Many singleton patches against the MM code. The patch series which

2024-01-09 11:18:47 -08:00

unicode

…

vboxsf

fs: vboxsf: fix a kernel-doc warning

2023-12-08 15:32:31 -07:00

verity

Networking changes for 6.8.

2024-01-11 10:07:29 -08:00

xfs

xfs: remove conditional building of rt geometry validator functions

2024-01-30 14:04:43 +05:30

zonefs

zonefs: Improve error handling

2024-02-16 10:20:35 +09:00

aio.c

sysctl-6.8-rc1

2024-01-10 17:44:36 -08:00

anon_inodes.c

Merge branch 'kvm-guestmemfd' into HEAD

2023-11-14 08:31:31 -05:00

attr.c

fs: fix doc comment typo fs tree wide

2023-12-21 13:17:54 +01:00

backing-file.c

fs: factor out backing_file_mmap() helper

2023-12-23 16:35:09 +02:00

bad_inode.c

…

binfmt_elf_fdpic.c

execve updates for v6.7-rc1

2023-10-30 19:28:19 -10:00

binfmt_elf_test.c

…

binfmt_elf.c

…

binfmt_flat.c

…

binfmt_misc.c

execve updates for v6.7-rc1

2023-10-30 19:28:19 -10:00

binfmt_script.c

…

buffer.c

Many singleton patches against the MM code. The patch series which

2024-01-09 11:18:47 -08:00

char_dev.c

As usual, lots of singleton and doubleton patches all over the tree and

2023-11-02 20:53:31 -10:00

compat_binfmt_elf.c

…

coredump.c

fs: Remove the now superfluous sentinel elements from ctl_table array

2023-12-28 04:57:57 -08:00

d_path.c

…

dax.c

fs : Fix warning using plain integer as NULL

2023-11-18 15:00:01 +01:00

dcache.c

dcache stuff for this cycle

2024-01-11 20:11:35 -08:00

direct-io.c

fs : Fix warning using plain integer as NULL

2023-11-18 15:00:01 +01:00

drop_caches.c

…

eventfd.c

eventfd: Remove usage of the deprecated ida_simple_xx() API

2023-12-12 14:24:55 +01:00

eventpoll.c

fs: Remove the now superfluous sentinel elements from ctl_table array

2023-12-28 04:57:57 -08:00

exec.c

execve fixes for v6.8-rc2

2024-01-24 13:32:29 -08:00

fcntl.c

…

fhandle.c

exportfs: add helpers to check if filesystem can encode/decode file handles

2023-10-24 17:57:45 +02:00

file_table.c

dcache stuff for this cycle

2024-01-11 20:11:35 -08:00

file.c

file: remove __receive_fd()

2023-12-12 14:24:14 +01:00

filesystems.c

…

fs_context.c

…

fs_parser.c

…

fs_pin.c

…

fs_struct.c

…

fs_types.c

…

fs-writeback.c

netfs: Move pinning-for-writeback from fscache to netfs

2023-12-24 15:08:49 +00:00

fsopen.c

…

init.c

…

inode.c

fix directory locking scheme on rename

2024-01-11 20:00:22 -08:00

internal.h

dcache stuff for this cycle

2024-01-11 20:11:35 -08:00

ioctl.c

lsm: new security_file_ioctl_compat() hook

2023-12-24 15:48:03 -05:00

Kconfig

vfs-6.8.netfs

2024-01-19 09:10:23 -08:00

Kconfig.binfmt

…

kernel_read_file.c

…

libfs.c

dcache stuff for this cycle

2024-01-11 20:11:35 -08:00

locks.c

fs: Remove the now superfluous sentinel elements from ctl_table array

2023-12-28 04:57:57 -08:00

Makefile

vfs-6.8.netfs

2024-01-19 09:10:23 -08:00

mbcache.c

…

mnt_idmapping.c

mnt_idmapping: decouple from namespaces

2023-11-28 14:08:47 +01:00

mount.h

mounts: keep list of mounts in an rbtree

2023-11-18 14:56:16 +01:00

mpage.c

fs: convert block_write_full_page to block_write_full_folio

2023-12-29 11:58:35 -08:00

namei.c

fix buggered locking in bch2_ioctl_subvolume_destroy()

2024-01-12 18:04:01 -08:00

namespace.c

fs: relax mount_setattr() permission checks

2024-02-07 21:16:29 +01:00

nsfs.c

nsfs: use d_make_root()

2023-11-25 02:49:43 -05:00

open.c

vfs-6.8.rw

2024-01-08 11:11:51 -08:00

pipe.c

sysctl-6.8-rc1

2024-01-10 17:44:36 -08:00

pnode.c

mounts: keep list of mounts in an rbtree

2023-11-18 14:56:16 +01:00

pnode.h

…

posix_acl.c

fs: fix doc comment typo fs tree wide

2023-12-21 13:17:54 +01:00

proc_namespace.c

namespace: extract show_path() helper

2023-11-18 14:56:16 +01:00

read_write.c

fsnotify: optionally pass access range in file permission hooks

2023-12-12 16:20:02 +01:00

readdir.c

fsnotify: optionally pass access range in file permission hooks

2023-12-12 16:20:02 +01:00

remap_range.c

remap_range: merge do_clone_file_range() into vfs_clone_file_range()

2024-02-06 17:07:21 +01:00

select.c

…

seq_file.c

…

signalfd.c

…

splice.c

fs: use splice_copy_file_range() inline helper

2023-12-12 16:20:02 +01:00

stack.c

…

stat.c

vfs-6.8.mount

2024-01-08 10:57:34 -08:00

statfs.c

…

super.c

fscrypt updates for 6.8

2024-01-10 10:24:49 -08:00

sync.c

…

sysctls.c

fs: Remove the now superfluous sentinel elements from ctl_table array

2023-12-28 04:57:57 -08:00

timerfd.c

…

userfaultfd.c

Generic:

2024-01-17 13:03:37 -08:00

utimes.c

…

xattr.c

…