Kemeng Shi fabd2e42bc mm: correct calculation of wb's bg_thresh in cgroup domain
wb_calc_thresh() is calculating wb's share of bg_thresh in the global
domain.  However in case of cgroup writeback this is not the right
thing to do.  Consider the following domain hierarchy:

                global domain (> 20G)
                /                 \
          cgroup1 (10G)     cgroup2 (10G)
                |                 |
bdi            wb1               wb2

and assume wb1 and wb2 have the same bandwidth and the background
threshold is set at 10%.  The bg_thresh of cgroup1 and cgroup2 is going
to be 1G.  Now because wb_calc_thresh(mdtc->wb, mdtc->bg_thresh)
calculates per-wb threshold in the global domain as (wb bandwidth) /
(domain bandwidth) it returns bg_thresh for wb1 as 0.5G although it has
nobody to compete against in cgroup1.

Fix the problem by calculating wb's share of bg_thresh in the cgroup
domain.

Test as following:
/* make it easier to observe the issue */
echo 300000 > /proc/sys/vm/dirty_expire_centisecs
echo 100 > /proc/sys/vm/dirty_writeback_centisecs

/* run fio in wb1 */
cd /sys/fs/cgroup
echo "+memory +io" > cgroup.subtree_control
mkdir group1
cd group1
echo 10G > memory.high
echo 10G > memory.max
echo $$ > cgroup.procs
mkfs.ext4 -F /dev/vdb
mount /dev/vdb /bdi1/
fio -name test -filename=/bdi1/file -size=600M -ioengine=libaio -bs=4K \
-iodepth=1 -rw=write -direct=0 --time_based -runtime=600 -invalidate=0

/* run fio in wb2 with a new shell */
cd /sys/fs/cgroup
mkdir group2
cd group2
echo 10G > memory.high
echo 10G > memory.max
echo $$ > cgroup.procs
mkfs.ext4 -F /dev/vdc
mount /dev/vdc /bdi2/
fio -name test -filename=/bdi2/file -size=600M -ioengine=libaio -bs=4K \
-iodepth=1 -rw=write -direct=0 --time_based -runtime=600 -invalidate=0

Before fix, the wrttien pages of wb1 and wb2 reported from
toos/writeback/wb_monitor.py keep growing. After fix, rare written pages
are accumulated.
There is no obvious change in fio result.

[jack@suse.cz: changelog rewording]
Link: https://lkml.kernel.org/r/20240425131724.36778-3-shikemeng@huaweicloud.com
Fixes: 74d369443325 ("writeback: Fix performance regression in wb_over_bg_thresh()")
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Howard Cochran <hcochran@kernelspring.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-05 17:53:52 -07:00
2024-04-12 10:22:33 -07:00
2024-03-18 14:59:13 -07:00
2024-03-18 15:11:44 -07:00
2024-04-25 20:56:29 -07:00
2024-04-25 20:55:49 -07:00
2024-01-18 17:57:07 -08:00
2022-09-28 09:02:20 +02:00
2022-10-10 12:00:45 -07:00
2024-04-14 13:38:39 -07:00
2024-03-18 03:36:32 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the reStructuredText markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.
Description
No description provided
Readme 5.7 GiB
Languages
C 97.6%
Assembly 1%
Shell 0.5%
Python 0.3%
Makefile 0.3%