linux/block
Xianting Tian 192f1c6bc2 blkcg: add plugging support for punt bio
The test and the explaination of the patch as bellow.

Before test we added more debug code in blkg_async_bio_workfn():
	int count = 0
	if (bios.head && bios.head->bi_next) {
		need_plug = true;
		blk_start_plug(&plug);
	}
	while ((bio = bio_list_pop(&bios))) {
		/*io_punt is a sysctl user interface to control the print*/
		if(io_punt) {
			printk("[%s:%d] bio start,size:%llu,%d count=%d plug?%d\n",
				current->comm, current->pid, bio->bi_iter.bi_sector,
				(bio->bi_iter.bi_size)>>9, count++, need_plug);
		}
		submit_bio(bio);
	}
	if (need_plug)
		blk_finish_plug(&plug);

Steps that need to be set to trigger *PUNT* io before testing:
	mount -t btrfs -o compress=lzo /dev/sda6 /btrfs
	mount -t cgroup2 nodev /cgroup2
	mkdir /cgroup2/cg3
	echo "+io" > /cgroup2/cgroup.subtree_control
	echo "8:0 wbps=1048576000" > /cgroup2/cg3/io.max #1000M/s
	echo $$ > /cgroup2/cg3/cgroup.procs

Then use dd command to test btrfs PUNT io in current shell:
	dd if=/dev/zero of=/btrfs/file bs=64K count=100000

Test hardware environment as below:
	[root@localhost btrfs]# lscpu
	Architecture:          x86_64
	CPU op-mode(s):        32-bit, 64-bit
	Byte Order:            Little Endian
	CPU(s):                32
	On-line CPU(s) list:   0-31
	Thread(s) per core:    2
	Core(s) per socket:    8
	Socket(s):             2
	NUMA node(s):          2
	Vendor ID:             GenuineIntel

With above debug code, test command and test environment, I did the
tests under 3 different system loads, which are triggered by stress:
1, Run 64 threads by command "stress -c 64 &"
	[53615.975974] [kworker/u66:18:1490] bio start,size:45583056,8 count=0 plug?1
	[53615.975980] [kworker/u66:18:1490] bio start,size:45583064,8 count=1 plug?1
	[53615.975984] [kworker/u66:18:1490] bio start,size:45583072,8 count=2 plug?1
	[53615.975987] [kworker/u66:18:1490] bio start,size:45583080,8 count=3 plug?1
	[53615.975990] [kworker/u66:18:1490] bio start,size:45583088,8 count=4 plug?1
	[53615.975993] [kworker/u66:18:1490] bio start,size:45583096,8 count=5 plug?1
	... ...
	[53615.977041] [kworker/u66:18:1490] bio start,size:45585480,8 count=303 plug?1
	[53615.977044] [kworker/u66:18:1490] bio start,size:45585488,8 count=304 plug?1
	[53615.977047] [kworker/u66:18:1490] bio start,size:45585496,8 count=305 plug?1
	[53615.977050] [kworker/u66:18:1490] bio start,size:45585504,8 count=306 plug?1
	[53615.977053] [kworker/u66:18:1490] bio start,size:45585512,8 count=307 plug?1
	[53615.977056] [kworker/u66:18:1490] bio start,size:45585520,8 count=308 plug?1
	[53615.977058] [kworker/u66:18:1490] bio start,size:45585528,8 count=309 plug?1

2, Run 32 threads by command "stress -c 32 &"
	[50586.290521] [kworker/u66:6:32351] bio start,size:45806496,8 count=0 plug?1
	[50586.290526] [kworker/u66:6:32351] bio start,size:45806504,8 count=1 plug?1
	[50586.290529] [kworker/u66:6:32351] bio start,size:45806512,8 count=2 plug?1
	[50586.290531] [kworker/u66:6:32351] bio start,size:45806520,8 count=3 plug?1
	[50586.290533] [kworker/u66:6:32351] bio start,size:45806528,8 count=4 plug?1
	[50586.290535] [kworker/u66:6:32351] bio start,size:45806536,8 count=5 plug?1
	... ...
	[50586.299640] [kworker/u66:5:32350] bio start,size:45808576,8 count=252 plug?1
	[50586.299643] [kworker/u66:5:32350] bio start,size:45808584,8 count=253 plug?1
	[50586.299646] [kworker/u66:5:32350] bio start,size:45808592,8 count=254 plug?1
	[50586.299649] [kworker/u66:5:32350] bio start,size:45808600,8 count=255 plug?1
	[50586.299652] [kworker/u66:5:32350] bio start,size:45808608,8 count=256 plug?1
	[50586.299663] [kworker/u66:5:32350] bio start,size:45808616,8 count=257 plug?1
	[50586.299665] [kworker/u66:5:32350] bio start,size:45808624,8 count=258 plug?1
	[50586.299668] [kworker/u66:5:32350] bio start,size:45808632,8 count=259 plug?1

3, Don't run thread by stress
	[50861.355246] [kworker/u66:19:32376] bio start,size:13544504,8 count=0 plug?0
	[50861.355288] [kworker/u66:19:32376] bio start,size:13544512,8 count=0 plug?0
	[50861.355322] [kworker/u66:19:32376] bio start,size:13544520,8 count=0 plug?0
	[50861.355353] [kworker/u66:19:32376] bio start,size:13544528,8 count=0 plug?0
	[50861.355392] [kworker/u66:19:32376] bio start,size:13544536,8 count=0 plug?0
	[50861.355431] [kworker/u66:19:32376] bio start,size:13544544,8 count=0 plug?0
	[50861.355468] [kworker/u66:19:32376] bio start,size:13544552,8 count=0 plug?0
	[50861.355499] [kworker/u66:19:32376] bio start,size:13544560,8 count=0 plug?0
	[50861.355532] [kworker/u66:19:32376] bio start,size:13544568,8 count=0 plug?0
	[50861.355575] [kworker/u66:19:32376] bio start,size:13544576,8 count=0 plug?0
	[50861.355618] [kworker/u66:19:32376] bio start,size:13544584,8 count=0 plug?0
	[50861.355659] [kworker/u66:19:32376] bio start,size:13544592,8 count=0 plug?0
	[50861.355740] [kworker/u66:0:32346] bio start,size:13544600,8 count=0 plug?1
	[50861.355748] [kworker/u66:0:32346] bio start,size:13544608,8 count=1 plug?1
	[50861.355962] [kworker/u66:2:32347] bio start,size:13544616,8 count=0 plug?0
	[50861.356272] [kworker/u66:7:31962] bio start,size:13544624,8 count=0 plug?0
	[50861.356446] [kworker/u66:7:31962] bio start,size:13544632,8 count=0 plug?0
	[50861.356567] [kworker/u66:7:31962] bio start,size:13544640,8 count=0 plug?0
	[50861.356707] [kworker/u66:19:32376] bio start,size:13544648,8 count=0 plug?0
	[50861.356748] [kworker/u66:15:32355] bio start,size:13544656,8 count=0 plug?0
	[50861.356825] [kworker/u66:17:31970] bio start,size:13544664,8 count=0 plug?0

Analysis of above 3 test results with different system load:
>From above test, we can see more and more continuous bios can be plugged
with system load increasing. When run "stress -c 64 &", 310 continuous
bios are plugged; When run "stress -c 32 &", 260 continuous bios are
plugged; When don't run stress, at most only 2 continuous bios are
plugged, in most cases, bio_list only contains one single bio.

How to explain above phenomenon:
We know, in submit_bio(), if the bio is a REQ_CGROUP_PUNT io, it will
queue a work to workqueue blkcg_punt_bio_wq. But when the workqueue is
scheduled, it depends on the system load.  When system load is low, the
workqueue will be quickly scheduled, and the bio in bio_list will be
quickly processed in blkg_async_bio_workfn(), so there is less chance
that the same io submit thread can add multiple continuous bios to
bio_list before workqueue is scheduled to run. The analysis aligned with
above test "3".
When system load is high, there is some delay before the workqueue can
be scheduled to run, the higher the system load the greater the delay.
So there is more chance that the same io submit thread can add multiple
continuous bios to bio_list. Then when the workqueue is scheduled to run,
there are more continuous bios in bio_list, which will be processed in
blkg_async_bio_workfn(). The analysis aligned with above test "1" and "2".

According to test, we can get io performance improved with the patch,
especially when system load is higher. Another optimazition is to use
the plug only when bio_list contains at least 2 bios.

Signed-off-by: Xianting Tian <tian.xianting@h3c.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-10 09:56:34 -06:00
..
partitions block: remove the disk argument to delete_partition 2020-09-01 19:38:25 -06:00
badblocks.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
bfq-cgroup.c bfq: fix blkio cgroup leakage v4 2020-08-18 07:48:08 -07:00
bfq-iosched.c blk-mq, elevator: Count requests per hctx to improve performance 2020-09-03 15:20:47 -06:00
bfq-iosched.h bfq: fix blkio cgroup leakage v4 2020-08-18 07:48:08 -07:00
bfq-wf2q.c bfq: fix blkio cgroup leakage v4 2020-08-18 07:48:08 -07:00
bio-integrity.c block: make function __bio_integrity_free() static 2020-07-02 12:38:18 -06:00
bio.c block: Fix page_is_mergeable() for compound pages 2020-08-17 19:35:53 -07:00
blk-cgroup-rwstat.c blk-cgroup: separate out blkg_rwstat under CONFIG_BLK_CGROUP_RWSTAT 2019-11-07 12:28:13 -07:00
blk-cgroup-rwstat.h blk-cgroup: separate out blkg_rwstat under CONFIG_BLK_CGROUP_RWSTAT 2019-11-07 12:28:13 -07:00
blk-cgroup.c blkcg: add plugging support for punt bio 2020-09-10 09:56:34 -06:00
blk-core.c blk-mq: Record nr_active_requests per queue for when using shared sbitmap 2020-09-03 15:20:47 -06:00
blk-crypto-fallback.c block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
blk-crypto-internal.h block: blk-crypto-fallback for Inline Encryption 2020-05-14 09:48:03 -06:00
blk-crypto.c block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
blk-exec.c block: add a blk_account_io_merge_bio helper 2020-05-27 05:21:23 -06:00
blk-flush.c block: fix double account of flush request's driver tag 2020-08-11 13:53:32 -06:00
blk-integrity.c block: Make blk-integrity preclude hardware inline encryption 2020-05-14 09:48:03 -06:00
blk-ioc.c block: remove retry loop in ioc_release_fn() 2020-07-16 10:22:15 -06:00
blk-iocost.c blk-iocost: add three debug stat - cost.wait, indebt and indelay 2020-09-01 19:38:33 -06:00
blk-iolatency.c blk-iolatency: only call ktime_get() if needed 2020-07-01 08:02:38 -06:00
blk-lib.c block: check queue's limits.discard_granularity in __blkdev_issue_discard() 2020-08-05 17:15:47 -06:00
blk-map.c block: remove the BIO_USER_MAPPED flag 2020-09-01 16:49:26 -06:00
blk-merge.c block: Remove a duplicative condition 2020-09-01 19:48:06 -06:00
blk-mq-cpumap.c blk-mq: balance mapping between present CPUs and queues 2019-08-04 21:43:12 -06:00
blk-mq-debugfs-zoned.c
blk-mq-debugfs.c blk-mq: Use pointers for blk_mq_tags bitmap tags 2020-09-03 15:20:47 -06:00
blk-mq-debugfs.h blk-mq: no need to check return value of debugfs_create functions 2019-06-13 03:00:30 -06:00
blk-mq-pci.c block: Fix blk_mq_*_map_queues() kernel-doc headers 2019-05-31 15:12:34 -06:00
blk-mq-rdma.c block: Fix blk_mq_*_map_queues() kernel-doc headers 2019-05-31 15:12:34 -06:00
blk-mq-sched.c block: Remove unused blk_mq_sched_free_hctx_data() 2020-09-07 20:11:15 -06:00
blk-mq-sched.h block: Remove unused blk_mq_sched_free_hctx_data() 2020-09-07 20:11:15 -06:00
blk-mq-sysfs.c blk-mq: make sure that line break can be printed 2019-11-04 07:14:10 -07:00
blk-mq-tag.c blk-mq: Record active_queues_shared_sbitmap per tag_set for when using shared sbitmap 2020-09-03 15:20:47 -06:00
blk-mq-tag.h blk-mq: Relocate hctx_may_queue() 2020-09-03 15:20:47 -06:00
blk-mq-virtio.c blk-mq: Fix typo in comment 2020-03-17 20:55:21 +01:00
blk-mq.c blk-mq, elevator: Count requests per hctx to improve performance 2020-09-03 15:20:47 -06:00
blk-mq.h blk-mq: Record active_queues_shared_sbitmap per tag_set for when using shared sbitmap 2020-09-03 15:20:47 -06:00
blk-pm.c scsi: block: pm: Simplify resume handling 2020-07-24 22:09:55 -04:00
blk-pm.h
blk-rq-qos.c Revert "blk-rq-qos: remove redundant finish_wait to rq_qos_wait." 2020-07-15 09:33:37 -06:00
blk-rq-qos.h blk-rq-qos: fix first node deletion of rq_qos_del() 2019-10-15 10:13:13 -06:00
blk-settings.c block: remove blk_queue_stack_limits 2020-07-20 15:38:52 -06:00
blk-stat.c blk-stat: make q->stats->lock irqsafe 2020-09-01 16:48:46 -06:00
blk-stat.h
blk-sysfs.c block: make QUEUE_SYSFS_BIT_FNS more useful 2020-09-08 09:01:10 -06:00
blk-throttle.c block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
blk-timeout.c block: blk-timeout: delete duplicated word 2020-07-31 16:29:47 -06:00
blk-wbt.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
blk-wbt.h blk-wbt: remove wbt_update_limits 2020-05-29 16:30:39 -06:00
blk-zoned.c block: don't do revalidate zones on invalid devices 2020-08-03 09:24:04 -06:00
blk.h block: remove the disk argument to delete_partition 2020-09-01 19:38:25 -06:00
bounce.c block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
bsg-lib.c bsg-lib: convert comma to semicolon 2020-08-16 20:07:12 -07:00
bsg.c compat_ioctl: bsg: add handler 2020-01-03 09:33:21 +01:00
cmdline-parser.c
elevator.c block: elevator: delete duplicated word and fix typos 2020-07-31 16:29:47 -06:00
genhd.c block: add a bdev_check_media_change helper 2020-09-10 09:32:30 -06:00
ioctl.c block: Do not discard buffers under a mounted filesystem 2020-09-07 20:10:55 -06:00
ioprio.c block: grant IOPRIO_CLASS_RT to CAP_SYS_NICE 2020-09-01 19:38:33 -06:00
Kconfig blk-wbt: Remove obsolete multiqueue I/O scheduling comment 2020-09-01 16:49:26 -06:00
Kconfig.iosched treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
keyslot-manager.c block/keyslot-manager: use kvfree_sensitive() 2020-06-29 13:24:05 -06:00
kyber-iosched.c blk-mq: Use pointers for blk_mq_tags bitmap tags 2020-09-03 15:20:47 -06:00
Makefile blk-mq: merge blk-softirq.c into blk-mq.c 2020-06-24 09:15:56 -06:00
mq-deadline.c blk-mq, elevator: Count requests per hctx to improve performance 2020-09-03 15:20:47 -06:00
opal_proto.h block: sed-opal: Change the check condition for regular session validity 2020-03-12 08:00:10 -06:00
scsi_ioctl.c scsi: core: Allow non-root users to perform ZBC commands 2020-03-16 18:26:31 -04:00
sed-opal.c block: sed-opal: Change the check condition for regular session validity 2020-03-12 08:00:10 -06:00
t10-pi.c block: Allow t10-pi to be modular 2020-01-06 20:59:04 -07:00