linux/block
Josef Bacik d706751215 block: introduce blk-iolatency io controller
Current IO controllers for the block layer are less than ideal for our
use case.  The io.max controller is great at hard limiting, but it is
not work conserving.  This patch introduces io.latency.  You provide a
latency target for your group and we monitor the io in short windows to
make sure we are not exceeding those latency targets.  This makes use of
the rq-qos infrastructure and works much like the wbt stuff.  There are
a few differences from wbt

 - It's bio based, so the latency covers the whole block layer in addition to
   the actual io.
 - We will throttle all IO types that comes in here if we need to.
 - We use the mean latency over the 100ms window.  This is because writes can
   be particularly fast, which could give us a false sense of the impact of
   other workloads on our protected workload.
 - By default there's no throttling, we set the queue_depth to INT_MAX so that
   we can have as many outstanding bio's as we're allowed to.  Only at
   throttle time do we pay attention to the actual queue depth.
 - We backcharge cgroups for root cg issued IO and induce artificial
   delays in order to deal with cases like metadata only or swap heavy
   workloads.

In testing this has worked out relatively well.  Protected workloads
will throttle noisy workloads down to 1 io at time if they are doing
normal IO on their own, or induce up to a 1 second delay per syscall if
they are doing a lot of root issued IO (metadata/swap IO).

Our testing has revolved mostly around our production web servers where
we have hhvm (the web server application) in a protected group and
everything else in another group.  We see slightly higher requests per
second (RPS) on the test tier vs the control tier, and much more stable
RPS across all machines in the test tier vs the control tier.

Another test we run is a slow memory allocator in the unprotected group.
Before this would eventually push us into swap and cause the whole box
to die and not recover at all.  With these patches we see slight RPS
drops (usually 10-15%) before the memory consumer is properly killed and
things recover within seconds.

Signed-off-by: Josef Bacik <jbacik@fb.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-07-09 09:07:54 -06:00
..
partitions partitions/ldm: remove redundant pointer dgrp 2018-07-09 09:07:53 -06:00
badblocks.c badblocks: fix wrong return value in badblocks_set if badblocks are disabled 2017-11-03 11:29:50 -07:00
bfq-cgroup.c block: use ktime_get_ns() instead of sched_clock() for cfq and bfq 2018-05-09 08:33:06 -06:00
bfq-iosched.c block, bfq: give a better name to bfq_bfqq_may_idle 2018-07-09 09:07:52 -06:00
bfq-iosched.h block, bfq: add/remove entity weights correctly 2018-07-09 09:07:52 -06:00
bfq-wf2q.c block, bfq: fix service being wrongly set to zero in case of preemption 2018-07-09 09:07:52 -06:00
bio-integrity.c block: Convert bio_set to mempool_init() 2018-05-14 13:16:03 -06:00
bio.c rq-qos: introduce dio_bio callback 2018-07-09 09:07:54 -06:00
blk-cgroup.c block: introduce blk-iolatency io controller 2018-07-09 09:07:54 -06:00
blk-core.c block: remove external dependency on wbt_flags 2018-07-09 09:07:54 -06:00
blk-exec.c blk-mq-sched: remove unused 'can_block' arg from blk_mq_sched_insert_request 2018-01-17 09:49:21 -07:00
blk-flush.c block: fix use-after-free in block flush handling 2018-06-09 06:37:14 -06:00
blk-integrity.c block drivers/block: Use octal not symbolic permissions 2018-05-24 13:38:59 -06:00
blk-ioc.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
blk-iolatency.c block: introduce blk-iolatency io controller 2018-07-09 09:07:54 -06:00
blk-lib.c block: break discard submissions into the user defined size 2018-05-08 15:10:44 -06:00
blk-map.c Merge branch 'for-4.16/block' of git://git.kernel.dk/linux-block 2018-01-29 11:51:49 -08:00
blk-merge.c block: don't use blocking queue entered for recursive bio submits 2018-06-02 20:35:00 -06:00
blk-mq-cpumap.c blk-mq: don't keep offline CPUs mapped to hctx 0 2018-04-10 08:38:46 -06:00
blk-mq-debugfs-zoned.c block: Make struct request_queue smaller for CONFIG_BLK_DEV_ZONED=n 2018-07-09 09:07:52 -06:00
blk-mq-debugfs.c blk-mq: dequeue request one by one from sw queue if hctx is busy 2018-07-09 09:07:53 -06:00
blk-mq-debugfs.h block: Make struct request_queue smaller for CONFIG_BLK_DEV_ZONED=n 2018-07-09 09:07:52 -06:00
blk-mq-pci.c blk-mq: code clean-up by adding an API to clear set->mq_map 2018-07-09 09:07:53 -06:00
blk-mq-rdma.c block: Add rdma affinity based queue mapping helper 2017-08-08 14:58:03 -04:00
blk-mq-sched.c blk-mq: dequeue request one by one from sw queue if hctx is busy 2018-07-09 09:07:53 -06:00
blk-mq-sched.h block: move sysfs_lock into elevator_init 2018-06-01 07:38:19 -06:00
blk-mq-sysfs.c block drivers/block: Use octal not symbolic permissions 2018-05-24 13:38:59 -06:00
blk-mq-tag.c blk-mq: remove blk_mq_tagset_iter 2018-06-14 17:01:45 +02:00
blk-mq-tag.h Merge branch 'for-4.15/block' of git://git.kernel.dk/linux-block 2017-11-14 15:32:19 -08:00
blk-mq-virtio.c blk-mq: provide a default queue mapping for virtio device 2017-02-27 20:54:05 +02:00
blk-mq.c block: remove external dependency on wbt_flags 2018-07-09 09:07:54 -06:00
blk-mq.h blk-mq: code clean-up by adding an API to clear set->mq_map 2018-07-09 09:07:53 -06:00
blk-rq-qos.c rq-qos: introduce dio_bio callback 2018-07-09 09:07:54 -06:00
blk-rq-qos.h rq-qos: introduce dio_bio callback 2018-07-09 09:07:54 -06:00
blk-settings.c blk-rq-qos: refactor out common elements of blk-wbt 2018-07-09 09:07:54 -06:00
blk-softirq.c block: fix timeout changes for legacy request drivers 2018-06-19 11:27:18 -06:00
blk-stat.c blk-stat: export helpers for modifying blk_rq_stat 2018-07-09 09:07:54 -06:00
blk-stat.h blk-stat: export helpers for modifying blk_rq_stat 2018-07-09 09:07:54 -06:00
blk-sysfs.c blk-rq-qos: refactor out common elements of blk-wbt 2018-07-09 09:07:54 -06:00
blk-tag.c for-linus-20180616 2018-06-17 05:37:55 +09:00
blk-throttle.c block: add bi_blkg to the bio for cgroups 2018-07-09 09:07:53 -06:00
blk-timeout.c blk-mq: Fix timeout handling in case the timeout handler returns BLK_EH_DONE 2018-06-23 10:25:45 -06:00
blk-wbt.c block: remove external dependency on wbt_flags 2018-07-09 09:07:54 -06:00
blk-wbt.h block: remove external dependency on wbt_flags 2018-07-09 09:07:54 -06:00
blk-zoned.c block: Remove a superfluous cast from blkdev_report_zones() 2018-07-09 09:07:52 -06:00
blk.h block: introduce blk-iolatency io controller 2018-07-09 09:07:54 -06:00
bounce.c block: fixup bioset_integrity_create() call 2018-05-30 18:51:21 -06:00
bsg-lib.c block: remove parent device reference from struct bsg_class_device 2018-05-29 13:00:25 -06:00
bsg.c bsg: fix race of bsg_open and bsg_unregister 2018-06-15 08:15:37 -06:00
cfq-iosched.c block drivers/block: Use octal not symbolic permissions 2018-05-24 13:38:59 -06:00
cmdline-parser.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
compat_ioctl.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
deadline-iosched.c block drivers/block: Use octal not symbolic permissions 2018-05-24 13:38:59 -06:00
elevator.c block: split the blk-mq case from elevator_init 2018-06-01 07:38:21 -06:00
genhd.c Merge branch 'hch.procfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-06-04 10:00:01 -07:00
ioctl.c block: pass inclusive 'lend' parameter to truncate_inode_pages_range 2018-02-23 15:20:19 -07:00
ioprio.c block: add ioprio_check_cap function 2018-05-31 10:50:54 -04:00
Kconfig block: introduce blk-iolatency io controller 2018-07-09 09:07:54 -06:00
Kconfig.iosched License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
kyber-iosched.c block: kyber: make kyber more friendly with merging 2018-05-30 10:47:40 -06:00
Makefile block: introduce blk-iolatency io controller 2018-07-09 09:07:54 -06:00
mq-deadline.c block drivers/block: Use octal not symbolic permissions 2018-05-24 13:38:59 -06:00
noop-iosched.c block: move existing elevator ops to union 2017-01-17 10:03:33 -07:00
opal_proto.h block: sed-opal: Set MBRDone on S3 resume path if TPER is MBREnabled 2017-09-11 09:45:52 -06:00
partition-generic.c block: don't print a message when the device went away 2018-05-29 08:59:21 -06:00
scsi_ioctl.c block: consistently use GFP_NOIO instead of __GFP_NORECLAIM 2018-05-14 08:55:18 -06:00
sed-opal.c block: sed-opal: Fix a couple off by one bugs 2018-06-20 12:04:06 -06:00
t10-pi.c t10-pi: Move opencoded contants to common header 2017-07-03 16:56:25 -06:00