linux

iv/linux

History

Jens Axboe 33391eecd6 block: treat poll queue enter similarly to timeouts

We ran into an issue where a production workload would randomly grind to
a halt and not continue until the pending IO had timed out. This turned
out to be a complicated interaction between queue freezing and polled
IO:

1) You have an application that does polled IO. At any point in time,
   there may be polled IO pending.

2) You have a monitoring application that issues a passthrough command,
   which is marked with side effects such that it needs to freeze the
   queue.

3) Passthrough command is started, which calls blk_freeze_queue_start()
   on the device. At this point the queue is marked frozen, and any
   attempt to enter the queue will fail (for non-blocking) or block.

4) Now the driver calls blk_mq_freeze_queue_wait(), which will return
   when the queue is quiesced and pending IO has completed.

5) The pending IO is polled IO, but any attempt to poll IO through the
   normal iocb_bio_iopoll() -> bio_poll() will fail when it gets to
   bio_queue_enter() as the queue is frozen. Rather than poll and
   complete IO, the polling threads will sit in a tight loop attempting
   to poll, but failing to enter the queue to do so.

The end result is that progress for either application will be stalled
until all pending polled IO has timed out. This causes obvious huge
latency issues for the application doing polled IO, but also long delays
for passthrough command.

Fix this by treating queue enter for polled IO just like we do for
timeouts. This allows quick quiesce of the queue as we still poll and
complete this IO, while still disallowing queueing up new IO.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

2023-01-29 15:18:34 -07:00

partitions

block: don't add partitions if GD_SUPPRESS_PART_SCAN is set

2022-09-03 11:29:03 -06:00

badblocks.c

block/badblocks: Remove redundant assignments

2022-04-23 07:15:26 -06:00

bdev.c

block: bdev & blktrace: use consistent function doc. notation

2022-12-01 09:16:46 -07:00

bfq-cgroup.c

block, bfq: inject I/O to underutilized actuators

2023-01-29 15:18:33 -07:00

bfq-iosched.c

block, bfq: balance I/O injection among underutilized actuators

2023-01-29 15:18:33 -07:00

bfq-iosched.h

block, bfq: inject I/O to underutilized actuators

2023-01-29 15:18:33 -07:00

bfq-wf2q.c

block, bfq: inject I/O to underutilized actuators

2023-01-29 15:18:33 -07:00

bio-integrity.c

block: pass struct queue_limits to the bio splitting helpers

2022-08-02 21:08:53 -06:00

bio.c

block: add a BUILD_BUG_ON() for adding more bio flags than we have space

2023-01-29 15:18:33 -07:00

blk-cgroup-fc-appid.c

cgroup: Homogenize cgroup_get_from_id() return value

2022-08-26 10:57:41 -10:00

blk-cgroup-rwstat.c

blk-cgroup: Fix the recursive blkg rwstat

2021-03-05 11:32:15 -07:00

blk-cgroup-rwstat.h

block: Use the new blk_opf_t type

2022-07-14 12:14:30 -06:00

blk-cgroup.c

blk-cgroup: fix missing pd_online_fn() while activating policy

2023-01-16 19:04:07 -07:00

blk-cgroup.h

blk-cgroup: Optimize blkcg_rstat_flush()

2022-11-16 16:58:44 -07:00

blk-core.c

block: treat poll queue enter similarly to timeouts

2023-01-29 15:18:34 -07:00

blk-crypto-fallback.c

treewide: use get_random_bytes() when possible

2022-10-11 17:42:58 -06:00

blk-crypto-internal.h

blk-crypto: pass a gendisk to blk_crypto_sysfs_{,un}register

2022-11-30 11:09:00 -07:00

blk-crypto-profile.c

blk-crypto: Add a missing include directive

2022-11-23 10:38:54 -07:00

blk-crypto-sysfs.c

block: untangle request_queue refcounting from sysfs

2022-11-30 11:09:00 -07:00

blk-crypto.c

for-6.2/block-2022-12-08

2022-12-13 10:43:59 -08:00

blk-flush.c

block: change request end_io handler to pass back a return value

2022-09-30 07:49:09 -06:00

blk-ia-ranges.c

block: untangle request_queue refcounting from sysfs

2022-11-30 11:09:00 -07:00

blk-integrity.c

blk-crypto: remove blk_crypto_unregister()

2021-11-29 06:38:51 -07:00

blk-ioc.c

block: fix default IO priority handling again

2022-06-27 06:29:12 -06:00

blk-iocost.c

blk-iocost: change div64_u64 to DIV64_U64_ROUND_UP in ioc_refresh_params()

2023-01-29 15:18:34 -07:00

blk-iolatency.c

treewide: Convert del_timer*() to timer_shutdown*()

2022-12-25 13:38:09 -08:00

blk-ioprio.c

blk-ioprio: pass a gendisk to blk_ioprio_init and blk_ioprio_exit

2022-09-26 19:09:31 -06:00

blk-ioprio.h

blk-ioprio: pass a gendisk to blk_ioprio_init and blk_ioprio_exit

2022-09-26 19:09:31 -06:00

blk-lib.c

blk-lib: fix blkdev_issue_secure_erase

2022-09-15 00:25:17 -06:00

blk-map.c

block: extend bio-cache for non-polled requests

2023-01-29 15:18:34 -07:00

blk-merge.c

block: don't allow splitting of a REQ_NOWAIT bio

2023-01-04 13:24:44 -07:00

blk-mq-cpumap.c

block: Change the return type of blk_mq_map_queues() into void

2022-08-22 10:07:53 -06:00

blk-mq-debugfs-zoned.c

block: move zone related fields to struct gendisk

2022-07-06 06:46:26 -06:00

blk-mq-debugfs.c

for-6.1/block-2022-10-03

2022-10-07 09:19:14 -07:00

blk-mq-debugfs.h

block: remove per-disk debugfs files in blk_unregister_queue

2022-06-17 07:31:05 -06:00

blk-mq-pci.c

block: Change the return type of blk_mq_map_queues() into void

2022-08-22 10:07:53 -06:00

blk-mq-rdma.c

block: Change the return type of blk_mq_map_queues() into void

2022-08-22 10:07:53 -06:00

blk-mq-sched.c

block: split elevator_switch

2022-11-01 09:12:24 -06:00

blk-mq-sched.h

block: move blk_mq_sched_assign_ioc to blk-ioc.c

2021-11-29 06:41:29 -07:00

blk-mq-sysfs.c

blk-mq: fix possible memleak when register 'hctx' failed

2022-11-25 06:34:03 -07:00

blk-mq-tag.c

sbitmap: fix batched wait_cnt accounting

2022-09-12 00:10:34 -06:00

blk-mq-tag.h

blk-mq: blk_mq_tag_busy is no need to return a value

2022-06-27 06:29:12 -06:00

blk-mq-virtio.c

block: Change the return type of blk_mq_map_queues() into void

2022-08-22 10:07:53 -06:00

blk-mq.c

block: fix hctx checks for batch allocation

2023-01-17 09:56:52 -07:00

blk-mq.h

blk-mq: move the srcu_struct used for quiescing to the tagset

2022-11-02 08:35:34 -06:00

blk-pm.c

scsi: block: pm: Always set request queue runtime active in blk_post_runtime_resume()

2021-12-22 23:38:29 -05:00

blk-pm.h

block: Remove unused blk_pm_*() function definitions

2021-02-22 06:33:48 -07:00

blk-rq-qos.c

block/rq_qos: Use atomic_try_cmpxchg in atomic_inc_below

2022-07-12 14:38:52 -06:00

blk-rq-qos.h

block/blk-rq-qos: delete useless enmu RQ_QOS_IOPRIO

2022-09-21 19:50:53 -06:00

blk-settings.c

block: save user max_sectors limit

2023-01-29 15:18:33 -07:00

blk-stat.c

block: make queue stat accounting a reference

2021-12-14 17:23:05 -07:00

blk-stat.h

block: make queue stat accounting a reference

2021-12-14 17:23:05 -07:00

blk-sysfs.c

block: save user max_sectors limit

2023-01-29 15:18:33 -07:00

blk-throttle.c

blk-throttle: Use more suitable time_after check for update of slice_start

2022-12-05 13:45:31 -07:00

blk-throttle.h

blk-throttle: pass a gendisk to blk_throtl_cancel_bios

2022-09-26 19:17:28 -06:00

blk-timeout.c

block: blk-timeout: delete duplicated word

2020-07-31 16:29:47 -06:00

blk-wbt.c

blk-wbt: don't enable throttling if default elevator is bfq

2022-10-23 18:59:17 -06:00

blk-wbt.h

blk-wbt: don't show valid wbt_lat_usec in sysfs while wbt is disabled

2022-10-23 18:59:17 -06:00

blk-zoned.c

block: add a new helper bdev_{is_zone_start, offset_from_zone_start}

2023-01-29 15:18:34 -07:00

blk.h

for-6.2/block-2022-12-08

2022-12-13 10:43:59 -08:00

bounce.c

block: change the blk_queue_bounce calling convention

2022-08-02 17:22:54 -06:00

bsg-lib.c

blk-mq: move the call to blk_put_queue out of blk_mq_destroy_queue

2022-10-25 08:25:10 -06:00

bsg.c

Driver Core changes for 6.2-rc1

2022-12-16 03:54:54 -08:00

disk-events.c

block: remove genhd.h

2022-02-02 07:49:59 -07:00

elevator.c

block: untangle request_queue refcounting from sysfs

2022-11-30 11:09:00 -07:00

elevator.h

block: add proper helpers for elevator_type module refcount management

2022-10-23 18:59:17 -06:00

fops.c

block: don't allow multiple bios for IOCB_NOWAIT issue

2023-01-29 15:18:34 -07:00

genhd.c

block-2023-01-06

2023-01-06 13:12:42 -08:00

holder.c

block: don't allow a disk link holder to itself

2022-11-16 15:19:56 -07:00

ioctl.c

block: Do not reread partition table on exclusively open device

2022-12-01 07:44:03 -07:00

ioprio.c

block: Fix handling of tasks without ioprio in ioprio_get(2)

2022-06-27 06:29:12 -06:00

Kconfig

block: Remove "select SRCU"

2023-01-05 08:50:10 -07:00

Kconfig.iosched

block: only build the icq tracking code when needed

2021-12-16 10:59:02 -07:00

kyber-iosched.c

treewide: Convert del_timer*() to timer_shutdown*()

2022-12-25 13:38:09 -08:00

Makefile

blk-cgroup: move blkcg_{get,set}_fc_appid out of line

2022-05-02 14:06:20 -06:00

mq-deadline.c

block: mq-deadline: Rename deadline_is_seq_writes()

2022-11-28 19:27:45 -07:00

opal_proto.h

block: sed-opal: Add ioctl to return device status

2022-08-22 07:52:51 -06:00

sed-opal.c

for-6.2/block-2022-12-08

2022-12-13 10:43:59 -08:00

t10-pi.c

block: add pi for extended integrity

2022-03-07 12:48:35 -07:00