linux/block
Ming Lei fd689871bb block: alloc map and request for new hardware queue
Alloc new map and request for new hardware queue when increse
hardware queue count. Before this patch, it will show a
warning for each new hardware queue, but it's not enough, these
hctx have no maps and reqeust, when a bio was mapped to these
hardware queue, it will trigger kernel panic when get request
from these hctx.

Test environment:
 * A NVMe disk supports 128 io queues
 * 96 cpus in system

A corner case can always trigger this panic, there are 96
io queues allocated for HCTX_TYPE_DEFAULT type, the corresponding kernel
log: nvme nvme0: 96/0/0 default/read/poll queues. Now we set nvme write
queues to 96, then nvme will alloc others(32) queues for read, but
blk_mq_update_nr_hw_queues does not alloc map and request for these new
added io queues. So when process read nvme disk, it will trigger kernel
panic when get request from these hardware context.

Reproduce script:

nr=$(expr `cat /sys/block/nvme0n1/device/queue_count` - 1)
echo $nr > /sys/module/nvme/parameters/write_queues
echo 1 > /sys/block/nvme0n1/device/reset_controller
dd if=/dev/nvme0n1 of=/dev/null bs=4K count=1

[ 8040.805626] ------------[ cut here ]------------
[ 8040.805627] WARNING: CPU: 82 PID: 12921 at block/blk-mq.c:2578 blk_mq_map_swqueue+0x2b6/0x2c0
[ 8040.805627] Modules linked in: nvme nvme_core nf_conntrack_netlink xt_addrtype br_netfilter overlay xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nft_counter nf_nat_tftp nf_conntrack_tftp nft_masq nf_tables_set nft_fib_inet nft_f
ib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack tun bridge nf_defrag_ipv6 nf_defrag_ipv4 stp llc ip6_tables ip_tables nft_compat rfkill ip_set nf_tables nfne
tlink sunrpc intel_rapl_msr intel_rapl_common skx_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass ipmi_ssif crct10dif_pclmul crc32_pclmul iTCO_wdt iTCO_vendor_support ghash_clmulni_intel intel_
cstate intel_uncore raid0 joydev intel_rapl_perf ipmi_si pcspkr mei_me ioatdma sg ipmi_devintf mei i2c_i801 dca lpc_ich ipmi_msghandler acpi_power_meter acpi_pad xfs libcrc32c sd_mod ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm d
rm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops
[ 8040.805637]  ahci drm i40e libahci crc32c_intel libata t10_pi wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: nvme_core]
[ 8040.805640] CPU: 82 PID: 12921 Comm: kworker/u194:2 Kdump: loaded Tainted: G        W         5.6.0-rc5.78317c+ #2
[ 8040.805640] Hardware name: Inspur SA5212M5/YZMB-00882-104, BIOS 4.0.9 08/27/2019
[ 8040.805641] Workqueue: nvme-reset-wq nvme_reset_work [nvme]
[ 8040.805642] RIP: 0010:blk_mq_map_swqueue+0x2b6/0x2c0
[ 8040.805643] Code: 00 00 00 00 00 41 83 c5 01 44 39 6d 50 77 b8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 8b bb 98 00 00 00 89 d6 e8 8c 81 03 00 eb 83 <0f> 0b e9 52 ff ff ff 0f 1f 00 0f 1f 44 00 00 41 57 48 89 f1 41 56
[ 8040.805643] RSP: 0018:ffffba590d2e7d48 EFLAGS: 00010246
[ 8040.805643] RAX: 0000000000000000 RBX: ffff9f013e1ba800 RCX: 000000000000003d
[ 8040.805644] RDX: ffff9f00ffff6000 RSI: 0000000000000003 RDI: ffff9ed200246d90
[ 8040.805644] RBP: ffff9f00f6a79860 R08: 0000000000000000 R09: 000000000000003d
[ 8040.805645] R10: 0000000000000001 R11: ffff9f0138c3d000 R12: ffff9f00fb3a9008
[ 8040.805645] R13: 000000000000007f R14: ffffffff96822660 R15: 000000000000005f
[ 8040.805645] FS:  0000000000000000(0000) GS:ffff9f013fa80000(0000) knlGS:0000000000000000
[ 8040.805646] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8040.805646] CR2: 00007f7f397fa6f8 CR3: 0000003d8240a002 CR4: 00000000007606e0
[ 8040.805647] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 8040.805647] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 8040.805647] PKRU: 55555554
[ 8040.805647] Call Trace:
[ 8040.805649]  blk_mq_update_nr_hw_queues+0x31b/0x390
[ 8040.805650]  nvme_reset_work+0xb4b/0xeab [nvme]
[ 8040.805651]  process_one_work+0x1a7/0x370
[ 8040.805652]  worker_thread+0x1c9/0x380
[ 8040.805653]  ? max_active_store+0x80/0x80
[ 8040.805655]  kthread+0x112/0x130
[ 8040.805656]  ? __kthread_parkme+0x70/0x70
[ 8040.805657]  ret_from_fork+0x35/0x40
[ 8040.805658] ---[ end trace b5f13b1e73ccb5d3 ]---
[ 8229.365135] BUG: kernel NULL pointer dereference, address: 0000000000000004
[ 8229.365165] #PF: supervisor read access in kernel mode
[ 8229.365178] #PF: error_code(0x0000) - not-present page
[ 8229.365191] PGD 0 P4D 0
[ 8229.365201] Oops: 0000 [#1] SMP PTI
[ 8229.365212] CPU: 77 PID: 13024 Comm: dd Kdump: loaded Tainted: G        W         5.6.0-rc5.78317c+ #2
[ 8229.365232] Hardware name: Inspur SA5212M5/YZMB-00882-104, BIOS 4.0.9 08/27/2019
[ 8229.365253] RIP: 0010:blk_mq_get_tag+0x227/0x250
[ 8229.365265] Code: 44 24 04 44 01 e0 48 8b 74 24 38 65 48 33 34 25 28 00 00 00 75 33 48 83 c4 40 5b 5d 41 5c 41 5d 41 5e c3 48 8d 68 10 4c 89 ef <44> 8b 60 04 48 89 ee e8 dd f9 ff ff 83 f8 ff 75 c8 e9 67 fe ff ff
[ 8229.365304] RSP: 0018:ffffba590e977970 EFLAGS: 00010246
[ 8229.365317] RAX: 0000000000000000 RBX: ffff9f00f6a79860 RCX: ffffba590e977998
[ 8229.365333] RDX: 0000000000000000 RSI: ffff9f012039b140 RDI: ffffba590e977a38
[ 8229.365349] RBP: 0000000000000010 R08: ffffda58ff94e190 R09: ffffda58ff94e198
[ 8229.365365] R10: 0000000000000011 R11: ffff9f00f6a79860 R12: 0000000000000000
[ 8229.365381] R13: ffffba590e977a38 R14: ffff9f012039b140 R15: 0000000000000001
[ 8229.365397] FS:  00007f481c230580(0000) GS:ffff9f013f940000(0000) knlGS:0000000000000000
[ 8229.365415] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8229.365428] CR2: 0000000000000004 CR3: 0000005f35e26004 CR4: 00000000007606e0
[ 8229.365444] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 8229.365460] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 8229.365476] PKRU: 55555554
[ 8229.365484] Call Trace:
[ 8229.365498]  ? finish_wait+0x80/0x80
[ 8229.365512]  blk_mq_get_request+0xcb/0x3f0
[ 8229.365525]  blk_mq_make_request+0x143/0x5d0
[ 8229.365538]  generic_make_request+0xcf/0x310
[ 8229.365553]  ? scan_shadow_nodes+0x30/0x30
[ 8229.365564]  submit_bio+0x3c/0x150
[ 8229.365576]  mpage_readpages+0x163/0x1a0
[ 8229.365588]  ? blkdev_direct_IO+0x490/0x490
[ 8229.365601]  read_pages+0x6b/0x190
[ 8229.365612]  __do_page_cache_readahead+0x1c1/0x1e0
[ 8229.365626]  ondemand_readahead+0x182/0x2f0
[ 8229.365639]  generic_file_buffered_read+0x590/0xab0
[ 8229.365655]  new_sync_read+0x12a/0x1c0
[ 8229.365666]  vfs_read+0x8a/0x140
[ 8229.365676]  ksys_read+0x59/0xd0
[ 8229.365688]  do_syscall_64+0x55/0x1d0
[ 8229.365700]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Weiping Zhang <zhangweiping@didiglobal.com>
Tested-by: Weiping Zhang <zhangweiping@didiglobal.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09 16:15:13 -06:00
..
partitions Merge branch 'block-5.7' into for-5.8/block 2020-05-09 16:13:58 -06:00
badblocks.c block: switch all files cleared marked as GPLv2 to SPDX tags 2019-04-30 16:11:57 -06:00
bfq-cgroup.c block, bfq: invoke flush_idle_tree after reparent_active_queues in pd_offline 2020-03-21 14:31:03 -06:00
bfq-iosched.c bdi: use bdi_dev_name() to get device name 2020-05-09 16:07:39 -06:00
bfq-iosched.h block, bfq: turn put_queue into release_process_ref in __bfq_bic_change_cgroup 2020-03-21 14:31:00 -06:00
bfq-wf2q.c block, bfq: get a ref to a group when adding it to a service tree 2020-02-03 06:58:15 -07:00
bio-integrity.c block: fix memleak of bio integrity data 2019-12-05 11:38:36 -07:00
bio.c block: move bio_map_* to blk-map.c 2020-03-27 12:04:34 -06:00
blk-cgroup-rwstat.c blk-cgroup: separate out blkg_rwstat under CONFIG_BLK_CGROUP_RWSTAT 2019-11-07 12:28:13 -07:00
blk-cgroup-rwstat.h blk-cgroup: separate out blkg_rwstat under CONFIG_BLK_CGROUP_RWSTAT 2019-11-07 12:28:13 -07:00
blk-cgroup.c Merge branch 'block-5.7' into for-5.8/block 2020-05-09 16:13:58 -06:00
blk-core.c block: add a bio_queue_enter helper 2020-04-29 09:33:26 -06:00
blk-exec.c block: account statistics for passthrough requests 2019-10-10 17:52:31 -06:00
blk-flush.c Revert "blkdev: check for valid request queue before issuing flush" 2020-03-27 10:23:44 -06:00
blk-integrity.c block: centralize PI remapping logic to the block layer 2019-09-17 20:03:49 -06:00
blk-ioc.c block: Fix use-after-free issue accessing struct io_cq 2020-03-12 07:07:38 -06:00
blk-iocost.c Merge branch 'block-5.7' into for-5.8/block 2020-05-09 16:13:58 -06:00
blk-iolatency.c blkcg: s/RQ_QOS_CGROUP/RQ_QOS_LATENCY/ 2019-08-28 21:17:08 -06:00
blk-lib.c block: fix 32 bit overflow in __blkdev_issue_discard() 2018-11-14 08:17:18 -07:00
blk-map.c block: remove RQF_COPY_USER 2020-04-22 10:47:06 -06:00
blk-merge.c block: replace BIO_QUEUE_ENTERED with BIO_CGROUP_ACCT 2020-04-29 09:33:26 -06:00
blk-mq-cpumap.c blk-mq: balance mapping between present CPUs and queues 2019-08-04 21:43:12 -06:00
blk-mq-debugfs-zoned.c block: Cleanup license notice 2019-01-17 21:21:40 -07:00
blk-mq-debugfs.c block: remove RQF_COPY_USER 2020-04-22 10:47:06 -06:00
blk-mq-debugfs.h blk-mq: no need to check return value of debugfs_create functions 2019-06-13 03:00:30 -06:00
blk-mq-pci.c block: Fix blk_mq_*_map_queues() kernel-doc headers 2019-05-31 15:12:34 -06:00
blk-mq-rdma.c block: Fix blk_mq_*_map_queues() kernel-doc headers 2019-05-31 15:12:34 -06:00
blk-mq-sched.c blk-mq: make function '__blk_mq_sched_dispatch_requests' static 2020-04-29 09:16:53 -06:00
blk-mq-sched.h block: blk-mq: Remove blk_mq_sched_started_request and started_request 2019-07-23 07:25:09 -06:00
blk-mq-sysfs.c blk-mq: make sure that line break can be printed 2019-11-04 07:14:10 -07:00
blk-mq-tag.c blk-mq: Remove some unused function arguments 2020-02-26 10:34:41 -07:00
blk-mq-tag.h blk-mq: Remove some unused function arguments 2020-02-26 10:34:41 -07:00
blk-mq-virtio.c blk-mq: Fix typo in comment 2020-03-17 20:55:21 +01:00
blk-mq.c block: alloc map and request for new hardware queue 2020-05-09 16:15:13 -06:00
blk-mq.h blk-mq: Remove some unused function arguments 2020-02-26 10:34:41 -07:00
blk-pm.c block: bypass blk_set_runtime_active for uninitialized q->dev 2019-09-12 07:11:56 -06:00
blk-pm.h block: remove the queue_lock indirection 2018-11-15 12:17:28 -07:00
blk-rq-qos.c blk-wbt: fix performance regression in wbt scale_up/scale_down 2019-10-06 09:26:41 -06:00
blk-rq-qos.h blk-rq-qos: fix first node deletion of rq_qos_del() 2019-10-15 10:13:13 -06:00
blk-settings.c block: move dma drain handling to scsi 2020-04-22 10:47:35 -06:00
blk-softirq.c block: Don't disable interrupts in trigger_softirq() 2019-11-18 07:29:22 -07:00
blk-stat.c blk-stat: Optimise blk_stat_add() 2019-10-07 21:19:10 -06:00
blk-stat.h block: deactivate blk_stat timer in wbt_disable_default() 2018-12-12 06:47:51 -07:00
blk-sysfs.c block: Remove "dying" checks from sysfs callbacks 2019-10-07 08:31:59 -06:00
blk-throttle.c blk-cgroup: separate out blkg_rwstat under CONFIG_BLK_CGROUP_RWSTAT 2019-11-07 12:28:13 -07:00
blk-timeout.c block: add SPDX tags to block layer files missing licensing information 2019-04-30 16:12:03 -06:00
blk-wbt.c blk-wbt: Use tracepoint_string() for wbt_step tracepoint string literals 2020-04-17 08:21:44 -06:00
blk-wbt.h block/rq_qos: implement rq_qos_ops->queue_depth_changed() 2019-08-28 21:17:07 -06:00
blk-zoned.c for-5.7/drivers-2020-03-29 2020-03-30 11:43:51 -07:00
blk.h block: remove create_io_context 2020-04-25 09:44:40 -06:00
bounce.c block: remove the i argument to bio_for_each_segment_all 2019-04-30 09:26:13 -06:00
bsg-lib.c block: Fix the type of 'sts' in bsg_queue_rq() 2019-12-20 11:52:01 -07:00
bsg.c compat_ioctl: bsg: add handler 2020-01-03 09:33:21 +01:00
cmdline-parser.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
elevator.c Merge branch 'for-linus' into for-5.5/block 2019-11-07 12:27:19 -07:00
genhd.c block: fold bdev_unhash_inode into invalidate_partition 2020-04-20 11:33:00 -06:00
ioctl.c block: refactor blkpg_ioctl 2020-04-20 11:32:59 -06:00
ioprio.c docs: block: convert to ReST 2019-07-15 09:20:27 -03:00
Kconfig blk-iocost: account for IO size when testing latencies 2020-04-30 15:54:45 -06:00
Kconfig.iosched blk-cgroup: separate out blkg_rwstat under CONFIG_BLK_CGROUP_RWSTAT 2019-11-07 12:28:13 -07:00
kyber-iosched.c blk-mq: remove blk_mq_put_ctx() 2019-07-02 21:03:27 -06:00
Makefile block: merge partition-generic.c and check.c 2020-03-24 07:57:08 -06:00
mq-deadline.c block: Introduce elevator features 2019-09-05 19:52:33 -06:00
opal_proto.h block: sed-opal: Change the check condition for regular session validity 2020-03-12 08:00:10 -06:00
scsi_ioctl.c scsi: core: Allow non-root users to perform ZBC commands 2020-03-16 18:26:31 -04:00
sed-opal.c block: sed-opal: Change the check condition for regular session validity 2020-03-12 08:00:10 -06:00
t10-pi.c block: Allow t10-pi to be modular 2020-01-06 20:59:04 -07:00