linux/drivers/nvme/host
Sagi Grimberg 3770a42bb8 nvme-tcp: fix regression that causes sporadic requests to time out
When we queue requests, we strive to batch as much as possible and also
signal the network stack that more data is about to be sent over a socket
with MSG_SENDPAGE_NOTLAST. This flag looks at the pending requests queued
as well as queue->more_requests that is derived from the block layer
last-in-batch indication.

We set more_request=true when we flush the request directly from
.queue_rq submission context (in nvme_tcp_send_all), however this is
wrongly assuming that no other requests may be queued during the
execution of nvme_tcp_send_all.

Due to this, a race condition may happen where:

 1. request X is queued as !last-in-batch
 2. request X submission context calls nvme_tcp_send_all directly
 3. nvme_tcp_send_all is preempted and schedules to a different cpu
 4. request Y is queued as last-in-batch
 5. nvme_tcp_send_all context sends request X+Y, however signals for
    both MSG_SENDPAGE_NOTLAST because queue->more_requests=true.

==> none of the requests is pushed down to the wire as the network
stack is waiting for more data, both requests timeout.

To fix this, we eliminate queue->more_requests and only rely on
the queue req_list and send_list to be not-empty.

Fixes: 122e5b9f3d ("nvme-tcp: optimize network stack with setting msg flags according to batch size")
Reported-by: Jonathan Nicklin <jnicklin@blockbridge.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Tested-by: Jonathan Nicklin <jnicklin@blockbridge.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-09-06 06:40:44 +02:00
..
apple.c nvme-apple: stop casting function pointer signatures 2022-08-02 17:22:51 -06:00
auth.c nvme-auth: Diffie-Hellman key exchange support 2022-08-02 17:14:49 -06:00
constants.c nvme-pci: print the command name of aborted commands 2022-08-02 17:22:48 -06:00
core.c nvme: enable generic interface (/dev/ngXnY) for unknown command sets 2022-08-02 17:22:53 -06:00
fabrics.c nvme-fabrics: Fix a typo in an error message 2022-08-10 16:21:31 +02:00
fabrics.h nvme: implement In-Band authentication 2022-08-02 17:14:49 -06:00
fault_inject.c block: remove the ->rq_disk field in struct request 2021-11-29 06:41:29 -07:00
fc.c nvme-fc: fix the fc_appid_store return value 2022-08-10 16:05:08 +02:00
fc.h nvme-fc: Update header and host for common definitions for LS handling 2020-05-09 16:18:33 -06:00
hwmon.c nvme-hwmon: Return error code when registration fails 2021-03-05 13:41:03 +01:00
ioctl.c nvme/host: Use the enum req_op and blk_opf_t types 2022-07-14 12:14:32 -06:00
Kconfig nvme-auth: Diffie-Hellman key exchange support 2022-08-02 17:14:49 -06:00
Makefile nvme: don't always build constants.o 2022-08-02 17:22:48 -06:00
multipath.c block: change the blk_queue_split calling convention 2022-08-02 17:22:53 -06:00
nvme.h nvme-multipath: refactor nvme_mpath_add_disk 2022-08-02 17:22:41 -06:00
pci.c nvme-pci: add NVME_QUIRK_BOGUS_NID for Lexar NM610 2022-08-31 07:57:28 +03:00
rdma.c nvme-rdma: split nvme_rdma_alloc_tagset 2022-08-02 17:22:48 -06:00
tcp.c nvme-tcp: fix regression that causes sporadic requests to time out 2022-09-06 06:40:44 +02:00
trace.c nvme: implement In-Band authentication 2022-08-02 17:14:49 -06:00
trace.h nvme: use command_id instead of req->tag in trace_nvme_complete_rq() 2022-08-02 17:22:46 -06:00
zns.c block: pass a gendisk to blk_queue_max_open_zones and blk_queue_max_active_zones 2022-07-06 06:46:26 -06:00