linux

iv/linux

Author	SHA1	Message	Date
Jiawei Fu (iBug)	5e11bacff0	drivers/nvme: Add quirks for device 126f:2262 [ Upstream commit e89086c43f0500bc7c4ce225495b73b8ce234c1f ] This commit adds NVME_QUIRK_NO_DEEPEST_PS and NVME_QUIRK_BOGUS_NID for device [126f:2262], which appears to be a generic VID:PID pair used for many SSDs based on the Silicon Motion SM2262/SM2262EN controller. Two of my SSDs with this VID:PID pair exhibit the same behavior: * They frequently have trouble exiting the deepest power state (5), resulting in the entire disk unresponsive. Verified by setting nvme_core.default_ps_max_latency_us=10000 and observing them behaving normally. * They produce all-zero nguid and eui64 with `nvme id-ns` command. The offending products are: * HP SSD EX950 1TB * HIKVISION C2000Pro 2TB Signed-off-by: Jiawei Fu <i@ibugone.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-13 13:07:39 +02:00
Chunguang Xu	ff2f90f88d	nvme: fix reconnection fail due to reserved tag allocation [ Upstream commit de105068fead55ed5c07ade75e9c8e7f86a00d1d ] We found a issue on production environment while using NVMe over RDMA, admin_q reconnect failed forever while remote target and network is ok. After dig into it, we found it may caused by a ABBA deadlock due to tag allocation. In my case, the tag was hold by a keep alive request waiting inside admin_q, as we quiesced admin_q while reset ctrl, so the request maked as idle and will not process before reset success. As fabric_q shares tagset with admin_q, while reconnect remote target, we need a tag for connect command, but the only one reserved tag was held by keep alive command which waiting inside admin_q. As a result, we failed to reconnect admin_q forever. In order to fix this issue, I think we should keep two reserved tags for admin queue. Fixes: ed01fee283a0 ("nvme-fabrics: only reserve a single tag") Signed-off-by: Chunguang Xu <chunguang.xu@shopee.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-26 18:20:09 -04:00
Daniel Wagner	fad689fce0	nvmet-fc: take ref count on tgtport before delete assoc [ Upstream commit fe506a74589326183297d5abdda02d0c76ae5a8b ] We have to ensure that the tgtport is not going away before be have remove all the associations. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-01 13:34:52 +01:00
Daniel Wagner	eaf0971fda	nvmet-fc: avoid deadlock on delete association path [ Upstream commit 710c69dbaccdac312e32931abcb8499c1525d397 ] When deleting an association the shutdown path is deadlocking because we try to flush the nvmet_wq nested. Avoid this by deadlock by deferring the put work into its own work item. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-01 13:34:52 +01:00
Daniel Wagner	399b70e8ea	nvmet-fc: abort command when there is no binding [ Upstream commit 3146345c2e9c2f661527054e402b0cfad80105a4 ] When the target port has not active port binding, there is no point in trying to process the command as it has to fail anyway. Instead adding checks to all commands abort the command early. Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-01 13:34:52 +01:00
Daniel Wagner	f2879398c2	nvmet-fc: hold reference on hostport match [ Upstream commit ca121a0f7515591dba0eb5532bfa7ace4dc153ce ] The hostport data structure is shared between the association, this why we keep track of the users via a refcount. So we should not decrement the refcount on a match and free the hostport several times. Reported by KASAN. Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-01 13:34:52 +01:00
Daniel Wagner	ccd49adde0	nvmet-fc: defer cleanup using RCU properly [ Upstream commit 4049dc96b8de7aeb3addcea039446e464726a525 ] When the target executes a disconnect and the host triggers a reconnect immediately, the reconnect command still finds an existing association. The reconnect crashes later on because nvmet_fc_delete_target_assoc blindly removes resources while the reconnect code wants to use it. To address this, nvmet_fc_find_target_assoc should not be able to lookup an association which is being removed. The association list is already under RCU lifetime management, so let's properly use it and remove the association from the list and wait for a grace period before cleaning up all. This means we also can drop the RCU management on the queues, because this is now handled via the association itself. A second step split the execution context so that the initial disconnect command can complete without running the reconnect code in the same context. As usual, this is done by deferring the ->done to a workqueue. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-01 13:34:52 +01:00
Daniel Wagner	2baa7272f2	nvmet-fc: release reference on target port [ Upstream commit c691e6d7e13dab81ac8c7489c83b5dea972522a5 ] In case we return early out of __nvmet_fc_finish_ls_req() we still have to release the reference on the target port. Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-01 13:34:52 +01:00
Daniel Wagner	95a9ff3307	nvmet-fcloop: swap the list_add_tail arguments [ Upstream commit dcfad4ab4d6733f2861cd241d8532a0004fc835a ] The first argument of list_add_tail function is the new element which should be added to the list which is the second argument. Swap the arguments to allow processing more than one element at a time. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-01 13:34:52 +01:00
Daniel Wagner	baa6b7eb8c	nvme-fc: do not wait in vain when unloading module [ Upstream commit 70fbfc47a392b98e5f8dba70c6efc6839205c982 ] The module exit path has race between deleting all controllers and freeing 'left over IDs'. To prevent double free a synchronization between nvme_delete_ctrl and ida_destroy has been added by the initial commit. There is some logic around trying to prevent from hanging forever in wait_for_completion, though it does not handling all cases. E.g. blktests is able to reproduce the situation where the module unload hangs forever. If we completely rely on the cleanup code executed from the nvme_delete_ctrl path, all IDs will be freed eventually. This makes calling ida_destroy unnecessary. We only have to ensure that all nvme_delete_ctrl code has been executed before we leave nvme_fc_exit_module. This is done by flushing the nvme_delete_wq workqueue. While at it, remove the unused nvme_fc_wq workqueue too. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-01 13:34:52 +01:00
Guixin Liu	307fc03dc4	nvmet-tcp: fix nvme tcp ida memory leak [ Upstream commit 47c5dd66c1840524572dcdd956f4af2bdb6fbdff ] The nvmet_tcp_queue_ida should be destroy when the nvmet-tcp module exit. Signed-off-by: Guixin Liu <kanie@linux.alibaba.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-01 13:34:51 +01:00
Maurizio Lombardi	2ed3d35328	nvmet-tcp: Fix the H2C expected PDU len calculation [ Upstream commit 9a1abc24850eb759e36a2f8869161c3b7254c904 ] The nvmet_tcp_handle_h2c_data_pdu() function should take into consideration the possibility that the header digest and/or the data digests are enabled when calculating the expected PDU length, before comparing it to the value stored in cmd->pdu_len. Fixes: efa56305908b ("nvmet-tcp: Fix a kernel panic when host sends an invalid H2C PDU length") Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-25 15:35:55 -08:00
Arnd Bergmann	79e9dfd7f8	nvme: trace: avoid memcpy overflow warning [ Upstream commit a7de1dea76cd6a3707707af4ea2f8bc3cdeaeb11 ] A previous patch introduced a struct_group() in nvme_common_command to help stringop fortification figure out the length of the fields, but one function is not currently using them: In file included from drivers/nvme/target/core.c:7: In file included from include/linux/string.h:254: include/linux/fortify-string.h:592:4: error: call to '__read_overflow2_field' declared with 'warning' attribute: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Werror,-Wattribute-warning] __read_overflow2_field(q_size_field, size); ^ Change this one to use the correct field name to avoid the warning. Fixes: 5c629dc9609dc ("nvme: use struct group for generic command dwords") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-25 15:35:55 -08:00
Arnd Bergmann	4652eb8176	nvmet: re-fix tracing strncpy() warning [ Upstream commit 4ee7ffeb4ce50c80bc4504db6f39b25a2df6bcf4 ] An earlier patch had tried to address a warning about a string copy with missing zero termination: drivers/nvme/target/trace.h:52:3: warning: ‘strncpy’ specified bound 32 equals destination size [-Wstringop-truncation] The new version causes a different warning with some compiler versions, notably gcc-9 and gcc-10, and also misses the zero padding that was apparently done intentionally in the original code: drivers/nvme/target/trace.h:56:2: error: 'strncpy' specified bound depends on the length of the source argument [-Werror=stringop-overflow=] Change it to use strscpy_pad() with the original length, which will give a properly padded and zero-terminated string as well as avoiding the warning. Fixes: d86481e924a7 ("nvmet: use min of device_path and disk len") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-25 15:35:55 -08:00
Maurizio Lombardi	2f00fd8d50	nvmet-tcp: fix a crash in nvmet_req_complete() [ Upstream commit 0849a5441358cef02586fb2d60f707c0db195628 ] in nvmet_tcp_handle_h2c_data_pdu(), if the host sends a data_offset different from rbytes_done, the driver ends up calling nvmet_req_complete() passing a status error. The problem is that at this point cmd->req is not yet initialized, the kernel will crash after dereferencing a NULL pointer. Fix the bug by replacing the call to nvmet_req_complete() with nvmet_tcp_fatal_error(). Fixes: 872d26a391da ("nvmet-tcp: add NVMe over TCP target driver") Reviewed-by: Keith Busch <kbsuch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-25 15:35:54 -08:00
Maurizio Lombardi	24e0576018	nvmet-tcp: Fix a kernel panic when host sends an invalid H2C PDU length [ Upstream commit efa56305908ba20de2104f1b8508c6a7401833be ] If the host sends an H2CData command with an invalid DATAL, the kernel may crash in nvmet_tcp_build_pdu_iovec(). Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 lr : nvmet_tcp_io_work+0x6ac/0x718 [nvmet_tcp] Call trace: process_one_work+0x174/0x3c8 worker_thread+0x2d0/0x3e8 kthread+0x104/0x110 Fix the bug by raising a fatal error if DATAL isn't coherent with the packet size. Also, the PDU length should never exceed the MAXH2CDATA parameter which has been communicated to the host in nvmet_tcp_handle_icreq(). Fixes: 872d26a391da ("nvmet-tcp: add NVMe over TCP target driver") Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-25 15:35:54 -08:00
Bitao Hu	c52d545c1e	nvme: fix deadlock between reset and scan [ Upstream commit 839a40d1e730977d4448d141fa653517c2959a88 ] If controller reset occurs when allocating namespace, both nvme_reset_work and nvme_scan_work will hang, as shown below. Test Scripts: for ((t=1;t<=128;t++)) do nsid=`nvme create-ns /dev/nvme1 -s 14537724 -c 14537724 -f 0 -m 0 \ -d 0 \| awk -F: '{print($NF);}'` nvme attach-ns /dev/nvme1 -n $nsid -c 0 done nvme reset /dev/nvme1 We will find that both nvme_reset_work and nvme_scan_work hung: INFO: task kworker/u249:4:17848 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/u249:4 state:D stack: 0 pid:17848 ppid: 2 flags:0x00000028 Workqueue: nvme-reset-wq nvme_reset_work [nvme] Call trace: __switch_to+0xb4/0xfc __schedule+0x22c/0x670 schedule+0x4c/0xd0 blk_mq_freeze_queue_wait+0x84/0xc0 nvme_wait_freeze+0x40/0x64 [nvme_core] nvme_reset_work+0x1c0/0x5cc [nvme] process_one_work+0x1d8/0x4b0 worker_thread+0x230/0x440 kthread+0x114/0x120 INFO: task kworker/u249:3:22404 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/u249:3 state:D stack: 0 pid:22404 ppid: 2 flags:0x00000028 Workqueue: nvme-wq nvme_scan_work [nvme_core] Call trace: __switch_to+0xb4/0xfc __schedule+0x22c/0x670 schedule+0x4c/0xd0 rwsem_down_write_slowpath+0x32c/0x98c down_write+0x70/0x80 nvme_alloc_ns+0x1ac/0x38c [nvme_core] nvme_validate_or_alloc_ns+0xbc/0x150 [nvme_core] nvme_scan_ns_list+0xe8/0x2e4 [nvme_core] nvme_scan_work+0x60/0x500 [nvme_core] process_one_work+0x1d8/0x4b0 worker_thread+0x260/0x440 kthread+0x114/0x120 INFO: task nvme:28428 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:nvme state:D stack: 0 pid:28428 ppid: 27119 flags:0x00000000 Call trace: __switch_to+0xb4/0xfc __schedule+0x22c/0x670 schedule+0x4c/0xd0 schedule_timeout+0x160/0x194 do_wait_for_common+0xac/0x1d0 __wait_for_common+0x78/0x100 wait_for_completion+0x24/0x30 __flush_work.isra.0+0x74/0x90 flush_work+0x14/0x20 nvme_reset_ctrl_sync+0x50/0x74 [nvme_core] nvme_dev_ioctl+0x1b0/0x250 [nvme_core] __arm64_sys_ioctl+0xa8/0xf0 el0_svc_common+0x88/0x234 do_el0_svc+0x7c/0x90 el0_svc+0x1c/0x30 el0_sync_handler+0xa8/0xb0 el0_sync+0x148/0x180 The reason for the hang is that nvme_reset_work occurs while nvme_scan_work is still running. nvme_scan_work may add new ns into ctrl->namespaces list after nvme_reset_work frozen all ns->q in ctrl->namespaces list. The newly added ns is not frozen, so nvme_wait_freeze will wait forever. Unfortunately, ctrl->namespaces_rwsem is held by nvme_reset_work, so nvme_scan_work will also wait forever. Now we are deadlocked! PROCESS1 PROCESS2 ============== ============== nvme_scan_work ... nvme_reset_work nvme_validate_or_alloc_ns nvme_dev_disable nvme_alloc_ns nvme_start_freeze down_write ... nvme_ns_add_to_ctrl_list ... up_write nvme_wait_freeze ... down_read nvme_alloc_ns blk_mq_freeze_queue_wait down_write Fix by marking the ctrl with say NVME_CTRL_FROZEN flag set in nvme_start_freeze and cleared in nvme_unfreeze. Then the scan can check it before adding the new namespace (under the namespaces_rwsem). Signed-off-by: Bitao Hu <yaoma@linux.alibaba.com> Reviewed-by: Guixin Liu <kanie@linux.alibaba.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-20 11:51:41 +01:00
Nitesh Shetty	946fd64ba3	nvme: prevent potential spectre v1 gadget [ Upstream commit 20dc66f2d76b4a410df14e4675e373b718babc34 ] This patch fixes the smatch warning, "nvmet_ns_ana_grpid_store() warn: potential spectre issue 'nvmet_ana_group_enabled' [w] (local cap)" Prevent the contents of kernel memory from being leaked to user space via speculative execution by using array_index_nospec. Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-20 11:51:41 +01:00
Keith Busch	8b2a6a3692	nvme-ioctl: move capable() admin check to the end [ Upstream commit 7be866b1cf0bf1dfa74480fe8097daeceda68622 ] This can be an expensive call on some kernel configs. Move it to the end after checking the cheaper ways to determine if the command is allowed. Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-20 11:51:41 +01:00
Keith Busch	8884a56d21	nvme: ensure reset state check ordering [ Upstream commit e6e7f7ac03e40795346f1b2994a05f507ad8d345 ] A different CPU may be setting the ctrl->state value, so ensure proper barriers to prevent optimizing to a stale state. Normally it isn't a problem to observe the wrong state as it is merely advisory to take a quicker path during initialization and error recovery, but seeing an old state can report unexpected ENETRESET errors when a reset request was in fact successful. Reported-by: Minh Hoang <mh2022@meta.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-20 11:51:41 +01:00
Keith Busch	cc5b051eeb	nvme: introduce helper function to get ctrl state [ Upstream commit 5c687c287c46fadb14644091823298875a5216aa ] The controller state is typically written by another CPU, so reading it should ensure no optimizations are taken. This is a repeated pattern in the driver, so start with adding a convenience function that returns the controller state with READ_ONCE(). Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-20 11:51:41 +01:00
Keith Busch	a4848c45a3	nvme-core: check for too small lba shift [ Upstream commit 74fbc88e161424b3b96a22b23a8e3e1edab9d05c ] The block layer doesn't support logical block sizes smaller than 512 bytes. The nvme spec doesn't support that small either, but the driver isn't checking to make sure the device responded with usable data. Failing to catch this will result in a kernel bug, either from a division by zero when stacking, or a zero length bio. Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-20 11:51:39 +01:00
Maurizio Lombardi	75cc56afb2	nvme-core: fix a memory leak in nvme_ns_info_from_identify() [ Upstream commit e3139cef8257fcab1725441e2fd5fd0ccb5481b1 ] In case of error, free the nvme_id_ns structure that was allocated by nvme_identify_ns(). Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-20 11:51:38 +01:00
Keith Busch	c62b9a2daf	Revert "nvme-fc: fix race between error recovery and creating association" commit d3e8b1858734bf46cda495be4165787b9a3981a6 upstream. The commit was identified to might sleep in invalid context and is blocking regression testing. This reverts commit ee6fdc5055e916b1dd497f11260d4901c4c1e55e. Link: https://lore.kernel.org/linux-nvme/hkhl56n665uvc6t5d6h3wtx7utkcorw4xlwi7d2t2bnonavhe6@xaan6pu43ap6/ Link: https://lists.infradead.org/pipermail/linux-nvme/2023-December/043756.html Reported-by: Daniel Wagner <dwagner@suse.de> Reported-by: Maurizio Lombardi <mlombard@redhat.com> Cc: Michael Liang <mliang@purestorage.com> Tested-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-01-05 15:19:43 +01:00
Maurizio Lombardi	fedbc8732f	nvme-pci: fix sleeping function called from interrupt context [ Upstream commit f6fe0b2d35457c10ec37acc209d19726bdc16dbd ] the nvme_handle_cqe() interrupt handler calls nvme_complete_async_event() but the latter may call nvme_auth_stop() which is a blocking function. Sleeping functions can't be called in interrupt context BUG: sleeping function called from invalid context in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/15 Call Trace: <IRQ> __cancel_work_timer+0x31e/0x460 ? nvme_change_ctrl_state+0xcf/0x3c0 [nvme_core] ? nvme_change_ctrl_state+0xcf/0x3c0 [nvme_core] nvme_complete_async_event+0x365/0x480 [nvme_core] nvme_poll_cq+0x262/0xe50 [nvme] Fix the bug by moving nvme_auth_stop() to fw_act_work (executed by the nvme_wq workqueue) Fixes: f50fff73d620 ("nvme: implement In-Band authentication") Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-01 12:42:35 +00:00
Hannes Reinecke	9514925a9a	nvme: catch errors from nvme_configure_metadata() [ Upstream commit cd9aed606088d36a7ffff3e808db4e76b1854285 ] nvme_configure_metadata() is issuing I/O, so we might incur an I/O error which will cause the connection to be reset. But in that case any further probing will race with reset and cause UAF errors. So return a status from nvme_configure_metadata() and abort probing if there was an I/O error. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-20 17:01:57 +01:00
Mark O'Donovan	89fc9028e8	nvme-auth: set explanation code for failure2 msgs [ Upstream commit 38ce1570e2c46e7e9af983aa337edd7e43723aa2 ] Some error cases were not setting an auth-failure-reason-code-explanation. This means an AUTH_Failure2 message will be sent with an explanation value of 0 which is a reserved value. Signed-off-by: Mark O'Donovan <shiftee@posteo.net> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-20 17:01:57 +01:00
Georg Gottleuber	dd864f6ee0	nvme-pci: Add sleep quirk for Kingston drives commit 107b4e063d78c300b21e2d5291b1aa94c514ea5b upstream. Some Kingston NV1 and A2000 are wasting a lot of power on specific TUXEDO platforms in s2idle sleep if 'Simple Suspend' is used. This patch applies a new quirk 'Force No Simple Suspend' to achieve a low power sleep without 'Simple Suspend'. Signed-off-by: Werner Sembach <wse@tuxedocomputers.com> Signed-off-by: Georg Gottleuber <ggo@tuxedocomputers.com> Cc: <stable@vger.kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-13 18:45:20 +01:00
Ewan D. Milne	2bda53caf1	nvme: check for valid nvme_identify_ns() before using it commit d8b90d600aff181936457f032d116dbd8534db06 upstream. When scanning namespaces, it is possible to get valid data from the first call to nvme_identify_ns() in nvme_alloc_ns(), but not from the second call in nvme_update_ns_info_block(). In particular, if the NSID becomes inactive between the two commands, a storage device may return a buffer filled with zero as per 4.1.5.1. In this case, we can get a kernel crash due to a divide-by-zero in blk_stack_limits() because ns->lba_shift will be set to zero. PID: 326 TASK: ffff95fec3cd8000 CPU: 29 COMMAND: "kworker/u98:10" #0 [ffffad8f8702f9e0] machine_kexec at ffffffff91c76ec7 #1 [ffffad8f8702fa38] __crash_kexec at ffffffff91dea4fa #2 [ffffad8f8702faf8] crash_kexec at ffffffff91deb788 #3 [ffffad8f8702fb00] oops_end at ffffffff91c2e4bb #4 [ffffad8f8702fb20] do_trap at ffffffff91c2a4ce #5 [ffffad8f8702fb70] do_error_trap at ffffffff91c2a595 #6 [ffffad8f8702fbb0] exc_divide_error at ffffffff928506e6 #7 [ffffad8f8702fbd0] asm_exc_divide_error at ffffffff92a00926 [exception RIP: blk_stack_limits+434] RIP: ffffffff92191872 RSP: ffffad8f8702fc80 RFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff95efa0c91800 RCX: 0000000000000001 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001 RBP: 00000000ffffffff R8: ffff95fec7df35a8 R9: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: ffff95fed33c09a8 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #8 [ffffad8f8702fce0] nvme_update_ns_info_block at ffffffffc06d3533 [nvme_core] #9 [ffffad8f8702fd18] nvme_scan_ns at ffffffffc06d6fa7 [nvme_core] This happened when the check for valid data was moved out of nvme_identify_ns() into one of the callers. Fix this by checking in both callers. Link: https://bugzilla.kernel.org/show_bug.cgi?id=218186 Fixes: 0dd6fff2aad4 ("nvme: bring back auto-removal of deleted namespaces during sequential scan") Cc: stable@vger.kernel.org Signed-off-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:52:18 +01:00
Christoph Hellwig	2291653c27	nvmet: nul-terminate the NQNs passed in the connect command [ Upstream commit 1c22e0295a5eb571c27b53c7371f95699ef705ff ] The host and subsystem NQNs are passed in the connect command payload and interpreted as nul-terminated strings. Ensure they actually are nul-terminated before using them. Fixes: a07b4970f464 "nvmet: add a generic NVMe target") Reported-by: Alon Zahavi <zahavi.alon@gmail.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-03 07:33:06 +01:00
Hannes Reinecke	399d76d330	nvme: blank out authentication fabrics options if not configured [ Upstream commit c7ca9757bda35ff9ce27ab42f2cb8b84d983e6ad ] If the config option NVME_HOST_AUTH is not selected we should not accept the corresponding fabrics options. This allows userspace to detect if NVMe authentication has been enabled for the kernel. Cc: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Fixes: f50fff73d620 ("nvme: implement In-Band authentication") Signed-off-by: Hannes Reinecke <hare@suse.de> Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-03 07:33:06 +01:00
Anuj Gupta	04e446f54f	nvme: fix error-handling for io_uring nvme-passthrough [ Upstream commit 1147dd0503564fa0e03489a039f9e0c748a03db4 ] Driver may return an error before submitting the command to the device. Ensure that such error is propagated up. Fixes: 456cba386e94 ("nvme: wire-up uring-cmd support for io-passthru on char-device.") Signed-off-by: Anuj Gupta <anuj20.g@samsung.com> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-20 11:59:35 +01:00
Keith Busch	5c3f406646	nvme-pci: add BOGUS_NID for Intel 0a54 device These ones claim cmic and nmic capable, so need special consideration to ignore their duplicate identifiers. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217981 Reported-by: welsh@cassens.com Signed-off-by: Keith Busch <kbusch@kernel.org>	2023-10-18 14:08:39 -07:00
Maurizio Lombardi	f965b281fd	nvmet-auth: complete a request only after freeing the dhchap pointers It may happen that the work to destroy a queue (for example nvmet_tcp_release_queue_work()) is started while an auth-send or auth-receive command is still completing. nvmet_sq_destroy() will block, waiting for all the references to the sq to be dropped, the last reference is then dropped when nvmet_req_complete() is called. When this happens, both nvmet_sq_destroy() and nvmet_execute_auth_send()/_receive() will free the dhchap pointers by calling nvmet_auth_sq_free(). Since there isn't any lock, the two threads may race against each other, causing double frees and memory corruptions, as reported by KASAN. Reproduced by stress blktests nvme/041 nvme/042 nvme/043 nvme nvme2: qid 0: authenticated with hash hmac(sha512) dhgroup ffdhe4096 ================================================================== BUG: KASAN: double-free in kfree+0xec/0x4b0 Call Trace: <TASK> kfree+0xec/0x4b0 nvmet_auth_sq_free+0xe1/0x160 [nvmet] nvmet_execute_auth_send+0x482/0x16d0 [nvmet] process_one_work+0x8e5/0x1510 Allocated by task 191846: __kasan_kmalloc+0x81/0xa0 nvmet_auth_ctrl_sesskey+0xf6/0x380 [nvmet] nvmet_auth_reply+0x119/0x990 [nvmet] Freed by task 143270: kfree+0xec/0x4b0 nvmet_auth_sq_free+0xe1/0x160 [nvmet] process_one_work+0x8e5/0x1510 Fix this bug by calling nvmet_req_complete() only after freeing the pointers, so we will prevent the race by holding the sq reference. V2: remove redundant code Fixes: db1312dd9548 ("nvmet: implement basic In-Band Authentication") Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2023-10-18 14:08:39 -07:00
Keith Busch	2b32c76e2b	nvme: sanitize metadata bounce buffer for reads User can request more metadata bytes than the device will write. Ensure kernel buffer is initialized so we're not leaking unsanitized memory on the copy-out. Fixes: 0b7f1f26f95a51a ("nvme: use the block layer for userspace passthrough metadata") Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2023-10-18 14:08:39 -07:00
Martin Wilck	4ae55a7dce	nvme-auth: use chap->s2 to indicate bidirectional authentication Commit 546dea18c999 ("nvme-auth: check chap ctrl_key once constructed") replaced the condition "if (ctrl->ctrl_key)" (indicating bidirectional auth) by "if (chap->ctrl_key)", because ctrl->ctrl_key is a resource shared with sysfs. But chap->ctrl_key is set in nvme_auth_process_dhchap_challenge() depending on the DHVLEN in the DH-HMAC-CHAP Challenge message received from the controller, and will thus be non-NULL for every DH-HMAC-CHAP exchange, even if unidirectional auth was requested. This will lead to a protocol violation by sending a Success2 message in the unidirectional case (per NVMe base spec 2.0, the authentication transaction ends after the Success1 message for unidirectional auth). Use chap->s2 instead, which is non-zero if and only if the host requested bi-directional authentication from the controller. Fixes: 546dea18c999 ("nvme-auth: check chap ctrl_key once constructed") Signed-off-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2023-10-10 08:06:06 -07:00
Sagi Grimberg	d920abd1e7	nvmet-tcp: Fix a possible UAF in queue intialization setup From Alon: "Due to a logical bug in the NVMe-oF/TCP subsystem in the Linux kernel, a malicious user can cause a UAF and a double free, which may lead to RCE (may also lead to an LPE in case the attacker already has local privileges)." Hence, when a queue initialization fails after the ahash requests are allocated, it is guaranteed that the queue removal async work will be called, hence leave the deallocation to the queue removal. Also, be extra careful not to continue processing the socket, so set queue rcv_state to NVMET_TCP_RECV_ERR upon a socket error. Cc: stable@vger.kernel.org Reported-by: Alon Zahavi <zahavi.alon@gmail.com> Tested-by: Alon Zahavi <zahavi.alon@gmail.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2023-10-10 08:03:22 -07:00
Maurizio Lombardi	3820c4fdc2	nvme-rdma: do not try to stop unallocated queues Trying to stop a queue which hasn't been allocated will result in a warning due to calling mutex_lock() against an uninitialized mutex. DEBUG_LOCKS_WARN_ON(lock->magic != lock) WARNING: CPU: 4 PID: 104150 at kernel/locking/mutex.c:579 Call trace: RIP: 0010:__mutex_lock+0x1173/0x14a0 nvme_rdma_stop_queue+0x1b/0xa0 [nvme_rdma] nvme_rdma_teardown_io_queues.part.0+0xb0/0x1d0 [nvme_rdma] nvme_rdma_delete_ctrl+0x50/0x100 [nvme_rdma] nvme_do_delete_ctrl+0x149/0x158 [nvme_core] Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2023-10-05 09:37:41 -07:00
Jens Axboe	c266ae774e	nvme fixes for Linux 6.6 - nvme-tcp iov len fix (Varun) - nvme-hwmon const qualifier for safety (Krzysztof) - nvme-fc null pointer checks (Nigel) - nvme-pci no numa node fix (Pratyush) - nvme timeout fix for non-compliant controllers (Keith) -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE3Fbyvv+648XNRdHTPe3zGtjzRgkFAmUDakkACgkQPe3zGtjz RgnSyA/+MR9dvwaa9ADcmWtzKaFnpxOy24Yr0CQ97oqfryv89ZVHSOT9gzQGdhNO m3Z3hJ72nU+mjjy8/Ixh0+AdSNLOD6sY9ohQQ/7jev+yGjOaSZnQWXpH6pionfuW DCfQweFq8VggCpT+wVinnzs7C4S/mhuuNHunefDAa+A7R6jeJENhridbzillokfi semNz6GSKcIMBsaQ85JDlTyh+nGfva7KwLVdEzA0g00XjxMw159bBjqPFtm/Dwde Jt280wVNQHEa+Jo8CVgAbYvFAhgAmwt5cqr8d22PisVVR324s7aMUf++BQcslmzX 5WyVTvr3nvKH5uAWtvKJiDV8VNhs6WM3CuCfOOqK2fOaQ6PN18In08y9aGPDQ73e 3xkkgDdRlJFa2jae48msIY2KnysJoqPJgymEtvHuqbQNYifE1jym39Ml5soWCol2 a0aEVobGTceeEb0D7MWdVFntS6HmdFrZcinmSGcgj34IAYCYaguHK8ZSPNovJnAq /27YDkfXnt3+8ZgbxhZjxxToH2xCaa/uC842Ujh3KTROqKbppw8YBv/2VoLNxGNh Kqlu2vApIbrNP19HuCp/oDu10Q+6KPJZY1qqT68ndLl47CYAXEG0XbyB13I4hdwL YUues45a9f9X+Q0cHoJ9Hr4rTLfF6dm5gNv0X1cUdyLNTcB1FbE= =WTY0 -----END PGP SIGNATURE----- Merge tag 'nvme-6.6-2023-09-14' of git://git.infradead.org/nvme into block-6.6 Pull NVMe fixes from Keith: "nvme fixes for Linux 6.6 - nvme-tcp iov len fix (Varun) - nvme-hwmon const qualifier for safety (Krzysztof) - nvme-fc null pointer checks (Nigel) - nvme-pci no numa node fix (Pratyush) - nvme timeout fix for non-compliant controllers (Keith)" * tag 'nvme-6.6-2023-09-14' of git://git.infradead.org/nvme: nvme: avoid bogus CRTO values nvme-pci: do not set the NUMA node of device if it has none nvme-fc: Prevent null pointer dereference in nvme_fc_io_getuuid() nvme: host: hwmon: constify pointers to hwmon_channel_info nvmet-tcp: pass iov_len instead of sg->length to bvec_set_page()	2023-09-14 16:20:31 -06:00
Keith Busch	6cc834ba62	nvme: avoid bogus CRTO values Some devices are reporting controller ready mode support, but return 0 for CRTO. These devices require a much higher time to ready than that, so they are failing to initialize after the driver starter preferring that value over CAP.TO. The spec requires that CAP.TO match the appropritate CRTO value, or be set to 0xff if CRTO is larger than that. This means that CAP.TO can be used to validate if CRTO is reliable, and provides an appropriate fallback for setting the timeout value if not. Use whichever is larger. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217863 Reported-by: Cláudio Sampaio <patola@gmail.com> Reported-by: Felix Yan <felixonmars@archlinux.org> Tested-by: Felix Yan <felixonmars@archlinux.org> Based-on-a-patch-by: Felix Yan <felixonmars@archlinux.org> Cc: stable@vger.kernel.org Signed-off-by: Keith Busch <kbusch@kernel.org>	2023-09-14 13:09:52 -07:00
Pratyush Yadav	dad651b2a4	nvme-pci: do not set the NUMA node of device if it has none If a device has no NUMA node information associated with it, the driver puts the device in node first_memory_node (say node 0). Not having a NUMA node and being associated with node 0 are completely different things and it makes little sense to mix the two. Signed-off-by: Pratyush Yadav <ptyadav@amazon.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2023-09-12 09:06:58 -07:00
Linus Torvalds	3d3dfeb3ae	for-6.6/block-2023-08-28 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmTs08EQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpqa4EACu/zKE+omGXBV0Q7kEpVsChjp0ElGtSDIJ tJfTuvnWqQjrqRv4ksmZvGdx8SkqFuXri4/7oBXlsaqeUVbIQdWJUpLErBye6nxa lUb6nXOFWwyG94cMRYs71lN0loosjb7aiVw7oVLAIhntq3p3doFl/cyy3ndMZrUE pZbsrWSt4QiOKhcO0TtIjfAwsr31AN51qFiNNITEiZl3UjXfkGRCK81X0yM2N8zZ 7Y0h1ldPBsZ/olNWeRyaW1uB64nKM0buR7/nDxCV/NI05nndJ34bIgo/JIj4xy0v SiBj2+y86+oMJZt17yYENwOQdtX3hbyESGuVm9dCrO0t9/byVQxkUk0OMm65BM/l l2d+gmMQZTbHziqfLlgq9i3i9+B4C2hsb7iBpuo7SW/FPbM45POgi3lpiZycaZyu krQo1qwL4KSGXzGN9CabEuKDcJcXqLxqMDOyEDA3R5Kz06V9tNuM+Di/mr4vuZHK sVHUfHuWBO9ionLlGPdc3fH/CuMqic8SHjumiAm2menBZV6cSzRDxpm6H4CyLt7y tWmw7BNU7dfHFGd+Jw0Ld49sAuEybszEXq6qYv5uYBVfJNqDvOvEeVoQp0RN2jJA AG30hymcZgxn9n7gkIgkPQDgIGUjnzUR8B2mE2UFU1CYVHXYXAXU55CCI5oeTkbs d0Y/zCZf1A== =p1bd -----END PGP SIGNATURE----- Merge tag 'for-6.6/block-2023-08-28' of git://git.kernel.dk/linux Pull block updates from Jens Axboe: "Pretty quiet round for this release. This contains: - Add support for zoned storage to ublk (Andreas, Ming) - Series improving performance for drivers that mark themselves as needing a blocking context for issue (Bart) - Cleanup the flush logic (Chengming) - sed opal keyring support (Greg) - Fixes and improvements to the integrity support (Jinyoung) - Add some exports for bcachefs that we can hopefully delete again in the future (Kent) - deadline throttling fix (Zhiguo) - Series allowing building the kernel without buffer_head support (Christoph) - Sanitize the bio page adding flow (Christoph) - Write back cache fixes (Christoph) - MD updates via Song: - Fix perf regression for raid0 large sequential writes (Jan) - Fix split bio iostat for raid0 (David) - Various raid1 fixes (Heinz, Xueshi) - raid6test build fixes (WANG) - Deprecate bitmap file support (Christoph) - Fix deadlock with md sync thread (Yu) - Refactor md io accounting (Yu) - Various non-urgent fixes (Li, Yu, Jack) - Various fixes and cleanups (Arnd, Azeem, Chengming, Damien, Li, Ming, Nitesh, Ruan, Tejun, Thomas, Xu)" * tag 'for-6.6/block-2023-08-28' of git://git.kernel.dk/linux: (113 commits) block: use strscpy() to instead of strncpy() block: sed-opal: keyring support for SED keys block: sed-opal: Implement IOC_OPAL_REVERT_LSP block: sed-opal: Implement IOC_OPAL_DISCOVERY blk-mq: prealloc tags when increase tagset nr_hw_queues blk-mq: delete redundant tagset map update when fallback blk-mq: fix tags leak when shrink nr_hw_queues ublk: zoned: support REQ_OP_ZONE_RESET_ALL md: raid0: account for split bio in iostat accounting md/raid0: Fix performance regression for large sequential writes md/raid0: Factor out helper for mapping and submitting a bio md raid1: allow writebehind to work on any leg device set WriteMostly md/raid1: hold the barrier until handle_read_error() finishes md/raid1: free the r1bio before waiting for blocked rdev md/raid1: call free_r1bio() before allow_barrier() in raid_end_bio_io() blk-cgroup: Fix NULL deref caused by blkg_policy_data being installed before init drivers/rnbd: restore sysfs interface to rnbd-client md/raid5-cache: fix null-ptr-deref for r5l_flush_stripe_to_raid() raid6: test: only check for Altivec if building on powerpc hosts raid6: test: make sure all intermediate and artifact files are .gitignored ...	2023-08-29 20:21:42 -07:00
Nigel Kirkland	8ae5b3a685	nvme-fc: Prevent null pointer dereference in nvme_fc_io_getuuid() The nvme_fc_fcp_op structure describing an AEN operation is initialized with a null request structure pointer. An FC LLDD may make a call to nvme_fc_io_getuuid passing a pointer to an nvmefc_fcp_req for an AEN operation. Add validation of the request structure pointer before dereference. Signed-off-by: Nigel Kirkland <nkirkland2304@gmail.com> Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2023-08-21 13:29:17 -07:00
Krzysztof Kozlowski	71be868472	nvme: host: hwmon: constify pointers to hwmon_channel_info Statically allocated array of pointed to hwmon_channel_info can be made const for safety. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Christoph Hellwig <hch@lst.de> Acked-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Keith Busch <kbusch@kernel.org>	2023-08-21 12:54:02 -07:00
Varun Prakash	1f0bbf2894	nvmet-tcp: pass iov_len instead of sg->length to bvec_set_page() iov_len is the valid data length, so pass iov_len instead of sg->length to bvec_set_page(). Fixes: 5bfaba275ae6 ("nvmet-tcp: don't map pages which can't come from HIGHMEM") Signed-off-by: Rakshana Sridhar <rakshanas@chelsio.com> Signed-off-by: Varun Prakash <varun@chelsio.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2023-08-21 12:54:02 -07:00
Linus Torvalds	360e694282	block-6.5-2023-08-11 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmTWfLQQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpg3nEACROhaeX6cpeDCSTqDVDW/ontbyn15eX7ep tGPLn/TVtKv2AztIobEinS08MdywqBO/VcB7XkxQV9Ov4JqCHIAKhndWI6/HqD9P DH3h6tE5JA8RQlNw1aHRrqWWIl1lpDQI6263um1tB2TuaxRa4xuR560jju0VZzAm 9541ceKlJT8Qc7yG0aiiCv6Bxz+b6Htv3DqCf1mY2yznl3BpN52RQHKhiA0sfnlF WKqNsvSJ9/kz3vJbNpFucO7ch8a7W+MzmBx0vf2ickTBpL/3hbhUOrE7dGeKI9rS cWh1HaULWqjnKY1uxF9nnapZxm8QoxkT/5T0DgmprKjwuZivfLASAhYpHBc3mT1S eQQ0AK8hqx7sPnPeO/kxWtxM2nzRLkeVd19ClbIwux/zDbRrpHWk2/wgnSUUd3/H HBbjbgPWbkgLvTOUKhIA5VPBcgkC1efom1+ePzkH/H4TRRuVJwg6s6utGXdgc1PX +B4TA8GtXRH/7L0tsblFyJRmd0Y6G7gYE/yy0DYZTMie3oaWrKx3lmz48AQUtEzh DG46VRA4wnthHRlw3mkLP7C6z4PJvK9WWBiK11eZ9VfJMF643FNpXQ3/bviR9pfF kXdwYXoi1mlnsQ0VUhu2f+JeV4hHalrjwD/VE2H0E8Ogb4ezmJteLyiZKcw5xwaA Hmtmbb7Qxw== =+1Vt -----END PGP SIGNATURE----- Merge tag 'block-6.5-2023-08-11' of git://git.kernel.dk/linux Pull block fixes from Jens Axboe: - NVMe pull request via Keith: - Fixes for request_queue state (Ming) - Another uuid quirk (August) - RCU poll fix for NVMe (Ming) - Fix for an IO stall with polled IO (me) - Fix for blk-iocost stats enable/disable accounting (Chengming) - Regression fix for large pages for zram (Christoph) * tag 'block-6.5-2023-08-11' of git://git.kernel.dk/linux: nvme: core: don't hold rcu read lock in nvme_ns_chr_uring_cmd_iopoll blk-iocost: fix queue stats accounting block: don't make REQ_POLLED imply REQ_NOWAIT block: get rid of unused plug->nowait flag zram: take device and not only bvec offset into account nvme-pci: add NVME_QUIRK_BOGUS_NID for Samsung PM9B1 256G and 512G nvme-rdma: fix potential unbalanced freeze & unfreeze nvme-tcp: fix potential unbalanced freeze & unfreeze nvme: fix possible hang when removing a controller during error recovery	2023-08-11 12:14:08 -07:00
Ming Lei	a7a7dabb5d	nvme: core: don't hold rcu read lock in nvme_ns_chr_uring_cmd_iopoll Now nvme_ns_chr_uring_cmd_iopoll() has switched to request based io polling, and the associated NS is guaranteed to be live in case of io polling, so request is guaranteed to be valid because blk-mq uses pre-allocated request pool. Remove the rcu read lock in nvme_ns_chr_uring_cmd_iopoll(), which isn't needed any more after switching to request based io polling. Fix "BUG: sleeping function called from invalid context" because set_page_dirty_lock() from blk_rq_unmap_user() may sleep. Fixes: 585079b6e425 ("nvme: wire up async polling for io passthrough commands") Reported-by: Guangwu Zhang <guazhang@redhat.com> Cc: Kanchan Joshi <joshi.k@samsung.com> Cc: Anuj Gupta <anuj20.g@samsung.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Tested-by: Guangwu Zhang <guazhang@redhat.com> Link: https://lore.kernel.org/r/20230809020440.174682-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2023-08-11 08:12:32 -06:00
Jinyoung Choi	80814b8e35	bio-integrity: update the payload size in bio_integrity_add_page() Previously, the bip's bi_size has been set before an integrity pages were added. If a problem occurs in the process of adding pages for bip, the bi_size mismatch problem must be dealt with. When the page is successfully added to bvec, the bi_size is updated. The parts affected by the change were also contained in this commit. Cc: Christoph Hellwig <hch@lst.de> Cc: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jinyoung Choi <j-young.choi@samsung.com> Tested-by: "Martin K. Petersen" <martin.petersen@oracle.com> Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20230803024956epcms2p38186a17392706650c582d38ef3dbcd32@epcms2p3 Signed-off-by: Jens Axboe <axboe@kernel.dk>	2023-08-09 16:05:35 -06:00
August Wikerfors	688b419c57	nvme-pci: add NVME_QUIRK_BOGUS_NID for Samsung PM9B1 256G and 512G The Samsung PM9B1 512G SSD found in some Lenovo Yoga 7 14ARB7 laptop units reports eui as 0001000200030004 when resuming from s2idle, causing the device to be removed with this error in dmesg: nvme nvme0: identifiers changed for nsid 1 To fix this, add a quirk to ignore namespace identifiers for this device. Signed-off-by: August Wikerfors <git@augustwikerfors.se> Signed-off-by: Keith Busch <kbusch@kernel.org>	2023-08-01 13:28:29 -07:00
Ming Lei	29b434d1e4	nvme-rdma: fix potential unbalanced freeze & unfreeze Move start_freeze into nvme_rdma_configure_io_queues(), and there is at least two benefits: 1) fix unbalanced freeze and unfreeze, since re-connection work may fail or be broken by removal 2) IO during error recovery can be failfast quickly because nvme fabrics unquiesces queues after teardown. One side-effect is that !mpath request may timeout during connecting because of queue topo change, but that looks not one big deal: 1) same problem exists with current code base 2) compared with !mpath, mpath use case is dominant Fixes: 9f98772ba307 ("nvme-rdma: fix controller reset hang during traffic") Cc: stable@vger.kernel.org Signed-off-by: Ming Lei <ming.lei@redhat.com> Tested-by: Yi Zhang <yi.zhang@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>	2023-07-21 00:53:33 -07:00

1 2 3 4 5 ...

3236 Commits