linux

iv/linux

Author	SHA1	Message	Date
Uday Shankar	931c97a54c	scsi: core: Restrict legal sdev_state transitions via sysfs [ Upstream commit 2331ce6126be8864b39490e705286b66e2344aac ] Userspace can currently write to sysfs to transition sdev_state to RUNNING or OFFLINE from any source state. This causes issues because proper transitioning out of some states involves steps besides just changing sdev_state, so allowing userspace to change sdev_state regardless of the source state can result in inconsistencies; e.g. with ISCSI we can end up with sdev_state == SDEV_RUNNING while the device queue is quiesced. Any task attempting I/O on the device will then hang, and in more recent kernels, iscsid will hang as well. More detail about this bug is provided in my first attempt: https://groups.google.com/g/open-iscsi/c/PNKca4HgPDs/m/CXaDkntOAQAJ Link: https://lore.kernel.org/r/20220924000241.2967323-1-ushankar@purestorage.com Signed-off-by: Uday Shankar <ushankar@purestorage.com> Suggested-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-11-10 18:15:34 +01:00
James Smart	21d65b3516	scsi: lpfc: Rework MIB Rx Monitor debug info logic [ Upstream commit bd269188ea94e40ab002cad7b0df8f12b8f0de54 ] The kernel test robot reported the following sparse warning: arch/arm64/include/asm/cmpxchg.h:88:1: sparse: sparse: cast truncates bits from constant value (369 becomes 69) On arm64, atomic_xchg only works on 8-bit byte fields. Thus, the macro usage of LPFC_RXMONITOR_TABLE_IN_USE can be unintentionally truncated leading to all logic involving the LPFC_RXMONITOR_TABLE_IN_USE macro to not work properly. Replace the Rx Table atomic_t indexing logic with a new lpfc_rx_info_monitor structure that holds a circular ring buffer. For locking semantics, a spinlock_t is used. Link: https://lore.kernel.org/r/20220819011736.14141-4-jsmart2021@gmail.com Fixes: 17b27ac59224 ("scsi: lpfc: Add rx monitoring statistics") Cc: <stable@vger.kernel.org> # v5.15+ Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-11-10 18:15:24 +01:00
James Smart	d70705e131	scsi: lpfc: Adjust CMF total bytes and rxmonitor [ Upstream commit a6269f837045acb02904f31f05acde847ec8f8a7 ] Calculate any extra bytes needed to account for timer accuracy. If we are less than LPFC_CMF_INTERVAL, then calculate the adjustment needed for total to reflect a full LPFC_CMF_INTERVAL. Add additional info to rxmonitor, and adjust some log formatting. Link: https://lore.kernel.org/r/20211204002644.116455-7-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: bd269188ea94 ("scsi: lpfc: Rework MIB Rx Monitor debug info logic") Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-11-10 18:15:23 +01:00
James Smart	9ebc6e8ad1	scsi: lpfc: Adjust bytes received vales during cmf timer interval [ Upstream commit d5ac69b332d8859d1f8bd5d4dee31f3267f6b0d2 ] The newly added congestion mgmt framework is seeing unexpected congestion FPINs and signals. In analysis, time values given to the adapter are not at hard time intervals. Thus the drift vs the transfer count seen is affecting how the framework manages things. Adjust counters to cover the drift. Link: https://lore.kernel.org/r/20210910233159.115896-11-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: bd269188ea94 ("scsi: lpfc: Rework MIB Rx Monitor debug info logic") Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-11-10 18:15:23 +01:00
Yu Kuai	aad798a0b3	scsi: sd: Revert "scsi: sd: Remove a local variable" This reverts commit 84f7a9de0602704bbec774a6c7f7c8c4994bee9c. Because it introduces a problem that rq->__data_len is set to the wrong value. before the patch: 1) nr_bytes = rq->__data_len 2) rq->__data_len = sdp->sector_size 3) scsi_init_io() 4) rq->__data_len = nr_bytes after the patch: 1) rq->__data_len = sdp->sector_size 2) scsi_init_io() 3) rq->__data_len = rq->__data_len -> __data_len is wrong It will cause that io can only complete one segment each time, and the io will requeue in scsi_io_completion_action(), which will cause severe performance degradation. Scsi write same is removed in commit e383e16e84e9 ("scsi: sd: Remove WRITE_SAME support") from mainline, hence this patch is only needed for stable kernels. Fixes: 84f7a9de0602 ("scsi: sd: Remove a local variable") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:59:20 +09:00
James Smart	0b0d169723	Revert "scsi: lpfc: SLI path split: Refactor lpfc_iocbq" This reverts commit 1c5e670d6a5a8e7e99b51f45e79879f7828bd2ec. LTS 5.15 pulled in several lpfc "SLI Path split" patches. The Path Split mods were a 14-patch set, which refactors the driver from to split the sli-3 hw (now eol) from the sli-4 hw and use sli4 structures natively. The patches are highly inter-related. Given only some of the patches were included, it created a situation where FLOGI's fail, thus SLI Ports can't start communication. Reverting this patch as its one of the partial Path Split patches that was included. Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:59:14 +09:00
James Smart	7a0fce24de	Revert "scsi: lpfc: SLI path split: Refactor fast and slow paths to native SLI4" This reverts commit c56cc7fefc3159049f94fb1318e48aa60cabf703. LTS 5.15 pulled in several lpfc "SLI Path split" patches. The Path Split mods were a 14-patch set, which refactors the driver from to split the sli-3 hw (now eol) from the sli-4 hw and use sli4 structures natively. The patches are highly inter-related. Given only some of the patches were included, it created a situation where FLOGI's fail, thus SLI Ports can't start communication. Reverting this patch as its one of the partial Path Split patches that was included. Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:59:14 +09:00
James Smart	7a36c9de43	Revert "scsi: lpfc: SLI path split: Refactor SCSI paths" This reverts commit b4543dbea84c8b64566dd0d626d63f8cbe977f61. LTS 5.15 pulled in several lpfc "SLI Path split" patches. The Path Split mods were a 14-patch set, which refactors the driver from to split the sli-3 hw (now eol) from the sli-4 hw and use sli4 structures natively. The patches are highly inter-related. Given only some of the patches were included, it created a situation where FLOGI's fail, thus SLI Ports can't start communication. Reverting this patch as its one of the partial Path Split patches that was included. NOTE: fixed a git revert error which caused a new line to be inserted: line 5755 of lpfc_scsi.c in lpfc_queuecommand + atomic_inc(&ndlp->cmd_pending); Removed the line Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:59:14 +09:00
James Smart	eb8be2dbfb	Revert "scsi: lpfc: Fix locking for lpfc_sli_iocbq_lookup()" This reverts commit 9a570069cdbbc28c4b5b4632d5c9369371ec739c. LTS 5.15 pulled in several lpfc "SLI Path split" patches. The Path Split mods were a 14-patch set, which refactors the driver from to split the sli-3 hw (now eol) from the sli-4 hw and use sli4 structures natively. The patches are highly inter-related. Given only some of the patches were included, it created a situation where FLOGI's fail, thus SLI Ports can't start communication. Reverting this patch as its a fix specific to the Path Split patches, which were partially included and now being pulled from 5.15. Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:59:14 +09:00
James Smart	065bf71a8a	Revert "scsi: lpfc: Fix element offset in __lpfc_sli_release_iocbq_s4()" This reverts commit 6e99860de6f4e286c386f533c4d35e7e283803d4. LTS 5.15 pulled in several lpfc "SLI Path split" patches. The Path Split mods were a 14-patch set, which refactors the driver from to split the sli-3 hw (now eol) from the sli-4 hw and use sli4 structures natively. The patches are highly inter-related. Given only some of the patches were included, it created a situation where FLOGI's fail, thus SLI Ports can't start communication. Reverting this patch as its a fix specific to the Path Split patches, which were partially included and now being pulled from 5.15. Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:59:14 +09:00
James Smart	97dc9076ea	Revert "scsi: lpfc: Resolve some cleanup issues following SLI path refactoring" This reverts commit 17bf429b913b9e7f8d2353782e24ed3a491bb2d8. LTS 5.15 pulled in several lpfc "SLI Path split" patches. The Path Split mods were a 14-patch set, which refactors the driver from to split the sli-3 hw (now eol) from the sli-4 hw and use sli4 structures natively. The patches are highly inter-related. Given only some of the patches were included, it created a situation where FLOGI's fail, thus SLI Ports can't start communication. Reverting this patch as its a fix specific to the Path Split patches, which were partially included and now being pulled from 5.15. Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:59:13 +09:00
Manish Rangankar	a2f0934e6b	scsi: qla2xxx: Use transport-defined speed mask for supported_speeds commit 0b863257c17c5f57a41e0a48de140ed026957a63 upstream. One of the sysfs values reported for supported_speeds was not valid (20Gb/s reported instead of 64Gb/s). Instead of driver internal speed mask definition, use speed mask defined in transport_fc for reporting host->supported_speeds. Link: https://lore.kernel.org/r/20220927115946.17559-1-njavali@marvell.com Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:59:12 +09:00
Rafael Mendonca	9749595feb	scsi: lpfc: Fix memory leak in lpfc_create_port() [ Upstream commit dc8e483f684a24cc06e1d5fa958b54db58855093 ] Commit 5e633302ace1 ("scsi: lpfc: vmid: Add support for VMID in mailbox command") introduced allocations for the VMID resources in lpfc_create_port() after the call to scsi_host_alloc(). Upon failure on the VMID allocations, the new code would branch to the 'out' label, which returns NULL without unwinding anything, thus skipping the call to scsi_host_put(). Fix the problem by creating a separate label 'out_free_vmid' to unwind the VMID resources and make the 'out_put_shost' label call only scsi_host_put(), as was done before the introduction of allocations for VMID. Fixes: 5e633302ace1 ("scsi: lpfc: vmid: Add support for VMID in mailbox command") Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com> Link: https://lore.kernel.org/r/20220916035908.712799-1-rafaelmendsr@gmail.com Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-29 10:12:56 +02:00
Letu Ren	9d54de8660	scsi: 3w-9xxx: Avoid disabling device if failing to enable it [ Upstream commit 7eff437b5ee1309b34667844361c6bbb5c97df05 ] The original code will "goto out_disable_device" and call pci_disable_device() if pci_enable_device() fails. The kernel will generate a warning message like "3w-9xxx 0000:00:05.0: disabling already-disabled device". We shouldn't disable a device that failed to be enabled. A simple return is fine. Link: https://lore.kernel.org/r/20220829110115.38789-1-fantasquex@gmail.com Reported-by: Zheyu Ma <zheyuma97@gmail.com> Signed-off-by: Letu Ren <fantasquex@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-26 12:35:46 +02:00
Mike Christie	a26b065875	scsi: iscsi: iscsi_tcp: Fix null-ptr-deref while calling getpeername() [ Upstream commit 57569c37f0add1b6489e1a1563c71519daf732cf ] Fix a NULL pointer crash that occurs when we are freeing the socket at the same time we access it via sysfs. The problem is that: 1. iscsi_sw_tcp_conn_get_param() and iscsi_sw_tcp_host_get_param() take the frwd_lock and do sock_hold() then drop the frwd_lock. sock_hold() does a get on the "struct sock". 2. iscsi_sw_tcp_release_conn() does sockfd_put() which does the last put on the "struct socket" and that does __sock_release() which sets the sock->ops to NULL. 3. iscsi_sw_tcp_conn_get_param() and iscsi_sw_tcp_host_get_param() then call kernel_getpeername() which accesses the NULL sock->ops. Above we do a get on the "struct sock", but we needed a get on the "struct socket". Originally, we just held the frwd_lock the entire time but in commit bcf3a2953d36 ("scsi: iscsi: iscsi_tcp: Avoid holding spinlock while calling getpeername()") we switched to refcount based because the network layer changed and started taking a mutex in that path, so we could no longer hold the frwd_lock. Instead of trying to maintain multiple refcounts, this just has us use a mutex for accessing the socket in the interface code paths. Link: https://lore.kernel.org/r/20220907221700.10302-1-michael.christie@oracle.com Fixes: bcf3a2953d36 ("scsi: iscsi: iscsi_tcp: Avoid holding spinlock while calling getpeername()") Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-26 12:35:16 +02:00
Mike Christie	e87fb1fcf8	scsi: iscsi: Run recv path from workqueue [ Upstream commit f1d269765ee29da56b32818b7a08054484ed89f2 ] We don't always want to run the recv path from the network softirq because when we have to have multiple sessions sharing the same CPUs, some sessions can eat up the NAPI softirq budget and affect other sessions or users. Allow us to queue the recv handling to the iscsi workqueue so we can have the scheduler/wq code try to balance the work and CPU use across all sessions' worker threads. Note: It wasn't the original intent of the change but a nice side effect is that for some workloads/configs we get a nice performance boost. For a simple read heavy test: fio --direct=1 --filename=/dev/dm-0 --rw=randread --bs=256K --ioengine=libaio --iodepth=128 --numjobs=4 where the iscsi threads, fio jobs, and rps_cpus share CPUs we see a 32% throughput boost. We also see increases for small I/O IOPs tests but it's not as high. Link: https://lore.kernel.org/r/20220616224557.115234-4-michael.christie@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: 57569c37f0ad ("scsi: iscsi: iscsi_tcp: Fix null-ptr-deref while calling getpeername()") Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-26 12:35:16 +02:00
Mike Christie	c2af03a7c1	scsi: iscsi: Add recv workqueue helpers [ Upstream commit 8af809966c0b34cfacd8da9a412689b8e9910354 ] Add helpers to allow the drivers to run their recv paths from libiscsi's workqueue. Link: https://lore.kernel.org/r/20220616224557.115234-3-michael.christie@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: 57569c37f0ad ("scsi: iscsi: iscsi_tcp: Fix null-ptr-deref while calling getpeername()") Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-26 12:35:15 +02:00
Mike Christie	d6aafc21be	scsi: iscsi: Rename iscsi_conn_queue_work() [ Upstream commit 4b9f8ce4d5e823e42944c5a0a4842b0f936365ad ] Rename iscsi_conn_queue_work() to iscsi_conn_queue_xmit() to reflect that it handles queueing of xmits only. Link: https://lore.kernel.org/r/20220616224557.115234-2-michael.christie@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Wu Bo <wubo40@huawei.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: 57569c37f0ad ("scsi: iscsi: iscsi_tcp: Fix null-ptr-deref while calling getpeername()") Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-26 12:35:15 +02:00
Duoming Zhou	e45a1516d2	scsi: libsas: Fix use-after-free bug in smp_execute_task_sg() [ Upstream commit 46ba53c30666717cb06c2b3c5d896301cd00d0c0 ] When executing SMP task failed, the smp_execute_task_sg() calls del_timer() to delete "slow_task->timer". However, if the timer handler sas_task_internal_timedout() is running, the del_timer() in smp_execute_task_sg() will not stop it and a UAF will happen. The process is shown below: (thread 1) \| (thread 2) smp_execute_task_sg() \| sas_task_internal_timedout() ... \| del_timer() \| ... \| ... sas_free_task(task) \| kfree(task->slow_task) //FREE\| \| task->slow_task->... //USE Fix by calling del_timer_sync() in smp_execute_task_sg(), which makes sure the timer handler have finished before the "task->slow_task" is deallocated. Link: https://lore.kernel.org/r/20220920144213.10536-1-duoming@zju.edu.cn Fixes: 2908d778ab3e ("[SCSI] aic94xx: new driver") Reviewed-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-26 12:35:15 +02:00
Saurav Kashyap	e0b1c16fda	scsi: qedf: Populate sysfs attributes for vport commit 592642e6b11e620e4b43189f8072752429fc8dc3 upstream. Few vport parameters were displayed by systool as 'Unknown' or 'NULL'. Copy speed, supported_speed, frame_size and update port_type for NPIV port. Link: https://lore.kernel.org/r/20220919134434.3513-1-njavali@marvell.com Cc: stable@vger.kernel.org Tested-by: Guangwu Zhang <guazhang@redhat.com> Reviewed-by: John Meneghini <jmeneghi@redhat.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-10-26 12:34:26 +02:00
Linus Torvalds	76efb4897b	scsi: stex: Properly zero out the passthrough command structure commit 6022f210461fef67e6e676fd8544ca02d1bcfa7a upstream. The passthrough structure is declared off of the stack, so it needs to be set to zero before copied back to userspace to prevent any unintentional data leakage. Switch things to be statically allocated which will fill the unused fields with 0 automatically. Link: https://lore.kernel.org/r/YxrjN3OOw2HHl9tx@kroah.com Cc: stable@kernel.org Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Reported-by: hdthky <hdthky0@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-10-15 07:59:01 +02:00
Letu Ren	1030659dac	scsi: qedf: Fix a UAF bug in __qedf_probe() [ Upstream commit fbfe96869b782364caebae0445763969ddb6ea67 ] In __qedf_probe(), if qedf->cdev is NULL which means qed_ops->common->probe() failed, then the program will goto label err1, and scsi_host_put() will free lport->host pointer. Because the memory qedf points to is allocated by libfc_host_alloc(), it will be freed by scsi_host_put(). However, the if statement below label err0 only checks whether qedf is NULL but doesn't check whether the memory has been freed. So a UAF bug can occur. There are two ways to reach the statements below err0. The first one is described as before, "qedf" should be set to NULL. The second one is goto "err0" directly. In the latter scenario qedf hasn't been changed and it has the initial value NULL. As a result the if statement is not reachable in any situation. The KASAN logs are as follows: [ 2.312969] BUG: KASAN: use-after-free in __qedf_probe+0x5dcf/0x6bc0 [ 2.312969] [ 2.312969] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 2.312969] Call Trace: [ 2.312969] dump_stack_lvl+0x59/0x7b [ 2.312969] print_address_description+0x7c/0x3b0 [ 2.312969] ? __qedf_probe+0x5dcf/0x6bc0 [ 2.312969] __kasan_report+0x160/0x1c0 [ 2.312969] ? __qedf_probe+0x5dcf/0x6bc0 [ 2.312969] kasan_report+0x4b/0x70 [ 2.312969] ? kobject_put+0x25d/0x290 [ 2.312969] kasan_check_range+0x2ca/0x310 [ 2.312969] __qedf_probe+0x5dcf/0x6bc0 [ 2.312969] ? selinux_kernfs_init_security+0xdc/0x5f0 [ 2.312969] ? trace_rpm_return_int_rcuidle+0x18/0x120 [ 2.312969] ? rpm_resume+0xa5c/0x16e0 [ 2.312969] ? qedf_get_generic_tlv_data+0x160/0x160 [ 2.312969] local_pci_probe+0x13c/0x1f0 [ 2.312969] pci_device_probe+0x37e/0x6c0 Link: https://lore.kernel.org/r/20211112120641.16073-1-fantasquex@gmail.com Reported-by: Zheyu Ma <zheyuma97@gmail.com> Acked-by: Saurav Kashyap <skashyap@marvell.com> Co-developed-by: Wende Tan <twd2.me@gmail.com> Signed-off-by: Wende Tan <twd2.me@gmail.com> Signed-off-by: Letu Ren <fantasquex@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-12 09:53:27 +02:00
Sreekanth Reddy	8b2ab46b6c	scsi: mpt3sas: Fix return value check of dma_get_required_mask() [ Upstream commit e0e0747de0ea3dd87cdbb0393311e17471a9baf1 ] Fix the incorrect return value check of dma_get_required_mask(). Due to this incorrect check, the driver was always setting the DMA mask to 63 bit. Link: https://lore.kernel.org/r/20220913120538.18759-2-sreekanth.reddy@broadcom.com Fixes: ba27c5cf286d ("scsi: mpt3sas: Don't change the DMA coherent mask after allocations") Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-28 11:11:48 +02:00
Rafael Mendonca	89df49e561	scsi: qla2xxx: Fix memory leak in __qlt_24xx_handle_abts() [ Upstream commit 601be20fc6a1b762044d2398befffd6bf236cebf ] Commit 8f394da36a36 ("scsi: qla2xxx: Drop TARGET_SCF_LOOKUP_LUN_FROM_TAG") made the __qlt_24xx_handle_abts() function return early if tcm_qla2xxx_find_cmd_by_tag() didn't find a command, but it missed to clean up the allocated memory for the management command. Link: https://lore.kernel.org/r/20220914024924.695604-1-rafaelmendsr@gmail.com Fixes: 8f394da36a36 ("scsi: qla2xxx: Drop TARGET_SCF_LOOKUP_LUN_FROM_TAG") Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-28 11:11:47 +02:00
Hannes Reinecke	87cd4c02bd	scsi: lpfc: Return DID_TRANSPORT_DISRUPTED instead of DID_REQUEUE [ Upstream commit c0a50cd389c3ed54831e240023dd12bafa56b3a6 ] When the driver hits an internal error condition returning DID_REQUEUE the I/O will be retried on the same ITL nexus. This will inhibit multipathing, resulting in endless retries even if the error could have been resolved by using a different ITL nexus. Return DID_TRANSPORT_DISRUPTED to allow for multipath to engage and route I/O to another ITL nexus. Link: https://lore.kernel.org/r/20220824060033.138661-1-hare@suse.de Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-23 14:15:50 +02:00
Yang Yingliang	1dcc308898	scsi: lpfc: Add missing destroy_workqueue() in error path commit da6d507f5ff328f346b3c50e19e19993027b8ffd upstream. Add the missing destroy_workqueue() before return from lpfc_sli4_driver_resource_setup() in the error path. Link: https://lore.kernel.org/r/20220823044237.285643-1-yangyingliang@huawei.com Fixes: 3cee98db2610 ("scsi: lpfc: Fix crash on driver unload in wq free") Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-15 11:30:03 +02:00
Sreekanth Reddy	6229fa494a	scsi: mpt3sas: Fix use-after-free warning commit 991df3dd5144f2e6b1c38b8d20ed3d4d21e20b34 upstream. Fix the following use-after-free warning which is observed during controller reset: refcount_t: underflow; use-after-free. WARNING: CPU: 23 PID: 5399 at lib/refcount.c:28 refcount_warn_saturate+0xa6/0xf0 Link: https://lore.kernel.org/r/20220906134908.1039-2-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-15 11:30:03 +02:00
Bart Van Assche	c501891293	scsi: ufs: core: Reduce the power mode change timeout [ Upstream commit 8f2c96420c6ec3dcb18c8be923e24c6feaa5ccf6 ] The current power mode change timeout (180 s) is so large that it can cause a watchdog timer to fire. Reduce the power mode change timeout to 10 seconds. Link: https://lore.kernel.org/r/20220811234401.1957911-1-bvanassche@acm.org Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:01 +02:00
Guixin Liu	bbfd857abb	scsi: megaraid_sas: Fix double kfree() [ Upstream commit 8c499e49240bd93628368c3588975cfb94169b8b ] When allocating log_to_span fails, kfree(instance->ctrl_context) is called twice. Remove redundant call. Link: https://lore.kernel.org/r/1659424729-46502-1-git-send-email-kanie@linux.alibaba.com Acked-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Guixin Liu <kanie@linux.alibaba.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:00 +02:00
Tony Battersby	8179f0e085	scsi: qla2xxx: Disable ATIO interrupt coalesce for quad port ISP27XX [ Upstream commit 53661ded2460b414644532de6b99bd87f71987e9 ] This partially reverts commit d2b292c3f6fd ("scsi: qla2xxx: Enable ATIO interrupt handshake for ISP27XX") For some workloads where the host sends a batch of commands and then pauses, ATIO interrupt coalesce can cause some incoming ATIO entries to be ignored for extended periods of time, resulting in slow performance, timeouts, and aborted commands. Disable interrupt coalesce and re-enable the dedicated ATIO MSI-X interrupt. Link: https://lore.kernel.org/r/97dcf365-89ff-014d-a3e5-1404c6af511c@cybernetics.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-09-15 11:30:00 +02:00
Saurabh Sengar	cd2a50d0a0	scsi: storvsc: Remove WQ_MEM_RECLAIM from storvsc_error_wq commit d957e7ffb2c72410bcc1a514153a46719255a5da upstream. storvsc_error_wq workqueue should not be marked as WQ_MEM_RECLAIM as it doesn't need to make forward progress under memory pressure. Marking this workqueue as WQ_MEM_RECLAIM may cause deadlock while flushing a non-WQ_MEM_RECLAIM workqueue. In the current state it causes the following warning: [ 14.506347] ------------[ cut here ]------------ [ 14.506354] workqueue: WQ_MEM_RECLAIM storvsc_error_wq_0:storvsc_remove_lun is flushing !WQ_MEM_RECLAIM events_freezable_power_:disk_events_workfn [ 14.506360] WARNING: CPU: 0 PID: 8 at <-snip->kernel/workqueue.c:2623 check_flush_dependency+0xb5/0x130 [ 14.506390] CPU: 0 PID: 8 Comm: kworker/u4:0 Not tainted 5.4.0-1086-azure #91~18.04.1-Ubuntu [ 14.506391] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 05/09/2022 [ 14.506393] Workqueue: storvsc_error_wq_0 storvsc_remove_lun [ 14.506395] RIP: 0010:check_flush_dependency+0xb5/0x130 <-snip-> [ 14.506408] Call Trace: [ 14.506412] __flush_work+0xf1/0x1c0 [ 14.506414] __cancel_work_timer+0x12f/0x1b0 [ 14.506417] ? kernfs_put+0xf0/0x190 [ 14.506418] cancel_delayed_work_sync+0x13/0x20 [ 14.506420] disk_block_events+0x78/0x80 [ 14.506421] del_gendisk+0x3d/0x2f0 [ 14.506423] sr_remove+0x28/0x70 [ 14.506427] device_release_driver_internal+0xef/0x1c0 [ 14.506428] device_release_driver+0x12/0x20 [ 14.506429] bus_remove_device+0xe1/0x150 [ 14.506431] device_del+0x167/0x380 [ 14.506432] __scsi_remove_device+0x11d/0x150 [ 14.506433] scsi_remove_device+0x26/0x40 [ 14.506434] storvsc_remove_lun+0x40/0x60 [ 14.506436] process_one_work+0x209/0x400 [ 14.506437] worker_thread+0x34/0x400 [ 14.506439] kthread+0x121/0x140 [ 14.506440] ? process_one_work+0x400/0x400 [ 14.506441] ? kthread_park+0x90/0x90 [ 14.506443] ret_from_fork+0x35/0x40 [ 14.506445] ---[ end trace 2d9633159fdc6ee7 ]--- Link: https://lore.kernel.org/r/1659628534-17539-1-git-send-email-ssengar@linux.microsoft.com Fixes: 436ad9413353 ("scsi: storvsc: Allow only one remove lun work item to be issued per lun") Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:51 +02:00
Kiwoong Kim	2c72bead9b	scsi: ufs: core: Enable link lost interrupt commit 6d17a112e9a63ff6a5edffd1676b99e0ffbcd269 upstream. Link lost is treated as fatal error with commit c99b9b230149 ("scsi: ufs: Treat link loss as fatal error"), but the event isn't registered as interrupt source. Enable it. Link: https://lore.kernel.org/r/1659404551-160958-1-git-send-email-kwmad.kim@samsung.com Fixes: c99b9b230149 ("scsi: ufs: Treat link loss as fatal error") Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Kiwoong Kim <kwmad.kim@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:51 +02:00
Quinn Tran	4438d54ce7	scsi: qla2xxx: edif: Fix dropped IKE message [ Upstream commit c019cd656e717349ff22d0c41d6fbfc773f48c52 ] This patch fixes IKE message being dropped due to error in processing Purex IOCB and Continuation IOCBs. Link: https://lore.kernel.org/r/20220713052045.10683-6-njavali@marvell.com Fixes: fac2807946c1 ("scsi: qla2xxx: edif: Add extraction of auth_els from the wire") Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-08-31 17:16:35 +02:00
Arun Easi	bcfe37c788	scsi: qla2xxx: Fix response queue handler reading stale packets [ Upstream commit b1f707146923335849fb70237eec27d4d1ae7d62 ] On some platforms, the current logic of relying on finding new packet solely based on signature pattern can lead to driver reading stale packets. Though this is a bug in those platforms, reduce such exposures by limiting reading packets until the IN pointer. Two module parameters are introduced: ql2xrspq_follow_inptr: When set, on newer adapters that has queue pointer shadowing, look for response packets only until response queue in pointer. When reset, response packets are read based on a signature pattern logic (old way). ql2xrspq_follow_inptr_legacy: Like ql2xrspq_follow_inptr, but for those adapters where there is no queue pointer shadowing. Link: https://lore.kernel.org/r/20220713052045.10683-5-njavali@marvell.com Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-08-31 17:16:35 +02:00
Ren Zhijie	d66d392c72	scsi: ufs: ufs-mediatek: Fix build error and type mismatch commit f54912b228a8df6c0133e31bc75628677bb8c6e5 upstream. If CONFIG_PM_SLEEP is not set. make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-, will fail: drivers/ufs/host/ufs-mediatek.c: In function ‘ufs_mtk_vreg_fix_vcc’: drivers/ufs/host/ufs-mediatek.c:688:46: warning: format ‘%u’ expects argument of type ‘unsigned int’, but argument 4 has type ‘long unsigned int’ [-Wformat=] snprintf(vcc_name, MAX_VCC_NAME, "vcc-opt%u", res.a1); ~^ ~~~~~~ %lu drivers/ufs/host/ufs-mediatek.c: In function ‘ufs_mtk_system_suspend’: drivers/ufs/host/ufs-mediatek.c:1371:8: error: implicit declaration of function ‘ufshcd_system_suspend’; did you mean ‘ufs_mtk_system_suspend’? [-Werror=implicit-function-declaration] ret = ufshcd_system_suspend(dev); ^~~~~~~~~~~~~~~~~~~~~ ufs_mtk_system_suspend drivers/ufs/host/ufs-mediatek.c: In function ‘ufs_mtk_system_resume’: drivers/ufs/host/ufs-mediatek.c:1386:9: error: implicit declaration of function ‘ufshcd_system_resume’; did you mean ‘ufs_mtk_system_resume’? [-Werror=implicit-function-declaration] return ufshcd_system_resume(dev); ^~~~~~~~~~~~~~~~~~~~ ufs_mtk_system_resume cc1: some warnings being treated as errors The declaration of func "ufshcd_system_suspend()" depends on CONFIG_PM_SLEEP, so the function wrapper ufs_mtk_system_suspend() should wrapped by CONFIG_PM_SLEEP too. Link: https://lore.kernel.org/r/20220619115432.205504-1-renzhijie2@huawei.com Fixes: 3fd23b8dfb54 ("scsi: ufs: ufs-mediatek: Fix the timing of configuring device regulators") Reported-by: Hulk Robot <hulkci@huawei.com> Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Ren Zhijie <renzhijie2@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [only take the suspend/resume portion of the commit - gregkh] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-25 11:40:46 +02:00
James Smart	9c8e2e6072	scsi: lpfc: Fix possible memory leak when failing to issue CMF WQE [ Upstream commit 2f67dc7970bce3529edce93a0a14234d88b3fcd5 ] There is no corresponding free routine if lpfc_sli4_issue_wqe fails to issue the CMF WQE in lpfc_issue_cmf_sync_wqe. If ret_val is non-zero, then free the iocbq request structure. Link: https://lore.kernel.org/r/20220701211425.2708-6-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-08-25 11:40:35 +02:00
James Smart	b92506dc51	scsi: lpfc: Prevent buffer overflow crashes in debugfs with malformed user input [ Upstream commit f8191d40aa612981ce897e66cda6a88db8df17bb ] Malformed user input to debugfs results in buffer overflow crashes. Adapt input string lengths to fit within internal buffers, leaving space for NULL terminators. Link: https://lore.kernel.org/r/20220701211425.2708-3-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-08-25 11:40:34 +02:00
Po-Wen Kao	4d6bab8d36	scsi: ufs: ufs-mediatek: Fix the timing of configuring device regulators [ Upstream commit 3fd23b8dfb54d9b74eba6dfdd3225db3ac116785 ] Currently the LPM configurations of device regulators may not work since VCC is not disabled yet while ufs_mtk_vreg_set_lpm() is executed. Fix this by changing the timing of invoking ufs_mtk_vreg_set_lpm(). Link: https://lore.kernel.org/r/20220616053725.5681-5-stanley.chu@mediatek.com Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Po-Wen Kao <powen.kao@mediatek.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-08-25 11:40:32 +02:00
James Smart	17bf429b91	scsi: lpfc: Resolve some cleanup issues following SLI path refactoring commit e27f05147bff21408c1b8410ad8e90cd286e7952 upstream. Following refactoring and consolidation in SLI processing, fix up some minor issues related to SLI path: - Correct the setting of LPFC_EXCHANGE_BUSY flag in response IOCB. - Fix some typographical errors. - Fix duplicate log messages. Link: https://lore.kernel.org/r/20220603174329.63777-4-jsmart2021@gmail.com Fixes: 1b64aa9eae28 ("scsi: lpfc: SLI path split: Refactor fast and slow paths to native SLI4") Cc: <stable@vger.kernel.org> # v5.18 Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-17 14:24:31 +02:00
James Smart	6e99860de6	scsi: lpfc: Fix element offset in __lpfc_sli_release_iocbq_s4() commit 84c6f99e39074d45f75986e42ca28e27c140fd0d upstream. The prior commit that moved from iocb elements to explicit wqe elements missed a name change. Correct __lpfc_sli_release_iocbq_s4() to reference wqe rather than iocb. Link: https://lore.kernel.org/r/20220506035519.50908-2-jsmart2021@gmail.com Fixes: a680a9298e7b ("scsi: lpfc: SLI path split: Refactor lpfc_iocbq") Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-17 14:24:31 +02:00
James Smart	9a570069cd	scsi: lpfc: Fix locking for lpfc_sli_iocbq_lookup() commit c26bd6602e1d348bfa754dc55e5608c922dd2801 upstream. The rules changed for lpfc_sli_iocbq_lookup() vs locking. Prior, the routine properly took out the lock. In newly refactored code, the locks must be held when calling the routine. Fix lpfc_sli_process_sol_iocb() to take the locks before calling the routine. Fix lpfc_sli_handle_fast_ring_event() to not release the locks to call the routine. Link: https://lore.kernel.org/r/20220323205545.81814-3-jsmart2021@gmail.com Fixes: 1b64aa9eae28 ("scsi: lpfc: SLI path split: Refactor fast and slow paths to native SLI4") Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-17 14:24:31 +02:00
James Smart	2b5ef6430c	scsi: lpfc: Remove extra atomic_inc on cmd_pending in queuecommand after VMID [ Upstream commit 0948a9c5386095baae4012190a6b65aba684a907 ] VMID introduced an extra increment of cmd_pending, causing double-counting of the I/O. The normal increment ios performed in lpfc_get_scsi_buf. Link: https://lore.kernel.org/r/20220701211425.2708-5-jsmart2021@gmail.com Fixes: 33c79741deaf ("scsi: lpfc: vmid: Introduce VMID in I/O path") Cc: <stable@vger.kernel.org> # v5.14+ Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-08-17 14:24:21 +02:00
James Smart	b4543dbea8	scsi: lpfc: SLI path split: Refactor SCSI paths [ Upstream commit 3512ac0942938d6977e7999ee69765d948d2faf1 ] This patch refactors the SCSI paths to use SLI-4 as the primary interface. - Conversion away from using SLI-3 iocb structures to set/access fields in common routines. Use the new generic get/set routines that were added. This move changes code from indirect structure references to using local variables with the generic routines. - Refactor routines when setting non-generic fields, to have both SLI3 and SLI4 specific sections. This replaces the set-as-SLI3 then translate to SLI4 behavior of the past. Link: https://lore.kernel.org/r/20220225022308.16486-14-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-08-17 14:24:21 +02:00
James Smart	c56cc7fefc	scsi: lpfc: SLI path split: Refactor fast and slow paths to native SLI4 [ Upstream commit 1b64aa9eae28ac598a03ed3d62a63ac5e5b295fc ] Convert the SLI4 fast and slow paths to use native SLI4 wqe constructs instead of iocb SLI3-isms. Includes the following: - Create simple get_xxx and set_xxx routines to wrapper access to common elements in both SLI3 and SLI4 commands - allowing calling routines to avoid sli-rev-specific structures to access the elements. - using the wqe in the job structure as the primary element - use defines from SLI-4, not SLI-3 - Removal of iocb to wqe conversion from fast and slow path - Add below routines to handle fast path lpfc_prep_embed_io - prepares the wqe for fast path lpfc_wqe_bpl2sgl - manages bpl to sgl conversion lpfc_sli_wqe2iocb - converts a WQE to IOCB for SLI-3 path - Add lpfc_sli3_iocb2wcqecmpl in completion path to convert an SLI-3 iocb completion to wcqe completion - Refactor some of the code that works on both revs for clarity Link: https://lore.kernel.org/r/20220225022308.16486-3-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-08-17 14:24:21 +02:00
James Smart	1c5e670d6a	scsi: lpfc: SLI path split: Refactor lpfc_iocbq [ Upstream commit a680a9298e7b4ff344aca3456177356b276e5038 ] Currently, SLI3 and SLI4 data paths use the same lpfc_iocbq structure. This is a "common" structure but many of the components refer to sli-rev specific entities which can lead the developer astray as to what they actually mean, should be set to, or when they should be used. This first patch prepares the lpfc_iocbq structure so that elements common to both SLI3 and SLI4 data paths are more appropriately named, making it clear they apply generically. Fieldnames based on 'iocb' (sli3) or 'wqe' (sli4) which are actually generic to the paths are renamed to 'cmd': - iocb_flag is renamed to cmd_flag - lpfc_vmid_iocb_tag is renamed to lpfc_vmid_tag - fabric_iocb_cmpl is renamed to fabric_cmd_cmpl - wait_iocb_cmpl is renamed to wait_cmd_cmpl - iocb_cmpl and wqe_cmpl are combined and renamed to cmd_cmpl - rsvd2 member is renamed to num_bdes due to pre-existing usage The structure name itself will retain the iocb reference as changing to a more relevant "job" or "cmd" title induces many hundreds of line changes for only a name change. lpfc_post_buffer is also renamed to lpfc_sli3_post_buffer to indicate use in the SLI3 path only. Link: https://lore.kernel.org/r/20220225022308.16486-2-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-08-17 14:24:21 +02:00
James Smart	eb36ec3039	scsi: lpfc: Fix EEH support for NVMe I/O [ Upstream commit 25ac2c970be32993f1dff607f8354f3c053d42bc ] Injecting errors on the PCI slot while the driver is handling NVMe I/O will cause crashes and hangs. There are several rather difficult scenarios occurring. The main issue is that the adapter can report a PCI error before or simultaneously to the PCI subsystem reporting the error. Both paths have different entry points and currently there is no interlock between them. Thus multiple teardown paths are competing and all heck breaks loose. Complicating things is the NVMs path. To a large degree, I/O was able to be shutdown for a full FC port on the SCSI stack. But on NVMe, there isn't a similar call. At best, it works on a per-controller basis, but even at the controller level, it's a controller "reset" call. All of which means I/O is still flowing on different CPUs with reset paths expecting hw access (mailbox commands) to execute properly. The following modifications are made: - A new flag is set in PCI error entrypoints so the driver can track being called by that path. - An interlock is added in the SLI hw error path and the PCI error path such that only one of the paths proceeds with the teardown logic. - RPI cleanup is patched such that RPIs are marked unregistered w/o mbx cmds in cases of hw error. - If entering the SLI port re-init calls, a case where SLI error teardown was quick and beat the PCI calls now reporting error, check whether the SLI port is still live on the PCI bus. - In the PCI reset code to bring the adapter back, recheck the IRQ settings. Different checks for SLI3 vs SLI4. - In I/O completions, that may be called as part of the cleanup or underway just before the hw error, check the state of the adapter. If in error, shortcut handling that would expect further adapter completions as the hw error won't be sending them. - In routines waiting on I/O completions, which may have been in progress prior to the hw error, detect the device is being torn down and abort from their waits and just give up. This points to a larger issue in the driver on ref-counting for data structures, as it doesn't have ref-counting on q and port structures. We'll do this fix for now as it would be a major rework to be done differently. - Fix the NVMe cleanup to simulate NVMe I/O completions if I/O is being failed back due to hw error. - In I/O buf allocation, done at the start of new I/Os, check hw state and fail if hw error. Link: https://lore.kernel.org/r/20210910233159.115896-10-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-08-17 14:24:20 +02:00
Arun Easi	15f67058a1	scsi: qla2xxx: Fix losing FCP-2 targets during port perturbation tests commit 58d1c124cd79ea686b512043c5bd515590b2ed95 upstream. When a mix of FCP-2 (tape) and non-FCP-2 targets are present, FCP-2 target state was incorrectly transitioned when both of the targets were gone. Fix this by ignoring state transition for FCP-2 targets. Link: https://lore.kernel.org/r/20220616053508.27186-7-njavali@marvell.com Fixes: 44c57f205876 ("scsi: qla2xxx: Changes to support FCP2 Target") Cc: stable@vger.kernel.org Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-17 14:24:18 +02:00
Arun Easi	6f1d5e6979	scsi: qla2xxx: Fix losing target when it reappears during delete commit 118b0c863c8f5629cc5271fc24d72d926e0715d9 upstream. FC target disappeared during port perturbation tests due to a race that tramples target state. Fix the issue by adding state checks before proceeding. Link: https://lore.kernel.org/r/20220616053508.27186-8-njavali@marvell.com Fixes: 44c57f205876 ("scsi: qla2xxx: Changes to support FCP2 Target") Cc: stable@vger.kernel.org Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-17 14:24:18 +02:00
Arun Easi	420e449e21	scsi: qla2xxx: Fix losing FCP-2 targets on long port disable with I/Os commit 2416ccd3815ba1613e10a6da0a24ef21acfe5633 upstream. FCP-2 devices were not coming back online once they were lost, login retries exhausted, and then came back up. Fix this by accepting RSCN when the device is not online. Link: https://lore.kernel.org/r/20220616053508.27186-10-njavali@marvell.com Fixes: 44c57f205876 ("scsi: qla2xxx: Changes to support FCP2 Target") Cc: stable@vger.kernel.org Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-17 14:24:17 +02:00
Quinn Tran	3f1102898b	scsi: qla2xxx: Wind down adapter after PCIe error commit d3117c83ba316b3200d9f2fe900f2b9a5525a25c upstream. Put adapter into a wind down state if OS does not make any attempt to recover the adapter after PCIe error. Link: https://lore.kernel.org/r/20220616053508.27186-4-njavali@marvell.com Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-17 14:24:17 +02:00

1 2 3 4 5 ...

22256 Commits