linux

iv/linux

Author	SHA1	Message	Date
Colin Ian King	5860d9fb56	scsi: lpfc: Return NULL rather than a plain 0 integer Function lpfc_sli4_perform_vport_cvl() returns a pointer to struct lpfc_nodelist so returning a plain 0 integer isn't good practice. Fix this by returning a NULL instead. Link: https://lore.kernel.org/r/20210925224113.183040-1-colin.king@canonical.com Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-28 23:21:36 -04:00
Cai Huoqing	9f80eca441	scsi: aic7xxx: Fix a function name in comments Use dma_alloc_coherent() instead of pci_alloc_consistent(). Link: https://lore.kernel.org/r/20210925132931.95-1-caihuoqing@baidu.com Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-28 23:19:18 -04:00
Cai Huoqing	8d807a0680	scsi: lpfc: Fix a function name in comments Use dma_map_sg() instead of pci_map_sg() in comments. Link: https://lore.kernel.org/r/20210925125324.1760-3-caihuoqing@baidu.com Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-28 23:18:06 -04:00
Len Baker	568778f557	scsi: advansys: Prefer struct_size() over open-coded arithmetic As noted in the "Deprecated Interfaces, Language Features, Attributes, and Conventions" documentation [1], size calculations (especially multiplication) should not be performed in memory allocator (or similar) function arguments due to the risk of them overflowing. This could lead to values wrapping around and a smaller allocation being made than the caller was expecting. Using those allocations could lead to linear overflows of heap memory and other misbehaviors. Use the struct_size() helper to do the arithmetic instead of the argument "size + count * size" in the kzalloc() function. This code was detected with the help of Coccinelle and audited and fixed manually. [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments Link: https://lore.kernel.org/r/20210925114205.11377-1-len.baker@gmx.com Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Len Baker <len.baker@gmx.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-28 22:54:55 -04:00
Krzysztof Kozlowski	ce580e47e8	scsi: ufs: exynos: Unify naming Use "Samsung" and "Exynos", not the uppercase versions. Link: https://lore.kernel.org/r/20210924132658.109814-2-krzysztof.kozlowski@canonical.com Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-28 22:45:43 -04:00
James Smart	efe1dc571a	scsi: lpfc: Fix mailbox command failure during driver initialization Contention for the mailbox interface may occur during driver initialization (immediately after a function reset), between mailbox commands initiated via ioctl (bsg) and those driver requested by the driver. After setting SLI_ACTIVE flag for a port, there is a window in which the driver will allow an ioctl to be initiated while the adapter is initializing and issuing mailbox commands via polling. The polling logic then gets confused. Correct by having thread setting SLI_ACTIVE spot an active mailbox command and allow it complete before proceeding. Link: https://lore.kernel.org/r/20210921143008.64212-1-jsmart2021@gmail.com Co-developed-by: Nigel Kirkland <nkirkland2304@gmail.com> Signed-off-by: Nigel Kirkland <nkirkland2304@gmail.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-22 00:22:12 -04:00
Tong Zhang	cbd9a3347c	scsi: dc395: Fix error case unwinding dc395x_init_one()->adapter_init() might fail. In this case, the acb is already cleaned up by adapter_init(), no need to do that in adapter_uninit(acb) again. [ 1.252251] dc395x: adapter init failed [ 1.254900] RIP: 0010:adapter_uninit+0x94/0x170 [dc395x] [ 1.260307] Call Trace: [ 1.260442] dc395x_init_one.cold+0x72a/0x9bb [dc395x] Link: https://lore.kernel.org/r/20210907040702.1846409-1-ztong0001@gmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reviewed-by: Finn Thain <fthain@linux-m68k.org> Signed-off-by: Tong Zhang <ztong0001@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-22 00:19:51 -04:00
Avri Altman	322c4b29ee	scsi: ufs: core: Add temperature notification exception handling The device may notify the host of an extreme temperature by using the exception event mechanism. The exception can be raised when the device’s Tcase temperature is either too high or too low. It is essentially up to the platform to decide what further actions need to be taken. leave a placeholder for a designated vop for that. Link: https://lore.kernel.org/r/20210915060407.40-3-avri.altman@wdc.com Reviewed-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Daejun Park <daejun7.park@samsung.com> Signed-off-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-22 00:07:42 -04:00
Avri Altman	e88e2d3220	scsi: ufs: core: Probe for temperature notification support Probe the dExtendedUFSFeaturesSupport register for the device's temperature notification support and, if supported, add a hardware monitor device. Link: https://lore.kernel.org/r/20210915060407.40-2-avri.altman@wdc.com Reviewed-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Daejun Park <daejun7.park@samsung.com> Signed-off-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-22 00:07:42 -04:00
Dmitry Bogdanov	e76b7c5e25	scsi: efct: Decrease area under spinlock Under the session level spinlock node->active_ios_lock in efct_scsi_io_alloc() we are taking another spinlock for the port. This leads to contention between sessions and even between I/Os in the same session. Reduce the locked region to active_ios list for which active_ios_lock is intended. Spinlock CPU usage decreases from 18% down to 13%. IOPS are increased from 220 kIOPS to 264 kIOPS for one LUN. Link: https://lore.kernel.org/r/20210914105539.6942-4-d.bogdanov@yadro.com Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Ram Vegesna <ram.vegesna@broadcom.com> Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-22 00:04:56 -04:00
Dmitry Bogdanov	ee3dce9f38	scsi: efct: Fix nport free nport_free for an empty nport hangs the state machine waiting for mbox completion if nport is not yet attached thinking that it is attaching right now. Add a check for nport attaching state and complete nport free. Link: https://lore.kernel.org/r/20210914105539.6942-3-d.bogdanov@yadro.com Reviewed-by: Ram Vegesna <ram.vegesna@broadcom.com> Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-22 00:04:55 -04:00
Dmitry Bogdanov	8d4efd0040	scsi: efct: Add state in nport sm trace printout Similar to other state machine traces and to make debug easier, add the state name to nport sm trace printout. Link: https://lore.kernel.org/r/20210914105539.6942-2-d.bogdanov@yadro.com Reviewed-by: Ram Vegesna <ram.vegesna@broadcom.com> Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-22 00:04:55 -04:00
Bart Van Assche	a7c0520669	scsi: core: Remove include <scsi/scsi_host.h> from scsi_cmnd.h There are no dependencies in <scsi/scsi_cmnd.h> on the <scsi/scsi_host.h> header file. Hence remove the scsi_host.h include directive from scsi_cmnd.h. This include directive was introduced in February 2021 by commit af1830956dc3 ("scsi: core: Add mq_poll support to SCSI layer"). Link: https://lore.kernel.org/r/20210917212751.2676054-1-bvanassche@acm.org Cc: Kashyap Desai <kashyap.desai@broadcom.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-21 23:54:44 -04:00
Li Feng	7e642ca037	scsi: target: Remove unused function arguments The se_cmd is unused in these functions, just remove it. Link: https://lore.kernel.org/r/20210913083045.3670648-1-fengli@smartx.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Li Feng <fengli@smartx.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:41:24 -04:00
Peter Wang	aba3b0757b	scsi: ufs: ufs-mediatek: Change dbg select by check IP version Mediatek UFS dbg select setting is changed in new IP version. Check the IP version before setting dbg select. Link: https://lore.kernel.org/r/1630918387-8333-1-git-send-email-peter.wang@mediatek.com Signed-off-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:38:58 -04:00
Daejun Park	351b3a849a	scsi: ufs: ufshpb: Use proper power management API In ufshpb, pm_runtime_{get,put}_sync() are used to avoid unwanted runtime suspend during query requests. Whereas commit b294ff3e3449 ("scsi: ufs: core: Enable power management for wlun") modified the driver core to use ufshcd_rpm_{get,put}_sync() APIs. Switch to these APIs in HPB module as well. Link: https://lore.kernel.org/r/20210902003534epcms2p1937a0f0eeb48a441cb69f5ef13ff8430@epcms2p1 Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Daejun Park <daejun7.park@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:38:58 -04:00
ChanWoo Lee	c4adf171e8	scsi: ufs: ufs-qcom: Remove unneeded variable 'err' 'err' is never set in this functon. Remove the declaration and just return 0. Link: https://lore.kernel.org/r/20210907044111.29632-1-cw9316.lee@samsung.com Signed-off-by: ChanWoo Lee <cw9316.lee@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:38:58 -04:00
Muneendra Kumar	e9d73bfa8e	scsi: documentation: Document Fibre Channel sysfs node for appid Update documentation for sysfs node within /sys/class/fc/fc_udev_device/. Link: https://lore.kernel.org/r/20210913015853.2086512-1-muneendra.kumar@broadcom.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:38:58 -04:00
Len Baker	0a5e20fc8c	scsi: elx: libefc: Prefer kcalloc() over open coded arithmetic As noted in the "Deprecated Interfaces, Language Features, Attributes, and Conventions" documentation [1], size calculations (especially multiplication) should not be performed in memory allocator (or similar) function arguments due to the risk of them overflowing. This could lead to values wrapping around and a smaller allocation being made than the caller was expecting. Using those allocations could lead to linear overflows of heap memory and other misbehaviors. Use the purpose specific kcalloc() function instead of the argument count * size in the kzalloc() function. [1] https://www.kernel.org/doc/html/v5.14/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments Link: https://lore.kernel.org/r/20210905062448.6587-1-len.baker@gmx.com Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Len Baker <len.baker@gmx.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:33:22 -04:00
James Smart	0d6b26795b	scsi: lpfc: Update lpfc version to 14.0.0.2 Update lpfc version to 14.0.0.2. Link: https://lore.kernel.org/r/20210910233159.115896-15-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:33:22 -04:00
James Smart	315b3fd135	scsi: lpfc: Improve PBDE checks during SGL processing The PBDE feature, setting payload buffer address explicitly in the WQE so it doesn't have to be fetched from the SGL, only makes sense when there is a single buffer for the I/O. When there are multiple buffers it actually hurts performance as the SGL subsequently has to be fetched. Rework the SGL logic to only use PBDE when a single buffer. Link: https://lore.kernel.org/r/20210910233159.115896-14-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:33:22 -04:00
James Smart	afd63fa511	scsi: lpfc: Zero CGN stats only during initial driver load and stat reset Currently congestion management framework results are cleared whenever the framework settings changed (such as it being turned off then back on). This unfortunately means prior stats, rolled up to higher time windows lose meaning. Change such that stats are not cleared. Thus they pause and resume with prior values still being considered. Link: https://lore.kernel.org/r/20210910233159.115896-13-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:33:21 -04:00
James Smart	3ea998cbf9	scsi: lpfc: Fix I/O block after enabling managed congestion mode If the congestion management framework dynamically enables, it may do so while I/O is in flight. The updates of cmf info due to inflight I/O completing may happen before values have been initialized. Fix by ensure cmf_max_bytes_per_interval is initialized when checking bandwidth utilization for SCSI layer blocking. Link: https://lore.kernel.org/r/20210910233159.115896-12-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:33:21 -04:00
James Smart	d5ac69b332	scsi: lpfc: Adjust bytes received vales during cmf timer interval The newly added congestion mgmt framework is seeing unexpected congestion FPINs and signals. In analysis, time values given to the adapter are not at hard time intervals. Thus the drift vs the transfer count seen is affecting how the framework manages things. Adjust counters to cover the drift. Link: https://lore.kernel.org/r/20210910233159.115896-11-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:33:21 -04:00
James Smart	25ac2c970b	scsi: lpfc: Fix EEH support for NVMe I/O Injecting errors on the PCI slot while the driver is handling NVMe I/O will cause crashes and hangs. There are several rather difficult scenarios occurring. The main issue is that the adapter can report a PCI error before or simultaneously to the PCI subsystem reporting the error. Both paths have different entry points and currently there is no interlock between them. Thus multiple teardown paths are competing and all heck breaks loose. Complicating things is the NVMs path. To a large degree, I/O was able to be shutdown for a full FC port on the SCSI stack. But on NVMe, there isn't a similar call. At best, it works on a per-controller basis, but even at the controller level, it's a controller "reset" call. All of which means I/O is still flowing on different CPUs with reset paths expecting hw access (mailbox commands) to execute properly. The following modifications are made: - A new flag is set in PCI error entrypoints so the driver can track being called by that path. - An interlock is added in the SLI hw error path and the PCI error path such that only one of the paths proceeds with the teardown logic. - RPI cleanup is patched such that RPIs are marked unregistered w/o mbx cmds in cases of hw error. - If entering the SLI port re-init calls, a case where SLI error teardown was quick and beat the PCI calls now reporting error, check whether the SLI port is still live on the PCI bus. - In the PCI reset code to bring the adapter back, recheck the IRQ settings. Different checks for SLI3 vs SLI4. - In I/O completions, that may be called as part of the cleanup or underway just before the hw error, check the state of the adapter. If in error, shortcut handling that would expect further adapter completions as the hw error won't be sending them. - In routines waiting on I/O completions, which may have been in progress prior to the hw error, detect the device is being torn down and abort from their waits and just give up. This points to a larger issue in the driver on ref-counting for data structures, as it doesn't have ref-counting on q and port structures. We'll do this fix for now as it would be a major rework to be done differently. - Fix the NVMe cleanup to simulate NVMe I/O completions if I/O is being failed back due to hw error. - In I/O buf allocation, done at the start of new I/Os, check hw state and fail if hw error. Link: https://lore.kernel.org/r/20210910233159.115896-10-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:33:21 -04:00
James Smart	cd8a36a90b	scsi: lpfc: Fix FCP I/O flush functionality for TMF routines A prior patch inadvertently caused lpfc_sli_sum_iocb() to exclude counting of outstanding aborted I/Os and ABORT IOCBs. Thus, lpfc_reset_flush_io_context() called from any TMF routine does not properly wait to flush all outstanding FCP IOCBs leading to a block layer crash on an invalid scsi_cmnd->request pointer. kernel BUG at ../block/blk-core.c:1489! RIP: 0010:blk_requeue_request+0xaf/0xc0 ... Call Trace: <IRQ> __scsi_queue_insert+0x90/0xe0 [scsi_mod] blk_done_softirq+0x7e/0x90 __do_softirq+0xd2/0x280 irq_exit+0xd5/0xe0 do_IRQ+0x4c/0xd0 common_interrupt+0x87/0x87 </IRQ> Fix by separating out the LPFC_IO_FCP, LPFC_IO_ON_TXCMPLQ, LPFC_DRIVER_ABORTED, and CMD_ABORT_XRI_CN \|\| CMD_CLOSE_XRI_CN checks into a new lpfc_sli_validate_fcp_iocb_for_abort() routine when determining to build an ABORT iocb. Restore lpfc_reset_flush_io_context() functionality by including counting of outstanding aborted IOCBs and ABORT IOCBs in lpfc_sli_sum_iocb(). Link: https://lore.kernel.org/r/20210910233159.115896-9-jsmart2021@gmail.com Fixes: e1364711359f ("scsi: lpfc: Fix illegal memory access on Abort IOCBs") Cc: <stable@vger.kernel.org> # v5.12+ Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:33:21 -04:00
James Smart	b507357f79	scsi: lpfc: Fix NVMe I/O failover to non-optimized path Currently, we hold off unregistering with NVMe transport layer until GID_FT or ADISC completes upon receipt of RSCN. In the ADISC discovery routine, for nodes not found in the GID_FT response, the nodes are unregistered from the SCSI transport but not UNREG_RPI'd. Meaning outstanding WQEs continue to be outstanding and were not failed back to the OS. If an NVMe device, this mean there wasn't initial termination of the I/Os so they could be issued on a different NVMe path. Fix by unregistering the RPI so that I/O is cancelled. Link: https://lore.kernel.org/r/20210910233159.115896-8-jsmart2021@gmail.com Fixes: 0614568361b0 ("scsi: lpfc: Delay unregistering from transport until GIDFT or ADISC completes") Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:33:21 -04:00
James Smart	a864ee709b	scsi: lpfc: Don't remove ndlp on PRLI errors in P2P mode In pt-2-pt mode, the initiator does not log into the target after a PRLI error. In pt-2-pt mode, the target responded to the PRLI by sending a LOGO. The LOGO causes all ELS and I/Os to be aborted. This caused the PRLI to fail. The PRLI completion path caused the discovery node to be dropped to avoid being stick in an UNUSED (not logged in) state. As the node was dropped there is no retry of the login and as it is pt-2-pt, there is no RSCN to retrigger discovery. Thus the other end is not seen by the OS. Fix by ensuring the discovery node is not dropped if connecting pt-2-pt. This will cause PLOGI to be retried. Link: https://lore.kernel.org/r/20210910233159.115896-7-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:33:21 -04:00
James Smart	3a874488d2	scsi: lpfc: Fix rediscovery of tape device after LIP On link up and node discovery, a remote port is registered with the SCSI transport and the driver sets fc4_xpt_flags to track transport registration. A link down event causes the driver to deregister with the SCSI transport, starting the devloss timer, and calls a local unreg routine to clear the login state. Part of the login state is the fc4_xpt_flags. However, with tape devices that support sequence level error recovery, which wants to preserve the login, the local unreg routine is skipped, thus the flags aren't cleared. A subsequent link up, ADISC is performed and the lpfc_nlp_reg_node() routine is called. As the fc4_xpt_flags is not clear, it's believed the node is already registered with the transport. Unfortunately, the registration was already terminated. Eventually the devloss tmo timer expires and tears down the device. Fix by ensuring the tape device, known by the ADISC flag, is always unregistered if the link drops. Link: https://lore.kernel.org/r/20210910233159.115896-6-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:33:21 -04:00
James Smart	88f7702984	scsi: lpfc: Fix hang on unload due to stuck fport node A test scenario encountered an unload hang while an FLOGI ELS was in flight when a link down condition occurred. The driver fails unload as it never releases the fport node. For most nodes, when the link drops, devloss tmo is started and the timeout will cause the final node release. For the Fport, as it has not yet registered with the SCSI transport, there is no devloss timer to be started, so there is no final release. Additionally, the link down sequence causes ABORTS to be issued for pending ELS's. The completions from the ABORTS perform the release of node references. However, as the adapter is being reset to be unloaded, those completions will never occur. Fix by the following: - In the ELS cleanup, recognize when unloading and place the ELS's on a different list that immediately cleans up/completes the ELS's. It's recognized that this condition primarily affects only the fport, with other ports having normal clean up logic that handles things. - Resolve the devloss issue by, when cleaning up nodes on after link down, recognizing when the fabric node does not have a completed state (its state is UNUSED) and removing a reference so the node can delete after the ELS reference is released. Link: https://lore.kernel.org/r/20210910233159.115896-5-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:33:21 -04:00
James Smart	20d2279f90	scsi: lpfc: Fix premature rpi release for unsolicited TPLS and LS_RJT A test scenario has a target issuing a TPLS after accepting the driver's PRLI. TPLS is not supported by the driver so it rejects the ELS. However, the reject was only happening on the primary N_Port. If the TPLS was to a NPIV vport, not only would it reject the ELS, but it would act on the TPLS, starting devloss, then unregister from the SCSI transport and release the node. When devloss expired, it would access the node again and cause a page faul. Fix by altering the NPIV code to recognize that a correctly registered node can reject unsolicited ELS I/O and to not unregister with the SCSI transport and tear the node down. Add a check of the fc4_xpt_flags so that only a zero value allows the unreg and teardown. Link: https://lore.kernel.org/r/20210910233159.115896-4-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:33:20 -04:00
James Smart	982fc3965d	scsi: lpfc: Don't release final kref on Fport node while ABTS outstanding In a rarely executed path, FLOGI failure, there is a refcounting error. If FLOGI completed with an error, typically a timeout, the initial completion handler would remove the job reference. However, the job completion isn't the actual end of the job/exchange as the timeout usually initiates an ABTS, and upon that ABTS completion, a final completion is sent. The driver removes the reference again in the final completion. Thus the imbalance. In the buggy cases, if there was a link bounce while the delayed response is outstanding, the fport node may be referenced again but there was no additional reference as it is already present. The delayed completion then occurs and removes the last reference freeing the node and causing issues in the link up processed that is using the node. Fix this scenario by removing the snippet that removed the reference in the initial FLOGI completion. The bad snippet was poorly trying to identify the FLOGI as OK to do so by realizing the node was not registered with either SCSI or NVMe transport. Link: https://lore.kernel.org/r/20210910233159.115896-3-jsmart2021@gmail.com Fixes: 618e2ee146d4 ("scsi: lpfc: Fix FLOGI failure due to accessing a freed node") Cc: <stable@vger.kernel.org> # v5.13+ Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:33:20 -04:00
James Smart	99154581b0	scsi: lpfc: Fix list_add() corruption in lpfc_drain_txq() When parsing the txq list in lpfc_drain_txq(), the driver attempts to pass the requests to the adapter. If such an attempt fails, a local "fail_msg" string is set and a log message output. The job is then added to a completions list for cancellation. Processing of any further jobs from the txq list continues, but since "fail_msg" remains set, jobs are added to the completions list regardless of whether a wqe was passed to the adapter. If successfully added to txcmplq, jobs are added to both lists resulting in list corruption. Fix by clearing the fail_msg string after adding a job to the completions list. This stops the subsequent jobs from being added to the completions list unless they had an appropriate failure. Link: https://lore.kernel.org/r/20210910233159.115896-2-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:33:20 -04:00
Colin Ian King	914418f369	scsi: qla2xxx: Remove redundant initialization of pointer req The pointer req is being initialized with a value that is never read, it is being updated later on. The assignment is redundant and can be removed. Link: https://lore.kernel.org/r/20210910114610.44752-1-colin.king@canonical.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Addresses-Coverity: ("Unused value")	2021-09-14 23:33:20 -04:00
Nilesh Javali	b0fe235dad	scsi: qla2xxx: Update version to 10.02.07.100-k Link: https://lore.kernel.org/r/20210908164622.19240-11-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:33:20 -04:00
Quinn Tran	3d33b303d4	scsi: qla2xxx: Fix use after free in eh_abort path In eh_abort path driver prematurely exits the call to upper layer. Check whether command is aborted / completed by firmware before exiting the call. 9 [ffff8b1ebf803c00] page_fault at ffffffffb0389778 [exception RIP: qla2x00_status_entry+0x48d] RIP: ffffffffc04fa62d RSP: ffff8b1ebf803cb0 RFLAGS: 00010082 RAX: 00000000ffffffff RBX: 00000000000e0000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 00000000000013d8 RDI: fffff3253db78440 RBP: ffff8b1ebf803dd0 R8: ffff8b1ebcd9b0c0 R9: 0000000000000000 R10: ffff8b1e38a30808 R11: 0000000000001000 R12: 00000000000003e9 R13: 0000000000000000 R14: ffff8b1ebcd9d740 R15: 0000000000000028 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 10 [ffff8b1ebf803cb0] enqueue_entity at ffffffffafce708f 11 [ffff8b1ebf803d00] enqueue_task_fair at ffffffffafce7b88 12 [ffff8b1ebf803dd8] qla24xx_process_response_queue at ffffffffc04fc9a6 [qla2xxx] 13 [ffff8b1ebf803e78] qla24xx_msix_rsp_q at ffffffffc04ff01b [qla2xxx] 14 [ffff8b1ebf803eb0] __handle_irq_event_percpu at ffffffffafd50714 Link: https://lore.kernel.org/r/20210908164622.19240-10-njavali@marvell.com Fixes: f45bca8c5052 ("scsi: qla2xxx: Fix double scsi_done for abort path") Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Co-developed-by: David Jeffery <djeffery@redhat.com> Signed-off-by: David Jeffery <djeffery@redhat.com> Co-developed-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:33:20 -04:00
Manish Rangankar	3a4e1f3b3a	scsi: qla2xxx: Move heartbeat handling from DPC thread to workqueue DPC thread gets restricted due to a no-op mailbox, which is a blocking call and has a high execution frequency. To free up the DPC thread we move no-op handling to the workqueue. Also, modified qla_do_heartbeat() to send no-op MBC if we don’t have any active interrupts, but there are still I/Os outstanding with firmware. Link: https://lore.kernel.org/r/20210908164622.19240-9-njavali@marvell.com Fixes: d94d8158e184 ("scsi: qla2xxx: Add heartbeat check") Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:33:20 -04:00
Shreyas Deodhar	38c61709e6	scsi: qla2xxx: Call process_response_queue() in Tx path Process responses in Tx path if any available for better performance. Link: https://lore.kernel.org/r/20210908164622.19240-8-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Shreyas Deodhar <sdeodhar@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:33:20 -04:00
Arun Easi	3ef68d4f0c	scsi: qla2xxx: Fix kernel crash when accessing port_speed sysfs file Kernel crashes when accessing port_speed sysfs file. The issue happens on a CNA when the local array was accessed beyond bounds. Fix this by changing the lookup. BUG: unable to handle kernel paging request at 0000000000004000 PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 15 PID: 455213 Comm: sosreport Kdump: loaded Not tainted 4.18.0-305.7.1.el8_4.x86_64 #1 RIP: 0010:string_nocheck+0x12/0x70 Code: 00 00 4c 89 e2 be 20 00 00 00 48 89 ef e8 86 9a 00 00 4c 01 e3 eb 81 90 49 89 f2 48 89 ce 48 89 f8 48 c1 fe 30 66 85 f6 74 4f <44> 0f b6 0a 45 84 c9 74 46 83 ee 01 41 b8 01 00 00 00 48 8d 7c 37 RSP: 0018:ffffb5141c1afcf0 EFLAGS: 00010286 RAX: ffff8bf4009f8000 RBX: ffff8bf4009f9000 RCX: ffff0a00ffffff04 RDX: 0000000000004000 RSI: ffffffffffffffff RDI: ffff8bf4009f8000 RBP: 0000000000004000 R08: 0000000000000001 R09: ffffb5141c1afb84 R10: ffff8bf4009f9000 R11: ffffb5141c1afce6 R12: ffff0a00ffffff04 R13: ffffffffc08e21aa R14: 0000000000001000 R15: ffffffffc08e21aa FS: 00007fc4ebfff700(0000) GS:ffff8c717f7c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000004000 CR3: 000000edfdee6006 CR4: 00000000001706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: string+0x40/0x50 vsnprintf+0x33c/0x520 scnprintf+0x4d/0x90 qla2x00_port_speed_show+0xb5/0x100 [qla2xxx] dev_attr_show+0x1c/0x40 sysfs_kf_seq_show+0x9b/0x100 seq_read+0x153/0x410 vfs_read+0x91/0x140 ksys_read+0x4f/0xb0 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x65/0xca Link: https://lore.kernel.org/r/20210908164622.19240-7-njavali@marvell.com Fixes: 4910b524ac9e ("scsi: qla2xxx: Add support for setting port speed") Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:33:20 -04:00
Quinn Tran	527d46e0b0	scsi: qla2xxx: edif: Use link event to wake up app Authentication application may be running and in the past tried to probe driver (app_start) but was unsuccessful. This could be due to the bsg layer not being ready to service the request. On a successful link up, driver will use the netlink Link Up event to notify the app to retry the app_start call. In another case, app does not poll for new NPIV host. This link up event would notify app of the presence of a new SCSI host. Link: https://lore.kernel.org/r/20210908164622.19240-6-njavali@marvell.com Fixes: 4de067e5df12 ("scsi: qla2xxx: edif: Add N2N support for EDIF") Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:33:19 -04:00
Arun Easi	e6e22e6cc2	scsi: qla2xxx: Fix crash in NVMe abort path System crash was seen when I/O was run against an NVMe target and aborts were occurring. Crash stack is: -- relevant crash stack -- BUG: kernel NULL pointer dereference, address: 0000000000000010 : #6 [ffffae1f8666bdd0] page_fault at ffffffffa740122e [exception RIP: qla_nvme_abort_work+339] RIP: ffffffffc0f592e3 RSP: ffffae1f8666be80 RFLAGS: 00010297 RAX: 0000000000000000 RBX: ffff9b581fc8af80 RCX: ffffffffc0f83bd0 RDX: 0000000000000001 RSI: ffff9b5839c6c7c8 RDI: 0000000008000000 RBP: ffff9b6832f85000 R8: ffffffffc0f68160 R9: ffffffffc0f70652 R10: ffffae1f862ffdc8 R11: 0000000000000300 R12: 000000000000010d R13: 0000000000000000 R14: ffff9b5839cea000 R15: 0ffff9b583fab170 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #7 [ffffae1f8666be98] process_one_work at ffffffffa6aba184 #8 [ffffae1f8666bed8] worker_thread at ffffffffa6aba39d #9 [ffffae1f8666bf10] kthread at ffffffffa6ac06ed The crash was due to a stale SRB structure access after it was aborted. Fix the issue by removing stale access. Link: https://lore.kernel.org/r/20210908164622.19240-5-njavali@marvell.com Fixes: 2cabf10dbbe3 ("scsi: qla2xxx: Fix hang on NVMe command timeouts") Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:33:19 -04:00
Saurav Kashyap	8192817efb	scsi: qla2xxx: Check for firmware capability before creating QPair Add firmware capability check of multiQ specifically for ISP25XX before creating qpair. Link: https://lore.kernel.org/r/20210908164622.19240-4-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:33:19 -04:00
Saurav Kashyap	52cca50d35	scsi: qla2xxx: Display 16G only as supported speeds for 3830c card This card is unique and doesn't support lower speeds, hence update the fdmi field to display 16G only. Link: https://lore.kernel.org/r/20210908164622.19240-3-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:33:19 -04:00
Bikash Hazarika	9e1c320696	scsi: qla2xxx: Add support for mailbox passthru This interface will allow user space applications to send a mailbox command to the firmware. Link: https://lore.kernel.org/r/20210908164622.19240-2-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Bikash Hazarika <bhazarika@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 23:33:19 -04:00
Ajish Koshy	51e6ed83bb	scsi: pm80xx: Fix memory leak during rmmod Driver failed to release all memory allocated. This would lead to memory leak during driver removal. Properly free memory when the module is removed. Link: https://lore.kernel.org/r/20210906170404.5682-5-Ajish.Koshy@microchip.com Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Ajish Koshy <Ajish.Koshy@microchip.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 22:29:11 -04:00
Viswas G	c29737d03c	scsi: pm80xx: Correct inbound and outbound queue logging Correct inbound queue and outbound queue size in 'ib_log' and 'ob_log' sysfs entries. Link: https://lore.kernel.org/r/20210906170404.5682-4-Ajish.Koshy@microchip.com Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Ajish Koshy <Ajish.Koshy@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 22:29:11 -04:00
Ajish Koshy	b27a40534e	scsi: pm80xx: Fix lockup in outbound queue management Commit 1f02beff224e ("scsi: pm80xx: Remove global lock from outbound queue processing") introduced a lock per outbound queue. Prior to that change the driver was using a global lock for all outbound queues. While processing the I/O responses and events the driver takes the outbound queue spinlock and is supposed to release it in pm8001_ccb_task_free_done() before calling command done(). Since the older code was using a global lock, pm8001_ccb_task_free_done() was releasing the global spin lock. The change that split the lock per outbound queue did not consider this and pm8001_ccb_task_free_done() was still releasing the global lock. Link: https://lore.kernel.org/r/20210906170404.5682-3-Ajish.Koshy@microchip.com Fixes: 1f02beff224e ("scsi: pm80xx: Remove global lock from outbound queue processing") Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Ajish Koshy <Ajish.Koshy@microchip.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 22:29:11 -04:00
Ajish Koshy	08d0a99213	scsi: pm80xx: Fix incorrect port value when registering a device During phyup event, the firmware provides the phy_id and port_id and driver is supposed to use these during device handle registration. Previously the driver was using the port id value from libsas during device handle registration. Since id can be different from the one assigned by firmware, this can lead to wrong device registration and drives not showing up. Use firmware assigned port id during device registration. Link: https://lore.kernel.org/r/20210906170404.5682-2-Ajish.Koshy@microchip.com Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Ajish Koshy <Ajish.Koshy@microchip.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 22:29:11 -04:00
Ding Hui	e018f03d6c	scsi: libiscsi: Move ehwait initialization to iscsi_session_setup() Commit ec29d0ac29be ("scsi: iscsi: Fix conn use after free during resets") moved member ehwait from 'conn' to 'session', but left the initialization of ehwait in iscsi_conn_setup(). Although a session can only have 1 conn currently, it is better to initialize ehwait in iscsi_session_setup() in case we implement handling multiple conns in the future. Link: https://lore.kernel.org/r/20210911135159.20543-1-dinghui@sangfor.com.cn Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-14 00:02:45 -04:00
John Garry	ce4fc333e5	scsi: libsas: Co-locate exports with symbols It is standard practice to co-locate export declarations with the symbol which is being exported. Or at least in the same file - see sas_phy_reset(). Modify libsas to follow this practice consistently. Link: https://lore.kernel.org/r/1631530296-32358-1-git-send-email-john.garry@huawei.com Reviewed-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-13 23:57:39 -04:00

1 2 3 4 5 ...

1042720 Commits