linux

iv/linux

Author	SHA1	Message	Date
Can Guo	637822e63b	scsi: ufs: core: Cancel rpm_dev_flush_recheck_work during system suspend During ufs system suspend, leaving rpm_dev_flush_recheck_work running or pending is risky because concurrency may happen between system suspend/resume and runtime resume routine. Fix this by cancelling rpm_dev_flush_recheck_work synchronously during system suspend. Link: https://lore.kernel.org/r/1619408921-30426-3-git-send-email-cang@codeaurora.org Fixes: `51dd905bd2` ("scsi: ufs: Fix WriteBooster flush during runtime suspend") Reviewed-by: Daejun Park <daejun7.park@samsung.com> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-28 23:15:17 -04:00
Can Guo	23043dd87b	scsi: ufs: core: Do not put UFS power into LPM if link is broken During resume, if link is broken due to AH8 failure, make sure ufshcd_resume() does not put UFS power back into LPM. Link: https://lore.kernel.org/r/1619408921-30426-2-git-send-email-cang@codeaurora.org Fixes: `4db7a23605` ("scsi: ufs: Fix concurrency of error handler and other error recovery paths") Reviewed-by: Daejun Park <daejun7.park@samsung.com> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-28 23:15:17 -04:00
Anastasia Kovaleva	fcb16d9a8e	scsi: qla2xxx: Prevent PRLI in target mode In a case when the initiator in P2P mode by some circumstances does not send PRLI, the target, in a case when the target port's WWPN is less than initiator's, changes the discovery state in DSC_GNL. When gnl completes it sends PRLI to the initiator. Usually the initiator in P2P mode always sends PRLI. We caught this issue on Linux stable v5.4.6 https://www.spinics.net/lists/stable/msg458515.html. Fix this particular corner case in the behaviour of the P2P mod target login state machine. Link: https://lore.kernel.org/r/20210422153414.4022-1-a.kovaleva@yadro.com Fixes: `a9ed06d4e6` ("scsi: qla2xxx: Allow PLOGI in target mode") Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Anastasia Kovaleva <a.kovaleva@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-28 22:55:01 -04:00
Bikash Hazarika	000e68faef	scsi: qla2xxx: Add marginal path handling support Add support for eh_should_retry_cmd callback in qla2xxx host template. Link: https://lore.kernel.org/r/20210427050914.7270-1-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Bikash Hazarika <bhazarika@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-28 22:55:01 -04:00
Keoseong Park	2f1137140f	scsi: ufs: core: Fix a typo in ufs-sysfs.c Fix the following typo: ufschd_uic_link_state_to_string() -> ufshcd_uic_link_state_to_string() ufschd_ufs_dev_pwr_mode_to_string() -> ufshcd_ufs_dev_pwr_mode_to_string() Link: https://lore.kernel.org/r/1381713434.61619509208911.JavaMail.epsvc@epcpadp3 Signed-off-by: Keoseong Park <keosung.park@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-28 22:55:00 -04:00
James Smart	e4ec10228f	scsi: lpfc: Fix bad memory access during VPD DUMP mailbox command The dump command for reading a region passes a requested read length specified in words (4-byte units). The response overwrites the same field with the actual number of bytes read. The mailbox handler for DUMP which reads VPD data (region 23) is treating the response field as if it were still a word_cnt, thus multiplying it by 4 to set the read's "length". Given the read value was calculated based on the size of the read buffer, the longer response length runs off the end of the buffer. Fix by reworking the code to use the response field as a byte count. Link: https://lore.kernel.org/r/20210421234511.102206-1-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-26 22:58:38 -04:00
James Smart	83adbba746	scsi: lpfc: Fix DMA virtual address ptr assignment in bsg lpfc_bsg_ct_unsol_event() routine acts assigns a ct_request to the wrong structure address, resulting in a bad address that results in bsg related timeouts. Correct the ct_request assignment to use the kernel virtual buffer address (not the control structure address). Link: https://lore.kernel.org/r/20210421234448.102132-1-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-26 22:57:27 -04:00
James Smart	e136471135	scsi: lpfc: Fix illegal memory access on Abort IOCBs In devloss timer handler and in backend calls to terminate remote port I/O, there is logic to walk through all active IOCBs and validate them to potentially trigger an abort request. This logic is causing illegal memory accesses which leads to a crash. Abort IOCBs, which may be on the list, do not have an associated lpfc_io_buf struct. The driver is trying to map an lpfc_io_buf struct on the IOCB and which results in a bogus address thus the issue. Fix by skipping over ABORT IOCBs (CLOSE IOCBs are ABORTS that don't send ABTS) in the IOCB scan logic. Link: https://lore.kernel.org/r/20210421234433.102079-1-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-26 22:56:31 -04:00
Bart Van Assche	41e70e3006	scsi: sd: Introduce a new local variable in sd_check_events() Instead of using 'retval' to represent first a SCSI status and later whether or not a disk change event occurred, introduce a new variable for the latter purpose. Link: https://lore.kernel.org/r/20210415220826.29438-17-bvanassche@acm.org Cc: Christoph Hellwig <hch@lst.de> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Daniel Wagner <dwagner@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:44:41 -04:00
Bart Van Assche	22dc227e8f	scsi: dc395x: Open-code status_byte(u8) calls The dc395x driver is one of the two drivers that passes an u8 argument to status_byte() instead of an s32 argument. Open-code status_byte() in preparation of changing SCSI status values into a structure. Link: https://lore.kernel.org/r/20210415220826.29438-16-bvanassche@acm.org Cc: Hannes Reinecke <hare@suse.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:44:41 -04:00
Bart Van Assche	3940ebf7ba	scsi: 53c700: Open-code status_byte(u8) calls The 53c700 driver is one of the two drivers that passes an u8 argument to status_byte() instead of an s32 argument. Open-code status_byte in preparation of changing SCSI status values into a structure. Link: https://lore.kernel.org/r/20210415220826.29438-15-bvanassche@acm.org Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:44:41 -04:00
Bart Van Assche	c64aab41c5	scsi: smartpqi: Remove unused functions This was detected by building the kernel with clang and W=1. Link: https://lore.kernel.org/r/20210415220826.29438-14-bvanassche@acm.org Cc: Don Brace <don.brace@microchip.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:44:41 -04:00
Bart Van Assche	11417cd5e2	scsi: qla4xxx: Remove an unused function This was detected by building the kernel with clang and W=1. Link: https://lore.kernel.org/r/20210415220826.29438-13-bvanassche@acm.org Cc: Nilesh Javali <njavali@marvell.com> Acked-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:44:41 -04:00
Bart Van Assche	40d1373b60	scsi: myrs: Remove unused functions This was detected by building the kernel with clang and W=1. Link: https://lore.kernel.org/r/20210415220826.29438-12-bvanassche@acm.org Cc: Hannes Reinecke <hare@kernel.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:44:41 -04:00
Bart Van Assche	3690ad6708	scsi: myrb: Remove unused functions This was detected by building the kernel with clang and W=1. Link: https://lore.kernel.org/r/20210415220826.29438-11-bvanassche@acm.org Cc: Hannes Reinecke <hare@kernel.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:44:40 -04:00
Bart Van Assche	3ad0b1da0d	scsi: mpt3sas: Fix two kernel-doc headers Fix the following warnings: drivers/scsi/mpt3sas/mpt3sas_base.c:5430: warning: Excess function parameter 'ct' description in '_base_allocate_pcie_sgl_pool' drivers/scsi/mpt3sas/mpt3sas_base.c:5493: warning: Excess function parameter 'ctr' description in '_base_allocate_chain_dma_pool' Link: https://lore.kernel.org/r/20210415220826.29438-10-bvanassche@acm.org Fixes: `d6adc251dd` ("scsi: mpt3sas: Force PCIe scatterlist allocations to be within same 4 GB region") Fixes: `7dd847dae1` ("scsi: mpt3sas: Force chain buffer allocations to be within same 4 GB region") Cc: Sathya Prakash <sathya.prakash@broadcom.com> Cc: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:44:40 -04:00
Bart Van Assche	be5aeee30e	scsi: fcoe: Suppress a compiler warning Suppress the following compiler warning: warning: cast to smaller integer type 'enum fip_mode' from 'void *' [-Wvoid-pointer-to-enum-cast] enum fip_mode fip_mode = (enum fip_mode)kp->arg; ^~~~~~~~~~~~~~~~~~~~~~ Link: https://lore.kernel.org/r/20210415220826.29438-9-bvanassche@acm.org Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:44:40 -04:00
Bart Van Assche	90d6697810	scsi: libfc: Fix a format specifier Since the 'mfs' member has been declared as 'u32' in include/scsi/libfc.h, use the %u format specifier instead of %hu. This patch fixes the following clang compiler warning: warning: format specifies type 'unsigned short' but the argument has type 'u32' (aka 'unsigned int') [-Wformat] "lport->mfs:%hu\n", mfs, lport->mfs); ~~~ ^~~~~~~~~~ %u Link: https://lore.kernel.org/r/20210415220826.29438-8-bvanassche@acm.org Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:44:40 -04:00
Bart Van Assche	56853f0e61	scsi: aacraid: Remove an unused function This was detected by building the kernel with clang and W=1. Link: https://lore.kernel.org/r/20210415220826.29438-7-bvanassche@acm.org Cc: aacraid@microsemi.com Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:44:40 -04:00
Bart Van Assche	b8e162f9e7	scsi: core: Introduce enum scsi_disposition Improve readability of the code in the SCSI core by introducing an enumeration type for the values used internally that decide how to continue processing a SCSI command. The eh_*_handler return values have not been changed because that would involve modifying all SCSI drivers. The output of the following command has been inspected to verify that no out-of-range values are assigned to a variable of type enum scsi_disposition: KCFLAGS=-Wassign-enum make CC=clang W=1 drivers/scsi/ Link: https://lore.kernel.org/r/20210415220826.29438-6-bvanassche@acm.org Cc: Christoph Hellwig <hch@lst.de> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Daniel Wagner <dwagner@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:44:40 -04:00
Bart Van Assche	280e91b026	scsi: core: Modify the scsi_send_eh_cmnd() return value for the SDEV_BLOCK case The comment above scsi_send_eh_cmnd() says: "Returns SUCCESS or FAILED or NEEDS_RETRY". This patch makes all values returned by scsi_send_eh_cmnd() match the documentation of this function. This change does not affect the behavior of scsi_eh_tur() nor of scsi_eh_try_stu() nor of the scsi_request_sense() callers. See also commit `bbe9fb0d04` ("scsi: Avoid that .queuecommand() gets called for a blocked SCSI device"; v5.3). Link: https://lore.kernel.org/r/20210415220826.29438-5-bvanassche@acm.org Cc: Christoph Hellwig <hch@lst.de> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Daniel Wagner <dwagner@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:44:40 -04:00
Bart Van Assche	0d2810cd62	scsi: core: Rename scsi_softirq_done() into scsi_complete() Commit `320ae51fee` ("blk-mq: new multi-queue block IO queueing mechanism"; v3.13) introduced a code path that calls the blk-mq completion function from interrupt context. scsi-mq was introduced by commit `d285203cf6` ("scsi: add support for a blk-mq based I/O path."; v3.17). Since the introduction of scsi-mq, scsi_softirq_done() can be called from interrupt context. That made the name of the function misleading, rename it to scsi_complete(). Link: https://lore.kernel.org/r/20210415220826.29438-4-bvanassche@acm.org Cc: Christoph Hellwig <hch@lst.de> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Daniel Wagner <dwagner@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:44:40 -04:00
Bart Van Assche	76fc0df9a0	scsi: core: Make the scsi_alloc_sgtables() documentation more accurate The current scsi_alloc_sgtables() documentation does not accurately explain what this function does. Hence improve the documentation of this function. Link: https://lore.kernel.org/r/20210415220826.29438-2-bvanassche@acm.org Cc: Christoph Hellwig <hch@lst.de> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Daniel Wagner <dwagner@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:44:39 -04:00
Viswas G	1f02beff22	scsi: pm80xx: Remove global lock from outbound queue processing Introduce spin lock for outbound queue. With this, driver need not acquire HBA global lock for outbound queue processing. Link: https://lore.kernel.org/r/20210415103352.3580-9-Viswas.G@microchip.com Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com> Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:29:00 -04:00
Viswas G	b431472bc8	scsi: pm80xx: Reset PI and CI memory during re-initialization Producer index(PI) outbound queue and consumer index(CI) for Outbound queue are in DMA memory. During resume(), the stale PI and CI Values will lead to unexpected behavior. These values should be reset to 0 during driver reinitialization. Link: https://lore.kernel.org/r/20210415103352.3580-8-Viswas.G@microchip.com Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com> Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:29:00 -04:00
Ruksar Devadi	4f5deeb40f	scsi: pm80xx: Completing pending I/O after fatal error When controller runs into fatal error, I/Os get stuck with no response, handler event is defined to complete the pending I/Os (SAS task and internal task) and also perform the cleanup for the drives. Link: https://lore.kernel.org/r/20210415103352.3580-7-Viswas.G@microchip.com Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:28:59 -04:00
Vishakha Channapattan	b0c306e621	scsi: pm80xx: Add sysfs attribute to track iop1 count A new sysfs variable 'ctl_iop1_count' is being introduced that tells if the controller is alive by indicating controller ticks. If on subsequent run we see the ticks changing that indicates that controller is not dead. Using the 'ctl_iop1_count' sysfs variable we can see ticks incrementing: linux-9saw:~# cat /sys/class/scsi_host/host*/ctl_iop1_count 0x00000069 0x0000006b 0x0000006d 0x00000072 Link: https://lore.kernel.org/r/20210415103352.3580-6-Viswas.G@microchip.com Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Vishakha Channapattan <vishakhavc@google.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com> Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com> Signed-off-by: Radha Ramachandran <radha@google.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:28:59 -04:00
Vishakha Channapattan	0602624ace	scsi: pm80xx: Add sysfs attribute to track iop0 count A new sysfs variable 'ctl_iop0_count' is being introduced that tells if the controller is alive by indicating controller ticks. If on subsequent run we see the ticks changing that indicates that controller is not dead. Using the 'ctl_iop0_count' sysfs variable we can see ticks incrementing: linux-9saw:~# cat /sys/class/scsi_host/host*/ctl_iop0_count 0x000000a3 0x000001db 0x000001e4 0x000001e7 Link: https://lore.kernel.org/r/20210415103352.3580-5-Viswas.G@microchip.com Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Vishakha Channapattan <vishakhavc@google.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com> Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com> Signed-off-by: Radha Ramachandran <radha@google.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:28:59 -04:00
Vishakha Channapattan	dd49ded8aa	scsi: pm80xx: Add sysfs attribute to track RAAE count A new sysfs variable 'ctl_raae_count' is being introduced that tells if the controller is alive by indicating controller ticks. If on subsequent run we see the ticks changing in RAAE count that indicates that controller is not dead. Using the 'ctl_raae_count' sysfs variable we can see ticks incrementing: linux-9saw:~# cat /sys/class/scsi_host/host*/ctl_raae_count 0x00002245 0x00002253 0x0000225e Link: https://lore.kernel.org/r/20210415103352.3580-4-Viswas.G@microchip.com Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Vishakha Channapattan <vishakhavc@google.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com> Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com> Signed-off-by: Radha Ramachandran <radha@google.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:28:59 -04:00
Vishakha Channapattan	a4c55e16c5	scsi: pm80xx: Add sysfs attribute to check controller hmi error A new sysfs variable 'ctl_hmi_error' is being introduced to give the error details if the MPI initialization fails Using the 'ctl_hmi_error' sysfs variable we can check the error details: linux-2dq0:~# cat /sys/class/scsi_host/host*/ctl_hmi_error 0x00000000 0x00000000 0x00000000 Link: https://lore.kernel.org/r/20210415103352.3580-3-Viswas.G@microchip.com Signed-off-by: Vishakha Channapattan <vishakhavc@google.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:28:59 -04:00
Vishakha Channapattan	4ddbea1b6f	scsi: pm80xx: Add sysfs attribute to check MPI state A new sysfs variable 'ctl_mpi_state' is being introduced to check the state of MPI. Using the 'ctl_mpi_state' sysfs variable we can check the MPI state: linux-2dq0:~# cat /sys/class/scsi_host/host*/ctl_mpi_state MPI is successfully initialized Link: https://lore.kernel.org/r/20210415103352.3580-2-Viswas.G@microchip.com Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Vishakha Channapattan <vishakhavc@google.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Ruksar Devadi <Ruksar.devadi@microchip.com> Signed-off-by: Ashokkumar N <Ashokkumar.N@microchip.com> Signed-off-by: Radha Ramachandran <radha@google.com> Signed-off-by: kernel test robot <lkp@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:28:59 -04:00
Roman Bolshakov	f02d4086a8	scsi: qla2xxx: Reserve extra IRQ vectors Commit `a6dcfe0848` ("scsi: qla2xxx: Limit interrupt vectors to number of CPUs") lowers the number of allocated MSI-X vectors to the number of CPUs. That breaks vector allocation assumptions in qla83xx_iospace_config(), qla24xx_enable_msix() and qla2x00_iospace_config(). Either of the functions computes maximum number of qpairs as: ha->max_qpairs = ha->msix_count - 1 (MB interrupt) - 1 (default response queue) - 1 (ATIO, in dual or pure target mode) max_qpairs is set to zero in case of two CPUs and initiator mode. The number is then used to allocate ha->queue_pair_map inside qla2x00_alloc_queues(). No allocation happens and ha->queue_pair_map is left NULL but the driver thinks there are queue pairs available. qla2xxx_queuecommand() tries to find a qpair in the map and crashes: if (ha->mqenable) { uint32_t tag; uint16_t hwq; struct qla_qpair *qpair = NULL; tag = blk_mq_unique_tag(cmd->request); hwq = blk_mq_unique_tag_to_hwq(tag); qpair = ha->queue_pair_map[hwq]; # <- HERE if (qpair) return qla2xxx_mqueuecommand(host, cmd, qpair); } BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 0 PID: 72 Comm: kworker/u4:3 Tainted: G W 5.10.0-rc1+ #25 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 Workqueue: scsi_wq_7 fc_scsi_scan_rport [scsi_transport_fc] RIP: 0010:qla2xxx_queuecommand+0x16b/0x3f0 [qla2xxx] Call Trace: scsi_queue_rq+0x58c/0xa60 blk_mq_dispatch_rq_list+0x2b7/0x6f0 ? __sbitmap_get_word+0x2a/0x80 __blk_mq_sched_dispatch_requests+0xb8/0x170 blk_mq_sched_dispatch_requests+0x2b/0x50 __blk_mq_run_hw_queue+0x49/0xb0 __blk_mq_delay_run_hw_queue+0xfb/0x150 blk_mq_sched_insert_request+0xbe/0x110 blk_execute_rq+0x45/0x70 __scsi_execute+0x10e/0x250 scsi_probe_and_add_lun+0x228/0xda0 __scsi_scan_target+0xf4/0x620 ? __pm_runtime_resume+0x4f/0x70 scsi_scan_target+0x100/0x110 fc_scsi_scan_rport+0xa1/0xb0 [scsi_transport_fc] process_one_work+0x1ea/0x3b0 worker_thread+0x28/0x3b0 ? process_one_work+0x3b0/0x3b0 kthread+0x112/0x130 ? kthread_park+0x80/0x80 ret_from_fork+0x22/0x30 The driver should allocate enough vectors to provide every CPU it's own HW queue and still handle reserved (MB, RSP, ATIO) interrupts. The change fixes the crash on dual core VM and prevents unbalanced QP allocation where nr_hw_queues is two less than the number of CPUs. Link: https://lore.kernel.org/r/20210412165740.39318-1-r.bolshakov@yadro.com Fixes: `a6dcfe0848` ("scsi: qla2xxx: Limit interrupt vectors to number of CPUs") Cc: Daniel Wagner <daniel.wagner@suse.com> Cc: Himanshu Madhani <himanshu.madhani@oracle.com> Cc: Quinn Tran <qutran@marvell.com> Cc: Nilesh Javali <njavali@marvell.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: stable@vger.kernel.org # 5.11+ Reported-by: Aleksandr Volkov <a.y.volkov@yadro.com> Reported-by: Aleksandr Miloserdov <a.miloserdov@yadro.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:06:03 -04:00
Don Brace	5cad5a5072	scsi: smartpqi: Fix device pointer variable reference static checker issue Dan Carpenter found a possible NULL pointer dereference issue in function pqi_sas_port_add_rphy(): drivers/scsi/smartpqi/smartpqi_sas_transport.c:97 pqi_sas_port_add_rphy() warn: variable dereferenced before check 'pqi_sas_port->device' (see line 95) Correct issue by moving reference of pqi_sas_port->device after the check for the device pointer being non-NULL. Link: https://www.mail-archive.com/kbuild@lists.01.org/msg06329.html Link: https://lore.kernel.org/r/161850493026.7302.10032784239320437353.stgit@brunhilda Fixes: `ec504b23df` ("scsi: smartpqi: Add phy ID support for the physical drives") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:03:32 -04:00
Don Brace	667298ceaf	scsi: smartpqi: Fix blocks_per_row static checker issue Dan Carpenter found a possible divide by 0 issue in the smartpqi driver in functions pci_get_aio_common_raid_map_values() and pqi_calc_aio_r5_or_r6(). The variable rmd->blocks_per_row is used as a divisor and could be 0. Using rmd->blocks_per_row as a divisor without checking it for 0 first. Correct these possible divide by 0 conditions by insuring that rmd->blocks_per_row is not zero before usage. The check for non-0 was too late to prevent a divide by 0 condition. Add in a comment to explain why the check for non-zero is necessary. If the member is 0, return PQI_RAID_BYPASS_INELIGIBLE before any division is performed. Link: https://lore.kernel.org/linux-scsi/YG%2F5kWHHAr7w5dU5@mwanda/ Link: https://lore.kernel.org/r/161850492435.7302.392780350442938047.stgit@brunhilda Fixes: `6702d2c40f` ("scsi: smartpqi: Add support for RAID5 and RAID6 writes") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-15 22:03:32 -04:00
Brian King	15cfef8623	scsi: ibmvfc: Fix invalid state machine BUG_ON() This fixes an issue hitting the BUG_ON() in ibmvfc_do_work(). When going through a host action of IBMVFC_HOST_ACTION_RESET, we change the action to IBMVFC_HOST_ACTION_TGT_DEL, then drop the host lock, and reset the CRQ, which changes the host state to IBMVFC_NO_CRQ. If, prior to setting the host state to IBMVFC_NO_CRQ, ibmvfc_init_host() is called, it can then end up changing the host action to IBMVFC_HOST_ACTION_INIT. If we then change the host state to IBMVFC_NO_CRQ, we will then hit the BUG_ON(). Make a couple of changes to avoid this. Leave the host action to be IBMVFC_HOST_ACTION_RESET or IBMVFC_HOST_ACTION_REENABLE until after we drop the host lock and reset or reenable the CRQ. Also harden the host state machine to ensure we cannot leave the reset / reenable state until we've finished processing the reset or reenable. Link: https://lore.kernel.org/r/20210413001009.902400-1-tyreld@linux.ibm.com Fixes: `73ee5d8672` ("[SCSI] ibmvfc: Fix soft lockup on resume") Signed-off-by: Brian King <brking@linux.vnet.ibm.com> [tyreld: added fixes tag] Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> [mkp: fix comment checkpatch warnings] Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-13 01:39:14 -04:00
James Smart	cf270817ca	scsi: lpfc: Copyright updates for 12.8.0.9 patches Update copyrights to 2021 for files modified in the 12.8.0.9 patch set. Link: https://lore.kernel.org/r/20210412013127.2387-17-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-13 01:39:14 -04:00
James Smart	3ebd25b0a4	scsi: lpfc: Update lpfc version to 12.8.0.9 Update lpfc version to 12.8.0.9 Link: https://lore.kernel.org/r/20210412013127.2387-16-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-13 01:39:14 -04:00
James Smart	5b1f5089b6	scsi: lpfc: Eliminate use of LPFC_DRIVER_NAME in lpfc_attr.c During code inspection, several cases of creating a dynamic attribute names in logs messages using a define was found. This is unnecessary. Place the native symbol name in the log messages. Link: https://lore.kernel.org/r/20210412013127.2387-15-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-13 01:39:14 -04:00
James Smart	f115612528	scsi: lpfc: Standardize discovery object logging format Code inspection showed lpfc was using three different pointer formats when logging discovery object pointers. Standardize the pointer format to x%px. Note: %px use is limited to discovery objects in order to aid core analysis. Link: https://lore.kernel.org/r/20210412013127.2387-14-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-13 01:39:14 -04:00
James Smart	3bfab8a026	scsi: lpfc: Fix various trivial errors in comments and log messages Clean up minor issues spotted by tools and code review: - Spelling Errors - Spurious characters and errors in function headers - nvme_info wqerr and err fields source data reversed - Extraneous new line in log message 0466 - Spacing error in log message 0109 - Messages 0140 and 0141 have portname and nodename reversed - Incorrect function labelling in comment Link: https://lore.kernel.org/r/20210412013127.2387-13-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-13 01:39:14 -04:00
James Smart	b62232ba8c	scsi: lpfc: Remove unsupported mbox PORT_CAPABILITIES logic SLI-4 does not contain a PORT_CAPABILITIES mailbox command (only SLI-3 does, and SLI-3 doesn't use it), yet there are SLI-4 code paths that have code to issue the command. The command will always fail. Remove the code for the mailbox command and leave only the resulting "failure path" logic. Link: https://lore.kernel.org/r/20210412013127.2387-12-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-13 01:39:14 -04:00
James Smart	d3de0d11a2	scsi: lpfc: Fix lpfc_hdw_queue attribute being ignored The lpfc_hdw_queue attribute is to set the number of hardware queues to be created on the adapter. Normally, the value is set to a default, which allows the hw queue count to be sized dynamically based on adapter capabilities, CPU/platform architecture, or CPU type. Currently, when lpfc_hdw_queue is set to a specific value, is has no effect and the dynamic sizing occurs. The routine checking whether parameters are default or not ignores the lpfc_hdw_queue setting and invokes the dynamic logic. Fix the routine to additionally check the lpfc_hdw_queue attribute value before using dynamic scaling. Additionally, SLI-3 supports only a small number of queues with dedicated functions, thus it needs to be exempted from the variable scaling and set to the expected values. Link: https://lore.kernel.org/r/20210412013127.2387-11-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-13 01:39:14 -04:00
James Smart	a314dec37c	scsi: lpfc: Fix missing FDMI registrations after Mgmt Svc login FDMI registration needs to be performed after every login with the FC Mgmt service. The flag the driver is using to track registration is cleared on link up, but never on Mgmt service logout/re-login. Fix by clearing the flag whenever a new login is completed with the FC Mgmt service. While perusing the flag use, logging was performed as if FDMI registration occurred on vports. However, it is limited to the physical port only. Revise the logging to reflect physical port based. Link: https://lore.kernel.org/r/20210412013127.2387-10-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-13 01:39:14 -04:00
James Smart	a1a553e31a	scsi: lpfc: Fix silent memory allocation failure in lpfc_sli4_bsg_link_diag_test() In the unlikely case of a failure to allocate an LPFC_MBOXQ_t structure, no return status is set, thus the routine never logs an error and returns success to the callee. Fix by setting a return code on failure. Link: https://lore.kernel.org/r/20210412013127.2387-9-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-13 01:39:13 -04:00
James Smart	724f6b43a3	scsi: lpfc: Fix use-after-free on unused nodes after port swap During target port swap, the swap logic ignores the DROPPED flag in the nodes. As a node then moves into the UNUSED state, the reference count will be dropped. If a node is later reused and moved out of the UNUSED state, an access can result in a use-after-free assert. Fix by having the port swap logic propagate the DROPPED flag when switching nodes. This will avoid reference from being dropped. Link: https://lore.kernel.org/r/20210412013127.2387-8-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-13 01:39:13 -04:00
James Smart	304ee43238	scsi: lpfc: Fix error handling for mailboxes completed in MBX_POLL mode In SLI-4, when performing a mailbox command with MBX_POLL, the driver uses the BMBX register to send the command rather than the MQ. A flag is set indicating the BMBX register is active and saves the mailbox job struct (mboxq) in the mbox_active element of the adapter. The routine then waits for completion or timeout. The mailbox job struct is not freed by the routine. In cases of timeout, the adapter will be reset. The lpfc_sli_mbox_sys_flush() routine will clean up the mbox in preparation for the reset. It clears the BMBX active flag and marks the job structure as MBX_NOT_FINISHED. But, it never frees the mboxq job structure. Expectation in both normal completion and timeout cases is that the issuer of the mbx command will free the structure. Unfortunately, not all calling paths are freeing the memory in cases of error. All calling paths were looked at and updated, if missing, to free the mboxq memory regardless of completion status. Link: https://lore.kernel.org/r/20210412013127.2387-7-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-13 01:39:13 -04:00
James Smart	4e76d4a9a2	scsi: lpfc: Fix lack of device removal on port swaps with PRLIs During target port-swap testing with link flips, the initiator could encounter PRLI errors. If the target node disappears permanently, the ndlp is found stuck in UNUSED state with ref count of 1. The rmmod of the driver will hang waiting for this node to be freed. While handling a link error in PRLI completion path, the code intends to skip triggering the discovery state machine. However this is causing the final reference release path to be skipped. This causes the node to be stuck with ref count of 1 Fix by ensuring the code path triggers the device removal event on the node state machine. Link: https://lore.kernel.org/r/20210412013127.2387-6-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-13 01:39:13 -04:00
James Smart	a789241e49	scsi: lpfc: Fix NMI crash during rmmod due to circular hbalock dependency Remove hbalock dependency for lpfc_abts_els_sgl_list and lpfc_abts_nvmet_ctx_list. The lists are adaquately synchronized with the sgl_list_lock and abts_nvmet_buf_list_lock. Link: https://lore.kernel.org/r/20210412013127.2387-5-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-13 01:39:13 -04:00
James Smart	f866eb06c0	scsi: lpfc: Fix reference counting errors in lpfc_cmpl_els_rsp() Call traces are being seen that result from a nodelist structure ref counting error. They are typically seen after transmission of an LS_RJT ELS response. Aged code in lpfc_cmpl_els_rsp() calls lpfc_nlp_not_used() which, if the ndlp reference count is exactly 1, will decrement the reference count. Previously lpfc_nlp_put() was within lpfc_els_free_iocb(), and the 'put' within the free would only be invoked if cmdiocb->context1 was not NULL. Since the nodelist structure reference count is decremented when exiting lpfc_cmpl_els_rsp() the lpfc_nlp_not_used() calls are no longer required. Calling them is causing the reference count issue. Fix by removing the lpfc_nlp_not_used() calls. Link: https://lore.kernel.org/r/20210412013127.2387-4-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-13 01:39:13 -04:00
James Smart	fffd18ec65	scsi: lpfc: Fix crash when a REG_RPI mailbox fails triggering a LOGO response Fix a crash caused by a double put on the node when the driver completed an ACC for an unsolicted abort on the same node. The second put was executed by lpfc_nlp_not_used() and is wrong because the completion routine executes the nlp_put when the iocbq was released. Additionally, the driver is issuing a LOGO then immediately calls lpfc_nlp_set_state to put the node into NPR. This call does nothing. Remove the lpfc_nlp_not_used call and additional set_state in the completion routine. Remove the lpfc_nlp_set_state post issue_logo. Isn't necessary. Link: https://lore.kernel.org/r/20210412013127.2387-3-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-13 01:39:13 -04:00

1 2 3 4 5 ...

21214 Commits