linux

iv/linux

Author	SHA1	Message	Date
Quinn Tran	f557970849	scsi: qla2xxx: Fix firmware resource tracking commit e370b64c7db96384a0886a09a9d80406e4c663d7 upstream. The storage was not draining I/Os and the work load was not spread out across different CPUs evenly. This led to firmware resource counters getting overrun on the busy CPU. This overrun prevented error recovery from happening in a timely manner. By switching the counter to atomic, it allows the count to be little more accurate to prevent the overrun. Cc: stable@vger.kernel.org Fixes: da7c21b72aa8 ("scsi: qla2xxx: Fix command flush during TMF") Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230821130045.34850-4-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-09-19 12:27:56 +02:00
Quinn Tran	3a9d4db2d2	scsi: qla2xxx: Error code did not return to upper layer commit 0ba0b018f94525a6b32f5930f980ce9b62b72e6f upstream. TMF was returned with an error code. The error code was not preserved to be returned to upper layer. Instead, the error code from the Marker was returned. Preserve error code from TMF and return it to upper layer. Cc: stable@vger.kernel.org Fixes: da7c21b72aa8 ("scsi: qla2xxx: Fix command flush during TMF") Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230821130045.34850-6-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-09-19 12:27:56 +02:00
Nilesh Javali	c7355cbb9c	scsi: qla2xxx: Fix smatch warn for qla_init_iocb_limit() commit b496953dd0444001b12f425ea07d78c1f47e3193 upstream. Fix indentation for warning reported by smatch: drivers/scsi/qla2xxx/qla_init.c:4199 qla_init_iocb_limit() warn: inconsistent indenting Fixes: efa74a62aaa2 ("scsi: qla2xxx: Adjust IOCB resource on qpair create") Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230821130045.34850-8-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-09-19 12:27:56 +02:00
Quinn Tran	974887e1d6	scsi: qla2xxx: Flush mailbox commands on chip reset commit 6d0b65569c0a10b27c49bacd8d25bcd406003533 upstream. Fix race condition between Interrupt thread and Chip reset thread in trying to flush the same mailbox. With the race condition, the "ha->mbx_intr_comp" will get an extra complete() call. The extra complete call create erroneous mailbox timeout condition when the next mailbox is sent where the mailbox call does not wait for interrupt to arrive. Instead, it advances without waiting. Add lock protection around the check for mailbox completion. Cc: stable@vger.kernel.org Fixes: b2000805a975 ("scsi: qla2xxx: Flush mailbox commands on chip reset") Signed-off-by: Quinn Tran <quinn.tran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230821130045.34850-3-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-09-19 12:27:55 +02:00
Manish Rangankar	98643561d8	scsi: qla2xxx: Remove unsupported ql2xenabledif option commit e9105c4b7a9208a21a9bda133707624f12ddabc2 upstream. User accidently passed module parameter ql2xenabledif=1 which is unsupported. However, driver still initialized which lead to guard tag errors during device discovery. Remove unsupported ql2xenabledif=1 option and validate the user input. Cc: stable@vger.kernel.org Signed-off-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230821130045.34850-7-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-09-19 12:27:55 +02:00
Quinn Tran	1f0e3814ad	scsi: qla2xxx: Fix TMF leak through commit 5d3148d8e8b05f084e607ac3bd55a4c317a9f934 upstream. Task management can retry up to 5 times when FW resource becomes bottle neck. Between the retries, there is a short sleep. Current code assumes the chip has not reset or session has not changed. Check for chip reset or session change before sending Task management. Cc: stable@vger.kernel.org Fixes: 9803fb5d2759 ("scsi: qla2xxx: Fix task management cmd failure") Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230714070104.40052-9-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-09-19 12:27:55 +02:00
Quinn Tran	e6aabf0654	scsi: qla2xxx: Fix session hang in gnl commit 39d22740712c7563a2e18c08f033deeacdaf66e7 upstream. Connection does not resume after a host reset / chip reset. The cause of the blockage is due to the FCF_ASYNC_ACTIVE left on. The gnl command was interrupted by the chip reset. On exiting the command, this flag should be turn off to allow relogin to reoccur. Clear this flag to prevent blockage. Cc: stable@vger.kernel.org Fixes: 17e64648aa47 ("scsi: qla2xxx: Correct fcport flags handling") Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230714070104.40052-7-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-09-19 12:27:55 +02:00
Quinn Tran	addaa136f1	scsi: qla2xxx: Turn off noisy message log commit 8ebaa45163a3fedc885c1dc7d43ea987a2f00a06 upstream. Some consider noisy log as test failure. Turn off noisy message log. Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230714070104.40052-8-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-09-19 12:27:55 +02:00
Quinn Tran	01e3440ce0	scsi: qla2xxx: Fix erroneous link up failure commit 5b51f35d127e7bef55fa869d2465e2bca4636454 upstream. Link up failure occurred where driver failed to see certain events from FW indicating link up (AEN 8011) and fabric login completion (AEN 8014). Without these 2 events, driver would not proceed forward to scan the fabric. The cause of this is due to delay in the receive of interrupt for Mailbox 60 that causes qla to set the fw_started flag late. The late setting of this flag causes other interrupts to be dropped. These dropped interrupts happen to be the link up (AEN 8011) and fabric login completion (AEN 8014). Set fw_started flag early to prevent interrupts being dropped. Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230714070104.40052-6-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-09-19 12:27:55 +02:00
Quinn Tran	ddb8fa0598	scsi: qla2xxx: Fix command flush during TMF commit da7c21b72aa86e990af5f73bce6590b8d8d148d0 upstream. For each TMF request, driver iterates through each qpair and flushes commands associated to the TMF. At the end of the qpair flush, a Marker is used to complete the flush transaction. This process was repeated for each qpair. The multiple flush and marker for this TMF request seems to cause confusion for FW. Instead, 1 flush is sent to FW. Driver would wait for FW to go through all the I/Os on each qpair to be read then return. Driver then closes out the transaction with a Marker. Cc: stable@vger.kernel.org Fixes: d90171dd0da5 ("scsi: qla2xxx: Multi-que support for TMF") Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230714070104.40052-5-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-09-19 12:27:55 +02:00
Quinn Tran	6e44a7e2a0	scsi: qla2xxx: fix inconsistent TMF timeout commit 009e7fe4a1ed52276b332842a6b6e23b07200f2d upstream. Different behavior were experienced of session being torn down vs not when TMF is timed out. When FW detects the time out, the session is torn down. When driver detects the time out, the session is not torn down. Allow TMF error to return to upper layer without session tear down. Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230714070104.40052-10-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-09-19 12:27:55 +02:00
Quinn Tran	cd06c45b32	scsi: qla2xxx: Fix deletion race condition commit 6dfe4344c168c6ca20fe7640649aacfcefcccb26 upstream. System crash when using debug kernel due to link list corruption. The cause of the link list corruption is due to session deletion was allowed to queue up twice. Here's the internal trace that show the same port was allowed to double queue for deletion on different cpu. 20808683956 015 qla2xxx [0000:13:00.1]-e801:4: Scheduling sess ffff93ebf9306800 for deletion 50:06:0e:80:12:48:ff:50 fc4_type 1 20808683957 027 qla2xxx [0000:13:00.1]-e801:4: Scheduling sess ffff93ebf9306800 for deletion 50:06:0e:80:12:48:ff:50 fc4_type 1 Move the clearing/setting of deleted flag lock. Cc: stable@vger.kernel.org Fixes: 726b85487067 ("qla2xxx: Add framework for async fabric discovery") Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230714070104.40052-2-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-09-19 12:27:55 +02:00
Quinn Tran	820010cfe5	scsi: qla2xxx: Limit TMF to 8 per function commit a8ec192427e0516436e61f9ca9eb49c54eadfe0a upstream. Per FW recommendation, 8 TMF's can be outstanding for each function. Previously, it allowed 8 per target. Limit TMF to 8 per function. Cc: stable@vger.kernel.org Fixes: 6a87679626b5 ("scsi: qla2xxx: Fix task management cmd fail due to unavailable resource") Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230714070104.40052-4-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-09-19 12:27:55 +02:00
Quinn Tran	faf7e224b4	scsi: qla2xxx: Adjust IOCB resource on qpair create commit efa74a62aaa2429c04fe6cb277b3bf6739747d86 upstream. During NVMe queue creation, a new qpair is created. FW resource limit needs to be re-adjusted to take into account the new qpair. Otherwise, NVMe command can not go through. This issue was discovered while testing/forcing FW execution to fail at load time. Add call to readjust IOCB and exchange limit. In addition, get FW state command and require FW to be running. Otherwise, error is generated. Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230714070104.40052-3-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-09-19 12:27:55 +02:00
Ranjan Kumar	316f398429	scsi: mpt3sas: Perform additional retries if doorbell read returns 0 commit 4ca10f3e31745d35249a727ecd108eb58f0a8c5e upstream. The driver retries certain register reads 3 times if the returned value is 0. This was done because the controller could return 0 for certain registers if other registers were being accessed concurrently by the BMC. In certain systems with increased BMC interactions, the register values returned can be 0 for longer than 3 retries. Change the retry count from 3 to 30 for the affected registers to prevent problems with out-of-band management. Fixes: b899202901a8 ("scsi: mpt3sas: Add separate function for aero doorbell reads") Cc: stable@vger.kernel.org Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20230829090020.5417-2-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-09-13 09:43:01 +02:00
Nilesh Javali	6c4f87e523	Revert "scsi: qla2xxx: Fix buffer overrun" commit 641671d97b9199f1ba35ccc2222d4b189a6a5de5 upstream. Revert due to Get PLOGI Template failed. This reverts commit b68710a8094fdffe8dd4f7a82c82649f479bb453. Cc: stable@vger.kernel.org Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230821130045.34850-9-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-09-13 09:43:01 +02:00
Chengfeng Ye	5a5fb3b175	scsi: fcoe: Fix potential deadlock on &fip->ctlr_lock [ Upstream commit 1a1975551943f681772720f639ff42fbaa746212 ] There is a long call chain that &fip->ctlr_lock is acquired by isr fnic_isr_msix_wq_copy() under hard IRQ context. Thus other process context code acquiring the lock should disable IRQ, otherwise deadlock could happen if the IRQ preempts the execution while the lock is held in process context on the same CPU. [ISR] fnic_isr_msix_wq_copy() -> fnic_wq_copy_cmpl_handler() -> fnic_fcpio_cmpl_handler() -> fnic_fcpio_flogi_reg_cmpl_handler() -> fnic_flush_tx() -> fnic_send_frame() -> fcoe_ctlr_els_send() -> spin_lock_bh(&fip->ctlr_lock) [Process Context] 1. fcoe_ctlr_timer_work() -> fcoe_ctlr_flogi_send() -> spin_lock_bh(&fip->ctlr_lock) 2. fcoe_ctlr_recv_work() -> fcoe_ctlr_recv_handler() -> fcoe_ctlr_recv_els() -> fcoe_ctlr_announce() -> spin_lock_bh(&fip->ctlr_lock) 3. fcoe_ctlr_recv_work() -> fcoe_ctlr_recv_handler() -> fcoe_ctlr_recv_els() -> fcoe_ctlr_flogi_retry() -> spin_lock_bh(&fip->ctlr_lock) 4. -> fcoe_xmit() -> fcoe_ctlr_els_send() -> spin_lock_bh(&fip->ctlr_lock) spin_lock_bh() is not enough since fnic_isr_msix_wq_copy() is a hardirq. These flaws were found by an experimental static analysis tool I am developing for irq-related deadlock. The patch fix the potential deadlocks by spin_lock_irqsave() to disable hard irq. Fixes: 794d98e77f59 ("[SCSI] libfcoe: retry rejected FLOGI to another FCF if possible") Signed-off-by: Chengfeng Ye <dg573847474@gmail.com> Link: https://lore.kernel.org/r/20230817074708.7509-1-dg573847474@gmail.com Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-09-13 09:42:56 +02:00
Tony Battersby	f06c7d823a	scsi: core: Use 32-bit hostnum in scsi_host_lookup() [ Upstream commit 62ec2092095b678ff89ce4ba51c2938cd1e8e630 ] Change scsi_host_lookup() hostnum argument type from unsigned short to unsigned int to match the type used everywhere else. Fixes: 6d49f63b415c ("[SCSI] Make host_no an unsigned int") Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Link: https://lore.kernel.org/r/a02497e7-c12b-ef15-47fc-3f0a0b00ffce@cybernetics.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-09-13 09:42:56 +02:00
Oleksandr Natalenko	af6fd0b3bc	scsi: qedf: Do not touch __user pointer in qedf_dbg_fp_int_cmd_read() directly [ Upstream commit 25dbc20deab5165f847b4eb42f376f725a986ee8 ] The qedf_dbg_fp_int_cmd_read() function invokes sprintf() directly on a __user pointer, which may crash the kernel. Avoid doing that by vmalloc()'ating a buffer for scnprintf() and then calling simple_read_from_buffer() which does a proper copy_to_user() call. Fixes: 61d8658b4a43 ("scsi: qedf: Add QLogic FastLinQ offload FCoE driver framework.") Link: https://lore.kernel.org/lkml/20230724120241.40495-1-oleksandr@redhat.com/ Link: https://lore.kernel.org/linux-scsi/20230726101236.11922-1-skashyap@marvell.com/ Cc: Saurav Kashyap <skashyap@marvell.com> Cc: Rob Evers <revers@redhat.com> Cc: Johannes Thumshirn <Johannes.Thumshirn@wdc.com> Cc: David Laight <David.Laight@ACULAB.COM> Cc: Jozef Bacik <jobacik@redhat.com> Cc: Laurence Oberman <loberman@redhat.com> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: GR-QLogic-Storage-Upstream@marvell.com Cc: linux-scsi@vger.kernel.org Reviewed-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Tested-by: Laurence Oberman <loberman@redhat.com> Acked-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Oleksandr Natalenko <oleksandr@redhat.com> Link: https://lore.kernel.org/r/20230731084034.37021-4-oleksandr@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-09-13 09:42:52 +02:00
Oleksandr Natalenko	dd8ce1c9ff	scsi: qedf: Do not touch __user pointer in qedf_dbg_debug_cmd_read() directly [ Upstream commit 31b5991a9a91ba97237ac9da509d78eec453ff72 ] The qedf_dbg_debug_cmd_read() function invokes sprintf() directly on a __user pointer, which may crash the kernel. Avoid doing that by using a small on-stack buffer for scnprintf() and then calling simple_read_from_buffer() which does a proper copy_to_user() call. Fixes: 61d8658b4a43 ("scsi: qedf: Add QLogic FastLinQ offload FCoE driver framework.") Link: https://lore.kernel.org/lkml/20230724120241.40495-1-oleksandr@redhat.com/ Link: https://lore.kernel.org/linux-scsi/20230726101236.11922-1-skashyap@marvell.com/ Cc: Saurav Kashyap <skashyap@marvell.com> Cc: Rob Evers <revers@redhat.com> Cc: Johannes Thumshirn <Johannes.Thumshirn@wdc.com> Cc: David Laight <David.Laight@ACULAB.COM> Cc: Jozef Bacik <jobacik@redhat.com> Cc: Laurence Oberman <loberman@redhat.com> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: GR-QLogic-Storage-Upstream@marvell.com Cc: linux-scsi@vger.kernel.org Reviewed-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Tested-by: Laurence Oberman <loberman@redhat.com> Acked-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Oleksandr Natalenko <oleksandr@redhat.com> Link: https://lore.kernel.org/r/20230731084034.37021-3-oleksandr@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-09-13 09:42:52 +02:00
Oleksandr Natalenko	472f2497a4	scsi: qedf: Do not touch __user pointer in qedf_dbg_stop_io_on_error_cmd_read() directly [ Upstream commit 7d3d20dee4f648ec44e9717d5f647d594d184433 ] The qedf_dbg_stop_io_on_error_cmd_read() function invokes sprintf() directly on a __user pointer, which may crash the kernel. Avoid doing that by using a small on-stack buffer for scnprintf() and then calling simple_read_from_buffer() which does a proper copy_to_user() call. Fixes: 61d8658b4a43 ("scsi: qedf: Add QLogic FastLinQ offload FCoE driver framework.") Link: https://lore.kernel.org/lkml/20230724120241.40495-1-oleksandr@redhat.com/ Link: https://lore.kernel.org/linux-scsi/20230726101236.11922-1-skashyap@marvell.com/ Cc: Saurav Kashyap <skashyap@marvell.com> Cc: Rob Evers <revers@redhat.com> Cc: Johannes Thumshirn <Johannes.Thumshirn@wdc.com> Cc: David Laight <David.Laight@ACULAB.COM> Cc: Jozef Bacik <jobacik@redhat.com> Cc: Laurence Oberman <loberman@redhat.com> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: GR-QLogic-Storage-Upstream@marvell.com Cc: linux-scsi@vger.kernel.org Reviewed-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Tested-by: Laurence Oberman <loberman@redhat.com> Acked-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Oleksandr Natalenko <oleksandr@redhat.com> Link: https://lore.kernel.org/r/20230731084034.37021-2-oleksandr@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-09-13 09:42:52 +02:00
Lin Ma	25feffb3fb	scsi: qla4xxx: Add length check when parsing nlattrs [ Upstream commit 47cd3770e31df942e2bb925a9a855c79ed0662eb ] There are three places that qla4xxx parses nlattrs: - qla4xxx_set_chap_entry() - qla4xxx_iface_set_param() - qla4xxx_sysfs_ddb_set_param() and each of them directly converts the nlattr to specific pointer of structure without length checking. This could be dangerous as those attributes are not validated and a malformed nlattr (e.g., length 0) could result in an OOB read that leaks heap dirty data. Add the nla_len check before accessing the nlattr data and return EINVAL if the length check fails. Fixes: 26ffd7b45fe9 ("[SCSI] qla4xxx: Add support to set CHAP entries") Fixes: 1e9e2be3ee03 ("[SCSI] qla4xxx: Add flash node mgmt support") Fixes: 00c31889f751 ("[SCSI] qla4xxx: fix data alignment and use nl helpers") Signed-off-by: Lin Ma <linma@zju.edu.cn> Link: https://lore.kernel.org/r/20230723080053.3714534-1-linma@zju.edu.cn Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-09-13 09:42:52 +02:00
Lin Ma	1806edae97	scsi: be2iscsi: Add length check when parsing nlattrs [ Upstream commit ee0268f230f66cb472df3424f380ea668da2749a ] beiscsi_iface_set_param() parses nlattr with nla_for_each_attr and assumes every attributes can be viewed as struct iscsi_iface_param_info. This is not true because there is no any nla_policy to validate the attributes passed from the upper function iscsi_set_iface_params(). Add the nla_len check before accessing the nlattr data and return EINVAL if the length check fails. Fixes: 0e43895ec1f4 ("[SCSI] be2iscsi: adding functionality to change network settings using iscsiadm") Signed-off-by: Lin Ma <linma@zju.edu.cn> Link: https://lore.kernel.org/r/20230723075938.3713864-1-linma@zju.edu.cn Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-09-13 09:42:52 +02:00
Lin Ma	85b8c282d1	scsi: iscsi: Add strlen() check in iscsi_if_set{_host}_param() [ Upstream commit ce51c817008450ef4188471db31639d42d37a5e1 ] The functions iscsi_if_set_param() and iscsi_if_set_host_param() convert an nlattr payload to type char* and then call C string handling functions like sscanf and kstrdup: char data = (char)ev + sizeof(*ev); ... sscanf(data, "%d", &value); However, since the nlattr is provided by the user-space program and the nlmsg skb is allocated with GFP_KERNEL instead of GFP_ZERO flag (see netlink_alloc_large_skb() in netlink_sendmsg()), dirty data on the heap can lead to an OOB access for those string handling functions. By investigating how the bug is introduced, we find it is really interesting as the old version parsing code starting from commit fd7255f51a13 ("[SCSI] iscsi: add sysfs attrs for uspace sync up") treated the nlattr as integer bytes instead of string and had length check in iscsi_copy_param(): if (ev->u.set_param.len != sizeof(uint32_t)) BUG(); But, since the commit a54a52caad4b ("[SCSI] iscsi: fixup set/get param functions"), the code treated the nlattr as C string while forgetting to add any strlen checks(), opening the possibility of an OOB access. Fix the potential OOB by adding the strlen() check before accessing the buf. If the data passes this check, all low-level set_param handlers can safely treat this buf as legal C string. Fixes: fd7255f51a13 ("[SCSI] iscsi: add sysfs attrs for uspace sync up") Fixes: 1d9bf13a9cf9 ("[SCSI] iscsi class: add iscsi host set param event") Signed-off-by: Lin Ma <linma@zju.edu.cn> Link: https://lore.kernel.org/r/20230723075820.3713119-1-linma@zju.edu.cn Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-09-13 09:42:51 +02:00
Lin Ma	bb8d101b83	scsi: iscsi: Add length check for nlattr payload [ Upstream commit 971dfcb74a800047952f5288512b9c7ddedb050a ] The current NETLINK_ISCSI netlink parsing loop checks every nlmsg to make sure the length is bigger than sizeof(struct iscsi_uevent) and then calls iscsi_if_recv_msg(). nlh = nlmsg_hdr(skb); if (nlh->nlmsg_len < sizeof(nlh) + sizeof(ev) \|\| skb->len < nlh->nlmsg_len) { break; } ... err = iscsi_if_recv_msg(skb, nlh, &group); Hence, in iscsi_if_recv_msg() the nlmsg_data can be safely converted to iscsi_uevent as the length is already checked. However, in other cases the length of nlattr payload is not checked before the payload is converted to other data structures. One example is iscsi_set_path() which converts the payload to type iscsi_path without any checks: params = (struct iscsi_path )((char )ev + sizeof(ev)); Whereas iscsi_if_transport_conn() correctly checks the pdu_len: pdu_len = nlh->nlmsg_len - sizeof(nlh) - sizeof(*ev); if ((ev->u.send_pdu.hdr_size > pdu_len) .. err = -EINVAL; To sum up, some code paths called in iscsi_if_recv_msg() do not check the length of the data (see below picture) and directly convert the data to another data structure. This could result in an out-of-bound reads and heap dirty data leakage. _________ nlmsg_len(nlh) _______________ / \ +----------+--------------+---------------------------+ \| nlmsghdr \| iscsi_uevent \| data \| +----------+--------------+---------------------------+ \ / iscsi_uevent->u.set_param.len Fix the issue by adding the length check before accessing it. To clean up the code, an additional parameter named rlen is added. The rlen is calculated at the beginning of iscsi_if_recv_msg() which avoids duplicated calculation. Fixes: ac20c7bf070d ("[SCSI] iscsi_transport: Added Ping support") Fixes: 43514774ff40 ("[SCSI] iscsi class: Add new NETLINK_ISCSI messages for cnic/bnx2i driver.") Fixes: 1d9bf13a9cf9 ("[SCSI] iscsi class: add iscsi host set param event") Fixes: 01cb225dad8d ("[SCSI] iscsi: add target discvery event to transport class") Fixes: 264faaaa1254 ("[SCSI] iscsi: add transport end point callbacks") Fixes: fd7255f51a13 ("[SCSI] iscsi: add sysfs attrs for uspace sync up") Signed-off-by: Lin Ma <linma@zju.edu.cn> Link: https://lore.kernel.org/r/20230725024529.428311-1-linma@zju.edu.cn Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-09-13 09:42:51 +02:00
Wenchao Hao	2737d82760	scsi: iscsi: Rename iscsi_set_param() to iscsi_if_set_param() [ Upstream commit 0c26a2d7c98039e913e63f9250fde738a3f88a60 ] There are two iscsi_set_param() functions defined in libiscsi.c and scsi_transport_iscsi.c respectively which is confusing. Rename the one in scsi_transport_iscsi.c to iscsi_if_set_param(). Signed-off-by: Wenchao Hao <haowenchao@huawei.com> Link: https://lore.kernel.org/r/20221122181105.4123935-1-haowenchao@huawei.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: 971dfcb74a80 ("scsi: iscsi: Add length check for nlattr payload") Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-09-13 09:42:51 +02:00
Xingui Yang	dd0dadb938	scsi: hisi_sas: Fix normally completed I/O analysed as failed [ Upstream commit f5393a5602cacfda2014e0ff8220e5a7564e7cd1 ] The PIO read command has no response frame and the struct iu[1024] won't be filled. I/Os which are normally completed will be treated as failed in sas_ata_task_done() when iu contains abnormal dirty data. Consequently ending_fis should not be filled by iu when the response frame hasn't been written to memory. Fixes: d380f55503ed ("scsi: hisi_sas: Don't bother clearing status buffer IU in task prep") Signed-off-by: Xingui Yang <yangxingui@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1689045300-44318-2-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-09-13 09:42:50 +02:00
Xingui Yang	ab0719d7b6	scsi: hisi_sas: Fix warnings detected by sparse [ Upstream commit c0328cc595124579328462fc45d7a29a084cf357 ] This patch fixes the following warning: drivers/scsi/hisi_sas/hisi_sas_v3_hw.c:2168:43: sparse: sparse: restricted __le32 degrades to integer Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202304161254.NztCVZIO-lkp@intel.com/ Signed-off-by: Xingui Yang <yangxingui@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1684118481-95908-4-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: f5393a5602ca ("scsi: hisi_sas: Fix normally completed I/O analysed as failed") Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-09-13 09:42:50 +02:00
Justin Tee	790587097c	scsi: lpfc: Fix incorrect big endian type assignment in bsg loopback path [ Upstream commit 9cefd6e7e0a77b0fbca5c793f6fb6821b0962775 ] The kernel test robot reported sparse warnings regarding incorrect type assignment for __be16 variables in bsg loopback path. Change the flagged lines to use the be16_to_cpu() and cpu_to_be16() macros appropriately. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230614175944.3577-1-justintee8345@gmail.com Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202306110819.sDIKiGgg-lkp@intel.com/ Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-09-13 09:42:25 +02:00
Michael Kelley	1a7f80f33a	scsi: storvsc: Always set no_report_opcodes [ Upstream commit 31d16e712bdcaee769de4780f72ff8d6cd3f0589 ] Hyper-V synthetic SCSI devices do not support the MAINTENANCE_IN SCSI command, so scsi_report_opcode() always fails, resulting in messages like this: hv_storvsc <guid>: tag#205 cmd 0xa3 status: scsi 0x2 srb 0x86 hv 0xc0000001 The recently added support for command duration limits calls scsi_report_opcode() four times as each device comes online, which significantly increases the number of messages logged in a system with many disks. Fix the problem by always marking Hyper-V synthetic SCSI devices as not supporting scsi_report_opcode(). With this setting, the MAINTENANCE_IN SCSI command is not issued and no messages are logged. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1686343101-18930-1-git-send-email-mikelley@microsoft.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-09-13 09:42:25 +02:00
Sagar Biradar	7d1ac3c2eb	scsi: aacraid: Reply queue mapping to CPUs based on IRQ affinity [ Upstream commit 9dc704dcc09eae7d21b5da0615eb2ed79278f63e ] Fix the I/O hang that arises because of the MSIx vector not having a mapped online CPU upon receiving completion. SCSI cmds take the blk_mq route, which is setup during init. Reserved cmds fetch the vector_no from mq_map after init is complete. Before init, they have to use 0 - as per the norm. Reviewed-by: Gilbert Wu <gilbert.wu@microchip.com> Signed-off-by: Sagar Biradar <Sagar.Biradar@microchip.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230519230834.27436-1-sagar.biradar@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-09-13 09:42:25 +02:00
Chengfeng Ye	c61d104612	scsi: qedi: Fix potential deadlock on &qedi_percpu->p_work_lock [ Upstream commit dd64f80587190265ca8a0f4be6c64c2fda6d3ac2 ] As &qedi_percpu->p_work_lock is acquired by hard IRQ qedi_msix_handler(), other acquisitions of the same lock under process context should disable IRQ, otherwise deadlock could happen if the IRQ preempts the execution while the lock is held in process context on the same CPU. qedi_cpu_offline() is one such function which acquires the lock in process context. [Deadlock Scenario] qedi_cpu_offline() ->spin_lock(&p->p_work_lock) <irq> ->qedi_msix_handler() ->edi_process_completions() ->spin_lock_irqsave(&p->p_work_lock, flags); (deadlock here) This flaw was found by an experimental static analysis tool I am developing for IRQ-related deadlocks. The tentative patch fix the potential deadlock by spin_lock_irqsave() under process context. Signed-off-by: Chengfeng Ye <dg573847474@gmail.com> Link: https://lore.kernel.org/r/20230726125655.4197-1-dg573847474@gmail.com Acked-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-09-13 09:42:24 +02:00
Justin Tee	24d9cc9335	scsi: lpfc: Remove reftag check in DIF paths [ Upstream commit 8eebf0e84f0614cebc7347f7bbccba4056d77d42 ] When preparing protection DIF I/O for DMA, the driver obtains reference tags from scsi_prot_ref_tag(). Previously, there was a wrong assumption that an all 0xffffffff value meant error and thus the driver failed the I/O. This patch removes the evaluation code and accepts whatever the upper layer returns. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230803211932.155745-1-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-09-13 09:42:24 +02:00
Zhu Wang	70461151d0	scsi: core: raid_class: Remove raid_component_add() commit 60c5fd2e8f3c42a5abc565ba9876ead1da5ad2b7 upstream. The raid_component_add() function was added to the kernel tree via patch "[SCSI] embryonic RAID class" (2005). Remove this function since it never has had any callers in the Linux kernel. And also raid_component_release() is only used in raid_component_add(), so it is also removed. Signed-off-by: Zhu Wang <wangzhu9@huawei.com> Link: https://lore.kernel.org/r/20230822015254.184270-1-wangzhu9@huawei.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Fixes: 04b5b5cb0136 ("scsi: core: Fix possible memory leak if device_add() fails") Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-30 16:11:12 +02:00
Zhu Wang	774cb3de7a	scsi: snic: Fix double free in snic_tgt_create() commit 1bd3a76880b2bce017987cf53780b372cf59528e upstream. Commit 41320b18a0e0 ("scsi: snic: Fix possible memory leak if device_add() fails") fixed the memory leak caused by dev_set_name() when device_add() failed. However, it did not consider that 'tgt' has already been released when put_device(&tgt->dev) is called. Remove kfree(tgt) in the error path to avoid double free of 'tgt' and move put_device(&tgt->dev) after the removed kfree(tgt) to avoid a use-after-free. Fixes: 41320b18a0e0 ("scsi: snic: Fix possible memory leak if device_add() fails") Signed-off-by: Zhu Wang <wangzhu9@huawei.com> Link: https://lore.kernel.org/r/20230819083941.164365-1-wangzhu9@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-30 16:11:12 +02:00
Nilesh Javali	38b0020f68	scsi: qedf: Fix firmware halt over suspend and resume commit ef222f551e7c4e2008fc442ffc9edcd1a7fd8f63 upstream. While performing certain power-off sequences, PCI drivers are called to suspend and resume their underlying devices through PCI PM (power management) interface. However the hardware does not support PCI PM suspend/resume operations so system wide suspend/resume leads to bad MFW (management firmware) state which causes various follow-up errors in driver when communicating with the device/firmware. To fix this driver implements PCI PM suspend handler to indicate unsupported operation to the PCI subsystem explicitly, thus avoiding system to go into suspended/standby mode. Fixes: 61d8658b4a43 ("scsi: qedf: Add QLogic FastLinQ offload FCoE driver framework.") Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230807093725.46829-1-njavali@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-16 18:27:31 +02:00
Nilesh Javali	a9518f4a49	scsi: qedi: Fix firmware halt over suspend and resume commit 1516ee035df32115197cd93ae3619dba7b020986 upstream. While performing certain power-off sequences, PCI drivers are called to suspend and resume their underlying devices through PCI PM (power management) interface. However the hardware does not support PCI PM suspend/resume operations so system wide suspend/resume leads to bad MFW (management firmware) state which causes various follow-up errors in driver when communicating with the device/firmware. To fix this driver implements PCI PM suspend handler to indicate unsupported operation to the PCI subsystem explicitly, thus avoiding system to go into suspended/standby mode. Fixes: ace7f46ba5fd ("scsi: qedi: Add QLogic FastLinQ offload iSCSI driver framework.") Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230807093725.46829-2-njavali@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-16 18:27:31 +02:00
Karan Tilak Kumar	fb004497b3	scsi: fnic: Replace return codes in fnic_clean_pending_aborts() commit 5a43b07a87835660f91d88a4db11abfea8c523b7 upstream. fnic_clean_pending_aborts() was returning a non-zero value irrespective of failure or success. This caused the caller of this function to assume that the device reset had failed, even though it would succeed in most cases. As a consequence, a successful device reset would escalate to host reset. Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com> Tested-by: Karan Tilak Kumar <kartilak@cisco.com> Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com> Link: https://lore.kernel.org/r/20230727193919.2519-1-kartilak@cisco.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-16 18:27:30 +02:00
Zhu Wang	b191ff1f07	scsi: core: Fix possible memory leak if device_add() fails commit 04b5b5cb0136ce970333a9c6cec7e46adba1ea3a upstream. If device_add() returns error, the name allocated by dev_set_name() needs be freed. As the comment of device_add() says, put_device() should be used to decrease the reference count in the error path. So fix this by calling put_device(), then the name can be freed in kobject_cleanp(). Fixes: ee959b00c335 ("SCSI: convert struct class_device to struct device") Signed-off-by: Zhu Wang <wangzhu9@huawei.com> Link: https://lore.kernel.org/r/20230803020230.226903-1-wangzhu9@huawei.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-16 18:27:30 +02:00
Zhu Wang	7723a5d5d1	scsi: snic: Fix possible memory leak if device_add() fails commit 41320b18a0e0dfb236dba4edb9be12dba1878156 upstream. If device_add() returns error, the name allocated by dev_set_name() needs be freed. As the comment of device_add() says, put_device() should be used to give up the reference in the error path. So fix this by calling put_device(), then the name can be freed in kobject_cleanp(). Fixes: c8806b6c9e82 ("snic: driver for Cisco SCSI HBA") Signed-off-by: Zhu Wang <wangzhu9@huawei.com> Acked-by: Narsimhulu Musini <nmusini@cisco.com> Link: https://lore.kernel.org/r/20230801111421.63651-1-wangzhu9@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-16 18:27:30 +02:00
Alexandra Diupina	9fdb273ede	scsi: 53c700: Check that command slot is not NULL commit 8366d1f1249a0d0bba41d0bd1298d63e5d34c7f7 upstream. Add a check for the command slot value to avoid dereferencing a NULL pointer. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Co-developed-by: Vladimir Telezhnikov <vtelezhnikov@astralinux.ru> Signed-off-by: Vladimir Telezhnikov <vtelezhnikov@astralinux.ru> Signed-off-by: Alexandra Diupina <adiupina@astralinux.ru> Link: https://lore.kernel.org/r/20230728123521.18293-1-adiupina@astralinux.ru Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-16 18:27:30 +02:00
Michael Kelley	ed70fa5629	scsi: storvsc: Fix handling of virtual Fibre Channel timeouts commit 175544ad48cbf56affeef2a679c6a4d4fb1e2881 upstream. Hyper-V provides the ability to connect Fibre Channel LUNs to the host system and present them in a guest VM as a SCSI device. I/O to the vFC device is handled by the storvsc driver. The storvsc driver includes a partial integration with the FC transport implemented in the generic portion of the Linux SCSI subsystem so that FC attributes can be displayed in /sys. However, the partial integration means that some aspects of vFC don't work properly. Unfortunately, a full and correct integration isn't practical because of limitations in what Hyper-V provides to the guest. In particular, in the context of Hyper-V storvsc, the FC transport timeout function fc_eh_timed_out() causes a kernel panic because it can't find the rport and dereferences a NULL pointer. The original patch that added the call from storvsc_eh_timed_out() to fc_eh_timed_out() is faulty in this regard. In many cases a timeout is due to a transient condition, so the situation can be improved by just continuing to wait like with other I/O requests issued by storvsc, and avoiding the guaranteed panic. For a permanent failure, continuing to wait may result in a hung thread instead of a panic, which again may be better. So fix the panic by removing the storvsc call to fc_eh_timed_out(). This allows storvsc to keep waiting for a response. The change has been tested by users who experienced a panic in fc_eh_timed_out() due to transient timeouts, and it solves their problem. In the future we may want to deprecate the vFC functionality in storvsc since it can't be fully fixed. But it has current users for whom it is working well enough, so it should probably stay for a while longer. Fixes: 3930d7309807 ("scsi: storvsc: use default I/O timeout handler for FC devices") Cc: stable@vger.kernel.org Signed-off-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1690606764-79669-1-git-send-email-mikelley@microsoft.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-16 18:27:30 +02:00
Tony Battersby	0e1605ec5b	scsi: core: Fix legacy /proc parsing buffer overflow commit 9426d3cef5000824e5f24f80ed5f42fb935f2488 upstream. (lightly modified commit message mostly by Linus Torvalds) The parsing code for /proc/scsi/scsi is disgusting and broken. We should have just used 'sscanf()' or something simple like that, but the logic may actually predate our kernel sscanf library routine for all I know. It certainly predates both git and BK histories. And we can't change it to be something sane like that now, because the string matching at the start is done case-insensitively, and the separator parsing between numbers isn't done at all, so any separator will work, including a possible terminating NUL character. This interface is root-only, and entirely for legacy use, so there is absolutely no point in trying to tighten up the parsing. Because any separator has traditionally worked, it's entirely possible that people have used random characters rather than the suggested space. So don't bother to try to pretty it up, and let's just make a minimal patch that can be back-ported and we can forget about this whole sorry thing for another two decades. Just make it at least not read past the end of the supplied data. Link: https://lore.kernel.org/linux-scsi/b570f5fe-cb7c-863a-6ed9-f6774c219b88@cybernetics.com/ Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin K Petersen <martin.petersen@oracle.com> Cc: James Bottomley <jejb@linux.ibm.com> Cc: Willy Tarreau <w@1wt.eu> Cc: stable@kernel.org Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Martin K Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-16 18:27:30 +02:00
Michael Kelley	f62faadc79	scsi: storvsc: Limit max_sectors for virtual Fibre Channel devices commit 010c1e1c5741365dbbf44a5a5bb9f30192875c4c upstream. The Hyper-V host is queried to get the max transfer size that it supports, and this value is used to set max_sectors for the synthetic SCSI controller. However, this max transfer size may be too large for virtual Fibre Channel devices, which are limited to 512 Kbytes. If a larger transfer size is used with a vFC device, Hyper-V always returns an error, and storvsc logs a message like this where the SRB status and SCSI status are both zero: hv_storvsc <GUID>: tag#197 cmd 0x8a status: scsi 0x0 srb 0x0 hv 0xc0000001 Add logic to limit the max transfer size to 512 Kbytes for vFC devices. Fixes: 1d3e0980782f ("scsi: storvsc: Correct reporting of Hyper-V I/O size limits") Cc: stable@vger.kernel.org Signed-off-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1689887102-32806-1-git-send-email-mikelley@microsoft.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-08-11 12:08:19 +02:00
Dan Carpenter	fec55ec035	scsi: qla2xxx: Fix end of loop test commit 339020091e246e708c1381acf74c5f8e3fe4d2b5 upstream. This loop will exit successfully when "found" is false or in the failure case it times out with "wait_iter" set to -1. The test for timeouts is impossible as is. Fixes: b843adde8d49 ("scsi: qla2xxx: Fix mem access after free") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/cea5a62f-b873-4347-8f8e-c67527ced8d2@kili.mountain Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-07-23 13:49:49 +02:00
Manish Rangankar	f459d586fd	scsi: qla2xxx: Remove unused nvme_ls_waitq wait queue commit 20fce500b232b970e40312a9c97e7f3b6d7a709c upstream. System crash when qla2x00_start_sp(sp) returns error code EGAIN and wake_up gets called for uninitialized wait queue sp->nvme_ls_waitq. qla2xxx [0000:37:00.1]-2121:5: Returning existing qpair of ffff8ae2c0513400 for idx=0 qla2xxx [0000:37:00.1]-700e:5: qla2x00_start_sp failed = 11 BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 PGD 0 P4D 0 Oops: 0000 [#1] SMP NOPTI Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 09/03/2021 Workqueue: nvme-wq nvme_fc_connect_ctrl_work [nvme_fc] RIP: 0010:__wake_up_common+0x4c/0x190 RSP: 0018:ffff95f3e0cb7cd0 EFLAGS: 00010086 RAX: 0000000000000000 RBX: ffff8b08d3b26328 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff8b08d3b26320 RBP: 0000000000000001 R08: 0000000000000000 R09: ffffffffffffffe8 R10: 0000000000000000 R11: ffff95f3e0cb7a60 R12: ffff95f3e0cb7d20 R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff8b2fdf6c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000002f1e410002 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: __wake_up_common_lock+0x7c/0xc0 qla_nvme_ls_req+0x355/0x4c0 [qla2xxx] ? __nvme_fc_send_ls_req+0x260/0x380 [nvme_fc] ? nvme_fc_send_ls_req.constprop.42+0x1a/0x45 [nvme_fc] ? nvme_fc_connect_ctrl_work.cold.63+0x1e3/0xa7d [nvme_fc] Remove unused nvme_ls_waitq wait queue. nvme_ls_waitq logic was removed previously in the commits tagged Fixed: below. Fixes: 219d27d7147e ("scsi: qla2xxx: Fix race conditions in the code for aborting SCSI commands") Fixes: 5621b0dd7453 ("scsi: qla2xxx: Simpify unregistration of FC-NVMe local/remote ports") Cc: stable@vger.kernel.org Signed-off-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230615074633.12721-1-njavali@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-07-23 13:49:49 +02:00
Shreyas Deodhar	b06d1b5253	scsi: qla2xxx: Pointer may be dereferenced commit 00eca15319d9ce8c31cdf22f32a3467775423df4 upstream. Klocwork tool reported pointer 'rport' returned from call to function fc_bsg_to_rport() may be NULL and will be dereferenced. Add a fix to validate rport before dereferencing. Cc: stable@vger.kernel.org Signed-off-by: Shreyas Deodhar <sdeodhar@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230607113843.37185-7-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-07-23 13:49:49 +02:00
Bikash Hazarika	b88b1241fb	scsi: qla2xxx: Correct the index of array commit b1b9d3825df4c757d653d0b1df66f084835db9c3 upstream. Klocwork reported array 'port_dstate_str' of size 10 may use index value(s) 10..15. Add a fix to correct the index of array. Cc: stable@vger.kernel.org Signed-off-by: Bikash Hazarika <bhazarika@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230607113843.37185-8-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-07-23 13:49:49 +02:00
Nilesh Javali	e466930717	scsi: qla2xxx: Check valid rport returned by fc_bsg_to_rport() commit af73f23a27206ffb3c477cac75b5fcf03410556e upstream. Klocwork reported warning of rport maybe NULL and will be dereferenced. rport returned by call to fc_bsg_to_rport() could be NULL and dereferenced. Check valid rport returned by fc_bsg_to_rport(). Cc: stable@vger.kernel.org Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230607113843.37185-5-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-07-23 13:49:49 +02:00
Bikash Hazarika	ce2cdbe530	scsi: qla2xxx: Fix potential NULL pointer dereference commit 464ea494a40c6e3e0e8f91dd325408aaf21515ba upstream. Klocwork tool reported 'cur_dsd' may be dereferenced. Add fix to validate pointer before dereferencing the pointer. Cc: stable@vger.kernel.org Signed-off-by: Bikash Hazarika <bhazarika@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230607113843.37185-3-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-07-23 13:49:49 +02:00

1 2 3 4 5 ...

23595 Commits