linux

iv/linux

Author	SHA1	Message	Date
Johan Hovold	b03f7ed9af	scsi: ufs: core: Fix devfreq deadlocks [ Upstream commit ba81043753fffbc2ad6e0c5ff2659f12ac2f46b4 ] There is a lock inversion and rwsem read-lock recursion in the devfreq target callback which can lead to deadlocks. Specifically, ufshcd_devfreq_scale() already holds a clk_scaling_lock read lock when toggling the write booster, which involves taking the dev_cmd mutex before taking another clk_scaling_lock read lock. This can lead to a deadlock if another thread: 1) tries to acquire the dev_cmd and clk_scaling locks in the correct order, or 2) takes a clk_scaling write lock before the attempt to take the clk_scaling read lock a second time. Fix this by dropping the clk_scaling_lock before toggling the write booster as was done before commit 0e9d4ca43ba8 ("scsi: ufs: Protect some contexts from unexpected clock scaling"). While the devfreq callbacks are already serialised, add a second serialising mutex to handle the unlikely case where a callback triggered through the devfreq sysfs interface is racing with a request to disable clock scaling through the UFS controller 'clkscale_enable' sysfs attribute. This could otherwise lead to the write booster being left disabled after having disabled clock scaling. Also take the new mutex in ufshcd_clk_scaling_allow() to make sure that any pending write booster update has completed on return. Note that this currently only affects Qualcomm platforms since commit 87bd05016a64 ("scsi: ufs: core: Allow host driver to disable wb toggling during clock scaling"). The lock inversion (i.e. 1 above) was reported by lockdep as: ====================================================== WARNING: possible circular locking dependency detected 6.1.0-next-20221216 #211 Not tainted ------------------------------------------------------ kworker/u16:2/71 is trying to acquire lock: ffff076280ba98a0 (&hba->dev_cmd.lock){+.+.}-{3:3}, at: ufshcd_query_flag+0x50/0x1c0 but task is already holding lock: ffff076280ba9cf0 (&hba->clk_scaling_lock){++++}-{3:3}, at: ufshcd_devfreq_scale+0x2b8/0x380 which lock already depends on the new lock. [ +0.011606] the existing dependency chain (in reverse order) is: -> #1 (&hba->clk_scaling_lock){++++}-{3:3}: lock_acquire+0x68/0x90 down_read+0x58/0x80 ufshcd_exec_dev_cmd+0x70/0x2c0 ufshcd_verify_dev_init+0x68/0x170 ufshcd_probe_hba+0x398/0x1180 ufshcd_async_scan+0x30/0x320 async_run_entry_fn+0x34/0x150 process_one_work+0x288/0x6c0 worker_thread+0x74/0x450 kthread+0x118/0x120 ret_from_fork+0x10/0x20 -> #0 (&hba->dev_cmd.lock){+.+.}-{3:3}: __lock_acquire+0x12a0/0x2240 lock_acquire.part.0+0xcc/0x220 lock_acquire+0x68/0x90 __mutex_lock+0x98/0x430 mutex_lock_nested+0x2c/0x40 ufshcd_query_flag+0x50/0x1c0 ufshcd_query_flag_retry+0x64/0x100 ufshcd_wb_toggle+0x5c/0x120 ufshcd_devfreq_scale+0x2c4/0x380 ufshcd_devfreq_target+0xf4/0x230 devfreq_set_target+0x84/0x2f0 devfreq_update_target+0xc4/0xf0 devfreq_monitor+0x38/0x1f0 process_one_work+0x288/0x6c0 worker_thread+0x74/0x450 kthread+0x118/0x120 ret_from_fork+0x10/0x20 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&hba->clk_scaling_lock); lock(&hba->dev_cmd.lock); lock(&hba->clk_scaling_lock); lock(&hba->dev_cmd.lock); * DEADLOCK * Fixes: 0e9d4ca43ba8 ("scsi: ufs: Protect some contexts from unexpected clock scaling") Cc: stable@vger.kernel.org # 5.12 Cc: Can Guo <quic_cang@quicinc.com> Tested-by: Andrew Halaney <ahalaney@redhat.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20230116161201.16923-1-johan+linaro@kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-02-01 08:27:26 +01:00
Alexey V. Vissarionov	f0227eca97	scsi: hpsa: Fix allocation size for scsi_host_alloc() [ Upstream commit bbbd25499100c810ceaf5193c3cfcab9f7402a33 ] The 'h' is a pointer to struct ctlr_info, so it's just 4 or 8 bytes, while the structure itself is much bigger. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: edd163687ea5 ("hpsa: add driver for HP Smart Array controllers.") Link: https://lore.kernel.org/r/20230118031255.GE15213@altlinux.org Signed-off-by: Alexey V. Vissarionov <gremlin@altlinux.org> Acked-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-02-01 08:27:22 +01:00
Yihang Li	d4b717e34d	scsi: hisi_sas: Set a port invalid only if there are no devices attached when refreshing port id [ Upstream commit f58c89700630da6554b24fd3df293a24874c10c1 ] Currently the driver sets the port invalid if one phy in the port is not enabled, which may cause issues in expander situation. In directly attached situation, if phy up doesn't occur in time when refreshing port id, the port is incorrectly set to invalid which will also cause disk lost. Therefore set a port invalid only if there are no devices attached to the port. Signed-off-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1672805000-141102-3-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-02-01 08:27:18 +01:00
Wenchao Hao	28e4e8ca9e	scsi: iscsi: Fix multiple iSCSI session unbind events sent to userspace [ Upstream commit a3be19b91ea7121d388084e8c07f5b1b982eb40c ] It was observed that the kernel would potentially send ISCSI_KEVENT_UNBIND_SESSION multiple times. Introduce 'target_state' in iscsi_cls_session() to make sure session will send only one unbind session event. This introduces a regression wrt. the issue fixed in commit 13e60d3ba287 ("scsi: iscsi: Report unbind session event when the target has been removed"). If iscsid dies for any reason after sending an unbind session to kernel, once iscsid is restarted, the kernel's ISCSI_KEVENT_UNBIND_SESSION event is lost and userspace is then unable to logout. However, the session is actually in invalid state (its target_id is INVALID) so iscsid should not sync this session during restart. Consequently we need to check the session's target state during iscsid restart. If session is in unbound state, do not sync this session and perform session teardown. This is OK because once a session is unbound, we can not recover it any more (mainly because its target id is INVALID). Signed-off-by: Wenchao Hao <haowenchao@huawei.com> Link: https://lore.kernel.org/r/20221126010752.231917-1-haowenchao@huawei.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Wu Bo <wubo40@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-02-01 08:27:16 +01:00
Sreekanth Reddy	63c2fa09b8	scsi: mpt3sas: Remove scsi_dma_map() error messages commit 0c25422d34b4726b2707d5f38560943155a91b80 upstream. When scsi_dma_map() fails by returning a sges_left value less than zero, the amount of logging produced can be extremely high. In a recent end-user environment, 1200 messages per second were being sent to the log buffer. This eventually overwhelmed the system and it stalled. These error messages are not needed. Remove them. Link: https://lore.kernel.org/r/20220303140203.12642-1-sreekanth.reddy@broadcom.com Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-01-18 11:48:58 +01:00
Peter Wang	7c26d21872	scsi: ufs: core: WLUN suspend SSU/enter hibern8 fail recovery [ Upstream commit 1a5665fc8d7a000671ebd3fe69c6f9acf1e0dcd9 ] When SSU/enter hibern8 fail in WLUN suspend flow, trigger the error handler and return busy to break the suspend. Otherwise the consumer will get stuck in runtime suspend status. Fixes: b294ff3e3449 ("scsi: ufs: core: Enable power management for wlun") Signed-off-by: Peter Wang <peter.wang@mediatek.com> Link: https://lore.kernel.org/r/20221208072520.26210-1-peter.wang@mediatek.com Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-01-18 11:48:53 +01:00
Bart Van Assche	513fdf0b8e	scsi: ufs: Stop using the clock scaling lock in the error handler [ Upstream commit 5675c381ea51360b4968b78f23aefda73e3de90d ] Instead of locking and unlocking the clock scaling lock, surround the command queueing code with an RCU reader lock and call synchronize_rcu(). This patch prepares for removal of the clock scaling lock. Link: https://lore.kernel.org/r/20211203231950.193369-16-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: 1a5665fc8d7a ("scsi: ufs: core: WLUN suspend SSU/enter hibern8 fail recovery") Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-01-18 11:48:53 +01:00
Shin'ichiro Kawasaki	13259b60b7	scsi: mpi3mr: Refer CONFIG_SCSI_MPI3MR in Makefile [ Upstream commit f0a43ba6c66cc0688e2748d986a1459fdd3442ef ] When Kconfig item CONFIG_SCSI_MPI3MR was introduced for mpi3mr driver, the Makefile of the driver was not modified to refer the Kconfig item. As a result, mpi3mr.ko is built regardless of the Kconfig item value y or m. Also, if 'make localmodconfig' can not find the Kconfig item in the Makefile, then it does not generate CONFIG_SCSI_MPI3MR=m even when mpi3mr.ko is loaded on the system. Refer to the Kconfig item to avoid the issues. Fixes: c4f7ac64616e ("scsi: mpi3mr: Add mpi30 Rev-R headers and Kconfig") Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Link: https://lore.kernel.org/r/20221207023659.2411785-1-shinichiro.kawasaki@wdc.com Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-01-18 11:48:53 +01:00
Arun Easi	d3871af13a	scsi: qla2xxx: Fix crash when I/O abort times out commit 68ad83188d782b2ecef2e41ac245d27e0710fe8e upstream. While performing CPU hotplug, a crash with the following stack was seen: Call Trace: qla24xx_process_response_queue+0x42a/0x970 [qla2xxx] qla2x00_start_nvme_mq+0x3a2/0x4b0 [qla2xxx] qla_nvme_post_cmd+0x166/0x240 [qla2xxx] nvme_fc_start_fcp_op.part.0+0x119/0x2e0 [nvme_fc] blk_mq_dispatch_rq_list+0x17b/0x610 __blk_mq_sched_dispatch_requests+0xb0/0x140 blk_mq_sched_dispatch_requests+0x30/0x60 __blk_mq_run_hw_queue+0x35/0x90 __blk_mq_delay_run_hw_queue+0x161/0x180 blk_execute_rq+0xbe/0x160 __nvme_submit_sync_cmd+0x16f/0x220 [nvme_core] nvmf_connect_admin_queue+0x11a/0x170 [nvme_fabrics] nvme_fc_create_association.cold+0x50/0x3dc [nvme_fc] nvme_fc_connect_ctrl_work+0x19/0x30 [nvme_fc] process_one_work+0x1e8/0x3c0 On abort timeout, completion was called without checking if the I/O was already completed. Verify that I/O and abort request are indeed outstanding before attempting completion. Fixes: 71c80b75ce8f ("scsi: qla2xxx: Do command completion on abort timeout") Reported-by: Marco Patalano <mpatalan@redhat.com> Tested-by: Marco Patalano <mpatalan@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20221129092634.15347-1-njavali@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-12-31 13:14:47 +01:00
Nathan Chancellor	4f6b206998	scsi: elx: libefc: Fix second parameter type in state callbacks [ Upstream commit 3d75e766b58a7410d4e835c534e1b4664a8f62d0 ] With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG), indirect call targets are validated against the expected function pointer prototype to make sure the call target is valid to help mitigate ROP attacks. If they are not identical, there is a failure at run time, which manifests as either a kernel panic or thread getting killed. A proposed warning in clang aims to catch these at compile time, which reveals: drivers/scsi/elx/libefc/efc_node.c:811:22: error: incompatible function pointer types assigning to 'void ()(struct efc_sm_ctx , u32, void )' (aka 'void ()(struct efc_sm_ctx , unsigned int, void )') from 'void ()(struct efc_sm_ctx , enum efc_sm_event, void )' [-Werror,-Wincompatible-function-pointer-types-strict] ctx->current_state = state; ^ ~~~~~ drivers/scsi/elx/libefc/efc_node.c:878:21: error: incompatible function pointer types assigning to 'void ()(struct efc_sm_ctx , u32, void )' (aka 'void ()(struct efc_sm_ctx , unsigned int, void )') from 'void ()(struct efc_sm_ctx , enum efc_sm_event, void )' [-Werror,-Wincompatible-function-pointer-types-strict] node->nodedb_state = state; ^ ~~~~~ drivers/scsi/elx/libefc/efc_node.c:905:6: error: incompatible function pointer types assigning to 'void ()(struct efc_sm_ctx , enum efc_sm_event, void )' from 'void ()(struct efc_sm_ctx , u32, void )' (aka 'void ()(struct efc_sm_ctx , unsigned int, void )') [-Werror,-Wincompatible-function-pointer-types-strict] pf = node->nodedb_state; ^ ~~~~~~~~~~~~~~~~~~ drivers/scsi/elx/libefc/efc_device.c:455:22: error: incompatible function pointer types assigning to 'void ()(struct efc_sm_ctx , u32, void )' (aka 'void ()(struct efc_sm_ctx , unsigned int, void )') from 'void (struct efc_sm_ctx , enum efc_sm_event, void )' [-Werror,-Wincompatible-function-pointer-types-strict] node->nodedb_state = __efc_d_init; ^ ~~~~~~~~~~~~ drivers/scsi/elx/libefc/efc_sm.c:41:22: error: incompatible function pointer types assigning to 'void ()(struct efc_sm_ctx , u32, void )' (aka 'void ()(struct efc_sm_ctx , unsigned int, void )') from 'void ()(struct efc_sm_ctx , enum efc_sm_event, void )' [-Werror,-Wincompatible-function-pointer-types-strict] ctx->current_state = state; ^ ~~~~~ The type of the second parameter in the prototypes of ->current_state() and ->nodedb_state() ('u32') does not match the implementations, which have a second parameter type of 'enum efc_sm_event'. Update the prototypes to have the correct second parameter type, clearing up all the warnings and CFI failures. Link: https://github.com/ClangBuiltLinux/linux/issues/1750 Reported-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20221102161906.2781508-1-nathan@kernel.org Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:14:44 +01:00
Bart Van Assche	23f0e9f863	scsi: ufs: Reduce the START STOP UNIT timeout [ Upstream commit dcd5b7637c6d442d957f73780a03047413ed3a10 ] Reduce the START STOP UNIT command timeout to one second since on Android devices a kernel panic is triggered if an attempt to suspend the system takes more than 20 seconds. One second should be enough for the START STOP UNIT command since this command completes in less than a millisecond for the UFS devices I have access to. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221018202958.1902564-7-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:14:44 +01:00
Justin Tee	2cf66428a2	scsi: lpfc: Fix hard lockup when reading the rx_monitor from debugfs [ Upstream commit c44e50f4a0ec00c2298f31f91bc2c3e9bbd81c7e ] During I/O and simultaneous cat of /sys/kernel/debug/lpfc/fnX/rx_monitor, a hard lockup similar to the call trace below may occur. The spin_lock_bh in lpfc_rx_monitor_report is not protecting from timer interrupts as expected, so change the strength of the spin lock to _irq. Kernel panic - not syncing: Hard LOCKUP CPU: 3 PID: 110402 Comm: cat Kdump: loaded exception RIP: native_queued_spin_lock_slowpath+91 [IRQ stack] native_queued_spin_lock_slowpath at ffffffffb814e30b _raw_spin_lock at ffffffffb89a667a lpfc_rx_monitor_record at ffffffffc0a73a36 [lpfc] lpfc_cmf_timer at ffffffffc0abbc67 [lpfc] __hrtimer_run_queues at ffffffffb8184250 hrtimer_interrupt at ffffffffb8184ab0 smp_apic_timer_interrupt at ffffffffb8a026ba apic_timer_interrupt at ffffffffb8a01c4f [End of IRQ stack] apic_timer_interrupt at ffffffffb8a01c4f lpfc_rx_monitor_report at ffffffffc0a73c80 [lpfc] lpfc_rx_monitor_read at ffffffffc0addde1 [lpfc] full_proxy_read at ffffffffb83e7fc3 vfs_read at ffffffffb833fe71 ksys_read at ffffffffb83402af do_syscall_64 at ffffffffb800430b entry_SYSCALL_64_after_hwframe at ffffffffb8a000ad Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20221017164323.14536-2-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:14:43 +01:00
Gaosheng Cui	1895e908b3	scsi: snic: Fix possible UAF in snic_tgt_create() [ Upstream commit e118df492320176af94deec000ae034cc92be754 ] Smatch reports a warning as follows: drivers/scsi/snic/snic_disc.c:307 snic_tgt_create() warn: '&tgt->list' not removed from list If device_add() fails in snic_tgt_create(), tgt will be freed, but tgt->list will not be removed from snic->disc.tgt_list, then list traversal may cause UAF. Remove from snic->disc.tgt_list before free(). Fixes: c8806b6c9e82 ("snic: driver for Cisco SCSI HBA") Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Link: https://lore.kernel.org/r/20221117035100.2944812-1-cuigaosheng1@huawei.com Acked-by: Narsimhulu Musini <nmusini@cisco.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:14:25 +01:00
Chen Zhongjin	09a60f908d	scsi: fcoe: Fix transport not deattached when fcoe_if_init() fails [ Upstream commit 4155658cee394b22b24c6d64e49247bf26d95b92 ] fcoe_init() calls fcoe_transport_attach(&fcoe_sw_transport), but when fcoe_if_init() fails, &fcoe_sw_transport is not detached and leaves freed &fcoe_sw_transport on fcoe_transports list. This causes panic when reinserting module. BUG: unable to handle page fault for address: fffffbfff82e2213 RIP: 0010:fcoe_transport_attach+0xe1/0x230 [libfcoe] Call Trace: <TASK> do_one_initcall+0xd0/0x4e0 load_module+0x5eee/0x7210 ... Fixes: 78a582463c1e ("[SCSI] fcoe: convert fcoe.ko to become an fcoe transport provider driver") Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com> Link: https://lore.kernel.org/r/20221115092442.133088-1-chenzhongjin@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:14:25 +01:00
Shang XiaoJing	e59da17205	scsi: ipr: Fix WARNING in ipr_init() [ Upstream commit e6f108bffc3708ddcff72324f7d40dfcd0204894 ] ipr_init() will not call unregister_reboot_notifier() when pci_register_driver() fails, which causes a WARNING. Call unregister_reboot_notifier() when pci_register_driver() fails. notifier callback ipr_halt [ipr] already registered WARNING: CPU: 3 PID: 299 at kernel/notifier.c:29 notifier_chain_register+0x16d/0x230 Modules linked in: ipr(+) xhci_pci_renesas xhci_hcd ehci_hcd usbcore led_class gpu_sched drm_buddy video wmi drm_ttm_helper ttm drm_display_helper drm_kms_helper drm drm_panel_orientation_quirks agpgart cfbft CPU: 3 PID: 299 Comm: modprobe Tainted: G W 6.1.0-rc1-00190-g39508d23b672-dirty #332 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 RIP: 0010:notifier_chain_register+0x16d/0x230 Call Trace: <TASK> __blocking_notifier_chain_register+0x73/0xb0 ipr_init+0x30/0x1000 [ipr] do_one_initcall+0xdb/0x480 do_init_module+0x1cf/0x680 load_module+0x6a50/0x70a0 __do_sys_finit_module+0x12f/0x1c0 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: f72919ec2bbb ("[SCSI] ipr: implement shutdown changes and remove obsolete write cache parameter") Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Link: https://lore.kernel.org/r/20221113064513.14028-1-shangxiaojing@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:14:25 +01:00
Yang Yingliang	c444f58fda	scsi: scsi_debug: Fix possible name leak in sdebug_add_host_helper() [ Upstream commit e6d773f93a49e0eda88a903a2a6542ca83380eb1 ] Afer commit 1fa5ae857bb1 ("driver core: get rid of struct device's bus_id string array"), the name of device is allocated dynamically, it needs be freed when device_register() returns error. As comment of device_register() says, one should use put_device() to give up the reference in the error path. Fix this by calling put_device(), then the name can be freed in kobject_cleanup(), and sdbg_host is freed in sdebug_release_adapter(). When the device release is not set, it means the device is not initialized. We can not call put_device() in this case. Use kfree() to free memory. Fixes: 1fa5ae857bb1 ("driver core: get rid of struct device's bus_id string array") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221112131010.3757845-1-yangyingliang@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:14:25 +01:00
Yang Yingliang	4e4968dfb5	scsi: fcoe: Fix possible name leak when device_register() fails [ Upstream commit 47b6a122c7b69a876c7ee2fc064a26b09627de9d ] If device_register() returns an error, the name allocated by dev_set_name() needs to be freed. As the comment of device_register() says, one should use put_device() to give up the reference in the error path. Fix this by calling put_device(), then the name can be freed in kobject_cleanup(). The 'fcf' is freed in fcoe_fcf_device_release(), so the kfree() in the error path can be removed. The 'ctlr' is freed in fcoe_ctlr_device_release(), so don't use the error label, just return NULL after calling put_device(). Fixes: 9a74e884ee71 ("[SCSI] libfcoe: Add fcoe_sysfs") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221112094310.3633291-1-yangyingliang@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:14:25 +01:00
Harshit Mogalapalli	0f5006d7d1	scsi: scsi_debug: Fix a warning in resp_report_zones() [ Upstream commit 07f2ca139d9a7a1ba71c4c03997c8de161db2346 ] As 'alloc_len' is user controlled data, if user tries to allocate memory larger than(>=) MAX_ORDER, then kcalloc() will fail, it creates a stack trace and messes up dmesg with a warning. Add __GFP_NOWARN in order to avoid too large allocation warning. This is detected by static analysis using smatch. Fixes: 7db0e0c8190a ("scsi: scsi_debug: Fix buffer size of REPORT ZONES command") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Link: https://lore.kernel.org/r/20221112070612.2121535-1-harshit.m.mogalapalli@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:14:25 +01:00
Harshit Mogalapalli	2432719b1a	scsi: scsi_debug: Fix a warning in resp_verify() [ Upstream commit ed0f17b748b20271cb568c7ca0b23b120316a47d ] As 'vnum' is controlled by user, so if user tries to allocate memory larger than(>=) MAX_ORDER, then kcalloc() will fail, it creates a stack trace and messes up dmesg with a warning. Add __GFP_NOWARN in order to avoid too large allocation warning. This is detected by static analysis using smatch. Fixes: c3e2fe9222d4 ("scsi: scsi_debug: Implement VERIFY(10), add VERIFY(16)") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Link: https://lore.kernel.org/r/20221112070031.2121068-1-harshit.m.mogalapalli@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:14:25 +01:00
Chen Zhongjin	038359eecc	scsi: efct: Fix possible memleak in efct_device_init() [ Upstream commit bb0cd225dd37df1f4a22e36dad59ff33178ecdfc ] In efct_device_init(), when efct_scsi_reg_fc_transport() fails, efct_scsi_tgt_driver_exit() is not called to release memory for efct_scsi_tgt_driver_init() and causes memleak: unreferenced object 0xffff8881020ce000 (size 2048): comm "modprobe", pid 465, jiffies 4294928222 (age 55.872s) backtrace: [<0000000021a1ef1b>] kmalloc_trace+0x27/0x110 [<000000004c3ed51c>] target_register_template+0x4fd/0x7b0 [target_core_mod] [<00000000f3393296>] efct_scsi_tgt_driver_init+0x18/0x50 [efct] [<00000000115de533>] 0xffffffffc0d90011 [<00000000d608f646>] do_one_initcall+0xd0/0x4e0 [<0000000067828cf1>] do_init_module+0x1cc/0x6a0 ... Fixes: 4df84e846624 ("scsi: elx: efct: Driver initialization routines") Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com> Link: https://lore.kernel.org/r/20221111074046.57061-1-chenzhongjin@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:14:25 +01:00
Yang Yingliang	23053a7926	scsi: hpsa: Fix possible memory leak in hpsa_add_sas_device() [ Upstream commit fda34a5d304d0b98cc967e8763b52221b66dc202 ] If hpsa_sas_port_add_rphy() returns an error, the 'rphy' allocated in sas_end_device_alloc() needs to be freed. Address this by calling sas_rphy_free() in the error path. Fixes: d04e62b9d63a ("hpsa: add in sas transport class") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221111043012.1074466-1-yangyingliang@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:14:25 +01:00
Yang Yingliang	2ab6d5927c	scsi: hpsa: Fix error handling in hpsa_add_sas_host() [ Upstream commit 4ef174a3ad9b5d73c1b6573e244ebba2b0d86eac ] hpsa_sas_port_add_phy() does: ... sas_phy_add() -> may return error here sas_port_add_phy() ... Whereas hpsa_free_sas_phy() does: ... sas_port_delete_phy() sas_phy_delete() ... If hpsa_sas_port_add_phy() returns an error, hpsa_free_sas_phy() can not be called to free the memory because the port and the phy have not been added yet. Replace hpsa_free_sas_phy() with sas_phy_free() and kfree() to avoid kernel crash in this case. Fixes: d04e62b9d63a ("hpsa: add in sas transport class") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221110151129.394389-1-yangyingliang@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:14:24 +01:00
Yang Yingliang	6a92129c8f	scsi: mpt3sas: Fix possible resource leaks in mpt3sas_transport_port_add() [ Upstream commit 78316e9dfc24906dd474630928ed1d3c562b568e ] In mpt3sas_transport_port_add(), if sas_rphy_add() returns error, sas_rphy_free() needs be called to free the resource allocated in sas_end_device_alloc(). Otherwise a kernel crash will happen: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000108 CPU: 45 PID: 37020 Comm: bash Kdump: loaded Tainted: G W 6.1.0-rc1+ #189 pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : device_del+0x54/0x3d0 lr : device_del+0x37c/0x3d0 Call trace: device_del+0x54/0x3d0 attribute_container_class_device_del+0x28/0x38 transport_remove_classdev+0x6c/0x80 attribute_container_device_trigger+0x108/0x110 transport_remove_device+0x28/0x38 sas_rphy_remove+0x50/0x78 [scsi_transport_sas] sas_port_delete+0x30/0x148 [scsi_transport_sas] do_sas_phy_delete+0x78/0x80 [scsi_transport_sas] device_for_each_child+0x68/0xb0 sas_remove_children+0x30/0x50 [scsi_transport_sas] sas_rphy_remove+0x38/0x78 [scsi_transport_sas] sas_port_delete+0x30/0x148 [scsi_transport_sas] do_sas_phy_delete+0x78/0x80 [scsi_transport_sas] device_for_each_child+0x68/0xb0 sas_remove_children+0x30/0x50 [scsi_transport_sas] sas_remove_host+0x20/0x38 [scsi_transport_sas] scsih_remove+0xd8/0x420 [mpt3sas] Because transport_add_device() is not called when sas_rphy_add() fails, the device is not added. When sas_rphy_remove() is subsequently called to remove the device in the remove() path, a NULL pointer dereference happens. Fixes: f92363d12359 ("[SCSI] mpt3sas: add new driver supporting 12GB SAS") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221109032403.1636422-1-yangyingliang@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:14:24 +01:00
Yuan Can	bfe10a1d9f	scsi: hpsa: Fix possible memory leak in hpsa_init_one() [ Upstream commit 9c9ff300e0de07475796495d86f449340d454a0c ] The hpda_alloc_ctlr_info() allocates h and its field reply_map. However, in hpsa_init_one(), if alloc_percpu() failed, the hpsa_init_one() jumps to clean1 directly, which frees h and leaks the h->reply_map. Fix by calling hpda_free_ctlr_info() to release h->replay_map and h instead free h directly. Fixes: 8b834bff1b73 ("scsi: hpsa: fix selection of reply queue") Signed-off-by: Yuan Can <yuancan@huawei.com> Link: https://lore.kernel.org/r/20221122015751.87284-1-yuancan@huawei.com Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:14:24 +01:00
Harshit Mogalapalli	0328ca389a	scsi: scsi_debug: Fix a warning in resp_write_scat() [ Upstream commit 216e179724c1d9f57a8ababf8bd7aaabef67f01b ] As 'lbdof_blen' is coming from user, if the size in kzalloc() is >= MAX_ORDER then we hit a warning. Call trace: sg_ioctl sg_ioctl_common scsi_ioctl sg_scsi_ioctl blk_execute_rq blk_mq_sched_insert_request blk_mq_run_hw_queue __blk_mq_delay_run_hw_queue __blk_mq_run_hw_queue blk_mq_sched_dispatch_requests __blk_mq_sched_dispatch_requests blk_mq_dispatch_rq_list scsi_queue_rq scsi_dispatch_cmd scsi_debug_queuecommand schedule_resp resp_write_scat If you try to allocate a memory larger than(>=) MAX_ORDER, then kmalloc() will definitely fail. It creates a stack trace and messes up dmesg. The user controls the size here so if they specify a too large size it will fail. Add __GFP_NOWARN in order to avoid too large allocation warning. This is detected by static analysis using smatch. Fixes: 481b5e5c7949 ("scsi: scsi_debug: add resp_write_scat function") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Link: https://lore.kernel.org/r/20221111100526.1790533-1-harshit.m.mogalapalli@oracle.com Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:14:23 +01:00
Bart Van Assche	2a26849d79	scsi: qla2xxx: Fix set-but-not-used variable warnings [ Upstream commit 4fb2169d66b837a2986f569f5d5b81f79e6e4a4c ] Fix the following two compiler warnings: drivers/scsi/qla2xxx/qla_init.c: In function ‘qla24xx_async_abort_cmd’: drivers/scsi/qla2xxx/qla_init.c:171:17: warning: variable ‘bail’ set but not used [-Wunused-but-set-variable] 171 \| uint8_t bail; \| ^~~~ drivers/scsi/qla2xxx/qla_init.c: In function ‘qla2x00_async_tm_cmd’: drivers/scsi/qla2xxx/qla_init.c:2023:17: warning: variable ‘bail’ set but not used [-Wunused-but-set-variable] 2023 \| uint8_t bail; \| ^~~~ Cc: Arun Easi <arun.easi@qlogic.com> Cc: Giridhar Malavali <giridhar.malavali@qlogic.com> Fixes: feafb7b1714c ("[SCSI] qla2xxx: Fix vport delete issues") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221031224818.2607882-1-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:14:23 +01:00
Bart Van Assche	8d50ccfbe2	scsi: core: Fix a race between scsi_done() and scsi_timeout() [ Upstream commit 978b7922d3dca672b41bb4b8ce6c06ab77112741 ] If there is a race between scsi_done() and scsi_timeout() and if scsi_timeout() loses the race, scsi_timeout() should not reset the request timer. Hence change the return value for this case from BLK_EH_RESET_TIMER into BLK_EH_DONE. Although the block layer holds a reference on a request (req->ref) while calling a timeout handler, restarting the timer (blk_add_timer()) while a request is being completed is racy. Reviewed-by: Mike Christie <michael.christie@oracle.com> Cc: Keith Busch <kbusch@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Hannes Reinecke <hare@suse.de> Reported-by: Adrian Hunter <adrian.hunter@intel.com> Fixes: 15f73f5b3e59 ("blk-mq: move failure injection out of blk_mq_complete_request") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221018202958.1902564-2-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-31 13:14:21 +01:00
Zhou Guanghui	97f47617e8	scsi: iscsi: Fix possible memory leak when device_register() failed [ Upstream commit f014165faa7b953b81dcbf18835936e5f8d01f2a ] If device_register() returns error, the name allocated by the dev_set_name() need be freed. As described in the comment of device_register(), we should use put_device() to give up the reference in the error path. Fix this by calling put_device(), the name will be freed in the kobject_cleanup(), and this patch modified resources will be released by calling the corresponding callback function in the device_release(). Signed-off-by: Zhou Guanghui <zhouguanghui1@huawei.com> Link: https://lore.kernel.org/r/20221110033729.1555-1-zhouguanghui1@huawei.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-02 17:41:11 +01:00
Michael Kelley	f9bc4a18e7	scsi: storvsc: Fix handling of srb_status and capacity change events [ Upstream commit b8a5376c321b4669f7ffabc708fd30c3970f3084 ] Current handling of the srb_status is incorrect. Commit 52e1b3b3daa9 ("scsi: storvsc: Correctly handle multiple flags in srb_status") is based on srb_status being a set of flags, when in fact only the 2 high order bits are flags and the remaining 6 bits are an integer status. Because the integer values of interest mostly look like flags, the code actually works when treated that way. But in the interest of correctness going forward, fix this by treating the low 6 bits of srb_status as an integer status code. Add handling for SRB_STATUS_INVALID_REQUEST, which was the original intent of commit 52e1b3b3daa9. Furthermore, treat the ERROR, ABORTED, and INVALID_REQUEST srb status codes as essentially equivalent for the cases we care about. There's no harm in doing so, and it isn't always clear which status code current or older versions of Hyper-V report for particular conditions. Treating the srb status codes as equivalent has the additional benefit of ensuring that capacity change events result in an immediate rescan so that the new size is known to Linux. Existing code checks SCSI sense data for capacity change events when the srb status is ABORTED. But capacity change events are also being observed when Hyper-V reports the srb status as ERROR. Without the immediate rescan, the new size isn't known until something else causes a rescan (such as running fdisk to expand a partition), and in the meantime, tools such as "lsblk" continue to report the old size. Fixes: 52e1b3b3daa9 ("scsi: storvsc: Correctly handle multiple flags in srb_status") Reported-by: Juan Tian <juantian@microsoft.com> Signed-off-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1668019722-1983-1-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-02 17:41:03 +01:00
Bart Van Assche	b90e6234f5	scsi: scsi_debug: Make the READ CAPACITY response compliant with ZBC [ Upstream commit ecb8c2580d37dbb641451049376d80c8afaa387f ] From ZBC-1: - RC BASIS = 0: The RETURNED LOGICAL BLOCK ADDRESS field indicates the highest LBA of a contiguous range of zones that are not sequential write required zones starting with the first zone. - RC BASIS = 1: The RETURNED LOGICAL BLOCK ADDRESS field indicates the LBA of the last logical block on the logical unit. The current scsi_debug READ CAPACITY response does not comply with the above if there are one or more sequential write required zones. SCSI initiators need a way to retrieve the largest valid LBA from SCSI devices. Reporting the largest valid LBA if there are one or more sequential zones requires to set the RC BASIS field in the READ CAPACITY response to one. Hence this patch. Cc: Douglas Gilbert <dgilbert@interlog.com> Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com> Suggested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20221102193248.3177608-1-bvanassche@acm.org Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-02 17:41:02 +01:00
Brian King	cdbba6a4de	scsi: ibmvfc: Avoid path failures during live migration [ Upstream commit 62fa3ce05d5d73c5eccc40b2db493f55fecfc446 ] Fix an issue reported when performing a live migration when multipath is configured with a short fast fail timeout of 5 seconds and also to have no_path_retry set to fail. In this scenario, all paths would go into the devloss state while the ibmvfc driver went through discovery to log back in. On a loaded system, the discovery might take longer than 5 seconds, which was resulting in all paths being marked failed, which then resulted in a read only filesystem. This patch changes the migration code in ibmvfc to avoid deleting rports at all in this scenario, so we avoid losing all paths. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Link: https://lore.kernel.org/r/20221026181356.148517-1-brking@linux.vnet.ibm.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-02 17:41:02 +01:00
Yuan Can	71beab7119	scsi: scsi_debug: Fix possible UAF in sdebug_add_host_helper() [ Upstream commit e208a1d795a08d1ac0398c79ad9c58106531bcc5 ] If device_register() fails in sdebug_add_host_helper(), it will goto clean and sdbg_host will be freed, but sdbg_host->host_list will not be removed from sdebug_host_list, then list traversal may cause UAF. Fix it. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Yuan Can <yuancan@huawei.com> Link: https://lore.kernel.org/r/20221117084421.58918-1-yuancan@huawei.com Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-11-26 09:24:50 +01:00
Yang Yingliang	2f21d653c6	scsi: scsi_transport_sas: Fix error handling in sas_phy_add() [ Upstream commit 5d7bebf2dfb0dc97aac1fbace0910e557ecdb16f ] If transport_add_device() fails in sas_phy_add(), the kernel will crash trying to delete the device in transport_remove_device() called from sas_remove_host(). Unable to handle kernel NULL pointer dereference at virtual address 0000000000000108 CPU: 61 PID: 42829 Comm: rmmod Kdump: loaded Tainted: G W 6.1.0-rc1+ #173 pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : device_del+0x54/0x3d0 lr : device_del+0x37c/0x3d0 Call trace: device_del+0x54/0x3d0 attribute_container_class_device_del+0x28/0x38 transport_remove_classdev+0x6c/0x80 attribute_container_device_trigger+0x108/0x110 transport_remove_device+0x28/0x38 sas_phy_delete+0x30/0x60 [scsi_transport_sas] do_sas_phy_delete+0x6c/0x80 [scsi_transport_sas] device_for_each_child+0x68/0xb0 sas_remove_children+0x40/0x50 [scsi_transport_sas] sas_remove_host+0x20/0x38 [scsi_transport_sas] hisi_sas_remove+0x40/0x68 [hisi_sas_main] hisi_sas_v2_remove+0x20/0x30 [hisi_sas_v2_hw] platform_remove+0x2c/0x60 Fix this by checking and handling return value of transport_add_device() in sas_phy_add(). Fixes: c7ebbbce366c ("[SCSI] SAS transport class") Suggested-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221107124828.115557-1-yangyingliang@huawei.com Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-11-26 09:24:35 +01:00
Uday Shankar	931c97a54c	scsi: core: Restrict legal sdev_state transitions via sysfs [ Upstream commit 2331ce6126be8864b39490e705286b66e2344aac ] Userspace can currently write to sysfs to transition sdev_state to RUNNING or OFFLINE from any source state. This causes issues because proper transitioning out of some states involves steps besides just changing sdev_state, so allowing userspace to change sdev_state regardless of the source state can result in inconsistencies; e.g. with ISCSI we can end up with sdev_state == SDEV_RUNNING while the device queue is quiesced. Any task attempting I/O on the device will then hang, and in more recent kernels, iscsid will hang as well. More detail about this bug is provided in my first attempt: https://groups.google.com/g/open-iscsi/c/PNKca4HgPDs/m/CXaDkntOAQAJ Link: https://lore.kernel.org/r/20220924000241.2967323-1-ushankar@purestorage.com Signed-off-by: Uday Shankar <ushankar@purestorage.com> Suggested-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-11-10 18:15:34 +01:00
James Smart	21d65b3516	scsi: lpfc: Rework MIB Rx Monitor debug info logic [ Upstream commit bd269188ea94e40ab002cad7b0df8f12b8f0de54 ] The kernel test robot reported the following sparse warning: arch/arm64/include/asm/cmpxchg.h:88:1: sparse: sparse: cast truncates bits from constant value (369 becomes 69) On arm64, atomic_xchg only works on 8-bit byte fields. Thus, the macro usage of LPFC_RXMONITOR_TABLE_IN_USE can be unintentionally truncated leading to all logic involving the LPFC_RXMONITOR_TABLE_IN_USE macro to not work properly. Replace the Rx Table atomic_t indexing logic with a new lpfc_rx_info_monitor structure that holds a circular ring buffer. For locking semantics, a spinlock_t is used. Link: https://lore.kernel.org/r/20220819011736.14141-4-jsmart2021@gmail.com Fixes: 17b27ac59224 ("scsi: lpfc: Add rx monitoring statistics") Cc: <stable@vger.kernel.org> # v5.15+ Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-11-10 18:15:24 +01:00
James Smart	d70705e131	scsi: lpfc: Adjust CMF total bytes and rxmonitor [ Upstream commit a6269f837045acb02904f31f05acde847ec8f8a7 ] Calculate any extra bytes needed to account for timer accuracy. If we are less than LPFC_CMF_INTERVAL, then calculate the adjustment needed for total to reflect a full LPFC_CMF_INTERVAL. Add additional info to rxmonitor, and adjust some log formatting. Link: https://lore.kernel.org/r/20211204002644.116455-7-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: bd269188ea94 ("scsi: lpfc: Rework MIB Rx Monitor debug info logic") Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-11-10 18:15:23 +01:00
James Smart	9ebc6e8ad1	scsi: lpfc: Adjust bytes received vales during cmf timer interval [ Upstream commit d5ac69b332d8859d1f8bd5d4dee31f3267f6b0d2 ] The newly added congestion mgmt framework is seeing unexpected congestion FPINs and signals. In analysis, time values given to the adapter are not at hard time intervals. Thus the drift vs the transfer count seen is affecting how the framework manages things. Adjust counters to cover the drift. Link: https://lore.kernel.org/r/20210910233159.115896-11-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: bd269188ea94 ("scsi: lpfc: Rework MIB Rx Monitor debug info logic") Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-11-10 18:15:23 +01:00
Yu Kuai	aad798a0b3	scsi: sd: Revert "scsi: sd: Remove a local variable" This reverts commit 84f7a9de0602704bbec774a6c7f7c8c4994bee9c. Because it introduces a problem that rq->__data_len is set to the wrong value. before the patch: 1) nr_bytes = rq->__data_len 2) rq->__data_len = sdp->sector_size 3) scsi_init_io() 4) rq->__data_len = nr_bytes after the patch: 1) rq->__data_len = sdp->sector_size 2) scsi_init_io() 3) rq->__data_len = rq->__data_len -> __data_len is wrong It will cause that io can only complete one segment each time, and the io will requeue in scsi_io_completion_action(), which will cause severe performance degradation. Scsi write same is removed in commit e383e16e84e9 ("scsi: sd: Remove WRITE_SAME support") from mainline, hence this patch is only needed for stable kernels. Fixes: 84f7a9de0602 ("scsi: sd: Remove a local variable") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:59:20 +09:00
James Smart	0b0d169723	Revert "scsi: lpfc: SLI path split: Refactor lpfc_iocbq" This reverts commit 1c5e670d6a5a8e7e99b51f45e79879f7828bd2ec. LTS 5.15 pulled in several lpfc "SLI Path split" patches. The Path Split mods were a 14-patch set, which refactors the driver from to split the sli-3 hw (now eol) from the sli-4 hw and use sli4 structures natively. The patches are highly inter-related. Given only some of the patches were included, it created a situation where FLOGI's fail, thus SLI Ports can't start communication. Reverting this patch as its one of the partial Path Split patches that was included. Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:59:14 +09:00
James Smart	7a0fce24de	Revert "scsi: lpfc: SLI path split: Refactor fast and slow paths to native SLI4" This reverts commit c56cc7fefc3159049f94fb1318e48aa60cabf703. LTS 5.15 pulled in several lpfc "SLI Path split" patches. The Path Split mods were a 14-patch set, which refactors the driver from to split the sli-3 hw (now eol) from the sli-4 hw and use sli4 structures natively. The patches are highly inter-related. Given only some of the patches were included, it created a situation where FLOGI's fail, thus SLI Ports can't start communication. Reverting this patch as its one of the partial Path Split patches that was included. Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:59:14 +09:00
James Smart	7a36c9de43	Revert "scsi: lpfc: SLI path split: Refactor SCSI paths" This reverts commit b4543dbea84c8b64566dd0d626d63f8cbe977f61. LTS 5.15 pulled in several lpfc "SLI Path split" patches. The Path Split mods were a 14-patch set, which refactors the driver from to split the sli-3 hw (now eol) from the sli-4 hw and use sli4 structures natively. The patches are highly inter-related. Given only some of the patches were included, it created a situation where FLOGI's fail, thus SLI Ports can't start communication. Reverting this patch as its one of the partial Path Split patches that was included. NOTE: fixed a git revert error which caused a new line to be inserted: line 5755 of lpfc_scsi.c in lpfc_queuecommand + atomic_inc(&ndlp->cmd_pending); Removed the line Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:59:14 +09:00
James Smart	eb8be2dbfb	Revert "scsi: lpfc: Fix locking for lpfc_sli_iocbq_lookup()" This reverts commit 9a570069cdbbc28c4b5b4632d5c9369371ec739c. LTS 5.15 pulled in several lpfc "SLI Path split" patches. The Path Split mods were a 14-patch set, which refactors the driver from to split the sli-3 hw (now eol) from the sli-4 hw and use sli4 structures natively. The patches are highly inter-related. Given only some of the patches were included, it created a situation where FLOGI's fail, thus SLI Ports can't start communication. Reverting this patch as its a fix specific to the Path Split patches, which were partially included and now being pulled from 5.15. Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:59:14 +09:00
James Smart	065bf71a8a	Revert "scsi: lpfc: Fix element offset in __lpfc_sli_release_iocbq_s4()" This reverts commit 6e99860de6f4e286c386f533c4d35e7e283803d4. LTS 5.15 pulled in several lpfc "SLI Path split" patches. The Path Split mods were a 14-patch set, which refactors the driver from to split the sli-3 hw (now eol) from the sli-4 hw and use sli4 structures natively. The patches are highly inter-related. Given only some of the patches were included, it created a situation where FLOGI's fail, thus SLI Ports can't start communication. Reverting this patch as its a fix specific to the Path Split patches, which were partially included and now being pulled from 5.15. Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:59:14 +09:00
James Smart	97dc9076ea	Revert "scsi: lpfc: Resolve some cleanup issues following SLI path refactoring" This reverts commit 17bf429b913b9e7f8d2353782e24ed3a491bb2d8. LTS 5.15 pulled in several lpfc "SLI Path split" patches. The Path Split mods were a 14-patch set, which refactors the driver from to split the sli-3 hw (now eol) from the sli-4 hw and use sli4 structures natively. The patches are highly inter-related. Given only some of the patches were included, it created a situation where FLOGI's fail, thus SLI Ports can't start communication. Reverting this patch as its a fix specific to the Path Split patches, which were partially included and now being pulled from 5.15. Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:59:13 +09:00
Manish Rangankar	a2f0934e6b	scsi: qla2xxx: Use transport-defined speed mask for supported_speeds commit 0b863257c17c5f57a41e0a48de140ed026957a63 upstream. One of the sysfs values reported for supported_speeds was not valid (20Gb/s reported instead of 64Gb/s). Instead of driver internal speed mask definition, use speed mask defined in transport_fc for reporting host->supported_speeds. Link: https://lore.kernel.org/r/20220927115946.17559-1-njavali@marvell.com Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:59:12 +09:00
Rafael Mendonca	9749595feb	scsi: lpfc: Fix memory leak in lpfc_create_port() [ Upstream commit dc8e483f684a24cc06e1d5fa958b54db58855093 ] Commit 5e633302ace1 ("scsi: lpfc: vmid: Add support for VMID in mailbox command") introduced allocations for the VMID resources in lpfc_create_port() after the call to scsi_host_alloc(). Upon failure on the VMID allocations, the new code would branch to the 'out' label, which returns NULL without unwinding anything, thus skipping the call to scsi_host_put(). Fix the problem by creating a separate label 'out_free_vmid' to unwind the VMID resources and make the 'out_put_shost' label call only scsi_host_put(), as was done before the introduction of allocations for VMID. Fixes: 5e633302ace1 ("scsi: lpfc: vmid: Add support for VMID in mailbox command") Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com> Link: https://lore.kernel.org/r/20220916035908.712799-1-rafaelmendsr@gmail.com Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-29 10:12:56 +02:00
Letu Ren	9d54de8660	scsi: 3w-9xxx: Avoid disabling device if failing to enable it [ Upstream commit 7eff437b5ee1309b34667844361c6bbb5c97df05 ] The original code will "goto out_disable_device" and call pci_disable_device() if pci_enable_device() fails. The kernel will generate a warning message like "3w-9xxx 0000:00:05.0: disabling already-disabled device". We shouldn't disable a device that failed to be enabled. A simple return is fine. Link: https://lore.kernel.org/r/20220829110115.38789-1-fantasquex@gmail.com Reported-by: Zheyu Ma <zheyuma97@gmail.com> Signed-off-by: Letu Ren <fantasquex@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-26 12:35:46 +02:00
Mike Christie	a26b065875	scsi: iscsi: iscsi_tcp: Fix null-ptr-deref while calling getpeername() [ Upstream commit 57569c37f0add1b6489e1a1563c71519daf732cf ] Fix a NULL pointer crash that occurs when we are freeing the socket at the same time we access it via sysfs. The problem is that: 1. iscsi_sw_tcp_conn_get_param() and iscsi_sw_tcp_host_get_param() take the frwd_lock and do sock_hold() then drop the frwd_lock. sock_hold() does a get on the "struct sock". 2. iscsi_sw_tcp_release_conn() does sockfd_put() which does the last put on the "struct socket" and that does __sock_release() which sets the sock->ops to NULL. 3. iscsi_sw_tcp_conn_get_param() and iscsi_sw_tcp_host_get_param() then call kernel_getpeername() which accesses the NULL sock->ops. Above we do a get on the "struct sock", but we needed a get on the "struct socket". Originally, we just held the frwd_lock the entire time but in commit bcf3a2953d36 ("scsi: iscsi: iscsi_tcp: Avoid holding spinlock while calling getpeername()") we switched to refcount based because the network layer changed and started taking a mutex in that path, so we could no longer hold the frwd_lock. Instead of trying to maintain multiple refcounts, this just has us use a mutex for accessing the socket in the interface code paths. Link: https://lore.kernel.org/r/20220907221700.10302-1-michael.christie@oracle.com Fixes: bcf3a2953d36 ("scsi: iscsi: iscsi_tcp: Avoid holding spinlock while calling getpeername()") Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-26 12:35:16 +02:00
Mike Christie	e87fb1fcf8	scsi: iscsi: Run recv path from workqueue [ Upstream commit f1d269765ee29da56b32818b7a08054484ed89f2 ] We don't always want to run the recv path from the network softirq because when we have to have multiple sessions sharing the same CPUs, some sessions can eat up the NAPI softirq budget and affect other sessions or users. Allow us to queue the recv handling to the iscsi workqueue so we can have the scheduler/wq code try to balance the work and CPU use across all sessions' worker threads. Note: It wasn't the original intent of the change but a nice side effect is that for some workloads/configs we get a nice performance boost. For a simple read heavy test: fio --direct=1 --filename=/dev/dm-0 --rw=randread --bs=256K --ioengine=libaio --iodepth=128 --numjobs=4 where the iscsi threads, fio jobs, and rps_cpus share CPUs we see a 32% throughput boost. We also see increases for small I/O IOPs tests but it's not as high. Link: https://lore.kernel.org/r/20220616224557.115234-4-michael.christie@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: 57569c37f0ad ("scsi: iscsi: iscsi_tcp: Fix null-ptr-deref while calling getpeername()") Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-26 12:35:16 +02:00
Mike Christie	c2af03a7c1	scsi: iscsi: Add recv workqueue helpers [ Upstream commit 8af809966c0b34cfacd8da9a412689b8e9910354 ] Add helpers to allow the drivers to run their recv paths from libiscsi's workqueue. Link: https://lore.kernel.org/r/20220616224557.115234-3-michael.christie@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: 57569c37f0ad ("scsi: iscsi: iscsi_tcp: Fix null-ptr-deref while calling getpeername()") Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-26 12:35:15 +02:00

1 2 3 4 5 ...

22289 Commits