Commit Graph

22206 Commits

Author SHA1 Message Date
Quinn Tran
9fd26c633e scsi: qla2xxx: edif: Fix EDIF bsg
Various EDIF bsgs did not properly fill out the reply_payload_rcv_len
field. This causes app to parse empty data in the return payload.

Link: https://lore.kernel.org/r/20211026115412.27691-13-njavali@marvell.com
Fixes: 7ebb336e45 ("scsi: qla2xxx: edif: Add start + stop bsgs")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:52:00 -04:00
Quinn Tran
36f468bfe9 scsi: qla2xxx: edif: Fix inconsistent check of db_flags
db_flags field is a bit field. Replace value check with bit flag check.

Link: https://lore.kernel.org/r/20211026115412.27691-12-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:52:00 -04:00
Quinn Tran
0f6d600a26 scsi: qla2xxx: edif: Increase ELS payload
Currently, firmware limits ELS payload to FC frame size/2112.  This patch
adjusts memory buffer size to be able to handle max ELS payload.

Link: https://lore.kernel.org/r/20211026115412.27691-11-njavali@marvell.com
Fixes: 84318a9f01 ("scsi: qla2xxx: edif: Add send, receive, and accept for auth_els")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:52:00 -04:00
Quinn Tran
91f6f5fbe8 scsi: qla2xxx: edif: Reduce connection thrash
On ipsec start by remote port, target port may use RSCN to trigger
initiator to relogin. If driver is already in the process of a relogin,
then ignore the RSCN and allow the current relogin to continue. This
reduces thrashing of the connection.

Link: https://lore.kernel.org/r/20211026115412.27691-10-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:52:00 -04:00
Quinn Tran
6c9998ce4b scsi: qla2xxx: edif: Tweak trace message
Modify trace messages for additional debugability.

Link: https://lore.kernel.org/r/20211026115412.27691-9-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:52:00 -04:00
Quinn Tran
8062b742d3 scsi: qla2xxx: edif: Replace list_for_each_safe with list_for_each_entry_safe
This patch is per review comment by Hannes Reinecke from previous
submission to replace list_for_each_safe with list_for_each_entry_safe.

Link: https://lore.kernel.org/r/20211026115412.27691-8-njavali@marvell.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:52:00 -04:00
Quinn Tran
b1af26c245 scsi: qla2xxx: edif: Flush stale events and msgs on session down
On session down, driver will flush all stale messages and doorbell
events. This prevents authentication application from having to process
stale data.

Link: https://lore.kernel.org/r/20211026115412.27691-7-njavali@marvell.com
Fixes: 4de067e5df ("scsi: qla2xxx: edif: Add N2N support for EDIF")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Co-developed-by: Karunakara Merugu <kmerugu@marvell.com>
Signed-off-by: Karunakara Merugu <kmerugu@marvell.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:51:59 -04:00
Quinn Tran
b492d6a488 scsi: qla2xxx: edif: Fix app start delay
Current driver does unnecessary pause for each session to get to certain
state before allowing the app start call to return. In larger environment,
this introduces a long delay.  Originally the delay was meant to
synchronize app and driver. However, the with current implementation the
two sides use various events to synchronize their state.

The same is applied to the authentication failure call.

Link: https://lore.kernel.org/r/20211026115412.27691-6-njavali@marvell.com
Fixes: 4de067e5df ("scsi: qla2xxx: edif: Add N2N support for EDIF")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:51:59 -04:00
Quinn Tran
8e6d5df3cb scsi: qla2xxx: edif: Fix app start fail
On app start, all sessions need to be reset to see if secure connection can
be made. Fix the broken check which prevents that process.

Link: https://lore.kernel.org/r/20211026115412.27691-5-njavali@marvell.com
Fixes: 4de067e5df ("scsi: qla2xxx: edif: Add N2N support for EDIF")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:51:59 -04:00
Quinn Tran
0b7a9fd934 scsi: qla2xxx: Turn off target reset during issue_lip
When user uses issue_lip to do link bounce, driver sends additional target
reset to remote device before resetting the link. The target reset would
affect other paths with active I/Os. This patch will remove the unnecessary
target reset.

Link: https://lore.kernel.org/r/20211026115412.27691-4-njavali@marvell.com
Fixes: 5854771e31 ("[SCSI] qla2xxx: Add ISPFX00 specific bus reset routine")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:51:59 -04:00
Quinn Tran
c98c5daaa2 scsi: qla2xxx: Fix gnl list corruption
Current code does list element deletion and addition in and out of lock
protection. This patch moves deletion behind lock.

list_add double add: new=ffff9130b5eb89f8, prev=ffff9130b5eb89f8,
    next=ffff9130c6a715f0.
 ------------[ cut here ]------------
 kernel BUG at lib/list_debug.c:31!
 invalid opcode: 0000 [#1] SMP PTI
 CPU: 1 PID: 182395 Comm: kworker/1:37 Kdump: loaded Tainted: G W  OE
 --------- -  - 4.18.0-193.el8.x86_64 #1
 Hardware name: HP ProLiant DL160 Gen8, BIOS J03 02/10/2014
 Workqueue: qla2xxx_wq qla2x00_iocb_work_fn [qla2xxx]
 RIP: 0010:__list_add_valid+0x41/0x50
 Code: 85 94 00 00 00 48 39 c7 74 0b 48 39 d7 74 06 b8 01 00 00 00 c3 48 89 f2
 4c 89 c1 48 89 fe 48 c7 c7 60 83 ad 97 e8 4d bd ce ff <0f> 0b 0f 1f 00 66 2e
 0f 1f 84 00 00 00 00 00 48 8b 07 48 8b 57 08
 RSP: 0018:ffffaba306f47d68 EFLAGS: 00010046
 RAX: 0000000000000058 RBX: ffff9130b5eb8800 RCX: 0000000000000006
 RDX: 0000000000000000 RSI: 0000000000000096 RDI: ffff9130b7456a00
 RBP: ffff9130c6a70a58 R08: 000000000008d7be R09: 0000000000000001
 R10: 0000000000000000 R11: 0000000000000001 R12: ffff9130c6a715f0
 R13: ffff9130b5eb8824 R14: ffff9130b5eb89f8 R15: ffff9130b5eb89f8
 FS:  0000000000000000(0000) GS:ffff9130b7440000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007efcaaef11a0 CR3: 000000005200a002 CR4: 00000000000606e0
 Call Trace:
  qla24xx_async_gnl+0x113/0x3c0 [qla2xxx]
  ? qla2x00_iocb_work_fn+0x53/0x80 [qla2xxx]
  ? process_one_work+0x1a7/0x3b0
  ? worker_thread+0x30/0x390
  ? create_worker+0x1a0/0x1a0
  ? kthread+0x112/0x130

Link: https://lore.kernel.org/r/20211026115412.27691-3-njavali@marvell.com
Fixes: 726b854870 ("qla2xxx: Add framework for async fabric discovery")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:51:59 -04:00
Quinn Tran
bb2ca6b3f0 scsi: qla2xxx: Relogin during fabric disturbance
For RSCN of type "Area, Domain, or Fabric", which indicate a portion or
entire fabric was disturbed, current driver does not set the scan_need flag
to indicate a session was affected by the disturbance. This in turn can
lead to I/O timeout and delay of relogin. Hence initiate relogin in the
event of fabric disturbance.

Link: https://lore.kernel.org/r/20211026115412.27691-2-njavali@marvell.com
Fixes: 1560bafdff ("scsi: qla2xxx: Use complete switch scan for RSCN events")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:51:59 -04:00
Christophe JAILLET
2c2934c80e scsi: elx: Use 'bitmap_zalloc()' when applicable
'sli4->ext[i].use_map' is a bitmap. Use 'bitmap_zalloc()' to simplify code,
improve the semantic and avoid some open-coded arithmetic in allocator
arguments.

Also change the corresponding 'kfree()' into 'bitmap_free()' to keep
consistency.

Link: https://lore.kernel.org/r/2a0a83949fb896a0a236dcca94dfdc8486d489f5.1635104793.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:28:33 -04:00
Bart Van Assche
1ea7d80263 scsi: ufs: core: Micro-optimize ufshcd_map_sg()
Replace two cpu_to_le32() calls by a single cpu_to_le64() call.

Additionally, issue a warning if the length of an scatter gather list
element exceeds what is allowed by the UFSHCI specification.

Link: https://lore.kernel.org/r/20211020214024.2007615-11-bvanassche@acm.org
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:24:50 -04:00
Bart Van Assche
9a868c8ad3 scsi: ufs: core: Add a compile-time structure size check
Before modifying struct ufshcd_sg_entry, add a compile-time structure size
check.

Link: https://lore.kernel.org/r/20211020214024.2007615-10-bvanassche@acm.org
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:24:50 -04:00
Bart Van Assche
3ad317a1f9 scsi: ufs: core: Remove three superfluous casts
Casting an int explicitly to u16 when passed as an argument to a function
is not necessary.

Since prd_table and ucd_prdt_ptr both have type struct ufshcd_sg_entry *,
remove the casts from assignments of these two to each other.

This patch does not change any functionality.

Link: https://lore.kernel.org/r/20211020214024.2007615-9-bvanassche@acm.org
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:24:50 -04:00
Bart Van Assche
7340faae94 scsi: ufs: core: Add debugfs attributes for triggering the UFS EH
Make it easier to test the impact of the UFS error handler on software that
submits SCSI commands to the UFS driver.

Link: https://lore.kernel.org/r/20211020214024.2007615-8-bvanassche@acm.org
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:24:50 -04:00
Bart Van Assche
e0022c6c29 scsi: ufs: core: Make it easier to add new debugfs attributes
Introduce an array for debugfs attributes to make it easier to add new
debugfs attributes. Change the value of the inode.i_private pointer for
debugfs attributes from a pointer to the HBA data structure to a pointer to
the attribute description for the stats attribute. Store the HBA pointer in
the private data of the parent inode instead.

Link: https://lore.kernel.org/r/20211020214024.2007615-7-bvanassche@acm.org
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:24:50 -04:00
Bart Van Assche
267a59f6a5 scsi: ufs: core: Export ufshcd_schedule_eh_work()
Make it possible to call ufshcd_schedule_eh_work() from other source files
than ufshcd.c. Additionally, convert a source code comment into a
lockdep_assert_held() call.

Link: https://lore.kernel.org/r/20211020214024.2007615-6-bvanassche@acm.org
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:24:50 -04:00
Bart Van Assche
4693fad7d6 scsi: ufs: core: Log error handler activity
Kernel logs are hard to comprehend without information about what the UFS
error handler is doing. Hence this patch that logs information about error
handler activity.

Link: https://lore.kernel.org/r/20211020214024.2007615-5-bvanassche@acm.org
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:24:50 -04:00
Bart Van Assche
957d63e77a scsi: ufs: core: Improve static type checking
Introduce an enumeration type for the overall command status to allow the
compiler to perform more static type checking.

Link: https://lore.kernel.org/r/20211020214024.2007615-4-bvanassche@acm.org
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:24:49 -04:00
Bart Van Assche
91bb765cca scsi: ufs: core: Improve source code comments
Make the descriptions above data structures that come from the UFS
specification match the terminology from that specification. This makes it
easier to find these data structures while reading the UFS specification.

Link: https://lore.kernel.org/r/20211020214024.2007615-3-bvanassche@acm.org
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:24:49 -04:00
Bart Van Assche
1168252357 scsi: ufs: Revert "Retry aborted SCSI commands instead of completing these successfully"
Commit 73dc3c4ac7 ("scsi: ufs: Retry aborted SCSI commands instead of
completing these successfully") is not necessary. If a SCSI command is
aborted successfully the UFS controller has not modified the command status
and the command status still has the value assigned by
ufshcd_prepare_req_desc_hdr(), namely OCS_INVALID_COMMAND_STATUS. The
function ufshcd_transfer_rsp_status() requeues commands that have an
invalid command status. Hence revert commit 73dc3c4ac7.

Link: https://lore.kernel.org/r/20211020214024.2007615-2-bvanassche@acm.org
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-26 23:24:49 -04:00
James Smart
83c3a7beae scsi: lpfc: Update lpfc version to 14.0.0.3
Update lpfc version to 14.0.0.3.

Link: https://lore.kernel.org/r/20211020211417.88754-9-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-20 23:33:46 -04:00
James Smart
af984c8729 scsi: lpfc: Allow fabric node recovery if recovery is in progress before devloss
A link bounce to a slow fabric may observe FDISC response delays lasting
longer than devloss tmo.  Current logic decrements the final fabric node
kref during a devloss tmo event.  This results in a NULL ptr dereference
crash if the FDISC completes for that fabric node after devloss tmo.

Fix by adding the NLP_IN_RECOV_POST_DEV_LOSS flag, which is set when
devloss tmo triggers and we've noticed that fabric node recovery has
already started or finished in between the time lpfc_dev_loss_tmo_callbk
queues lpfc_dev_loss_tmo_handler.  If fabric node recovery succeeds, then
the driver reverses the devloss tmo marked kref put with a kref get.  If
fabric node recovery fails, then the final kref put relies on the ELS
timing out or the REG_LOGIN cmpl routine.

Link: https://lore.kernel.org/r/20211020211417.88754-8-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-20 23:33:46 -04:00
James Smart
1854f53ccd scsi: lpfc: Fix link down processing to address NULL pointer dereference
If an FC link down transition while PLOGIs are outstanding to fabric well
known addresses, outstanding ABTS requests may result in a NULL pointer
dereference. Driver unload requests may hang with repeated "2878" log
messages.

The Link down processing results in ABTS requests for outstanding ELS
requests. The Abort WQEs are sent for the ELSs before the driver had set
the link state to down. Thus the driver is sending the Abort with the
expectation that an ABTS will be sent on the wire. The Abort request is
stalled waiting for the link to come up. In some conditions the driver may
auto-complete the ELSs thus if the link does come up, the Abort completions
may reference an invalid structure.

Fix by ensuring that Abort set the flag to avoid link traffic if issued due
to conditions where the link failed.

Link: https://lore.kernel.org/r/20211020211417.88754-7-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-20 23:33:46 -04:00
James Smart
15af02d8a5 scsi: lpfc: Allow PLOGI retry if previous PLOGI was aborted
A remote nport can stop responding to PLOGI beyond the ELS I/O timeout
under some fault conditions.  When this happens, the non-response triggers
a dev_loss_tmo event from the transport which causes the driver to abort
the PLOGI and stop any retries. This was due to a policy in the ELS
completion handler whenever an ELS was terminated due to driver request.

Revise the ELS completion path to detect PLOGIs that were aborted and
allow retries.

Link: https://lore.kernel.org/r/20211020211417.88754-6-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-20 23:33:45 -04:00
James Smart
79b20becce scsi: lpfc: Fix use-after-free in lpfc_unreg_rpi() routine
An error is detected with the following report when unloading the driver:
  "KASAN: use-after-free in lpfc_unreg_rpi+0x1b1b"

The NLP_REG_LOGIN_SEND nlp_flag is set in lpfc_reg_fab_ctrl_node(), but the
flag is not cleared upon completion of the login.

This allows a second call to lpfc_unreg_rpi() to proceed with nlp_rpi set
to LPFC_RPI_ALLOW_ERROR.  This results in a use after free access when used
as an rpi_ids array index.

Fix by clearing the NLP_REG_LOGIN_SEND nlp_flag in
lpfc_mbx_cmpl_fc_reg_login().

Link: https://lore.kernel.org/r/20211020211417.88754-5-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-20 23:33:45 -04:00
James Smart
7a1dda9436 scsi: lpfc: Correct sysfs reporting of loop support after SFP status change
Applications determine loop support in part by querying the 'pls' sysfs
node. Reporting of 'pls' (Private Loop Support) is derived from the
descriptor returned by the COMMON_GET_SLI4_PARAMETERS mailbox command,
which is issued during initialization or after a reset.

The value of this field may change if there is a dynamic SFP change.  The
driver currently will not pick up the change as there was no reset
scenario.

Rework to commonize the sending of the COMMON_GET_SLI4_PARAMETERS
command. Add the calling of the routine after receipt of an async event
indicating an SFP change.

Link: https://lore.kernel.org/r/20211020211417.88754-4-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-20 23:33:45 -04:00
James Smart
d305c253af scsi: lpfc: Wait for successful restart of SLI3 adapter during host sg_reset
A prior patch introduced HBA_NEEDS_CFG_PORT flag logic, but in
lpfc_sli_brdrestart_s3() code path, right after HBA_NEEDS_CFG_PORT is set,
the phba->hba_flag is cleared in lpfc_sli_brdreset().

Fix by calling lpfc_sli_chipset_init() to wait for successful restart of
the HBA in lpfc_host_reset_handler() after lpfc_sli_brdrestart().

lpfc_sli_chipset_init() sets the HBA_NEEDS_CFG_PORT flag so that the
lpfc_sli_hba_setup() routine from lpfc_online() will execute
lpfc_sli_config_port() initialization step when the brdrestart is
successful.

Link: https://lore.kernel.org/r/20211020211417.88754-3-jsmart2021@gmail.com
Fixes: d2f2547efd ("scsi: lpfc: Fix auto sli_mode and its effect on CONFIG_PORT for SLI3")
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-20 23:33:45 -04:00
James Smart
a516074c20 scsi: lpfc: Revert LOG_TRACE_EVENT back to LOG_INIT prior to driver_resource_setup()
In cases when lpfc_enable_pci_dev() fails, lpfc_printf_log() with
LOG_TRACE_EVENT set will call lpfc_dmp_dbg() which uses the
phba->port_list_lock.

However, phba->port_list_lock does not get initialized until
lpfc_setup_driver_resource_phase1().  Thus, any initialization routine with
LOG_TRACE_EVENT log message prior to lpfc_setup_driver_resource_phase1()
will crash.

Revert LOG_TRACE_EVENT back to LOG_INIT for all log messages in routines
prior to lpfc_setup_driver_resource_phase1().

Link: https://lore.kernel.org/r/20211020211417.88754-2-jsmart2021@gmail.com
CC: Zheyu Ma <zheyuma97@gmail.com>
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-20 23:33:45 -04:00
Srinivas Kandagatla
b6ca770ae7 scsi: ufs: ufshcd-pltfrm: Fix memory leak due to probe defer
UFS drivers that probe defer will end up leaking memory allocated for clk
and regulator names via kstrdup() because the structure that is holding
this memory is allocated via devm_* variants which will be freed during
probe defer but the names are never freed.

Use same devm_* variant of kstrdup to free the memory allocated to name
when driver probe defers.

Kmemleak found around 11 leaks on Qualcomm Dragon Board RB5:

unreferenced object 0xffff66f243fb2c00 (size 128):
  comm "kworker/u16:0", pid 7, jiffies 4294893319 (age 94.848s)
  hex dump (first 32 bytes):
    63 6f 72 65 5f 63 6c 6b 00 76 69 72 74 75 61 6c  core_clk.virtual
    2f 77 6f 72 6b 71 75 65 75 65 2f 73 63 73 69 5f  /workqueue/scsi_
  backtrace:
    [<000000006f788cd1>] slab_post_alloc_hook+0x88/0x410
    [<00000000cfd1372b>] __kmalloc_track_caller+0x138/0x230
    [<00000000a92ab17b>] kstrdup+0xb0/0x110
    [<0000000037263ab6>] ufshcd_pltfrm_init+0x1a8/0x500
    [<00000000a20a5caa>] ufs_qcom_probe+0x20/0x58
    [<00000000a5e43067>] platform_probe+0x6c/0x118
    [<00000000ef686e3f>] really_probe+0xc4/0x330
    [<000000005b18792c>] __driver_probe_device+0x88/0x118
    [<00000000a5d295e8>] driver_probe_device+0x44/0x158
    [<000000007e83f58d>] __device_attach_driver+0xb4/0x128
    [<000000004bfa4470>] bus_for_each_drv+0x68/0xd0
    [<00000000b89a83bc>] __device_attach+0xec/0x170
    [<00000000ada2beea>] device_initial_probe+0x14/0x20
    [<0000000079921612>] bus_probe_device+0x9c/0xa8
    [<00000000d268bf7c>] deferred_probe_work_func+0x90/0xd0
    [<000000009ef64bfa>] process_one_work+0x29c/0x788
unreferenced object 0xffff66f243fb2c80 (size 128):
  comm "kworker/u16:0", pid 7, jiffies 4294893319 (age 94.848s)
  hex dump (first 32 bytes):
    62 75 73 5f 61 67 67 72 5f 63 6c 6b 00 00 00 00  bus_aggr_clk....
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

With this patch no memory leaks are reported.

Link: https://lore.kernel.org/r/20210914092214.6468-1-srinivas.kandagatla@linaro.org
Fixes: aa49761309 ("ufs: Add regulator enable support")
Fixes: c6e79dacd8 ("ufs: Add clock initialization support")
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-20 23:13:37 -04:00
Arnd Bergmann
bb4a8dcb4e scsi: ufs: mediatek: Avoid sched_clock() misuse
sched_clock() is not meant to be used in portable driver code, and assuming
a particular clock frequency is not how this is meant to be used. It also
causes a build failure because of a missing header inclusion:

drivers/scsi/ufs/ufs-mediatek.c:321:12: error: implicit declaration of function 'sched_clock' [-Werror,-Wimplicit-function-declaration]
        timeout = sched_clock() + retry_ms * 1000000UL;

A better interface to use here ktime_get_mono_fast_ns(), which works mostly
like ktime_get() but is safe to use inside of a suspend callback.

Link: https://lore.kernel.org/r/20211018132022.2281589-1-arnd@kernel.org
Fixes: 9561f58442 ("scsi: ufs: mediatek: Support vops pre suspend to disable auto-hibern8")
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-20 23:09:58 -04:00
Jiapeng Chong
0ae8f47851 scsi: mpt3sas: Make mpt3sas_dev_attrs static
This symbol is not used outside of mpt3sas_ctl.c, mark it static.

Fixes the following sparse warning:

drivers/scsi/mpt3sas/mpt3sas_ctl.c:3988:18: warning: symbol
'mpt3sas_dev_attrs' was not declared. Should it be static?

Link: https://lore.kernel.org/r/1634639239-2892-1-git-send-email-jiapeng.chong@linux.alibaba.com
Fixes: 1bb3ca27d2 ("scsi: mpt3sas: Switch to attribute groups")
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-20 22:57:27 -04:00
Sreekanth Reddy
3d8fa78ebd scsi: scsi_transport_sas: Add 22.5 Gbps link rate definitions
Add 22.5 Gbps link rate definitions.

Link: https://lore.kernel.org/r/20211018070611.26428-1-sreekanth.reddy@broadcom.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-19 14:07:19 -04:00
Christoph Hellwig
e6ab611352 scsi: aha1542: Use memcpy_{from,to}_bvec()
Use the memcpy_{from,to}_bvec() helpers instead of open coding them.

Link: https://lore.kernel.org/r/20211018060802.1815982-1-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-19 14:07:19 -04:00
Luis Chamberlain
e9d658c217 scsi: sr: Add error handling support for add_disk()
We never checked for errors on add_disk() as this function returned
void. Now that this is fixed, use the shiny new error handling.

Just put the cdrom kref and have the unwinding be done by
sr_kref_release().

Link: https://lore.kernel.org/r/20211015233028.2167651-3-mcgrof@kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:51:34 -04:00
Luis Chamberlain
2a7a891f4c scsi: sd: Add error handling support for add_disk()
We never checked for errors on add_disk() as this function returned
void. Now that this is fixed, use the shiny new error handling.

As with the error handling for device_add() we follow the same logic and
just put the device so that cleanup is done via the scsi_disk_release().

Link: https://lore.kernel.org/r/20211015233028.2167651-2-mcgrof@kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:51:34 -04:00
Peter Wang
25d542a853 scsi: ufs: ufs-mediatek: Fix wrong location for ref-clk delay
Fix the location of delay for ref-clk gating and ungating in
ufs_mtk_setup_ref_clk().

Link: https://lore.kernel.org/r/20211016005802.7729-4-stanley.chu@mediatek.com
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:35 -04:00
Stanley Chu
1eaff502a8 scsi: ufs: ufs-mediatek: Fix build error caused by use of sched_clock()
Add proper header for using sched_clock().

Link: https://lore.kernel.org/r/20211016005802.7729-3-stanley.chu@mediatek.com
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:35 -04:00
Stanley Chu
fc65e933fb scsi: ufs: ufs-mediatek: Introduce default delay for reference clock
Introduce default delay time for gating or ungating reference clock instead
of ambiguous magic numbers.

The defined value is suitable for all current MediaTek UFS platforms.

Link: https://lore.kernel.org/r/20211016005802.7729-2-stanley.chu@mediatek.com
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:35 -04:00
Ye Bin
f347c26836 scsi: scsi_debug: Fix out-of-bound read in resp_report_tgtpgs()
The following issue was observed running syzkaller:

BUG: KASAN: slab-out-of-bounds in memcpy include/linux/string.h:377 [inline]
BUG: KASAN: slab-out-of-bounds in sg_copy_buffer+0x150/0x1c0 lib/scatterlist.c:831
Read of size 2132 at addr ffff8880aea95dc8 by task syz-executor.0/9815

CPU: 0 PID: 9815 Comm: syz-executor.0 Not tainted 4.19.202-00874-gfc0fe04215a9 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xe4/0x14a lib/dump_stack.c:118
 print_address_description+0x73/0x280 mm/kasan/report.c:253
 kasan_report_error mm/kasan/report.c:352 [inline]
 kasan_report+0x272/0x370 mm/kasan/report.c:410
 memcpy+0x1f/0x50 mm/kasan/kasan.c:302
 memcpy include/linux/string.h:377 [inline]
 sg_copy_buffer+0x150/0x1c0 lib/scatterlist.c:831
 fill_from_dev_buffer+0x14f/0x340 drivers/scsi/scsi_debug.c:1021
 resp_report_tgtpgs+0x5aa/0x770 drivers/scsi/scsi_debug.c:1772
 schedule_resp+0x464/0x12f0 drivers/scsi/scsi_debug.c:4429
 scsi_debug_queuecommand+0x467/0x1390 drivers/scsi/scsi_debug.c:5835
 scsi_dispatch_cmd+0x3fc/0x9b0 drivers/scsi/scsi_lib.c:1896
 scsi_request_fn+0x1042/0x1810 drivers/scsi/scsi_lib.c:2034
 __blk_run_queue_uncond block/blk-core.c:464 [inline]
 __blk_run_queue+0x1a4/0x380 block/blk-core.c:484
 blk_execute_rq_nowait+0x1c2/0x2d0 block/blk-exec.c:78
 sg_common_write.isra.19+0xd74/0x1dc0 drivers/scsi/sg.c:847
 sg_write.part.23+0x6e0/0xd00 drivers/scsi/sg.c:716
 sg_write+0x64/0xa0 drivers/scsi/sg.c:622
 __vfs_write+0xed/0x690 fs/read_write.c:485
kill_bdev:block_device:00000000e138492c
 vfs_write+0x184/0x4c0 fs/read_write.c:549
 ksys_write+0x107/0x240 fs/read_write.c:599
 do_syscall_64+0xc2/0x560 arch/x86/entry/common.c:293
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

We get 'alen' from command its type is int. If userspace passes a large
length we will get a negative 'alen'.

Switch n, alen, and rlen to u32.

Link: https://lore.kernel.org/r/20211013033913.2551004-3-yebin10@huawei.com
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:35 -04:00
Ye Bin
4e3ace0051 scsi: scsi_debug: Fix out-of-bound read in resp_readcap16()
The following warning was observed running syzkaller:

[ 3813.830724] sg_write: data in/out 65466/242 bytes for SCSI command 0x9e-- guessing data in;
[ 3813.830724]    program syz-executor not setting count and/or reply_len properly
[ 3813.836956] ==================================================================
[ 3813.839465] BUG: KASAN: stack-out-of-bounds in sg_copy_buffer+0x157/0x1e0
[ 3813.841773] Read of size 4096 at addr ffff8883cf80f540 by task syz-executor/1549
[ 3813.846612] Call Trace:
[ 3813.846995]  dump_stack+0x108/0x15f
[ 3813.847524]  print_address_description+0xa5/0x372
[ 3813.848243]  kasan_report.cold+0x236/0x2a8
[ 3813.849439]  check_memory_region+0x240/0x270
[ 3813.850094]  memcpy+0x30/0x80
[ 3813.850553]  sg_copy_buffer+0x157/0x1e0
[ 3813.853032]  sg_copy_from_buffer+0x13/0x20
[ 3813.853660]  fill_from_dev_buffer+0x135/0x370
[ 3813.854329]  resp_readcap16+0x1ac/0x280
[ 3813.856917]  schedule_resp+0x41f/0x1630
[ 3813.858203]  scsi_debug_queuecommand+0xb32/0x17e0
[ 3813.862699]  scsi_dispatch_cmd+0x330/0x950
[ 3813.863329]  scsi_request_fn+0xd8e/0x1710
[ 3813.863946]  __blk_run_queue+0x10b/0x230
[ 3813.864544]  blk_execute_rq_nowait+0x1d8/0x400
[ 3813.865220]  sg_common_write.isra.0+0xe61/0x2420
[ 3813.871637]  sg_write+0x6c8/0xef0
[ 3813.878853]  __vfs_write+0xe4/0x800
[ 3813.883487]  vfs_write+0x17b/0x530
[ 3813.884008]  ksys_write+0x103/0x270
[ 3813.886268]  __x64_sys_write+0x77/0xc0
[ 3813.886841]  do_syscall_64+0x106/0x360
[ 3813.887415]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

This issue can be reproduced with the following syzkaller log:

r0 = openat(0xffffffffffffff9c, &(0x7f0000000040)='./file0\x00', 0x26e1, 0x0)
r1 = syz_open_procfs(0xffffffffffffffff, &(0x7f0000000000)='fd/3\x00')
open_by_handle_at(r1, &(0x7f00000003c0)=ANY=[@ANYRESHEX], 0x602000)
r2 = syz_open_dev$sg(&(0x7f0000000000), 0x0, 0x40782)
write$binfmt_aout(r2, &(0x7f0000000340)=ANY=[@ANYBLOB="00000000deff000000000000000000000000000000000000000000000000000047f007af9e107a41ec395f1bded7be24277a1501ff6196a83366f4e6362bc0ff2b247f68a972989b094b2da4fb3607fcf611a22dd04310d28c75039d"], 0x126)

In resp_readcap16() we get "int alloc_len" value -1104926854, and then pass
the huge arr_len to fill_from_dev_buffer(), but arr is only 32 bytes. This
leads to OOB in sg_copy_buffer().

To solve this issue, define alloc_len as u32.

Link: https://lore.kernel.org/r/20211013033913.2551004-2-yebin10@huawei.com
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:35 -04:00
Colin Ian King
8ecfb16c9b scsi: 3w-xxx: Remove redundant initialization of variable retval
The variable retval is being initialized with a value that is never read,
it is being updated immediately afterwards. The assignment is redundant and
can be removed.

Link: https://lore.kernel.org/r/20211013182834.137410-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Addresses-Coverity: ("Unused value")
2021-10-18 22:38:34 -04:00
MichelleJin
b3ef4a0e40 scsi: fcoe: Use netif_is_bond_master() instead of open code
'netdev->priv_flags & IFF_BONDING && netdev->flags & IFF_MASTER' is defined
as netif_is_bond_master() in netdevice.h. Replace it to clean up code.

Link: https://lore.kernel.org/r/20211015142006.540773-1-shjy180909@gmail.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: MichelleJin <shjy180909@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:34 -04:00
Tyrel Datwyler
3319a8ba82 scsi: ibmvscsi: Use GFP_KERNEL with dma_alloc_coherent() in initialize_event_pool()
During driver probe we allocate a dma region for our event pool.
Currently, zero is passed for the gfp_flags parameter. Driver probe
callbacks are run in process context and we hold no locks so we can sleep
here if necessary.

Fix by passing GFP_KERNEL explicitly to dma_alloc_coherent().

Link: https://lore.kernel.org/r/1547089149-20577-1-git-send-email-tyreld@linux.vnet.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:34 -04:00
Dan Carpenter
30e99f05f8 scsi: mpi3mr: Use scnprintf() instead of snprintf()
I intended to move from snprintf() to scnprintf() in the previous patch but
I messed up and did not do that.  The result of my bug is that it this
function could trigger a WARN() if the buffer is too large.

Link: https://lore.kernel.org/r/20211013083005.GA8592@kili
Fixes: 76a4f7cc59 ("scsi: mpi3mr: Clean up mpi3mr_print_ioc_info()")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:34 -04:00
Martin Kepplinger
c4da120575 scsi: sd: Print write through due to no caching mode page as warning
For SD cardreaders it is extremely common not to have a cache.
Consequently, the following messages do not point to a real error one could
try to fix but rather describe how the disk works:

  sd 0:0:0:0: [sda] No Caching mode page found
  sd 0:0:0:0: [sda] Assuming drive cache: write through

Print these messages as warnings instead of errors.

Link: https://lore.kernel.org/r/20211013075050.3870354-1-martin.kepplinger@puri.sm
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin Kepplinger <martin.kepplinger@puri.sm>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-18 22:38:34 -04:00
Bart Van Assche
a47c6b713e scsi: core: Remove two host template members that are no longer used
All SCSI drivers have been converted to use shost_groups and sdev_groups
instead of shost_attrs or sdev_attrs. Hence remove shost_attrs and
sdev_attrs. Additionally, remove the 'lld_attr_group' members and also
the scsi_convert_dev_attrs() function.

Link: https://lore.kernel.org/r/20211012233558.4066756-47-bvanassche@acm.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-16 21:45:59 -04:00
Bart Van Assche
7500be6291 scsi: snic: Switch to attribute groups
struct device supports attribute groups directly but does not support
struct device_attribute directly. Hence switch to attribute groups.

Link: https://lore.kernel.org/r/20211012233558.4066756-44-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-10-16 21:45:59 -04:00