IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Under the session level spinlock node->active_ios_lock in
efct_scsi_io_alloc() we are taking another spinlock for the port. This
leads to contention between sessions and even between I/Os in the same
session.
Reduce the locked region to active_ios list for which active_ios_lock is
intended. Spinlock CPU usage decreases from 18% down to 13%. IOPS are
increased from 220 kIOPS to 264 kIOPS for one LUN.
Link: https://lore.kernel.org/r/20210914105539.6942-4-d.bogdanov@yadro.com
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
nport_free for an empty nport hangs the state machine waiting for mbox
completion if nport is not yet attached thinking that it is attaching right
now. Add a check for nport attaching state and complete nport free.
Link: https://lore.kernel.org/r/20210914105539.6942-3-d.bogdanov@yadro.com
Reviewed-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Similar to other state machine traces and to make debug easier, add the
state name to nport sm trace printout.
Link: https://lore.kernel.org/r/20210914105539.6942-2-d.bogdanov@yadro.com
Reviewed-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There are no dependencies in <scsi/scsi_cmnd.h> on the <scsi/scsi_host.h>
header file. Hence remove the scsi_host.h include directive from
scsi_cmnd.h. This include directive was introduced in February 2021 by
commit af1830956dc3 ("scsi: core: Add mq_poll support to SCSI layer").
Link: https://lore.kernel.org/r/20210917212751.2676054-1-bvanassche@acm.org
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The se_cmd is unused in these functions, just remove it.
Link: https://lore.kernel.org/r/20210913083045.3670648-1-fengli@smartx.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Li Feng <fengli@smartx.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In ufshpb, pm_runtime_{get,put}_sync() are used to avoid unwanted runtime
suspend during query requests. Whereas commit b294ff3e3449 ("scsi: ufs:
core: Enable power management for wlun") modified the driver core to use
ufshcd_rpm_{get,put}_sync() APIs.
Switch to these APIs in HPB module as well.
Link: https://lore.kernel.org/r/20210902003534epcms2p1937a0f0eeb48a441cb69f5ef13ff8430@epcms2p1
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
As noted in the "Deprecated Interfaces, Language Features, Attributes, and
Conventions" documentation [1], size calculations (especially
multiplication) should not be performed in memory allocator (or similar)
function arguments due to the risk of them overflowing. This could lead to
values wrapping around and a smaller allocation being made than the caller
was expecting. Using those allocations could lead to linear overflows of
heap memory and other misbehaviors.
Use the purpose specific kcalloc() function instead of the argument count *
size in the kzalloc() function.
[1] https://www.kernel.org/doc/html/v5.14/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments
Link: https://lore.kernel.org/r/20210905062448.6587-1-len.baker@gmx.com
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Len Baker <len.baker@gmx.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update lpfc version to 14.0.0.2.
Link: https://lore.kernel.org/r/20210910233159.115896-15-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The PBDE feature, setting payload buffer address explicitly in the WQE so
it doesn't have to be fetched from the SGL, only makes sense when there is
a single buffer for the I/O. When there are multiple buffers it actually
hurts performance as the SGL subsequently has to be fetched.
Rework the SGL logic to only use PBDE when a single buffer.
Link: https://lore.kernel.org/r/20210910233159.115896-14-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently congestion management framework results are cleared whenever the
framework settings changed (such as it being turned off then back on). This
unfortunately means prior stats, rolled up to higher time windows lose
meaning.
Change such that stats are not cleared. Thus they pause and resume with
prior values still being considered.
Link: https://lore.kernel.org/r/20210910233159.115896-13-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If the congestion management framework dynamically enables, it may do so
while I/O is in flight. The updates of cmf info due to inflight I/O
completing may happen before values have been initialized.
Fix by ensure cmf_max_bytes_per_interval is initialized when checking
bandwidth utilization for SCSI layer blocking.
Link: https://lore.kernel.org/r/20210910233159.115896-12-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The newly added congestion mgmt framework is seeing unexpected congestion
FPINs and signals. In analysis, time values given to the adapter are not
at hard time intervals. Thus the drift vs the transfer count seen is
affecting how the framework manages things.
Adjust counters to cover the drift.
Link: https://lore.kernel.org/r/20210910233159.115896-11-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Injecting errors on the PCI slot while the driver is handling NVMe I/O will
cause crashes and hangs.
There are several rather difficult scenarios occurring. The main issue is
that the adapter can report a PCI error before or simultaneously to the PCI
subsystem reporting the error. Both paths have different entry points and
currently there is no interlock between them. Thus multiple teardown paths
are competing and all heck breaks loose.
Complicating things is the NVMs path. To a large degree, I/O was able to be
shutdown for a full FC port on the SCSI stack. But on NVMe, there isn't a
similar call. At best, it works on a per-controller basis, but even at the
controller level, it's a controller "reset" call. All of which means I/O is
still flowing on different CPUs with reset paths expecting hw access
(mailbox commands) to execute properly.
The following modifications are made:
- A new flag is set in PCI error entrypoints so the driver can track being
called by that path.
- An interlock is added in the SLI hw error path and the PCI error path
such that only one of the paths proceeds with the teardown logic.
- RPI cleanup is patched such that RPIs are marked unregistered w/o mbx
cmds in cases of hw error.
- If entering the SLI port re-init calls, a case where SLI error teardown
was quick and beat the PCI calls now reporting error, check whether the
SLI port is still live on the PCI bus.
- In the PCI reset code to bring the adapter back, recheck the IRQ
settings. Different checks for SLI3 vs SLI4.
- In I/O completions, that may be called as part of the cleanup or
underway just before the hw error, check the state of the adapter. If
in error, shortcut handling that would expect further adapter
completions as the hw error won't be sending them.
- In routines waiting on I/O completions, which may have been in progress
prior to the hw error, detect the device is being torn down and abort
from their waits and just give up. This points to a larger issue in the
driver on ref-counting for data structures, as it doesn't have
ref-counting on q and port structures. We'll do this fix for now as it
would be a major rework to be done differently.
- Fix the NVMe cleanup to simulate NVMe I/O completions if I/O is being
failed back due to hw error.
- In I/O buf allocation, done at the start of new I/Os, check hw state and
fail if hw error.
Link: https://lore.kernel.org/r/20210910233159.115896-10-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A prior patch inadvertently caused lpfc_sli_sum_iocb() to exclude counting
of outstanding aborted I/Os and ABORT IOCBs. Thus,
lpfc_reset_flush_io_context() called from any TMF routine does not properly
wait to flush all outstanding FCP IOCBs leading to a block layer crash on
an invalid scsi_cmnd->request pointer.
kernel BUG at ../block/blk-core.c:1489!
RIP: 0010:blk_requeue_request+0xaf/0xc0
...
Call Trace:
<IRQ>
__scsi_queue_insert+0x90/0xe0 [scsi_mod]
blk_done_softirq+0x7e/0x90
__do_softirq+0xd2/0x280
irq_exit+0xd5/0xe0
do_IRQ+0x4c/0xd0
common_interrupt+0x87/0x87
</IRQ>
Fix by separating out the LPFC_IO_FCP, LPFC_IO_ON_TXCMPLQ,
LPFC_DRIVER_ABORTED, and CMD_ABORT_XRI_CN || CMD_CLOSE_XRI_CN checks into a
new lpfc_sli_validate_fcp_iocb_for_abort() routine when determining to
build an ABORT iocb.
Restore lpfc_reset_flush_io_context() functionality by including counting
of outstanding aborted IOCBs and ABORT IOCBs in lpfc_sli_sum_iocb().
Link: https://lore.kernel.org/r/20210910233159.115896-9-jsmart2021@gmail.com
Fixes: e1364711359f ("scsi: lpfc: Fix illegal memory access on Abort IOCBs")
Cc: <stable@vger.kernel.org> # v5.12+
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently, we hold off unregistering with NVMe transport layer until GID_FT
or ADISC completes upon receipt of RSCN. In the ADISC discovery routine,
for nodes not found in the GID_FT response, the nodes are unregistered from
the SCSI transport but not UNREG_RPI'd. Meaning outstanding WQEs continue
to be outstanding and were not failed back to the OS. If an NVMe device,
this mean there wasn't initial termination of the I/Os so they could be
issued on a different NVMe path.
Fix by unregistering the RPI so that I/O is cancelled.
Link: https://lore.kernel.org/r/20210910233159.115896-8-jsmart2021@gmail.com
Fixes: 0614568361b0 ("scsi: lpfc: Delay unregistering from transport until GIDFT or ADISC completes")
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In pt-2-pt mode, the initiator does not log into the target after a PRLI
error. In pt-2-pt mode, the target responded to the PRLI by sending a
LOGO. The LOGO causes all ELS and I/Os to be aborted. This caused the PRLI
to fail. The PRLI completion path caused the discovery node to be dropped
to avoid being stick in an UNUSED (not logged in) state. As the node was
dropped there is no retry of the login and as it is pt-2-pt, there is no
RSCN to retrigger discovery. Thus the other end is not seen by the OS.
Fix by ensuring the discovery node is not dropped if connecting pt-2-pt.
This will cause PLOGI to be retried.
Link: https://lore.kernel.org/r/20210910233159.115896-7-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
On link up and node discovery, a remote port is registered with the SCSI
transport and the driver sets fc4_xpt_flags to track transport
registration.
A link down event causes the driver to deregister with the SCSI transport,
starting the devloss timer, and calls a local unreg routine to clear the
login state. Part of the login state is the fc4_xpt_flags. However, with
tape devices that support sequence level error recovery, which wants to
preserve the login, the local unreg routine is skipped, thus the flags
aren't cleared.
A subsequent link up, ADISC is performed and the lpfc_nlp_reg_node()
routine is called. As the fc4_xpt_flags is not clear, it's believed the
node is already registered with the transport. Unfortunately, the
registration was already terminated. Eventually the devloss tmo timer
expires and tears down the device.
Fix by ensuring the tape device, known by the ADISC flag, is always
unregistered if the link drops.
Link: https://lore.kernel.org/r/20210910233159.115896-6-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A test scenario encountered an unload hang while an FLOGI ELS was in flight
when a link down condition occurred. The driver fails unload as it never
releases the fport node.
For most nodes, when the link drops, devloss tmo is started and the timeout
will cause the final node release. For the Fport, as it has not yet
registered with the SCSI transport, there is no devloss timer to be
started, so there is no final release. Additionally, the link down
sequence causes ABORTS to be issued for pending ELS's. The completions from
the ABORTS perform the release of node references. However, as the adapter
is being reset to be unloaded, those completions will never occur.
Fix by the following:
- In the ELS cleanup, recognize when unloading and place the ELS's on a
different list that immediately cleans up/completes the ELS's. It's
recognized that this condition primarily affects only the fport, with
other ports having normal clean up logic that handles things.
- Resolve the devloss issue by, when cleaning up nodes on after link down,
recognizing when the fabric node does not have a completed state (its
state is UNUSED) and removing a reference so the node can delete after
the ELS reference is released.
Link: https://lore.kernel.org/r/20210910233159.115896-5-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A test scenario has a target issuing a TPLS after accepting the driver's
PRLI. TPLS is not supported by the driver so it rejects the ELS. However,
the reject was only happening on the primary N_Port. If the TPLS was to a
NPIV vport, not only would it reject the ELS, but it would act on the TPLS,
starting devloss, then unregister from the SCSI transport and release the
node. When devloss expired, it would access the node again and cause a page
faul.
Fix by altering the NPIV code to recognize that a correctly registered node
can reject unsolicited ELS I/O and to not unregister with the SCSI
transport and tear the node down. Add a check of the fc4_xpt_flags so that
only a zero value allows the unreg and teardown.
Link: https://lore.kernel.org/r/20210910233159.115896-4-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In a rarely executed path, FLOGI failure, there is a refcounting error. If
FLOGI completed with an error, typically a timeout, the initial completion
handler would remove the job reference. However, the job completion isn't
the actual end of the job/exchange as the timeout usually initiates an
ABTS, and upon that ABTS completion, a final completion is sent. The driver
removes the reference again in the final completion. Thus the imbalance.
In the buggy cases, if there was a link bounce while the delayed response
is outstanding, the fport node may be referenced again but there was no
additional reference as it is already present. The delayed completion then
occurs and removes the last reference freeing the node and causing issues
in the link up processed that is using the node.
Fix this scenario by removing the snippet that removed the reference in the
initial FLOGI completion. The bad snippet was poorly trying to identify the
FLOGI as OK to do so by realizing the node was not registered with either
SCSI or NVMe transport.
Link: https://lore.kernel.org/r/20210910233159.115896-3-jsmart2021@gmail.com
Fixes: 618e2ee146d4 ("scsi: lpfc: Fix FLOGI failure due to accessing a freed node")
Cc: <stable@vger.kernel.org> # v5.13+
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When parsing the txq list in lpfc_drain_txq(), the driver attempts to pass
the requests to the adapter. If such an attempt fails, a local "fail_msg"
string is set and a log message output. The job is then added to a
completions list for cancellation.
Processing of any further jobs from the txq list continues, but since
"fail_msg" remains set, jobs are added to the completions list regardless
of whether a wqe was passed to the adapter. If successfully added to
txcmplq, jobs are added to both lists resulting in list corruption.
Fix by clearing the fail_msg string after adding a job to the completions
list. This stops the subsequent jobs from being added to the completions
list unless they had an appropriate failure.
Link: https://lore.kernel.org/r/20210910233159.115896-2-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The pointer req is being initialized with a value that is never read, it is
being updated later on. The assignment is redundant and can be removed.
Link: https://lore.kernel.org/r/20210910114610.44752-1-colin.king@canonical.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Addresses-Coverity: ("Unused value")
DPC thread gets restricted due to a no-op mailbox, which is a blocking call
and has a high execution frequency. To free up the DPC thread we move no-op
handling to the workqueue. Also, modified qla_do_heartbeat() to send no-op
MBC if we don’t have any active interrupts, but there are still I/Os
outstanding with firmware.
Link: https://lore.kernel.org/r/20210908164622.19240-9-njavali@marvell.com
Fixes: d94d8158e184 ("scsi: qla2xxx: Add heartbeat check")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Process responses in Tx path if any available for better performance.
Link: https://lore.kernel.org/r/20210908164622.19240-8-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Shreyas Deodhar <sdeodhar@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Authentication application may be running and in the past tried to probe
driver (app_start) but was unsuccessful. This could be due to the bsg layer
not being ready to service the request. On a successful link up, driver
will use the netlink Link Up event to notify the app to retry the app_start
call.
In another case, app does not poll for new NPIV host. This link up event
would notify app of the presence of a new SCSI host.
Link: https://lore.kernel.org/r/20210908164622.19240-6-njavali@marvell.com
Fixes: 4de067e5df12 ("scsi: qla2xxx: edif: Add N2N support for EDIF")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
System crash was seen when I/O was run against an NVMe target and aborts
were occurring.
Crash stack is:
-- relevant crash stack --
BUG: kernel NULL pointer dereference, address: 0000000000000010
:
#6 [ffffae1f8666bdd0] page_fault at ffffffffa740122e
[exception RIP: qla_nvme_abort_work+339]
RIP: ffffffffc0f592e3 RSP: ffffae1f8666be80 RFLAGS: 00010297
RAX: 0000000000000000 RBX: ffff9b581fc8af80 RCX: ffffffffc0f83bd0
RDX: 0000000000000001 RSI: ffff9b5839c6c7c8 RDI: 0000000008000000
RBP: ffff9b6832f85000 R8: ffffffffc0f68160 R9: ffffffffc0f70652
R10: ffffae1f862ffdc8 R11: 0000000000000300 R12: 000000000000010d
R13: 0000000000000000 R14: ffff9b5839cea000 R15: 0ffff9b583fab170
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#7 [ffffae1f8666be98] process_one_work at ffffffffa6aba184
#8 [ffffae1f8666bed8] worker_thread at ffffffffa6aba39d
#9 [ffffae1f8666bf10] kthread at ffffffffa6ac06ed
The crash was due to a stale SRB structure access after it was aborted.
Fix the issue by removing stale access.
Link: https://lore.kernel.org/r/20210908164622.19240-5-njavali@marvell.com
Fixes: 2cabf10dbbe3 ("scsi: qla2xxx: Fix hang on NVMe command timeouts")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This card is unique and doesn't support lower speeds, hence update the fdmi
field to display 16G only.
Link: https://lore.kernel.org/r/20210908164622.19240-3-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This interface will allow user space applications to send a mailbox command
to the firmware.
Link: https://lore.kernel.org/r/20210908164622.19240-2-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bikash Hazarika <bhazarika@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Driver failed to release all memory allocated. This would lead to memory
leak during driver removal.
Properly free memory when the module is removed.
Link: https://lore.kernel.org/r/20210906170404.5682-5-Ajish.Koshy@microchip.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Ajish Koshy <Ajish.Koshy@microchip.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Correct inbound queue and outbound queue size in 'ib_log' and 'ob_log'
sysfs entries.
Link: https://lore.kernel.org/r/20210906170404.5682-4-Ajish.Koshy@microchip.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Ajish Koshy <Ajish.Koshy@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit 1f02beff224e ("scsi: pm80xx: Remove global lock from outbound queue
processing") introduced a lock per outbound queue. Prior to that change the
driver was using a global lock for all outbound queues.
While processing the I/O responses and events the driver takes the outbound
queue spinlock and is supposed to release it in pm8001_ccb_task_free_done()
before calling command done(). Since the older code was using a global
lock, pm8001_ccb_task_free_done() was releasing the global spin lock. The
change that split the lock per outbound queue did not consider this and
pm8001_ccb_task_free_done() was still releasing the global lock.
Link: https://lore.kernel.org/r/20210906170404.5682-3-Ajish.Koshy@microchip.com
Fixes: 1f02beff224e ("scsi: pm80xx: Remove global lock from outbound queue processing")
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Ajish Koshy <Ajish.Koshy@microchip.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During phyup event, the firmware provides the phy_id and port_id and driver
is supposed to use these during device handle registration. Previously the
driver was using the port id value from libsas during device handle
registration. Since id can be different from the one assigned by firmware,
this can lead to wrong device registration and drives not showing up.
Use firmware assigned port id during device registration.
Link: https://lore.kernel.org/r/20210906170404.5682-2-Ajish.Koshy@microchip.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Ajish Koshy <Ajish.Koshy@microchip.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit ec29d0ac29be ("scsi: iscsi: Fix conn use after free during resets")
moved member ehwait from 'conn' to 'session', but left the initialization
of ehwait in iscsi_conn_setup().
Although a session can only have 1 conn currently, it is better to
initialize ehwait in iscsi_session_setup() in case we implement handling
multiple conns in the future.
Link: https://lore.kernel.org/r/20210911135159.20543-1-dinghui@sangfor.com.cn
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
It is standard practice to co-locate export declarations with the symbol
which is being exported. Or at least in the same file - see
sas_phy_reset().
Modify libsas to follow this practice consistently.
Link: https://lore.kernel.org/r/1631530296-32358-1-git-send-email-john.garry@huawei.com
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The hisi_hba debugfs_dump_index member should increased after a dump
insertion completed, and not before it has started, so fix the code to do
so.
Link: https://lore.kernel.org/r/1629799260-120116-6-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Some usage of del_timer() in the driver is potentially unsafe.
When running the sas_task->slow_task timer in
hisi_sas_exec_internal_tmf_task(), execution may be blocked in function
hisi_sas_task_exec(); so it is possible that the timer is running when the
callback to disable the timer is running. This could be dangerous, as we
immediately release resources which the timer callback uses after disabling
the timer. The same situation may be found at other sites, such as
_hisi_sas_internal_task_abort().
Change calls to del_timer() to del_timer_sync() as necessary, to ensure any
timer has finished when disabling.
Also remove calls to timer_pending() prior to del_timer() as it is not
necessary.
Link: https://lore.kernel.org/r/1629799260-120116-5-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
HISI_SAS_RESET_BIT means that the controller is being reset, and so the
name is a bit vague. Rename it to HISI_SAS_RESETTING_BIT.
Link: https://lore.kernel.org/r/1629799260-120116-4-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Use managed PCI functions such as pcim_enable_device() and
pcim_iomap_regions() to simplify exception handling code.
Link: https://lore.kernel.org/r/1629799260-120116-2-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
- Add missing fields and remove some duplicate fields when printing a perf_event_attr.
- Fix hybrid config terms list corruption.
- Update kernel header copies, some resulted in new kernel features being
automagically added to 'perf trace' syscall/tracepoint argument id->string translators.
- Add a file generated during the documentation build to .gitignore.
- Add an option to build without libbfd, as some distros, like Debian consider
its ABI unstable.
- Add support to print a textual representation of IBS raw sample data in 'perf report'.
- Fix bpf 'perf test' sample mismatch reporting
- Fix passing arguments to stackcollapse report in a 'perf script' python script.
- Allow build-id with trailing zeros.
- Look for ImageBase in PE file to compute .text offset.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYT0+hwAKCRCyPKLppCJ+
JxcPAQDO+iCKK/sF3TVN8f0T8xkFD6y8krBXPAtQHCAhVBeiqAD9F4R0VMX6nwy3
8rJnsNd2ODjywgFBO4uPy0N2fxBWjwo=
=/hH1
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-for-v5.15-2021-09-11' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull more perf tools updates from Arnaldo Carvalho de Melo:
- Add missing fields and remove some duplicate fields when printing a
perf_event_attr.
- Fix hybrid config terms list corruption.
- Update kernel header copies, some resulted in new kernel features
being automagically added to 'perf trace' syscall/tracepoint argument
id->string translators.
- Add a file generated during the documentation build to .gitignore.
- Add an option to build without libbfd, as some distros, like Debian
consider its ABI unstable.
- Add support to print a textual representation of IBS raw sample data
in 'perf report'.
- Fix bpf 'perf test' sample mismatch reporting
- Fix passing arguments to stackcollapse report in a 'perf script'
python script.
- Allow build-id with trailing zeros.
- Look for ImageBase in PE file to compute .text offset.
* tag 'perf-tools-for-v5.15-2021-09-11' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (25 commits)
tools headers UAPI: Update tools's copy of drm.h headers
tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
tools headers UAPI: Sync linux/fs.h with the kernel sources
tools headers UAPI: Sync linux/in.h copy with the kernel sources
perf tools: Add an option to build without libbfd
perf tools: Allow build-id with trailing zeros
perf tools: Fix hybrid config terms list corruption
perf tools: Factor out copy_config_terms() and free_config_terms()
perf tools: Fix perf_event_attr__fprintf() missing/dupl. fields
perf tools: Ignore Documentation dependency file
perf bpf: Provide a weak btf__load_from_kernel_by_id() for older libbpf versions
tools include UAPI: Update linux/mount.h copy
perf beauty: Cover more flags in the move_mount syscall argument beautifier
tools headers UAPI: Sync linux/prctl.h with the kernel sources
tools include UAPI: Sync sound/asound.h copy with the kernel sources
tools headers UAPI: Sync linux/kvm.h with the kernel sources
tools headers UAPI: Sync x86's asm/kvm.h with the kernel sources
perf report: Add support to print a textual representation of IBS raw sample data
perf report: Add tools/arch/x86/include/asm/amd-ibs.h
perf env: Add perf_env__cpuid, perf_env__{nr_}pmu_mappings
...