IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
commit 49b729f58e7a98a006a8a0c1dcca8a1a4f58d2a8 upstream.
This done routine will delete the timer and check for its return value and
decrease the reference count accordingly. This prevents boot hangs reported
after commit 31e6cdbe0eae ("scsi: qla2xxx: Implement ref count for SRB")
was merged.
Link: https://lore.kernel.org/r/20220208093946.4471-1-njavali@marvell.com
Fixes: 31e6cdbe0eae ("scsi: qla2xxx: Implement ref count for SRB")
Reported-by: Ewan Milne <emilne@redhat.com>
Tested-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d2646eed7b19a206912f49101178cbbaa507256c upstream.
While a session is in the middle of a relogin, a late RSCN can be delivered
from switch. RSCN trigger fabric scan where the scan logic can trigger
another session login while a login is in progress. Reduce the extra
trigger to prevent multiple logins to the same session.
Link: https://lore.kernel.org/r/20220310092604.22950-10-njavali@marvell.com
Fixes: bee8b84686c4 ("scsi: qla2xxx: Reduce redundant ADISC command for RSCNs")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c13ce47c64ea8f14e77eecb40d1e7c2ac667f898 upstream.
For N2N topology, ELS Passthrough is used to send PLOGI. On failure of ELS
pass through PLOGI, driver flipped over to using LLIOCB PLOGI for N2N. This
is not consistent. Delete the session to restart the connection where ELS
pass through PLOGI would be used consistently.
Link: https://lore.kernel.org/r/20220310092604.22950-7-njavali@marvell.com
Fixes: c76ae845ea83 ("scsi: qla2xxx: Add error handling for PLOGI ELS passthrough")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c85ab7d9e27a80e48d5b7d7fb2fe2b0fdb2de523 upstream.
At NVMe ELS request time, request structure is DMA mapped and never
unmapped. Fix this by calling the unmap on ELS completion.
Link: https://lore.kernel.org/r/20220310092604.22950-5-njavali@marvell.com
Fixes: e84067d74301 ("scsi: qla2xxx: Add FC-NVMe F/W initialization and transport registration")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c02aada06d19a215c8291bd968a99a270e96f734 upstream.
User experienced device lost. The log shows Get port data base command was
queued up, failed, and requeued again. Every time it is requeued, it set
the FCF_ASYNC_ACTIVE. This prevents any recovery code from occurring
because driver thinks a recovery is in progress for this session. In
essence, this session is hung. The reason it gets into this place is the
session deletion got in front of this call due to link perturbation.
Break the requeue cycle and exit. The session deletion code will trigger a
session relogin.
Link: https://lore.kernel.org/r/20220310092604.22950-8-njavali@marvell.com
Fixes: 726b85487067 ("qla2xxx: Add framework for async fabric discovery")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 58ca5999e0367d131de82a75257fbfd5aed0195d upstream.
User experienced no task management error while target device is responding
with error. The RSP_CODE field in the status IOCB is in little endian.
Driver assumes it's big endian and it picked up erroneous data.
Convert the data back to big endian as is on the wire.
Link: https://lore.kernel.org/r/20220310092604.22950-2-njavali@marvell.com
Fixes: faef62d13463 ("[SCSI] qla2xxx: Fix Task Management command asynchronous handling")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6a45c8e137d4e2c72eecf1ac7cf64f2fdfcead99 upstream.
User experienced some of the LUN failed to get rediscovered after long
cable pull test. The issue is triggered by a race condition between driver
setting session online state vs starting the LUN scan process at the same
time. Current code set the online state after notifying the session is
available. In this case, trigger to start the LUN scan process happened
before driver could set the session in online state. LUN scan ends up with
failure due to the session online check was failing.
Set the online state before reporting of the availability of the session.
Link: https://lore.kernel.org/r/20220310092604.22950-3-njavali@marvell.com
Fixes: aecf043443d3 ("scsi: qla2xxx: Fix Remote port registration")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cfbafad7c6032d449a5a07f2d273acd2437bbc6a upstream.
While allocating firmware dump, check if dump is already collected and do
not re-allocate the buffer.
Link: https://lore.kernel.org/r/20220110050218.3958-17-njavali@marvell.com
Cc: stable@vger.kernel.org
Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0d6a536cb1fcabb6c3e9c94871c8d0b29bb5813b upstream.
This is an update to the original 28xx adapter enablement. Add a bunch of
conditionals that are applicable for 28xx.
Link: https://lore.kernel.org/r/20220110050218.3958-16-njavali@marvell.com
Fixes: ecc89f25e225 ("scsi: qla2xxx: Add Device ID for ISP28XX")
Cc: stable@vger.kernel.org
Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4c103a802c69fca63976af6b372ccba39ed74370 upstream.
28XX adapters are capable of detecting both T10 PI tag escape values as
well as IP guard. This was missed due to the adapter type missed in the
corresponding macros. Fix this by adding support for 28xx in those macros.
Link: https://lore.kernel.org/r/20220110050218.3958-14-njavali@marvell.com
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Tested-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 73825fd7a37c1a685e9e9e27c9dc91ef1f3e2971 upstream.
Silence compile warning due to unaligned memory access.
qla_edif.c:713:45: warning: taking address of packed member 'u' of class or
structure 'auth_complete_cmd' may result in an unaligned pointer value
[-Waddress-of-packed-member]
fcport = qla2x00_find_fcport_by_pid(vha, &appplogiok.u.d_id);
Link: https://lore.kernel.org/r/20220110050218.3958-13-njavali@marvell.com
Cc: stable@vger.kernel.org
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8ad4be3d15cf144b5834bdb00d5bbe4050938dc7 upstream.
A device logout in loop topology initiates a device connection teardown
which loses the FW device handle. In loop topo, the device handle is not
regrabbed leading to device login failures and eventually to loss of the
device. Fix this by taking the main login path that does it.
Link: https://lore.kernel.org/r/20220110050218.3958-11-njavali@marvell.com
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 725d3a0d31a51c0debf970011e05f585e805165b upstream.
Fix stuck sessions in get port database. When a thread is in the process of
re-establishing a session, a flag is set to prevent multiple threads /
triggers from doing the same task. This flag was left on, where any attempt
to relogin was locked out. Clear this flag, if the attempt has failed.
Link: https://lore.kernel.org/r/20220110050218.3958-4-njavali@marvell.com
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 31e6cdbe0eae37badceb5e0d4f06cf051432fd77 upstream.
The timeout handler and the done function are racing. When
qla2x00_async_iocb_timeout() starts to run it can be preempted by the
normal response path (via the firmware?). qla24xx_async_gpsc_sp_done()
releases the SRB unconditionally. When scheduling back to
qla2x00_async_iocb_timeout() qla24xx_async_abort_cmd() will access an freed
sp->qpair pointer:
qla2xxx [0000:83:00.0]-2871:0: Async-gpsc timeout - hdl=63d portid=234500 50:06:0e:80:08:77:b6:21.
qla2xxx [0000:83:00.0]-2853:0: Async done-gpsc res 0, WWPN 50:06:0e:80:08:77:b6:21
qla2xxx [0000:83:00.0]-2854:0: Async-gpsc OUT WWPN 20:45:00:27:f8:75:33:00 speeds=2c00 speed=0400.
qla2xxx [0000:83:00.0]-28d8:0: qla24xx_handle_gpsc_event 50:06:0e:80:08:77:b6:21 DS 7 LS 6 rc 0 login 1|1 rscn 1|0 lid 5
BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
IP: qla24xx_async_abort_cmd+0x1b/0x1c0 [qla2xxx]
Obvious solution to this is to introduce a reference counter. One reference
is taken for the normal code path (the 'good' case) and one for the timeout
path. As we always race between the normal good case and the timeout/abort
handler we need to serialize it. Also we cannot assume any order between
the handlers. Since this is slow path we can use proper synchronization via
locks.
When we are able to cancel a timer (del_timer returns 1) we know there
can't be any error handling in progress because the timeout handler hasn't
expired yet, thus we can safely decrement the refcounter by one.
If we are not able to cancel the timer, we know an abort handler is
running. We have to make sure we call sp->done() in the abort handlers
before calling kref_put().
Link: https://lore.kernel.org/r/20220110050218.3958-3-njavali@marvell.com
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Co-developed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d4523bd6fd5d3afa9f08a86038a8a92176089f5b upstream.
Move common open-coded asynchronous command initializing code such as
setting up the timer and the done callback into one function. This is a
preparation step and allows us later on to change the low level error flow
handling at a central place.
Link: https://lore.kernel.org/r/20220110050218.3958-2-njavali@marvell.com
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 208cc9fe6f21112b5cc6cb87065fb8ab66e79316 ]
The driver must perform its 4GB boundary check using the pool's DMA address
instead of using the virtual address.
Link: https://lore.kernel.org/r/20220303140230.13098-1-sreekanth.reddy@broadcom.com
Fixes: d6adc251dd2f ("scsi: mpt3sas: Force PCIe scatterlist allocations to be within same 4 GB region")
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c4e070457a93705e56ed06b3910d9e5fe56d3be3 ]
Currently the permission of parameter prot_mask is 0x0, which means that
the member does not appear in sysfs. Change it as other module parameters
to 0444 for world-readable.
[mkp: s/v3/v2/]
Link: https://lore.kernel.org/r/1645703489-87194-2-git-send-email-john.garry@huawei.com
Fixes: d6a9000b81be ("scsi: hisi_sas: Add support for DIF feature for v2 hw")
Reported-by: Yihang Li <liyihang6@hisilicon.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1d6736c3e162061dc811c76e605f35ef3234bffa ]
NCQ NON DATA is an NCQ command with the DMA_NONE DMA direction and so a
register-device-to-host-FIS response is expected for it.
However, for an IO_SUCCESS case, mpi_sata_completion() expects a
set-device-bits-FIS for any ata task with an use_ncq field true, which
includes NCQ NON DATA commands.
Fix this to correctly treat NCQ NON DATA commands as non-data by also
testing for the DMA_NONE DMA direction.
Link: https://lore.kernel.org/r/20220220031810.738362-16-damien.lemoal@opensource.wdc.com
Fixes: dbf9bfe61571 ("[SCSI] pm8001: add SAS/SATA HBA driver")
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aa028141ab0bc62c44a84d42f09db35d82df82a2 ]
In the pm8001_chip_sata_req() and pm80xx_chip_sata_req() functions, all
tasks with a DMA direction of DMA_NONE (no data transfer) are initialized
using the ATAP value 0x04. However, NCQ NON DATA commands, while being
DMA_NONE commands are NCQ commands and need to be initialized using the
value 0x07 for ATAP, similarly to other NCQ commands.
Make sure that NCQ NON DATA command tasks are initialized similarly to
other NCQ commands by also testing the task "use_ncq" field in addition to
the DMA direction. While at it, reorganize the code into a chain of if -
else if - else to avoid useless affectations and debug messages.
Link: https://lore.kernel.org/r/20220220031810.738362-15-damien.lemoal@opensource.wdc.com
Fixes: dbf9bfe61571 ("[SCSI] pm8001: add SAS/SATA HBA driver")
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fd6d0e376211d7ed759db96b0fbd9a1cee67d462 ]
Make sure that the __le32 fields of struct sata_cmd are manipulated after
applying the correct endian conversion. That is, use cpu_to_le32() for
assigning values and le32_to_cpu() for consulting a field value. In
particular, make sure that the calculations for the 4G boundary check are
done using CPU endianness and *not* little endian values. With these fixes,
many sparse warnings are removed.
While at it, fix some code identation and add blank lines after variable
declarations and in some other places to make this code more readable.
Link: https://lore.kernel.org/r/20220220031810.738362-12-damien.lemoal@opensource.wdc.com
Fixes: 0ecdf00ba6e5 ("[SCSI] pm80xx: 4G boundary fix.")
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 970404cc5744b1033b6ee601be4ef0e2d1fbcf72 ]
Make sure that the __le32 fields of struct ssp_ini_io_start_req are
manipulated after applying the correct endian conversion. That is, use
cpu_to_le32() for assigning values and le32_to_cpu() for consulting a field
value. In particular, make sure that the calculations for the 4G boundary
check are done using CPU endianness and *not* little endian values. With
these fixes, many sparse warnings are removed.
While at it, add blank lines after variable declarations and in some other
places to make this code more readable.
Link: https://lore.kernel.org/r/20220220031810.738362-11-damien.lemoal@opensource.wdc.com
Fixes: 0ecdf00ba6e5 ("[SCSI] pm80xx: 4G boundary fix.")
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f8b12dfb476dad38ce755aaf5e2df46f06f1822e ]
All fields of the kek_mgmt_req structure have the type __le32. So make sure
to use cpu_to_le32() to initialize them. This suppresses the sparse
warning:
warning: incorrect type in assignment (different base types)
expected restricted __le32 [addressable] [assigned] [usertype] new_curidx_ksop
got int
Link: https://lore.kernel.org/r/20220220031810.738362-10-damien.lemoal@opensource.wdc.com
Fixes: f5860992db55 ("[SCSI] pm80xx: Added SPCv/ve specific hardware functionalities and relevant changes in common files")
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ca374f5d92b8ae778f6a37dd3e7ed809bbf7a953 ]
All fields of the SASProtocolTimerConfig structure have the __le32 type.
As such, use cpu_to_le32() to initialize them. This change suppresses many
sparse warnings:
warning: incorrect type in assignment (different base types)
expected restricted __le32 [addressable] [usertype] pageCode
got int
Note that the check to limit the value of the STP_IDLE_TMO field is removed
as this field is initialized using the fixed (and small) value defined by
the STP_IDLE_TIME macro.
The pm8001_dbg() calls printing the values of the SASProtocolTimerConfig
structure fileds are changed to use le32_to_cpu() to present the values in
human readable form.
Link: https://lore.kernel.org/r/20220220031810.738362-9-damien.lemoal@opensource.wdc.com
Fixes: a6cb3d012b98 ("[SCSI] pm80xx: thermal, sas controller config and error handling update")
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bb225b12dbcc82d53d637d10b8d70b64494f8c16 ]
The fields of the set_ctrl_cfg_req structure have the __le32 type, so use
cpu_to_le32() to assign them. This removes the sparse warnings:
warning: incorrect type in assignment (different base types)
expected restricted __le32
got unsigned int
Link: https://lore.kernel.org/r/20220220031810.738362-8-damien.lemoal@opensource.wdc.com
Fixes: 842784e0d15b ("pm80xx: Update For Thermal Page Code")
Fixes: f5860992db55 ("[SCSI] pm80xx: Added SPCv/ve specific hardware functionalities and relevant changes in common files")
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cd2268a180117aa8ebb23e090ba204324b2d0e93 ]
The ds_ads_m field of struct ssp_ini_tm_start_req has the type __le32.
Assigning a value to it should thus use cpu_to_le32(). This fixes the
sparse warning:
warning: incorrect type in assignment (different base types)
expected restricted __le32 [addressable] [assigned] [usertype] ds_ads_m
got int
Link: https://lore.kernel.org/r/20220220031810.738362-7-damien.lemoal@opensource.wdc.com
Fixes: dbf9bfe61571 ("[SCSI] pm8001: add SAS/SATA HBA driver")
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1a37b6738b58d86f6b144b3fc754ace0f2e0166d ]
Since the sata_cmd struct is zeroed out before its fields are initialized,
there is no need for using "|=" to initialize the ncqtag_atap_dir_m
field. Using a standard assignment removes the sparse warning:
warning: invalid assignment: |=
Also, since the ncqtag_atap_dir_m field has type __le32, use cpu_to_le32()
to generate the assigned value.
Link: https://lore.kernel.org/r/20220220031810.738362-5-damien.lemoal@opensource.wdc.com
Fixes: c6b9ef5779c3 ("[SCSI] pm80xx: NCQ error handling changes")
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3032ed77a28913203a4fe0ab8f05752331af79b3 ]
Report both the command flags and command state instead of only the
command state.
Link: https://lore.kernel.org/r/20220218195117.25689-22-bvanassche@acm.org
Fixes: 4d7007b49d52 ("[SCSI] fnic: Fnic Trace Utility")
Cc: Hiral Patel <hiralpat@cisco.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 8454563e4c2aafbfb81a383ab423ea8b9b430a25 upstream.
To detect for the DMA_NONE (no data transfer) DMA direction,
sas_ata_qc_issue() tests if the command protocol is ATA_PROT_NODATA. This
test does not include the ATA_CMD_NCQ_NON_DATA command as this command
protocol is defined as ATA_PROT_NCQ_NODATA (equal to ATA_PROT_FLAG_NCQ) and
not as ATA_PROT_NODATA.
To include both NCQ and non-NCQ commands when testing for the DMA_NONE DMA
direction, use "!ata_is_data()".
Link: https://lore.kernel.org/r/20220220031810.738362-2-damien.lemoal@opensource.wdc.com
Fixes: 176ddd89171d ("scsi: libsas: Reset num_scatter if libata marks qc as NODATA")
Cc: stable@vger.kernel.org
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 07e0984b96ec1ba8c6de1c092b986b00ea0c114c upstream.
In the original FPIN commit, stats were incremented by the event_count.
Event_count is the minimum # of events that must occur before an FPIN is
sent. Thus, its not the actual number of events, and could be significantly
off (too low) as it doesn't reflect anything not reported. Rather than
attempt to count events, have the statistic count how many FPINS cross the
threshold and were reported.
Link: https://lore.kernel.org/r/20220301175536.60250-1-jsmart2021@gmail.com
Fixes: 3dcfe0de5a97 ("scsi: fc: Parse FPIN packets and update statistics")
Cc: <stable@vger.kernel.org> # v5.11+
Cc: Shyam Sundar <ssundar@marvell.com>
Cc: Nilesh Javali <njavali@marvell.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 71bb9ab6e3511b7bb98678a19eb8cf1ccbf3ca2f upstream.
Kernel messages produced during runtime PM can cause a never-ending cycle
because user space utilities (e.g. journald or rsyslog) write the messages
back to storage, causing runtime resume, more messages, and so on.
Messages that tell of things that are expected to happen, are arguably
unnecessary, so suppress them.
UFS driver messages are changes to from dev_err() to dev_dbg() which means
they will not display unless activated by dynamic debug of building with
-DDEBUG.
sdev->silence_suspend is set to skip messages from sd_suspend_common()
"Synchronizing SCSI cache", "Stopping disk" and scsi_report_sense()
"Power-on or device reset occurred" message (Note, that message appears
when the LUN is accessed after runtime PM, not during runtime PM)
Example messages from Ubuntu 21.10:
$ dmesg | tail
[ 1620.380071] ufshcd 0000:00:12.5: ufshcd_print_pwr_info:[RX, TX]: gear=[1, 1], lane[1, 1], pwr[SLOWAUTO_MODE, SLOWAUTO_MODE], rate = 0
[ 1620.408825] ufshcd 0000:00:12.5: ufshcd_print_pwr_info:[RX, TX]: gear=[4, 4], lane[2, 2], pwr[FAST MODE, FAST MODE], rate = 2
[ 1620.409020] ufshcd 0000:00:12.5: ufshcd_find_max_sup_active_icc_level: Regulator capability was not set, actvIccLevel=0
[ 1620.409524] sd 0:0:0:0: Power-on or device reset occurred
[ 1622.938794] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[ 1622.939184] ufs_device_wlun 0:0:0:49488: Power-on or device reset occurred
[ 1625.183175] ufshcd 0000:00:12.5: ufshcd_print_pwr_info:[RX, TX]: gear=[1, 1], lane[1, 1], pwr[SLOWAUTO_MODE, SLOWAUTO_MODE], rate = 0
[ 1625.208041] ufshcd 0000:00:12.5: ufshcd_print_pwr_info:[RX, TX]: gear=[4, 4], lane[2, 2], pwr[FAST MODE, FAST MODE], rate = 2
[ 1625.208311] ufshcd 0000:00:12.5: ufshcd_find_max_sup_active_icc_level: Regulator capability was not set, actvIccLevel=0
[ 1625.209035] sd 0:0:0:0: Power-on or device reset occurred
Note for stable: depends on patch "scsi: core: sd: Add silence_suspend flag
to suppress some PM messages".
Link: https://lore.kernel.org/r/20220228113652.970857-3-adrian.hunter@intel.com
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit af4edb1d50c6d1044cb34bc43621411b7ba2cffe upstream.
Kernel messages produced during runtime PM can cause a never-ending cycle
because user space utilities (e.g. journald or rsyslog) write the messages
back to storage, causing runtime resume, more messages, and so on.
Messages that tell of things that are expected to happen are arguably
unnecessary, so add a flag to suppress them. This flag is used by the UFS
driver.
Link: https://lore.kernel.org/r/20220228113652.970857-2-adrian.hunter@intel.com
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b81e0c2372e65e5627864ba034433b64b2fc73f5 upstream.
Drop various include not actually used in genhd.h itself, and
move the remaning includes closer together.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20210920123328.1399408-15-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>a
Reported-by: "H. Nikolaus Schaller" <hns@goldelico.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Cc: "Maciej W. Rozycki" <macro@orcam.me.uk>
[ resolves MIPS build failure by luck, root cause needs to be fixed in
Linus's tree properly, but this is needed for now to fix the build - gregkh ]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 33172ab50a53578a95691310f49567c9266968b0 upstream.
It isn't enough to check whether a grant is still being in use by
calling gnttab_query_foreign_access(), as a mapping could be realized
by the other side just after having called that function.
In case the call was done in preparation of revoking a grant it is
better to do so via gnttab_try_end_foreign_access() and check the
success of that operation instead.
This is CVE-2022-23038 / part of XSA-396.
Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f10f582d28220f50099d3f561116256267821429 upstream.
This fixes a deadlock added with commit b40f3894e39e ("scsi: qedi: Complete
TMF works before disconnect")
Bug description from Jia-Ju Bai:
qedi_process_tmf_resp()
spin_lock(&session->back_lock); --> Line 201 (Lock A)
spin_lock(&qedi_conn->tmf_work_lock); --> Line 230 (Lock B)
qedi_process_cmd_cleanup_resp()
spin_lock_bh(&qedi_conn->tmf_work_lock); --> Line 752 (Lock B)
spin_lock_bh(&conn->session->back_lock); --> Line 784 (Lock A)
When qedi_process_tmf_resp() and qedi_process_cmd_cleanup_resp() are
concurrently executed, the deadlock can occur.
This patch fixes the deadlock by not holding the tmf_work_lock in
qedi_process_cmd_cleanup_resp while holding the back_lock. The
tmf_work_lock is only needed while we remove the tmf_work from the
work_list.
Link: https://lore.kernel.org/r/20220208185448.6206-1-michael.christie@oracle.com
Fixes: b40f3894e39e ("scsi: qedi: Complete TMF works before disconnect")
Cc: Manish Rangankar <mrangankar@marvell.com>
Cc: Nilesh Javali <njavali@marvell.com>
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Reported-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7f4c5a26f735dea4bbc0eb8eb9da99cda95a8563 upstream.
When connected point to point, the driver does not know the FC4's supported
by the other end. In Fabrics, it can query the nameserver. Thus the driver
must send PRLIs for the FC4s it supports and enable support based on the
acc(ept) or rej(ect) of the respective FC4 PRLI. Currently the driver
supports SCSI and NVMe PRLIs.
Unfortunately, although the behavior is per standard, many devices have
come to expect only SCSI PRLIs. In this particular example, the NVMe PRLI
is properly RJT'd but the target decided that it must LOGO after seeing the
unexpected NVMe PRLI. The LOGO causes the sequence to restart and login is
now in an infinite failure loop.
Fix the problem by having the driver, on a pt2pt link, remember NVMe PRLI
accept or reject status across logout as long as the link stays "up". When
retrying login, if the prior NVMe PRLI was rejected, it will not be sent on
the next login.
Link: https://lore.kernel.org/r/20220212163120.15385-1-jsmart2021@gmail.com
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 945c3cca05d78351bba29fa65d93834cb7934c7b upstream.
The following deadlock has been observed on a test setup:
- All tags allocated
- The SCSI error handler calls ufshcd_eh_host_reset_handler()
- ufshcd_eh_host_reset_handler() queues work that calls
ufshcd_err_handler()
- ufshcd_err_handler() locks up as follows:
Workqueue: ufs_eh_wq_0 ufshcd_err_handler.cfi_jt
Call trace:
__switch_to+0x298/0x5d8
__schedule+0x6cc/0xa94
schedule+0x12c/0x298
blk_mq_get_tag+0x210/0x480
__blk_mq_alloc_request+0x1c8/0x284
blk_get_request+0x74/0x134
ufshcd_exec_dev_cmd+0x68/0x640
ufshcd_verify_dev_init+0x68/0x35c
ufshcd_probe_hba+0x12c/0x1cb8
ufshcd_host_reset_and_restore+0x88/0x254
ufshcd_reset_and_restore+0xd0/0x354
ufshcd_err_handler+0x408/0xc58
process_one_work+0x24c/0x66c
worker_thread+0x3e8/0xa4c
kthread+0x150/0x1b4
ret_from_fork+0x10/0x30
Fix this lockup by making ufshcd_exec_dev_cmd() allocate a reserved
request.
Link: https://lore.kernel.org/r/20211203231950.193369-10-bvanassche@acm.org
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d77ea8226b3be23b0b45aa42851243b62a27bda1 upstream.
Commit 7252a3603015 ("scsi: ufs: Avoid busy-waiting by eliminating tag
conflicts") guarantees that 'tag' is not in use by any SCSI command.
Remove the check that returns early if a conflict occurs.
Link: https://lore.kernel.org/r/20211203231950.193369-6-bvanassche@acm.org
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Acked-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit df7abcaa1246e2537ab4016077b5443bb3c09378 ]
Currently a use-after-free may occur if a sas_task is aborted by the upper
layer before we handle the I/O completion in mpi_ssp_completion() or
mpi_sata_completion().
In this case, the following are the two steps in handling those I/O
completions:
- Call complete() to inform the upper layer handler of completion of
the I/O.
- Release driver resources associated with the sas_task in
pm8001_ccb_task_free() call.
When complete() is called, the upper layer may free the sas_task. As such,
we should not touch the associated sas_task afterwards, but we do so in the
pm8001_ccb_task_free() call.
Fix by swapping the complete() and pm8001_ccb_task_free() calls ordering.
Link: https://lore.kernel.org/r/1643289172-165636-4-git-send-email-john.garry@huawei.com
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 61f162aa4381845acbdc7f2be4dfb694d027c018 ]
Currently a use-after-free may occur if a TMF sas_task is aborted before we
handle the IO completion in mpi_ssp_completion(). The abort occurs due to
timeout.
When the timeout occurs, the SAS_TASK_STATE_ABORTED flag is set and the
sas_task is freed in pm8001_exec_internal_tmf_task().
However, if the I/O completion occurs later, the I/O completion still
thinks that the sas_task is available. Fix this by clearing the ccb->task
if the TMF times out - the I/O completion handler does nothing if this
pointer is cleared.
Link: https://lore.kernel.org/r/1643289172-165636-3-git-send-email-john.garry@huawei.com
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit edb854a3680bacc9ef9b91ec0c5ff6105886f6f3 ]
We currently use ->cmd_per_lun as initial queue depth for setting up the
budget_map. Martin Wilck reported that it is common for the queue_depth to
be subsequently updated in slave_configure() based on detected hardware
characteristics.
As a result, for some drivers, the static host template settings for
cmd_per_lun and can_queue won't actually get used in practice. And if the
default values are used to allocate the budget_map, memory may be consumed
unnecessarily.
Fix the issue by reallocating the budget_map after ->slave_configure()
returns. At that time the device queue_depth should accurately reflect what
the hardware needs.
Link: https://lore.kernel.org/r/20220127153733.409132-1-ming.lei@redhat.com
Cc: Bart Van Assche <bvanassche@acm.org>
Reported-by: Martin Wilck <martin.wilck@suse.com>
Suggested-by: Martin Wilck <martin.wilck@suse.com>
Tested-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>