IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Use the module_pci_driver() macro to make the code simpler by eliminating
module_init() and module_exit() calls.
Link: https://lore.kernel.org/r/20200917071044.1909268-1-liushixin2@huawei.com
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This has no change in behavior, but improves the accounting a bit.
Link: https://lore.kernel.org/r/20201005084130.143273-11-hch@lst.de
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Move this trivial functionality into scsi_prepare_cmd() instead of
splitting it over multiple small functions, and update the comments to
better document passthrough commands as the special case.
Link: https://lore.kernel.org/r/20201005084130.143273-10-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Rename scsi_init_io() to scsi_alloc_sgtables(), and ensure callers call
scsi_free_sgtables() to cleanup failures close to scsi_init_io() instead of
leaking it down the generic I/O submission path.
Link: https://lore.kernel.org/r/20201005084130.143273-9-hch@lst.de
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The old name is rather confusing now that the the legacy prep_fn is gone.
Link: https://lore.kernel.org/r/20201005084130.143273-8-hch@lst.de
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The old name is rather confusing now that the the legacy prep_fn is gone.
Link: https://lore.kernel.org/r/20201005084130.143273-7-hch@lst.de
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
We only need to detect the command size for ioctl request from userspace,
which is limited to the passthrough path. Move the check there instead of
doing it for all queuecommand invocations.
Link: https://lore.kernel.org/r/20201005084130.143273-4-hch@lst.de
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is no good reason to keep this functionality as a separate function,
just merge it into the only caller.
Link: https://lore.kernel.org/r/20201005084130.143273-3-hch@lst.de
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This function is only used by code built into scsi_mod.ko.
Link: https://lore.kernel.org/r/20201005084130.143273-2-hch@lst.de
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently the PHY state is set according to the state of the PHYs after
reset. This is invalid as the PHYs are already re-initialized.
Set PHY state according to the state before the reset instead of after.
Link: https://lore.kernel.org/r/1601649038-25534-8-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently sas_resume_ha() is called while resuming the controller to wait
for all suspended PHYs to come up and all the libsas events to be
completed.
There is a scenario which will cause task hung: For direct attach with two
disks connected with two PHYs, disable phy0 before suspending the disk on
phy1 and the controller, then enable phy0 and resume the controller, and
task hung occurs as follows:
[ 591.901463] hisi_sas_v3_hw 0000:b4:02.0: resuming from operating state [D0]
[ 593.113525] hisi_sas_v3_hw 0000:b4:02.0: neither _PS0 nor _PR0 is defined
[ 593.120301] hisi_sas_v3_hw 0000:b4:02.0: waiting up to 25 seconds for 1 phy to resume
[ 593.120836] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy0 link_rate=10(sata)
[ 593.134680] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy1 link_rate=10(sata)
[ 593.134733] sas: phy-2:0 added to port-2:0, phy_mask:0x1 (5000000000000200)
[ 593.148350] sas: DOING DISCOVERY on port 0, pid:948
[ 593.153227] hisi_sas_v3_hw 0000:b4:02.0: dev[3:5] found
[ 593.159840] sas: Enter sas_scsi_recover_host busy: 0 failed: 0
[ 593.165663] sas: ata7: end_device-2:0: dev error handler
[ 593.165730] sas: ata2: end_device-2:1: dev error handler
[ 593.172532] hisi_sas_v3_hw 0000:b4:02.0: phydown: phy0 phy_state=0x2
[ 593.182570] hisi_sas_v3_hw 0000:b4:02.0: ignore flutter phy0 down
[ 593.331277] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy0 link_rate=10(sata)
[ 593.498956] ata7.00: ATA-11: SAMSUNG MZ7LH960HAJR-00005, HXT7404Q, max UDMA/133
[ 593.506235] ata7.00: 1875385008 sectors, multi 16: LBA48 NCQ (depth 32)
[ 593.514295] ata7.00: configured for UDMA/133
[ 593.518557] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
[ 593.528613] sas: ata7: end_device-2:0: model:SAMSUNG MZ7LH960HAJR-00005
serial:S45NNA0M712225
[ 593.537520] device_link_add 316: dev=2:0:2:0 supplier:2 consumer:0
[ 593.543674] device_link_add 324
[ 593.546801] device_link_add 352
[ 593.549930] device_link_add 406
[ 593.553058] device_link_add 440: dev=2:0:2:0 supplier:2 consumer:0
[ 593.559208] device_link_add 444
[ 593.562335] device_link_add 455
[ 593.565517] scsi 2:0:2:0: Direct-Access ATA SAMSUNG MZ7LH960 404Q PQ: 0
ANSI: 5
[ 620.057464] phy-2:1: resume timeout
[ 738.841445] INFO: task kworker/u256:0:8 blocked for more than 120 seconds.
[ 738.848295] Not tainted 5.8.0-rc1-76154-g0d52b59-dirty #744
[ 738.854361] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 738.862155] kworker/u256:0 D 0 8 2 0x00000028
[ 738.867626] Workqueue: 0000:b4:02.0_event_q sas_port_event_worker
[ 738.873693] Call trace:
[ 738.876133] __switch_to+0xf4/0x148
[ 738.879613] __schedule+0x270/0x5d8
[ 738.883091] schedule+0x78/0x110
[ 738.886307] schedule_timeout+0x1ac/0x280
[ 738.890299] wait_for_completion+0x94/0x138
[ 738.894472] flush_workqueue+0x114/0x438
[ 738.898377] sas_porte_bytes_dmaed+0x400/0x500
[ 738.902801] sas_port_event_worker+0x28/0x40
[ 738.907053] process_one_work+0x1e8/0x360
[ 738.911046] worker_thread+0x44/0x478
[ 738.914698] kthread+0x150/0x158
[ 738.917915] ret_from_fork+0x10/0x1c
[ 738.921534] INFO: task kworker/u256:1:948 blocked for more than 120 seconds.
[ 738.928550] Not tainted 5.8.0-rc1-76154-g0d52b59-dirty #744
[ 738.934614] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 738.942408] kworker/u256:1 D 0 948 2 0x00000028
[ 738.947873] Workqueue: 0000:b4:02.0_disco_q sas_discover_domain
[ 738.953766] Call trace:
[ 738.956203] __switch_to+0xf4/0x148
[ 738.959678] __schedule+0x270/0x5d8
[ 738.963152] schedule+0x78/0x110
[ 738.966368] rpm_resume+0xcc/0x550
[ 738.969757] __pm_runtime_resume+0x3c/0x88
[ 738.973836] rpm_get_suppliers+0x50/0x148
[ 738.977829] __pm_runtime_set_status+0x124/0x2f0
[ 738.982427] scsi_sysfs_add_sdev+0x1a0/0x2a8
[ 738.986679] scsi_probe_and_add_lun+0x888/0xab0
[ 738.991190] __scsi_scan_target+0xec/0x520
[ 738.995268] scsi_scan_target+0x11c/0x128
[ 738.999261] sas_rphy_add+0x15c/0x1e8
[ 739.002907] sas_probe_devices+0xe4/0x150
[ 739.006899] sas_discover_domain+0x33c/0x588
[ 739.011150] process_one_work+0x1e8/0x360
[ 739.015143] worker_thread+0x44/0x478
[ 739.018789] kthread+0x150/0x158
[ 739.022003] ret_from_fork+0x10/0x1c
...
If an extra phy0 up happens during resume of the SAS controller, it will
emit a new libsas event (event PORTE_BYTES_DMAED and event
DISCE_DISCOVER_DOMAIN). We will call function scsi_sysfs_add_sdev() in
event DISCE_DISCOVER_DOMAIN, which will call __pm_runtime_set_status() to
resume supplier (host controller). For runtime PM core, if device is in the
resuming state, the later resume request of the device will wait for
previous resume request to complete synchronously. At that point in time
the state of the controller is still resuming as it waits for all libsas
events to be completed, while libsas event DISCE_DISCOVER_DOMAIN is blocked
as the state of the controller is resuming which causes a deadlock.
To avoid the issue, filter out new PHY up events while the controller is
suspended.
Link: https://lore.kernel.org/r/1601649038-25534-7-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Runtime PM of SCSI devices is already supported in SCSI layer, we can
suspend/resume every SCSI device separately. But if there is no link
between hisi_hba and SCSI devices or SCSI targets it will cause issues if
the controller is suspended while SCSI devices are still resuming. Only
when all the SCSI devices under the controller are suspended, the
controller can be suspended. Add the device link between SCSI devices
and the controller.
Link: https://lore.kernel.org/r/1601649038-25534-6-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
To support system suspend/resume or runtime suspend/resume, need to use the
function pci_set_power_state() to change the power state which requires at
least method _PS0 or _PR0 be filled by platform for v3 hw. So check whether
the method is supported, if not, print a warning.
A Kconfig dependency is added as there is no stub for
acpi_device_power_manageable().
Link: https://lore.kernel.org/r/1601649038-25534-5-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For v3 hw we will add support for runtime PM which is only supported in new
framework. Legacy PM support and new framework are not allowed to be used
together. Switch to new framework to support suspend and resume.
Link: https://lore.kernel.org/r/1601649038-25534-3-git-send-email-john.garry@huawei.com
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A call trace is observed when running function level reset with online CPUs
less than 16 and MSI auto-affinity enabled.
[16538.348038] Call trace:
[16538.348422] pci_irq_vector+0x98/0xc0
[16538.348947] disable_host_v3_hw+0x8c/0x288 [hisi_sas_v3_hw]
[16538.349706] hisi_sas_reset_prepare_v3_hw+0x60/0x88 [hisi_sas_v3_hw]
[16538.350631] pci_dev_save_and_disable+0x38/0x68
[16538.351290] pci_reset_function+0x44/0x88
[16538.351846] reset_store+0x6c/0xb8
[16538.352429] dev_attr_store+0x44/0x60
[16538.353035] sysfs_kf_write+0x58/0x80
[16538.353558] kernfs_fop_write+0x140/0x230
[16538.354175] __vfs_write+0x48/0x80
[16538.354675] vfs_write+0xb8/0x1d8
[16538.355145] ksys_write+0x74/0xf8
[16538.355615] __arm64_sys_write+0x24/0x30
[16538.356240] el0_svc_common.constprop.4+0x80/0x1f0
[16538.356905] do_el0_svc+0x2c/0x38
[16538.357408] el0_svc+0x14/0x40
[16538.357848] el0_sync_handler+0xbc/0x2ec
[16538.358388] el0_sync+0x140/0x180
The reason is that if we use pci_alloc_irq_vectors_affinity() to allocate
IRQs, the number of CQ IRQs can only be less than or equal to the number of
online CPUs, but we use hisi_hba->queue_count (always 16) to iterate during
interrupt_disable_v3_hw().
Use hisi_hba->cq_nvecs to replace hisi_hba->queue_count to avoid
synchronize IRQ on a CPU which does not exist.
Link: https://lore.kernel.org/r/1601649038-25534-2-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fixes coccicheck warning:
drivers/scsi/lpfc/lpfc_attr.c:5341:5-11: Unneeded variable: "status". Return "- EINVAL" on line 5342
Link: https://lore.kernel.org/r/20200916022859.349089-1-yebin10@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fixes coccicheck warning:
drivers/scsi/qla4xxx/ql4_init.c:1173:5-11: Unneeded variable: "status". Return "QLA_ERROR" on line 1195
Link: https://lore.kernel.org/r/20200916022749.348923-1-yebin10@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If we fail to issue the iocb in lpfc_gen_req() we need to drop the nodelist
reference.
Link: https://lore.kernel.org/r/20200910084059.138507-1-hare@suse.de
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The be_fill_queue() function can only fail when "eq_vaddress" is NULL and
since it's non-NULL here that means the function call can't fail. But
imagine if it could, then in that situation we would want to store the
"paddr" so that dma memory can be released.
Link: https://lore.kernel.org/r/20200928091300.GD377727@mwanda
Fixes: bfead3b2cb ("[SCSI] be2iscsi: Adding msix and mcc_rings V3")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
On fan failure event from MFW, bring down active connections and unload
the firmware context.
Link: https://lore.kernel.org/r/20200924070338.8270-1-mrangankar@marvell.com
Signed-off-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Corrects drivers/target/target_core_user.c:688:6: warning: 'page' may be
used uninitialized.
Link: https://lore.kernel.org/r/20200924001920.43594-1-john.p.donnelly@oracle.com
Fixes: 3c58f73723 ("scsi: target: tcmu: Optimize use of flush_dcache_page")
Cc: Mike Christie <michael.christie@oracle.com>
Acked-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix the following warnings:
[drivers/scsi/fnic/fnic_debugfs.c:123]: (warning) %u in format string (no. 1)
requires 'unsigned int' but the argument type is 'int'.
[drivers/scsi/fnic/fnic_debugfs.c:125]: (warning) %u in format string (no. 1)
requires 'unsigned int' but the argument type is 'int'.
[drivers/scsi/fnic/fnic_debugfs.c:127]: (warning) %u in format string (no. 1)
requires 'unsigned int' but the argument type is 'int'.
Link: https://lore.kernel.org/r/20200930021919.2832860-2-yebin10@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Acked-by: Karan Tilak Kumar <kartilak@cisco.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
ret is always zero or error so the assignment is redundant.
Link: https://lore.kernel.org/r/20200925060754.156599-1-jingxiangfeng@huawei.com
Acked-by: Karan Tilak Kumar <kartilak@cisco.com>
Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The fnic drivers assigns an ioreq structure to each command and severs this
assignment once scsi_done() has been called and the command has been
completed.
When traversing commands to terminate outstanding I/O we should not call
scsi_done() on commands which do not have a corresponding ioreq structure;
these commands have either never entered the driver or have already been
completed.
[mkp: fixed unused label warning]
Link: https://lore.kernel.org/r/20200515112647.49260-1-hare@suse.de
Reported-by: kbuild test robot <lkp@intel.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Acked-by: Satish Kharat <satishkh@cisco.com>
Acked-by: Karan Tilak Kumar <kartilak@cisco.com>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For older versions of gcc, the array = {0}; will cause warnings:
drivers/scsi/ufs/ufshcd-crypto.c: In function 'ufshcd_crypto_keyslot_program':
drivers/scsi/ufs/ufshcd-crypto.c:62:8: warning: missing braces around initializer [-Wmissing-braces]
union ufs_crypto_cfg_entry cfg = { 0 };
^
drivers/scsi/ufs/ufshcd-crypto.c:62:8: warning: (near initialization for 'cfg.reg_val') [-Wmissing-braces]
drivers/scsi/ufs/ufshcd-crypto.c: In function 'ufshcd_clear_keyslot':
drivers/scsi/ufs/ufshcd-crypto.c:103:8: warning: missing braces around initializer [-Wmissing-braces]
union ufs_crypto_cfg_entry cfg = { 0 };
^
2 warnings generated
Link: https://lore.kernel.org/r/20201002063538.1250-1-shipujin.t@gmail.com
Fixes: 70297a8ac7 ("scsi: ufs: UFS crypto API")
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Pujin Shi <shipujin.t@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix the following warning:
[drivers/scsi/qla2xxx/qla_dbg.c:2451]: (warning) %ld in format string (no. 4)
requires 'long' but the argument type is 'unsigned long'.
Link: https://lore.kernel.org/r/20200930022515.2862532-4-yebin10@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix the following warnings:
[drivers/scsi/qla2xxx/qla_os.c:4882]: (warning) %ld in format string (no. 2)
requires 'long' but the argument type is 'unsigned long'.
[drivers/scsi/qla2xxx/qla_os.c:5011]: (warning) %ld in format string (no. 1)
requires 'long' but the argument type is 'unsigned long'.
Link: https://lore.kernel.org/r/20200930022515.2862532-3-yebin10@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix the following warnings:
[drivers/scsi/qla2xxx/tcm_qla2xxx.c:884]: (warning) %u in format string (no. 1)
requires 'unsigned int' but the argument type is 'signed int'.
[drivers/scsi/qla2xxx/tcm_qla2xxx.c:885]: (warning) %u in format string (no. 1)
requires 'unsigned int' but the argument type is 'signed int'.
[drivers/scsi/qla2xxx/tcm_qla2xxx.c:886]: (warning) %u in format string (no. 1)
requires 'unsigned int' but the argument type is 'signed int'.
[drivers/scsi/qla2xxx/tcm_qla2xxx.c:887]: (warning) %u in format string (no. 1)
requires 'unsigned int' but the argument type is 'signed int'.
[drivers/scsi/qla2xxx/tcm_qla2xxx.c:888]: (warning) %u in format string (no. 1)
requires 'unsigned int' but the argument type is 'signed int'.
Link: https://lore.kernel.org/r/20200930022515.2862532-2-yebin10@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Some iSCSI targets went with the traditional "export N ports" approach and
then allowed the initiator to multipath over them. Other targets went the
opposite direction and export a single port, and then software on the
target side performs load balancing and failover to other targets via an
iSCSI specific feature or IP takover.
The problem for the 2nd type of config is we quickly run out of our five
retries and get I/O errors. In these setups we want to reduce resource use
on the initiator side so we only wanted the one session and no
dm-multipath. To handle traditional multipath operations like failover we
do IP takover on the target side. So we would have an iSCSI target running
on node1. Some monitoring software decides it's dead or the node is
overloaded so it starts the iSCSI target on node2. The problem is for the
failover case where we might have the equivalent of a dm-multipath
temporary all paths down, or we just have to try more than 5 nodes before
finding a good one.
To handle this type of issue allow the user to configure the disk cmd
retries from -1 to the current max of 5. -1 means infinite retries and
should be used for setups where some other setting is going to control when
to fail. For example iSCSI has the replacement/recovery timeout and fc
(some users have used FC with NPIV and done something similar as IP
takover) has dev_loss_tmo/fast_io_fail which will eventually expire and
fail I/O.
Link: https://lore.kernel.org/r/1601566554-26752-3-git-send-email-michael.christie@oracle.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add infinite retry support to SCSI midlayer by combining common checks for
retries into some helper functions, and then checking for the
-1/SCSI_CMD_RETRIES_NO_LIMIT.
Link: https://lore.kernel.org/r/1601566554-26752-2-git-send-email-michael.christie@oracle.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
trace-cmd report doesn't show events from target subsystem because
scsi_command_size() leaks through event format string:
[target:target_sequencer_start] function scsi_command_size not defined
[target:target_cmd_complete] function scsi_command_size not defined
Addition of scsi_command_size() to plugin_scsi.c in trace-cmd doesn't
help because an expression is used inside TP_printk(). trace-cmd event
parser doesn't understand minus sign inside [ ]:
Error: expected ']' but read '-'
Rather than duplicating kernel code in plugin_scsi.c, provide a dedicated
field for CONTROL byte.
Link: https://lore.kernel.org/r/20200929125957.83069-1-r.bolshakov@yadro.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Driver was using a shorter timeout waiting for PLOGI from the peer in
point-to-point configurations. Some devices takes some time (~4 seconds) to
initiate the PLOGI. This peer initiating PLOGI is when the peer has a
higher P-WWN.
Increase the wait time based on N2N R_A_TOV.
Link: https://lore.kernel.org/r/20200929102152.32278-7-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
On unload, session cleanup prematurely gave the signal for driver unload
path to advance.
Link: https://lore.kernel.org/r/20200929102152.32278-6-njavali@marvell.com
Fixes: 726b854870 ("qla2xxx: Add framework for async fabric discovery")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Normally, the MPI firmware is reset when an MPI dump is collected. If an
unsaved MPI dump exists in the driver, though, an alternate mechanism is
used. This mechanism, which was not fully correct, is not recommended and
instead an MPI dump template walk is suggested to perform the MPI reset.
To allow for the MPI dump template walk, extra space is reserved in the MPI
dump buffer which gets used only when there is already an MPI dump in
place.
Link: https://lore.kernel.org/r/20200929102152.32278-5-njavali@marvell.com
Fixes: cbb01c2f2f ("scsi: qla2xxx: Fix MPI failure AEN (8200) handling")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When printing the message:
"MPI Heartbeat stop. MPI reset is not needed.."
..the wrong register was checked leading to always printing that MPI reset
is not needed, even when it is needed. Fix the MPI reset message.
Link: https://lore.kernel.org/r/20200929102152.32278-4-njavali@marvell.com
Fixes: cbb01c2f2f ("scsi: qla2xxx: Fix MPI failure AEN (8200) handling")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Current code uses wrong mailbox option to extract bbc from firmware. This
field is nested inside of PLOGI payload. Extract bbc from PLOGI template
payload.
Link: https://lore.kernel.org/r/20200929102152.32278-3-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Since the version string has been modified, sscanf() returns 4 instead of
6.
Link: https://lore.kernel.org/r/20200929102152.32278-2-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>