16641 Commits

Author SHA1 Message Date
James Smart
c221768bd4 scsi: lpfc: Fix 16gb hbas failing cq create.
The lancer G5 chip family fails the CQ create with 16k page size.  The
hardware incorrectly reports it supports large page sizes when it is
actually limited to 4k pages.

A prior patch resolved this for the A0 chip revision only.  This patch
excludes all revisions of the G5 asic from using large page sizes. As
knowing the actual chip revision is unnecessary, the now unused definitions
are removed

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:33 -04:00
James Smart
7438273fa2 scsi: lpfc: Fix crash in blk_mq layer when executing modprobe -r lpfc
modprobe -r lpfc produces the following:

Call Trace:
 __blk_mq_run_hw_queue+0xa2/0xb0
 __blk_mq_delay_run_hw_queue+0x9d/0xb0
 ? blk_mq_hctx_has_pending+0x32/0x80
 blk_mq_run_hw_queue+0x50/0xd0
 blk_mq_sched_insert_request+0x110/0x1b0
 blk_execute_rq_nowait+0x76/0x180
 nvme_keep_alive_work+0x8a/0xd0 [nvme_core]
 process_one_work+0x17f/0x440
 worker_thread+0x126/0x3c0
 ? manage_workers.isra.24+0x2a0/0x2a0
 kthread+0xd1/0xe0
 ? insert_kthread_work+0x40/0x40
 ret_from_fork_nospec_begin+0x21/0x21
 ? insert_kthread_work+0x40/0x40

However, rmmod lpfc would run correctly.

When an nvme remoteport is unregistered with the host nvme transport, it
needs to set the remoteport->dev_loss_tmo value 0 to indicate an immediate
termination of device loss and prevent any further keep alives to that
rport.  The driver was never setting dev_loss_tmo causing the nvme
transport to continue to send the keep alive.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:33 -04:00
James Smart
4d5e789a2e scsi: lpfc: correct oversubscription of nvme io requests for an adapter
Under large configurations, the driver would start to log message 6065 -
NVME out of buffers (exchanges).

The driver is using the ndlp cmd_qdepth value when determining the max
outstanding ios for an adapter. This value, by default, is set to 65536,
which exceeds the maximum exchange counts supported on an adapter. The ndlp
cmd_qdepth has no relevance and outstanding io count should be capped at
the max exchange count with IO requests beyond that level getting bounced
back with an EBUSY status so that they are retried by the block layer.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:33 -04:00
James Smart
dc19e3b4a8 scsi: lpfc: Fix MDS diagnostics failure (Rx < Tx)
MDS diagnostics fail because of frame count mismatch.

Unavailability of SGL is the trigger for this issue. If ELS SGL is not
available to process MDS frame, IOCB is put in FCP txq but not attempted to
post afterwards. So, driver stops processing incoming frames as it runs out
of IOCB.  lpfc_drain_txq attempts to submit IOCBS that are queued in ELS
txq but MDS frames are posted to FCP WQ.

Attempt to submit IOCBs that are present in FCP txq when MDS loopback is
running.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiang Chen
3e1fb1b8ab scsi: hisi_sas: Mark PHY as in reset for nexus reset
When issuing a nexus reset for directly attached device, we want to ignore
the PHY down events so libsas will not deform and reform the port.

In the case that the attached SAS changes for the reset, libsas will deform
and form a port.

For scenario that the PHY does not come up after a timeout period, then
report the PHY down to libsas.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiaofei Tan
d87e72fb4f scsi: hisi_sas: Fix return value when get_free_slot() failed
It is an step of executing task to get free slot. If the step fails, we
will cleanup LLDD resources and should return failure to upper layer or
internal caller to abort task execution of this time.

But in the current code, the caller of get_free_slot() doesn't return
failure when get_free_slot() failed. This patch is to fix it.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiaofei Tan
31709548d2 scsi: hisi_sas: Terminate STP reject quickly for v2 hw
For v2 hw, STP link from target is rejected after host reset because of a
SoC bug. The STP reject will be terminated after we have sent IO from each
PHY of a port.

This is not an problem before, as we don't need to setup STP link from
target immediately after host reset. But now, it is.  Because we want to
send soft-reset immediately after host reset.

In order to terminate STP reject quickly, this patch send ATA reset command
through each PHY of a port. Notes: ATA reset command don't need target's
response.

Besides, we do abort dev for each device before terminating STP reject.
This is a quirk of v2 hw.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiaofei Tan
b09fcd09e9 scsi: hisi_sas: Add v2 hw force PHY function for internal ATA command
This patch adds a force PHY function for internal ATA command for v2 hw.

Because there is an SoC bug in v2 hw, and need send an IO through each PHY
of a port to work around a bug which occurs after a controller reset.

This force PHY function will be used in the later patch.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiaofei Tan
78bd2b4f6e scsi: hisi_sas: Include TMF elements in struct hisi_sas_slot
In future scenarios we will want to use the TMF struct for more task types
than SSP.

As such, we can add struct hisi_sas_tmf_task directly into struct
hisi_sas_slot, and this will mean we can remove the TMF parameters from the
task prep functions.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiaofei Tan
a865ae14ff scsi: hisi_sas: Try wait commands before before controller reset
We may reset the controller in many scenarios, such as SCSI EH and HW
errors. There should be no IO which returns from target when SCSI EH is
active. But for other scenarios, there may be.  It is not necessary to make
such IOs fail.

This patch adds an function of trying to wait for any commands, or IO, to
complete before host reset. If no more CQ returned from host controller in
100ms, we assume no more IO can return, and then stop waiting. We wait 5s
at most.

The HW has a register CQE_SEND_CNT to indicate the total number of CQs that
has been reported to driver. We can use this register and it is reliable to
resd this register in such scenarios that require host reset.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiaofei Tan
6175abdeae scsi: hisi_sas: Init disks after controller reset
After the controller is reset, it is possible that the disks attached still
have outstanding IO to complete.

Thus, when the PHYs come back up after controller reset, it is possible
that these IOs complete at some unknown point later.

We want to ensure that all IOs are complete after the controller reset so
that all associated IPTT and other resources can be recycled safely.

To achieve this, re-init the disks by TMF or softreset (in case of ATA
devices).

If the init fails - maybe because the device was removed or link has not
come up - then do not release the device resources, but rather rely on SCSI
EH to handle the timeout for these resources later on.

This patch also does some cleanup to hisi_sas_init_disk(), including
removing superfluous cases in the switch statement.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiang Chen
235bfc7ff6 scsi: hisi_sas: Create a scsi_host_template per HW module
When a SCSI host is registered, the SCSI mid-layer takes a reference to a
module in Scsi_host.hostt.module. In doing this, we are prevented from
removing the driver module for the host in dangerous scenario, like when a
disk is mounted.

Currently there is only one scsi_host_template (sht) for all HW versions,
and this is the main.c module. So this means that we can possibly remove
the HW module in this dangerous scenario, as SCSI mid-layer is only
referencing the main.c module.

To fix this, create a sht per module, referencing that same module to
create the Scsi host.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiang Chen
d5a60dfdb3 scsi: hisi_sas: Reset disks when discovered
When a disk is discovered, it may be in an error state, or there may be
residual commands remaining in the disk.

To ensure any disk is in good state after discovery, reset via TMF (for SAS
disk) or softreset (for a SATA disk).

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiaofei Tan
428f1b3424 scsi: hisi_sas: Add LED feature for v3 hw
This patch implements LED feature of directly attached disk for v3 hw.

In fact, this hw has created an SGPIO component for LED feature, and we can
control LEDs just by internal registers.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:32 -04:00
Xiang Chen
1b86518581 scsi: hisi_sas: Change common allocation mode of device id
To reduce possibility of hitting unknown SoC bugs and aid debugging and
test, change allocation mode of device id from last used device id instead
of lowest available index.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:31 -04:00
Xiang Chen
fa3be0f231 scsi: hisi_sas: change slot index allocation mode
Currently we find the lowest available empty bit in the IPTT bitmap to
allocate the IPTT for a command.

To reduce possibility of hitting unknown SoC bugs and also aid in the
debugging of those same bugs, change the allocation mode.

The next allocation method is to use the next free slot adjacent to the
most recently allocated slot, in a round-robin fashion.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:31 -04:00
John Garry
757db2dae2 scsi: hisi_sas: Introduce hisi_sas_phy_set_linkrate()
There is much common code and functionality between the HW versions to set
the PHY linkrate.

As such, this patch factors out the common code into a generic function
hisi_sas_phy_set_linkrate().

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:31 -04:00
Wei Yongjun
eb217359eb scsi: hisi_sas: fix a typo in hisi_sas_task_prep()
Fix a typo in hisi_sas_task_prep().

Fixes: 7eee4b921822 ("scsi: hisi_sas: relocate smp sg map")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 21:36:51 -04:00
Colin Ian King
9af3c47924 scsi: pm80xx: fix spelling mistake "UNSORPORTED" -> "SUPPORTED"
Trivial fix to spelling mistake in pm8001_printk message text; also I
believe NOT_UNSUPPORTED should probably be NOT_SUPPORTED. Also fix the
indent of the pm8001_printk statement.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Jack Wang <jinpu.wang@profitbricks.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 21:31:07 -04:00
Douglas Gilbert
e37c7d9a03 scsi: core: sanitize++ in progress
Commit 505aa4b6a883 ("scsi: sd: Defer spinning up drive while SANITIZE is
in progress") may not be sufficient, especially if the SCSI SANITIZE
command is sent via the bsg or sg pass-throughs, since they don't use the
sd driver.

Add "Sanitize in progress" plus some other recent "... in progress"
additional sense codes into the scsi mid-level so they are treated in a
similar fashion to "Format in progress".

[mkp: checkpatch]

Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 21:29:46 -04:00
Bart Van Assche
c9ddf73476 scsi: scsi_transport_srp: Fix shost to rport translation
Since an SRP remote port is attached as a child to shost->shost_gendev
and as the only child, the translation from the shost pointer into an
rport pointer must happen by looking up the shost child that is an
rport. This patch fixes the following KASAN complaint:

BUG: KASAN: slab-out-of-bounds in srp_timed_out+0x57/0x110 [scsi_transport_srp]
Read of size 4 at addr ffff880035d3fcc0 by task kworker/1:0H/19

CPU: 1 PID: 19 Comm: kworker/1:0H Not tainted 4.16.0-rc3-dbg+ #1
Workqueue: kblockd blk_mq_timeout_work
Call Trace:
dump_stack+0x85/0xc7
print_address_description+0x65/0x270
kasan_report+0x231/0x350
srp_timed_out+0x57/0x110 [scsi_transport_srp]
scsi_times_out+0xc7/0x3f0 [scsi_mod]
blk_mq_terminate_expired+0xc2/0x140
bt_iter+0xbc/0xd0
blk_mq_queue_tag_busy_iter+0x1c7/0x350
blk_mq_timeout_work+0x325/0x3f0
process_one_work+0x441/0xa50
worker_thread+0x76/0x6c0
kthread+0x1b2/0x1d0
ret_from_fork+0x24/0x30

Fixes: e68ca75200fe ("scsi_transport_srp: Reduce failover time")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 21:23:38 -04:00
David S. Miller
5b79c2af66 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Lots of easy overlapping changes in the confict
resolutions here.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-26 19:46:15 -04:00
Linus Torvalds
b68ea0ee03 for-linus-20180524
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJbBramAAoJEPfTWPspceCm2gsP/R5p/Vo8Ml303cdgCOrw+Q+T
 W/Wp6hGmNUZ7HjMlfF85rF1bqpgGoj226qIibHvKi7eYwUScDNgJwcrrWnNazbGx
 Rnl/+NQ/H/38pBvvDJKjth9n0LY/O1geemhnWzYZmqmZJe3jFNTDpw5pGy4RkTpJ
 4DCuXg2n81x+kDt7Nslb0dYONkrc1yUjelHS/sJKlUoPs1MvsICGsWNS+Lw0WMIj
 Ls282QNs1Z6Y1gcM18UXsTNJSXQFTBmsbH3CIBukHckP2wjZrMEvgM+uxIORqwID
 0DJWBSVNJMxrsyZtkMMkkz2wMjPjSvx3HjD/ULpglJD6nGSwRAiQr6XIxARSEEDL
 3prQyhEeVGSioOjt5IR2Llt9QDYVhb1GJqqHYmDZux0SMCVg1Pv7mclrudpXGpzq
 v2mSJqnwLofOuTFrSh6afgxClD/yh1UNKByf4Ni78PagAgNS4SjEIz8VDG4m1iNu
 UgWMbM+vqpZhgIlSDsmnqFEVyGwCWwUOHGCeshQXIy9xaWbCtmaGQpNh4Tho273O
 wH6F/jwCnw1BcUeiwOQi+qcwEf1z6nRTGGb3hZt7u2Tj6r8HZzNHtkRaFXsIrnvA
 Vc3Y9+YzRWBVV6wzqPC5K+alvGs5ZXLj7BIxQorEkoMkNCYWiPsiBS/rtc0OYiST
 b2rC+5eYzquEZ42aqIIn
 =QBaF
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20180524' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "Two fixes that should go into this release:

   - a loop writeback error clearing fix from Jeff

   - the sr sense fix from myself"

* tag 'for-linus-20180524' of git://git.kernel.dk/linux-block:
  loop: clear wb_err in bd_inode when detaching backing file
  sr: pass down correctly sized SCSI sense buffer
2018-05-24 08:53:20 -07:00
Ganesh Goudar
1d19023fa6 cxgb4: change the port capability bits definition
MDI Port Capabilities bit definitions were inconsistent with
regard to the MDI enum values. 2 bits used to define MDI in
the port capabilities are not really separable, it's a 2-bit
field with 4 different values. Change the port capability bit
definitions to be "AUTO" and "STRAIGHT" in order to get them
to line up with the enum's.

Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-23 16:00:56 -04:00
Manish Rangankar
269afb3603 qedi: Add get_generic_tlv_data handler.
Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22 23:29:54 -04:00
Manish Rangankar
534bbdf883 qedi: Add support for populating ethernet TLVs.
This patch adds callbacks for providing the ethernet protocol driver TLVs.

Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22 23:29:54 -04:00
Chad Dupuis
8673daf4f5 qedf: Add get_generic_tlv_data handler.
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22 23:29:54 -04:00
Chad Dupuis
642a0b37e6 qedf: Add support for populating ethernet TLVs.
This patch adds callbacks for providing the ethernet protocol driver TLVs.

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22 23:29:54 -04:00
David S. Miller
6f6e434aa2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
S390 bpf_jit.S is removed in net-next and had changes in 'net',
since that code isn't used any more take the removal.

TLS data structures split the TX and RX components in 'net-next',
put the new struct members from the bug fix in 'net' into the RX
part.

The 'net-next' tree had some reworking of how the ERSPAN code works in
the GRE tunneling code, overlapping with a one-line headroom
calculation fix in 'net'.

Overlapping changes in __sock_map_ctx_update_elem(), keep the bits
that read the prog members via READ_ONCE() into local variables
before using them.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-21 16:01:54 -04:00
Jens Axboe
f7068114d4 sr: pass down correctly sized SCSI sense buffer
We're casting the CDROM layer request_sense to the SCSI sense
buffer, but the former is 64 bytes and the latter is 96 bytes.
As we generally allocate these on the stack, we end up blowing
up the stack.

Fix this by wrapping the scsi_execute() call with a properly
sized sense buffer, and copying back the bits for the CDROM
layer.

Cc: stable@vger.kernel.org
Reported-by: Piotr Gabriel Kosinski <pg.kosinski@gmail.com>
Reported-by: Daniel Shapira <daniel@twistlock.com>
Tested-by: Kees Cook <keescook@chromium.org>
Fixes: 82ed4db499b8 ("block: split scsi_request out of struct request")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-21 12:21:14 -06:00
Colin Ian King
017b3f8a10 scsi: snic: fix a couple of spelling mistakes: "COMPLETE"
Trivial fix to spelling mistakes/typos:
"SNIC_IOREQ_ABTS_COMPELTE" -> "SNIC_IOREQ_ABTS_COMPLETE"
"SNIC_IOREQ_LR_COMPELTE"   -> "SNIC_IOREQ_LR_COMPLETE"
"SNIC_IOREQ_CMD_COMPELTE"  -> "SNIC_IOREQ_CMD_COMPLETE"

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 12:22:49 -04:00
Christophe Jaillet
51b910c3c7 scsi: qlogicpti: Fix an error handling path in 'qpti_sbus_probe()'
The 'free_irq()' call is not at the right place in the error handling
path.  The changed order has been introduced in commit 3d4253d9afab
("[SCSI] qlogicpti: Convert to new SBUS device framework.")

Fixes: 3d4253d9afab ("[SCSI] qlogicpti: Convert to new SBUS device framework.")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 12:22:48 -04:00
Vijay Viswanath
10e5e37581 scsi: ufs: Add clock ungating to a separate workqueue
UFS driver can receive a request during memory reclaim by kswapd.  So
when ufs driver puts the ungate work in queue, and if there are no idle
workers, kthreadd is invoked to create a new kworker. Since kswapd task
holds a mutex which kthreadd also needs, this can cause a deadlock
situation. So ungate work must be done in a separate work queue with
WQ_MEM_RECLAIM flag enabled.  Such a workqueue will have a rescue thread
which will be called when the above deadlock condition is possible.

Signed-off-by: Vijay Viswanath <vviswana@codeaurora.org>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Asutosh Das <asutoshd@codeaurora.org>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 12:22:47 -04:00
Venkat Gopalakrishnan
7f6ba4f12e scsi: ufs: make sure all interrupts are processed
As multiple requests are submitted to the ufs host controller in
parallel there could be instances where the command completion interrupt
arrives later for a request that is already processed earlier as the
corresponding doorbell was cleared when handling the previous
interrupt. Read the interrupt status in a loop after processing the
received interrupt to catch such interrupts and handle it.

Signed-off-by: Venkat Gopalakrishnan <venkatg@codeaurora.org>
Signed-off-by: Asutosh Das <asutoshd@codeaurora.org>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 12:22:47 -04:00
Subhash Jadavani
69a6fff068 scsi: ufs: ufs-qcom: remove broken hci version quirk
UFSHCD_QUIRK_BROKEN_UFS_HCI_VERSION is only applicable for QCOM UFS host
controller version 2.x.y and this has been fixed from version 3.x.y
onwards, hence this change removes this quirk for version 3.x.y onwards.

[mkp: applied by hand]

Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Asutosh Das <asutoshd@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 12:22:35 -04:00
Subhash Jadavani
38135535dc scsi: ufs: add reference counting for scsi block requests
Currently we call the scsi_block_requests()/scsi_unblock_requests()
whenever we want to block/unblock scsi requests but as there is no
reference counting, nesting of these calls could leave us in undesired
state sometime. Consider following call flow sequence:

1. func1() calls scsi_block_requests() but calls func2() before
   calling scsi_unblock_requests()
2. func2() calls scsi_block_requests()
3. func2() calls scsi_unblock_requests()
4. func1() calls scsi_unblock_requests()

As there is no reference counting, we will have scsi requests unblocked
after #3 instead of it to be unblocked only after #4. Though we may not
have failures seen with this, we might run into some failures in future.
Better solution would be to fix this by adding reference counting.

Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Asutosh Das <asutoshd@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 12:20:48 -04:00
Subhash Jadavani
b334456ec2 scsi: ufs: ufshcd: fix possible unclocked register access
Vendor specific setup_clocks ops may depend on clocks managed by ufshcd
driver so if the vendor specific setup_clocks callback is called when
the required clocks are turned off, it results into unclocked register
access.

This change make sure that required clocks are enabled before vendor
specific setup_clocks callback is called.

Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Venkat Gopalakrishnan <venkatg@codeaurora.org>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Asutosh Das <asutoshd@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 12:20:48 -04:00
Maya Erez
2e3611e954 scsi: ufs: fix exception event handling
The device can set the exception event bit in one of the response UPIU,
for example to notify the need for urgent BKOPs operation.  In such a
case, the host driver calls ufshcd_exception_event_handler to handle
this notification.  When trying to check the exception event status (for
finding the cause for the exception event), the device may be busy with
additional SCSI commands handling and may not respond within the 100ms
timeout.

To prevent that, we need to block SCSI commands during handling of
exception events and allow retransmissions of the query requests, in
case of timeout.

Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Maya Erez <merez@codeaurora.org>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Asutosh Das <asutoshd@codeaurora.org>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 12:20:48 -04:00
Kees Cook
60d6d22d85 scsi: dpt_i2o: Remove VLA usage
On the quest to remove all VLAs from the kernel[1] this moves the
sg_list variable off the stack, as already done for other allocated
buffers in adpt_i2o_passthru(). Additionally consolidates the error path
for kfree().

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 12:03:51 -04:00
Bjorn Andersson
092b45583c scsi: ufs: Use freq table with devfreq
devfreq requires that the client operates on actual frequencies, not only 0
and UMAX_INT and as such UFS brok with the introduction of f1d981eaecf8 ("PM
/ devfreq: Use the available min/max frequency").

This patch registers the frequencies of the first clock with devfreq and use
these to determine if we're trying to step up or down.

Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com> [for devfreq & OPP part]
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:28:18 -04:00
Bjorn Andersson
deac444f4e scsi: ufs: Extract devfreq registration
Failing to register with devfreq leaves hba->devfreq assigned, which causes
the error path to dereference the ERR_PTR(). Rather than bolting on more
conditionals, move the call of devm_devfreq_add_device() into it's own
function and only update hba->devfreq once it's successfully registered.

The subsequent patch builds upon this to make UFS actually work again, as
it's been broken since f1d981eaecf8 ("PM / devfreq: Use the available
min/max frequency")

Also switch to use DEVFREQ_GOV_SIMPLE_ONDEMAND constant.

Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:28:18 -04:00
Michael Kelley
1b25a8c4d2 scsi: storvsc: Avoid allocating memory for temp cpumasks
Current code allocates 240 Kbytes (in typical configs) for each synthetic
SCSI controller to use as temp cpumask variables.  Recode to avoid needing
the temp cpumask variables and remove the memory allocation.

Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Acked-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:28:17 -04:00
Uma Krishnan
cd43c221bb scsi: cxlflash: Isolate external module dependencies
Depending on the underlying transport, cxlflash has a dependency on either
the CXL or OCXL drivers, which are enabled via their Kconfig option.
Instead of having a module wide dependency on these config options, it is
better to isolate the object modules that are dependent on the CXL and OCXL
drivers and adjust the module dependencies accordingly.

This commit isolates the object files that are dependent on CXL and/or
OCXL. The cxl/ocxl fops used in the core driver are tucked under an ifdef to
avoid compilation errors.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:22:10 -04:00
Uma Krishnan
de5d35aff5 scsi: cxlflash: Abstract hardware dependent assignments
As a staging cleanup to support transport specific builds of the cxlflash
module, relocate device dependent assignments to header files. This will
avoid littering the core driver with conditional compilation logic.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:22:10 -04:00
Uma Krishnan
5e12397a97 scsi: cxlflash: Add include guards to backend.h
The new header file, backend.h, that was recently added is missing the
include guards. This commit adds the guards.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:22:10 -04:00
Matthew R. Ochs
e63a8d886d scsi: cxlflash: Use local mutex for AFU serialization
AFUs can only process a single AFU command at a time. This is enforced with
a global mutex situated within the AFU send routine. As this mutex has a
global scope, it has the potential to unnecessarily block commands destined
for other AFUs.

Instead of using a global mutex, transition the mutex to be per-AFU. This
will allow commands to only be blocked by siblings of the same AFU.

Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:22:10 -04:00
Uma Krishnan
32a9ae415b scsi: cxlflash: Acquire semaphore before invoking ioctl services
When a superpipe process that makes use of virtual LUNs is terminated or
killed abruptly, there is a possibility that the cxlflash driver could hang
and deprive other operations on the adapter.

The release fop registered to be invoked on a context close, detaches every
LUN associated with the context. The underlying service to detach the LUN
assumes it has been called with the read semaphore held, and releases the
semaphore before any operation that could be time consuming.

When invoked without holding the read semaphore, an opportunity is created
for the semaphore's count to become negative when it is temporarily released
during one of these potential lengthy operations. This negative count
results in subsequent acquisition attempts taking forever, leading to the
hang.

To support the current design point of holding the semaphore on the ioctl()
paths, the release fop should acquire it before invoking any ioctl services.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:22:10 -04:00
Uma Krishnan
d58188c306 scsi: cxlflash: Limit the debug logs in the IO path
The kernel log can get filled with debug messages from send_cmd_ioarrin()
when dynamic debug is enabled for the cxlflash module and there is a lot of
legacy I/O traffic.

While these messages are necessary to debug issues that involve command
tracking, the abundance of data can overwrite other useful data in the
log. The best option available is to limit the messages that should serve
most of the common use cases.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:22:10 -04:00
Uma Krishnan
e0f76ad130 scsi: cxlflash: Yield to active send threads
The following Oops may be encountered if the device is reset, i.e. EEH
recovery, while there is heavy I/O traffic:

59:mon> t
[c000200db64bb680] c008000009264c40 cxlflash_queuecommand+0x3b8/0x500
					[cxlflash]
[c000200db64bb770] c00000000090d3b0 scsi_dispatch_cmd+0x130/0x2f0
[c000200db64bb7f0] c00000000090fdd8 scsi_request_fn+0x3c8/0x8d0
[c000200db64bb900] c00000000067f528 __blk_run_queue+0x68/0xb0
[c000200db64bb930] c00000000067ab80 __elv_add_request+0x140/0x3c0
[c000200db64bb9b0] c00000000068daac blk_execute_rq_nowait+0xec/0x1a0
[c000200db64bba00] c00000000068dbb0 blk_execute_rq+0x50/0xe0
[c000200db64bba50] c0000000006b2040 sg_io+0x1f0/0x520
[c000200db64bbaf0] c0000000006b2e94 scsi_cmd_ioctl+0x534/0x610
[c000200db64bbc20] c000000000926208 sd_ioctl+0x118/0x280
[c000200db64bbcc0] c00000000069f7ac blkdev_ioctl+0x7fc/0xe30
[c000200db64bbd20] c000000000439204 block_ioctl+0x84/0xa0
[c000200db64bbd40] c0000000003f8514 do_vfs_ioctl+0xd4/0xa00
[c000200db64bbde0] c0000000003f8f04 SyS_ioctl+0xc4/0x130
[c000200db64bbe30] c00000000000b184 system_call+0x58/0x6c

When there is no room to send the I/O request, the cached room is refreshed
by reading the memory mapped command room value from the AFU. The AFU
register mapping is refreshed during a reset, creating a race condition that
can lead to the Oops above.

During a device reset, the AFU should not be unmapped until all the active
send threads quiesce. An atomic counter, cmds_active, is currently used to
track internal AFU commands and quiesce during reset. This same counter can
also be used for the active send threads.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:22:09 -04:00
Xiaofei Tan
2f6bca202b scsi: hisi_sas: add check of device in hisi_sas_task_exec()
Currently we don't check that device is not gone before dereferencing
its elements in the function hisi_sas_task_exec() (specifically, the DQ
pointer).

This patch fixes this issue by filling in the DQ pointer in
hisi_sas_task_prep() after we check that the device pointer is still
safe to reference.

[mkp: typo]

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18 11:22:09 -04:00