14805 Commits

Author SHA1 Message Date
Hannes Reinecke
1bc0eb0446 scsi: sg: protect accesses to 'reserved' page array
The 'reserved' page array is used as a short-cut for mapping data,
saving us to allocate pages per request. However, the 'reserved' array
is only capable of holding one request, so this patch introduces a mutex
for protect 'sg_fd' against concurrent accesses.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Tested-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-11 20:55:20 -04:00
Hannes Reinecke
136e57bf43 scsi: sg: remove 'save_scat_len'
Unused.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Tested-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-11 20:55:20 -04:00
Hannes Reinecke
745dfa0d8e scsi: sg: disable SET_FORCE_LOW_DMA
The ioctl SET_FORCE_LOW_DMA has never worked since the initial git
check-in, and the respective setting is nowadays handled correctly. So
disable it entirely.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Tested-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-11 20:55:20 -04:00
Guilherme G. Piccoli
911e572e98 scsi: aacraid: fix PCI error recovery path
During a PCI error recovery, if aac_check_health() is not aware that a
PCI error happened and we have an offline PCI channel, it might trigger
some errors (like NULL pointer dereference) and inhibit the error
recovery process to complete.

This patch makes the health check procedure aware of PCI channel issues,
and in case of error recovery process, the function
aac_adapter_check_health() returns -1 and let the recovery process to
complete successfully. This patch was tested on upstream kernel
v4.11-rc5 in PowerPC ppc64le architecture with adapter 9005:028d
(VID:DID) - the error recovery procedure was able to recover fine.

Fixes: 5c63f7f710bd ("aacraid: Added EEH support")
Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Reviewed-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-11 20:45:59 -04:00
Colin Ian King
1fdcd2d1da scsi: qla2xxx: remove some redundant pointer assignments
There are several local or function parameter pointers that are being
assigned NULL after a kfree where and these have no effect and hence can
be removed.

Fixes various cppcheck warnings:

"Assignment of function parameter has no effect outside the
function. Did you forget dereferencing it"

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-11 20:42:43 -04:00
NeilBrown
717a94b5fc sched/core: Remove 'task' parameter and rename tsk_restore_flags() to current_restore_flags()
It is not safe for one thread to modify the ->flags
of another thread as there is no locking that can protect
the update.

So tsk_restore_flags(), which takes a task pointer and modifies
the flags, is an invitation to do the wrong thing.

All current users pass "current" as the task, so no developers have
accepted that invitation.  It would be best to ensure it remains
that way.

So rename tsk_restore_flags() to current_restore_flags() and don't
pass in a task_struct pointer.  Always operate on current->flags.

Signed-off-by: NeilBrown <neilb@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-11 09:06:32 +02:00
Linus Torvalds
78d91a75b4 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "Here's a pull request for 4.11-rc, fixing a set of issues mostly
  centered around the new scheduling framework. These have been brewing
  for a while, but split up into what we absolutely need in 4.11, and
  what we can defer until 4.12. These are well tested, on both single
  queue and multiqueue setups, and with and without shared tags. They
  fix several hangs that have happened in testing.

  This is obviously larger than I would have preferred at this point in
  time, but I don't think we can shave much off this and still get the
  desired results.

  In detail, this pull request contains:

   - a set of five fixes for NVMe, mostly from Christoph and one from
     Roland.

   - a series from Bart, fixing issues with dm-mq and SCSI shared tags
     and scheduling. Note that one of those patches commit messages may
     read like an optimization, but it is in fact an important fix for
     queue restarts in particular.

   - a series from Omar, most importantly fixing a hang with multiple
     hardware queues when we fail to get a driver tag. Another important
     fix in there is for resizing hardware queues, which nbd does when
     handling multiple sockets for one connection.

   - fixing an imbalance in putting the ctx for hctx request allocations
     from Minchan"

* 'for-linus' of git://git.kernel.dk/linux-block:
  blk-mq: Restart a single queue if tag sets are shared
  dm rq: Avoid that request processing stalls sporadically
  scsi: Avoid that SCSI queues get stuck
  blk-mq: Introduce blk_mq_delay_run_hw_queue()
  blk-mq: remap queues when adding/removing hardware queues
  blk-mq-sched: fix crash in switch error path
  blk-mq-sched: set up scheduler tags when bringing up new queues
  blk-mq-sched: refactor scheduler initialization
  blk-mq: use the right hctx when getting a driver tag fails
  nvmet: fix byte swap in nvmet_parse_io_cmd
  nvmet: fix byte swap in nvmet_execute_write_zeroes
  nvmet: add missing byte swap in nvmet_get_smart_log
  nvme: add missing byte swap in nvme_setup_discard
  nvme: Correct NVMF enum values to match NVMe-oF rev 1.0
  block: do not put mq context in blk_mq_alloc_request_hctx
2017-04-08 11:56:58 -07:00
Martin K. Petersen
bcd069bb25 scsi: sd: Remove LBPRZ dependency for discards
Separating discards and zeroout operations allows us to remove the LBPRZ
block zeroing constraints from discards and honor the device preferences
for UNMAP commands.

If supported by the device, we'll also choose UNMAP over one of the
WRITE SAME variants for discards.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-08 11:25:38 -06:00
Martin K. Petersen
e6bd931284 scsi: sd: Separate zeroout and discard command choices
Now that zeroout and discards are distinct operations we need to
separate the policy of choosing the appropriate command. Create a
zeroing_mode which can be one of:

write:			Zeroout assist not present, use regular WRITE
writesame:		Allow WRITE SAME(10/16) with a zeroed payload
writesame_16_unmap:	Allow WRITE SAME(16) with UNMAP
writesame_10_unmap:	Allow WRITE SAME(10) with UNMAP

The last two are conditional on the device being thin provisioned with
LBPRZ=1 and LBPWS=1 or LBPWS10=1 respectively.

Whether to set the UNMAP bit or not depends on the REQ_NOUNMAP flag. And
if none of the _unmap variants are supported, regular WRITE SAME will be
used if the device supports it.

The zeroout_mode is exported in sysfs and the detected mode for a given
device can be overridden using the string constants above.

With this change in place we can now issue WRITE SAME(16) with UNMAP set
for block zeroing applications that require hard guarantees and
logical_block_size granularity. And at the same time use the UNMAP
command with the device's preferred granulary and alignment for discard
operations.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-08 11:25:38 -06:00
Christoph Hellwig
48920ff2a5 block: remove the discard_zeroes_data flag
Now that we use the proper REQ_OP_WRITE_ZEROES operation everywhere we can
kill this hack.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-08 11:25:38 -06:00
Christoph Hellwig
e4b878371b sd: implement unmapping Write Zeroes
Try to use a write same with unmap bit variant if the device supports it
and the caller allows for it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-08 11:25:38 -06:00
Christoph Hellwig
02d261034f sd: implement REQ_OP_WRITE_ZEROES
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-08 11:25:38 -06:00
Christoph Hellwig
81d926e8b5 sd: split sd_setup_discard_cmnd
Split sd_setup_discard_cmnd into one function per provisioning type.  While
this creates some very slight duplication of boilerplate code it keeps the
code modular for additions of new provisioning types, and for reusing the
write same functions for the upcoming scsi implementation of the Write Zeroes
operation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-08 11:25:38 -06:00
Martin K. Petersen
7c856152cb scsi: sd: Fix capacity calculation with 32-bit sector_t
We previously made sure that the reported disk capacity was less than
0xffffffff blocks when the kernel was not compiled with large sector_t
support (CONFIG_LBDAF). However, this check assumed that the capacity
was reported in units of 512 bytes.

Add a sanity check function to ensure that we only enable disks if the
entire reported capacity can be expressed in terms of sector_t.

Cc: <stable@vger.kernel.org>
Reported-by: Steve Magnani <steve.magnani@digidescorp.com>
Cc: Bart Van Assche <Bart.VanAssche@sandisk.com>
Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-07 17:07:16 -04:00
Sawan Chandak
bf6061b17a scsi: qla2xxx: Add fix to read correct register value for ISP82xx.
Add fix to read correct register value for ISP82xx, during check for
register disconnect.ISP82xx has different base register.

Fixes: a465537ad1a4 ("qla2xxx: Disable the adapter and skip error recovery in case of register disconnect")
Signed-off-by: Sawan Chandak <sawan.chandak@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-07 17:07:15 -04:00
Chad Dupuis
8eaf7dfcfc scsi: qedf: Fix crash due to unsolicited FIP VLAN response.
We need to initialize qedf->fipvlan_compl in __qedf_probe so that if we
receive an unsolicited FIP VLAN response, the system doesn't crash due
to trying to complete an uninitialized completion.

Also add a check to see if there are any waiters on the completion so we
don't inadvertantly kick start the discovery process due to the
unsolicited frame.

Fixed the crash:

<1>BUG: unable to handle kernel NULL pointer dereference at (null)
<1>IP: [<ffffffff8105ed71>] __wake_up_common+0x31/0x90
<4>PGD 0
<4>Oops: 0000 [#1] SMP
<4>last sysfs file: /sys/devices/system/cpu/online
<4>CPU 7
<4>Modules linked in: autofs4 nfs lockd fscache auth_rpcgss nfs_acl sunrpc target_core_iblock target_core_file target_core_pscsi target_core_mod configfs bnx2fc cnic fcoe 8021q garp stp llc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 vfat fat uinput ipmi_devintf microcode power_meter acpi_ipmi ipmi_si ipmi_msghandler iTCO_wdt iTCO_vendor_support dcdbas sg joydev sb_edac edac_core lpc_ich mfd_core shpchp tg3 ptp pps_core ext4 jbd2 mbcache sr_mod cdrom sd_mod crc_t10dif qedi(U) iscsi_boot_sysfs libiscsi scsi_transport_iscsi uio qedf(U) libfcoe libfc scsi_transport_fc scsi_tgt qede(U) qed(U) ahci megaraid_sas wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
<4>
<4>Pid: 1485, comm: qedf_11_ll2 Not tainted 2.6.32-642.el6.x86_64 #1 Dell Inc. PowerEdge R730/0599V5
<4>RIP: 0010:[<ffffffff8105ed71>]  [<ffffffff8105ed71>] __wake_up_common+0x31/0x90
<4>RSP: 0018:ffff881068a83d50  EFLAGS: 00010086
<4>RAX: ffffffffffffffe8 RBX: ffff88106bf42de0 RCX: 0000000000000000
<4>RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffff88106bf42de0
<4>RBP: ffff881068a83d90 R08: 0000000000000000 R09: 00000000fffffffe
<4>R10: 0000000000000000 R11: 000000000000000b R12: 0000000000000286
<4>R13: ffff88106bf42de8 R14: 0000000000000000 R15: 0000000000000000
<4>FS:  0000000000000000(0000) GS:ffff88089c460000(0000) knlGS:0000000000000000
<4>CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
<4>CR2: 0000000000000000 CR3: 0000000001a8d000 CR4: 00000000001407e0
<4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
<4>Process qedf_11_ll2 (pid: 1485, threadinfo ffff881068a80000, task ffff881068a70040)
<4>Stack:
<4> ffff88106ef00090 0000000300000001 ffff881068a83d90 ffff88106bf42de0
<4><d> 0000000000000286 ffff88106bf42dd8 ffff88106bf40a50 0000000000000002
<4><d> ffff881068a83dc0 ffffffff810634c7 ffff881000000003 000000000000000b
<4>Call Trace:
<4> [<ffffffff810634c7>] complete+0x47/0x60
<4> [<ffffffffa01d37e7>] qedf_fip_recv+0x1c7/0x450 [qedf]
<4> [<ffffffffa01cb3cb>] qedf_ll2_recv_thread+0x33b/0x510 [qedf]
<4> [<ffffffffa01cb090>] ? qedf_ll2_recv_thread+0x0/0x510 [qedf]
<4> [<ffffffff810a662e>] kthread+0x9e/0xc0
<4> [<ffffffff8100c28a>] child_rip+0xa/0x20
<4> [<ffffffff810a6590>] ? kthread+0x0/0xc0
<4> [<ffffffff8100c280>] ? child_rip+0x0/0x20
<4>Code: 41 56 41 55 41 54 53 48 83 ec 18 0f 1f 44 00 00 89 75 cc 89 55 c8 4c 8d 6f 08 48 8b 57 08 41 89 cf 4d 89 c6 48 8d 42 e8 49 39 d5 <48> 8b 58 18 74 3f 48 83 eb 18 eb 0a 0f 1f 00 48 89 d8 48 8d 5a
<1>RIP  [<ffffffff8105ed71>] __wake_up_common+0x31/0x90
<4> RSP <ffff881068a83d50>
<4>CR2: 0000000000000000

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-07 17:07:15 -04:00
Martin K. Petersen
a00a786251 scsi: sr: Sanity check returned mode data
Kefeng Wang discovered that old versions of the QEMU CD driver would
return mangled mode data causing us to walk off the end of the buffer in
an attempt to parse it. Sanity check the returned mode sense data.

Cc: <stable@vger.kernel.org>
Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-07 17:07:14 -04:00
Fam Zheng
6780414519 scsi: sd: Consider max_xfer_blocks if opt_xfer_blocks is unusable
If device reports a small max_xfer_blocks and a zero opt_xfer_blocks, we
end up using BLK_DEF_MAX_SECTORS, which is wrong and r/w of that size
may get error.

[mkp: tweaked to avoid setting rw_max twice and added typecast]

Cc: <stable@vger.kernel.org> # v4.4+
Fixes: ca369d51b3e ("block/sd: Fix device-imposed transfer length limits")
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-07 17:05:15 -04:00
Jens Axboe
65f619d253 Merge branch 'for-linus' into for-4.12/block
We've added a considerable amount of fixes for stalls and issues
with the blk-mq scheduling in the 4.11 series since forking
off the for-4.12/block branch. We need to do improvements on
top of that for 4.12, so pull in the previous fixes to make
our lives easier going forward.

Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-07 12:45:20 -06:00
Bart Van Assche
36e3cf2739 scsi: Avoid that SCSI queues get stuck
If a .queue_rq() function returns BLK_MQ_RQ_QUEUE_BUSY then the block
driver that implements that function is responsible for rerunning the
hardware queue once requests can be queued again successfully.

commit 52d7f1b5c2f3 ("blk-mq: Avoid that requeueing starts stopped
queues") removed the blk_mq_stop_hw_queue() call from scsi_queue_rq()
for the BLK_MQ_RQ_QUEUE_BUSY case. Hence change all calls to functions
that are intended to rerun a busy queue such that these examine all
hardware queues instead of only stopped queues.

Since no other functions than scsi_internal_device_block() and
scsi_internal_device_unblock() should ever stop or restart a SCSI
queue, change the blk_mq_delay_queue() call into a
blk_mq_delay_run_hw_queue() call.

Fixes: commit 52d7f1b5c2f3 ("blk-mq: Avoid that requeueing starts stopped queues")
Fixes: commit 7e79dadce222 ("blk-mq: stop hardware queue in blk_mq_delay_queue()")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Long Li <longli@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-07 12:27:08 -06:00
Nicholas Mc Guire
bf5ea6fba7 scsi: qla4xxx: drop redundant init_completion
The redundant init_completion() here seems to be a cut&past error as
struct scsi_qla_host only has 4 completion elements to initialize, thus
the duplicate init_completion(disable_acb_comp) is simply removed.

Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-06 13:14:32 -04:00
Hannes Reinecke
a06586325f scsi: make asynchronous aborts mandatory
There hasn't been any reports for HBAs where asynchronous abort
would not work, so we should make it mandatory and remove
the fallback.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-06 13:07:33 -04:00
Hannes Reinecke
2171b6d08b scsi: make scsi_eh_scmd_add() always succeed
scsi_eh_scmd_add() currently only will fail if no
error handler thread is started (which will never be the
case) or if the state machine encounters an illegal transition.

But if we're encountering an invalid state transition
chances is we cannot fixup things with the error handler.
So better add a WARN_ON for illegal host states and
make scsi_dh_scmd_add() a void function.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-06 13:07:33 -04:00
Hannes Reinecke
8e8c9d01c5 scsi: make eh_eflags persistent
If a failed command is retried and fails again we need
to enter SCSI EH, otherwise we will never be able to
recover the command.
To detect this situation we must not clear scmd->eh_eflags
when EH finishes but rather make it persistent throughout
the lifetime of the command.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-06 13:07:33 -04:00
Christoph Hellwig
909657615d scsi: libsas: allow async aborts
We now first try to call ->eh_abort_handler from a work queue, but libsas
was always failing that for no good reason.  Allow async aborts.

Reviewed-by: Johannes Thumshirn <jth@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-06 13:07:32 -04:00
Hannes Reinecke
1bcb93047e scsi: always send command aborts
When a command has timed out we always should be sending an
abort; with the previous code a failed abort might signal
SCSI EH to start, and all other timed out commands will
never be aborted, even though they might belong to a
different ITL nexus.

Cc: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-06 13:07:32 -04:00
Hannes Reinecke
e8f8d50e07 scsi: sd: Return SUCCESS in sd_eh_action() after device offline
If sd_eh_action() decides to take the device offline there is
no point in returning FAILED, as taking the device offline
is the ultimate step in SCSI EH anyway.
So further escalation via SCSI EH is not likely to make a
difference and we can as well return SUCCESS.

Cc: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-06 13:07:32 -04:00
Hannes Reinecke
7a38dc0bfb scsi: scsi_error: count medium access timeout only once per EH run
The current medium access timeout counter will be increased for
each command, so if there are enough failed commands we'll hit
the medium access timeout for even a single device failure and
the following kernel message is displayed:

sd H:C:T:L: [sdXY] Medium access timeout failure. Offlining disk!

Fix this by making the timeout per EH run, ie the counter will
only be increased once per device and EH run.

Fixes: 18a4d0a ("[SCSI] Handle disk devices which can not process medium access commands")
Cc: Ewan Milne <emilne@redhat.com>
Cc: Lawrence Obermann <loberman@redhat.com>
Cc: Benjamin Block <bblock@linux.vnet.ibm.com>
Cc: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-06 13:07:32 -04:00
Christoph Hellwig
104d9c7f94 scsi: csiostor: switch to pci_alloc_irq_vectors
And get automatic MSI-X affinity for free.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-06 12:57:03 -04:00
Mauricio Faria de Oliveira
75106523f3 scsi: ses: don't get power status of SES device slot on probe
The commit 08024885a2a3 ("ses: Add power_status to SES device slot")
introduced the 'power_status' attribute to enclosure components and
the associated callbacks.

There are 2 callbacks available to get the power status of a device:
1) ses_get_power_status() for 'struct enclosure_component_callbacks'
2) get_component_power_status() for the sysfs device attribute
(these are available for kernel-space and user-space, respectively.)

However, despite both methods being available to get power status
on demand, that commit also introduced a call to get power status
in ses_enclosure_data_process().

This dramatically increased the total probe time for SCSI devices
on larger configurations, because ses_enclosure_data_process() is
called several times during the SCSI devices probe and loops over
the component devices (but that is another problem, another patch).

That results in a tremendous continuous hammering of SCSI Receive
Diagnostics commands to the enclosure-services device, which does
delay the total probe time for the SCSI devices __significantly__:

  Originally, ~34 minutes on a system attached to ~170 disks:

    [ 9214.490703] mpt3sas version 13.100.00.00 loaded
    ...
    [11256.580231] scsi 17:0:177:0: qdepth(16), tagged(1), simple(0),
                   ordered(0), scsi_level(6), cmd_que(1)

  With this patch, it decreased to ~2.5 minutes -- a 13.6x faster

    [ 1002.992533] mpt3sas version 13.100.00.00 loaded
    ...
    [ 1151.978831] scsi 11:0:177:0: qdepth(16), tagged(1), simple(0),
                   ordered(0), scsi_level(6), cmd_que(1)

Back to the commit discussion.. on the ses_get_power_status() call
introduced in ses_enclosure_data_process(): impact of removing it.

That may possibly be in place to initialize the power status value
on device probe.  However, those 2 functions available to retrieve
that value _do_ automatically refresh/update it.  So the potential
benefit would be a direct access of the 'power_status' field which
does not use the callbacks...

But the only reader of 'struct enclosure_component::power_status'
is the get_component_power_status() callback for sysfs attribute,
and it _does_ check for and call the .get_power_status callback,
(which indeed is defined and implemented by that commit), so the
power status value is, again, automatically updated.

So, the remaining potential for a direct/non-callback access to
the power_status attribute would be out-of-tree modules -- well,
for those, if they are for whatever reason interested in values
that are set during device probe and not up-to-date by the time
they need it.. well, that would be curious.

Well, to handle that more properly, set the initial power state
value to '-1' (i.e., uninitialized) instead of '1' (power 'on'),
and check for it in that callback which may do an direct access
to the field value _if_ a callback function is not defined.

Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Fixes: 08024885a2a3 ("ses: Add power_status to SES device slot")
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-06 12:48:05 -04:00
Bart Van Assche
c02465fa13 scsi: osd_uld: Check scsi_device_get() return value
scsi_device_get() can fail. Hence check its return value.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Boaz Harrosh <bharrosh@panasas.com>
Acked-by: Boaz Harrosh <ooo@electrozaur.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-06 12:43:12 -04:00
David S. Miller
6f14f443d3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Mostly simple cases of overlapping changes (adding code nearby,
a function whose name changes, for example).

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06 08:24:51 -07:00
Al Viro
bf7af0cea8 esas2r: don't open-code memdup_user()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-04-06 02:11:00 -04:00
Christoph Hellwig
64c7f1d157 block, scsi: move the retries field to struct scsi_request
Instead of bloating the generic struct request with it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-05 12:05:08 -06:00
Johannes Thumshirn
8690218a4c scsi: sas: remove sas_domain_release_transport
sas_domain_release_transport is unused since at least v3.13, remove it.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-04 20:16:38 -04:00
Milan P Gandhi
5a68a1c29f scsi: qla2xxx: Fix typo in driver
Signed-off-by: Milan P Gandhi <mgandhi@redhat.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-04 20:13:57 -04:00
Arnd Bergmann
44a5b97712 scsi: advansys: fix uninitialized data access
gcc-7.0.1 now warns about a previously unnoticed access of uninitialized
struct members:

drivers/scsi/advansys.c: In function 'AscMsgOutSDTR':
drivers/scsi/advansys.c:3860:26: error: '*((void *)&sdtr_buf+5)' may be used uninitialized in this function [-Werror=maybe-uninitialized]
         ((ushort)s_buffer[i + 1] << 8) | s_buffer[i]);
                          ^
drivers/scsi/advansys.c:3860:26: error: '*((void *)&sdtr_buf+7)' may be used uninitialized in this function [-Werror=maybe-uninitialized]
drivers/scsi/advansys.c:3860:26: error: '*((void *)&sdtr_buf+5)' may be used uninitialized in this function [-Werror=maybe-uninitialized]
drivers/scsi/advansys.c:3860:26: error: '*((void *)&sdtr_buf+7)' may be used uninitialized in this function [-Werror=maybe-uninitialized]

The code has existed in this exact form at least since v2.6.12, and the
warning seems correct. This uses named initializers to ensure we
initialize all members of the structure.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-04 19:39:39 -04:00
Al Viro
bee3f412d6 Merge branch 'parisc-4.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux into uaccess.parisc 2017-04-02 10:33:48 -04:00
Linus Torvalds
7d34ddbe47 SCSI fixes on 20170401
Thirteen small fixes: The hopefully final effort to get the lpfc nvme
 kconfig problems sorted, there's one important sg fix (user can induce
 read after end of buffer) and one minor enhancement (adding an extra
 PCI ID to qedi). The rest are a set of minor fixes, which mostly occur
 as user visible in error legs or on specific devices.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJY39E7AAoJEAVr7HOZEZN4WskQALCWMroFglXHofUbwMrKH4gv
 on+6k1k+z0iiUiidpfdPb0gL1D4fSBcDl60fQbSUFFlGiM1/MTpyv10QhqlER1al
 74ZLJ6cavw52WBLJccQBCGeYBIDdvAGPLfcoIHPfa4+6I2yXe1dsAY6T3qDUHsGW
 amEG62qjbpUkrPhyQq+ehTxU4itam2JH17eTis4xVCG0vXuvlp4igecbErzwOZu7
 zhpTvJZezsfiCXmPGyqbyRU1IRU5WglznwiZ7duNtTIFD8vQ9dugs/QH88VL31rh
 25uWiJMn9waC2o4wHuRzHb5VOFQxkhanAc0y+f3I4pxTdX4d5yN7TeNtxUxZM1z7
 CEB4QFVns8YF68WZaodCVqn06uX4REwdIs6n7KTsQT9JGQbnmEFoGLNe1/wCgdGZ
 16gH+0visFCnZQpCDbuFsUcddFglAT1EtvNLbPxKk3sKxAwqZJ+e5Lon7CX9s89f
 rPlvRb68Nw/ctxgXffM2ecRddpvHTeRgy1XBv/STMhGOzJV5k6S3nXPZfyq5kWdH
 Fv9MUu3qu2rVplPKydrOlXkz40a2cl/jS0M8UXueoJwE/JkvoiwquzThLO1BB3W/
 5Dc1NVii67qPlEJ8mAsNYiPnZww7t8IRlHlD+H7/pSo0RE2C4jhNmoZMyEjwlmex
 Fq13DkTbBhIZ0mNCwQ2J
 =umUo
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Thirteen small fixes: The hopefully final effort to get the lpfc nvme
  kconfig problems sorted, there's one important sg fix (user can induce
  read after end of buffer) and one minor enhancement (adding an extra
  PCI ID to qedi). The rest are a set of minor fixes, which mostly occur
  as user visible in error legs or on specific devices"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ufs: remove the duplicated checking for supporting clkscaling
  scsi: lpfc: fix building without debugfs support
  scsi: lpfc: Fix PT2PT PRLI reject
  scsi: hpsa: fix volume offline state
  scsi: libsas: fix ata xfer length
  scsi: scsi_dh_alua: Warn if the first argument of alua_rtpg_queue() is NULL
  scsi: scsi_dh_alua: Ensure that alua_activate() calls the completion function
  scsi: scsi_dh_alua: Check scsi_device_get() return value
  scsi: sg: check length passed to SG_NEXT_CMD_LEN
  scsi: ufshcd-platform: remove the useless cast in ERR_PTR/IS_ERR
  scsi: qedi: Add PCI device-ID for QL41xxx adapters.
  scsi: aacraid: Fix potential null access
  scsi: qla2xxx: Fix crash in qla2xxx_eh_abort on bad ptr
2017-04-01 20:07:31 -07:00
Eric Biggers
f363b089be blk-mq: constify struct blk_mq_ops
Constify all instances of blk_mq_ops, as they are never modified.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-03-31 08:28:58 -06:00
Christoph Hellwig
831488669a scsi: be2iscsi: switch to pci_alloc_irq_vectors
And get automatic MSI-X affinity for free.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-03-30 11:37:16 -04:00
Szymon Mielczarek
dbd34a61c3 Revert "scsi: ufs: add queries retry mechanism"
This reverts commit 61e073590b82a539654626ecae91b8fab11db3f3.

The patch introduced redundant query retries as we already had such
mechanism provided with _retry functions.  Both ufshcd_read_desc and
ufshcd_read_unit_desc_param functions call ufshcd_query_descriptor_retry
wrapper.

Signed-off-by: Szymon Mielczarek <szymonx.mielczarek@intel.com>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-03-29 22:51:40 -04:00
Don Brace
5765d180fd scsi: hpsa: change driver version
Reviewed-by: Gerry Morong <gerry.morong@microsemi.com>
Reviewed-by: Scott Teel <scott.teel@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-03-29 22:50:09 -04:00
Don Brace
7f1974a76d scsi: hpsa: update pci ids
Reviewed-by: Gerry Morong <gerry.morong@microsemi.com>
Reviewed-by: Scott Teel <scott.teel@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-03-29 22:49:48 -04:00
Arnd Bergmann
8bb74d3661 scsi: hisi_sas: fix SATA dependency
Removing the 'select SCSI_SAS_LIBSAS' statement in Kconfig resulted in a
link failure in configurations that have hisi_sas built-in but libsas as
a loadable module:

drivers/scsi/built-in.o: In function `hisi_sas_scan_finished':
hisi_sas_main.c:(.text+0x37ce9): undefined reference to `sas_drain_work'
drivers/scsi/built-in.o: In function `hisi_sas_slave_configure':
hisi_sas_main.c:(.text+0x37d17): undefined reference to `sas_slave_configure'
hisi_sas_main.c:(.text+0x37d40): undefined reference to `sas_change_queue_depth'
drivers/scsi/built-in.o: In function `hisi_sas_remove':

All other libsas users have the 'select' statement, so we should do the
same here for consistency. For all I can tell, the patch that added the
sata softreset does not actually introduce a dependency on SCSI_SAS_ATA
but instead adds calls into libata itself, so we can express that with a
more specific dependency.

We cannot have 'select SCSI_SAS_LIBSAS; depends on SCSI_SAS_ATA' as that
would cause a dependency loop.

Fixes: 7c594f0407de ("scsi: hisi_sas: add softreset function for SATA disk")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-03-29 22:44:53 -04:00
Tomohiro Kusumi
d985c6eaa8 scsi: ufs: just use sizeof() for snprintf()
Not much reason to use ARRAY_SIZE() when we know it's for a C string.

Signed-off-by: Tomohiro Kusumi <tkusumi@tuxera.com>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-03-29 22:42:21 -04:00
Tomohiro Kusumi
0b9961be10 scsi: ufs: remove deprecated enum for hw interrupt
These flags are no longer needed after 2fbd009b in 2013.

Signed-off-by: Tomohiro Kusumi <tkusumi@tuxera.com>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-03-29 22:42:21 -04:00
Tomohiro Kusumi
f6b254513c scsi: ufs: add missing macros for register bits from UFSHCI spec
Add macros for register bits that can be found in JESD223C (v2.1).

Not all registers are defined in ufshci.h (i.e. some are unused whether
macros are defined or undefined), but all the bits for those registers
that are already defined should appear here.

Signed-off-by: Tomohiro Kusumi <tkusumi@tuxera.com>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-03-29 22:42:21 -04:00
Tomohiro Kusumi
9c490d2dd3 scsi: ufs: non functional macro fix
Not having () isn't likely to do any harm in this case, but all the
other macros below do have it. Also add "are" in a comment.

Signed-off-by: Tomohiro Kusumi <tkusumi@tuxera.com>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-03-29 22:42:21 -04:00
Tomohiro Kusumi
4a8eec2bac scsi: ufs: use existing macro CONTROLLER_ENABLE to test register bit
Signed-off-by: Tomohiro Kusumi <tkusumi@tuxera.com>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-03-29 22:42:21 -04:00