linux/drivers/scsi
Mauricio Faria de Oliveira 785a470496 scsi: ipr: do not set DID_PASSTHROUGH on CHECK CONDITION
On a dual controller setup with multipath enabled, some MEDIUM ERRORs
caused both paths to be failed, thus I/O got queued/blocked since the
'queue_if_no_path' feature is enabled by default on IPR controllers.

This example disabled 'queue_if_no_path' so the I/O failure is seen at
the sg_dd program.  Notice that after the sg_dd test-case, both paths
are in 'failed' state, and both path/priority groups are in 'enabled'
state (not 'active') -- which would block I/O with 'queue_if_no_path'.

    # sg_dd if=/dev/dm-2 bs=4096 count=1 dio=1 verbose=4 blk_sgio=0
    <...>
    read(unix): count=4096, res=-1
    sg_dd: reading, skip=0 : Input/output error
    <...>

    # dmesg
    [...] sd 2:2:16:0: [sds] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
    [...] sd 2:2:16:0: [sds] Sense Key : Medium Error [current]
    [...] sd 2:2:16:0: [sds] Add. Sense: Unrecovered read error - recommend rewrite the data
    [...] sd 2:2:16:0: [sds] CDB: Read(10) 28 00 00 00 00 00 00 00 20 00
    [...] blk_update_request: I/O error, dev sds, sector 0
    [...] device-mapper: multipath: Failing path 65:32.
    <...>
    [...] device-mapper: multipath: Failing path 65:224.

    # multipath -l
    1IBM_IPR-0_59C2AE0000001F80 dm-2 IBM     ,IPR-0   59C2AE00
    size=5.2T features='0' hwhandler='1 alua' wp=rw
    |-+- policy='service-time 0' prio=0 status=enabled
    | `- 2:2:16:0 sds  65:32  failed undef running
    `-+- policy='service-time 0' prio=0 status=enabled
      `- 1:2:7:0  sdae 65:224 failed undef running

This is not the desired behavior. The dm-multipath explicitly checks
for the MEDIUM ERROR case (and a few others) so not to fail the path
(e.g., I/O to other sectors could potentially happen without problems).
See dm-mpath.c :: do_end_io_bio() -> noretry_error() !->! fail_path().

The problem trace is:

1) ipr_scsi_done()  // SENSE KEY/CHECK CONDITION detected, go to..
2) ipr_erp_start()  // ipr_is_gscsi() and masked_ioasc OK, go to..
3) ipr_gen_sense()  // masked_ioasc is IPR_IOASC_MED_DO_NOT_REALLOC,
                    // so set DID_PASSTHROUGH.

4) scsi_decide_disposition()  // check for DID_PASSTHROUGH and return
                              // early on, faking a DID_OK.. *instead*
                              // of reaching scsi_check_sense().

                              // Had it reached the latter, that would
                              // set host_byte to DID_MEDIUM_ERROR.

5) scsi_finish_command()
6) scsi_io_completion()
7) __scsi_error_from_host_byte()  // That would be converted to -ENODATA
<...>
8) dm_softirq_done()
9) multipath_end_io()
10) do_end_io()
11) noretry_error()  // And that is checked in dm-mpath :: noretry_error()
                     // which would cause fail_path() not to be called.

With this patch applied, the I/O is failed but the paths are not.  This
multipath device continues accepting more I/O requests without blocking.
(and notice the different host byte/driver byte handling per SCSI layer).

    # dmesg
    [...] sd 2:2:7:0: [sdaf] Done: SUCCESS Result: hostbyte=0x13 driverbyte=DRIVER_OK
    [...] sd 2:2:7:0: [sdaf] CDB: Read(10) 28 00 00 00 00 00 00 00 40 00
    [...] sd 2:2:7:0: [sdaf] Sense Key : Medium Error [current]
    [...] sd 2:2:7:0: [sdaf] Add. Sense: Unrecovered read error - recommend rewrite the data
    [...] blk_update_request: critical medium error, dev sdaf, sector 0
    [...] blk_update_request: critical medium error, dev dm-6, sector 0
    [...] sd 2:2:7:0: [sdaf] Done: SUCCESS Result: hostbyte=0x13 driverbyte=DRIVER_OK
    [...] sd 2:2:7:0: [sdaf] CDB: Read(10) 28 00 00 00 00 00 00 00 10 00
    [...] sd 2:2:7:0: [sdaf] Sense Key : Medium Error [current]
    [...] sd 2:2:7:0: [sdaf] Add. Sense: Unrecovered read error - recommend rewrite the data
    [...] blk_update_request: critical medium error, dev sdaf, sector 0
    [...] blk_update_request: critical medium error, dev dm-6, sector 0
    [...] Buffer I/O error on dev dm-6, logical block 0, async page read

    # multipath -l 1IBM_IPR-0_59C2AE0000001F80
    1IBM_IPR-0_59C2AE0000001F80 dm-6 IBM     ,IPR-0   59C2AE00
    size=5.2T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
    |-+- policy='service-time 0' prio=0 status=active
    | `- 2:2:7:0  sdaf 65:240 active undef running
    `-+- policy='service-time 0' prio=0 status=enabled
      `- 1:2:7:0  sdh  8:112  active undef running

Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-11 21:40:19 -04:00
..
aacraid scsi: aacraid: fix PCI error recovery path 2017-04-11 20:45:59 -04:00
aic7xxx treewide: Fix printk() message errors 2016-12-14 10:54:27 +01:00
aic94xx scsi: aic94xx: Add a missing call to kfree 2016-11-29 11:21:49 -05:00
arcmsr Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
arm scsi: ncr5380: Use correct types for DMA routines 2016-11-08 17:29:48 -05:00
be2iscsi scsi: remove eh_timed_out methods in the transport template 2017-02-06 19:10:03 -05:00
bfa SCSI misc on 20170220 2017-02-21 11:51:42 -08:00
bnx2fc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-02-22 10:15:09 -08:00
bnx2i scsi: remove eh_timed_out methods in the transport template 2017-02-06 19:10:03 -05:00
csiostor scsi: remove eh_timed_out methods in the transport template 2017-02-06 19:10:03 -05:00
cxgbi Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-02-22 10:15:09 -08:00
cxlflash scsi: merge __scsi_execute into scsi_execute 2017-02-23 16:57:19 -05:00
device_handler scsi: scsi_dh_alua: Warn if the first argument of alua_rtpg_queue() is NULL 2017-03-19 13:16:37 -04:00
dpt
esas2r scsi: esas2r: Fix format string type mistakes 2017-01-09 23:52:26 -05:00
fcoe Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-02-22 10:15:09 -08:00
fnic SCSI misc on 20170220 2017-02-21 11:51:42 -08:00
hisi_sas scsi: hisi_sas: decrease running_req in hisi_sas_slot_task_free() 2017-01-20 19:10:42 -05:00
ibmvscsi scsi: remove eh_timed_out methods in the transport template 2017-02-06 19:10:03 -05:00
ibmvscsi_tgt ibmvscsis: Add SGL limit 2017-02-08 10:51:24 -08:00
isci scsi: isci: switch to pci_alloc_irq_vectors 2016-12-01 08:36:17 -05:00
libfc block: split scsi_request out of struct request 2017-01-27 15:08:35 -07:00
libsas scsi: libsas: fix ata xfer length 2017-03-20 09:45:08 -04:00
lpfc scsi: lpfc: fix building without debugfs support 2017-03-23 11:28:43 -04:00
megaraid scsi: megaraid_sas: Driver version upgrade 2017-03-13 22:59:53 -04:00
mpt3sas scsi: mpt3sas: Avoid sleeping in interrupt context 2017-03-01 21:52:13 -05:00
mvsas SCSI misc on 20161213 2016-12-14 10:49:33 -08:00
osd scsi: make the sense header argument to scsi_test_unit_ready mandatory 2017-02-22 19:35:24 -05:00
pcmcia scsi: Do not set cmd_per_lun to 1 in the host template 2015-05-31 18:06:28 -07:00
pm8001 scsi: pm8001: switch to pci_irq_alloc_vectors 2017-02-06 19:12:30 -05:00
qedf scsi: qedf: Fix crash due to unsolicited FIP VLAN response. 2017-04-07 17:07:15 -04:00
qedi scsi: qedi: Add PCI device-ID for QL41xxx adapters. 2017-03-15 19:00:57 -04:00
qla2xxx scsi: qla2xxx: Add fix to read correct register value for ISP82xx. 2017-04-07 17:07:15 -04:00
qla4xxx scsi: qla4xxx: remove two unused MSI-X related #defines 2017-01-11 22:54:45 -05:00
smartpqi scsi: smartpqi: fix time handling 2017-02-22 18:41:42 -05:00
snic SCSI misc on 20170220 2017-02-21 11:51:42 -08:00
sym53c8xx_2 scsi: sym53c8xx_2: Use complete() instead complete_all() 2016-09-14 13:19:29 -04:00
ufs scsi: ufs: remove the duplicated checking for supporting clkscaling 2017-03-27 21:45:41 -04:00
.gitignore
3w-9xxx.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
3w-9xxx.h scsi: Update 3ware driver email addresses 2016-12-14 15:25:12 -05:00
3w-sas.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
3w-sas.h scsi: Update 3ware driver email addresses 2016-12-14 15:25:12 -05:00
3w-xxxx.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
3w-xxxx.h scsi: Update 3ware driver email addresses 2016-12-14 15:25:12 -05:00
53c700_d.h_shipped
53c700.c scsi: remove current_cmnd field from struct scsi_device 2016-07-13 22:33:23 -04:00
53c700.h scsi: remove current_cmnd field from struct scsi_device 2016-07-13 22:33:23 -04:00
53c700.scr
a100u2w.c scsi: a100u2w: trivial typo in printk 2015-08-07 15:03:42 +02:00
a100u2w.h
a2091.c
a2091.h
a3000.c
a3000.h
a4000t.c
advansys.c scsi: advansys: fix build warning for PCI=n 2016-11-08 17:29:58 -05:00
aha152x.c scsi: Do not set cmd_per_lun to 1 in the host template 2015-05-31 18:06:28 -07:00
aha152x.h
aha1542.c scsi: aha1542: avoid uninitialized variable warnings 2016-02-23 21:27:02 -05:00
aha1542.h aha1542: fix include guard and remove useless changelog 2015-04-09 18:08:31 -07:00
aha1740.c scsi: Do not set cmd_per_lun to 1 in the host template 2015-05-31 18:06:28 -07:00
aha1740.h scsi: Do not set cmd_per_lun to 1 in the host template 2015-05-31 18:06:28 -07:00
am53c974.c am53c974: Fix crash during modprobe 2015-04-17 10:13:56 -07:00
atari_scsi.c scsi: atari_scsi: Reset DMA during bus reset only under ST-DMA lock 2017-01-31 21:39:22 -05:00
atp870u.c atp870u: Introduce atp870_init() 2015-11-25 22:08:55 -05:00
atp870u.h atp870u: Remove scam_on from struct atp_unit 2015-11-25 22:08:52 -05:00
BusLogic.c scsi: replace seq_printf with seq_puts 2015-02-02 09:57:45 -08:00
BusLogic.h
bvme6000_scsi.c
ch.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2015-04-14 09:50:27 -07:00
constants.c scsi: fix upper bounds check of sense key in scsi_sense_key_string() 2016-08-16 00:49:32 -04:00
dc395x.c scsi: print single-character strings with seq_putc 2015-02-02 09:57:46 -08:00
dc395x.h
dmx3191d.c scsi: dmx3191d: use module_pci_driver 2016-11-16 20:43:50 -05:00
dpt_i2o.c scsi: dpt_i2o: double free if adpt_i2o_online_hba() fails 2017-01-05 00:21:12 -05:00
dpti.h
eata_generic.h
eata_pio.c eata_pio: missing break statement 2016-05-10 22:01:07 -04:00
eata_pio.h
eata.c scsi: drop reason argument from ->change_queue_depth 2014-11-24 14:45:27 +01:00
esp_scsi.c scsi: use host wide tags by default 2015-11-09 17:11:57 -08:00
esp_scsi.h esp_scsi: correctly detect am53c974 2014-11-24 16:13:16 +01:00
fdomain.c scsi: fdomain: drop fdomain_pci_tbl when built-in 2016-02-23 21:27:02 -05:00
fdomain.h
FlashPoint.c FlashPoint: fix build warning 2015-11-09 16:32:14 -08:00
g_NCR5380.c scsi: ncr5380: Reduce #include files 2017-01-31 21:38:15 -05:00
gdth_ioctl.h
gdth_proc.c gdth: replace struct timeval with ktime_get_real_seconds() 2016-02-25 21:16:49 -05:00
gdth_proc.h
gdth.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
gdth.h
gvp11.c
gvp11.h
hosts.c scsi: allocate scsi_cmnd structures as part of struct request 2017-01-27 15:08:35 -07:00
hpsa_cmd.h scsi: hpsa: update check for logical volume status 2017-03-15 13:36:22 -04:00
hpsa.c scsi: hpsa: fix volume offline state 2017-03-23 10:12:29 -04:00
hpsa.h scsi: hpsa: limit outstanding rescans 2017-03-15 13:37:10 -04:00
hptiop.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
hptiop.h hptiop: Support HighPoint RR36xx HBAs and Support SAS tape and SAS media changer 2015-08-12 13:14:57 -07:00
imm.c imm: check parport_claim 2016-02-25 21:10:53 -05:00
imm.h
initio.c SCSI: initio: remove duplicate module device table 2015-11-20 11:39:03 -05:00
initio.h
ipr.c scsi: ipr: do not set DID_PASSTHROUGH on CHECK CONDITION 2017-04-11 21:40:19 -04:00
ipr.h scsi: ipr: Use pci_irq_allocate_vectors 2016-11-08 17:29:46 -05:00
ips.c scsi: ips: don't use custom hex_asc_upper[] table 2016-11-08 17:29:57 -05:00
ips.h Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
iscsi_boot_sysfs.c ibft: Expose iBFT acpi header via sysfs 2016-05-16 11:14:29 -04:00
iscsi_tcp.c scsi: remove eh_timed_out methods in the transport template 2017-02-06 19:10:03 -05:00
iscsi_tcp.h iscsi_tcp: Use ahash 2016-01-27 20:36:10 +08:00
jazz_esp.c
Kconfig scsi: lpfc: Finalize Kconfig options for nvme 2017-03-15 13:37:18 -04:00
lasi700.c
libiscsi_tcp.c iscsi_tcp: Use ahash 2016-01-27 20:36:10 +08:00
libiscsi.c scsi: libiscsi: add lock around task lists to fix list corruption regression 2017-02-28 22:05:22 -05:00
mac53c94.c PCI: Remove includes of asm/pci-bridge.h 2016-02-05 16:29:28 -06:00
mac53c94.h
mac_esp.c
mac_scsi.c scsi: ncr5380: Resolve various static checker warnings 2017-01-31 21:38:35 -05:00
Makefile scsi: qedf: Add QLogic FastLinQ offload FCoE driver framework. 2017-02-22 19:10:59 -05:00
megaraid.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
megaraid.h
mesh.c PCI: Remove includes of asm/pci-bridge.h 2016-02-05 16:29:28 -06:00
mesh.h
mvme16x_scsi.c
mvme147.c
mvme147.h
mvumi.c scsi: mvumi: remove fake transport template 2017-02-06 19:08:17 -05:00
mvumi.h
ncr53c8xx.c scsi: drop reason argument from ->change_queue_depth 2014-11-24 14:45:27 +01:00
ncr53c8xx.h
NCR53c406a.c scsi: Do not set cmd_per_lun to 1 in the host template 2015-05-31 18:06:28 -07:00
NCR5380.c scsi: ncr5380: Improve target selection robustness 2017-01-31 21:38:58 -05:00
NCR5380.h scsi: ncr5380: Clean up dead code and redundant macro usage 2017-01-31 21:37:44 -05:00
NCR_D700.c
NCR_D700.h
NCR_Q720.c
NCR_Q720.h
nsp32_debug.c
nsp32_io.h
nsp32.c scsi: Do not set cmd_per_lun to 1 in the host template 2015-05-31 18:06:28 -07:00
nsp32.h
osst_detect.h
osst_options.h
osst.c block: fold cmd_type into the REQ_OP_ space 2017-01-31 14:00:44 -07:00
osst.h
pmcraid.c scsi: pmcraid: switch to pci_alloc_irq_vectors 2017-01-09 23:47:00 -05:00
pmcraid.h scsi: pmcraid: switch to pci_alloc_irq_vectors 2017-01-09 23:47:00 -05:00
ppa.c scsi: ppa: use new parport device model 2016-02-23 21:27:02 -05:00
ppa.h
ps3rom.c scsi: Do not set cmd_per_lun to 1 in the host template 2015-05-31 18:06:28 -07:00
qla1280.c qla1280: Don't allocate 512kb of host tags 2016-04-30 09:25:26 -07:00
qla1280.h
qlogicfas408.c
qlogicfas408.h
qlogicfas.c scsi: Do not set cmd_per_lun to 1 in the host template 2015-05-31 18:06:28 -07:00
qlogicpti.c qlogicpti: Return correct error code 2016-03-01 20:06:49 -05:00
qlogicpti.h qlogicpti: Fix compiler warnings 2016-11-28 15:51:31 -05:00
raid_class.c
script_asm.pl
scsi_common.c scsi: always zero sshdr in scsi_normalize_sense 2017-02-22 19:33:00 -05:00
scsi_debug.c scsi: scsi_debug: Add OPTIMAL TRANSFER LENGTH GRANULARITY option. 2017-01-31 22:08:44 -05:00
scsi_devinfo.c scsi: scsi_devinfo: remove synchronous ALUA for NETAPP devices 2016-12-07 18:13:52 -05:00
scsi_dh.c scsi: use 'scsi_device_from_queue()' for scsi_dh 2017-02-22 18:41:42 -05:00
scsi_error.c SCSI misc on 20170220 2017-02-21 11:51:42 -08:00
scsi_ioctl.c scsi: make the sense header argument to scsi_test_unit_ready mandatory 2017-02-22 19:35:24 -05:00
scsi_lib_dma.c
scsi_lib.c scsi: mpt3sas: Avoid sleeping in interrupt context 2017-03-01 21:52:13 -05:00
scsi_logging.c scsi_logging: return void for dev_printk() functions 2015-02-04 08:00:24 -08:00
scsi_logging.h
scsi_module.c
scsi_netlink.c
scsi_pm.c scsi: Set request queue runtime PM status back to active on resume 2016-02-19 10:52:45 -05:00
scsi_priv.h scsi: mpt3sas: Avoid sleeping in interrupt context 2017-03-01 21:52:13 -05:00
scsi_proc.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
scsi_sas_internal.h scsi_transport_sas: add 'scsi_target_id' sysfs attribute 2016-03-14 21:05:04 -04:00
scsi_scan.c scsi: Remove one useless stack variable 2016-10-11 18:02:09 -04:00
scsi_sysctl.c
scsi_sysfs.c scsi: avoid a permanent stop of the scsi device's request queue 2016-12-14 15:51:17 -05:00
scsi_trace.c scsi-trace: define ZBC_IN and ZBC_OUT 2016-04-11 16:57:09 -04:00
scsi_transport_api.h
scsi_transport_fc.c SCSI misc on 20170220 2017-02-21 11:51:42 -08:00
scsi_transport_iscsi.c block/bsg: move queue creation into bsg_setup_queue 2017-01-27 15:08:35 -07:00
scsi_transport_sas.c block: split scsi_request out of struct request 2017-01-27 15:08:35 -07:00
scsi_transport_spi.c scsi: merge __scsi_execute into scsi_execute 2017-02-23 16:57:19 -05:00
scsi_transport_srp.c scsi: remove tsk_mgmt_response and it_nexus_response transport methods 2017-02-06 19:10:41 -05:00
scsi_typedefs.h
scsi.c block: introduce blk_rq_is_passthrough 2017-01-31 14:00:34 -07:00
scsi.h
scsicam.c
sd_dif.c scsi: sd: Move DIF protection types to t10-pi.h 2016-09-15 09:51:14 -04:00
sd_zbc.c sd_zbc: Force use of READ16/WRITE16 2016-11-14 13:16:42 -07:00
sd.c scsi: sd: Fix capacity calculation with 32-bit sector_t 2017-04-07 17:07:16 -04:00
sd.h sd: Implement support for ZBC devices 2016-10-18 19:49:11 -06:00
sense_codes.h scsi: move Additional Sense Codes to separate file 2016-04-11 16:57:09 -04:00
ses.c scsi: ses: Fix SAS device detection in enclosure 2017-01-17 13:58:57 -05:00
sg.c scsi: sg: check length passed to SG_NEXT_CMD_LEN 2017-03-16 19:46:33 -04:00
sgiwd93.c
sim710.c scsi: sim710: fix build warning 2016-02-23 21:27:02 -05:00
sni_53c710.c
sr_ioctl.c scsi: merge __scsi_execute into scsi_execute 2017-02-23 16:57:19 -05:00
sr_vendor.c
sr.c scsi: sr: Sanity check returned mode data 2017-04-07 17:07:14 -04:00
sr.h
st_options.h
st.c block: fold cmd_type into the REQ_OP_ space 2017-01-31 14:00:44 -07:00
st.h st: Remove obsolete scsi_tape.max_pfn 2015-11-18 11:59:09 -05:00
stex.c stex: Add S3/S4 support 2016-02-23 21:27:02 -05:00
storvsc_drv.c scsi: storvsc: Workaround for virtual DVD SCSI version 2017-03-07 20:20:12 -05:00
sun3_scsi_vme.c
sun3_scsi.c SCSI misc on 20170220 2017-02-21 11:51:42 -08:00
sun3x_esp.c arch, drivers: don't include <asm/io.h> directly, use <linux/io.h> instead 2015-08-10 23:07:05 -04:00
sun_esp.c
sym53c416.c scsi: Do not set cmd_per_lun to 1 in the host template 2015-05-31 18:06:28 -07:00
sym53c416.h
virtio_scsi.c scsi: virtio_scsi: Reject commands when virtqueue is broken 2017-01-20 19:17:18 -05:00
vmw_pvscsi.c scsi: vmw_pvscsi: handle the return value from pci_alloc_irq_vectors correctly 2017-03-06 22:27:33 -05:00
vmw_pvscsi.h scsi: vmw_pvscsi: switch to pci_alloc_irq_vectors 2017-01-11 22:31:03 -05:00
wd33c93.c scsi: print single-character strings with seq_putc 2015-02-02 09:57:46 -08:00
wd33c93.h
wd719x.c drivers/scsi/wd719x.c: remove last declaration using DEFINE_PCI_DEVICE_TABLE 2016-09-01 17:52:01 -07:00
wd719x.h scsi: Do not set cmd_per_lun to 1 in the host template 2015-05-31 18:06:28 -07:00
xen-scsifront.c xen/scsifront: don't request a slot on the ring until request is ready 2016-12-09 10:59:13 +01:00
zalon.c
zorro7xx.c