2017-11-14 20:38:05 +03:00
// SPDX-License-Identifier: GPL-2.0
2005-04-17 02:20:36 +04:00
/*
2008-06-10 20:20:58 +04:00
* zfcp device driver
2005-04-17 02:20:36 +04:00
*
2008-06-10 20:20:58 +04:00
* Module interface and handling of zfcp data structures .
2005-04-17 02:20:36 +04:00
*
scsi: zfcp: Move allocation of the shost object to after xconf- and xport-data
At the moment we allocate and register the Scsi_Host object corresponding
to a zfcp adapter (FCP device) very early in the life cycle of the adapter
- even before we fully discover and initialize the underlying
firmware/hardware. This had the advantage that we could already use the
Scsi_Host object, and fill in all its information during said discover and
initialize.
Due to commit 737eb78e82d5 ("block: Delay default elevator initialization")
(first released in v5.4), we noticed a regression that would prevent us
from using any storage volume if zfcp is configured with support for DIF or
DIX (zfcp.dif=1 || zfcp.dix=1). Doing so would result in an illegal memory
access as soon as the first request is sent with such an configuration. As
example for a crash resulting from this:
scsi host0: scsi_eh_0: sleeping
scsi host0: zfcp
qdio: 0.0.1900 ZFCP on SC 4bd using AI:1 QEBSM:0 PRI:1 TDD:1 SIGA: W AP
scsi 0:0:0:0: scsi scan: INQUIRY pass 1 length 36
Unable to handle kernel pointer dereference in virtual kernel address space
Failing address: 0000000000000000 TEID: 0000000000000483
Fault in home space mode while using kernel ASCE.
AS:0000000035c7c007 R3:00000001effcc007 S:00000001effd1000 P:000000000000003d
Oops: 0004 ilc:3 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: ...
CPU: 1 PID: 783 Comm: kworker/u760:5 Kdump: loaded Not tainted 5.6.0-rc2-bb-next+ #1
Hardware name: ...
Workqueue: scsi_wq_0 fc_scsi_scan_rport [scsi_transport_fc]
Krnl PSW : 0704e00180000000 000003ff801fcdae (scsi_queue_rq+0x436/0x740 [scsi_mod])
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
Krnl GPRS: 0fffffffffffffff 0000000000000000 0000000187150120 0000000000000000
000003ff80223d20 000000000000018e 000000018adc6400 0000000187711000
000003e0062337e8 00000001ae719000 0000000187711000 0000000187150000
00000001ab808100 0000000187150120 000003ff801fcd74 000003e0062336a0
Krnl Code: 000003ff801fcd9e: e310a35c0012 lt %r1,860(%r10)
000003ff801fcda4: a7840010 brc 8,000003ff801fcdc4
#000003ff801fcda8: e310b2900004 lg %r1,656(%r11)
>000003ff801fcdae: d71710001000 xc 0(24,%r1),0(%r1)
000003ff801fcdb4: e310b2900004 lg %r1,656(%r11)
000003ff801fcdba: 41201018 la %r2,24(%r1)
000003ff801fcdbe: e32010000024 stg %r2,0(%r1)
000003ff801fcdc4: b904002b lgr %r2,%r11
Call Trace:
[<000003ff801fcdae>] scsi_queue_rq+0x436/0x740 [scsi_mod]
([<000003ff801fcd74>] scsi_queue_rq+0x3fc/0x740 [scsi_mod])
[<00000000349c9970>] blk_mq_dispatch_rq_list+0x390/0x680
[<00000000349d1596>] blk_mq_sched_dispatch_requests+0x196/0x1a8
[<00000000349c7a04>] __blk_mq_run_hw_queue+0x144/0x160
[<00000000349c7ab6>] __blk_mq_delay_run_hw_queue+0x96/0x228
[<00000000349c7d5a>] blk_mq_run_hw_queue+0xd2/0xe0
[<00000000349d194a>] blk_mq_sched_insert_request+0x192/0x1d8
[<00000000349c17b8>] blk_execute_rq_nowait+0x80/0x90
[<00000000349c1856>] blk_execute_rq+0x6e/0xb0
[<000003ff801f8ac2>] __scsi_execute+0xe2/0x1f0 [scsi_mod]
[<000003ff801fef98>] scsi_probe_and_add_lun+0x358/0x840 [scsi_mod]
[<000003ff8020001c>] __scsi_scan_target+0xc4/0x228 [scsi_mod]
[<000003ff80200254>] scsi_scan_target+0xd4/0x100 [scsi_mod]
[<000003ff802d8b96>] fc_scsi_scan_rport+0x96/0xc0 [scsi_transport_fc]
[<0000000034245ce8>] process_one_work+0x458/0x7d0
[<00000000342462a2>] worker_thread+0x242/0x448
[<0000000034250994>] kthread+0x15c/0x170
[<0000000034e1979c>] ret_from_fork+0x30/0x38
INFO: lockdep is turned off.
Last Breaking-Event-Address:
[<000003ff801fbc36>] scsi_add_cmd_to_list+0x9e/0xa8 [scsi_mod]
Kernel panic - not syncing: Fatal exception: panic_on_oops
While this issue is exposed by the commit named above, this is only by
accident. The real issue exists for longer already - basically since it's
possible to use blk-mq via scsi-mq, and blk-mq pre-allocates all requests
for a tag-set during initialization of the same. For a given Scsi_Host
object this is done when adding the object to the midlayer
(`scsi_add_host()` and such). In `scsi_mq_setup_tags()` the midlayer
calculates how much memory is required for a single scsi_cmnd, and its
additional data, which also might include space for additional protection
data - depending on whether the Scsi_Host has any form of protection
capabilities (`scsi_host_get_prot()`).
The problem is now thus, because zfcp does this step before we actually
know whether the firmware/hardware has these capabilities, we don't set any
protection capabilities in the Scsi_Host object. And so, no space is
allocated for additional protection data for requests in the Scsi_Host
tag-set.
Once we go through discover and initialize the FCP device firmware/hardware
fully (this is done via the firmware commands "Exchange Config Data" and
"Exchange Port Data") we find out whether it actually supports DIF and DIX,
and we set the corresponding capabilities in the Scsi_Host object (in
`zfcp_scsi_set_prot()`). Now the Scsi_Host potentially has protection
capabilities, but the already allocated requests in the tag-set don't have
any space allocated for that.
When we then trigger target scanning or add scsi_devices manually, the
midlayer will use requests from that tag-set, and before sending most
requests, it will also call `scsi_mq_prep_fn()`. To prepare the scsi_cmnd
this function will check again whether the used Scsi_Host has any
protection capabilities - and now it potentially has - and if so, it will
try to initialize the assumed to be preallocated structures and thus it
causes the crash, like shown above.
Before delaying the default elevator initialization with the commit named
above, we always would also allocate an elevator for any scsi_device before
ever sending any requests - in contrast to now, where we do it after
device-probing. That elevator in turn would have its own tag-set, and that
is initialized after we went through discovery and initialization of the
underlying firmware/hardware. So requests from that tag-set can be
allocated properly, and if used - unless the user changes/disabled the
default elevator - this would hide the underlying issue.
To fix this for any configuration - with or without an elevator - we move
the allocation and registration of the Scsi_Host object for a given FCP
device to after the first complete discovery and initialization of the
underlying firmware/hardware. By doing that we can make all basic
properties of the Scsi_Host known to the midlayer by the time we call
`scsi_add_host()`, including whether we have any protection capabilities.
To do that we have to delay all the accesses that we would have done in the
past during discovery and initialization, and do them instead once we are
finished with it. The previous patches ramp up to this by fencing and
factoring out all these accesses, and make it possible to re-do them later
on. In addition we make also use of the diagnostic buffers we recently
added with
commit 92953c6e0aa7 ("scsi: zfcp: signal incomplete or error for sync exchange config/port data")
commit 7e418833e689 ("scsi: zfcp: diagnostics buffer caching and use for exchange port data")
commit 088210233e6f ("scsi: zfcp: add diagnostics buffer for exchange config data")
(first released in v5.5), because these already cache all the information
we need for that "re-do operation" - the information cached are always
updated during xconf or xport data, so it won't be stale.
In addition to the move and re-do, this patch also updates the
function-documentation of `zfcp_scsi_adapter_register()` and changes how it
reports if a Scsi_Host object already exists. In that case future
recovery-operations can skip this step completely and behave much like they
would do in the past - zfcp does not release a once allocated Scsi_Host
object unless the corresponding FCP device is deconstructed completely.
Link: https://lore.kernel.org/r/030dd6da318bbb529f0b5268ec65cebcd20fc0a3.1588956679.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-05-08 20:23:35 +03:00
* Copyright IBM Corp . 2002 , 2020
2005-04-17 02:20:36 +04:00
*/
2006-05-22 20:14:08 +04:00
/*
* Driver authors :
* Martin Peschke ( originator of the driver )
* Raimund Schroeder
* Aron Zeh
* Wolfgang Taphorn
* Stefan Bader
* Heiko Carstens ( kernel 2.6 port of the driver )
* Andreas Herrmann
* Maxim Shchetynin
* Volker Sameske
* Ralph Wuerthner
2008-06-10 20:20:58 +04:00
* Michael Loehr
* Swen Schillig
* Christof Schmitt
* Martin Petermann
* Sven Schuetz
2013-04-26 19:34:54 +04:00
* Steffen Maier
2019-10-25 19:12:44 +03:00
* Benjamin Block
2006-05-22 20:14:08 +04:00
*/
2008-12-25 15:39:53 +03:00
# define KMSG_COMPONENT "zfcp"
# define pr_fmt(fmt) KMSG_COMPONENT ": " fmt
2008-12-25 15:38:50 +03:00
# include <linux/seq_file.h>
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 11:04:11 +03:00
# include <linux/slab.h>
2011-07-30 11:25:15 +04:00
# include <linux/module.h>
2005-04-17 02:20:36 +04:00
# include "zfcp_ext.h"
2009-11-24 18:54:10 +03:00
# include "zfcp_fc.h"
2010-02-17 13:18:50 +03:00
# include "zfcp_reqlist.h"
2019-10-25 19:12:44 +03:00
# include "zfcp_diag.h"
2005-04-17 02:20:36 +04:00
2008-12-25 15:38:55 +03:00
# define ZFCP_BUS_ID_SIZE 20
2006-05-22 20:14:08 +04:00
MODULE_AUTHOR ( " IBM Deutschland Entwicklung GmbH - linux390@de.ibm.com " ) ;
2008-07-02 12:56:37 +04:00
MODULE_DESCRIPTION ( " FCP HBA driver " ) ;
2005-04-17 02:20:36 +04:00
MODULE_LICENSE ( " GPL " ) ;
2008-12-19 18:56:57 +03:00
static char * init_device ;
module_param_named ( device , init_device , charp , 0400 ) ;
2005-04-17 02:20:36 +04:00
MODULE_PARM_DESC ( device , " specify initial device " ) ;
2010-12-02 17:16:18 +03:00
static struct kmem_cache * __init zfcp_cache_hw_align ( const char * name ,
unsigned long size )
2009-08-18 17:43:15 +04:00
{
return kmem_cache_create ( name , size , roundup_pow_of_two ( size ) , 0 , NULL ) ;
}
2008-12-19 18:56:57 +03:00
static void __init zfcp_init_device_configure ( char * busid , u64 wwpn , u64 lun )
2005-04-17 02:20:36 +04:00
{
2009-11-24 18:54:00 +03:00
struct ccw_device * cdev ;
2005-04-17 02:20:36 +04:00
struct zfcp_adapter * adapter ;
struct zfcp_port * port ;
2009-11-24 18:54:00 +03:00
cdev = get_ccwdev_by_busid ( & zfcp_ccw_driver , busid ) ;
if ( ! cdev )
2009-09-24 12:23:22 +04:00
return ;
2009-11-24 18:54:00 +03:00
if ( ccw_device_set_online ( cdev ) )
goto out_ccw_device ;
2005-04-17 02:20:36 +04:00
2009-11-24 18:54:00 +03:00
adapter = zfcp_ccw_adapter_by_cdev ( cdev ) ;
2008-07-02 12:56:37 +04:00
if ( ! adapter )
2009-11-24 18:54:00 +03:00
goto out_ccw_device ;
2009-09-24 12:23:22 +04:00
port = zfcp_get_port_by_wwpn ( adapter , wwpn ) ;
if ( ! port )
2005-04-17 02:20:36 +04:00
goto out_port ;
2010-08-30 12:55:09 +04:00
flush_work ( & port - > rport_work ) ;
2008-10-01 14:42:25 +04:00
2010-09-08 16:39:52 +04:00
zfcp_unit_add ( port , lun ) ;
2010-02-17 13:18:56 +03:00
put_device ( & port - > dev ) ;
2010-09-08 16:39:52 +04:00
2008-07-02 12:56:37 +04:00
out_port :
2009-11-24 18:54:00 +03:00
zfcp_ccw_adapter_put ( adapter ) ;
out_ccw_device :
put_device ( & cdev - > dev ) ;
2005-04-17 02:20:36 +04:00
return ;
}
2008-12-19 18:56:57 +03:00
static void __init zfcp_init_device_setup ( char * devstr )
{
char * token ;
2009-10-13 12:44:07 +04:00
char * str , * str_saved ;
2008-12-19 18:56:57 +03:00
char busid [ ZFCP_BUS_ID_SIZE ] ;
u64 wwpn , lun ;
/* duplicate devstr and keep the original for sysfs presentation*/
2010-07-16 17:37:34 +04:00
str_saved = kstrdup ( devstr , GFP_KERNEL ) ;
2009-10-13 12:44:07 +04:00
str = str_saved ;
2008-12-19 18:56:57 +03:00
if ( ! str )
return ;
token = strsep ( & str , " , " ) ;
if ( ! token | | strlen ( token ) > = ZFCP_BUS_ID_SIZE )
goto err_out ;
s390/dasd,zfcp: fix gcc 8 stringop-truncation warnings
ccw "busid" should always be NUL-terminated, as evident from e.g.
get_ccwdev_by_busid doing "return (strcmp(bus_id, dev_name(dev)) == 0)".
Replace all strncpy initializing busid with strlcpy. This fixes the
following gcc 8 warnings:
drivers/s390/scsi/zfcp_aux.c:104:2: warning: 'strncpy' specified bound 20
equals destination size [-Wstringop-truncation]
strncpy(busid, token, ZFCP_BUS_ID_SIZE);
drivers/s390/block/dasd_eer.c:316:2: warning: 'strncpy' specified bound 10
equals destination size [-Wstringop-truncation]
strncpy(header.busid, dev_name(&device->cdev->dev), DASD_EER_BUSID_SIZE);
drivers/s390/block/dasd_eer.c:359:2: warning: 'strncpy' specified bound 10
equals destination size [-Wstringop-truncation]
strncpy(header.busid, dev_name(&device->cdev->dev), DASD_EER_BUSID_SIZE);
drivers/s390/block/dasd_devmap.c:429:3: warning: 'strncpy' specified bound
20 equals destination size [-Wstringop-truncation]
strncpy(new->bus_id, bus_id, DASD_BUS_ID_SIZE);
Acked-by: Stefan Haberland <sth@linux.ibm.com>
Acked-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-06-17 12:56:17 +03:00
strlcpy ( busid , token , ZFCP_BUS_ID_SIZE ) ;
2008-12-19 18:56:57 +03:00
token = strsep ( & str , " , " ) ;
2013-08-22 19:49:32 +04:00
if ( ! token | | kstrtoull ( token , 0 , ( unsigned long long * ) & wwpn ) )
2008-12-19 18:56:57 +03:00
goto err_out ;
token = strsep ( & str , " , " ) ;
2013-08-22 19:49:32 +04:00
if ( ! token | | kstrtoull ( token , 0 , ( unsigned long long * ) & lun ) )
2008-12-19 18:56:57 +03:00
goto err_out ;
2009-10-13 12:44:07 +04:00
kfree ( str_saved ) ;
2008-12-19 18:56:57 +03:00
zfcp_init_device_configure ( busid , wwpn , lun ) ;
return ;
2009-10-13 12:44:07 +04:00
err_out :
kfree ( str_saved ) ;
2008-12-19 18:56:57 +03:00
pr_err ( " %s is not a valid SCSI device \n " , devstr ) ;
}
2008-07-02 12:56:37 +04:00
static int __init zfcp_module_init ( void )
2005-04-17 02:20:36 +04:00
{
2006-09-19 00:28:49 +04:00
int retval = - ENOMEM ;
scsi: zfcp: make DIX experimental, disabled, and independent of DIF
Introduce separate zfcp module parameters to individually select support
for: DIF which should work (zfcp.dif, which used to be DIF+DIX, disabled)
or DIX+DIF which can cause trouble (zfcp.dix, new, disabled).
If DIX is enabled, we warn on zfcp driver initialization. As before, this
also reduces the maximum I/O request size to half, to support the worst
case of merged single sector requests with one protection data scatter
gather element per sector. This can impact the maximum throughput.
In DIF-only mode (zfcp.dif=1 zfcp.dix=0), we can use the full maximum I/O
request size as there is no protection data for zfcp.
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Fedor Loshakov <loshakov@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-29 15:09:56 +03:00
if ( zfcp_experimental_dix )
pr_warn ( " DIX is enabled. It is experimental and might cause problems \n " ) ;
2011-02-22 21:54:44 +03:00
zfcp_fsf_qtcb_cache = zfcp_cache_hw_align ( " zfcp_fsf_qtcb " ,
sizeof ( struct fsf_qtcb ) ) ;
if ( ! zfcp_fsf_qtcb_cache )
2009-08-18 17:43:15 +04:00
goto out_qtcb_cache ;
2011-02-22 21:54:41 +03:00
zfcp_fc_req_cache = zfcp_cache_hw_align ( " zfcp_fc_req " ,
sizeof ( struct zfcp_fc_req ) ) ;
if ( ! zfcp_fc_req_cache )
goto out_fc_cache ;
2009-11-24 18:54:14 +03:00
2011-02-22 21:54:46 +03:00
zfcp_scsi_transport_template =
2006-09-19 00:28:49 +04:00
fc_attach_transport ( & zfcp_transport_functions ) ;
2011-02-22 21:54:46 +03:00
if ( ! zfcp_scsi_transport_template )
2006-09-19 00:28:49 +04:00
goto out_transport ;
2011-02-22 21:54:46 +03:00
scsi_transport_reserve_device ( zfcp_scsi_transport_template ,
2010-09-08 16:39:51 +04:00
sizeof ( struct zfcp_scsi_dev ) ) ;
2009-11-24 18:54:04 +03:00
retval = ccw_driver_register ( & zfcp_ccw_driver ) ;
2005-04-17 02:20:36 +04:00
if ( retval ) {
2008-12-25 15:39:53 +03:00
pr_err ( " The zfcp device driver could not register with "
2008-10-01 14:42:15 +04:00
" the common I/O layer \n " ) ;
2005-04-17 02:20:36 +04:00
goto out_ccw_register ;
}
2008-12-19 18:56:57 +03:00
if ( init_device )
zfcp_init_device_setup ( init_device ) ;
return 0 ;
2005-04-17 02:20:36 +04:00
2008-07-02 12:56:37 +04:00
out_ccw_register :
2011-02-22 21:54:46 +03:00
fc_release_transport ( zfcp_scsi_transport_template ) ;
2008-07-02 12:56:37 +04:00
out_transport :
2011-02-22 21:54:41 +03:00
kmem_cache_destroy ( zfcp_fc_req_cache ) ;
out_fc_cache :
2011-02-22 21:54:44 +03:00
kmem_cache_destroy ( zfcp_fsf_qtcb_cache ) ;
2009-08-18 17:43:15 +04:00
out_qtcb_cache :
2005-04-17 02:20:36 +04:00
return retval ;
}
2008-07-02 12:56:37 +04:00
module_init ( zfcp_module_init ) ;
2005-04-17 02:20:36 +04:00
2009-11-24 18:54:04 +03:00
static void __exit zfcp_module_exit ( void )
{
ccw_driver_unregister ( & zfcp_ccw_driver ) ;
2011-02-22 21:54:46 +03:00
fc_release_transport ( zfcp_scsi_transport_template ) ;
2011-02-22 21:54:41 +03:00
kmem_cache_destroy ( zfcp_fc_req_cache ) ;
2011-02-22 21:54:44 +03:00
kmem_cache_destroy ( zfcp_fsf_qtcb_cache ) ;
2009-11-24 18:54:04 +03:00
}
module_exit ( zfcp_module_exit ) ;
2005-04-17 02:20:36 +04:00
/**
* zfcp_get_port_by_wwpn - find port in port list of adapter by wwpn
* @ adapter : pointer to adapter to search for port
* @ wwpn : wwpn to search for
2008-07-02 12:56:37 +04:00
*
* Returns : pointer to zfcp_port or NULL
2005-04-17 02:20:36 +04:00
*/
2008-07-02 12:56:37 +04:00
struct zfcp_port * zfcp_get_port_by_wwpn ( struct zfcp_adapter * adapter ,
2008-10-01 14:42:18 +04:00
u64 wwpn )
2005-04-17 02:20:36 +04:00
{
2009-11-24 18:53:58 +03:00
unsigned long flags ;
2005-04-17 02:20:36 +04:00
struct zfcp_port * port ;
2009-11-24 18:53:58 +03:00
read_lock_irqsave ( & adapter - > port_list_lock , flags ) ;
list_for_each_entry ( port , & adapter - > port_list , list )
2009-11-24 18:54:05 +03:00
if ( port - > wwpn = = wwpn ) {
2010-02-17 13:18:56 +03:00
if ( ! get_device ( & port - > dev ) )
2009-11-24 18:54:05 +03:00
port = NULL ;
2009-11-24 18:53:58 +03:00
read_unlock_irqrestore ( & adapter - > port_list_lock , flags ) ;
2008-07-02 12:56:37 +04:00
return port ;
2009-11-24 18:53:58 +03:00
}
read_unlock_irqrestore ( & adapter - > port_list_lock , flags ) ;
2008-07-02 12:56:37 +04:00
return NULL ;
2005-04-17 02:20:36 +04:00
}
2008-07-02 12:56:37 +04:00
static int zfcp_allocate_low_mem_buffers ( struct zfcp_adapter * adapter )
2005-04-17 02:20:36 +04:00
{
2009-08-18 17:43:15 +04:00
adapter - > pool . erp_req =
mempool_create_kmalloc_pool ( 1 , sizeof ( struct zfcp_fsf_req ) ) ;
if ( ! adapter - > pool . erp_req )
2005-04-17 02:20:36 +04:00
return - ENOMEM ;
2009-08-18 17:43:20 +04:00
adapter - > pool . gid_pn_req =
mempool_create_kmalloc_pool ( 1 , sizeof ( struct zfcp_fsf_req ) ) ;
if ( ! adapter - > pool . gid_pn_req )
return - ENOMEM ;
2009-08-18 17:43:15 +04:00
adapter - > pool . scsi_req =
mempool_create_kmalloc_pool ( 1 , sizeof ( struct zfcp_fsf_req ) ) ;
if ( ! adapter - > pool . scsi_req )
2005-04-17 02:20:36 +04:00
return - ENOMEM ;
2009-08-18 17:43:15 +04:00
adapter - > pool . scsi_abort =
mempool_create_kmalloc_pool ( 1 , sizeof ( struct zfcp_fsf_req ) ) ;
if ( ! adapter - > pool . scsi_abort )
2005-04-17 02:20:36 +04:00
return - ENOMEM ;
2009-08-18 17:43:15 +04:00
adapter - > pool . status_read_req =
2008-07-02 12:56:37 +04:00
mempool_create_kmalloc_pool ( FSF_STATUS_READS_RECOM ,
2006-03-26 13:37:47 +04:00
sizeof ( struct zfcp_fsf_req ) ) ;
2009-08-18 17:43:15 +04:00
if ( ! adapter - > pool . status_read_req )
return - ENOMEM ;
adapter - > pool . qtcb_pool =
2011-02-22 21:54:44 +03:00
mempool_create_slab_pool ( 4 , zfcp_fsf_qtcb_cache ) ;
2009-08-18 17:43:15 +04:00
if ( ! adapter - > pool . qtcb_pool )
2005-04-17 02:20:36 +04:00
return - ENOMEM ;
2011-02-22 21:54:40 +03:00
BUILD_BUG_ON ( sizeof ( struct fsf_status_read_buffer ) > PAGE_SIZE ) ;
adapter - > pool . sr_data =
mempool_create_page_pool ( FSF_STATUS_READS_RECOM , 0 ) ;
if ( ! adapter - > pool . sr_data )
2005-04-17 02:20:36 +04:00
return - ENOMEM ;
2009-11-24 18:54:10 +03:00
adapter - > pool . gid_pn =
2011-02-22 21:54:42 +03:00
mempool_create_slab_pool ( 1 , zfcp_fc_req_cache ) ;
2009-11-24 18:54:10 +03:00
if ( ! adapter - > pool . gid_pn )
2005-04-17 02:20:36 +04:00
return - ENOMEM ;
return 0 ;
}
2008-07-02 12:56:37 +04:00
static void zfcp_free_low_mem_buffers ( struct zfcp_adapter * adapter )
2005-04-17 02:20:36 +04:00
{
2018-11-08 17:44:37 +03:00
mempool_destroy ( adapter - > pool . erp_req ) ;
mempool_destroy ( adapter - > pool . scsi_req ) ;
mempool_destroy ( adapter - > pool . scsi_abort ) ;
mempool_destroy ( adapter - > pool . qtcb_pool ) ;
mempool_destroy ( adapter - > pool . status_read_req ) ;
mempool_destroy ( adapter - > pool . sr_data ) ;
mempool_destroy ( adapter - > pool . gid_pn ) ;
2005-04-17 02:20:36 +04:00
}
2008-07-02 12:56:37 +04:00
/**
* zfcp_status_read_refill - refill the long running status_read_requests
* @ adapter : ptr to struct zfcp_adapter for which the buffers should be refilled
*
2018-12-06 19:31:21 +03:00
* Return :
* * 0 on success meaning at least one status read is pending
* * 1 if posting failed and not a single status read buffer is pending ,
* also triggers adapter reopen recovery
2008-07-02 12:56:37 +04:00
*/
2008-05-19 14:17:37 +04:00
int zfcp_status_read_refill ( struct zfcp_adapter * adapter )
{
2018-12-06 19:31:20 +03:00
while ( atomic_add_unless ( & adapter - > stat_miss , - 1 , 0 ) )
2009-08-18 17:43:19 +04:00
if ( zfcp_fsf_status_read ( adapter - > qdio ) ) {
2018-12-06 19:31:20 +03:00
atomic_inc ( & adapter - > stat_miss ) ; /* undo add -1 */
2010-04-30 20:09:36 +04:00
if ( atomic_read ( & adapter - > stat_miss ) > =
adapter - > stat_read_buf_num ) {
2010-12-02 17:16:16 +03:00
zfcp_erp_adapter_reopen ( adapter , 0 , " axsref1 " ) ;
2008-07-02 12:56:33 +04:00
return 1 ;
}
2008-05-19 14:17:37 +04:00
break ;
2018-12-06 19:31:20 +03:00
}
2008-05-19 14:17:37 +04:00
return 0 ;
}
static void _zfcp_status_read_scheduler ( struct work_struct * work )
{
zfcp_status_read_refill ( container_of ( work , struct zfcp_adapter ,
stat_work ) ) ;
}
2020-10-28 21:30:52 +03:00
static void zfcp_version_change_lost_work ( struct work_struct * work )
{
struct zfcp_adapter * adapter = container_of ( work , struct zfcp_adapter ,
version_change_lost_work ) ;
zfcp_fsf_exchange_config_data_sync ( adapter - > qdio , NULL ) ;
}
2008-12-25 15:38:50 +03:00
static void zfcp_print_sl ( struct seq_file * m , struct service_level * sl )
{
struct zfcp_adapter * adapter =
container_of ( sl , struct zfcp_adapter , service_level ) ;
seq_printf ( m , " zfcp: %s microcode level %x \n " ,
dev_name ( & adapter - > ccw_device - > dev ) ,
adapter - > fsf_lic_version ) ;
}
2009-08-18 17:43:17 +04:00
static int zfcp_setup_adapter_work_queue ( struct zfcp_adapter * adapter )
{
char name [ TASK_COMM_LEN ] ;
snprintf ( name , sizeof ( name ) , " zfcp_q_%s " ,
dev_name ( & adapter - > ccw_device - > dev ) ) ;
2016-08-30 23:27:20 +03:00
adapter - > work_queue = alloc_ordered_workqueue ( name , WQ_MEM_RECLAIM ) ;
2009-08-18 17:43:17 +04:00
if ( adapter - > work_queue )
return 0 ;
return - ENOMEM ;
}
static void zfcp_destroy_adapter_work_queue ( struct zfcp_adapter * adapter )
{
if ( adapter - > work_queue )
destroy_workqueue ( adapter - > work_queue ) ;
adapter - > work_queue = NULL ;
}
2008-07-02 12:56:37 +04:00
/**
* zfcp_adapter_enqueue - enqueue a new adapter to the list
* @ ccw_device : pointer to the struct cc_device
*
2009-11-24 18:54:00 +03:00
* Returns : struct zfcp_adapter *
2005-04-17 02:20:36 +04:00
* Enqueues an adapter at the end of the adapter list in the driver data .
* All adapter internal structures are set up .
* Proc - fs entries are also created .
*/
2009-11-24 18:54:00 +03:00
struct zfcp_adapter * zfcp_adapter_enqueue ( struct ccw_device * ccw_device )
2005-04-17 02:20:36 +04:00
{
struct zfcp_adapter * adapter ;
2009-11-24 18:53:59 +03:00
if ( ! get_device ( & ccw_device - > dev ) )
2009-11-24 18:54:00 +03:00
return ERR_PTR ( - ENODEV ) ;
2005-04-17 02:20:36 +04:00
2008-07-02 12:56:37 +04:00
adapter = kzalloc ( sizeof ( struct zfcp_adapter ) , GFP_KERNEL ) ;
2009-11-24 18:53:59 +03:00
if ( ! adapter ) {
put_device ( & ccw_device - > dev ) ;
2009-11-24 18:54:00 +03:00
return ERR_PTR ( - ENOMEM ) ;
2009-11-24 18:53:59 +03:00
}
kref_init ( & adapter - > ref ) ;
2005-04-17 02:20:36 +04:00
ccw_device - > handler = NULL ;
adapter - > ccw_device = ccw_device ;
2009-11-24 18:53:59 +03:00
INIT_WORK ( & adapter - > stat_work , _zfcp_status_read_scheduler ) ;
zfcp: auto port scan resiliency
This patch improves the Fibre Channel port scan behaviour of the zfcp lldd.
Without it the zfcp device driver may churn up the storage area network by
excessive scanning and scan bursts, particularly in big virtual server
environments, potentially resulting in interference of virtual servers and
reduced availability of storage connectivity.
The two main issues as to the zfcp device drivers automatic port scan in
virtual server environments are frequency and simultaneity.
On the one hand, there is no point in allowing lots of ports scans
in a row. It makes sense, though, to make sure that a scan is conducted
eventually if there has been any indication for potential SAN changes.
On the other hand, lots of virtual servers receiving the same indication
for a SAN change had better not attempt to conduct a scan instantly,
that is, at the same time.
Hence this patch has a two-fold approach for better port scanning:
the introduction of a rate limit to amend frequency issues, and the
introduction of a short random backoff to amend simultaneity issues.
Both approaches boil down to deferred port scans, with delays
comprising parts for both approaches.
The new port scan behaviour is summarised best by:
NEW: NEW:
no_auto_port_rescan random rate flush
backoff limit =wait
adapter resume/thaw yes yes no yes*
adapter online (user) no yes no yes*
port rescan (user) no no no yes
adapter recovery (user) yes yes yes no
adapter recovery (other) yes yes yes no
incoming ELS yes yes yes no
incoming ELS lost yes yes yes no
Implementation is straight-forward by converting an existing worker to
a delayed worker. But care is needed whenever that worker is going to be
flushed (in order to make sure work has been completed), since a flush
operation cancels the timer set up for deferred execution (see * above).
There is a small race window whenever a port scan work starts
running up to the point in time of storing the time stamp for that port
scan. The impact is negligible. Closing that gap isn't trivial, though, and
would the destroy the beauty of a simple work-to-delayed-work conversion.
Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-11-13 16:59:48 +03:00
INIT_DELAYED_WORK ( & adapter - > scan_work , zfcp_fc_scan_ports ) ;
2011-02-22 21:54:48 +03:00
INIT_WORK ( & adapter - > ns_up_work , zfcp_fc_sym_name_update ) ;
2020-10-28 21:30:52 +03:00
INIT_WORK ( & adapter - > version_change_lost_work ,
zfcp_version_change_lost_work ) ;
2005-04-17 02:20:36 +04:00
zfcp: auto port scan resiliency
This patch improves the Fibre Channel port scan behaviour of the zfcp lldd.
Without it the zfcp device driver may churn up the storage area network by
excessive scanning and scan bursts, particularly in big virtual server
environments, potentially resulting in interference of virtual servers and
reduced availability of storage connectivity.
The two main issues as to the zfcp device drivers automatic port scan in
virtual server environments are frequency and simultaneity.
On the one hand, there is no point in allowing lots of ports scans
in a row. It makes sense, though, to make sure that a scan is conducted
eventually if there has been any indication for potential SAN changes.
On the other hand, lots of virtual servers receiving the same indication
for a SAN change had better not attempt to conduct a scan instantly,
that is, at the same time.
Hence this patch has a two-fold approach for better port scanning:
the introduction of a rate limit to amend frequency issues, and the
introduction of a short random backoff to amend simultaneity issues.
Both approaches boil down to deferred port scans, with delays
comprising parts for both approaches.
The new port scan behaviour is summarised best by:
NEW: NEW:
no_auto_port_rescan random rate flush
backoff limit =wait
adapter resume/thaw yes yes no yes*
adapter online (user) no yes no yes*
port rescan (user) no no no yes
adapter recovery (user) yes yes yes no
adapter recovery (other) yes yes yes no
incoming ELS yes yes yes no
incoming ELS lost yes yes yes no
Implementation is straight-forward by converting an existing worker to
a delayed worker. But care is needed whenever that worker is going to be
flushed (in order to make sure work has been completed), since a flush
operation cancels the timer set up for deferred execution (see * above).
There is a small race window whenever a port scan work starts
running up to the point in time of storing the time stamp for that port
scan. The impact is negligible. Closing that gap isn't trivial, though, and
would the destroy the beauty of a simple work-to-delayed-work conversion.
Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-11-13 16:59:48 +03:00
adapter - > next_port_scan = jiffies ;
scsi: zfcp: fix erp_action use-before-initialize in REC action trace
v4.10 commit 6f2ce1c6af37 ("scsi: zfcp: fix rport unblock race with LUN
recovery") extended accessing parent pointer fields of struct
zfcp_erp_action for tracing. If an erp_action has never been enqueued
before, these parent pointer fields are uninitialized and NULL. Examples
are zfcp objects freshly added to the parent object's children list,
before enqueueing their first recovery subsequently. In
zfcp_erp_try_rport_unblock(), we iterate such list. Accessing erp_action
fields can cause a NULL pointer dereference. Since the kernel can read
from lowcore on s390, it does not immediately cause a kernel page
fault. Instead it can cause hangs on trying to acquire the wrong
erp_action->adapter->dbf->rec_lock in zfcp_dbf_rec_action_lvl()
^bogus^
while holding already other locks with IRQs disabled.
Real life example from attaching lots of LUNs in parallel on many CPUs:
crash> bt 17723
PID: 17723 TASK: ... CPU: 25 COMMAND: "zfcperp0.0.1800"
LOWCORE INFO:
-psw : 0x0404300180000000 0x000000000038e424
-function : _raw_spin_lock_wait_flags at 38e424
...
#0 [fdde8fc90] zfcp_dbf_rec_action_lvl at 3e0004e9862 [zfcp]
#1 [fdde8fce8] zfcp_erp_try_rport_unblock at 3e0004dfddc [zfcp]
#2 [fdde8fd38] zfcp_erp_strategy at 3e0004e0234 [zfcp]
#3 [fdde8fda8] zfcp_erp_thread at 3e0004e0a12 [zfcp]
#4 [fdde8fe60] kthread at 173550
#5 [fdde8feb8] kernel_thread_starter at 10add2
zfcp_adapter
zfcp_port
zfcp_unit <address>, 0x404040d600000000
scsi_device NULL, returning early!
zfcp_scsi_dev.status = 0x40000000
0x40000000 ZFCP_STATUS_COMMON_RUNNING
crash> zfcp_unit <address>
struct zfcp_unit {
erp_action = {
adapter = 0x0,
port = 0x0,
unit = 0x0,
},
}
zfcp_erp_action is always fully embedded into its container object. Such
container object is never moved in its object tree (only add or delete).
Hence, erp_action parent pointers can never change.
To fix the issue, initialize the erp_action parent pointers before
adding the erp_action container to any list and thus before it becomes
accessible from outside of its initializing function.
In order to also close the time window between zfcp_erp_setup_act()
memsetting the entire erp_action to zero and setting the parent pointers
again, drop the memset and instead explicitly initialize individually
all erp_action fields except for parent pointers. To be extra careful
not to introduce any other unintended side effect, even keep zeroing the
erp_action fields for list and timer. Also double-check with
WARN_ON_ONCE that erp_action parent pointers never change, so we get to
know when we would deviate from previous behavior.
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 6f2ce1c6af37 ("scsi: zfcp: fix rport unblock race with LUN recovery")
Cc: <stable@vger.kernel.org> #2.6.32+
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-13 16:40:07 +03:00
adapter - > erp_action . adapter = adapter ;
2019-10-25 19:12:44 +03:00
if ( zfcp_diag_adapter_setup ( adapter ) )
goto failed ;
2009-08-18 17:43:22 +04:00
if ( zfcp_qdio_setup ( adapter ) )
2009-11-24 18:53:59 +03:00
goto failed ;
2005-04-17 02:20:36 +04:00
2008-06-10 20:20:57 +04:00
if ( zfcp_allocate_low_mem_buffers ( adapter ) )
2009-11-24 18:53:59 +03:00
goto failed ;
2005-04-17 02:20:36 +04:00
2010-02-17 13:18:50 +03:00
adapter - > req_list = zfcp_reqlist_alloc ( ) ;
if ( ! adapter - > req_list )
2009-11-24 18:53:59 +03:00
goto failed ;
2008-07-02 12:56:37 +04:00
2009-08-18 17:43:21 +04:00
if ( zfcp_dbf_adapter_register ( adapter ) )
2009-11-24 18:53:59 +03:00
goto failed ;
2008-07-02 12:56:37 +04:00
2009-08-18 17:43:17 +04:00
if ( zfcp_setup_adapter_work_queue ( adapter ) )
2009-11-24 18:53:59 +03:00
goto failed ;
2009-08-18 17:43:17 +04:00
2009-08-18 17:43:22 +04:00
if ( zfcp_fc_gs_setup ( adapter ) )
2009-11-24 18:53:59 +03:00
goto failed ;
2009-08-18 17:43:22 +04:00
2009-11-24 18:53:58 +03:00
rwlock_init ( & adapter - > port_list_lock ) ;
INIT_LIST_HEAD ( & adapter - > port_list ) ;
2010-07-16 17:37:39 +04:00
INIT_LIST_HEAD ( & adapter - > events . list ) ;
INIT_WORK ( & adapter - > events . work , zfcp_fc_post_event ) ;
spin_lock_init ( & adapter - > events . list_lock ) ;
2009-08-18 17:43:25 +04:00
init_waitqueue_head ( & adapter - > erp_ready_wq ) ;
2008-07-02 12:56:37 +04:00
init_waitqueue_head ( & adapter - > erp_done_wqh ) ;
2005-04-17 02:20:36 +04:00
2008-07-02 12:56:37 +04:00
INIT_LIST_HEAD ( & adapter - > erp_ready_head ) ;
INIT_LIST_HEAD ( & adapter - > erp_running_head ) ;
2005-04-17 02:20:36 +04:00
2005-12-01 04:46:32 +03:00
rwlock_init ( & adapter - > erp_lock ) ;
2005-04-17 02:20:36 +04:00
rwlock_init ( & adapter - > abort_lock ) ;
2009-08-18 17:43:27 +04:00
if ( zfcp_erp_thread_setup ( adapter ) )
2009-11-24 18:53:59 +03:00
goto failed ;
2005-04-17 02:20:36 +04:00
2008-12-25 15:38:50 +03:00
adapter - > service_level . seq_print = zfcp_print_sl ;
2005-04-17 02:20:36 +04:00
dev_set_drvdata ( & ccw_device - > dev , adapter ) ;
2021-04-14 20:08:02 +03:00
if ( device_add_groups ( & ccw_device - > dev , zfcp_sysfs_adapter_attr_groups ) )
goto err_sysfs ;
scsi: zfcp: introduce sysfs interface for diagnostics of local SFP transceiver
This adds an interface to read the diagnostics of the local SFP transceiver
of an FCP-Channel from userspace. This comes in the form of new sysfs
entries that are attached to the CCW device representing the FCP
device. Each type of data gets its own sysfs entry; the whole collection of
entries is pooled into a new child-directory of the CCW device node:
"diagnostics".
Adds sysfs entries for:
* sfp_invalid: boolean value evaluating to whether the following 5
fields are invalid; {0, 1}; 1 - invalid
* temperature: transceiver temp.; unit 1/256°C;
range [-128°C, +128°C]
* vcc: supply voltage; unit 100μV; range [0, 6.55V]
* tx_bias: transmitter laser bias current; unit 2μA;
range [0, 131mA]
* tx_power: coupled TX output power; unit 0.1μW; range [0, 6.5mW]
* rx_power: received optical power; unit 0.1μW; range [0, 6.5mW]
* optical_port: boolean value evaluating to whether the FCP-Channel has
an optical port; {0, 1}; 1 - optical
* fec_active: boolean value evaluating to whether 16G FEC is active;
{0, 1}; 1 - active
* port_tx_type: nibble describing the port type; {0, 1, 2, 3};
0 - unknown, 1 - short wave,
2 - long wave LC 1310nm, 3 - long wave LL 1550nm
* connector_type: two bits describing the connector type; {0, 1};
0 - unknown, 1 - SFP+
This is only supported if the FCP-Channel in turn supports reporting the
SFP Diagnostic Data, otherwise read() on these new entries will return
EOPNOTSUPP (this affects only adapters older than FICON Express8S, on
Mainframe generations older than z14). Other possible errors for read()
include ENOLINK, ENODEV and ENOMEM.
With this patch the userspace-interface will only read data stored in
the corresponding "diagnostic buffer" (that was stored during completion
of an previous Exchange Port Data command). Implicit updating will
follow later in this series.
Link: https://lore.kernel.org/r/1f9cce7c829c881e7d71a3f10c5b57f3dd84ab32.1572018132.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-25 19:12:47 +03:00
2010-04-30 20:09:33 +04:00
/* report size limit per scatter-gather segment */
adapter - > ccw_device - > dev . dma_parms = & adapter - > dma_parms ;
2013-04-26 19:34:54 +04:00
adapter - > stat_read_buf_num = FSF_STATUS_READS_RECOM ;
scsi: zfcp: Move allocation of the shost object to after xconf- and xport-data
At the moment we allocate and register the Scsi_Host object corresponding
to a zfcp adapter (FCP device) very early in the life cycle of the adapter
- even before we fully discover and initialize the underlying
firmware/hardware. This had the advantage that we could already use the
Scsi_Host object, and fill in all its information during said discover and
initialize.
Due to commit 737eb78e82d5 ("block: Delay default elevator initialization")
(first released in v5.4), we noticed a regression that would prevent us
from using any storage volume if zfcp is configured with support for DIF or
DIX (zfcp.dif=1 || zfcp.dix=1). Doing so would result in an illegal memory
access as soon as the first request is sent with such an configuration. As
example for a crash resulting from this:
scsi host0: scsi_eh_0: sleeping
scsi host0: zfcp
qdio: 0.0.1900 ZFCP on SC 4bd using AI:1 QEBSM:0 PRI:1 TDD:1 SIGA: W AP
scsi 0:0:0:0: scsi scan: INQUIRY pass 1 length 36
Unable to handle kernel pointer dereference in virtual kernel address space
Failing address: 0000000000000000 TEID: 0000000000000483
Fault in home space mode while using kernel ASCE.
AS:0000000035c7c007 R3:00000001effcc007 S:00000001effd1000 P:000000000000003d
Oops: 0004 ilc:3 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: ...
CPU: 1 PID: 783 Comm: kworker/u760:5 Kdump: loaded Not tainted 5.6.0-rc2-bb-next+ #1
Hardware name: ...
Workqueue: scsi_wq_0 fc_scsi_scan_rport [scsi_transport_fc]
Krnl PSW : 0704e00180000000 000003ff801fcdae (scsi_queue_rq+0x436/0x740 [scsi_mod])
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
Krnl GPRS: 0fffffffffffffff 0000000000000000 0000000187150120 0000000000000000
000003ff80223d20 000000000000018e 000000018adc6400 0000000187711000
000003e0062337e8 00000001ae719000 0000000187711000 0000000187150000
00000001ab808100 0000000187150120 000003ff801fcd74 000003e0062336a0
Krnl Code: 000003ff801fcd9e: e310a35c0012 lt %r1,860(%r10)
000003ff801fcda4: a7840010 brc 8,000003ff801fcdc4
#000003ff801fcda8: e310b2900004 lg %r1,656(%r11)
>000003ff801fcdae: d71710001000 xc 0(24,%r1),0(%r1)
000003ff801fcdb4: e310b2900004 lg %r1,656(%r11)
000003ff801fcdba: 41201018 la %r2,24(%r1)
000003ff801fcdbe: e32010000024 stg %r2,0(%r1)
000003ff801fcdc4: b904002b lgr %r2,%r11
Call Trace:
[<000003ff801fcdae>] scsi_queue_rq+0x436/0x740 [scsi_mod]
([<000003ff801fcd74>] scsi_queue_rq+0x3fc/0x740 [scsi_mod])
[<00000000349c9970>] blk_mq_dispatch_rq_list+0x390/0x680
[<00000000349d1596>] blk_mq_sched_dispatch_requests+0x196/0x1a8
[<00000000349c7a04>] __blk_mq_run_hw_queue+0x144/0x160
[<00000000349c7ab6>] __blk_mq_delay_run_hw_queue+0x96/0x228
[<00000000349c7d5a>] blk_mq_run_hw_queue+0xd2/0xe0
[<00000000349d194a>] blk_mq_sched_insert_request+0x192/0x1d8
[<00000000349c17b8>] blk_execute_rq_nowait+0x80/0x90
[<00000000349c1856>] blk_execute_rq+0x6e/0xb0
[<000003ff801f8ac2>] __scsi_execute+0xe2/0x1f0 [scsi_mod]
[<000003ff801fef98>] scsi_probe_and_add_lun+0x358/0x840 [scsi_mod]
[<000003ff8020001c>] __scsi_scan_target+0xc4/0x228 [scsi_mod]
[<000003ff80200254>] scsi_scan_target+0xd4/0x100 [scsi_mod]
[<000003ff802d8b96>] fc_scsi_scan_rport+0x96/0xc0 [scsi_transport_fc]
[<0000000034245ce8>] process_one_work+0x458/0x7d0
[<00000000342462a2>] worker_thread+0x242/0x448
[<0000000034250994>] kthread+0x15c/0x170
[<0000000034e1979c>] ret_from_fork+0x30/0x38
INFO: lockdep is turned off.
Last Breaking-Event-Address:
[<000003ff801fbc36>] scsi_add_cmd_to_list+0x9e/0xa8 [scsi_mod]
Kernel panic - not syncing: Fatal exception: panic_on_oops
While this issue is exposed by the commit named above, this is only by
accident. The real issue exists for longer already - basically since it's
possible to use blk-mq via scsi-mq, and blk-mq pre-allocates all requests
for a tag-set during initialization of the same. For a given Scsi_Host
object this is done when adding the object to the midlayer
(`scsi_add_host()` and such). In `scsi_mq_setup_tags()` the midlayer
calculates how much memory is required for a single scsi_cmnd, and its
additional data, which also might include space for additional protection
data - depending on whether the Scsi_Host has any form of protection
capabilities (`scsi_host_get_prot()`).
The problem is now thus, because zfcp does this step before we actually
know whether the firmware/hardware has these capabilities, we don't set any
protection capabilities in the Scsi_Host object. And so, no space is
allocated for additional protection data for requests in the Scsi_Host
tag-set.
Once we go through discover and initialize the FCP device firmware/hardware
fully (this is done via the firmware commands "Exchange Config Data" and
"Exchange Port Data") we find out whether it actually supports DIF and DIX,
and we set the corresponding capabilities in the Scsi_Host object (in
`zfcp_scsi_set_prot()`). Now the Scsi_Host potentially has protection
capabilities, but the already allocated requests in the tag-set don't have
any space allocated for that.
When we then trigger target scanning or add scsi_devices manually, the
midlayer will use requests from that tag-set, and before sending most
requests, it will also call `scsi_mq_prep_fn()`. To prepare the scsi_cmnd
this function will check again whether the used Scsi_Host has any
protection capabilities - and now it potentially has - and if so, it will
try to initialize the assumed to be preallocated structures and thus it
causes the crash, like shown above.
Before delaying the default elevator initialization with the commit named
above, we always would also allocate an elevator for any scsi_device before
ever sending any requests - in contrast to now, where we do it after
device-probing. That elevator in turn would have its own tag-set, and that
is initialized after we went through discovery and initialization of the
underlying firmware/hardware. So requests from that tag-set can be
allocated properly, and if used - unless the user changes/disabled the
default elevator - this would hide the underlying issue.
To fix this for any configuration - with or without an elevator - we move
the allocation and registration of the Scsi_Host object for a given FCP
device to after the first complete discovery and initialization of the
underlying firmware/hardware. By doing that we can make all basic
properties of the Scsi_Host known to the midlayer by the time we call
`scsi_add_host()`, including whether we have any protection capabilities.
To do that we have to delay all the accesses that we would have done in the
past during discovery and initialization, and do them instead once we are
finished with it. The previous patches ramp up to this by fencing and
factoring out all these accesses, and make it possible to re-do them later
on. In addition we make also use of the diagnostic buffers we recently
added with
commit 92953c6e0aa7 ("scsi: zfcp: signal incomplete or error for sync exchange config/port data")
commit 7e418833e689 ("scsi: zfcp: diagnostics buffer caching and use for exchange port data")
commit 088210233e6f ("scsi: zfcp: add diagnostics buffer for exchange config data")
(first released in v5.5), because these already cache all the information
we need for that "re-do operation" - the information cached are always
updated during xconf or xport data, so it won't be stale.
In addition to the move and re-do, this patch also updates the
function-documentation of `zfcp_scsi_adapter_register()` and changes how it
reports if a Scsi_Host object already exists. In that case future
recovery-operations can skip this step completely and behave much like they
would do in the past - zfcp does not release a once allocated Scsi_Host
object unless the corresponding FCP device is deconstructed completely.
Link: https://lore.kernel.org/r/030dd6da318bbb529f0b5268ec65cebcd20fc0a3.1588956679.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-05-08 20:23:35 +03:00
return adapter ;
2005-04-17 02:20:36 +04:00
2021-04-14 20:08:02 +03:00
err_sysfs :
2009-11-24 18:53:59 +03:00
failed :
2021-04-14 20:08:01 +03:00
/* TODO: make this more fine-granular */
cancel_delayed_work_sync ( & adapter - > scan_work ) ;
cancel_work_sync ( & adapter - > stat_work ) ;
cancel_work_sync ( & adapter - > ns_up_work ) ;
cancel_work_sync ( & adapter - > version_change_lost_work ) ;
zfcp_destroy_adapter_work_queue ( adapter ) ;
zfcp_fc_wka_ports_force_offline ( adapter - > gs ) ;
zfcp_scsi_adapter_unregister ( adapter ) ;
zfcp_erp_thread_kill ( adapter ) ;
zfcp_dbf_adapter_unregister ( adapter ) ;
zfcp_qdio_destroy ( adapter - > qdio ) ;
zfcp_ccw_adapter_put ( adapter ) ; /* final put to release */
2009-11-24 18:54:00 +03:00
return ERR_PTR ( - ENOMEM ) ;
}
void zfcp_adapter_unregister ( struct zfcp_adapter * adapter )
{
struct ccw_device * cdev = adapter - > ccw_device ;
zfcp: auto port scan resiliency
This patch improves the Fibre Channel port scan behaviour of the zfcp lldd.
Without it the zfcp device driver may churn up the storage area network by
excessive scanning and scan bursts, particularly in big virtual server
environments, potentially resulting in interference of virtual servers and
reduced availability of storage connectivity.
The two main issues as to the zfcp device drivers automatic port scan in
virtual server environments are frequency and simultaneity.
On the one hand, there is no point in allowing lots of ports scans
in a row. It makes sense, though, to make sure that a scan is conducted
eventually if there has been any indication for potential SAN changes.
On the other hand, lots of virtual servers receiving the same indication
for a SAN change had better not attempt to conduct a scan instantly,
that is, at the same time.
Hence this patch has a two-fold approach for better port scanning:
the introduction of a rate limit to amend frequency issues, and the
introduction of a short random backoff to amend simultaneity issues.
Both approaches boil down to deferred port scans, with delays
comprising parts for both approaches.
The new port scan behaviour is summarised best by:
NEW: NEW:
no_auto_port_rescan random rate flush
backoff limit =wait
adapter resume/thaw yes yes no yes*
adapter online (user) no yes no yes*
port rescan (user) no no no yes
adapter recovery (user) yes yes yes no
adapter recovery (other) yes yes yes no
incoming ELS yes yes yes no
incoming ELS lost yes yes yes no
Implementation is straight-forward by converting an existing worker to
a delayed worker. But care is needed whenever that worker is going to be
flushed (in order to make sure work has been completed), since a flush
operation cancels the timer set up for deferred execution (see * above).
There is a small race window whenever a port scan work starts
running up to the point in time of storing the time stamp for that port
scan. The impact is negligible. Closing that gap isn't trivial, though, and
would the destroy the beauty of a simple work-to-delayed-work conversion.
Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-11-13 16:59:48 +03:00
cancel_delayed_work_sync ( & adapter - > scan_work ) ;
2009-11-24 18:54:00 +03:00
cancel_work_sync ( & adapter - > stat_work ) ;
2011-02-22 21:54:48 +03:00
cancel_work_sync ( & adapter - > ns_up_work ) ;
2020-10-28 21:30:52 +03:00
cancel_work_sync ( & adapter - > version_change_lost_work ) ;
2009-11-24 18:54:00 +03:00
zfcp_destroy_adapter_work_queue ( adapter ) ;
zfcp_fc_wka_ports_force_offline ( adapter - > gs ) ;
2011-02-22 21:54:46 +03:00
zfcp_scsi_adapter_unregister ( adapter ) ;
2021-04-14 20:08:02 +03:00
device_remove_groups ( & cdev - > dev , zfcp_sysfs_adapter_attr_groups ) ;
2009-11-24 18:54:00 +03:00
zfcp_erp_thread_kill ( adapter ) ;
2010-12-02 17:16:16 +03:00
zfcp_dbf_adapter_unregister ( adapter ) ;
2009-11-24 18:54:00 +03:00
zfcp_qdio_destroy ( adapter - > qdio ) ;
zfcp_ccw_adapter_put ( adapter ) ; /* final put to release */
2005-04-17 02:20:36 +04:00
}
2008-07-02 12:56:37 +04:00
/**
2009-11-24 18:53:59 +03:00
* zfcp_adapter_release - remove the adapter from the resource list
* @ ref : pointer to struct kref
2005-04-17 02:20:36 +04:00
* locks : adapter list write lock is assumed to be held by caller
*/
2009-11-24 18:53:59 +03:00
void zfcp_adapter_release ( struct kref * ref )
2005-04-17 02:20:36 +04:00
{
2009-11-24 18:53:59 +03:00
struct zfcp_adapter * adapter = container_of ( ref , struct zfcp_adapter ,
ref ) ;
2009-11-24 18:54:00 +03:00
struct ccw_device * cdev = adapter - > ccw_device ;
2005-04-17 02:20:36 +04:00
2009-11-24 18:53:59 +03:00
dev_set_drvdata ( & adapter - > ccw_device - > dev , NULL ) ;
2009-08-18 17:43:22 +04:00
zfcp_fc_gs_destroy ( adapter ) ;
2005-04-17 02:20:36 +04:00
zfcp_free_low_mem_buffers ( adapter ) ;
2019-10-25 19:12:44 +03:00
zfcp_diag_adapter_free ( adapter ) ;
2008-07-02 12:56:37 +04:00
kfree ( adapter - > req_list ) ;
2006-01-05 11:59:34 +03:00
kfree ( adapter - > fc_stats ) ;
kfree ( adapter - > stats_reset_data ) ;
2005-04-17 02:20:36 +04:00
kfree ( adapter ) ;
2009-11-24 18:54:00 +03:00
put_device ( & cdev - > dev ) ;
2009-11-24 18:53:59 +03:00
}
static void zfcp_port_release ( struct device * dev )
2008-07-02 12:56:38 +04:00
{
2010-02-17 13:18:56 +03:00
struct zfcp_port * port = container_of ( dev , struct zfcp_port , dev ) ;
2009-11-24 18:53:59 +03:00
2009-11-24 18:54:00 +03:00
zfcp_ccw_adapter_put ( port - > adapter ) ;
2009-11-24 18:53:59 +03:00
kfree ( port ) ;
2008-07-02 12:56:38 +04:00
}
2005-04-17 02:20:36 +04:00
/**
* zfcp_port_enqueue - enqueue port to port list of adapter
* @ adapter : adapter where remote port is added
* @ wwpn : WWPN of the remote port to be enqueued
* @ status : initial status for the port
* @ d_id : destination id of the remote port to be enqueued
2008-07-02 12:56:37 +04:00
* Returns : pointer to enqueued port on success , ERR_PTR on error
2005-04-17 02:20:36 +04:00
*
* All port internal structures are set up and the sysfs entry is generated .
* d_id is used to enqueue ports with a well known address like the Directory
* Service for nameserver lookup .
*/
2008-10-01 14:42:18 +04:00
struct zfcp_port * zfcp_port_enqueue ( struct zfcp_adapter * adapter , u64 wwpn ,
2008-07-02 12:56:37 +04:00
u32 status , u32 d_id )
2005-04-17 02:20:36 +04:00
{
2005-08-27 22:07:54 +04:00
struct zfcp_port * port ;
2009-11-24 18:53:59 +03:00
int retval = - ENOMEM ;
kref_get ( & adapter - > ref ) ;
2009-08-18 17:43:30 +04:00
2009-11-24 18:53:58 +03:00
port = zfcp_get_port_by_wwpn ( adapter , wwpn ) ;
if ( port ) {
2010-02-17 13:18:56 +03:00
put_device ( & port - > dev ) ;
2009-11-24 18:53:59 +03:00
retval = - EEXIST ;
goto err_out ;
2009-08-18 17:43:30 +04:00
}
2005-04-17 02:20:36 +04:00
2008-07-02 12:56:37 +04:00
port = kzalloc ( sizeof ( struct zfcp_port ) , GFP_KERNEL ) ;
2005-04-17 02:20:36 +04:00
if ( ! port )
2009-11-24 18:53:59 +03:00
goto err_out ;
2005-04-17 02:20:36 +04:00
2009-11-24 18:53:58 +03:00
rwlock_init ( & port - > unit_list_lock ) ;
INIT_LIST_HEAD ( & port - > unit_list ) ;
2012-09-04 17:23:34 +04:00
atomic_set ( & port - > units , 0 ) ;
2009-11-24 18:53:58 +03:00
2009-08-18 17:43:20 +04:00
INIT_WORK ( & port - > gid_pn_work , zfcp_fc_port_did_lookup ) ;
2009-03-02 15:09:01 +03:00
INIT_WORK ( & port - > test_link_work , zfcp_fc_link_test_work ) ;
2009-03-02 15:09:08 +03:00
INIT_WORK ( & port - > rport_work , zfcp_scsi_rport_work ) ;
2005-04-17 02:20:36 +04:00
port - > adapter = adapter ;
2008-07-02 12:56:37 +04:00
port - > d_id = d_id ;
port - > wwpn = wwpn ;
2009-03-02 15:09:08 +03:00
port - > rport_task = RPORT_NONE ;
2010-02-17 13:18:56 +03:00
port - > dev . parent = & adapter - > ccw_device - > dev ;
2013-04-26 18:13:48 +04:00
port - > dev . groups = zfcp_port_attr_groups ;
2010-02-17 13:18:56 +03:00
port - > dev . release = zfcp_port_release ;
2005-04-17 02:20:36 +04:00
scsi: zfcp: fix erp_action use-before-initialize in REC action trace
v4.10 commit 6f2ce1c6af37 ("scsi: zfcp: fix rport unblock race with LUN
recovery") extended accessing parent pointer fields of struct
zfcp_erp_action for tracing. If an erp_action has never been enqueued
before, these parent pointer fields are uninitialized and NULL. Examples
are zfcp objects freshly added to the parent object's children list,
before enqueueing their first recovery subsequently. In
zfcp_erp_try_rport_unblock(), we iterate such list. Accessing erp_action
fields can cause a NULL pointer dereference. Since the kernel can read
from lowcore on s390, it does not immediately cause a kernel page
fault. Instead it can cause hangs on trying to acquire the wrong
erp_action->adapter->dbf->rec_lock in zfcp_dbf_rec_action_lvl()
^bogus^
while holding already other locks with IRQs disabled.
Real life example from attaching lots of LUNs in parallel on many CPUs:
crash> bt 17723
PID: 17723 TASK: ... CPU: 25 COMMAND: "zfcperp0.0.1800"
LOWCORE INFO:
-psw : 0x0404300180000000 0x000000000038e424
-function : _raw_spin_lock_wait_flags at 38e424
...
#0 [fdde8fc90] zfcp_dbf_rec_action_lvl at 3e0004e9862 [zfcp]
#1 [fdde8fce8] zfcp_erp_try_rport_unblock at 3e0004dfddc [zfcp]
#2 [fdde8fd38] zfcp_erp_strategy at 3e0004e0234 [zfcp]
#3 [fdde8fda8] zfcp_erp_thread at 3e0004e0a12 [zfcp]
#4 [fdde8fe60] kthread at 173550
#5 [fdde8feb8] kernel_thread_starter at 10add2
zfcp_adapter
zfcp_port
zfcp_unit <address>, 0x404040d600000000
scsi_device NULL, returning early!
zfcp_scsi_dev.status = 0x40000000
0x40000000 ZFCP_STATUS_COMMON_RUNNING
crash> zfcp_unit <address>
struct zfcp_unit {
erp_action = {
adapter = 0x0,
port = 0x0,
unit = 0x0,
},
}
zfcp_erp_action is always fully embedded into its container object. Such
container object is never moved in its object tree (only add or delete).
Hence, erp_action parent pointers can never change.
To fix the issue, initialize the erp_action parent pointers before
adding the erp_action container to any list and thus before it becomes
accessible from outside of its initializing function.
In order to also close the time window between zfcp_erp_setup_act()
memsetting the entire erp_action to zero and setting the parent pointers
again, drop the memset and instead explicitly initialize individually
all erp_action fields except for parent pointers. To be extra careful
not to introduce any other unintended side effect, even keep zeroing the
erp_action fields for list and timer. Also double-check with
WARN_ON_ONCE that erp_action parent pointers never change, so we get to
know when we would deviate from previous behavior.
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 6f2ce1c6af37 ("scsi: zfcp: fix rport unblock race with LUN recovery")
Cc: <stable@vger.kernel.org> #2.6.32+
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-13 16:40:07 +03:00
port - > erp_action . adapter = adapter ;
port - > erp_action . port = port ;
2010-02-17 13:18:56 +03:00
if ( dev_set_name ( & port - > dev , " 0x%016llx " , ( unsigned long long ) wwpn ) ) {
2009-08-18 17:43:30 +04:00
kfree ( port ) ;
2009-11-24 18:53:59 +03:00
goto err_out ;
2009-08-18 17:43:30 +04:00
}
2009-11-24 18:53:59 +03:00
retval = - EINVAL ;
2005-04-17 02:20:36 +04:00
2010-02-17 13:18:56 +03:00
if ( device_register ( & port - > dev ) ) {
put_device ( & port - > dev ) ;
2009-11-24 18:53:59 +03:00
goto err_out ;
2009-08-18 17:43:29 +04:00
}
2005-04-17 02:20:36 +04:00
2009-11-24 18:53:58 +03:00
write_lock_irq ( & adapter - > port_list_lock ) ;
list_add_tail ( & port - > list , & adapter - > port_list ) ;
write_unlock_irq ( & adapter - > port_list_lock ) ;
2015-04-24 02:12:32 +03:00
atomic_or ( status | ZFCP_STATUS_COMMON_RUNNING , & port - > status ) ;
2008-07-02 12:56:37 +04:00
2005-04-17 02:20:36 +04:00
return port ;
2009-11-24 18:53:59 +03:00
err_out :
2009-11-24 18:54:00 +03:00
zfcp_ccw_adapter_put ( adapter ) ;
2009-11-24 18:53:59 +03:00
return ERR_PTR ( retval ) ;
2005-04-17 02:20:36 +04:00
}