IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
change_queue_depth will adjust device queuedepth upon receiving
"SAM_STAT_TASK_SET_FULL" scsi status from the target.
Also added ql4xqfulltracking command line param to enable or disable
queuefull tracking. One can disabling queuefull tracking to ensure
user set scsi device queuedepth is not altered.
Signed-off-by: Tej Parkash <tej.parkash@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Allow ddb state to change to DDB_DS_NO_CONNECTION_ACTIVE or
DDB_DS_SESSION_FAILED before issuing clear ddb mailbox cmd,
because clear ddb mailbox cmd fails if the ddb state is not
equal to DDB_DS_NO_CONNECTION_ACTIVE or DDB_DS_SESSION_FAILED.
Signed-off-by: Manish Rangankar <manish.rangankar@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Update the session and connection parameter before sending
connection logged in event to iscsiadm because in some
scenario logout may come in just after we send the logged
in event to user, which free up session, connection and ddb,
but DPC is still updating session and connect parameter
which can lead to panic.
Signed-off-by: Manish Rangankar <manish.rangankar@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Check for Firmware Hang (AF_FW_RECOVERY) after mailbox command
has gained access to ensure that the mailbox command does not
wait un-necessarily during a firmware recovery and prevent
premature mailbox timeout which will lead to back to back reset's.
Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This patch has the SW FCoE driver and the bnx2fc
driver make use of the new fcoe_sysfs API added
earlier in this patch series.
After this patch a fcoe_ctlr_device is allocated with
private data in this order.
+------------------+ +------------------+
| fcoe_ctlr_device | | fcoe_ctlr_device |
+------------------+ +------------------+
| fcoe_ctlr | | fcoe_ctlr |
+------------------+ +------------------+
| fcoe_interface | | bnx2fc_interface |
+------------------+ +------------------+
libfcoe also takes part in this new model since it
discovers and manages fcoe_fcf instances. The memory
allocation is different for FCFs. I didn't want to
impact libfcoe's fcoe_fcf processing, so this patch
creates fcoe_fcf_device instances for each discovered
fcoe_fcf. The two are paired using a (void * priv)
member of the fcoe_ctlr_device. This allows libfcoe
to continue maintaining its list of fcoe_fcf instances
and simply attaches and detaches them from existing
or new fcoe_fcf_device instances.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This patch adds a 'fcoe bus' infrastructure to the kernel
that is driven by changes to libfcoe which allow LLDs to
present FIP (FCoE Initialization Protocol) discovered
entities and their attributes to user space via sysfs.
This patch adds the following APIs-
fcoe_ctlr_device_add
fcoe_ctlr_device_delete
fcoe_fcf_device_add
fcoe_fcf_device_delete
They allow the LLD to expose the FCoE ENode Controller
and any discovered FCFs (Fibre Channel Forwarders, e.g.
FCoE switches) to the user. Each of these new devices
has their own bus_type so that they are grouped together
for easy lookup from a user space application. Each
new class has an attribute_group to expose attributes
for any created instances. The attributes are-
fcoe_ctlr_device
* fcf_dev_loss_tmo
* lesb_link_fail
* lesb_vlink_fail
* lesb_miss_fka
* lesb_symb_err
* lesb_err_block
* lesb_fcs_error
fcoe_fcf_device
* fabric_name
* switch_name
* priority
* selected
* fc_map
* vfid
* mac
* fka_peroid
* fabric_state
* dev_loss_tmo
A device loss infrastructre similar to the FC Transport's
is also added by this patch. It is nice to have so that a
link flapping adapter doesn't continually advance the count
used to identify the discovered FCF. FCFs will exist in a
"Disconnected" state until either the timer expires or the
FCF is rediscovered and becomes "Connected."
This patch generates a few checkpatch.pl WARNINGS that
I'm not sure what to do about. They're macros modeled
around the FC Transport attribute building macros, which
have the same 'feature' where the caller can ommit a cast
in the argument list and no cast occurs in the code. I'm
not sure how to keep the code condensed while keeping the
macros. Any advice would be appreciated.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Currently the fcoe_ctlr associated with an interface is allocated
as a member of struct bnx2fc_interface. This causes problems when
when later patches attempt to use the new fcoe_sysfs APIs which
allow us to allocate the bnx2fc_interface as private data to a
fcoe_ctlr_device instance. The problem is that libfcoe wants to
be able use pointer math to find a fcoe_ctlr's fcoe_ctlr_device
as well as finding a fcoe_ctlr_device's assocated fcoe_ctlr. To
do this we need to allocate the fcoe_ctlr_device, with private
data for the LLD. The private data will contain the fcoe_ctlr
and its private data will be the bnx2fc_interface.
+-------------------+
| fcoe_ctlr_device |
+-------------------+
| fcoe_ctlr |
+-------------------+
| bnx2fc_interface |
+-------------------+
This prep work will allow us to go from a fcoe_ctlr_device
instance to its fcoe_ctlr as well as from a fcoe_ctlr to its
fcoe_ctlr_device once the fcoe_sysfs API is in use (later
patches in this series).
Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Currently the fcoe_ctlr associated with an interface is allocated
as a member of struct fcoe_interface. This causes problems when
attempting to use the new fcoe_sysfs APIs which allow us to allocate
the fcoe_interface as private data to the fcoe_ctlr_device instance.
The problem is that libfcoe wants to be able use pointer math to find a
fcoe_ctlr's fcoe_ctlr_device as well as finding a fcoe_ctlr_device's
assocated fcoe_ctlr. To do this we need to allocate the
fcoe_ctlr_device, with private data for the LLD. The private data
contains the fcoe_ctlr and its private data is the fcoe_interface.
This patch only allocates the fcoe_interface with the fcoe_ctlr, the
fcoe_ctlr_device will be added in a later patch, which will complete
the below diagram-
+------------------+
| fcoe_ctlr_device |
+------------------+
| fcoe_ctlr |
+------------------+
| fcoe_interface |
+------------------+
This prep work will allow us to go from a fcoe_ctlr_device instance
to its fcoe_ctlr as well as from a fcoe_ctlr to its fcoe_ctlr_device
once the fcoe_sysfs API is in use (later patches in this series).
Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
block congestion control doesn't have any concept of fairness across
multiple queues. This means that if SCSI reports the host as busy in
the queue congestion control it can result in an unfair starvation
situation in dm-mp if there are multiple multipath devices on the same
host. For example:
http://www.redhat.com/archives/dm-devel/2012-May/msg00123.html
The fix for this is to report only the sdev busy state (and ignore the
host busy state) in the block congestion control call back.
The host is still congested, but the SCSI subsystem will sort out the
congestion in a fair way because it knows the relation between the
queues and the host.
[jejb: fixed up trailing whitespace]
Reported-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Tested-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Avoid dereferencing a NULL pointer if scsi_host_alloc is failed.
Signed-off-by: Namjae Jeon <linkinjeon@gmail.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
In the original code, if dma_pool_alloc() fails then we call
dma_pool_free(). It causes an error, possibly a NULL dereference.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
This patch adds support for tcm_qla2xxx fabric module for target-core
using the new qla_target.c LLD logic. This includes support for explict
NodeACLs via configfs using tcm_qla2xxx_setup_nacl_from_rport() from libfc
struct fc_host->rports, and demo-mode support for virtual LUN=0 access.
This patch also adds support for using tcm_qla2xxx_lport->lport_fcport_map
and ->lport_loopid_map of btree_head32 to track struct se_node_acl pointers
for individual 24-bit Port ID and 16-bit Loop ID values w/ qla_target_template
->find_sess_by_s_id() and ->find_sess_by_loop_id() used in a number of
locations into the primary I/O dispatch logic in qla_target.c LLD code.
The main piece for FC Nexus setup is in tcm_qla2xxx_check_initiator_node_acl(),
which calls tcm_qla2xxx_set_sess_by_[s_id,loop_id]() to setup our
lport->lport_fcport_map and lport_loopid_map pointers respectively, and
register the new nexus with TCM via __transport_register_session().
(nab: Add qla_tgt_mgmt_cmd usage with TARGET_SCF_ACK_KREF during TMRs +
change tcm_qla2xxx_nacl->nport_id to u32 (DanC))
(danc: tcm_qla2xxx: checking for NULL instead of IS_ERR())
(roland: Fix up v3.5 breakage for removal of transport_do_task_sg_chain +
Add hook so qla_target code can shutdown sessions)
(steveh: Convert FC address map from flat array to btree)
(randy: fix qla2xxx printk format warnings for size_t)
(joern: Make most of tcm_qla2xxx static + remove unnecessary
workqueue_struct prototypes + use WWN_SIZE instead of hard-coded
constants)
Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Add LLD target mode for >= 24xx series HW. This code was originally based on
external qla2x00t module based on 8.02.01-k4, and has been refactored to
push the bulk of code into mainline qla2xxx.ko LLD -> qla_target.c.
The implementation uses internal workqueues for I/O context submission
into tcm_qla2xxx code, and includes the struct qla_tgt_func_tmpl API for
external interaction to allow qla2xxx LDD to function without direct
target-core dependencies:
It also enables qla_target.c usage within existing qla2xxx LLD code.
This includes:
*) Addition of target mode specific members to existing data
structures in qla_def.h and struct qla_hw_data->tgt_ops using
qla_target.h:struct qla_tgt_func_tmpl
*) Addition of struct qla_tgt_func_tmpl and direct calls into
qla_target.c logic w/ qlt_* prefixed functions.
*) Addition of qla_iocb.c:qla2x00_req_pkt() for ring processing, and
qla2x00_issue_marker() for handling request/response queue processing
for target mode operation
*) Addition of various qla_tgt_mode_enabled() logic checks in
qla24xx_nvram_config(), qla2x00_initialize_adapter(), qla2x00_rff_id(),
qla2x00_abort_isp(), qla24xx_modify_vp_config(), and
qla2x00_vp_abort_isp().
By default the new qlini_mode module parameter is setting initiator-mode
to 'enabled' in order for 'modprobe qla2xxx' to continue to function as
expected in initiator only mode. Enabling target-mode operation will
currently require a:
modprobe qla2xxx qlini_mode="disabled"
in order to explictly disabled initiator mode and allow target-mode
to be enabled via tcm_qla2xxx configfs fabric callers.
(nab: Convert to qlini_mode='enabled' by default in qla_target.c)
(joern: Remove loop_id from qla_tgt_make_local_sess() arguments +
Remove unused s_id + fix s_id endianness bug +
simplify qla_tgt_abort_work)
(gerard: fix section __exit mismatch in qla_tgt_exit)
(arun: Capture ATIO queue during firmware dump + Send SCR in target mode +
Target mode review comments)
(roland: Don't create duplicate target sessions to address tearing down
ACLs with IO in flight + Add missing call to qlt_fc_port_deleted
call during qla2x00_schedule_rport_del timeout)
Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
commit 491118dff9aeb207408bd42aa4897bc2c145747f
Author: Saurav Kashyap <saurav.kashyap@qlogic.com>
Date: Tue Aug 16 11:31:50 2011 -0700
[SCSI] qla2xxx: During loopdown perform Diagnostic loopback.
The LOOP_DOWN test is not needed.
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
The field vp_idx in struct fc_port is a redundant/mirror copy of
the same field in struct scsi_qla_host;
struct fc_port has a pointer vha to scsi_qla_host which allows
the original copy of vp_idx to be readily accessed.
Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Both the target-id and LUN are munged in the original printk().
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Currently stats is part of ha data structure, common for physical and virtual
ports. Moved the stats to vha, each port will have its own stat.
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Add an extra layer of logging granularity for messages that are necessary in
some circumstances but may flood the kernel log buffer with too many messages
otherwise.
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
If interrupt registration failed we could crash the machine as we were trying
to deference some pointers which weren't allocated yet. Move the allocation
a little earlier and make some checks to the free resource code to make sure
that we don't try to free a resource that was never allocated.
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Correct spelling "occured" to "occurred" in
drivers/scsi/qla2xxx/qla_mbx.c
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Optimized queuecommand handler's to eliminate double head-room checks.
The checks are moved inside the 1st if-loop otherwise you would end up checking twice when there is
enough head room.
Signed-off-by: Chetan Loke <loke.chetan@gmail.com>
Reviewed-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Reviewed-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Replace "Inconisistent" with "Inconsistent" in drivers/scsi/qla2xxx/qla_init.c
Signed-off-by: Raul Porcel <armin76@gentoo.org>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
For scsi devices which use scsi bus runtime callback, runtime suspend
will call scsi_dev_type_suspend, and if the drv->suspend failed, the
device will still be in active state. But since scsi_device_quiesce is
called, the device will not be able to respond any more commands.
So add a check here to see if err occured, if so, bring the device back
to normal state with scsi_device_resume.
Signed-off-by: Aaron Lu <aaron.lu@amd.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Made changes to set the fc_host sysfs entries supported_speeds,
supported_classes etc., during the vport creation from the
FC transport template.
Signed-off-by: Krishna Gudipati <kgudipat@brocade.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
When the task management IO times out, or a flush operation is performed while
task management IO is pending, driver is not cleaning up the IO. This patch
cleans up the IO for the above cases.
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
When IO abort times out during eh_abort or a flush operation is performed while
abort is pending, the driver is not cleaning up the IO and thus not reducing
the IO reference count. With this change, as part of explicit logout, the IO is
cleaned up.
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Commit 907c07d45199f954ddcf66c2c9763c87d012cb15 added more cases to do FLOGI
retry on receiving bad response. Remove the code that drops the packet and
allow the stack to handle bad FLOGI response.
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iQEcBAABAgAGBQJPuiSaAAoJEDeqqVYsXL0MJ8wH/2QYwxCtTzwgBE4DSUrZ/mnO
ygiiausG7gNY845hAmXhoEqhYBe1GA/fvfSXOdurAPrFmfu2HvvPEyKmu3soWxLM
rrXP7JNRjHOSz+GIktZECg6K9iobldl0zCxdn515ATnBEOZVom5v+uBE13sfg5uP
iOS73JF7h2VRcAYuw8jsVTdc/rnH2nG4TsbW2B+Hp3Ti1pFSnyHbbNuE2FJ9bEX4
gTBtsYYRZPWl24WuhmmS6LHyGqL+rcU/wKj4+rAdNQwsh+MBgcMDhGQ1UWg/3OGN
N8wur7AEQnyvsYdufjFNmHBux4TcdCVZISsWYb3frctJ5XVtYViMjlbmMMy1T7s=
=rxQP
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI misc update from James Bottomley:
"The patch contains the usual assortment of driver updates (be2iscsi,
bfa, bnx2i, fcoe, hpsa, isci, lpfc, megaraid, mpt2sas, pm8001, sg)
plus an assortment of other changes and fixes. Also new is the fact
that the isci update is delivered as a git merge (with signed tag)."
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (158 commits)
isci: End the RNC resumption wait when the RNC is destroyed.
isci: Fixed RNC bug that lost the suspension or resumption during destroy
isci: Fix RNC AWAIT_SUSPENSION->INVALIDATING transition.
isci: Manage the IREQ_NO_AUTO_FREE_TAG under scic_lock.
isci: Remove obviated host callback list.
isci: Check IDEV_GONE before performing abort path operations.
isci: Restore the ATAPI device RNC management code.
isci: Don't wait for an RNC suspend if it's being destroyed.
isci: Change the phy control and link reset interface for HW reasons.
isci: Added timeouts to RNC suspensions in the abort path.
isci: Add protocol indicator for TMF requests.
isci: Directly control IREQ_ABORT_PATH_ACTIVE when completing TMFs.
isci: Wait for RNC resumption before leaving the abort path.
isci: Fix RNC suspend call for SCI_RESUMING state.
isci: Manage tag releases differently when aborting tasks.
isci: Callbacks to libsas occur under scic_lock and are synchronized.
isci: When in the abort path, defeat other resume calls until done.
isci: Implement waiting for suspend in the abort path.
isci: Make sure all TCs are terminated and cleaned in LUN reset.
isci: Manage the LLHANG timer enable/disable per-device.
...
Pull m68k updates from Geert Uytterhoeven.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: Setup CROSS_COMPILE at the top
m68k: Correct the Atari ALLOWINT definition
m68k/video: Create <asm/vga.h>
m68k: Make sure {read,write}s[bwl]() are always defined
m68k/mm: Port OOM changes to do_page_fault()
scsi/atari: Make more functions static
scsi/atari: Revive "atascsi=" setup option
net/ariadne: Improve debug prints
m68k/atari: Change VME irq numbers from unsigned long to unsigned int
m68k/amiga: Use arch_initcall() for registering platform devices
m68k/amiga: Add error checks when registering platform devices
m68k/amiga: Mark z_dev_present() __init
m68k: Remove unused MAX_NOINT_IPL definition
1/ Rework remote-node-context (RNC) handling for proper management of
the silicon state machine in error handling and hot-plug conditions.
Further details below, suffice to say if the RNC is mismanaged the
silicon state machines may lock up.
2/ Refactor the initialization code to be reused for suspend/resume support
3/ Miscellaneous bug fixes to address discovery issues and hardware
compatibility.
RNC rework details from Jeff Skirvin:
In the controller, devices as they appear on a SAS domain (or
direct-attached SATA devices) are represented by memory structures known
as "Remote Node Contexts" (RNCs). These structures are transferred from
main memory to the controller using a set of register commands; these
commands include setting up the context ("posting"), removing the
context ("invalidating"), and commands to control the scheduling of
commands and connections to that remote device ("suspensions" and
"resumptions"). There is a similar path to control RNC scheduling from
the protocol engine, which interprets the results of command and data
transmission and reception.
In general, the controller chooses among non-suspended RNCs to find one
that has work requiring scheduling the transmission of command and data
frames to a target. Likewise, when a target tries to return data back
to the initiator, the state of the RNC is used by the controller to
determine how to treat the incoming request. As an example, if the RNC
is in the state "TX/RX Suspended", incoming SSP connection requests from
the target will be rejected by the controller hardware. When an RNC is
"TX Suspended", it will not be selected by the controller hardware to
start outgoing command or data operations (with certain priority-based
exceptions).
As mentioned above, there are two sources for management of the RNC
states: commands from driver software, and the result of transmission
and reception conditions of commands and data signaled by the controller
hardware. As an example of the latter, if an outgoing SSP command ends
with a OPEN_REJECT(BAD_DESTINATION) status, the RNC state will
transition to the "TX Suspended" state, and this is signaled by the
controller hardware in the status to the completion of the pending
command as well as signaled in a controller hardware event. Examples of
the former are included in the patch changelogs.
Driver software is required to suspend the RNC in a "TX/RX Suspended"
condition before any outstanding commands can be terminated. Failure to
guarantee this can lead to a complete hardware hang condition. Earlier
versions of the driver software did not guarantee that an RNC was
correctly managed before I/O termination, and so operated in an unsafe
way.
Further, the driver performed unnecessary contortions to preserve the
remote device command state and so was more complicated than it needed
to be. A simplifying driver assumption is that once an I/O has entered
the error handler path without having completed in the target, the
requirement on the driver is that all use of the sas_task must end.
Beyond that, recovery of operation is dependent on libsas and other
components to reset, rediscover and reconfigure the device before normal
operation can restart. In the driver, this simplifying assumption meant
that the RNC management could be reduced to entry into the suspended
state, terminating the targeted I/O request, and resuming the RNC as
needed for device-specific management such as an SSP Abort Task or LUN
Reset Management request.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJPtYFXAAoJEB7SkWpmfYgCJkcP/1VvsuuitNy/YM9P1tb/RvQ7
ytJzjGtWiAABHVWwjgB+Ng7hUTaP2r6l8KeNfwxpwXyNdBAUNysYEUBHAfPsKzKz
espTmw3wVCnREajgKXZwFp9aTj8DcYFB6vKcC/ddACt3uRNjjA9En36+6797r8Vg
YdebyjFX2FxwoUj0icTUiV/OXgb8w723imnCl8bfOhhFRi4eFZ4EJ23AdMkUya1i
uYePAPJPSQJuU/87gNIx4JcR0qHJ1ziGPEY+XC47CzEeXbBTSPgWOwanQ6KPoXRJ
XVxamfcKAjRdtwQ4m1vYBSE32RTdrjhujVbkGiPi6QaEbCLjhLSCIYyuS3XMckV+
TCZ16o5kd/I6ZtZOeP4zZRGnNBkOPzY44qiJeKffjWDhTrFacx4XWJB/ftWPgEA5
N2zFH3RM4sY0FUJ3I/Qe5CERNdCXMtcj+UAf3nHpAIVcv46Lp+qoSkdEx05uuuiN
+D/dSlubktuvuzmB5WisL3qrjNEkkLTAGQpZs1j0ojLEBm0XAgV5EzqmHiZ0GOPD
OQNFxeei9SlqgtKIIP0bymRispPrG2HVCOvExYMxzKR6fjxofZLAs/aWOsdhxgMq
TlAyZJ6OmGI+KX68HHzoMpT9iquvmP64WGkfHzCx296BfSKiruLh/Jzt5gGwv+Z1
5tlpnUr9dUTxx7qkQXvj
=HYvO
-----END PGP SIGNATURE-----
Merge tag 'isci-for-3.5' into misc
isci update for 3.5
1/ Rework remote-node-context (RNC) handling for proper management of
the silicon state machine in error handling and hot-plug conditions.
Further details below, suffice to say if the RNC is mismanaged the
silicon state machines may lock up.
2/ Refactor the initialization code to be reused for suspend/resume support
3/ Miscellaneous bug fixes to address discovery issues and hardware
compatibility.
RNC rework details from Jeff Skirvin:
In the controller, devices as they appear on a SAS domain (or
direct-attached SATA devices) are represented by memory structures known
as "Remote Node Contexts" (RNCs). These structures are transferred from
main memory to the controller using a set of register commands; these
commands include setting up the context ("posting"), removing the
context ("invalidating"), and commands to control the scheduling of
commands and connections to that remote device ("suspensions" and
"resumptions"). There is a similar path to control RNC scheduling from
the protocol engine, which interprets the results of command and data
transmission and reception.
In general, the controller chooses among non-suspended RNCs to find one
that has work requiring scheduling the transmission of command and data
frames to a target. Likewise, when a target tries to return data back
to the initiator, the state of the RNC is used by the controller to
determine how to treat the incoming request. As an example, if the RNC
is in the state "TX/RX Suspended", incoming SSP connection requests from
the target will be rejected by the controller hardware. When an RNC is
"TX Suspended", it will not be selected by the controller hardware to
start outgoing command or data operations (with certain priority-based
exceptions).
As mentioned above, there are two sources for management of the RNC
states: commands from driver software, and the result of transmission
and reception conditions of commands and data signaled by the controller
hardware. As an example of the latter, if an outgoing SSP command ends
with a OPEN_REJECT(BAD_DESTINATION) status, the RNC state will
transition to the "TX Suspended" state, and this is signaled by the
controller hardware in the status to the completion of the pending
command as well as signaled in a controller hardware event. Examples of
the former are included in the patch changelogs.
Driver software is required to suspend the RNC in a "TX/RX Suspended"
condition before any outstanding commands can be terminated. Failure to
guarantee this can lead to a complete hardware hang condition. Earlier
versions of the driver software did not guarantee that an RNC was
correctly managed before I/O termination, and so operated in an unsafe
way.
Further, the driver performed unnecessary contortions to preserve the
remote device command state and so was more complicated than it needed
to be. A simplifying driver assumption is that once an I/O has entered
the error handler path without having completed in the target, the
requirement on the driver is that all use of the sas_task must end.
Beyond that, recovery of operation is dependent on libsas and other
components to reset, rediscover and reconfigure the device before normal
operation can restart. In the driver, this simplifying assumption meant
that the RNC management could be reduced to entry into the suspended
state, terminating the targeted I/O request, and resuming the RNC as
needed for device-specific management such as an SSP Abort Task or LUN
Reset Management request.
While the RNC is suspended for I/O cleanup, the remote device can be
stopped and the RNC setup for destruction. These changes accomodate that
case in the abort path.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This fix corrects the saving of resume parameters when the destruction
of the RNC has already been directed, and makes sure not to overwrite
the RNC destruction callbacks.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The RNC state machine would incorrectly transition from
SCI_RNC_AWAIT_SUSPENSION directly to SCI_RNC_INVALIDATING when a destruct
request was made. This would skip the increment of the suspension count
and the abort of pending TCs (although the invalidating state would at
least cleanup outstanding TCs).
Instead, the RNC will transition to SCI_RNC_SUSPENDED and then start the
destruction process.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Since there is a possibilty of a timeout waiting for the RNC suspension,
handle the exit case from the task termination under scic_lock, and leave
the tag allocated if the termination timed-out.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Since the callbacks to libsas now occur under scic_lock, there is no
longer any reason to save the completed requests in a separate list
for completion to libsas.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In the link fail path, set IDEV_GONE for every device on the domain
when the last link in the port fails.
In the abort path functions like isci_reset_device, make sure that
there has not already been a detected domain failure with the device
by checking IDEV_GONE, before performing any kind of hard reset, SMP
phy control, or TMF operation.
The check for IDEV_GONE makes sure that the device in the abort path
really has control of the port with which it is associated. This
prevents starting hard resets at incorrect times and scheduling
unnecessary LUN resets for SATA devices.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The ATAPI specific and STP general RNC suspension code had been
incorrectly removed from the remote device code.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Make sure that the wait for suspend can handle the RNC destruction case.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>