51 Commits

Author SHA1 Message Date
Brian King
309bd27121 [SCSI] scsi: Device scanning oops for offlined devices (resend)
If a device gets offlined as a result of the Inquiry sent
during scanning, the following oops can occur. After the
disk gets put into the SDEV_OFFLINE state, the error handler
sends back the failed inquiry, which wakes the thread doing
the scan. This starts a race between the scanning thread
freeing the scsi device and the error handler calling
scsi_run_host_queues to restart the host. Since the disk
is in the SDEV_OFFLINE state, scsi_device_get will still
work, which results in __scsi_iterate_devices getting
a reference to the scsi disk when it shouldn't.

The following execution thread causes the oops:

CPU 0 (scan)				CPU 1 (eh)

---------------------------------------------------------
scsi_probe_and_add_lun
                        ....
                                        scsi_eh_offline_sdevs
                                        scsi_eh_flush_done_q
scsi_destroy_sdev
scsi_device_dev_release
                                        scsi_restart_operations
                                         scsi_run_host_queues
                                          __scsi_iterate_devices
                                           get_device
scsi_device_dev_release_usercontext
                                          scsi_run_queue
                                            <---OOPS--->

The patch fixes this by changing the state of the sdev to SDEV_DEL
before doing the final put_device, which should prevent the race
from occurring.

Original oops follows:

Badness in kref_get at lib/kref.c:32
Call Trace:
[C00000002F4476D0] [C00000000000EE20] .show_stack+0x68/0x1b0 (unreliable)
[C00000002F447770] [C00000000037515C] .program_check_exception+0x1cc/0x5a8
[C00000002F447840] [C00000000000446C] program_check_common+0xec/0x100
 Exception: 700 at .kref_get+0x10/0x28
    LR = .kobject_get+0x20/0x3c
[C00000002F447B30] [C00000002F447BC0] 0xc00000002f447bc0 (unreliable)
[C00000002F447BB0] [C000000000254BDC] .get_device+0x20/0x3c
[C00000002F447C30] [D000000000063188] .scsi_device_get+0x34/0xdc [scsi_mod]
[C00000002F447CC0] [D0000000000633EC] .__scsi_iterate_devices+0x50/0xbc [scsi_mod]
[C00000002F447D60] [D00000000006A910] .scsi_run_host_queues+0x34/0x5c [scsi_mod]
[C00000002F447DF0] [D000000000069054] .scsi_error_handler+0xdb4/0xe44 [scsi_mod]
[C00000002F447EE0] [C00000000007B4E0] .kthread+0x128/0x178
[C00000002F447F90] [C000000000025E84] .kernel_thread+0x4c/0x68
Unable to handle kernel paging request for <7>PCI: Enabling device: (0002:41:01.1), cmd 143
data at address 0x000001b8
Faulting instruction address: 0xd0000000000698e4
sym1: <1010-66> rev 0x1 at pci 0002:41:01.1 irq 216
sym1: No NVRAM, ID 7, Fast-80, LVD, parity checking
sym1: SCSI BUS has been reset.
scsi2 : sym-2.2.2
cpu 0x0: Vector: 300 (Data Access) at [c00000002f447a30]
    pc: d0000000000698e4: .scsi_run_queue+0x2c/0x218 [scsi_mod]
    lr: d00000000006a904: .scsi_run_host_queues+0x28/0x5c [scsi_mod]
    sp: c00000002f447cb0
   msr: 9000000000009032
   dar: 1b8
 dsisr: 40000000
  current = 0xc0000000045fecd0
  paca    = 0xc00000000048ee80
    pid   = 1123, comm = scsi_eh_1
enter ? for help
[c00000002f447d60] d00000000006a904 .scsi_run_host_queues+0x28/0x5c [scsi_mod]
[c00000002f447df0] d000000000069054 .scsi_error_handler+0xdb4/0xe44 [scsi_mod]
[c00000002f447ee0] c00000000007b4e0 .kthread+0x128/0x178
[c00000002f447f90] c000000000025e84 .kernel_thread+0x4c/0x68

Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-06-28 12:39:56 -04:00
Christoph Hellwig
beb4048750 [SCSI] remove scsi_request infrastructure
With Achim patch the last user (gdth) is switched away from scsi_request
so we an kill it now.  Also disables some code in i2o_scsi that was
broken since the sg driver stopped using scsi_requests.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-06-10 16:24:40 -05:00
Amit Arora
091686d3b5 [SCSI] Return -EINVAL when "id == max_id" in scsi_scan_host_selected()
The scsi_scan_host_selected() should return -EINVAL when the id is equal
to the max_id. Currently it uses ">" when comparing with max_id, and
hence leaves the border case when "id==max_id".
The channel and lun have values valid from 0 up to,
and including, max_channel or max_lun. But, the valid values for id
range from 0 to max_id-1. This patch fixes the problem.

Signed-off-by: Amit Arora <aarora@in.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-05-28 13:01:23 -04:00
akpm@osdl.org
c5f2e6404c [SCSI] scsi_scan.c: fix compile warnings
drivers/scsi/scsi_scan.c: In function `scsi_probe_and_add_lun':
drivers/scsi/scsi_scan.c:926: warning: unused variable `vend'
drivers/scsi/scsi_scan.c:926: warning: unused variable `mod'
drivers/scsi/scsi_scan.c: At top level:
drivers/scsi/scsi_scan.c:829: warning: `scsi_inq_str' defined but not used

Fix those, tighten up the (somewhat poorly-designed) logging macro and fix
some coding-style warts.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-15 09:04:40 -05:00
James Bottomley
84d891d672 Merge ../scsi-rc-fixes-2.6
Conflicts:

	include/scsi/scsi_devinfo.h

Same number for two BLIST flags:  BLIST_MAX_512 and BLIST_ATTACH_PQ3

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-14 15:47:45 -05:00
Kurt Garloff
13f7e5acc8 [SCSI] BLIST_ATTACH_PQ3 flags
Some devices report a peripheral qualifier of 3 for LUN 0; with the original
code, we would still try a REPORT_LUNS scan (if SCSI level is >= 3 or if we
have the BLIST_REPORTLUNS2 passed in), but NOT any sequential scan.
Also, the device at LUN 0 (which is not connected according to the PQ) is not
registered with the OS.

Unfortunately, SANs exist that are SCSI-2 and do NOT support REPORT_LUNS, but
report a unknown device with PQ 3 on LUN 0. We still need to scan them, and
most probably we even need BLIST_SPARSELUN (and BLIST_LARGELUN). See the bug
reference for an infamous example.

This is patch 3/3:
3. Implement the blacklist flag BLIST_ATTACH_PQ3 that makes the scsi
   scanning code register PQ3 devices and continues scanning; only sg
   will attach thanks to scsi_bus_match().

Signed-off-by: Kurt Garloff <garloff@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-14 13:56:56 -05:00
Kurt Garloff
6c7154c97e [SCSI] Better log messages for PQ3 devs
Some devices report a peripheral qualifier of 3 for LUN 0; with the original
code, we would still try a REPORT_LUNS scan (if SCSI level is >= 3 or if we
have the BLIST_REPORTLUNS2 passed in), but NOT any sequential scan.
Also, the device at LUN 0 (which is not connected according to the PQ) is not
registered with the OS.

Unfortunately, SANs exist that are SCSI-2 and do NOT support REPORT_LUNS, but
report a unknown device with PQ 3 on LUN 0. We still need to scan them, and
most probably we even need BLIST_SPARSELUN (and BLIST_LARGELUN). See the bug
reference for an infamous example.

This patch 2/3:
If a PQ3 device is found, log a message that describes the device
(INQUIRY DATA and C:B:T:U tuple) and make a suggestion for blacklisting
it.

Signed-off-by: Kurt Garloff <garloff@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-14 13:56:03 -05:00
Kurt Garloff
4186ab1973 [SCSI] Try LUN 1 and use bflags
Some devices report a peripheral qualifier of 3 for LUN 0; with the original
code, we would still try a REPORT_LUNS scan (if SCSI level is >= 3 or if we
have the BLIST_REPORTLUNS2 passed in), but NOT any sequential scan.
Also, the device at LUN 0 (which is not connected according to the PQ) is not
registered with the OS.

Unfortunately, SANs exist that are SCSI-2 and do NOT support REPORT_LUNS, but
report a unknown device with PQ 3 on LUN 0. We still need to scan them, and
most probably we even need BLIST_SPARSELUN (and BLIST_LARGELUN). See the bug
reference for an infamous example.

This is patch 1/3:
If we end up in sequential scan, at least try LUN 1 for devices
that reported a PQ of 3 for LUN 0.
Also return blacklist flags, even for PQ3 devices.

Signed-off-by: Kurt Garloff <garloff@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-14 13:53:43 -05:00
James Bottomley
4d7db04a7a [SCSI] add SCSI_UNKNOWN and LUN transfer limit restrictions
Original From: Ingo Flaschberger <if@xip.at>

To support the RA4100 array from Compaq.

This patch now correctly handles SCSI_UNKNOWN types with regard to
BLIST_REPORTLUNS2 (allow it) and cdb[1] LUN inclusion (don't).

It also allows a BLIST_MAX_512 flag to restrict the maximum transfer
length to 512 blocks (apparently this is an RA4100 problem).

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-13 10:13:31 -05:00
Mike Anderson
a50a5e3792 [SCSI] scsi: move target_destroy call
This patch moves the calling of target_destroy next to the list_del. This
closed a race being seen while doing a device add on the aic7xxx.

Signed-off-by: Mike Anderson <andmike@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-03-14 14:36:00 -06:00
James Bottomley
f33b5d783b Merge ../linux-2.6 2006-03-14 14:18:01 -06:00
Dave Jones
93f5608989 [SCSI] fix two leaks in scsi_alloc_sdev failure paths
If the scsi_alloc_queue or the slave_alloc calls in scsi_alloc_device fail,
we forget to release the locally allocated sdev on the failure path.

Coverity #609

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-03-12 09:25:40 -06:00
Brian King
32f9579250 [SCSI] scsi: Handle device_add failure in scsi_alloc_target
Fixes scsi to handle device_add failure in scsi_alloc_target.
Without this patch, if this call were to fail, we can oops
when we free the target.

Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-02-27 23:38:59 -06:00
James Bottomley
ffedb45225 [SCSI] fix scsi process problems and clean up the target reap issues
In order to use the new execute_in_process_context() API, you have to
provide it with the work storage, which I do in SCSI in scsi_device and
scsi_target, but which also means that we can no longer queue up the
target reaps, so instead I moved the target to a state model which
allows target_alloc to detect if we've received a dying target and wait
for it to be gone.  Hopefully, this should also solve the target
namespace race.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-02-27 23:37:45 -06:00
Alan Stern
1bfc5d9d5e [SCSI] Recognize missing LUNs for non-standard devices
Some non-standard SCSI targets or protocols, such as USB UFI, report "no
LUN present" by setting the Peripheral Device Type to 0x1f and the
Peripheral Qualifier to 0 (not 3 as the standard requires) in the INQUIRY
response.  This patch (as650b) adds a new target flag and code to
accomodate such targets.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-02-27 23:24:09 -06:00
Matthew Wilcox
a97a83a06b [SCSI] fix uninitialized variable error
in __scsi_add_device, sdev may be uninitialised if
scsi_host_scan_allowed() returns false.  Fix by initialising at the
top of the routine.  Also rely on the fact that
scsi_probe_and_add_lun() only actually fills in the sdev pointer if
the SCSI_SCAN_LUN_PRESENT case (so no need to check the return value).

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-02-27 22:55:07 -06:00
Greg KH
5e3c34c1e9 [SCSI] Remove devfs support from the SCSI subsystem
As devfs has been disabled from the kernel tree for a number of months
now (5 to be exact), here's a patch against 2.6.16-rc1-git1 that removes
support for it from the SCSI subsystem.

The patch also removes the scsi_disk devfs_name field as it's no longer
needed.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-02-27 22:55:04 -06:00
Jes Sorensen
24669f75a3 [SCSI] SCSI core kmalloc2kzalloc
Change the core SCSI code to use kzalloc rather than kmalloc+memset
where possible.

Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-02-27 22:55:02 -06:00
Christoph Hellwig
938050916f [SCSI] scsi: handle ->slave_configure return value
When >slave_configure fails the scsi midlayer should handle it.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-02-27 21:26:45 -06:00
James Bottomley
65110b2168 [SCSI] fix wrong context bugs in SCSI
There's a bug in releasing scsi_device where the release function
actually frees the block queue.  However, the block queue release
calls flush_work(), which requires process context (the scsi_device
structure may release from irq context).  Update the release function
to invoke via the execute_in_process_context() API.

Also clean up the scsi_target structure releasing via this API.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-02-14 11:15:11 -06:00
Christoph Hellwig
e02f3f5922 [SCSI] remove target parent limitiation
When James Smart fixed the issue of the userspace scan atributes
crashing the system with the FC transport class he added a patch to
let the transport class check if the parent is valid for a given
transport class.

When adding support for the integrated raid of fusion sas devices
we ran into a problem with that, as it didn't allow adding virtual
raid volumes without the transport class knowing about it.

So this patch adds a user_scan attribute instead, that takes over from
scsi_scan_host_selected if the transport class sets it and thus lets
the transport class control the user-initiated scanning.  As this
plugs the hole about user-initiated scanning the target_parent hook
goes away and we rely on callers of the scanning routines to do
something sensible.

For SAS this meant I had to switch from a spinlock to a mutex to
synchronize the topology linked lists, in FC they were completely
unsynchronized which seems wrong.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-01-14 10:55:05 -06:00
Arjan van de Ven
0b95067238 [SCSI] turn most scsi semaphores into mutexes
the scsi layer is using semaphores in a mutex way, this patch converts
these into using mutexes instead

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-01-12 11:53:11 -06:00
Linus Torvalds
f61ea1b0c8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 2006-01-04 16:30:12 -08:00
James Bottomley
04333393b9 [PATCH] Fix Fibre Channel boot oops
The oops is characteristic of the underlying device being removed from
visibility before the class device, and sure enough we do device_del()
before transport_unregister() in the scsi_target_reap() routines.  I've
no idea why this is suddenly showing up, since the code has been in
there since that function was first invented.  However, I've confirmed
this fixes Andrew Vasquez's boot oops.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-12-26 10:17:43 -08:00
James Bottomley
863a930a40 [SCSI] fix scsi_reap_target() device_del from atomic context
scsi_reap_target() was desgined to be called from any context.
However it must do a device_del() of the target device, which may only
be called from user context.  Thus we have to reimplement
scsi_reap_target() via a workqueue.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-12-17 10:48:08 -06:00
Arjan van de Ven
0ad78200ba [SCSI] Mark some core scsi data structures const
patch below marks a few scsi core datastructures as const, so that they end up
in the .rodata section and don't cacheline share with things that get dirtied

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-12-13 18:11:01 -07:00
Brian King
66e0522526 [PATCH] Fix SCSI scanning slab corruption
There is a double free in the scsi scan code if a LLDD's slave_alloc()
call fails.  There is a direct call to scsi_free_queue and then the
following put_device calls the release function, which also frees the
queue.

Remove the redundant scsi_free_queue.

Signed-off-by: Brian King <brking@us.ibm.com>
Tested-by: Nathan Lynch <ntl@pobox.com>
[ Also removed some strange whitespace artifacts in that area ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-12-12 12:35:39 -08:00
Christoph Hellwig
f64a181d89 [SCSI] remove Scsi_Device typedef
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-11-09 15:48:20 -05:00
Alan Stern
2ef8919830 [SCSI] Fix refcount leak in scsi_report_lun_scan
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-11-08 16:13:34 -05:00
Jeff Garzik
3bf743e7c8 [SCSI] use {sdev,scmd,starget,shost}_printk in generic code
rejections fixed and
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-10-28 20:52:11 -05:00
Jeff Garzik
13ec92b33e [SCSI] kill unused scsi_scan_single_target()
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-10-28 20:29:18 -05:00
James Bottomley
9ccfc756a7 [SCSI] move the mid-layer printk's over to shost/starget/sdev_printk
This should eliminate (at least in the mid layer) to make numeric
assumptions about any of the enumeration variables.  As a side effect,
it will also make all the messages consistent and line us up nicely for
the error logging strategy (if it ever shows itself again).

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-10-28 14:23:02 -05:00
James Bottomley
6f3a20242d [SCSI] allow REPORT LUN scanning even for LUN 0 PQ of 3
Currently we just ignore the device, which means there are a few
arrays out there that we don't find.

This patch updates the scsi_report_lun_scan() to take a target instead
of a device so it can be called on a return of
SCSI_SCAN_TARGET_PRESENT, which is what a PQ 3 device returns.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-09-25 12:01:48 -05:00
Alan Stern
a64358db12 [SCSI] SCSI scanning and removal fixes
This patch (as545) fixes the list traversals in __scsi_remove_target and
scsi_forget_host.  In each case the existing code list_for_each_entry_safe
in an _unsafe_ manner, because the list was not protected from outside
modification while the iteration was running.

The new scsi_forget_host routine takes the moderately controversial step
of iterating over devices for removal rather than iterating over targets.
This makes more sense to me because the current scheme treats targets as
second-class citizens, created and removed on demand, rather than as
objects corresponding to actual hardware.  (Also I couldn't figure out any
safe way to iterate over the target list, since it's not so easy to tell
when a target has already been removed.)

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-09-18 15:22:06 -05:00
James Bottomley
146f7262ee [SCSI] Alter the scsi_add_device() API to conform to what users expect
The original API returned either an ERR_PTR() or a refcounted sdev.
Unfortunately, if it's successful, you need to do a scsi_device_put() on
the sdev otherwise the refcounting is wrong.

Everyone seems to expect that scsi_add_device() should be callable
without doing the ref put, so alter the API so it is (we still have
__scsi_add_device with the original behaviour).

The only actual caller that needs altering is the one in firewire ...
not because it gets this right, but because it acts on the error if one
is returned.

Acked-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-09-10 14:43:25 -05:00
Alan Stern
b70d37bf61 [SCSI] Fix module removal/device add race
This patch (as546) fixes an oops-causing failure to check the return code
from scsi_device_get.  The call can return an error if the LLD is being
unloaded from memory.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-09-10 11:21:02 -05:00
Alan Stern
e517d3133f [SCSI] add missing scan mutex to scsi_scan_target()
This patch (as543) adds a private entry point to scsi_scan_target, for use
when the caller already owns the scan_mutex, and updates the kerneldoc for
that routine (which was badly out-of-date).  It converts scsi_scan_channel
to use the new entry point.  Lastly, it modifies scsi_get_host_dev to make
it acquire the scan_mutex, necessary since the routine adds a new
scsi_device even if it doesn't do any actual scanning.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-09-09 10:24:31 -05:00
James Bottomley
ea73a9f239 [SCSI] convert sd to scsi_execute_req (and update the scsi_execute_req API)
This one removes struct scsi_request entirely from sd.  In the process,
I noticed we have no callers of scsi_wait_req who don't immediately
normalise the sense, so I updated the API to make it take a struct
scsi_sense_hdr instead of simply a big sense buffer.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-08-28 11:33:52 -05:00
James Bottomley
7a93aef7fb Merge HEAD from ../scsi-misc-2.6-tmp 2005-08-28 11:18:35 -05:00
James Bottomley
392160335c [SCSI] use scatter lists for all block pc requests and simplify hw handlers
Original From: Mike Christie <michaelc@cs.wisc.edu>

Add scsi_execute_req() as a replacement for scsi_wait_req()

Fixed up various pieces (added REQ_SPECIAL and caught req use after
free)

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-08-28 10:46:40 -05:00
James.Smart@Emulex.Com
5c44cd2afa [SCSI] fix target scanning oops with fc transport class
We have some nasty issues with 2.6.12-rc6. Any request to scan on
the lpfc or qla2xxx FC adapters will oops. What is happening is the
system is defaulting to non-transport registered targets, which
inherit the parent of the scan. On this second scan, performed by
the attribute, the parent becomes the shost instead of the rport.
The slave functions in the 2 FC adapters use starget_to_rport()
routines, which incorrectly map the shost as an rport pointer.

Additionally, this pointed out other weaknesses:
- If the target structure is torn down outside of the transport,
  we have no method for it to be regenerated at the proper parent.
- We have race conditions on the target being allocated by both
  the midlayer scan (parent=shost) and by the fc transport
  (parent=rport).

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-08-08 17:14:55 -05:00
Mike Anderson
82f29467a0 [SCSI] host state model update: mediate host add/remove race
Add support to not allow additions to a host when it is being removed.

Signed-off-by: Mike Anderson <andmike@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-30 11:13:01 -05:00
Alan Stern
b24b103345 [PATCH] scsi_scan: check return code from scsi_sysfs_add_sdev
Adds a missing check for an error return code from scsi_sysfs_add_sdev.
This resolves entry #4863 in the OSDL bugzilla.  Although in that bug
report the failure occurred because of a confusion over scanning vs.
rescanning, in general add_sdev can fail for a number of reasons (the
simplest being insufficient memory) and the caller should cope properly.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: James Bottomley <James.Bottomley@SteelEye.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:25:51 -07:00
James.Smart@Emulex.Com
2f4701d827 [SCSI] add int_to_scsilun() function
One of the issues we had was reverting the midlayers lun value
into the 8byte lun value that we wanted to send to the device.
Historically, there's been some combination of byte swapping,
setting high/low, etc. There's also been no common thread between
how our driver did it and others.  I also got very confused as
to why byteswap routines were being used.

Anyway, this patch is a LLDD-callable function that reverts the
midlayer's lun value, stored in an int, to the 8-byte quantity
(note: this is not the real 8byte quantity, just the same amount
that scsilun_to_int() was able to convert and store originally).

This also solves the dilemma of the thread:
http://marc.theaimsgroup.com/?l=linux-kernel&m=112116767118981&w=2

A patch for the lpfc driver to use this function will be along
in a few days (batched with other patches).

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-14 11:21:27 -04:00
James Bottomley
3237ee78fc merge by hand (fix up qla_os.c merge error) 2005-06-17 18:42:23 -05:00
Nathan Lynch
c92715b3c2 [SCSI] fix slab corruption during ipr probe
With CONFIG_DEBUG_SLAB=y I see slab corruption messages during boot on
pSeries machines with IPR adapters with any 2.6.12-rc kernel.

The change which seems to have introduced the problem is "SCSI: revamp
target scanning routines" and may be found at:
http://marc.theaimsgroup.com/?l=bk-commits-head&m=111093946426333&w=2

In order to revert that in a 2.6.12-rc1 tree, I had to revert "target
code updates to support scanned targets" first:
http://marc.theaimsgroup.com/?l=bk-commits-head&m=111094132524649&w=2

With both patches reverted, the corruption messages go away.

ipr: IBM Power RAID SCSI Device Driver version: 2.0.13 (February 21,
2005)
ipr 0001:d0:01.0: Found IOA with IRQ: 167
ipr 0001:d0:01.0: Starting IOA initialization sequence.
ipr 0001:d0:01.0: Adapter firmware version: 020A005C
ipr 0001:d0:01.0: IOA initialized.
scsi0 : IBM 570B Storage Adapter
  Vendor: IBM       Model: VSBPD4E1  U4SCSI  Rev: 4770
  Type:   Enclosure                          ANSI SCSI revision: 02
  Vendor: IBM   H0  Model: HUS103036FL3800   Rev: RPQF
  Type:   Direct-Access                      ANSI SCSI revision: 04
  Vendor: IBM   H0  Model: HUS103036FL3800   Rev: RPQF
  Type:   Direct-Access                      ANSI SCSI revision: 04
  Vendor: IBM   H0  Model: HUS103036FL3800   Rev: RPQF
  Type:   Direct-Access                      ANSI SCSI revision: 04
  Vendor: IBM   H0  Model: HUS103036FL3800   Rev: RPQF
  Type:   Direct-Access                      ANSI SCSI revision: 04
  Vendor: IBM       Model: VSBPD4E1  U4SCSI  Rev: 4770
  Type:   Enclosure                          ANSI SCSI revision: 02
Slab corruption: start=c0000001e8de5268, len=512
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c00000000029c3a0>](.scsi_target_dev_release+0x28/0x50)
080: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6a
Prev obj: start=c0000001e8de5050, len=512
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<0000000000000000>](0x0)
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Next obj: start=c0000001e8de5480, len=512
Redzone: 0x170fc2a5/0x170fc2a5.
Last user: [<c000000000228d7c>](.as_init_queue+0x5c/0x228)
000: c0 00 00 01 e8 83 26 08 00 00 00 00 00 00 00 00
010: 00 00 00 00 00 00 00 00 c0 00 00 01 e8 de 54 98
Slab corruption: start=c0000001e8de5268, len=512
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<c00000000029c3a0>](.scsi_target_dev_release+0x28/0x50)
080: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6a
Prev obj: start=c0000001e8de5050, len=512
Redzone: 0x5a2cf071/0x5a2cf071.
Last user: [<0000000000000000>](0x0)
000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Next obj: start=c0000001e8de5480, len=512
Redzone: 0x170fc2a5/0x170fc2a5.
Last user: [<c000000000228d7c>](.as_init_queue+0x5c/0x228)
000: c0 00 00 01 e8 83 26 08 00 00 00 00 00 00 00 00
010: 00 00 00 00 00 00 00 00 c0 00 00 01 e8 de 54 98
...

I did some digging and the problem seems to be a refcounting issue in
__scsi_add_device.  The target gets freed in scsi_target_reap, and
then __scsi_add_device tries to do another device_put on it.

Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-06-03 09:38:55 -05:00
James Bottomley
a283bd37d0 [SCSI] Add target alloc/destroy callbacks to the host template
This gives the HBA driver notice when a target is created and
destroyed to allow it to manage its own target based allocations
accordingly.

This is a much reduced verson of the original patch sent in by
James.Smart@Emulex.com

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-05-26 11:27:53 -04:00
Al Viro
631e8a1398 [SCSI] TYPE_RBC cache fixes (sbp2.c affected)
a) TYPE_SDAD renamed to TYPE_RBC and taken to scsi.h
	b) in sbp2.c remapping of TYPE_RPB to TYPE_DISK turned off
	c) relevant places in midlayer and sd.c taught to accept TYPE_RBC
	d) sd.c::sd_read_cache_type() looks into page 6 when dealing with
TYPE_RBC - these guys have writeback cache flag there and are not guaranteed
to have page 8 at all.
	e) sd_read_cache_type() got an extra sanity check - it checks that
it got the page it asked for before using its contents.  And screams if
mismatch had happened.  Rationale: there are broken devices out there that
are "helpful" enough to go for "I don't have a page you've asked for, here,
have another one".  For example, PL3507 had been caught doing just that...
	f) sbp2 sets sdev->use_10_for_rw and sdev->use_10_for_ms instead
of bothering to remap READ6/WRITE6/MOD_SENSE, so most of the conversions
in there are gone now.

	Incidentally, I wonder if USB storage devices that have no
mode page 8 are simply RBC ones.  I haven't touched that, but it might
be interesting to check...

Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-05-26 08:41:15 -05:00
Al Viro
bc86120a85 [PATCH] SCSI GFP fixes
Somebody forgot that | has higher priority than ?:.  As the result,
allocation is done with bogus flags - instead of GFP_ATOMIC + possibly
GFP_DMA we always get GFP_DMA and no GFP_ATOMIC. 

Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-24 12:28:34 -07:00
152587deb8 [PATCH] fix NMI lockup with CFQ scheduler
The current problem seen is that the queue lock is actually in the
SCSI device structure, so when that structure is freed on device
release, we go boom if the queue tries to access the lock again.

The fix here is to move the lock from the scsi_device to the queue.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-04-16 20:10:09 -05:00