linux

iv/linux

Author	SHA1	Message	Date
Martin K. Petersen	ca369d51b3	block/sd: Fix device-imposed transfer length limits Commit `4f258a4634` ("sd: Fix maximum I/O size for BLOCK_PC requests") had the unfortunate side-effect of removing an implicit clamp to BLK_DEF_MAX_SECTORS for REQ_TYPE_FS requests in the block layer code. This caused problems for some SMR drives. Debugging this issue revealed a few problems with the existing infrastructure since the block layer didn't know how to deal with device-imposed limits, only limits set by the I/O controller. - Introduce a new queue limit, max_dev_sectors, which is used by the ULD to signal the maximum sectors for a REQ_TYPE_FS request. - Ensure that max_dev_sectors is correctly stacked and taken into account when overriding max_sectors through sysfs. - Rework sd_read_block_limits() so it saves the max_xfer and opt_xfer values for later processing. - In sd_revalidate() set the queue's max_dev_sectors based on the MAXIMUM TRANSFER LENGTH value in the Block Limits VPD. If this value is not reported, fall back to a cap based on the CDB TRANSFER LENGTH field size. - In sd_revalidate(), use OPTIMAL TRANSFER LENGTH from the Block Limits VPD--if reported and sane--to signal the preferred device transfer size for FS requests. Otherwise use BLK_DEF_MAX_SECTORS. - blk_limits_max_hw_sectors() is no longer used and can be removed. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=93581 Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: sweeneygj@gmx.com Tested-by: Arzeets <anatol.pomozov@gmail.com> Tested-by: David Eisner <david.eisner@oriel.oxon.org> Tested-by: Mario Kicherer <dev@kicherer.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2015-11-25 21:38:58 -05:00
Hannes Reinecke	22e0d99415	scsi: introduce sdev_prefix_printk() Like scmd_printk(), but the device name is passed in as a string. Can be used by eg ULDs which do not have access to the scsi_cmnd structure. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Robert Elliott <elliott@hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-11-12 11:15:57 +01:00
Martin K. Petersen	c611529e7c	sd: Honor block layer integrity handling flags A set of flags introduced in the block layer enable better control over how protection information is handled. These flags are useful for both error injection and data recovery purposes. Checking can be enabled and disabled for controller and disk, and the guard tag format is now a per-I/O property. Update sd_protect_op to communicate the relevant information to the low-level device driver via a set of flags in scsi_cmnd. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-09-30 15:17:35 -06:00
Martin K. Petersen	bcdb247c6b	sd: Limit transfer length Until now the per-command transfer length has exclusively been gated by the max_sectors parameter in the scsi_host template. Given that the size of this parameter has been bumped to an unsigned int we have to be careful not to exceed the target device's capabilities. If the if the device specifies a Maximum Transfer Length in the Block Limits VPD we'll use that value. Otherwise we'll use 0xffffffff for devices that have use_16_for_rw set and 0xffff for the rest. We then combine the chosen disk limit with max_sectors in the host template. The smaller of the two will be used to set the max_hw_sectors queue limit. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2014-07-17 22:07:33 +02:00
Martin K. Petersen	b2bff6ceb6	[SCSI] sd: Quiesce mode sense error messages Messages about discovered disk properties are only printed once unless they are found to have changed. Errors encountered during mode sense, however, are printed every time we revalidate. Quiesce mode sense errors so they are only printed during the first scan. [jejb: checkpatch fixes] Bugzilla: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=733565 Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2014-03-27 08:26:33 -07:00
James Bottomley	7e660100d8	[SCSI] Derive the FLUSH_TIMEOUT from the basic I/O timeout Rather than having a separate constant for specifying the timeout on FLUSH operations, use the basic I/O timeout value that is already configurable on a per target basis to derive the FLUSH timeout. Looking at the current definitions of these timeout values, the FLUSH operation is supposed to have a value that is twice the normal timeout value. This patch preserves this relationship while leveraging the flexibility of specifying the I/O timeout. Based on a prior patch by KY Srinivasan <kys@microsoft.com> Reviewed-by: KY Srinivasan <kys@microsoft.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-10-25 09:58:16 +01:00
Martin K. Petersen	66c28f9712	[SCSI] sd: Update WRITE SAME heuristics SATA drives located behind a SAS controller would incorrectly receive WRITE SAME commands. Tweak the heuristics so that: - If REPORT SUPPORTED OPERATION CODES is provided we will use that to choose between WRITE SAME(16), WRITE SAME(10) and disabled. This also fixes an issue with the old code which would issue WRITE SAME(10) despite the command not being whitelisted in REPORT SUPPORTED OPERATION CODES. - If REPORT SUPPORTED OPERATION CODES is not provided we will fall back to WRITE SAME(10) unless the device has an ATA Information VPD page. The assumption is that a SATL which is smart enough to implement WRITE SAME would also provide REPORT SUPPORTED OPERATION CODES. To facilitate the new heuristics scsi_report_opcode() has been modified to so we can distinguish between "operation not supported" and "RSOC not supported". Reported-by: H. Peter Anvin <hpa@zytor.com> Tested-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-06-26 17:56:18 -07:00
James Bottomley	39c60a0948	[SCSI] sd: fix array cache flushing bug causing performance problems Some arrays synchronize their full non volatile cache when the sd driver sends a SYNCHRONIZE CACHE command. Unfortunately, they can have Terrabytes of this and we send a SYNCHRONIZE CACHE for every barrier if an array reports it has a writeback cache. This leads to massive slowdowns on journalled filesystems. The fix is to allow userspace to turn off the writeback cache setting as a temporary measure (i.e. without doing the MODE SELECT to write it back to the device), so even though the device reported it has a writeback cache, the user, knowing that the cache is non volatile and all they care about is filesystem correctness, can turn that bit off in the kernel and avoid the performance ruinous (and safety irrelevant) SYNCHRONIZE CACHE commands. The way you do this is add a 'temporary' prefix when performing the usual cache setting operations, so echo temporary write through > /sys/class/scsi_disk/<disk>/cache_type Reported-by: Ric Wheeler <rwheeler@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2013-05-02 16:15:54 -07:00
Martin K. Petersen	5db44863b6	[SCSI] sd: Implement support for WRITE SAME Implement support for WRITE SAME(10) and WRITE SAME(16) in the SCSI disk driver. - We set the default maximum to 0xFFFF because there are several devices out there that only support two-byte block counts even with WRITE SAME(16). We only enable transfers bigger than 0xFFFF if the device explicitly reports MAXIMUM WRITE SAME LENGTH in the BLOCK LIMITS VPD. - max_write_same_blocks can be overriden per-device basis in sysfs. - The UNMAP discovery heuristics remain unchanged but the discard limits are tweaked to match the "real" WRITE SAME commands. - In the error handling logic we now distinguish between WRITE SAME with and without UNMAP set. The discovery process heuristics are: - If the device reports a SCSI level of SPC-3 or greater we'll issue READ SUPPORTED OPERATION CODES to find out whether WRITE SAME(16) is supported. If that's the case we will use it. - If the device supports the block limits VPD and reports a MAXIMUM WRITE SAME LENGTH bigger than 0xFFFF we will use WRITE SAME(16). - Otherwise we will use WRITE SAME(10) unless the target LBA is beyond 0xFFFFFFFF or the block count exceeds 0xFFFF. - no_write_same is set for ATA, FireWire and USB. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-11-13 22:45:42 -08:00
Martin K. Petersen	8c579ab69d	[SCSI] sd: Avoid remapping bad reference tags It does not make sense to translate ref tags with unexpected values. Instead we simply ignore them and let the upper layers catch the problem. Ref tags that contain the expected value are still remapped. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-09-24 12:10:59 +04:00
Martin K. Petersen	18a4d0a22e	[SCSI] Handle disk devices which can not process medium access commands We have experienced several devices which fail in a fashion we do not currently handle gracefully in SCSI. After a failure these devices will respond to the SCSI primary command set (INQUIRY, TEST UNIT READY, etc.) but any command accessing the storage medium will time out. The following patch adds an callback that can be used by upper level drivers to inspect the results of an error handling command. This in turn has been used to implement additional checking in the SCSI disk driver. If a medium access command fails twice but TEST UNIT READY succeeds both times in the subsequent error handling we will offline the device. The maximum number of failed commands required to take a device offline can be tweaked in sysfs. Also add a new error flag to scsi_debug which allows this scenario to be easily reproduced. [jejb: fix up integer parsing to use kstrtouint] Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2012-02-19 10:14:52 -06:00
Dave Kleikamp	21208ae5a2	[SCSI] sd: remove arbitrary SD_MAX_DISKS namespace limit There is no reason to limit the SCSI disk namespace to sdXXX. Add new error messages to sd_probe() in the unlikely event that either ida_get_new() or sd_format_disk_name() fail. Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>	2011-10-30 12:58:11 +04:00
Martin K. Petersen	c98a0eb0e9	[SCSI] sd: Logical Block Provisioning update SBC3r26 contains many changes to the Logical Block Provisioning interfaces (formerly known as Thin Provisioning ditto). This patch implements support for both the old and new schemes using the same heuristic as before (whether the LBP VPD page is present). The new code also allows the provisioning mode (i.e. choice of command) to be overridden on a per-device basis via sysfs. Two additional modes are supported in this version: - WRITE SAME(10) with the UNMAP bit set - WRITE SAME(10) without the UNMAP bit set. This allows us to support devices that predate the TP/LBP enhancements in SBC3 and which work by way zero-detection Switching between modes has been consolidated in a helper function that also updates the block layer topology according to the limitations of the chosen command. I experimented with trying WRITE SAME(16) if UNMAP fails, WRITE SAME(10) if WRITE SAME(16) fails, etc. but found several devices that got cranky. So for now we'll disable discard if one of the commands fail. The user still has the option of selecting a different mode in sysfs. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2011-03-14 18:37:34 -05:00
Tejun Heo	2bae0093ca	[SCSI] sd: implement sd_check_events() Replace sd_media_change() with sd_check_events(). * Move media removed logic into set_media_not_present() and media_not_present() and set sdev->changed iff an existing media is removed or the device indicates UNIT_ATTENTION. * Make sd_check_events() sets sdev->changed if previously missing media becomes present. * Event is reported only if sdev->changed is set. This makes media presence event reported if scsi_disk->media_present actually changed or the device indicated UNIT_ATTENTION. For backward compatibility, SDEV_EVT_MEDIA_CHANGE is generated each time sd_check_events() detects media change event. [jejb: fix boot failure] Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Jens Axboe <jaxboe@fusionio.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2011-01-14 09:17:34 -06:00
Martin K. Petersen	526f7c7950	[SCSI] sd: Fix overflow with big physical blocks The hw_sector_size variable could overflow if a device reported huge physical blocks. Switch to the more accurate physical_block_size terminology and make sure we use an unsigned int to match the range permitted by READ CAPACITY(16). Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-10-11 17:33:20 -05:00
Martin K. Petersen	045d3fe766	[SCSI] sd: Update thin provisioning support Add support for the Thin Provisioning VPD page and use the TPU and TPWS bits to switch between UNMAP and WRITE SAME(16) for discards. If no TP VPD page is present we fall back to old scheme where the max descriptor count combined with the max lba count are used trigger UNMAP. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-09-17 13:07:55 -04:00
Mike Christie	e3b3e62467	[SCSI] scsi/block: increase flush/sync timeout We have been seeing the flush request timeout with a wide range of hardware from tgt+iser to FC targets from a major vendor. After discussions about if the value should be configurable and what the best value should be, this patch just increases the flush/sync cache timeout to 1 minute. 2 minutes was determined to be too long, and making it configurable was troublesome for users. This patch was made over Linus's tree. It is not made over scsi-misc or scsi-rc-fixes, because Linus's had block layer changes that my patch was built over. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Acked-by: Jens Axboe <jaxboe@fusionio.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2010-09-05 14:57:04 -03:00
Arnd Bergmann	409f3499a2	scsi/sd: remove big kernel lock Every user of the BKL in the sd driver is the result of the pushdown from the block layer into the open/close/ioctl functions. The only place that used to rely on the BKL is the sdkp->openers variable, which gets converted into an atomic_t. Nothing else seems to rely on the BKL, since the functions do not touch global data without holding another lock, and the open/close functions are still protected from concurrent execution using the bdev->bd_mutex. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: linux-scsi@vger.kernel.org Cc: "James E.J. Bottomley" <James.Bottomley@suse.de> Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-08-07 18:26:08 +02:00
Martin K. Petersen	e339c1a7c0	[SCSI] sd: WRITE SAME(16) / UNMAP support Implement a function for handling discard requests that sends either WRITE SAME(16) or UNMAP(10) depending on parameters indicated by the device in the block limits VPD. Extract unmap constraints and report them to the block layer. Based in part by a patch by Christoph Hellwig <hch@lst.de>. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2009-12-10 08:54:15 -06:00
Martin K. Petersen	4e7392ec58	[SCSI] sd: Support disks formatted with DIF Type 2 Disks formatted with DIF Type 2 reject READ/WRITE 6/10/12/16 commands when protection is enabled. Only the 32-byte variants are supported. Implement support for issusing 32-byte READ/WRITE and enable Type 2 drives in the protection type detection logic. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2009-10-02 09:47:04 -05:00
Martin K. Petersen	35e1a5d90b	[SCSI] sd: Detach DIF from block integrity infrastructure So far we have only issued DIF commands if CONFIG_BLK_DEV_INTEGRITY is enabled. However, communication between initiator and target should be independent of protection information DMA. There are DIF-only host adapters coming out that will be able to take advantage of this. Move the relevant DIF bits to sd.c. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2009-10-02 09:46:39 -05:00
Martin K. Petersen	ea09bcc9c2	sd: Physical block size and alignment support Extract physical block size and lowest aligned LBA from READ CAPACITY(16) response and adjust queue parameters. Report physical block size and alignment when applicable. [jejb: fix up trailing whitespace] Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2009-06-21 10:52:37 -05:00
Martin K. Petersen	70a9b87346	[SCSI] sd: Make revalidate less chatty sd_revalidate ends up being called several times during device setup. With this patch we print everything during the first scan. Subsequent invocations will only print a message if the parameter in question has actually changed (LUN capacity has increased, etc.). Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2009-03-17 21:43:52 -04:00
James Bottomley	4c393e6e45	[SCSI] sd: fix compile failure with CONFIG_BLK_DEV_INTEGRITY=n Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-10-15 08:41:28 -04:00
Martin K. Petersen	9e06688e7d	[SCSI] sd: Correctly handle all combinations of DIF and DIX The old detection code couldn't handle all possible combinations of DIX and DIF. This version does, giving priority to DIX if the controller is capable. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-10-13 09:28:58 -04:00
Linus Torvalds	18351070b8	Re-introduce "[SCSI] extend the last_sector_bug flag to cover more sectors" This re-introduces commit `2b14290078`, which was reverted due to the regression it caused by commit `fca082c9f1`. That regression was not root-caused by the original commit, it was just uncovered by it, and the real fix was done by Alan Stern in commit `580da34847` ("Fix USB storage hang on command abort"). We can thus re-introduce the change that was confirmed by Alan Jenkins to be still required by his odd card reader. Cc: Alan Jenkins <alan-jenkins@tuffmail.co.uk> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-05 21:42:21 -07:00
Linus Torvalds	fca082c9f1	Revert "[SCSI] extend the last_sector_bug flag to cover more sectors" This reverts commit `2b14290078`, since it seems to break some other USB storage devices (at least a JMicron USB to ATA bridge). As such, while it apparently fixes some cardreaders, it would need to be made conditional on the exact reader it fixes in order to avoid causing regressions. Cc: Alan Jenkins <alan-jenkins@tuffmail.co.uk> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-04 16:36:20 -07:00
Alan Jenkins	2b14290078	[SCSI] extend the last_sector_bug flag to cover more sectors The last_sector_bug flag was added to work around a bug in certain usb cardreaders, where they would crash if a multiple sector read included the last sector. The original implementation avoids this by e.g. splitting an 8 sector read which includes the last sector into a 7 sector read, and a single sector read for the last sector. The flag is enabled for all USB devices. This revealed a second bug in other usb cardreaders, which crash when they get a multiple sector read which stops 1 sector short of the last sector. Affected hardware includes the Kingston "MobileLite" external USB cardreader and the internal USB cardreader on the Asus EeePC. Extend the last_sector_bug workaround to ensure that any access which touches the last 8 hardware sectors of the device is a single sector long. Requests are shrunk as necessary to meet this constraint. This gives us a safety margin against potential unknown or future bugs affecting multi-sector access to the end of the device. The two known bugs only affect the last 2 sectors. However, they suggest that these devices are prone to fencepost errors and that multi-sector access to the end of the device is not well tested. Popular OS's use multi-sector accesses, but they rarely read the last few sectors. Linux (with udev & vol_id) automatically reads sectors from the end of the device on insertion. It is assumed that single sector accesses are more thoroughly tested during development. Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk> Tested-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-07-27 10:16:13 -04:00
Martin K. Petersen	af55ff675a	[SCSI] sd: Support for SCSI disk (SBC) Data Integrity Field Support for controllers and disks that implement DIF protection information: - During command preparation the RDPROTECT/WRPROTECT must be set correctly if the target has DIF enabled. - READ(6) and WRITE(6) are not supported when DIF is on. - The controller must be told how to handle the I/O via the protection operation field in scsi_cmnd. - Refactor the I/O completion code that extracts failed LBA from the returned sense data and handle DIF failures correctly. - sd_dif.c implements the functions required to prepare and complete requests with protection information attached. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-07-26 15:14:56 -04:00
Martin K. Petersen	e0597d7001	[SCSI] sd: Identify DIF protection type and application tag ownership If a disk is formatted with protection information (Inquiry bit PROTECT=1) it is required to support Read Capacity(16). Force use of the 16-bit command in this case and extract the P_TYPE field which indicates whether the disk is formatted using DIF Type 1, 2 or 3. The ATO (App Tag Own) bit in the Control Mode Page indicates whether the storage device or the initiator own the contents of the DIF application tag. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-07-26 15:14:55 -04:00
Martin K. Petersen	5b635da11e	[SCSI] sd: Move scsi_disk() accessor function to sd.h Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-07-12 08:22:31 -05:00
Martin K. Petersen	aa91696e56	[SCSI] sd: Move sd.h header file Christoph objected to having sd.h in include/scsi since it is internal to the sd driver. Move it to drivers/scsi/sd.h. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-07-12 08:22:31 -05:00

32 Commits