2006-01-11 12:17:46 -08:00
# include <linux/capability.h>
2005-04-16 15:20:36 -07:00
# include <linux/blkdev.h>
2011-05-26 16:00:52 -04:00
# include <linux/export.h>
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 17:04:11 +09:00
# include <linux/gfp.h>
2005-04-16 15:20:36 -07:00
# include <linux/blkpg.h>
2006-01-08 01:02:50 -08:00
# include <linux/hdreg.h>
2016-01-06 12:03:42 -08:00
# include <linux/badblocks.h>
2005-04-16 15:20:36 -07:00
# include <linux/backing-dev.h>
2011-09-16 02:31:11 -04:00
# include <linux/fs.h>
2006-03-23 20:00:26 +01:00
# include <linux/blktrace_api.h>
2015-10-15 14:10:48 +02:00
# include <linux/pr.h>
2005-04-16 15:20:36 -07:00
# include <asm/uaccess.h>
static int blkpg_ioctl ( struct block_device * bdev , struct blkpg_ioctl_arg __user * arg )
{
struct block_device * bdevp ;
struct gendisk * disk ;
2012-08-01 12:24:18 +02:00
struct hd_struct * part , * lpart ;
2005-04-16 15:20:36 -07:00
struct blkpg_ioctl_arg a ;
struct blkpg_partition p ;
2008-09-03 09:03:02 +02:00
struct disk_part_iter piter ;
2005-04-16 15:20:36 -07:00
long long start , length ;
2008-09-03 09:01:09 +02:00
int partno ;
2005-04-16 15:20:36 -07:00
if ( ! capable ( CAP_SYS_ADMIN ) )
return - EACCES ;
if ( copy_from_user ( & a , arg , sizeof ( struct blkpg_ioctl_arg ) ) )
return - EFAULT ;
if ( copy_from_user ( & p , a . data , sizeof ( struct blkpg_partition ) ) )
return - EFAULT ;
disk = bdev - > bd_disk ;
if ( bdev ! = bdev - > bd_contains )
return - EINVAL ;
2008-09-03 09:01:09 +02:00
partno = p . pno ;
2008-08-25 19:56:15 +09:00
if ( partno < = 0 )
2005-04-16 15:20:36 -07:00
return - EINVAL ;
switch ( a . op ) {
case BLKPG_ADD_PARTITION :
start = p . start > > 9 ;
length = p . length > > 9 ;
2012-08-01 12:24:18 +02:00
/* check for fit in a hd_struct */
if ( sizeof ( sector_t ) = = sizeof ( long ) & &
2005-04-16 15:20:36 -07:00
sizeof ( long long ) > sizeof ( long ) ) {
long pstart = start , plength = length ;
if ( pstart ! = start | | plength ! = length
2012-09-17 11:47:13 +01:00
| | pstart < 0 | | plength < 0 | | partno > 65535 )
2005-04-16 15:20:36 -07:00
return - EINVAL ;
}
2008-08-25 19:30:16 +09:00
2006-03-23 03:00:28 -08:00
mutex_lock ( & bdev - > bd_mutex ) ;
2008-08-25 19:30:16 +09:00
2005-04-16 15:20:36 -07:00
/* overlap? */
2008-09-03 09:03:02 +02:00
disk_part_iter_init ( & piter , disk ,
DISK_PITER_INCL_EMPTY ) ;
while ( ( part = disk_part_iter_next ( & piter ) ) ) {
if ( ! ( start + length < = part - > start_sect | |
start > = part - > start_sect + part - > nr_sects ) ) {
disk_part_iter_exit ( & piter ) ;
2006-03-23 03:00:28 -08:00
mutex_unlock ( & bdev - > bd_mutex ) ;
2005-04-16 15:20:36 -07:00
return - EBUSY ;
}
}
2008-09-03 09:03:02 +02:00
disk_part_iter_exit ( & piter ) ;
2005-04-16 15:20:36 -07:00
/* all seems OK */
2008-11-10 15:29:58 +09:00
part = add_partition ( disk , partno , start , length ,
2010-08-31 15:47:05 -05:00
ADDPART_FLAG_NONE , NULL ) ;
2006-03-23 03:00:28 -08:00
mutex_unlock ( & bdev - > bd_mutex ) ;
2013-11-06 15:56:39 +08:00
return PTR_ERR_OR_ZERO ( part ) ;
2005-04-16 15:20:36 -07:00
case BLKPG_DEL_PARTITION :
2008-09-03 09:03:02 +02:00
part = disk_get_part ( disk , partno ) ;
if ( ! part )
2005-04-16 15:20:36 -07:00
return - ENXIO ;
2008-09-03 09:03:02 +02:00
bdevp = bdget ( part_devt ( part ) ) ;
disk_put_part ( part ) ;
2005-04-16 15:20:36 -07:00
if ( ! bdevp )
return - ENOMEM ;
2008-09-03 09:03:02 +02:00
2006-12-08 02:36:13 -08:00
mutex_lock ( & bdevp - > bd_mutex ) ;
2005-04-16 15:20:36 -07:00
if ( bdevp - > bd_openers ) {
2006-03-23 03:00:28 -08:00
mutex_unlock ( & bdevp - > bd_mutex ) ;
2005-04-16 15:20:36 -07:00
bdput ( bdevp ) ;
return - EBUSY ;
}
/* all seems OK */
fsync_bdev ( bdevp ) ;
2007-05-06 14:49:54 -07:00
invalidate_bdev ( bdevp ) ;
2005-04-16 15:20:36 -07:00
2007-02-20 13:58:18 -08:00
mutex_lock_nested ( & bdev - > bd_mutex , 1 ) ;
2008-09-03 09:01:09 +02:00
delete_partition ( disk , partno ) ;
2006-03-23 03:00:28 -08:00
mutex_unlock ( & bdev - > bd_mutex ) ;
mutex_unlock ( & bdevp - > bd_mutex ) ;
2005-04-16 15:20:36 -07:00
bdput ( bdevp ) ;
2012-08-01 12:24:18 +02:00
return 0 ;
case BLKPG_RESIZE_PARTITION :
start = p . start > > 9 ;
/* new length of partition in bytes */
length = p . length > > 9 ;
/* check for fit in a hd_struct */
if ( sizeof ( sector_t ) = = sizeof ( long ) & &
sizeof ( long long ) > sizeof ( long ) ) {
long pstart = start , plength = length ;
if ( pstart ! = start | | plength ! = length
| | pstart < 0 | | plength < 0 )
return - EINVAL ;
}
part = disk_get_part ( disk , partno ) ;
if ( ! part )
return - ENXIO ;
bdevp = bdget ( part_devt ( part ) ) ;
if ( ! bdevp ) {
disk_put_part ( part ) ;
return - ENOMEM ;
}
mutex_lock ( & bdevp - > bd_mutex ) ;
mutex_lock_nested ( & bdev - > bd_mutex , 1 ) ;
if ( start ! = part - > start_sect ) {
mutex_unlock ( & bdevp - > bd_mutex ) ;
mutex_unlock ( & bdev - > bd_mutex ) ;
bdput ( bdevp ) ;
disk_put_part ( part ) ;
return - EINVAL ;
}
/* overlap? */
disk_part_iter_init ( & piter , disk ,
DISK_PITER_INCL_EMPTY ) ;
while ( ( lpart = disk_part_iter_next ( & piter ) ) ) {
if ( lpart - > partno ! = partno & &
! ( start + length < = lpart - > start_sect | |
start > = lpart - > start_sect + lpart - > nr_sects )
) {
disk_part_iter_exit ( & piter ) ;
mutex_unlock ( & bdevp - > bd_mutex ) ;
mutex_unlock ( & bdev - > bd_mutex ) ;
bdput ( bdevp ) ;
disk_put_part ( part ) ;
return - EBUSY ;
}
}
disk_part_iter_exit ( & piter ) ;
part_nr_sects_write ( part , ( sector_t ) length ) ;
i_size_write ( bdevp - > bd_inode , p . length ) ;
mutex_unlock ( & bdevp - > bd_mutex ) ;
mutex_unlock ( & bdev - > bd_mutex ) ;
bdput ( bdevp ) ;
disk_put_part ( part ) ;
2005-04-16 15:20:36 -07:00
return 0 ;
default :
return - EINVAL ;
}
}
2015-05-06 12:26:22 +08:00
/*
* This is an exported API for the block driver , and will not
* acquire bd_mutex . This API should be used in case that
* caller has held bd_mutex already .
*/
int __blkdev_reread_part ( struct block_device * bdev )
2005-04-16 15:20:36 -07:00
{
struct gendisk * disk = bdev - > bd_disk ;
2011-08-23 20:01:04 +02:00
if ( ! disk_part_scan_enabled ( disk ) | | bdev ! = bdev - > bd_contains )
2005-04-16 15:20:36 -07:00
return - EINVAL ;
if ( ! capable ( CAP_SYS_ADMIN ) )
return - EACCES ;
2015-05-06 12:26:22 +08:00
lockdep_assert_held ( & bdev - > bd_mutex ) ;
return rescan_partitions ( disk , bdev ) ;
}
EXPORT_SYMBOL ( __blkdev_reread_part ) ;
/*
* This is an exported API for the block driver , and will
* try to acquire bd_mutex . If bd_mutex has been held already
* in current context , please call __blkdev_reread_part ( ) .
block: replace trylock with mutex_lock in blkdev_reread_part()
The only possible problem of using mutex_lock() instead of trylock
is about deadlock.
If there aren't any locks held before calling blkdev_reread_part(),
deadlock can't be caused by this conversion.
If there are locks held before calling blkdev_reread_part(),
and if these locks arn't required in open, close handler and I/O
path, deadlock shouldn't be caused too.
Both user space's ioctl(BLKRRPART) and md_setup_drive() from
init/do_mounts_md.c belongs to the 1st case, so the conversion is safe
for the two cases.
For loop, the previous patches in this pathset has fixed the ABBA lock
dependency, so the conversion is OK.
For nbd, tx_lock is held when calling the function:
- both open and release won't hold the lock
- when blkdev_reread_part() is run, I/O thread has been stopped
already, so tx_lock won't be acquired in I/O path at that time.
- so the conversion won't cause deadlock for nbd
For dasd, both dasd_open(), dasd_release() and request function don't
acquire any mutex/semphone, so the conversion should be safe.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-05-06 12:26:27 +08:00
*
* Make sure the held locks in current context aren ' t required
* in open ( ) / close ( ) handler and I / O path for avoiding ABBA deadlock :
* - bd_mutex is held before calling block driver ' s open / close
* handler
* - reading partition table may submit I / O to the block device
2015-05-06 12:26:22 +08:00
*/
int blkdev_reread_part ( struct block_device * bdev )
{
int res ;
block: replace trylock with mutex_lock in blkdev_reread_part()
The only possible problem of using mutex_lock() instead of trylock
is about deadlock.
If there aren't any locks held before calling blkdev_reread_part(),
deadlock can't be caused by this conversion.
If there are locks held before calling blkdev_reread_part(),
and if these locks arn't required in open, close handler and I/O
path, deadlock shouldn't be caused too.
Both user space's ioctl(BLKRRPART) and md_setup_drive() from
init/do_mounts_md.c belongs to the 1st case, so the conversion is safe
for the two cases.
For loop, the previous patches in this pathset has fixed the ABBA lock
dependency, so the conversion is OK.
For nbd, tx_lock is held when calling the function:
- both open and release won't hold the lock
- when blkdev_reread_part() is run, I/O thread has been stopped
already, so tx_lock won't be acquired in I/O path at that time.
- so the conversion won't cause deadlock for nbd
For dasd, both dasd_open(), dasd_release() and request function don't
acquire any mutex/semphone, so the conversion should be safe.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-05-06 12:26:27 +08:00
mutex_lock ( & bdev - > bd_mutex ) ;
2015-05-06 12:26:22 +08:00
res = __blkdev_reread_part ( bdev ) ;
2006-03-23 03:00:28 -08:00
mutex_unlock ( & bdev - > bd_mutex ) ;
2015-05-06 12:26:22 +08:00
2005-04-16 15:20:36 -07:00
return res ;
}
2015-05-06 12:26:22 +08:00
EXPORT_SYMBOL ( blkdev_reread_part ) ;
2005-04-16 15:20:36 -07:00
2015-10-15 14:10:47 +02:00
static int blk_ioctl_discard ( struct block_device * bdev , fmode_t mode ,
unsigned long arg , unsigned long flags )
2008-08-11 15:58:42 +01:00
{
2015-10-15 14:10:47 +02:00
uint64_t range [ 2 ] ;
uint64_t start , len ;
if ( ! ( mode & FMODE_WRITE ) )
return - EBADF ;
if ( copy_from_user ( range , ( void __user * ) arg , sizeof ( range ) ) )
return - EFAULT ;
start = range [ 0 ] ;
len = range [ 1 ] ;
2010-08-11 14:17:49 -07:00
2008-08-11 15:58:42 +01:00
if ( start & 511 )
return - EINVAL ;
if ( len & 511 )
return - EINVAL ;
start > > = 9 ;
len > > = 9 ;
2010-11-08 14:39:12 +01:00
if ( start + len > ( i_size_read ( bdev - > bd_inode ) > > 9 ) )
2008-08-11 15:58:42 +01:00
return - EINVAL ;
2010-08-11 14:17:49 -07:00
return blkdev_issue_discard ( bdev , start , len , GFP_KERNEL , flags ) ;
2008-08-11 15:58:42 +01:00
}
2015-10-15 14:10:47 +02:00
static int blk_ioctl_zeroout ( struct block_device * bdev , fmode_t mode ,
unsigned long arg )
2012-09-18 12:19:29 -04:00
{
2015-10-15 14:10:47 +02:00
uint64_t range [ 2 ] ;
uint64_t start , len ;
if ( ! ( mode & FMODE_WRITE ) )
return - EBADF ;
if ( copy_from_user ( range , ( void __user * ) arg , sizeof ( range ) ) )
return - EFAULT ;
start = range [ 0 ] ;
len = range [ 1 ] ;
2012-09-18 12:19:29 -04:00
if ( start & 511 )
return - EINVAL ;
if ( len & 511 )
return - EINVAL ;
start > > = 9 ;
len > > = 9 ;
if ( start + len > ( i_size_read ( bdev - > bd_inode ) > > 9 ) )
return - EINVAL ;
2015-01-20 20:06:30 -05:00
return blkdev_issue_zeroout ( bdev , start , len , GFP_KERNEL , false ) ;
2012-09-18 12:19:29 -04:00
}
2005-04-16 15:20:36 -07:00
static int put_ushort ( unsigned long arg , unsigned short val )
{
return put_user ( val , ( unsigned short __user * ) arg ) ;
}
static int put_int ( unsigned long arg , int val )
{
return put_user ( val , ( int __user * ) arg ) ;
}
2009-10-03 20:52:01 +02:00
static int put_uint ( unsigned long arg , unsigned int val )
{
return put_user ( val , ( unsigned int __user * ) arg ) ;
}
2005-04-16 15:20:36 -07:00
static int put_long ( unsigned long arg , long val )
{
return put_user ( val , ( long __user * ) arg ) ;
}
static int put_ulong ( unsigned long arg , unsigned long val )
{
return put_user ( val , ( unsigned long __user * ) arg ) ;
}
static int put_u64 ( unsigned long arg , u64 val )
{
return put_user ( val , ( u64 __user * ) arg ) ;
}
2007-08-29 20:34:12 -04:00
int __blkdev_driver_ioctl ( struct block_device * bdev , fmode_t mode ,
unsigned cmd , unsigned long arg )
{
struct gendisk * disk = bdev - > bd_disk ;
[PATCH] beginning of methods conversion
To keep the size of changesets sane we split the switch by drivers;
to keep the damn thing bisectable we do the following:
1) rename the affected methods, add ones with correct
prototypes, make (few) callers handle both. That's this changeset.
2) for each driver convert to new methods. *ALL* drivers
are converted in this series.
3) kill the old (renamed) methods.
Note that it _is_ a flagday; all in-tree drivers are converted and by the
end of this series no trace of old methods remain. The only reason why
we do that this way is to keep the damn thing bisectable and allow per-driver
debugging if anything goes wrong.
New methods:
open(bdev, mode)
release(disk, mode)
ioctl(bdev, mode, cmd, arg) /* Called without BKL */
compat_ioctl(bdev, mode, cmd, arg)
locked_ioctl(bdev, mode, cmd, arg) /* Called with BKL, legacy */
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-02 09:09:22 -05:00
if ( disk - > fops - > ioctl )
return disk - > fops - > ioctl ( bdev , mode , cmd , arg ) ;
2007-08-29 20:34:12 -04:00
return - ENOTTY ;
}
/*
* For the record : _GPL here is only because somebody decided to slap it
* on the previous export . Sheer idiocy , since it wasn ' t copyrightable
* at all and could be open - coded without any exports by anybody who cares .
*/
EXPORT_SYMBOL_GPL ( __blkdev_driver_ioctl ) ;
2015-10-15 14:10:48 +02:00
static int blkdev_pr_register ( struct block_device * bdev ,
struct pr_registration __user * arg )
{
const struct pr_ops * ops = bdev - > bd_disk - > fops - > pr_ops ;
struct pr_registration reg ;
if ( ! capable ( CAP_SYS_ADMIN ) )
return - EPERM ;
if ( ! ops | | ! ops - > pr_register )
return - EOPNOTSUPP ;
if ( copy_from_user ( & reg , arg , sizeof ( reg ) ) )
return - EFAULT ;
if ( reg . flags & ~ PR_FL_IGNORE_KEY )
return - EOPNOTSUPP ;
return ops - > pr_register ( bdev , reg . old_key , reg . new_key , reg . flags ) ;
}
static int blkdev_pr_reserve ( struct block_device * bdev ,
struct pr_reservation __user * arg )
{
const struct pr_ops * ops = bdev - > bd_disk - > fops - > pr_ops ;
struct pr_reservation rsv ;
if ( ! capable ( CAP_SYS_ADMIN ) )
return - EPERM ;
if ( ! ops | | ! ops - > pr_reserve )
return - EOPNOTSUPP ;
if ( copy_from_user ( & rsv , arg , sizeof ( rsv ) ) )
return - EFAULT ;
if ( rsv . flags & ~ PR_FL_IGNORE_KEY )
return - EOPNOTSUPP ;
return ops - > pr_reserve ( bdev , rsv . key , rsv . type , rsv . flags ) ;
}
static int blkdev_pr_release ( struct block_device * bdev ,
struct pr_reservation __user * arg )
{
const struct pr_ops * ops = bdev - > bd_disk - > fops - > pr_ops ;
struct pr_reservation rsv ;
if ( ! capable ( CAP_SYS_ADMIN ) )
return - EPERM ;
if ( ! ops | | ! ops - > pr_release )
return - EOPNOTSUPP ;
if ( copy_from_user ( & rsv , arg , sizeof ( rsv ) ) )
return - EFAULT ;
if ( rsv . flags )
return - EOPNOTSUPP ;
return ops - > pr_release ( bdev , rsv . key , rsv . type ) ;
}
static int blkdev_pr_preempt ( struct block_device * bdev ,
struct pr_preempt __user * arg , bool abort )
{
const struct pr_ops * ops = bdev - > bd_disk - > fops - > pr_ops ;
struct pr_preempt p ;
if ( ! capable ( CAP_SYS_ADMIN ) )
return - EPERM ;
if ( ! ops | | ! ops - > pr_preempt )
return - EOPNOTSUPP ;
if ( copy_from_user ( & p , arg , sizeof ( p ) ) )
return - EFAULT ;
if ( p . flags )
return - EOPNOTSUPP ;
return ops - > pr_preempt ( bdev , p . old_key , p . new_key , p . type , abort ) ;
}
static int blkdev_pr_clear ( struct block_device * bdev ,
struct pr_clear __user * arg )
{
const struct pr_ops * ops = bdev - > bd_disk - > fops - > pr_ops ;
struct pr_clear c ;
if ( ! capable ( CAP_SYS_ADMIN ) )
return - EPERM ;
if ( ! ops | | ! ops - > pr_clear )
return - EOPNOTSUPP ;
if ( copy_from_user ( & c , arg , sizeof ( c ) ) )
return - EFAULT ;
if ( c . flags )
return - EOPNOTSUPP ;
return ops - > pr_clear ( bdev , c . key ) ;
}
2012-01-05 15:40:12 -08:00
/*
* Is it an unrecognized ioctl ? The correct returns are either
* ENOTTY ( final ) or ENOIOCTLCMD ( " I don't know this one, try a
* fallback " ). ENOIOCTLCMD gets turned into ENOTTY by the ioctl
* code before returning .
*
* Confused drivers sometimes return EINVAL , which is wrong . It
* means " I understood the ioctl command, but the parameters to
* it were wrong " .
*
* We should aim to just fix the broken drivers , the EINVAL case
* should go away .
*/
static inline int is_unrecognized_ioctl ( int ret )
{
return ret = = - EINVAL | |
ret = = - ENOTTY | |
ret = = - ENOIOCTLCMD ;
}
2015-11-30 10:20:29 -08:00
# ifdef CONFIG_FS_DAX
bool blkdev_dax_capable ( struct block_device * bdev )
{
struct gendisk * disk = bdev - > bd_disk ;
if ( ! disk - > fops - > direct_access )
return false ;
/*
* If the partition is not aligned on a page boundary , we can ' t
* do dax I / O to it .
*/
if ( ( bdev - > bd_part - > start_sect % ( PAGE_SIZE / 512 ) )
| | ( bdev - > bd_part - > nr_sects % ( PAGE_SIZE / 512 ) ) )
return false ;
2016-01-06 12:03:42 -08:00
/*
* If the device has known bad blocks , force all I / O through the
* driver / page cache .
*
* TODO : support finer grained dax error handling
*/
if ( disk - > bb & & disk - > bb - > count )
return false ;
2015-11-30 10:20:29 -08:00
return true ;
}
# endif
2015-10-15 14:10:47 +02:00
static int blkdev_flushbuf ( struct block_device * bdev , fmode_t mode ,
unsigned cmd , unsigned long arg )
2005-06-23 00:10:15 -07:00
{
2015-10-15 14:10:47 +02:00
int ret ;
2005-06-23 00:10:15 -07:00
2015-10-15 14:10:47 +02:00
if ( ! capable ( CAP_SYS_ADMIN ) )
return - EACCES ;
2005-06-23 00:10:15 -07:00
2015-10-15 14:10:47 +02:00
ret = __blkdev_driver_ioctl ( bdev , mode , cmd , arg ) ;
if ( ! is_unrecognized_ioctl ( ret ) )
return ret ;
2005-06-23 00:10:15 -07:00
2015-10-15 14:10:47 +02:00
fsync_bdev ( bdev ) ;
invalidate_bdev ( bdev ) ;
return 0 ;
}
2008-08-11 15:58:42 +01:00
2015-10-15 14:10:47 +02:00
static int blkdev_roset ( struct block_device * bdev , fmode_t mode ,
unsigned cmd , unsigned long arg )
{
int ret , n ;
2008-08-11 15:58:42 +01:00
2015-10-15 14:10:47 +02:00
ret = __blkdev_driver_ioctl ( bdev , mode , cmd , arg ) ;
if ( ! is_unrecognized_ioctl ( ret ) )
return ret ;
if ( ! capable ( CAP_SYS_ADMIN ) )
return - EACCES ;
if ( get_user ( n , ( int __user * ) arg ) )
return - EFAULT ;
set_device_ro ( bdev , n ) ;
return 0 ;
}
2008-08-11 15:58:42 +01:00
2015-10-15 14:10:47 +02:00
static int blkdev_getgeo ( struct block_device * bdev ,
struct hd_geometry __user * argp )
{
struct gendisk * disk = bdev - > bd_disk ;
struct hd_geometry geo ;
int ret ;
2008-08-11 15:58:42 +01:00
2015-10-15 14:10:47 +02:00
if ( ! argp )
return - EINVAL ;
if ( ! disk - > fops - > getgeo )
return - ENOTTY ;
/*
* We need to set the startsect first , the driver may
* want to override it .
*/
memset ( & geo , 0 , sizeof ( geo ) ) ;
geo . start = get_start_sect ( bdev ) ;
ret = disk - > fops - > getgeo ( bdev , & geo ) ;
if ( ret )
return ret ;
if ( copy_to_user ( argp , & geo , sizeof ( geo ) ) )
return - EFAULT ;
return 0 ;
}
2012-09-18 12:19:29 -04:00
2015-10-15 14:10:47 +02:00
/* set the logical block size */
static int blkdev_bszset ( struct block_device * bdev , fmode_t mode ,
int __user * argp )
{
int ret , n ;
2012-09-18 12:19:29 -04:00
2015-10-15 14:10:47 +02:00
if ( ! capable ( CAP_SYS_ADMIN ) )
return - EACCES ;
if ( ! argp )
return - EINVAL ;
if ( get_user ( n , argp ) )
return - EFAULT ;
2012-09-18 12:19:29 -04:00
2015-10-15 14:10:47 +02:00
if ( ! ( mode & FMODE_EXCL ) ) {
bdgrab ( bdev ) ;
if ( blkdev_get ( bdev , mode | FMODE_EXCL , & bdev ) < 0 )
return - EBUSY ;
2012-09-18 12:19:29 -04:00
}
2008-08-11 15:58:42 +01:00
2015-10-15 14:10:47 +02:00
ret = set_blocksize ( bdev , n ) ;
if ( ! ( mode & FMODE_EXCL ) )
blkdev_put ( bdev , mode | FMODE_EXCL ) ;
return ret ;
}
2006-01-08 01:02:50 -08:00
2015-10-15 14:10:47 +02:00
/*
* always keep this in sync with compat_blkdev_ioctl ( )
*/
int blkdev_ioctl ( struct block_device * bdev , fmode_t mode , unsigned cmd ,
unsigned long arg )
{
struct backing_dev_info * bdi ;
void __user * argp = ( void __user * ) arg ;
loff_t size ;
unsigned int max_sectors ;
switch ( cmd ) {
case BLKFLSBUF :
return blkdev_flushbuf ( bdev , mode , cmd , arg ) ;
case BLKROSET :
return blkdev_roset ( bdev , mode , cmd , arg ) ;
case BLKDISCARD :
return blk_ioctl_discard ( bdev , mode , arg , 0 ) ;
case BLKSECDISCARD :
return blk_ioctl_discard ( bdev , mode , arg ,
BLKDEV_DISCARD_SECURE ) ;
case BLKZEROOUT :
return blk_ioctl_zeroout ( bdev , mode , arg ) ;
case HDIO_GETGEO :
return blkdev_getgeo ( bdev , argp ) ;
2008-09-18 15:53:24 -04:00
case BLKRAGET :
case BLKFRAGET :
if ( ! arg )
return - EINVAL ;
bdi = blk_get_backing_dev_info ( bdev ) ;
mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.
This promise never materialized. And unlikely will.
We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.
Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.
Let's stop pretending that pages in page cache are special. They are
not.
The changes are pretty straight-forward:
- <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
- page_cache_get() -> get_page();
- page_cache_release() -> put_page();
This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.
The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.
There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.
virtual patch
@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT
@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE
@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK
@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)
@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)
@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-01 15:29:47 +03:00
return put_long ( arg , ( bdi - > ra_pages * PAGE_SIZE ) / 512 ) ;
2008-09-18 15:53:24 -04:00
case BLKROGET :
return put_int ( arg , bdev_read_only ( bdev ) ! = 0 ) ;
2009-10-03 20:52:01 +02:00
case BLKBSZGET : /* get block device soft block size (cf. BLKSSZGET) */
2008-09-18 15:53:24 -04:00
return put_int ( arg , block_size ( bdev ) ) ;
2009-10-03 20:52:01 +02:00
case BLKSSZGET : /* get block device logical block size */
2009-05-22 17:17:49 -04:00
return put_int ( arg , bdev_logical_block_size ( bdev ) ) ;
2009-10-03 20:52:01 +02:00
case BLKPBSZGET : /* get block device physical block size */
return put_uint ( arg , bdev_physical_block_size ( bdev ) ) ;
case BLKIOMIN :
return put_uint ( arg , bdev_io_min ( bdev ) ) ;
case BLKIOOPT :
return put_uint ( arg , bdev_io_opt ( bdev ) ) ;
case BLKALIGNOFF :
return put_int ( arg , bdev_alignment_offset ( bdev ) ) ;
2009-12-03 09:24:48 +01:00
case BLKDISCARDZEROES :
return put_uint ( arg , bdev_discard_zeroes_data ( bdev ) ) ;
2008-09-18 15:53:24 -04:00
case BLKSECTGET :
2014-05-25 21:43:33 +09:00
max_sectors = min_t ( unsigned int , USHRT_MAX ,
queue_max_sectors ( bdev_get_queue ( bdev ) ) ) ;
return put_ushort ( arg , max_sectors ) ;
2012-01-11 16:29:31 +01:00
case BLKROTATIONAL :
return put_ushort ( arg , ! blk_queue_nonrot ( bdev_get_queue ( bdev ) ) ) ;
2008-09-18 15:53:24 -04:00
case BLKRASET :
case BLKFRASET :
if ( ! capable ( CAP_SYS_ADMIN ) )
return - EACCES ;
bdi = blk_get_backing_dev_info ( bdev ) ;
mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.
This promise never materialized. And unlikely will.
We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.
Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.
Let's stop pretending that pages in page cache are special. They are
not.
The changes are pretty straight-forward:
- <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
- page_cache_get() -> get_page();
- page_cache_release() -> put_page();
This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.
The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.
There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.
virtual patch
@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT
@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE
@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK
@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)
@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)
@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-01 15:29:47 +03:00
bdi - > ra_pages = ( arg * 512 ) / PAGE_SIZE ;
2008-09-18 15:53:24 -04:00
return 0 ;
case BLKBSZSET :
2015-10-15 14:10:47 +02:00
return blkdev_bszset ( bdev , mode , argp ) ;
2008-09-18 15:53:24 -04:00
case BLKPG :
2015-10-15 14:10:47 +02:00
return blkpg_ioctl ( bdev , argp ) ;
2008-09-18 15:53:24 -04:00
case BLKRRPART :
2015-10-15 14:10:47 +02:00
return blkdev_reread_part ( bdev ) ;
2008-09-18 15:53:24 -04:00
case BLKGETSIZE :
2010-11-08 14:39:12 +01:00
size = i_size_read ( bdev - > bd_inode ) ;
2008-09-18 15:53:24 -04:00
if ( ( size > > 9 ) > ~ 0UL )
return - EFBIG ;
return put_ulong ( arg , size > > 9 ) ;
case BLKGETSIZE64 :
2010-11-08 14:39:12 +01:00
return put_u64 ( arg , i_size_read ( bdev - > bd_inode ) ) ;
2008-09-18 15:53:24 -04:00
case BLKTRACESTART :
case BLKTRACESTOP :
case BLKTRACESETUP :
case BLKTRACETEARDOWN :
2015-10-15 14:10:47 +02:00
return blk_trace_ioctl ( bdev , cmd , argp ) ;
2015-11-30 10:20:29 -08:00
case BLKDAXGET :
return put_int ( arg , ! ! ( bdev - > bd_inode - > i_flags & S_DAX ) ) ;
break ;
2015-10-15 14:10:48 +02:00
case IOC_PR_REGISTER :
return blkdev_pr_register ( bdev , argp ) ;
case IOC_PR_RESERVE :
return blkdev_pr_reserve ( bdev , argp ) ;
case IOC_PR_RELEASE :
return blkdev_pr_release ( bdev , argp ) ;
case IOC_PR_PREEMPT :
return blkdev_pr_preempt ( bdev , argp , false ) ;
case IOC_PR_PREEMPT_ABORT :
return blkdev_pr_preempt ( bdev , argp , true ) ;
case IOC_PR_CLEAR :
return blkdev_pr_clear ( bdev , argp ) ;
2008-09-18 15:53:24 -04:00
default :
2015-10-15 14:10:47 +02:00
return __blkdev_driver_ioctl ( bdev , mode , cmd , arg ) ;
2008-09-18 15:53:24 -04:00
}
2005-04-16 15:20:36 -07:00
}
[PATCH] Fix root hole in raw device
[Patch] Fix raw device ioctl pass-through
Raw character devices are supposed to pass ioctls through to the block
devices they are bound to. Unfortunately, they are using the wrong
function for this: ioctl_by_bdev(), instead of blkdev_ioctl().
ioctl_by_bdev() performs a set_fs(KERNEL_DS) before calling the ioctl,
redirecting the user-space buffer access to the kernel address space.
This is, needless to say, a bad thing.
This was noticed first on s390, where raw IO was non-functioning. The
s390 driver config does not actually allow raw IO to be enabled, which
was the first part of the problem. Secondly, the s390 kernel address
space is distinct from user, causing legal raw ioctls to fail. I've
reproduced this on a kernel built with 4G:4G split on x86, which fails
in the same way (-EFAULT if the address does not exist kernel-side;
returns success without actually populating the user buffer if it does.)
The patch below fixes both the config and address-space problems. It's
based closely on a patch by Jan Glauber <jang@de.ibm.com>, which has
been tested on s390 at IBM. I've tested it on x86 4G:4G (split address
space) and x86_64 (common address space).
Kernel-address-space access has been assigned CAN-2005-1264.
Signed-off-by: Stephen Tweedie <sct@redhat.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-05-13 23:31:19 -04:00
EXPORT_SYMBOL_GPL ( blkdev_ioctl ) ;