2005-04-17 02:20:36 +04:00
/*
* Internal header file for device mapper
*
* Copyright ( C ) 2001 , 2002 Sistina Software
2006-06-26 11:27:32 +04:00
* Copyright ( C ) 2004 - 2006 Red Hat , Inc . All rights reserved .
2005-04-17 02:20:36 +04:00
*
* This file is released under the LGPL .
*/
# ifndef DM_INTERNAL_H
# define DM_INTERNAL_H
# include <linux/fs.h>
# include <linux/device-mapper.h>
# include <linux/list.h>
# include <linux/blkdev.h>
2006-03-27 13:17:54 +04:00
# include <linux/hdreg.h>
2005-04-17 02:20:36 +04:00
2006-12-08 13:41:04 +03:00
/*
* Suspend feature flags
*/
# define DM_SUSPEND_LOCKFS_FLAG (1 << 0)
2006-12-08 13:41:07 +03:00
# define DM_SUSPEND_NOFLUSH_FLAG (1 << 1)
2006-12-08 13:41:04 +03:00
dm: enable request based option
This patch enables request-based dm.
o Request-based dm and bio-based dm coexist, since there are
some target drivers which are more fitting to bio-based dm.
Also, there are other bio-based devices in the kernel
(e.g. md, loop).
Since bio-based device can't receive struct request,
there are some limitations on device stacking between
bio-based and request-based.
type of underlying device
bio-based request-based
----------------------------------------------
bio-based OK OK
request-based -- OK
The device type is recognized by the queue flag in the kernel,
so dm follows that.
o The type of a dm device is decided at the first table binding time.
Once the type of a dm device is decided, the type can't be changed.
o Mempool allocations are deferred to at the table loading time, since
mempools for request-based dm are different from those for bio-based
dm and needed mempool type is fixed by the type of table.
o Currently, request-based dm supports only tables that have a single
target. To support multiple targets, we need to support request
splitting or prevent bio/request from spanning multiple targets.
The former needs lots of changes in the block layer, and the latter
needs that all target drivers support merge() function.
Both will take a time.
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-06-22 13:12:36 +04:00
/*
* Type of table and mapped_device ' s mempool
*/
# define DM_TYPE_NONE 0
# define DM_TYPE_BIO_BASED 1
# define DM_TYPE_REQUEST_BASED 2
2005-04-17 02:20:36 +04:00
/*
* List of devices that a metadevice uses and should open / close .
*/
2008-10-10 16:37:09 +04:00
struct dm_dev_internal {
2005-04-17 02:20:36 +04:00
struct list_head list ;
atomic_t count ;
2008-10-10 16:37:09 +04:00
struct dm_dev dm_dev ;
2005-04-17 02:20:36 +04:00
} ;
struct dm_table ;
dm: enable request based option
This patch enables request-based dm.
o Request-based dm and bio-based dm coexist, since there are
some target drivers which are more fitting to bio-based dm.
Also, there are other bio-based devices in the kernel
(e.g. md, loop).
Since bio-based device can't receive struct request,
there are some limitations on device stacking between
bio-based and request-based.
type of underlying device
bio-based request-based
----------------------------------------------
bio-based OK OK
request-based -- OK
The device type is recognized by the queue flag in the kernel,
so dm follows that.
o The type of a dm device is decided at the first table binding time.
Once the type of a dm device is decided, the type can't be changed.
o Mempool allocations are deferred to at the table loading time, since
mempools for request-based dm are different from those for bio-based
dm and needed mempool type is fixed by the type of table.
o Currently, request-based dm supports only tables that have a single
target. To support multiple targets, we need to support request
splitting or prevent bio/request from spanning multiple targets.
The former needs lots of changes in the block layer, and the latter
needs that all target drivers support merge() function.
Both will take a time.
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-06-22 13:12:36 +04:00
struct dm_md_mempools ;
2005-04-17 02:20:36 +04:00
/*-----------------------------------------------------------------
2006-06-26 11:27:33 +04:00
* Internal table functions .
2005-04-17 02:20:36 +04:00
* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
dm table: rework reference counting
Rework table reference counting.
The existing code uses a reference counter. When the last reference is
dropped and the counter reaches zero, the table destructor is called.
Table reference counters are acquired/released from upcalls from other
kernel code (dm_any_congested, dm_merge_bvec, dm_unplug_all).
If the reference counter reaches zero in one of the upcalls, the table
destructor is called from almost random kernel code.
This leads to various problems:
* dm_any_congested being called under a spinlock, which calls the
destructor, which calls some sleeping function.
* the destructor attempting to take a lock that is already taken by the
same process.
* stale reference from some other kernel code keeps the table
constructed, which keeps some devices open, even after successful
return from "dmsetup remove". This can confuse lvm and prevent closing
of underlying devices or reusing device minor numbers.
The patch changes reference counting so that the table destructor can be
called only at predetermined places.
The table has always exactly one reference from either mapped_device->map
or hash_cell->new_map. After this patch, this reference is not counted
in table->holders. A pair of dm_create_table/dm_destroy_table functions
is used for table creation/destruction.
Temporary references from the other code increase table->holders. A pair
of dm_table_get/dm_table_put functions is used to manipulate it.
When the table is about to be destroyed, we wait for table->holders to
reach 0. Then, we call the table destructor. We use active waiting with
msleep(1), because the situation happens rarely (to one user in 5 years)
and removing the device isn't performance-critical task: the user doesn't
care if it takes one tick more or not.
This way, the destructor is called only at specific points
(dm_table_destroy function) and the above problems associated with lazy
destruction can't happen.
Finally remove the temporary protection added to dm_any_congested().
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-01-06 06:05:10 +03:00
void dm_table_destroy ( struct dm_table * t ) ;
2005-04-17 02:20:36 +04:00
void dm_table_event_callback ( struct dm_table * t ,
void ( * fn ) ( void * ) , void * context ) ;
struct dm_target * dm_table_get_target ( struct dm_table * t , unsigned int index ) ;
struct dm_target * dm_table_find_target ( struct dm_table * t , sector_t sector ) ;
2009-06-22 13:12:34 +04:00
int dm_calculate_queue_limits ( struct dm_table * table ,
struct queue_limits * limits ) ;
void dm_table_set_restrictions ( struct dm_table * t , struct request_queue * q ,
struct queue_limits * limits ) ;
2005-04-17 02:20:36 +04:00
struct list_head * dm_table_get_devices ( struct dm_table * t ) ;
void dm_table_presuspend_targets ( struct dm_table * t ) ;
void dm_table_postsuspend_targets ( struct dm_table * t ) ;
2006-10-03 12:15:36 +04:00
int dm_table_resume_targets ( struct dm_table * t ) ;
2005-04-17 02:20:36 +04:00
int dm_table_any_congested ( struct dm_table * t , int bdi_bits ) ;
dm: prepare for request based option
This patch adds core functions for request-based dm.
When struct mapped device (md) is initialized, md->queue has
an I/O scheduler and the following functions are used for
request-based dm as the queue functions:
make_request_fn: dm_make_request()
pref_fn: dm_prep_fn()
request_fn: dm_request_fn()
softirq_done_fn: dm_softirq_done()
lld_busy_fn: dm_lld_busy()
Actual initializations are done in another patch (PATCH 2).
Below is a brief summary of how request-based dm behaves, including:
- making request from bio
- cloning, mapping and dispatching request
- completing request and bio
- suspending md
- resuming md
bio to request
==============
md->queue->make_request_fn() (dm_make_request()) calls __make_request()
for a bio submitted to the md.
Then, the bio is kept in the queue as a new request or merged into
another request in the queue if possible.
Cloning and Mapping
===================
Cloning and mapping are done in md->queue->request_fn() (dm_request_fn()),
when requests are dispatched after they are sorted by the I/O scheduler.
dm_request_fn() checks busy state of underlying devices using
target's busy() function and stops dispatching requests to keep them
on the dm device's queue if busy.
It helps better I/O merging, since no merge is done for a request
once it is dispatched to underlying devices.
Actual cloning and mapping are done in dm_prep_fn() and map_request()
called from dm_request_fn().
dm_prep_fn() clones not only request but also bios of the request
so that dm can hold bio completion in error cases and prevent
the bio submitter from noticing the error.
(See the "Completion" section below for details.)
After the cloning, the clone is mapped by target's map_rq() function
and inserted to underlying device's queue using
blk_insert_cloned_request().
Completion
==========
Request completion can be hooked by rq->end_io(), but then, all bios
in the request will have been completed even error cases, and the bio
submitter will have noticed the error.
To prevent the bio completion in error cases, request-based dm clones
both bio and request and hooks both bio->bi_end_io() and rq->end_io():
bio->bi_end_io(): end_clone_bio()
rq->end_io(): end_clone_request()
Summary of the request completion flow is below:
blk_end_request() for a clone request
=> blk_update_request()
=> bio->bi_end_io() == end_clone_bio() for each clone bio
=> Free the clone bio
=> Success: Complete the original bio (blk_update_request())
Error: Don't complete the original bio
=> blk_finish_request()
=> rq->end_io() == end_clone_request()
=> blk_complete_request()
=> dm_softirq_done()
=> Free the clone request
=> Success: Complete the original request (blk_end_request())
Error: Requeue the original request
end_clone_bio() completes the original request on the size of
the original bio in successful cases.
Even if all bios in the original request are completed by that
completion, the original request must not be completed yet to keep
the ordering of request completion for the stacking.
So end_clone_bio() uses blk_update_request() instead of
blk_end_request().
In error cases, end_clone_bio() doesn't complete the original bio.
It just frees the cloned bio and gives over the error handling to
end_clone_request().
end_clone_request(), which is called with queue lock held, completes
the clone request and the original request in a softirq context
(dm_softirq_done()), which has no queue lock, to avoid a deadlock
issue on submission of another request during the completion:
- The submitted request may be mapped to the same device
- Request submission requires queue lock, but the queue lock
has been held by itself and it doesn't know that
The clone request has no clone bio when dm_softirq_done() is called.
So target drivers can't resubmit it again even error cases.
Instead, they can ask dm core for requeueing and remapping
the original request in that cases.
suspend
=======
Request-based dm uses stopping md->queue as suspend of the md.
For noflush suspend, just stops md->queue.
For flush suspend, inserts a marker request to the tail of md->queue.
And dispatches all requests in md->queue until the marker comes to
the front of md->queue. Then, stops dispatching request and waits
for the all dispatched requests to complete.
After that, completes the marker request, stops md->queue and
wake up the waiter on the suspend queue, md->wait.
resume
======
Starts md->queue.
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-06-22 13:12:35 +04:00
int dm_table_any_busy_target ( struct dm_table * t ) ;
dm: enable request based option
This patch enables request-based dm.
o Request-based dm and bio-based dm coexist, since there are
some target drivers which are more fitting to bio-based dm.
Also, there are other bio-based devices in the kernel
(e.g. md, loop).
Since bio-based device can't receive struct request,
there are some limitations on device stacking between
bio-based and request-based.
type of underlying device
bio-based request-based
----------------------------------------------
bio-based OK OK
request-based -- OK
The device type is recognized by the queue flag in the kernel,
so dm follows that.
o The type of a dm device is decided at the first table binding time.
Once the type of a dm device is decided, the type can't be changed.
o Mempool allocations are deferred to at the table loading time, since
mempools for request-based dm are different from those for bio-based
dm and needed mempool type is fixed by the type of table.
o Currently, request-based dm supports only tables that have a single
target. To support multiple targets, we need to support request
splitting or prevent bio/request from spanning multiple targets.
The former needs lots of changes in the block layer, and the latter
needs that all target drivers support merge() function.
Both will take a time.
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-06-22 13:12:36 +04:00
int dm_table_set_type ( struct dm_table * t ) ;
unsigned dm_table_get_type ( struct dm_table * t ) ;
2009-06-22 13:12:36 +04:00
bool dm_table_bio_based ( struct dm_table * t ) ;
dm: enable request based option
This patch enables request-based dm.
o Request-based dm and bio-based dm coexist, since there are
some target drivers which are more fitting to bio-based dm.
Also, there are other bio-based devices in the kernel
(e.g. md, loop).
Since bio-based device can't receive struct request,
there are some limitations on device stacking between
bio-based and request-based.
type of underlying device
bio-based request-based
----------------------------------------------
bio-based OK OK
request-based -- OK
The device type is recognized by the queue flag in the kernel,
so dm follows that.
o The type of a dm device is decided at the first table binding time.
Once the type of a dm device is decided, the type can't be changed.
o Mempool allocations are deferred to at the table loading time, since
mempools for request-based dm are different from those for bio-based
dm and needed mempool type is fixed by the type of table.
o Currently, request-based dm supports only tables that have a single
target. To support multiple targets, we need to support request
splitting or prevent bio/request from spanning multiple targets.
The former needs lots of changes in the block layer, and the latter
needs that all target drivers support merge() function.
Both will take a time.
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-06-22 13:12:36 +04:00
bool dm_table_request_based ( struct dm_table * t ) ;
int dm_table_alloc_md_mempools ( struct dm_table * t ) ;
void dm_table_free_md_mempools ( struct dm_table * t ) ;
struct dm_md_mempools * dm_table_get_md_mempools ( struct dm_table * t ) ;
2005-04-17 02:20:36 +04:00
2007-12-13 17:15:25 +03:00
/*
* To check the return value from dm_table_find_target ( ) .
*/
# define dm_target_is_valid(t) ((t)->table)
dm: enable request based option
This patch enables request-based dm.
o Request-based dm and bio-based dm coexist, since there are
some target drivers which are more fitting to bio-based dm.
Also, there are other bio-based devices in the kernel
(e.g. md, loop).
Since bio-based device can't receive struct request,
there are some limitations on device stacking between
bio-based and request-based.
type of underlying device
bio-based request-based
----------------------------------------------
bio-based OK OK
request-based -- OK
The device type is recognized by the queue flag in the kernel,
so dm follows that.
o The type of a dm device is decided at the first table binding time.
Once the type of a dm device is decided, the type can't be changed.
o Mempool allocations are deferred to at the table loading time, since
mempools for request-based dm are different from those for bio-based
dm and needed mempool type is fixed by the type of table.
o Currently, request-based dm supports only tables that have a single
target. To support multiple targets, we need to support request
splitting or prevent bio/request from spanning multiple targets.
The former needs lots of changes in the block layer, and the latter
needs that all target drivers support merge() function.
Both will take a time.
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-06-22 13:12:36 +04:00
/*
* To check whether the target type is request - based or not ( bio - based ) .
*/
# define dm_target_request_based(t) ((t)->type->map_rq != NULL)
2005-04-17 02:20:36 +04:00
/*-----------------------------------------------------------------
* A registry of target types .
* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
int dm_target_init ( void ) ;
void dm_target_exit ( void ) ;
struct target_type * dm_get_target_type ( const char * name ) ;
2009-04-02 22:55:28 +04:00
void dm_put_target_type ( struct target_type * tt ) ;
2005-04-17 02:20:36 +04:00
int dm_target_iterate ( void ( * iter_func ) ( struct target_type * tt ,
void * param ) , void * param ) ;
int dm_split_args ( int * argc , char * * * argvp , char * input ) ;
/*
* The device - mapper can be driven through one of two interfaces ;
* ioctl or filesystem , depending which patch you have applied .
*/
int dm_interface_init ( void ) ;
void dm_interface_exit ( void ) ;
2009-01-06 06:05:12 +03:00
/*
* sysfs interface
*/
int dm_sysfs_init ( struct mapped_device * md ) ;
void dm_sysfs_exit ( struct mapped_device * md ) ;
struct kobject * dm_kobject ( struct mapped_device * md ) ;
struct mapped_device * dm_get_from_kobject ( struct kobject * kobj ) ;
2005-04-17 02:20:36 +04:00
/*
* Targets for linear and striped mappings
*/
int dm_linear_init ( void ) ;
void dm_linear_exit ( void ) ;
int dm_stripe_init ( void ) ;
void dm_stripe_exit ( void ) ;
2006-06-26 11:27:34 +04:00
int dm_open_count ( struct mapped_device * md ) ;
int dm_lock_for_deletion ( struct mapped_device * md ) ;
2005-04-17 02:20:36 +04:00
2009-06-22 13:12:30 +04:00
void dm_kobject_uevent ( struct mapped_device * md , enum kobject_action action ,
unsigned cookie ) ;
2007-12-13 17:15:57 +03:00
2008-04-25 00:43:49 +04:00
int dm_kcopyd_init ( void ) ;
void dm_kcopyd_exit ( void ) ;
dm: enable request based option
This patch enables request-based dm.
o Request-based dm and bio-based dm coexist, since there are
some target drivers which are more fitting to bio-based dm.
Also, there are other bio-based devices in the kernel
(e.g. md, loop).
Since bio-based device can't receive struct request,
there are some limitations on device stacking between
bio-based and request-based.
type of underlying device
bio-based request-based
----------------------------------------------
bio-based OK OK
request-based -- OK
The device type is recognized by the queue flag in the kernel,
so dm follows that.
o The type of a dm device is decided at the first table binding time.
Once the type of a dm device is decided, the type can't be changed.
o Mempool allocations are deferred to at the table loading time, since
mempools for request-based dm are different from those for bio-based
dm and needed mempool type is fixed by the type of table.
o Currently, request-based dm supports only tables that have a single
target. To support multiple targets, we need to support request
splitting or prevent bio/request from spanning multiple targets.
The former needs lots of changes in the block layer, and the latter
needs that all target drivers support merge() function.
Both will take a time.
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-06-22 13:12:36 +04:00
/*
* Mempool operations
*/
struct dm_md_mempools * dm_alloc_md_mempools ( unsigned type ) ;
void dm_free_md_mempools ( struct dm_md_mempools * pools ) ;
2005-04-17 02:20:36 +04:00
# endif