IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Change dm-linear to implement direct_access function,
linear_direct_access(), which maps sector and calls direct_access
function of its physical target device.
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Change mapped device to implement direct_access function,
dm_blk_direct_access(), which calls a target direct_access function.
'struct target_type' is extended to have target direct_access interface.
This function limits direct accessible size to the dm_target's limit
with max_io_len().
Add dm_table_supports_dax() to iterate all targets and associated block
devices to check for DAX support. To add DAX support to a DM target the
target must only implement the direct_access function.
Add a new dm type, DM_TYPE_DAX_BIO_BASED, which indicates that mapped
device supports DAX and is bio based. This new type is used to assure
that all target devices have DAX support and remain that way after
QUEUE_FLAG_DAX is set in mapped device.
At initial table load, QUEUE_FLAG_DAX is set to mapped device when setting
DM_TYPE_DAX_BIO_BASED to the type. Any subsequent table load to the
mapped device must have the same type, or else it fails per the check in
table_load().
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Instead of a flag and an index just make sure an index of 0 means
no need to free the bvec array. Also move the constants related
to the bvec pools together and use a consistent naming scheme for
them.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
These two are confusing leftover of the old world order, combining
values of the REQ_OP_ and REQ_ namespaces. For callers that don't
special case we mostly just replace bi_rw with bio_data_dir or
op_is_write, except for the few cases where a switch over the REQ_OP_
values makes more sense. Any check for READA is replaced with an
explicit check for REQ_RAHEAD. Also remove the READA alias for
REQ_RAHEAD.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
The discard passdown was being issued after the block was unmapped,
which meant the block could be reprovisioned whilst the passdown discard
was still in flight.
We can only identify unshared blocks (safe to do a passdown a discard
to) once they're unmapped and their ref count hits zero. Block ref
counts are now used to guard against concurrent allocation of these
blocks that are being discarded. So now we unmap the block, issue
passdown discards, and the immediately increment ref counts for regions
that have been discarded via passed down (this is safe because
allocation occurs within the same thread). We then decrement ref counts
once the passdown discard IO is complete -- signaling these blocks may
now be allocated.
This fixes the potential for corruption that was reported here:
https://www.redhat.com/archives/dm-devel/2016-June/msg00311.html
Reported-by: Dennis Yang <dennisyang@qnap.com>
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
dm_btree_find_next_single() can short-circuit the search for a block
with a return of -ENODATA if all entries are higher than the search key
passed to lower_bound().
This hasn't been a problem because of the way the btree has been used by
DM thinp. But it must be fixed now in preparation for fixing the race
in DM thinp's handling of simultaneous block discard vs allocation.
Otherwise, once that fix is in place, some of the blocks in a discard
would not be unmapped as expected.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
RAID10 random read performance is lower than expected due to excessive spinlock
utilisation which is required mostly for rebuild/resync. Simplify allow_barrier
as it's in IO path and encounters a lot of unnecessary congestion.
As lower_barrier just takes a lock in order to decrement a counter, convert
counter (nr_pending) into atomic variable and remove the spin lock. There is
also a congestion for wake_up (it uses lock internally) so call it only when
it's really needed. As wake_up is not called constantly anymore, ensure process
waiting to raise a barrier is notified when there are no more waiting IOs.
Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Changeset 6791875e2e53 has added early return from a function so there is no
sysfs notification for 'active' and 'clean' state change.
Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
md loads raidX modules and increments module refcount each time level
has changed but does not decrement it. You are unable to unload raid0
module after reshape because raid0 reshape changes level to raid4
and back to raid0.
Signed-off-by: Aleksey Obitotskiy <aleksey.obitotskiy@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
The md code stores the exact time of the last error in the
last_read_error variable using a timespec structure. It only
ever uses the seconds portion of that though, so we can
use a scalar for it.
There won't be an overflow in 2038 here, because it already
used monotonic time and 32-bit is enough for that, but I've
decided to use time64_t for consistency in the conversion.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Shaohua Li <shli@fb.com>
raid_io_hints() was retrieving the number of data stripes used for the
calculation of io_opt from struct r5conf, which is not defined for raid0
mappings.
Base the calculation on the in-core raid_set structure instead.
Also, adjust to use to_bytes() for the sector -> bytes conversion
throughout.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Use 'unsigned int' where appropriate.
Return negative errors.
Correct an indentation.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
So far we tried to rely on the SCSI 'all target ports' bit to register
all path, but for many setups this didn't work properly as the different
paths are seen as separate initiators to the target instead of multiple
ports of the same initiator. Because of that we'll stop setting the
'all target ports' bit in SCSI, and let device mapper handle iterating
over the device for each path and register them manually.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Commit d548b34b062 ("dm: reduce the queue delay used in dm_request_fn
from 100ms to 10ms") always intended the value to be 10 msecs -- it
just expressed it in jiffies because earlier commit 7eaceaccab ("block:
remove per-queue plugging") did.
Signed-off-by: Tahsin Erdogan <tahsin@google.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Fixes: d548b34b062 ("dm: reduce the queue delay used in dm_request_fn from 100ms to 10ms")
Cc: stable@vger.kernel.org # 4.1+ -- stable@ backports must be applied to drivers/md/dm.c
Add "delta_disks" constructor argument support to raid1 to allow for
consistent userspace disk addition/removal handling.
Fix raid_status() to report all raid disks with status and table output
on disk adding reshapes, not just the ones listed on the mddev; optimize
its rebuild and writemostly output.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Enhance rs_reshape_requested() check function to be more transparent and
fix its raid10 check.
Streamline the constructor by factoring out reshaping preparation into
fucntion rs_prepare_reshape().
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Resizing a RAID set during recovery can be allowed, because the MD
resynchronization thread will either stop any ongoing recovery in case
of shrinking below the current recovery position or carry on recovery
to the new size if the set is growing.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Add function rs_setup_recovery() to allow for defined setup of RAID set
recovery in the constructor.
Will be called with dev_sectors={0, rdev->sectors, MaxSectors} to
recover a new or enforced sync, grown or not to be synhronized RAID set
respectively.
Prevents recovery on raid0, which doesn't support it.
Enforces recovery on raid6 to ensure properly defined Syndromes
mandatory for that MD personality are being created.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Check return value of kthread_run() in dm_old_init_request_queue().
Reported-by: Minfei Huang <mnghuan@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
We have assigned sb->block_size before the switch,
so remove the redundant one.
Reviewed-by: Coly Li <colyli@suse.de>
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Acked-by: Eric Wheeler <bcache@lists.ewheeler.net>
Signed-off-by: Jens Axboe <axboe@fb.com>
There is no return in continue_at(), update the documentation.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Cache_sb is not used in cache_alloc, and we have copied
sb info to cache->sb already, remove it.
Reviewed-by: Coly Li <colyli@suse.de>
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
do_div was replaced with div64_u64 at some point, causing a bug with
block calculation due to incompatible semantics of the two functions.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Fixes: a739ff3f543a ("dm verity: add support for forward error correction")
Cc: stable@vger.kernel.org # v4.5+
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Merge the two DM_PARAMS_[KV]MALLOC flags into a single flag.
Doing so avoids the crashes seen with previous attempts to consolidate
buffer management to use kvfree() without first flagging that memory had
actually been allocated.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Avoid that sparse complains about assigning a __le64 value to a u64
variable. Remove the (u64) casts since these are superfluous. This
patch does not change the behavior of the source code.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
A newly introduced function has 'const int' as the return type,
but as "make W=1" reports, that has no meaning:
drivers/md/dm-raid.c:510:18: error: type qualifiers ignored on function return type [-Werror=ignored-qualifiers]
This changes the return type to plain 'int'.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 33e53f06850f ("dm raid: introduce extended superblock and new raid types to support takeover/reshaping")
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Superblock updates where bogus causing some takovers/reshapes to fail.
Introduce new runtime flag (RT_FLAG_KEEP_RS_FROZEN) to keep a raid set
frozen when a layout change was requested. Userpace will immediately
reload the table w/o the flags requesting such change once they made it
to the superblocks and any change of recovery/reshape offsets has to be
avoided until after read.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>