linux

iv/linux

Author	SHA1	Message	Date
Mike Snitzer	7979d90757	dm vdo logger: remove log level to string conversion code Was only used by sysfs code, can be reinstated if/when needed. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Ken Raeburn <raeburn@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-04 15:07:57 -05:00
Mike Snitzer	25315e967a	dm vdo: add 'log_level' module parameter Expose control over dm-vdo's log-level in terms of a module param. It can be read and written via /sys/module/dm_vdo/parameters/log_level. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Ken Raeburn <raeburn@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-04 15:07:56 -05:00
Mike Snitzer	a9da0fb6d8	dm vdo: remove all sysfs interfaces Also update target major version number. All info is (or will be) accessible through alternative interfaces (e.g. "dmsetup message", module params, etc). Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Ken Raeburn <raeburn@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-04 15:07:56 -05:00
Mike Snitzer	4e4152482b	dm vdo target: eliminate inappropriate uses of UDS_SUCCESS Most should be VDO_SUCCESS. But comparing the return from kstrtouint() with UDS_SUCCESS (happens to be 0) makes no sense. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-04 15:07:56 -05:00
Matthew Sakai	e60167367e	dm vdo indexer: update ASSERT and ASSERT_LOG_ONLY usage Update indexer uses of ASSERT and ASSERT_LOG_ONLY to VDO_ASSERT and VDO_ASSERT_LOG_ONLY, respectively. Remove ASSERT and ASSERT_LOG_ONLY. Also rename uds_assertion_failed to vdo_assertion_failed. Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-03-04 15:07:56 -05:00
Mike Snitzer	fc03f73760	dm vdo encodings: update some stale comments Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-04 15:07:56 -05:00
Mike Snitzer	6a79248b42	dm vdo permassert: audit all of ASSERT to test for VDO_SUCCESS Also rename ASSERT to VDO_ASSERT and ASSERT_LOG_ONLY to VDO_ASSERT_LOG_ONLY. But re-introduce ASSERT and ASSERT_LOG_ONLY as a placeholder for the benefit of dm-vdo/indexer. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-04 15:07:56 -05:00
Mike Snitzer	a958c53af7	dm-vdo funnel-workqueue: return VDO_SUCCESS from make_simple_work_queue Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-04 15:07:56 -05:00
Mike Snitzer	34edf9e28c	dm vdo thread-utils: return VDO_SUCCESS on vdo_create_thread success Update all callers to check for VDO_SUCCESS. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-04 15:07:56 -05:00
Mike Snitzer	6c43cf2488	dm vdo int-map: return VDO_SUCCESS on success Update all callers to check for VDO_SUCCESS (most already did). Also fix whitespace for update_mapping() parameters. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-04 15:07:56 -05:00
Mike Snitzer	2de70388b3	dm vdo: check for VDO_SUCCESS return value from memory-alloc functions VDO_SUCCESS and UDS_SUCCESS were used interchangably, update all callers of VDO's memory-alloc functions to consistently check for VDO_SUCCESS. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-04 15:07:56 -05:00
Mike Snitzer	97d3380396	dm vdo memory-alloc: return VDO_SUCCESS on success Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-04 15:07:56 -05:00
Matthew Sakai	ee8f6ec1b1	dm vdo errors: remove unused error codes Also define VDO_SUCCESS in a more central location, and rename error block constants for clarity. Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-03-04 15:07:56 -05:00
Mike Snitzer	8f89115efc	dm vdo memory-alloc: rename vdo_do_allocation to __vdo_do_allocation __vdo_do_allocation shouldn't be used outside of memory-alloc.h, so add hidden prefix. Also, tabify the vdo_allocate_extended macro. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-04 15:07:56 -05:00
Mike Snitzer	0eea6b6e78	dm vdo memory-alloc: change from uds_ to vdo_ namespace Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-04 15:07:56 -05:00
Bruce Johnston	6008d526b0	dm-vdo: change unnamed enums to defines Signed-off-by: Bruce Johnston <bjohnsto@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-03-04 15:07:56 -05:00
Matthew Sakai	04530b487b	dm vdo: remove outdated pointer_map reference Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-03-04 15:07:56 -05:00
Matthew Sakai	e1e510fcad	dm vdo: update module comments Update outdated comments referring to separate VDO and UDS modules. Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-03-04 15:07:55 -05:00
Matthew Sakai	bbe434d94e	dm vdo indexer delta-index: fix typos in comments Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-03-04 15:07:55 -05:00
Jiapeng Chong	eebd4e1630	dm vdo: fix various function names referenced in comment blocks No functional modification involved. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-03-04 15:07:55 -05:00
Mike Snitzer	17b1a73fea	dm vdo: move indexer files into sub-directory The goal is to assist high-level understanding of which code is conceptually specific to VDO's indexer. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-04 15:07:55 -05:00
Matthew Sakai	61234f0bda	dm vdo: remove unnecessary indexer.h includes Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-03-04 15:07:55 -05:00
Chung Chung	81c751ad1b	dm vdo: clean up scnprintf usage Ignore scnprintf return status since it is not necessary. Change write_* functions type from int to void since we no longer return any result. Also, clean up any code that checks or uses any scnprintf return results. Check uds_allocate return code which was previous ignored, return and log error when uds_allocate failed. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Chung Chung <cchung@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-03-04 15:07:55 -05:00
Mike Snitzer	20be466c7a	dm vdo: include <asm/current.h> to resolve current being undeclared Reported when building on loongarch. Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Bruce Johnston <bjohnsto@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-04 15:07:55 -05:00
Mike Snitzer	444d3f0bfd	dm vdo indexer-volume: fix missing mutex_lock in process_entry Must mutex_lock after dm_bufio_read, before dm_bufio_read error handling, otherwise process_entry error path will return without volume->read_threads_mutex held. This fixes potential double mutex_unlock. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-04 15:07:55 -05:00
Mike Snitzer	b259c1a60c	dm vdo flush: initialize return to NULL in allocate_flush Otherwise, error path could result in allocate_flush's subsequent check for flush being non-NULL leading to false positive. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-03-04 15:07:55 -05:00
Dan Carpenter	672fc9b8c0	dm vdo slab-depot: delete unnecessary check in allocate_components This is a duplicate check so it can't be true. Delete it. Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-03-04 15:07:55 -05:00
Mike Snitzer	924553644a	dm vdo memory-alloc: simplify allocations_allowed() Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-04 15:07:55 -05:00
Susan LeGendre-McGhee	dcd1332bb5	dm vdo: remove internal ticket references Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-03-04 15:07:55 -05:00
Tejun Heo	c375b22333	dm-verity: Convert from tasklet to BH workqueue The only generic interface to execute asynchronously in the BH context is tasklet; however, it's marked deprecated and has some design flaws. To replace tasklets, BH workqueue support was recently added. A BH workqueue behaves similarly to regular workqueues except that the queued work items are executed in the BH context. This commit converts dm-verity from tasklet to BH workqueue. It backfills tasklet code that was removed with commit `0a9bab391e` ("dm-crypt, dm-verity: disable tasklets") and tweaks to use BH workqueue (and does some renaming). This is a minimal conversion which doesn't rename the related names including the "try_verify_in_tasklet" option. If this patch is applied, a follow-up patch would be necessary. I couldn't decide whether the option name would need to be updated too. Signed-off-by: Tejun Heo <tj@kernel.org> [snitzer: rename 'use_tasklet' to 'use_bh_wq' and 'in_tasklet' to 'in_bh'] Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-03-02 10:30:36 -05:00
Tejun Heo	fb6ad4aec1	dm-crypt: Convert from tasklet to BH workqueue The only generic interface to execute asynchronously in the BH context is tasklet; however, it's marked deprecated and has some design flaws. To replace tasklets, BH workqueue support was recently added. A BH workqueue behaves similarly to regular workqueues except that the queued work items are executed in the BH context. This commit converts dm-crypt from tasklet to BH workqueue. It backfills tasklet code that was removed with commit `0a9bab391e` ("dm-crypt, dm-verity: disable tasklets") and tweaks to use BH workqueue. Like a regular workqueue, a BH workqueue allows freeing the currently executing work item. Converting from tasklet to BH workqueue removes the need for deferring bio_endio() again to a work item, which was buggy anyway. I tested this lightly with "--perf-no_read_workqueue --perf-no_write_workqueue" + some code modifications, but would really -appreciate if someone who knows the code base better could take a look. Signed-off-by: Tejun Heo <tj@kernel.org> Link: http://lkml.kernel.org/r/82b964f0-c2c8-a2c6-5b1f-f3145dc2c8e5@redhat.com [snitzer: rebase ontop of commit `0a9bab391e` reduced this commit's changes] Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-03-02 10:30:36 -05:00
Christoph Hellwig	8e0ef41286	dm: use queue_limits_set Use queue_limits_set which validates the limits and takes care of updating the readahead settings instead of directly assigning them to the queue. For that make sure all limits are actually updated before the assignment. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mike Snitzer <snitzer@kernel.org> Link: https://lore.kernel.org/r/20240228225653.947152-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2024-03-01 08:54:42 -07:00
Mike Snitzer	6a87a8a258	dm vdo thread-device: rename all methods to reflect vdo-only use Also moved vdo_init()'s call to vdo_initialize_thread_device_registry next to other registry initialization. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-01 09:26:24 -05:00
Mike Snitzer	82b354ffe2	dm vdo thread-registry: rename all methods to reflect vdo-only use Otherwise, uds_ prefix is misleading (vdo_ is the new catch-all for code that is used by vdo-only or _both_ vdo and the indexer code). Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-01 09:26:20 -05:00
Mike Snitzer	cb6f8b7500	dm vdo thread-utils: cleanup included headers Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-01 09:26:11 -05:00
Mike Snitzer	650e3107bc	dm vdo thread-utils: further cleanup of thread functions Change thread function prefix from "uds_" to "vdo_" and fix vdo_join_threads() to return void. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-01 09:26:07 -05:00
Mike Snitzer	fe6e4ccbe8	dm vdo thread-utils: remove all uds_*_mutex wrappers Just use mutex_init, mutex_lock and mutex_unlock. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-01 09:26:03 -05:00
Mike Snitzer	7f2e494ddd	dm vdo thread-utils: push uds_*_cond interface down to indexer Only used by indexer components. Also return void from uds_init_cond(), remove uds_destroy_cond(), and fix up all callers. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-01 09:25:58 -05:00
Mike Snitzer	877f36b764	dm vdo: fold thread-cond-var.c into thread-utils Further cleanup is needed for thread-utils interfaces given many functions should return void or be removed entirely because they amount to obfuscation via wrappers. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-01 09:25:54 -05:00
Mike Snitzer	8e6333af19	dm vdo indexer: rename uds.h to indexer.h Also remove unnecessary include from funnel-queue.c. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-01 09:25:49 -05:00
Mike Snitzer	c2f54aa2b2	dm vdo: rename uds-threads.[ch] to thread-utils.[ch] Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-01 09:25:45 -05:00
Mike Snitzer	eef7cf5e22	dm vdo indexer sparse-cache: cleanup threads_barrier code Rename 'barrier' to 'threads_barrier', remove useless uds_destroy_barrier(), return void from remaining methods and clean up uds_make_sparse_cache() accordingly. Also remove uds_ prefix from the 2 remaining threads_barrier functions. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-01 09:25:41 -05:00
Mike Snitzer	0593855a83	dm vdo uds-threads: push 'barrier' down to sparse-cache The sparse-cache is the only user of the 'barrier' data structure, so just move it private to it. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-01 09:25:36 -05:00
Mike Snitzer	2d98aa1780	dm vdo uds-threads: eliminate uds_*_semaphore interfaces The implementation of thread 'barrier' data structure does not require overdone private semaphore wrappers. Also rename the barrier structure's 'mutex' member (a semaphore) to 'lock'. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-01 09:25:32 -05:00
Mike Snitzer	9d87418945	dm vdo: make uds_*_semaphore interface private to uds-threads.c Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-01 09:25:23 -05:00
Mike Snitzer	50944062f7	dm vdo block-map: rename page state name from "UDS_FREE" to "FREE" Only used for log message, but no need for "UDS_" prefix. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-03-01 09:25:16 -05:00
Harshit Mogalapalli	f304f6b443	dm vdo volume-index: fix an assert statement in start_restoring_volume_sub_index() Use "==" instead of "=" in ASSERT() statement. Fixes: ef074a31e88e ("dm vdo: implement the volume index") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-03-01 09:25:09 -05:00
Yu Kuai	0091c5a269	md/raid1: factor out helpers to choose the best rdev from read_balance() The way that best rdev is chosen: 1) If the read is sequential from one rdev: - if rdev is rotational, use this rdev; - if rdev is non-rotational, use this rdev until total read length exceed disk opt io size; 2) If the read is not sequential: - if there is idle disk, use it, otherwise: - if the array has non-rotational disk, choose the rdev with minimal inflight IO; - if all the underlaying disks are rotational disk, choose the rdev with closest IO; There are no functional changes, just to make code cleaner and prepare for following refactor. Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240229095714.926789-12-yukuai1@huaweicloud.com	2024-02-29 22:49:46 -08:00
Yu Kuai	ba58f57fdf	md/raid1: factor out the code to manage sequential IO There is no functional change for now, make read_balance() cleaner and prepare to fix problems and refactor the handler of sequential IO. Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240229095714.926789-11-yukuai1@huaweicloud.com	2024-02-29 22:49:46 -08:00
Yu Kuai	9f3ced7922	md/raid1: factor out choose_bb_rdev() from read_balance() read_balance() is hard to understand because there are too many status and branches, and it's overlong. This patch factor out the case to read the rdev with bad blocks from read_balance(), there are no functional changes. Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240229095714.926789-10-yukuai1@huaweicloud.com	2024-02-29 22:49:46 -08:00
Yu Kuai	dfa8ecd167	md/raid1: factor out choose_slow_rdev() from read_balance() read_balance() is hard to understand because there are too many status and branches, and it's overlong. This patch factor out the case to read the slow rdev from read_balance(), there are no functional changes. Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240229095714.926789-9-yukuai1@huaweicloud.com	2024-02-29 22:49:46 -08:00
Yu Kuai	31a7333175	md/raid1: factor out read_first_rdev() from read_balance() read_balance() is hard to understand because there are too many status and branches, and it's overlong. This patch factor out the case to read the first rdev from read_balance(), there are no functional changes. Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240229095714.926789-8-yukuai1@huaweicloud.com	2024-02-29 22:49:46 -08:00
Yu Kuai	f109207629	md/raid1-10: factor out a new helper raid1_should_read_first() If resync is in progress, read_balance() should find the first usable disk, otherwise, data could be inconsistent after resync is done. raid1 and raid10 implement the same checking, hence factor out the checking to make code cleaner. Noted that raid1 is using 'mddev->recovery_cp', which is updated after all resync IO is done, while raid10 is using 'conf->next_resync', which is inaccurate because raid10 update it before submitting resync IO. Fortunately, raid10 read IO can't concurrent with resync IO, hence there is no problem. And this patch also switch raid10 to use 'mddev->recovery_cp'. Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240229095714.926789-7-yukuai1@huaweicloud.com	2024-02-29 22:49:46 -08:00
Yu Kuai	f29841ff3b	md/raid1-10: add a helper raid1_check_read_range() The checking and handler of bad blocks appear many timers during read_balance() in raid1 and raid10. This helper will be used in later patches to simplify read_balance() a lot. Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240229095714.926789-6-yukuai1@huaweicloud.com	2024-02-29 22:49:46 -08:00
Yu Kuai	257ac239ff	md/raid1: fix choose next idle in read_balance() Commit `12cee5a8a2` ("md/raid1: prevent merging too large request") add the case choose next idle in read_balance(): read_balance: for_each_rdev if(next_seq_sect == this_sector \|\| dist == 0) -> sequential reads best_disk = disk; if (...) choose_next_idle = 1 continue; for_each_rdev -> iterate next rdev if (pending == 0) best_disk = disk; -> choose the next idle disk break; if (choose_next_idle) -> keep using this rdev if there are no other idle disk contine However, commit `2e52d449bc` ("md/raid1: add failfast handling for reads.") remove the code: - /* If device is idle, use it */ - if (pending == 0) { - best_disk = disk; - break; - } Hence choose next idle will never work now, fix this problem by following: 1) don't set best_disk in this case, read_balance() will choose the best disk after iterating all the disks; 2) add 'pending' so that other idle disk will be chosen; 3) add a new local variable 'sequential_disk' to record the disk, and if there is no other idle disk, 'sequential_disk' will be chosen; Fixes: `2e52d449bc` ("md/raid1: add failfast handling for reads.") Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240229095714.926789-5-yukuai1@huaweicloud.com	2024-02-29 22:49:45 -08:00
Yu Kuai	2c27d09d3a	md/raid1: record nonrot rdevs while adding/removing rdevs to conf For raid1, each read will iterate all the rdevs from conf and check if any rdev is non-rotational, then choose rdev with minimal IO inflight if so, or rdev with closest distance otherwise. Disk nonrot info can be changed through sysfs entry: /sys/block/[disk_name]/queue/rotational However, consider that this should only be used for testing, and user really shouldn't do this in real life. Record the number of non-rotational disks in conf, to avoid checking each rdev in IO fast path and simplify read_balance() a little bit. Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240229095714.926789-4-yukuai1@huaweicloud.com	2024-02-29 22:49:45 -08:00
Yu Kuai	969d6589ab	md/raid1: factor out helpers to add rdev to conf There are no functional changes, just make code cleaner and prepare to record disk non-rotational information while adding and removing rdev to conf Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240229095714.926789-3-yukuai1@huaweicloud.com	2024-02-29 22:49:45 -08:00
Yu Kuai	3a0f007b69	md: add a new helper rdev_has_badblock() The current api is_badblock() must pass in 'first_bad' and 'bad_sectors', however, many caller just want to know if there are badblocks or not, and these caller must define two local variable that will never be used. Add a new helper rdev_has_badblock() that will only return if there are badblocks or not, remove unnecessary local variables and replace is_badblock() with the new helper in many places. There are no functional changes, and the new helper will also be used later to refactor read_balance(). Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240229095714.926789-2-yukuai1@huaweicloud.com	2024-02-29 22:49:45 -08:00
Gui-Dong Han	dfd2bf4367	md/raid5: fix atomicity violation in raid5_cache_count In raid5_cache_count(): if (conf->max_nr_stripes < conf->min_nr_stripes) return 0; return conf->max_nr_stripes - conf->min_nr_stripes; The current check is ineffective, as the values could change immediately after being checked. In raid5_set_cache_size(): ... conf->min_nr_stripes = size; ... while (size > conf->max_nr_stripes) conf->min_nr_stripes = conf->max_nr_stripes; ... Due to intermediate value updates in raid5_set_cache_size(), concurrent execution of raid5_cache_count() and raid5_set_cache_size() may lead to inconsistent reads of conf->max_nr_stripes and conf->min_nr_stripes. The current checks are ineffective as values could change immediately after being checked, raising the risk of conf->min_nr_stripes exceeding conf->max_nr_stripes and potentially causing an integer overflow. This possible bug is found by an experimental static analysis tool developed by our team. This tool analyzes the locking APIs to extract function pairs that can be concurrently executed, and then analyzes the instructions in the paired functions to identify possible concurrency bugs including data races and atomicity violations. The above possible bug is reported when our tool analyzes the source code of Linux 6.2. To resolve this issue, it is suggested to introduce local variables 'min_stripes' and 'max_stripes' in raid5_cache_count() to ensure the values remain stable throughout the check. Adding locks in raid5_cache_count() fails to resolve atomicity violations, as raid5_set_cache_size() may hold intermediate values of conf->min_nr_stripes while unlocked. With this patch applied, our tool no longer reports the bug, with the kernel configuration allyesconfig for x86_64. Due to the lack of associated hardware, we cannot test the patch in runtime testing, and just verify it according to the code logic. Fixes: `edbe83ab4c` ("md/raid5: allow the stripe_cache to grow and shrink.") Cc: stable@vger.kernel.org Signed-off-by: Gui-Dong Han <2045gemini@gmail.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240112071017.16313-1-2045gemini@gmail.com Signed-off-by: Song Liu <song@kernel.org>	2024-02-27 14:25:34 -08:00
Heming Zhao	ecbd8ebb51	md/md-bitmap: fix incorrect usage for sb_index Commit `d7038f9518` ("md-bitmap: don't use ->index for pages backing the bitmap file") removed page->index from bitmap code, but left wrong code logic for clustered-md. current code never set slot offset for cluster nodes, will sometimes cause crash in clustered env. Call trace (partly): md_bitmap_file_set_bit+0x110/0x1d8 [md_mod] md_bitmap_startwrite+0x13c/0x240 [md_mod] raid1_make_request+0x6b0/0x1c08 [raid1] md_handle_request+0x1dc/0x368 [md_mod] md_submit_bio+0x80/0xf8 [md_mod] __submit_bio+0x178/0x300 submit_bio_noacct_nocheck+0x11c/0x338 submit_bio_noacct+0x134/0x614 submit_bio+0x28/0xdc submit_bh_wbc+0x130/0x1cc submit_bh+0x1c/0x28 Fixes: `d7038f9518` ("md-bitmap: don't use ->index for pages backing the bitmap file") Cc: stable@vger.kernel.org # v6.6+ Signed-off-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240223121128.28985-1-heming.zhao@suse.com	2024-02-26 13:34:44 -08:00
Li Nan	e9b0a1556c	md: check mddev->pers before calling md_set_readonly() If 'mddev->pers' is NULL, there is nothing to do in md_set_readonly(). Except for md_ioctl(), the other two callers of md_set_readonly() have already checked 'mddev->pers'. To simplify the code, move the check of 'mddev->pers' to the caller. Signed-off-by: Li Nan <linan122@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240226031444.3606764-10-linan666@huaweicloud.com	2024-02-26 10:22:22 -08:00
Li Nan	650b2e69ff	md: clean up openers check in do_md_stop() and md_set_readonly() Before stopping or setting readonly, mddev_set_closing_and_sync_blockdev() is always called to check the openers. So no longer need to check it again in do_md_stop() and md_set_readonly(). Clean it up. Signed-off-by: Li Nan <linan122@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240226031444.3606764-9-linan666@huaweicloud.com	2024-02-26 10:22:22 -08:00
Li Nan	99b902ac17	md: sync blockdev before stopping raid or setting readonly Commit `a05b7ea03d` ("md: avoid crash when stopping md array races with closing other open fds.") added sync_block before stopping raid and setting readonly. Later in commit `260fa034ef` ("md: avoid deadlock when dirty buffers during md_stop.") it is moved to ioctl. array_state_store() was ignored. Add sync blockdev to array_state_store() now. Signed-off-by: Li Nan <linan122@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240226031444.3606764-8-linan666@huaweicloud.com	2024-02-26 10:22:22 -08:00
Li Nan	f74aaf614e	md: factor out a helper to sync mddev There are no functional changes, prepare to sync mddev in array_state_store(). Signed-off-by: Li Nan <linan122@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240226031444.3606764-7-linan666@huaweicloud.com	2024-02-26 10:22:22 -08:00
Li Nan	9674f54e41	md: Don't clear MD_CLOSING when the raid is about to stop The raid should not be opened anymore when it is about to be stopped. However, other processes can open it again if the flag MD_CLOSING is cleared before exiting. From now on, this flag will not be cleared when the raid will be stopped. Fixes: `065e519e71` ("md: MD_CLOSING needs to be cleared after called md_set_readonly or do_md_stop") Signed-off-by: Li Nan <linan122@huawei.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240226031444.3606764-6-linan666@huaweicloud.com	2024-02-26 10:22:22 -08:00
Li Nan	91b26a39fb	md: return directly before setting did_set_md_closing There is nothing to do at 'out' before setting 'did_set_md_closing' in md_ioctl(). Return directly, and it will help us to remove 'did_set_md_closing' later. Signed-off-by: Li Nan <linan122@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240226031444.3606764-5-linan666@huaweicloud.com	2024-02-26 10:22:22 -08:00
Li Nan	9dd8702e7c	md: clean up invalid BUG_ON in md_ioctl 'disk->private_data' is set to mddev in md_alloc() and never set to NULL, and users need to open mddev before submitting ioctl. So mddev must not have been freed during ioctl, and there is no need to check mddev here. Clean up it. Signed-off-by: Li Nan <linan122@huawei.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240226031444.3606764-4-linan666@huaweicloud.com	2024-02-26 10:22:22 -08:00
Li Nan	4e26593944	md: changed the switch of RAID_VERSION to if There is only one case of this 'switch'. Change it to 'if'. Signed-off-by: Li Nan <linan122@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240226031444.3606764-3-linan666@huaweicloud.com	2024-02-26 10:22:22 -08:00
Li Nan	2fe4ffc3ec	md: merge the check of capabilities into md_ioctl_valid() There is no functional change. Just to make code cleaner. Signed-off-by: Li Nan <linan122@huawei.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240226031444.3606764-2-linan666@huaweicloud.com	2024-02-26 10:22:22 -08:00
Christian Brauner	3789fb8746	bcache: port block device access to files Link: https://lore.kernel.org/r/20240123-vfs-bdev-file-v2-13-adbd023e19cc@kernel.org Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-02-25 12:05:24 +01:00
Christian Brauner	a28d893eb3	md: port block device access to file Link: https://lore.kernel.org/r/20240123-vfs-bdev-file-v2-4-adbd023e19cc@kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-02-25 12:05:22 +01:00
Linus Torvalds	f2e367d6ad	- Fix DM integrity and verity targets to not use excessive stack when they recheck in the error path. -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEJfWUX4UqZ4x1O2wixSPxCi2dA1oFAmXaEswACgkQxSPxCi2d A1oT2Qf/c1opgjRUe+yY/v7nWf4paufSj2O4LYAy/qQBU7IS9CcXQPzi/pKlfEo8 60OZfa5gfrCAla79se7hHI/mxReq7CI5nFvYDyqQ1JZQ/djG/4cN/oWf5fQ12pon /ET1IzaZ+Mom+5wDBeQBLoQwXTA1ru5Bi1OiUe9Ed3wzadZQQks5s65fPnc0emGJ ClyaXiiCt4Dy36E5GmuPpmPB4ZJ57SwcnFWDFIeCHEbIQk36APkZ22z7lqGObjw2 ANO1l59k6ojzmaXLi9pw/J/o/qyfNR0MpeI7SpmtJzhSZKeGKsUX2GlJ9QBhViJp XL/+7MbSRJ43IY1lomoHZm1vxe0aPg== =sQPX -----END PGP SIGNATURE----- Merge tag 'for-6.8/dm-fix-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fix from Mike Snitzer: - Fix DM integrity and verity targets to not use excessive stack when they recheck in the error path. * tag 'for-6.8/dm-fix-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm-integrity, dm-verity: reduce stack usage for recheck	2024-02-24 09:55:29 -08:00
Arnd Bergmann	66ad2fbcdb	dm-integrity, dm-verity: reduce stack usage for recheck The newly added integrity_recheck() function has another larger stack allocation, just like its caller integrity_metadata(). When it gets inlined, the combination of the two exceeds the warning limit for 32-bit architectures and possibly risks an overflow when this is called from a deep call chain through a file system: drivers/md/dm-integrity.c:1767:13: error: stack frame size (1048) exceeds limit (1024) in 'integrity_metadata' [-Werror,-Wframe-larger-than] 1767 \| static void integrity_metadata(struct work_struct *w) Since the caller at this point is done using its checksum buffer, just reuse the same buffer in the new function to avoid the double allocation. [Mikulas: add "noinline" to integrity_recheck and verity_recheck. These functions are only called on error, so they shouldn't bloat the stack frame or code size of the caller.] Fixes: `c88f5e553f` ("dm-integrity: recheck the integrity tag after a failure") Fixes: `9177f3c0de` ("dm-verity: recheck the hash after a failure") Cc: stable@vger.kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-24 10:53:57 -05:00
Linus Torvalds	e7768e65cd	- Stable fixes for 3 DM targets (integrity, verity and crypt) to address systemic failure that can occur if user provided pages map to the same block. - Fix DM crypt to not allow modifying data that being encrypted for authenticated encryption. - Fix DM crypt and verity targets to align their respective bvec_iter struct members to avoid the need for byte level access (due to __packed attribute) that is costly on some arches (like RISC). -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEJfWUX4UqZ4x1O2wixSPxCi2dA1oFAmXY0iwACgkQxSPxCi2d A1oG3Qf/WE0T5qyBnDZ7irhvJmSLVx4oAwzB0PmMtELZ3Tkyn7BBAxq1Q2I2UT3x r90d1uy/pz6Y+kZkAPZjYuYLctukEa1swpfFe0Sn01dBrbgGU/p2vi3fkF+ZK6/t n5EN8S5dkf6rIDmp8R56iP8mP4OEultYjLugxc6ROohFgHZicoqv+Pye9kHp0Y19 HSW2eueag/s2nMa9HKjIEd3+NBgmGb0qMMf3M6CXpRLNi/f/cyHbPzq83+eW3gcg jl480w5YHk2nOUSqrO8UfIaP4BpD3SEXQxVqIzdkVX4cEBO4yRcBNrQpsT89GsXj sg5zinkq3g7SThEpQWdpkeZMR/6q/A== =n0nQ -----END PGP SIGNATURE----- Merge tag 'for-6.8/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - Stable fixes for 3 DM targets (integrity, verity and crypt) to address systemic failure that can occur if user provided pages map to the same block. - Fix DM crypt to not allow modifying data that being encrypted for authenticated encryption. - Fix DM crypt and verity targets to align their respective bvec_iter struct members to avoid the need for byte level access (due to __packed attribute) that is costly on some arches (like RISC). * tag 'for-6.8/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm-crypt, dm-integrity, dm-verity: bump target version dm-verity, dm-crypt: align "struct bvec_iter" correctly dm-crypt: recheck the integrity tag after a failure dm-crypt: don't modify the data when using authenticated encryption dm-verity: recheck the hash after a failure dm-integrity: recheck the integrity tag after a failure	2024-02-23 09:23:54 -08:00
Pierre Gondois	b20a229c28	bcache: use of hlist_count_nodes() Make use of the newly added hlist_count_nodes(). Link: https://lkml.kernel.org/r/20240104164937.424320-4-pierre.gondois@arm.com Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> Acked-by: Coly Li <colyli@suse.de> Acked-by: Marco Elver <elver@google.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Arve Hjønnevåg <arve@android.com> Cc: Carlos Llamas <cmllamas@google.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: Kees Cook <keescook@chromium.org> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Martijn Coenen <maco@android.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Todd Kjos <tkjos@android.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-02-22 15:38:51 -08:00
Mathieu Desnoyers	c29290728d	dm: treat alloc_dax() -EOPNOTSUPP failure as non-fatal In preparation for checking whether the architecture has data cache aliasing within alloc_dax(), modify the error handling of dm alloc_dev() to treat alloc_dax() -EOPNOTSUPP failure as non-fatal. Link: https://lkml.kernel.org/r/20240215144633.96437-5-mathieu.desnoyers@efficios.com Fixes: `d92576f116` ("dax: does not work correctly with virtual aliasing caches") Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: Mike Snitzer <snitzer@kernel.org> Cc: Mikulas Patocka <mpatocka@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@armlinux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Chinner <david@fromorbit.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: kernel test robot <lkp@intel.com> Cc: Michael Sclafani <dm-devel@lists.linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-02-22 15:27:19 -08:00
Linus Torvalds	ffd2cb6b71	block-6.8-2024-02-22 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmXXiBEQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgprR0D/9zwzw1JcCcaMlYPL8yJcUjxNOQF7qrldXQ 86u4Jmqq8QtAzOZWTuXZiFBaq9/+h7FsnPppPXsTXPxz6wrlOHhc+38NR0Zs3kHq vng6glfRRBkX8NuMGID754IOpwS79ZP3z07Yk6ruZKcmVVx40WVBLtFwENA7Ub+Q /ktbu0PUe+7xBIsEBkgDGBfpyagJaMP+vgaQzl36sDXVY5lSiyHRhez27WrovNGU kXOTzuEY2RezWF6oI7yth7zllTAw/tJEpbjhFZCOm6DaZffHF7AHpoTOLYdK989Y ZA2d9tWltfgTvjohNUjtQmlL/SHKHFKE+JrlUgkv8KpGN9Y+ySKJsoSG37ntL3+W fX5NAe5MDy5xO6jm/Kj8668oYdlCHODm3faj3ezzhBTQYFEssc9bX06uGhiQugaI fosI4oAHJ9jYFNzZzeAMx1oFvorCzinseGbDzN/938Q6nRAZdpLxWHhQ6V1+81Ny lv/HFV4DoDW+4sMp69UP8yK92x9UDutaxwbl7tgdnHfPmp9s8VeLgv6xbPRB5hJp XrCH1WVgM7cYGz26pVhUrFDIdPBVPPNfTz0hAo2O1zpGbM+2JiENgK71MrLu5P9i m+QRa8FIeV80wRH0wdT4H/Oy8r8fOrUD8JG6WKiR98SSS81raOWdF8TzFWGEuFvO ZH5FBgowjg== =0LBw -----END PGP SIGNATURE----- Merge tag 'block-6.8-2024-02-22' of git://git.kernel.dk/linux Pull block fixes from Jens Axboe: "Mostly just fixlets for md, but also a sed-opal parsing fix" * tag 'block-6.8-2024-02-22' of git://git.kernel.dk/linux: block: sed-opal: handle empty atoms when parsing response md: Don't suspend the array for interrupted reshape md: Don't register sync_thread for reshape directly md: Make sure md_do_sync() will set MD_RECOVERY_DONE md: Don't ignore read-only array in md_check_recovery() md: Don't ignore suspended array in md_check_recovery() md: Fix missing release of 'active_io' for flush	2024-02-22 11:57:30 -08:00
Mike Snitzer	fa34e5893f	dm: update relevant MODULE_AUTHOR entries to latest dm-devel mailing list Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 14:22:55 -05:00
Mike Snitzer	86ab1b84b2	dm ioctl: update DM_DRIVER_EMAIL to new dm-devel mailing list Fixes: `3da5d2de92` ("MAINTAINERS: update the dm-devel mailing list") Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 14:22:55 -05:00
Fan Wu	9356fcfe0a	dm verity: set DM_TARGET_SINGLETON feature flag The device-mapper has a flag to mark targets as singleton, which is a required flag for immutable targets. Without this flag, multiple dm-verity targets can be added to a mapped device, which has no practical use cases and will let dm_table_get_immutable_target return NULL. This patch adds the missing flag, restricting only one dm-verity target per mapped device. Signed-off-by: Fan Wu <wufan@linux.microsoft.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 14:22:55 -05:00
Hongyu Jin	5d8d408153	dm crypt: Fix IO priority lost when queuing write bios Since dm-crypt queues writes to a different kernel thread (workqueue), the bios will dispatch from tasks with different io_context->ioprio settings and blkcg than the submitting task, thus giving incorrect ioprio to the io scheduler. Get the original IO priority setting via struct dm_crypt_io::base_bio and set this priority in the bio for write. Link: https://lore.kernel.org/dm-devel/alpine.LRH.2.11.1612141049250.13402@mail.ewheeler.net Signed-off-by: Hongyu Jin <hongyu.jin@unisoc.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 14:22:55 -05:00
Hongyu Jin	d95e2c34a3	dm verity: Fix IO priority lost when reading FEC and hash After obtaining the data, verification or error correction process may trigger a new IO that loses the priority of the original IO, that is, the verification of the higher priority IO may be blocked by the lower priority IO. Make the IO used for verification and error correction follow the priority of the original IO. Co-developed-by: Yibin Ding <yibin.ding@unisoc.com> Signed-off-by: Yibin Ding <yibin.ding@unisoc.com> Signed-off-by: Hongyu Jin <hongyu.jin@unisoc.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 14:22:55 -05:00
Hongyu Jin	e9b2238e47	dm bufio: Support IO priority Some IO will dispatch from kworker with different io_context settings than the submitting task, we may need to specify a priority to avoid losing priority. Add dm_bufio_read_with_ioprio() and dm_bufio_prefetch_with_ioprio() for use by bufio users to pass an ioprio other than IOPRIO_DEFAULT. Co-developed-by: Yibin Ding <yibin.ding@unisoc.com> Signed-off-by: Yibin Ding <yibin.ding@unisoc.com> Signed-off-by: Hongyu Jin <hongyu.jin@unisoc.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> [snitzer: introduced _with_ioprio() wrappers to reduce churn] Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 14:22:55 -05:00
Hongyu Jin	6e5f0f6383	dm io: Support IO priority Some IO will dispatch from kworker with different io_context settings than the submitting task, we may need to specify a priority to avoid losing priority. Add IO priority parameter to dm_io() and update all callers. Co-developed-by: Yibin Ding <yibin.ding@unisoc.com> Signed-off-by: Yibin Ding <yibin.ding@unisoc.com> Signed-off-by: Hongyu Jin <hongyu.jin@unisoc.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 14:22:51 -05:00
Mike Snitzer	1e00d57694	dm vdo logger: update logging to start with "device-mapper: vdo" Stops short of actually using DM's various logging macros (e.g. DMERR, DMINFO, etc) because VDO's logger isn't quite compatible with them. Also switch emit_log_message_to_kernel() from open-coding printk with log-level to using corresponding pr_ macro. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Bruce Johnston <bjohnsto@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:19 -05:00
Mike Snitzer	318a9ce59b	dm vdo logger: switch UDS_LOG_NOTICE to be alias for UDS_LOG_INFO Prepare to bring VDO's logging closer to DM's logging by eliminating support for KERN_NOTICE log level (DM hasn't ever had a need for it). Only one message in index-session.c used UDS_LOG_NOTICE, convert it to log with uds_log_info(). Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Bruce Johnston <bjohnsto@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:19 -05:00
Mike Snitzer	cae3816d99	dm vdo: tweak wait_for_completion_interruptible callers Update uds_join_threads to delay in wait_for_completion_interruptible loop. And cleanup style nits in perform_admin_operation(). Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:19 -05:00
Mike Snitzer	5581a43d30	dm vdo delta-index: fix various small nits Fix some needless line wrapping (given surrounding context), missing braces and some stale or incorrect references to data structure or function name. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:19 -05:00
Mike Snitzer	dea93aab18	dm vdo chapter_index: fix a few small nits Add missing braces and raise one function arg up a line to eliminate line wrap. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:19 -05:00
Mike Snitzer	571eff3969	dm vdo: cleanup style for comments in structs Use /* ... / rather than /* ... */ if for no other reason than syntax highlighting is improved (at least for me, in emacs: comments are now red, code is yellow. Previously comments were also yellow). Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:19 -05:00
Mike Snitzer	d008f6eeab	dm vdo dedupe: fix various small nits Add a __must_hold sparse annotation to launch_dedupe_state_change that reflects its ASSERTION code comments about locking requirements, add some extra braces and fix a couple typos. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:18 -05:00
Mike Snitzer	181547bbb8	dm vdo string-utils: remove unnecessary includes Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:18 -05:00
Ken Raeburn	5f770bd1f2	dm vdo message-stats: reformat to remove excessive newlines Signed-off-by: Ken Raeburn <raeburn@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 13:43:18 -05:00
Mike Snitzer	fbbd7a25e8	dm vdo: use #define for NO_CHAPTER and NO_CHAPTER_INDEX_ENTRY Avoids unconventional use of 'static const' and enum in headers. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:18 -05:00
Susan LeGendre-McGhee	b196d6bd30	dm vdo: move encoding constants to encodings.c Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 13:43:18 -05:00
Matthew Sakai	ea9ca07aff	dm vdo: add documentation details on zones and locking Add details describing the vdo zone and thread model to the documentation comments for major vdo components. Also added some high-level description of the block map structure. Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 13:43:18 -05:00
Mike Snitzer	b863d7f750	dm vdo recovery-journal: fix sparse 'mixed bitwiseness' warning Only one user of WRITE_FLAGS so no need to factor it out in an enum (which causes sparse's 'mixed bitwiseness' warning). Just use the flags in the only consumer. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:18 -05:00
Mike Snitzer	f46b1ab7e7	dm vdo dedupe: silence sparse warnings about locking context imbalances Annotate both open_index() and close_index() with __must_hold(&zones->lock) to silence these sparse warnings: warning: context imbalance in 'close_index' - unexpected unlock warning: context imbalance in 'open_index' - unexpected unlock Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:18 -05:00
Mike Snitzer	872564c501	dm vdo data-vio: silence sparse warnings about locking context imbalances Factor wait_permit() out from acquire_permit() so that the latter always holds the spinlock and the former always releases it. Otherwise sparse complains about locking context imbalances due to conditional spin_unlock in acquire_permit: warning: context imbalance in 'acquire_permit' - different lock contexts for basic block warning: context imbalance in 'vdo_launch_bio' - unexpected unlock Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:18 -05:00
Mike Snitzer	a6c05c981e	dm vdo: fix various blk_opf_t sparse warnings Use proper blk_opf_t type rather than unsigned int. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:18 -05:00
Mike Snitzer	ff91994648	dm vdo: fix sparse 'warning: Using plain integer as NULL pointer' Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:18 -05:00
Mike Snitzer	3fa8e6ec07	dm vdo: fix sparse warnings about missing statics Addresses various sparse warnings like: warning: symbol 'SYMBOL' was not declared. Should it be static? Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:18 -05:00
Mike Snitzer	952b57a58d	dm vdo: rename struct configuration to uds_configuration Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Ken Raeburn <raeburn@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:18 -05:00
Mike Snitzer	7f67d0f1c8	dm vdo: rename struct geometry to index_geometry Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Ken Raeburn <raeburn@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:18 -05:00
Mike Snitzer	5c45cd10c0	dm vdo index: fix various small nits Add braces around multi-line while loops and if statements. Also remove excess newlines. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Chung Chung <cchung@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:17 -05:00
Mike Snitzer	ac9ae5769d	dm vdo dedupe: fix various small nits Remove extra blank line, mark function inline, add missing braces, and fix a typo in a comment. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Chung Chung <cchung@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:17 -05:00
Mike Snitzer	1ccef45aa8	dm vdo slab-depot: fix various small nits Comment typo, whitespace issues, mark function inline. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Chung Chung <cchung@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:17 -05:00
Mike Snitzer	97b6f0e752	dm vdo data-vio: rename is_trim flag to is_discard Eliminate use of "trim" in favor of "discard" since it reflects the top-level Linux discard primative rather than the ATA specific ditto. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:17 -05:00
Mike Snitzer	f7c1c2e085	dm vdo: rename vdo_map_to_system_error to vdo_status_to_errno Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:17 -05:00
Mike Snitzer	c10497b3b1	dm vdo: rename uds_map_to_system_error to uds_status_to_errno Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:17 -05:00
Mike Snitzer	86492a3f69	dm vdo: slight cleanup of UDS error codes No need to increment each UDS_ error code manually (relative to UDS_ERROR_CODE_BASE). Also, remove unused PRP_BLOCK_START and PRP_BLOCK_END. Lastly, UDS_SUCCESS and VDO_SUCCESS are used interchangeably; so best to explicitly set VDO_SUCCESS equal to UDS_SUCCESS. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:17 -05:00
Mike Snitzer	b06d5c37b8	dm vdo block-map: rename struct cursors member to 'completion' 'completion' is more informative name for a 'struct vdo_completion' than 'parent'. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:17 -05:00
Mike Snitzer	3ccf136a49	dm vdo block-map: avoid extra dereferences to access vdo object The vdo_page_cache's 'vdo' is the same as the block_map's vdo instance, so use that to save 2 extra dereferences. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:17 -05:00
Mike Snitzer	36778716a2	dm vdo block-map: remove extra vdo arg from initialize_block_map_zone The block_map is passed to initialize_block_map_zone, but the block_map's vdo member is already initialized with the same vdo instance, so just use it. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:17 -05:00
Mike Snitzer	8810d3d594	dm vdo block-map: use uds_log_ratelimit() rather than open code it Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:17 -05:00
Mike Snitzer	6bda10727d	dm vdo block-map: fix a few small nits Rename 'pages' to 'num_pages' in distribute_page_over_waitq(). Update assert message in validate_completed_page() to model others. Tweak line-wrapping on a comment that was needlessly long. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:17 -05:00
Mike Snitzer	f36b1d3ba5	dm vdo: use a proper Makefile for dm-vdo Requires moving dm-vdo-target.c into drivers/md/dm-vdo/ This change adds a proper drivers/md/dm-vdo/Makefile and eliminates the abnormal use of patsubst in drivers/md/Makefile -- which was the cause of at least one build failure that was reported by the upstream build bot. Also, split out VDO's drivers/md/dm-vdo/Kconfig and include it from drivers/md/Kconfig Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:17 -05:00
Mike Snitzer	4c79d55678	dm vdo: fix how dm_kcopyd_client_create() failure is checked dm_kcopyd_client_create() returns an ERR_PTR so its return must be checked with IS_ERR(). Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Chung Chung <cchung@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:17 -05:00
Bruce Johnston	9165dac822	dm vdo int-map: remove unused parameter from vdo_int_map_create Reviewed-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Bruce Johnston <bjohnsto@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 13:43:16 -05:00
Bruce Johnston	ffb8d96541	dm vdo int-map: rename functions to use a common vdo_int_map preamble Reviewed-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Bruce Johnston <bjohnsto@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 13:43:16 -05:00
Bruce Johnston	db6b0a7ffe	dm vdo dedupe: switch to using int-map instead of pointer-map Use get_unaligned_le64() on the hash lock's record name to serve as the key to use with the int hash-map. Switching to using int hash-map removes the only consumer of pointer hash-map, as such it is removed. Reviewed-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Bruce Johnston <bjohnsto@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 13:43:16 -05:00
Mike Snitzer	a4bba246ec	dm vdo wait-queue: rename to vdo_waitq_dequeue_waiter Rename vdo_waitq_dequeue_next_waiter to vdo_waitq_dequeue_waiter. The "next" aspect of returned waiter is implied. "next" also isn't informative ("oldest" would be). Removing "next_" adds symmetry to vdo_waitq_enqueue_waiter(). Also fix whitespace and comments from previous waitq commit. Reviewed-by: Ken Raeburn <raeburn@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:16 -05:00
Mike Snitzer	29f0ef873c	dm vdo block-map: optimize enter_zone_read_only_mode Rather than incrementally dequeue from the zone->flush_waiters vdo_wait_queue, simply re-initialize it. Reviewed-by: Ken Raeburn <raeburn@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:16 -05:00
Mike Snitzer	e752e5c33b	dm vdo wait-queue: optimize vdo_waitq_dequeue_matching_waiters Remove temporary 'matched_waiters' waitq and just enqueue matched waiters directly to the caller provided 'matched_waitq'. Reviewed-by: Ken Raeburn <raeburn@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:16 -05:00
Mike Snitzer	cd1227dd83	dm vdo wait-queue: remove unused debug function vdo_waitq_get_next_waiter Reviewed-by: Ken Raeburn <raeburn@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:16 -05:00
Mike Snitzer	d6e260cc42	dm vdo wait-queue: add proper namespace to interface Rename various interfaces and structs associated with vdo's wait-queue, e.g.: s/wait_queue/vdo_wait_queue/, s/waiter/vdo_waiter/, etc. Now all function names start with "vdo_waitq_" or "vdo_waiter_". Reviewed-by: Ken Raeburn <raeburn@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:16 -05:00
Mike Snitzer	46a707cce0	dm vdo io-submitter: rename to vdo_submit_vio and submit_data_vio Rename process_vio_io() to vdo_submit_vio(), and process_data_vio_io() to submit_data_vio(). Reviewed-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:16 -05:00
Mike Snitzer	d58d3c86c3	dm vdo io-submitter: rename to vdo_submit_data_vio Rename submit_data_vio_io() to vdo_submit_data_vio(). Reviewed-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:16 -05:00
Mike Snitzer	ebe16015c3	dm vdo io-submitter: rename to vdo_submit_flush_vio Rename submit_flush_vio() to vdo_submit_flush_vio(). Reviewed-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:16 -05:00
Mike Snitzer	f7f46761cc	dm vdo io-submitter: rename to vdo_submit_metadata_vio Rename submit_metadata_vio() to vdo_submit_metadata_vio(). Reviewed-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:16 -05:00
Mike Snitzer	0dc2009d97	dm vdo io-submitter: remove get_bio_sector Just open-code access to bio's sector. Reviewed-by: Susan LeGendre-McGhee <slegendr@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Matthew Sakai <msakai@redhat.com>	2024-02-20 13:43:16 -05:00
Matthew Sakai	f11aca85b0	dm vdo: enable configuration and building of dm-vdo dm-vdo targets are not supported for 32-bit configurations. A vdo target typically requires 1 to 1.5 GB of memory at any given time, which is likely a large fraction of the addressable memory of a 32-bit system. At the same time, the amount of addressable storage attached to a 32-bit system may not be large enough for deduplication to provide much benefit. Because of these concerns, 32-bit platforms are deemed unlikely to benefit from using a vdo target, so dm-vdo is targeted only at 64-bit platforms. Co-developed-by: J. corwin Coburn <corwin@hurlbutnet.net> Signed-off-by: J. corwin Coburn <corwin@hurlbutnet.net> Co-developed-by: John Wiele <jwiele@redhat.com> Signed-off-by: John Wiele <jwiele@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 13:43:16 -05:00
Matthew Sakai	03d1e20fa1	dm vdo: add the top-level DM target This adds the dm-vdo target. The dm-vdo target provides inline deduplication, compression, and zero-block elimination, allowing applications to consume less actual storage than a normal target. By layering it with other device mapper targets, it can add these features to any storage stack. It can also provide a common deduplication pool for groups of targets. The vdo target does not protect against data corruption, relying instead on integrity protection of the storage below it. Co-developed-by: J. corwin Coburn <corwin@hurlbutnet.net> Signed-off-by: J. corwin Coburn <corwin@hurlbutnet.net> Co-developed-by: Michael Sclafani <dm-devel@lists.linux.dev> Signed-off-by: Michael Sclafani <dm-devel@lists.linux.dev> Co-developed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Co-developed-by: Bruce Johnston <bjohnsto@redhat.com> Signed-off-by: Bruce Johnston <bjohnsto@redhat.com> Co-developed-by: Ken Raeburn <raeburn@redhat.com> Signed-off-by: Ken Raeburn <raeburn@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 13:43:15 -05:00
Matthew Sakai	29a811959c	dm vdo: add debugging support Add support for dumping detailed vdo state to the kernel log via a dmsetup message. The dump code is not thread-safe and is generally intended for use only when the vdo is hung. Co-developed-by: J. corwin Coburn <corwin@hurlbutnet.net> Signed-off-by: J. corwin Coburn <corwin@hurlbutnet.net> Co-developed-by: Michael Sclafani <dm-devel@lists.linux.dev> Signed-off-by: Michael Sclafani <dm-devel@lists.linux.dev> Co-developed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Co-developed-by: Bruce Johnston <bjohnsto@redhat.com> Signed-off-by: Bruce Johnston <bjohnsto@redhat.com> Co-developed-by: Ken Raeburn <raeburn@redhat.com> Signed-off-by: Ken Raeburn <raeburn@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 13:43:15 -05:00
Matthew Sakai	92f8d7a94f	dm vdo: add sysfs support for setting parameters and fetching stats Add data and methods setting run time parameters via sysfs, and to make state and statistics information available through sysfs. Co-developed-by: J. corwin Coburn <corwin@hurlbutnet.net> Signed-off-by: J. corwin Coburn <corwin@hurlbutnet.net> Co-developed-by: Michael Sclafani <dm-devel@lists.linux.dev> Signed-off-by: Michael Sclafani <dm-devel@lists.linux.dev> Co-developed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Co-developed-by: Bruce Johnston <bjohnsto@redhat.com> Signed-off-by: Bruce Johnston <bjohnsto@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 13:43:15 -05:00
Matthew Sakai	a9457ab9d0	dm vdo: add statistics reporting Add data and methods to report statisics. Co-developed-by: J. corwin Coburn <corwin@hurlbutnet.net> Signed-off-by: J. corwin Coburn <corwin@hurlbutnet.net> Co-developed-by: Michael Sclafani <dm-devel@lists.linux.dev> Signed-off-by: Michael Sclafani <dm-devel@lists.linux.dev> Co-developed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 13:43:15 -05:00
Matthew Sakai	827c6389c6	dm vdo: add the on-disk formats and marshalling of vdo structures Add data and methods for marshalling and unmarshalling the persistent metadata. Co-developed-by: J. corwin Coburn <corwin@hurlbutnet.net> Signed-off-by: J. corwin Coburn <corwin@hurlbutnet.net> Co-developed-by: Michael Sclafani <dm-devel@lists.linux.dev> Signed-off-by: Michael Sclafani <dm-devel@lists.linux.dev> Co-developed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 13:43:15 -05:00
Matthew Sakai	06e932fea1	dm vdo: add the primary vdo structure Add the data and methods that manage the dm-vdo target itself. This includes the overall state of the target and its threads, the state of the logical volumes, startup, shutdown, and statistics. Co-developed-by: J. corwin Coburn <corwin@hurlbutnet.net> Signed-off-by: J. corwin Coburn <corwin@hurlbutnet.net> Co-developed-by: Michael Sclafani <dm-devel@lists.linux.dev> Signed-off-by: Michael Sclafani <dm-devel@lists.linux.dev> Co-developed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 13:43:15 -05:00
Matthew Sakai	4fa98386be	dm vdo: add repair of damaged vdo volumes When a vdo is restarted after a crash, it will automatically attempt to recover from its journals. If a vdo encounters an unrecoverable error, it will enter read-only mode. This mode indicates that some previously acknowledged data may have been lost. The vdo may be instructed to rebuild as best it can in order to return to a writable state. Although some data may be lost, this process will ensure that the vdo's own metadata is self-consistent. Co-developed-by: J. corwin Coburn <corwin@hurlbutnet.net> Signed-off-by: J. corwin Coburn <corwin@hurlbutnet.net> Co-developed-by: Michael Sclafani <dm-devel@lists.linux.dev> Signed-off-by: Michael Sclafani <dm-devel@lists.linux.dev> Co-developed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 13:43:15 -05:00
Matthew Sakai	95a7235768	dm vdo: add the recovery journal The recovery journal is used to amortize updates across the block map and slab depot. Each write request causes an entry to be made in the journal. Entries are either "data remappings" or "block map remappings." For a data remapping, the journal records the logical address affected and its old and new physical mappings. For a block map remapping, the journal records the block map page number and the physical block allocated for it (block map pages are never reclaimed, so the old mapping is always 0). Each journal entry and the data write it represents must be stable on disk before the other metadata structures may be updated to reflect the operation. Co-developed-by: J. corwin Coburn <corwin@hurlbutnet.net> Signed-off-by: J. corwin Coburn <corwin@hurlbutnet.net> Co-developed-by: Michael Sclafani <dm-devel@lists.linux.dev> Signed-off-by: Michael Sclafani <dm-devel@lists.linux.dev> Co-developed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 13:43:15 -05:00
Matthew Sakai	14d531d7b7	dm vdo: implement the block map page cache The set of leaf pages of the block map tree is too large to fit in memory, so each block map zone maintains a cache of leaf pages. This patch adds the implementation of that cache. Co-developed-by: J. corwin Coburn <corwin@hurlbutnet.net> Signed-off-by: J. corwin Coburn <corwin@hurlbutnet.net> Co-developed-by: Michael Sclafani <dm-devel@lists.linux.dev> Signed-off-by: Michael Sclafani <dm-devel@lists.linux.dev> Co-developed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 13:43:15 -05:00
Matthew Sakai	ddb12d6714	dm vdo: add the block map The block map contains the logical to physical mapping. It can be thought of as an array with one entry per logical address. Each entry is 5 bytes: 36 bits contain the physical block number which holds the data for the given logical address, and the remaining 4 bits are used to indicate the nature of the mapping. Of the 16 possible states, one represents a logical address which is unmapped (i.e. it has never been written, or has been discarded), one represents an uncompressed block, and the other 14 states are used to indicate that the mapped data is compressed, and which of the compression slots in the compressed block this logical address maps to. Co-developed-by: J. corwin Coburn <corwin@hurlbutnet.net> Signed-off-by: J. corwin Coburn <corwin@hurlbutnet.net> Co-developed-by: Michael Sclafani <dm-devel@lists.linux.dev> Signed-off-by: Michael Sclafani <dm-devel@lists.linux.dev> Co-developed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 13:43:15 -05:00
Matthew Sakai	7ce49449ff	dm vdo: add the slab depot Add the data and methods that implement the slab_depot that manages the allocation of slabs of blocks added by the preceding patches. Co-developed-by: J. corwin Coburn <corwin@hurlbutnet.net> Signed-off-by: J. corwin Coburn <corwin@hurlbutnet.net> Co-developed-by: Michael Sclafani <dm-devel@lists.linux.dev> Signed-off-by: Michael Sclafani <dm-devel@lists.linux.dev> Co-developed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 13:43:15 -05:00
Matthew Sakai	c9ba9fd33c	dm vdo: add the block allocators and physical zones Each slab is independent of every other. They are assigned to "physical zones" in round-robin fashion. If there are P physical zones, then slab n is assigned to zone n mod P. The set of slabs in each physical zone is managed by a block allocator. Co-developed-by: J. corwin Coburn <corwin@hurlbutnet.net> Signed-off-by: J. corwin Coburn <corwin@hurlbutnet.net> Co-developed-by: Michael Sclafani <dm-devel@lists.linux.dev> Signed-off-by: Michael Sclafani <dm-devel@lists.linux.dev> Co-developed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 13:43:15 -05:00
Matthew Sakai	883069e30e	dm vdo: add the slab summary The slab depot maintains an additional small data structure, the "slab summary," which is used to reduce the amount of work needed to come back online after a crash. The slab summary maintains an entry for each slab indicating whether or not the slab has ever been used, whether it is clean (i.e. all of its reference count updates have been persisted to storage), and approximately how full it is. During recovery, each physical zone will attempt to recover at least one slab, stopping whenever it has recovered a slab which has some free blocks. Once each zone has some space (or has determined that none is available), the target can resume normal operation in a degraded mode. Read and write requests can be serviced, perhaps with degraded performance, while the remainder of the dirty slabs are recovered. Co-developed-by: J. corwin Coburn <corwin@hurlbutnet.net> Signed-off-by: J. corwin Coburn <corwin@hurlbutnet.net> Co-developed-by: Michael Sclafani <dm-devel@lists.linux.dev> Signed-off-by: Michael Sclafani <dm-devel@lists.linux.dev> Co-developed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 13:43:15 -05:00
Matthew Sakai	09eff388df	dm vdo: add slab structure, slab journal and reference counters Most of the vdo volume belongs to the slab depot. The depot contains a collection of slabs. The slabs can be up to 32GB, and are divided into three sections. Most of a slab consists of a linear sequence of 4K blocks. These blocks are used either to store data, or to hold portions of the block map (see subsequent patches). In addition to the data blocks, each slab has a set of reference counters, using 1 byte for each data block. Finally each slab has a journal. Reference updates are written to the slab journal, which is written out one block at a time as each block fills. A copy of the reference counters is kept in memory, and are written out a block at a time, in oldest-dirtied-order whenever there is a need to reclaim slab journal space. The journal is used both to ensure that the main recovery journal (see subsequent patches) can regularly free up space, and also to amortize the cost of updating individual reference blocks. This patch adds the slab structure as well as the slab journal and reference counters. Co-developed-by: J. corwin Coburn <corwin@hurlbutnet.net> Signed-off-by: J. corwin Coburn <corwin@hurlbutnet.net> Co-developed-by: Michael Sclafani <dm-devel@lists.linux.dev> Signed-off-by: Michael Sclafani <dm-devel@lists.linux.dev> Co-developed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 13:43:15 -05:00
Matthew Sakai	58a55a5916	dm vdo: add the compressed block bin packer When blocks do not deduplicate, vdo will attempt to compress them. Up to 14 compressed blocks may be packed into a single data block (this limitation is imposed by the block map). The packer implements a simple best-fit packing algorithm and also manages the formatting and writing of compressed blocks when bins fill. Co-developed-by: J. corwin Coburn <corwin@hurlbutnet.net> Signed-off-by: J. corwin Coburn <corwin@hurlbutnet.net> Co-developed-by: Michael Sclafani <dm-devel@lists.linux.dev> Signed-off-by: Michael Sclafani <dm-devel@lists.linux.dev> Co-developed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 13:43:15 -05:00
Matthew Sakai	b053056133	dm vdo: add use of deduplication index in hash zones Add the data and methods that manage queries to the deduplication index and the responses from the index. Co-developed-by: J. corwin Coburn <corwin@hurlbutnet.net> Signed-off-by: J. corwin Coburn <corwin@hurlbutnet.net> Co-developed-by: Michael Sclafani <dm-devel@lists.linux.dev> Signed-off-by: Michael Sclafani <dm-devel@lists.linux.dev> Co-developed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Co-developed-by: Bruce Johnston <bjohnsto@redhat.com> Signed-off-by: Bruce Johnston <bjohnsto@redhat.com> Co-developed-by: Ken Raeburn <raeburn@redhat.com> Signed-off-by: Ken Raeburn <raeburn@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 13:43:14 -05:00
Matthew Sakai	cfaf07fae7	dm vdo: add hash locks and hash zones In order to deduplicate concurrent writes of the same data (to different locations), data_vios which are writing the same data are grouped together in a "hash lock," named for and keyed by the hash of the data being written. Each hash lock is assigned to a hash zone based on a portion of its hash. Co-developed-by: J. corwin Coburn <corwin@hurlbutnet.net> Signed-off-by: J. corwin Coburn <corwin@hurlbutnet.net> Co-developed-by: Michael Sclafani <dm-devel@lists.linux.dev> Signed-off-by: Michael Sclafani <dm-devel@lists.linux.dev> Co-developed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 13:43:14 -05:00
Matthew Sakai	c65bfacedc	dm vdo: add the vdo io_submitter The io_submitter handles bio submission from vdo data store to the storage below. It will merge bios when possible. Co-developed-by: J. corwin Coburn <corwin@hurlbutnet.net> Signed-off-by: J. corwin Coburn <corwin@hurlbutnet.net> Co-developed-by: Michael Sclafani <dm-devel@lists.linux.dev> Signed-off-by: Michael Sclafani <dm-devel@lists.linux.dev> Co-developed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Co-developed-by: Bruce Johnston <bjohnsto@redhat.com> Signed-off-by: Bruce Johnston <bjohnsto@redhat.com> Co-developed-by: Ken Raeburn <raeburn@redhat.com> Signed-off-by: Ken Raeburn <raeburn@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>	2024-02-20 13:43:14 -05:00

1 2 3 4 5 ...

8114 Commits