IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmRCvdsQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpg4oD/457EJ21Fm36NuyT/S0Cr8ok9Tdk7t9BeBh
V/9CYThoXr5aqAox0Vq23FF+Rhzm81GzwYERN4493LBblliNeNOo2IaXF9/7qrUW
11v9Bkug2J3k3hRGtEa6Zl0EpMu+FRLsNpchjFS2KPuOq+iMDxrvwuy50kidWg7n
r25e4UwpExVO9fIoUSmzgWVfRHOTuj9yiG/UsaH2+2BRXerIX0Q1tyElwmcGh25M
Ad2hN+yDnuIbNA5gNUpnzY32Dp0zjAsquc//QOvq9mltcNTElokB8idGliismvyd
8qF0lkwQwewOBT/sSD5EY3K0Qd8IJu425bvT/yPUDScHz1chxHUoxo5eisIr2M9l
5AL5KHAf7Zzs8ZuV+IYPzZ5qM6a/vF3mHUisKRNKYVhF46Nmd4cBratfXwWb1MxV
clQM2qr0TLOYli9mOeTXph3hg/rBVqKqf90boAZoN8b2tWBKlMykpqRadbepjrgx
bmBSwwAF99NxIHEjU3U5DMdUloCSiMZIfMfDxQrPNDrfWAW4xJs5Ym0VeOjEotTt
oFEs1fr6c3Mn7KEuPPfOtnDxvs51IP/B8+gDgMt/edf+wHiCU1Zm31u2gxt2dsKh
g73Y92i5SHjIf36H5szBTeioyMy1E1VA9HF14xWz2eKdQ+wxQ9VNWoctcJ85k3F4
6AZDYRIrWA==
=EaE9
-----END PGP SIGNATURE-----
Merge tag 'iter-ubuf.2-2023-04-21' of git://git.kernel.dk/linux
Pull ITER_UBUF updates from Jens Axboe:
"This turns singe vector imports into ITER_UBUF, rather than
ITER_IOVEC.
The former is more trivial to iterate and advance, and hence a bit
more efficient. From some very unscientific testing, ~60% of all iovec
imports are single vector"
* tag 'iter-ubuf.2-2023-04-21' of git://git.kernel.dk/linux:
iov_iter: Mark copy_compat_iovec_from_user() noinline
iov_iter: import single vector iovecs as ITER_UBUF
iov_iter: convert import_single_range() to ITER_UBUF
iov_iter: overlay struct iovec and ubuf/len
iov_iter: set nr_segs = 1 for ITER_UBUF
iov_iter: remove iov_iter_iovec()
iov_iter: add iter_iov_addr() and iter_iov_len() helpers
ALSA: pcm: check for user backed iterator, not specific iterator type
IB/qib: check for user backed iterator, not specific iterator type
IB/hfi1: check for user backed iterator, not specific iterator type
iov_iter: add iter_iovec() helper
block: ensure bio_alloc_map_data() deals with ITER_UBUF correctly
The code for setting a block device capacity (bd_nr_sectors field of
struct block_device) is duplicated in set_capacity() and
bdev_set_nr_sectors(). Clean this up by making bdev_set_nr_sectors()
a block layer internal function defined in block/bdev.c instead of
having this function statically defined in block/partitions/core.c.
With this change, set_capacity() implementation can be simplified to
only calling bdev_set_nr_sectors().
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20230424131318.79935-1-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This reverts commit 23f3e3272e7a4d9fb870485cd6df1e4f9539282c.
blk-mq sched bio merge still needs request to grab queue usage counter,
so we can't simply call blk_mq_attempt_bio_merge() when queue usage
counter isn't held.
Fixes: 23f3e3272e7a ("block: Merge bio before checking ->cached_rq")
Cc: Xiao Ni <xni@redhat.com>
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230420112018.1108058-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Locking range start and locking range length
attributes may be require to satisfy restrictions
exposed by OPAL2 geometry feature reporting.
Geometry reporting feature is described in TCG OPAL SSC,
section 3.1.1.4 (ALIGN, LogicalBlockSize, AlignmentGranularity
and LowestAlignedLBA).
4.3.5.2.1.1 RangeStart Behavior:
[ StartAlignment = (RangeStart modulo AlignmentGranularity) - LowestAlignedLBA ]
When processing a Set method or CreateRow method on the Locking
table for a non-Global Range row, if:
a) the AlignmentRequired (ALIGN above) column in the LockingInfo
table is TRUE;
b) RangeStart is non-zero; and
c) StartAlignment is non-zero, then the method SHALL fail and
return an error status code INVALID_PARAMETER.
4.3.5.2.1.2 RangeLength Behavior:
If RangeStart is zero, then
[ LengthAlignment = (RangeLength modulo AlignmentGranularity) - LowestAlignedLBA ]
If RangeStart is non-zero, then
[ LengthAlignment = (RangeLength modulo AlignmentGranularity) ]
When processing a Set method or CreateRow method on the Locking
table for a non-Global Range row, if:
a) the AlignmentRequired (ALIGN above) column in the LockingInfo
table is TRUE;
b) RangeLength is non-zero; and
c) LengthAlignment is non-zero, then the method SHALL fail and
return an error status code INVALID_PARAMETER
In userspace we stuck to logical block size reported by general
block device (via sysfs or ioctl), but we can not read
'AlignmentGranularity' or 'LowestAlignedLBA' anywhere else and
we need to get those values from sed-opal interface otherwise
we will not be able to report or avoid locking range setup
INVALID_PARAMETER errors above.
Signed-off-by: Ondrej Kozina <okozina@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Tested-by: Milan Broz <gmazyland@gmail.com>
Link: https://lore.kernel.org/r/20230411090931.9193-2-okozina@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Guard all the code to punt bios to a per-cgroup submission helper by a
new CONFIG_BLK_CGROUP_PUNT_BIO symbol that is selected by btrfs.
This way non-btrfs kernel builds don't need to have this code.
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
async_bio_lock is only taken from bio submission and workqueue context,
both are never in bottom halves.
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David Sterba <dsterba@suse.com>
REQ_CGROUP_PUNT is a bit annoying as it is hard to follow and adds
a branch to the bio submission hot path. To fix this, export
blkcg_punt_bio_submit and let btrfs call it directly. Add a new
REQ_FS_PRIVATE flag for btrfs to indicate to it's own low-level
bio submission code that a punt to the cgroup submission helper
is required.
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit b12e5c6c755a accidentally changes blk_kick_flush to do a head
insert into the requeue list, fix this up.
Fixes: b12e5c6c755a ("blk-mq: pass a flags argument to blk_mq_add_to_requeue_list")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20230416073553.966161-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We have a long chain of memory dereferencing just to whether or not
this disk has a special submit_bio helper. As that's not necessarily
the common case, add a bd_has_submit_bio state in the bdev to avoid
traversing this memory dependency chain if we don't need to.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
__blk_mq_run_hw_queue just contains a WARN_ON_ONCE for calls from
interrupt context and a blk_mq_run_dispatch_ops-protected call to
blk_mq_sched_dispatch_requests. Open code the call to
blk_mq_sched_dispatch_requests in both callers, and move the WARN_ON_ONCE
to blk_mq_run_hw_queue where it can be extended to all !async calls,
while the other call is from workqueue context and thus obviously does
not need the assert.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230413060651.694656-6-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Only blk_mq_run_hw_queue can call __blk_mq_delay_run_hw_queue with
async=false, so move the handling there.
With this __blk_mq_delay_run_hw_queue can be merged into
blk_mq_delay_run_hw_queue.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230413060651.694656-5-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
For the in-context dispatch, blk_mq_hctx_stopped is alredy checked in
blk_mq_sched_dispatch_requests under blk_mq_run_dispatch_ops() protection.
For the async dispatch case having a check before scheduling the work
still makes sense to avoid needless workqueue scheduling, so just keep it
for that case.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230413060651.694656-4-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
blk_mq_hctx_stopped is already checked in blk_mq_sched_dispatch_requests
under blk_mq_run_dispatch_ops() protection, so remove the duplicate check.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230413060651.694656-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
__blk_mq_sched_dispatch_requests currently has duplicated logic
for the cases where requests are on the hctx dispatch list or not.
Merge the two with a new need_dispatch variable and remove a few
pointless local variables.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230413060651.694656-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Replace the boolean at_head argument with the same flags that are already
passed to blk_mq_insert_request.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230413064057.707578-21-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Instead of passing a bool at_head, pass down the full flags from the
blk_mq_insert_request interface.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230413064057.707578-20-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Replace the boolean at_head argument with the same flags that are already
passed to blk_mq_insert_request.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230413064057.707578-19-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Replace the at_head bool with a flags argument that so far only contains
a single BLK_MQ_INSERT_AT_HEAD value. This makes it much easier to grep
for head insertions into the blk-mq dispatch queues.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230413064057.707578-18-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
blk_mq_add_to_requeue_list takes a bool parameter to control how to kick
the requeue list at the end of the function. Move the call to
blk_mq_kick_requeue_list to the callers that want it instead.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230413064057.707578-17-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
blk_mq_request_bypass_insert takes a bool parameter to control how to run
the queue at the end of the function. Move the blk_mq_run_hw_queue call
to the callers that want it instead.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230413064057.707578-16-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
blk_mq_insert_request takes two bool parameters to control how to run
the queue at the end of the function. Move the blk_mq_run_hw_queue call
to the callers that want it instead.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230413064057.707578-15-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Due to the wildly different behavior based on the bypass_insert argument,
not a whole lot of code in __blk_mq_try_issue_directly is actually shared
between blk_mq_try_issue_directly and blk_mq_request_issue_directly.
Remove __blk_mq_try_issue_directly and fold the code into the two callers
instead.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230413064057.707578-14-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Factor out a helper from __blk_mq_try_issue_directly in preparation
of folding that function into its two callers.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230413064057.707578-13-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Split the RQF_DONTPREP and RQF_SOFTBARRIER in separate branches to make
the code more readable.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230413064057.707578-12-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
While both passthrough and flush requests call directly into
blk_mq_request_bypass_insert, the parameters aren't the same.
Split the handling into two separate conditionals and turn the whole
function into an if/elif/elif/else flow instead of the gotos.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230413064057.707578-11-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Just call blk_mq_add_to_requeue_list directly from the two callers.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230413064057.707578-10-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Remove this very small helper and fold it into the only caller.
Note that this moves the trace_block_rq_insert out of ctx->lock, matching
the other calls to this tracepoint.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230413064057.707578-9-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
There is no good point in keeping the __blk_mq_insert_request around
for two function calls and a singler caller.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230413064057.707578-8-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
blk_mq_sched_insert_request is the main request insert helper and not
directly I/O scheduler related. Move blk_mq_sched_insert_request to
blk-mq.c, rename it to blk_mq_insert_request and mark it static.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230413064057.707578-7-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
blk_mq_dispatch_plug_list is the only caller of
blk_mq_sched_insert_requests, and it makes sense to just fold it there
as blk_mq_sched_insert_requests isn't specific to I/O schedulers despite
the name.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230413064057.707578-6-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Move all logic related to the direct insert (including the call to
blk_mq_run_hw_queue) into blk_mq_insert_requests to streamline the code
flow up a bit, and to allow marking blk_mq_try_issue_list_directly
static.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230413064057.707578-5-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
block/blk-mq.h needs various definitions from <linux/blk-mq.h>,
include it there instead of relying on the source files to include
both.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230413064057.707578-4-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
blk-mq-tag.h is always included by blk-mq.h, and causes recursive
inclusion hell with further changes. Just merge it into blk-mq.h
instead.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230413064057.707578-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Plugs never insert at head, so don't plug for head insertions.
Fixes: 1c2d2fff6dc0 ("block: wire-up support for passthrough plugging")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230413064057.707578-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
blk_throtl_register() will unconditionally enable blk-stat for gendisk
when register, even when we have no BLK_DEV_THROTTLING_LOW config.
Since the kernel always has only BLK_DEV_THROTTLING config and the
BLK_DEV_THROTTLING_LOW config is still in EXPERIMENTAL state, we can
just skip blk-stat when !BLK_DEV_THROTTLING_LOW.
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230413062805.2081970-2-chengming.zhou@linux.dev
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We need to set QUEUE_FLAG_STATS for two cases:
1. blk_stat_enable_accounting()
2. blk_stat_add_callback()
So we should clear it only when ((q->stats->accounting == 0) &&
list_empty(&q->stats->callbacks)).
blk_stat_disable_accounting() only check if q->stats->accounting
is 0 before clear the flag, this patch fix it.
Also add list_empty(&q->stats->callbacks)) check when enable, or
the flag is already set.
The bug can be reproduced on kernel without BLK_DEV_THROTTLING
(since it unconditionally enable accounting, see the next patch).
# cat /sys/block/sr0/queue/scheduler
none mq-deadline [bfq]
# cat /sys/kernel/debug/block/sr0/state
SAME_COMP|IO_STAT|INIT_DONE|STATS|REGISTERED|NOWAIT|30
# echo none > /sys/block/sr0/queue/scheduler
# cat /sys/kernel/debug/block/sr0/state
SAME_COMP|IO_STAT|INIT_DONE|REGISTERED|NOWAIT
# cat /sys/block/sr0/queue/wbt_lat_usec
75000
We can see that after changing elevator from "bfq" to "none",
"STATS" flag is lost even though WBT callback still need it.
Fixes: 68497092bde9 ("block: make queue stat accounting a reference")
Cc: <stable@vger.kernel.org> # v5.17+
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230413062805.2081970-1-chengming.zhou@linux.dev
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Other rq_qos policies such as wbt and iocost are lazy-initialized when they
are configured for the first time for the device but iolatency is
initialized unconditionally from blkcg_init_disk() during gendisk init. Lazy
init is beneficial because rq_qos policies add runtime overhead when
initialized as every IO has to walk all registered rq_qos callbacks.
This patch switches iolatency to lazy initialization too so that it only
registered its rq_qos policy when it is first configured.
Note that there is a known race condition between blkcg config file writes
and del_gendisk() and this patch makes iolatency susceptible to it by
exposing the init path to race against the deletion path. However, that
problem already exists in iocost and is being worked on.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/r/20230413000649.115785-5-tj@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The name was too generic given that there are multiple blkcg rq-qos
policies.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/r/20230413000649.115785-4-tj@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We want to support lazy init of rq-qos policies so that iolatency is enabled
lazily on configuration instead of gendisk initialization. The way blkg
config helpers are structured now is a bit awkward for that. Let's
restructure:
* blkcg_conf_open_bdev() is renamed to blkg_conf_open_bdev(). The blkcg_
prefix was used because the bdev opening step is blkg-independent.
However, the distinction is too subtle and confuses more than helps. Let's
switch to blkg prefix so that it's consistent with the type and other
helper names.
* struct blkg_conf_ctx now remembers the original input string and is always
initialized by the new blkg_conf_init().
* blkg_conf_open_bdev() is updated to take a pointer to blkg_conf_ctx like
blkg_conf_prep() and can be called multiple times safely. Instead of
modifying the double pointer to input string directly,
blkg_conf_open_bdev() now sets blkg_conf_ctx->body.
* blkg_conf_finish() is renamed to blkg_conf_exit() for symmetry and now
must be called on all blkg_conf_ctx's which were initialized with
blkg_conf_init().
Combined, this allows the users to either open the bdev first or do it
altogether with blkg_conf_prep() which will help implementing lazy init of
rq-qos policies.
blkg_conf_init/exit() will also be used implement synchronization against
device removal. This is necessary because iolat / iocost are configured
through cgroupfs instead of one of the files under /sys/block/DEVICE. As
cgroupfs operations aren't synchronized with block layer, the lazy init and
other configuration operations may race against device removal. This patch
makes blkg_conf_init/exit() used consistently for all cgroup-orginating
configurations making them a good place to implement explicit
synchronization.
Users are updated accordingly. No behavior change is intended by this patch.
v2: bfq wasn't updated in v1 causing a build error. Fixed.
v3: Update the description to include future use of blkg_conf_init/exit() as
synchronization points.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Yu Kuai <yukuai1@huaweicloud.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20230413000649.115785-3-tj@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Now that all RCU flavors have been combined either holding a spin lock,
disabling irq or disabling preemption implies RCU read lock, so there's no
need to use rcu_read_[un]lock() explicitly while holding queue_lock. This
shouldn't cause any behavior changes.
v2: Description updated. Leave __acquires/release on queue_lock alone.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20230413000649.115785-2-tj@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
No rdma device exposes its irq vectors affinity today. So the only
mapping that we have left, is the default blk_mq_map_queues, which
we fallback to anyways. Also fixup the only consumer of this helper
(nvme-rdma).
Remove this now dead code.
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Acked-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
BLK_STS_NEXUS is used for NVMe/SCSI reservation conflicts and DASD's
locking feature which works similar to NVMe/SCSI reservations where a
host can get a lock on a device and when the lock is taken it will get
failures.
This patch renames BLK_STS_NEXUS so it better reflects this type of
use.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20230407200551.12660-3-michael.christie@oracle.com
Acked-by: Stefan Haberland <sth@linux.ibm.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently if disk_scan_partitions() failed, GD_NEED_PART_SCAN will still
set, and partition scan will be proceed again when blkdev_get_by_dev()
is called. However, this will cause a problem that re-assemble partitioned
raid device will creat partition for underlying disk.
Test procedure:
mdadm -CR /dev/md0 -l 1 -n 2 /dev/sda /dev/sdb -e 1.0
sgdisk -n 0:0:+100MiB /dev/md0
blockdev --rereadpt /dev/sda
blockdev --rereadpt /dev/sdb
mdadm -S /dev/md0
mdadm -A /dev/md0 /dev/sda /dev/sdb
Test result: underlying disk partition and raid partition can be
observed at the same time
Note that this can still happen in come corner cases that
GD_NEED_PART_SCAN can be set for underlying disk while re-assemble raid
device.
Fixes: e5cfefa97bcc ("block: fix scan partition for exclusively open device again")
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
blkcg_policy cpd_init_fn() is used to just initialize some default
fields of policy data, which is enough to do in cpd_alloc_fn().
This patch delete the only user bfq_cpd_init(), and remove cpd_init_fn
from blkcg_policy.
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230406145050.49914-4-zhouchengming@bytedance.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
cpd_bind_fn is just used for update default weight when block
subsys attached to a hierarchy. No any policy need it anymore.
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230406145050.49914-3-zhouchengming@bytedance.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
BFQ_WEIGHT_LEGACY_DFL is the same as CGROUP_WEIGHT_DFL, which means
we don't need cpd_bind_fn() callback to update default weight when
attached to a hierarchy.
This patch remove BFQ_WEIGHT_LEGACY_DFL and cpd_bind_fn().
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20230406145050.49914-2-zhouchengming@bytedance.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Remove several calls to compound_head() and the last caller of
set_page_writeback_keepwrite(), so remove the wrapper too.
Also export bio_add_folio() as this is the first caller from a module.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20230324180129.1220691-4-willy@infradead.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
It returns following attributes:
locking range start
locking range length
read lock enabled
write lock enabled
lock state (RW, RO or LK)
It can be retrieved by user authority provided the authority
was added to locking range via prior IOC_OPAL_ADD_USR_TO_LR
ioctl command. The command was extended to add user in ACE that
allows to read attributes listed above.
Signed-off-by: Ondrej Kozina <okozina@redhat.com>
Tested-by: Luca Boccassi <bluca@debian.org>
Tested-by: Milan Broz <gmazyland@gmail.com>
Link: https://lore.kernel.org/r/20230405111223.272816-6-okozina@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>