IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
queuedata is not referenced in ublk_drv and we can use driver_data
instead. Pass NULL to blk_mq_alloc_disk() as queuedata while allocating
ublk's gendisk.
Signed-off-by: Ziyang Zhang <ZiyangZhang@linux.alibaba.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230207070839.370817-4-ZiyangZhang@linux.alibaba.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
WRITE_ZEROES won't return bytes returned just like FLUSH and DISCARD,
and we can end it directly. Add missing comment for it in
ublk_complete_rq().
Signed-off-by: Ziyang Zhang <ZiyangZhang@linux.alibaba.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230207070839.370817-3-ZiyangZhang@linux.alibaba.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
bio_has_data() allows a NULL bio so the NULL check in
ublk_rq_has_data() is unnecessary.
Signed-off-by: Ziyang Zhang <ZiyangZhang@linux.alibaba.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230207070839.370817-2-ZiyangZhang@linux.alibaba.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmPdRq8QHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpjcqEADcWlRjkcLzRpEMD9g3IyDShasT1JVeSvV6
xqDuA0kRF6DyObu82jE2wiZ49FRpeCUw6S6ZdVhvwGHgPpfLBuPWonFnTqxYAnSz
XCYnt4QdZHGiydIHVxkyP8Raz6d24kZawlUmbE7dcfksNziyGR5UjbCsk1HNJhmf
EvnLZ2EozZwsZLW/RRYZrh9Q8ccB8kJeX+JuUVw7sboNyJ+bW+x+7prlm3CKgopX
IiP69E6qIPe6RHkyLRdKgYgxRdcgeq6uJk/nuZ/6uPCcyrz+0QEtge3CkTe7zLkF
CPmbWlqngmNfNsS93nPTK2kHWTz8P2spo+UTkXIegSYBA8CIr9lDxazSFKT0B6zH
yIWzmQoE7YXRI5B21rlPvNGE/gPSy48mSn1ym/MCf+UyWGneRypeU/K//2Ww3UJK
F1Xl2c1v/EEr28qPuC8VQbAsQ56GOcZ6zW4Q0grxTYm0KzzJ2O5B3FEHdCWlS/x9
KY5v3a8a3nXg9rNio0ruXiyD5l7PE5nFESNrBFDS4kEfxk4cx50ZfgDH68d515/W
//EnNjx9nN20yF+LcKD70KJHxPdWaUXGT2c1+E/tdbrgUKReCpER+5hQc8+YxQML
DCbzr7LJjX5mmDQ5YI6Y09/L6luzFMjrnxpmXkL7nyWQlSYkMqus3vPtDcJ5Xk2J
shHBlzIcuw==
=/+rE
-----END PGP SIGNATURE-----
Merge tag 'block-6.2-2023-02-03' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
"A bit bigger than I'd like at this point, but mostly a bunch of little
fixes. In detail:
- NVMe pull request via Christoph:
- Fix a missing queue put in nvmet_fc_ls_create_association
(Amit Engel)
- Clear queue pointers on tag_set initialization failure
(Maurizio Lombardi)
- Use workqueue dedicated to authentication (Shin'ichiro
Kawasaki)
- Fix for an overflow in ublk (Liu)
- Fix for leaking a queue reference in block cgroups (Ming)
- Fix for a use-after-free in BFQ (Yu)"
* tag 'block-6.2-2023-02-03' of git://git.kernel.dk/linux:
blk-cgroup: don't update io stat for root cgroup
nvme-auth: use workqueue dedicated to authentication
nvme: clear the request_queue pointers on failure in nvme_alloc_io_tag_set
nvme: clear the request_queue pointers on failure in nvme_alloc_admin_tag_set
nvme-fc: fix a missing queue put in nvmet_fc_ls_create_association
block: Fix the blk_mq_destroy_queue() documentation
block: ublk: extending queue_size to fix overflow
block, bfq: fix uaf for bfqq in bic_set_bfqq()
Use the bvec_set_page helper to initialize bvecs.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20230203150634.3199647-11-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Use the bvec_set_virt helper to initialize the special_vec.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20230203150634.3199647-10-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Use the bvec_set_page helper to initialize the copy up bvec.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Link: https://lore.kernel.org/r/20230203150634.3199647-9-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The ->rw_page method is a special purpose bypass of the usual bio handling
path that is limited to single-page reads and writes and synchronous which
causes a lot of extra code in the drivers, callers and the block layer.
The only remaining user is the MM swap code. Switch that swap code to
simply submit a single-vec on-stack bio an synchronously wait on it based
on a newly added QUEUE_FLAG_SYNCHRONOUS flag set by the drivers that
currently implement ->rw_page instead. While this touches one extra cache
line and executes extra code, it simplifies the block layer and drivers
and ensures that all feastures are properly supported by all drivers, e.g.
right now ->rw_page bypassed cgroup writeback entirely.
[akpm@linux-foundation.org: fix comment typo, per Dan]
Link: https://lkml.kernel.org/r/20230125133436.447864-8-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Make the following minor changes which were reported by colleagues
while reviewing this code:
- Remove the parentheses from around the LOOP_DEFAULT_HW_Q_DEPTH
definition since these are superfluous.
- Accept other number formats than decimal, e.g. hexadecimal.
- Do not set hw_queue_depth to an out-of-range value, even if that value
won't be used.
- Use the LOOP_DEFAULT_HW_Q_DEPTH macro in the kernel module parameter
description to prevent that the description gets out of sync.
This patch has been tested as follows:
# modprobe -r loop
# modprobe loop hw_queue_depth=-1
modprobe: ERROR: could not insert 'loop': Invalid argument
# modprobe loop hw_queue_depth=0
modprobe: ERROR: could not insert 'loop': Invalid argument
# modprobe loop hw_queue_depth=1; cat /sys/module/loop/parameters/hw_queue_depth
1
# modprobe -r loop; modprobe loop; cat /sys/module/loop/parameters/hw_queue_depth hw_queue_depth=0x10
16
# modprobe -r loop; modprobe loop; cat /sys/module/loop/parameters/hw_queue_depth hw_queue_depth=128
128
# modprobe -r loop; modprobe loop hw_queue_depth=129; cat /sys/module/loop/parameters/hw_queue_depth
129
# modprobe -r loop; modprobe loop hw_queue_depth=$((1<<32))
modprobe: ERROR: could not insert 'loop': Numerical result out of range
See also commit ef44c50837ab ("loop: allow user to set the queue
depth").
Cc: Chaitanya Kulkarni <kch@nvidia.com>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20230130211347.832110-1-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Owner of one unprivileged ublk device could be one evil user, which
can grant this disk's privilege to other users deliberately, and
this way could be like making one trap and waiting for other users
to be caught.
So only owner to open unprivileged disk even though the owner
grants disk privilege to other user. This way is reasonable too
given anyone can create ublk disk, and no need other's grant.
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Fixes: 4093cb5a0634 ("ublk_drv: add mechanism for supporting unprivileged ublk device")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230131040446.214583-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When validating drafted SPDK ublk target, in a case that
assigning large queue depth to multiqueue ublk device,
ublk target would run into a weird incorrect state. During
rounds of review and debug, An overflow bug was found
in ublk driver.
In ublk_cmd.h, UBLK_MAX_QUEUE_DEPTH is 4096 which means
each ublk queue depth can be set as large as 4096. But
when setting qd for a ublk device,
sizeof(struct ublk_queue) + depth * sizeof(struct ublk_io)
will be larger than 65535 if qd is larger than 2728.
Then queue_size is overflowed, and ublk_get_queue()
references a wrong pointer position. The wrong content of
ublk_queue elements will lead to out-of-bounds memory
access.
Extend queue_size in ublk_device as "unsigned int".
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver")
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230131070552.115067-1-xiaodong.liu@intel.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Move PARIDE protocol modules out of drivers/block into
drivers/ata/pata_parport and update the CONFIG_ symbol names to
PATA_PARPORT.
[Damien]
The pata_parport driver file itsef is also moved together with the
protocol modules in drivers/ata/pata_parport.
Signed-off-by: Ondrej Zary <linux@zary.sk>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Remove PARIDE core and high level protocols, taking care not to break
low-level drivers (used by pata_parport). Also update documentation.
Signed-off-by: Ondrej Zary <linux@zary.sk>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
The pata_parport is a libata-based replacement of the old PARIDE
subsystem - driver for parallel port IDE devices.
It uses the original paride low-level protocol drivers but does not
need the high-level drivers (pd, pcd, pf, pt, pg). The IDE devices
behind parallel port adapters are handled by the ATA layer.
This will allow paride and its high-level drivers to be removed.
Unfortunately, libata drivers cannot sleep so pata_parport claims
parport before activating the ata host and keeps it claimed (and
protocol connected) until the ata host is removed. This means that
no devices can be chained (neither other pata_parport devices nor
a printer).
paride and pata_parport are mutually exclusive because the compiled
protocol drivers are incompatible.
Tested with:
- Imation SuperDisk LS-120 and HP C4381A (EPAT)
- Freecom Parallel CD (FRPW)
- Toshiba Mobile CD-RW 2793008 w/Freecom Parallel Cable rev.903 (FRIQ)
- Backpack CD-RW 222011 and CD-RW 19350 (BPCK6)
The following bugs in low-level protocol drivers were found and will
be fixed later:
Note: EPP-32 mode is buggy in EPAT - and also in all other protocol
drivers - they don't handle non-multiple-of-4 block transfers
correctly. This causes problems with LS-120 drive.
There is also another bug in EPAT: EPP modes don't work unless a 4-bit
or 8-bit mode is used first (probably some initialization missing?).
Once the device is initialized, EPP works until power cycle.
So after device power on, you have to:
echo "parport0 epat 0" >/sys/bus/pata_parport/new_device
echo pata_parport.0 >/sys/bus/pata_parport/delete_device
echo "parport0 epat 4" >/sys/bus/pata_parport/new_device
(autoprobe will initialize correctly as it tries the slowest modes
first but you'll get the broken EPP-32 mode)
Note: EPP modes are buggy in FRPW, only modes 0 and 1 work.
Signed-off-by: Ondrej Zary <linux@zary.sk>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
ps3vram iterates over the bio one segment, that is page aligned and max
page sized chunk, a time. Because of that there is no point in
calling bio_split_to_limits, or explicitly setting the default limits
that are only used by bio_split_to_limits.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Geoff Levand <geoff@infradead.org>
Link: https://lore.kernel.org/r/20230123074718.57951-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
unprivileged ublk device is helpful for container use case, such
as: ublk device created in one unprivileged container can be controlled
and accessed by this container only.
Implement this feature by adding flag of UBLK_F_UNPRIVILEGED_DEV, and if
this flag isn't set, any control command has been run from privileged
user. Otherwise, any control command can be sent from any unprivileged
user, but the user has to be permitted to access the ublk char device
to be controlled.
In case of UBLK_F_UNPRIVILEGED_DEV:
1) for command UBLK_CMD_ADD_DEV, it is always allowed, and user needs
to provide owner's uid/gid in this command, so that udev can set correct
ownership for the created ublk device, since the device owner uid/gid
can be queried via command of UBLK_CMD_GET_DEV_INFO.
2) for other control commands, they can only be run successfully if the
current user is allowed to access the specified ublk char device, for
running the permission check, path of the ublk char device has to be
provided by these commands.
Also add one control of command UBLK_CMD_GET_DEV_INFO2 which always
include the char dev path in payload since userspace may not have
knowledge if this device is created in unprivileged mode.
For applying this mechanism, system administrator needs to take
the following policies:
1) chmod 0666 /dev/ublk-control
2) change ownership of ublkcN & ublkbN
- chown owner_uid:owner_gid /dev/ublkcN
- chown owner_uid:owner_gid /dev/ublkbN
Both can be done via one simple udev rule.
Userspace:
https://github.com/ming1/ubdsrv/tree/unprivileged-ublk
'ublk add -t $TYPE --un_privileged=1' is for creating one un-privileged
ublk device if the user is un-privileged.
Link: https://lore.kernel.org/linux-block/YoOr6jBfgVm8GvWg@stefanha-x1.localdomain/
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230106041711.914434-7-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Prepare for supporting unprivileged ublk device by limiting max number
ublk devices added. Otherwise too many ublk devices could be added by
un-trusted user, which can be thought as one DoS.
Reviewed-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230106041711.914434-6-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Userspace side only knows device ID, but the associated path of ublkc* and
ublkb* could be changed by udev, and that depends on userspace's policy, so
add parameter of UBLK_PARAM_TYPE_DEVT for retrieving major/minor of the
ublkc* and ublkb*, then user may figure out major/minor of the ublk disks
he/she owns. With major/minor, it is easy to find the device node path.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230106041711.914434-5-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
It is annoying for each control command handler to get/put ublk
device and deal with failure.
Control command handler is simplified a lot by moving
ublk_get_device_from_id into ublk_ctrl_uring_cmd().
Reviewed-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230106041711.914434-4-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If any ubq daemon is unprivileged, the ublk char device is allowed
for unprivileged user actually, and we can't trust the current user,
so not probe partitions.
Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver")
Reviewed-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230106041711.914434-3-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
No one uses 'nr_aborted_queues' any more, so remove it.
Reviewed-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230106041711.914434-2-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Trying to remove an "empty" (just initialized, or "cleared") interval
from the tree, this results in an endless loop.
As we typically protect the tree with a spinlock_irq,
the result is a hung system.
Be nice to error cleanup code paths, ignore removal of empty intervals.
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20230113123538.144276-8-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This require_context attribute originated in a proposed sparse patch by
Philipp Reisner back in 2008. Johannes Berg had a different solution to
a similar problem, and that patch "won" in the end; so the require_context
thing never got merged. The whole history can be read at [0].
DRBD kept using these annotations anyway for a while. Nowadays, on a
modern unmodified sparse, they obviously do nothing, and they are hardly
used anymore anyway.
So, just remove the definitions of these macros.
[0] https://www.spinics.net/lists/linux-sparse/msg01150.html
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Reviewed-by: Joel Colledge <joel.colledge@linbit.com>
Link: https://lore.kernel.org/r/20230113123538.144276-6-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
To be more similar to what we do in the out-of-tree module and ease the
upstreaming process.
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Reviewed-by: Joel Colledge <joel.colledge@linbit.com>
Link: https://lore.kernel.org/r/20230113123506.144082-4-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
To be more similar to what we do in the out-of-tree module and ease the
upstreaming process.
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Reviewed-by: Joel Colledge <joel.colledge@linbit.com>
Link: https://lore.kernel.org/r/20230113123506.144082-2-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This is used as an unsigned value, so define it that way to avoid
having to cast it.
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20230105205146.3610282-2-kbusch@meta.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The 'ublk_chr_class' is needed when deleting ublk char devices in
ublk_exit(), so move it after devices(idle) are removed.
Fixes the following warning reported by Harris, James R:
[ 859.178950] sysfs group 'power' not found for kobject 'ublkc0'
[ 859.178962] WARNING: CPU: 3 PID: 1109 at fs/sysfs/group.c:278 sysfs_remove_group+0x9c/0xb0
Reported-by: "Harris, James R" <james.r.harris@intel.com>
Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver")
Link: https://lore.kernel.org/linux-block/Y9JlFmSgDl3+zy3N@T590/T/#t
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Tested-by: Jim Harris <james.r.harris@intel.com>
Link: https://lore.kernel.org/r/20230126115346.263344-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmPK8NUQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgptS/EADT+m0n7jjonp7NoENoZT2y4o5ayESuEmBV
X8QUg/Ji1P3VG3QzI+yCqevGa2Rkkd8EenlokpjLliuqPdb/aZ56G7rsebotzWu3
zOV3XNvKvD0thiMIjmXABvmUKdb3lcrM5tpC9Uqq6L52SqbtkSsPUVO+rWE/tTZk
u97dUmyQcaD2brGfn4AcR0wgQoxrcLbmUpa/TKhFIDPDl+4PFi2ePoSQSsdDJT8R
PTvQhud1dl/wJ3733vj8S8s4Sxkbm5xXt50oDaTSmdOWSNOuMNuyW3WqkZ/SPdyK
LDmtOXEfuiokJK/l+DZ9SKt6jONW6ShdEaUo37/8yjYCnZFvWkcfn+6mWaDygjqS
eI3Mwb91w8K9krTZU1tGq3qOtxEJwbtLHCM96nh8SHLjNrYYrkZQZHOcea9CgX8h
iMzI5ylP2t6RofwHwwFoZYGOxrRz/R5LS+pCFIv720QnBjb9ZpO9zoDQaDl5tOS6
UpuL3XPzs9rZZizY00NG6+vQeSdSLRyyjs4XIWYxrZy2wuC2EjM0HstMfefldQcJ
uEfgrVgd/pcUTNzCG8uH8cZbmeflivm18J6OX86l2X9d3m62HD5gULHFOFxbDwsC
zoQOsyaGVRLpO0+/0MKs7aLaZlk40VDb4XdRsM6qbd4+x+J7yicvGrkUxS6cZMwT
VlQm3YUc0g==
=L12Q
-----END PGP SIGNATURE-----
Merge tag 'block-6.2-2023-01-20' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
"Various little tweaks all over the place:
- NVMe pull request via Christoph:
- fix controller shutdown regression in nvme-apple (Janne Grunau)
- fix a polling on timeout regression in nvme-pci (Keith Busch)
- Fix a bug in the read request side request allocation caching
(Pavel)
- pktcdvd was brought back after we configured a NULL return on bio
splits, make it consistent with the others (me)
- BFQ refcount fix (Yu)
- Block cgroup policy activation fix (Yu)
- Fix for an md regression introduced in the 6.2 cycle (Adrian)"
* tag 'block-6.2-2023-01-20' of git://git.kernel.dk/linux:
nvme-pci: fix timeout request state check
nvme-apple: only reset the controller when RTKit is running
nvme-apple: reset controller during shutdown
block: fix hctx checks for batch allocation
block/rnbd-clt: fix wrong max ID in ida_alloc_max
blk-cgroup: fix missing pd_online_fn() while activating policy
pktcdvd: check for NULL returna fter calling bio_split_to_limits()
block, bfq: switch 'bfqg->ref' to use atomic refcount apis
md: fix incorrect declaration about claim_rdev in md_import_device
When supplied buffer does not have assignment sign next_arg() sets `val`
pointer to NULL, so we cannot dereference it. Add a NULL pointer test to
handle `param` case, in addition to `*val` test, which handles cases when
param has no value assigned to it: `param=`.
Link: https://lkml.kernel.org/r/20230103030119.1496358-1-senozhatsky@chromium.org
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
- The double `range` is duplicated in comment, remove one.
- change `syfs` to `sysfs`
Link: https://lkml.kernel.org/r/20221223040331.4194-1-jhs2.lee@samsung.com
Signed-off-by: JeongHyeon Lee <jhs2.lee@samsung.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
We need to pass 'end - 1' to ida_alloc_max after switch from
ida_simple_get to ida_alloc_max.
Otherwise smatch warns.
drivers/block/rnbd/rnbd-clt.c:1460 init_dev() error: Calling ida_alloc_max() with a 'max' argument which is a power of 2. -1 missing?
Fixes: 24afc15dbe21 ("block/rnbd: Remove a useless mutex")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Link: https://lore.kernel.org/r/20221230010926.32243-1-guoqing.jiang@linux.dev
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The revert of the removal of this driver happened after we fixed up
the split limits for NOWAIT issue, hence it got missed. Ensure that
we check for a NULL bio after splitting, in case it should be retried.
Marking this as fixing both commits, so that stable backport will do
this correctly.
Cc: stable@vger.kernel.org
Fixes: 9cea62b2cbab ("block: don't allow splitting of a REQ_NOWAIT bio")
Fixes: 4b83e99ee709 ("Revert "pktcdvd: remove driver."")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCY76ohgAKCRCAXGG7T9hj
vo8fAP0XJ94B7asqcN4W3EyeyfqxUf1eZvmWRhrbKqpLnmHLaQEA/uJBkXL49Zj7
TTcbxR1coJ/hPwhtmONU4TNtCZ+RXw0=
=2Ib5
-----END PGP SIGNATURE-----
Merge tag 'for-linus-6.2-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- two cleanup patches
- a fix of a memory leak in the Xen pvfront driver
- a fix of a locking issue in the Xen hypervisor console driver
* tag 'for-linus-6.2-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/pvcalls: free active map buffer on pvcalls_front_free_map
hvc/xen: lock console list traversal
x86/xen: Remove the unused function p2m_index()
xen: make remove callback of xen driver void returned
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmO4SiAQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpgc9D/0XJufUgHsLeFCF5G+q6iL5Bz+d7ymw+VFv
xrNjOz8wUKYKXcJqxLrPdkmL1tcd1+fESNGgyBidn4P53BWoHB9dtbs8+Lova08t
I4lQmZHgxgbAMhSOwGvHlOTkdlBIw/fBgQ6XdI+1qmpxzma5+gjImjyp7oH+pODP
zqsg3DKRQmDApKWtvB6D5iItsWc1Jx5TEuOfU5/JjLuVZWl6O2qynNVUccF5T89O
jkt624yO+r70CVfX3NAdFTm/mOEUiGH97l4l/8OkekJ40pf73xzvNRF/S8z8nHb/
QUGY1tKvr08xfPusl3epmQ5aO938F0aFpKi2x6P+z3G6Uq+dqMMrjJl8XMDG+J+d
+yBow5yRH7o6oBb0YPPz/6S5zBjslsHtuKFd/rs4mCDfjp9GHiIIiIpdLxZEWawJ
WaYlc5WlzSdopT/IxfaRZ9HMHzscdKadjiFngSKdpEdCUw7wxdIey+/9xbKR+xh0
Es13MzyCCurj4OnyDl5cnetGJUNNiL1JvQmIaFVndyxnMfvOaZBBmKW7h9RYBIU/
nqi4vZwYoafnGUIfLFL6uq9F627lF/EhodDuLheqz0G2pWhmFJITOJUAakGNFf83
22CiKY2GyTrOy5tKqkNzv7BG/KyJZGP+CxyyQ/7xm0k2C9wEjYSpZHKcjaNZygU5
eswPKbZMkw==
=LJ5Q
-----END PGP SIGNATURE-----
Merge tag 'block-2023-01-06' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
"The big change here is obviously the revert of the pktcdvd driver
removal. Outside of that, just minor tweaks. In detail:
- Re-instate the pktcdvd driver, which necessitates adding back
bio_copy_data_iter() and the fops->devnode() hook for now (me)
- Fix for splitting of a bio marked as NOWAIT, causing either nowait
reads or writes to error with EAGAIN even if parts of the IO
completed (me)
- Fix for ublk, punting management commands to io-wq as they can all
easily block for extended periods of time (Ming)
- Removal of SRCU dependency for the block layer (Paul)"
* tag 'block-2023-01-06' of git://git.kernel.dk/linux:
block: Remove "select SRCU"
Revert "pktcdvd: remove driver."
Revert "block: remove devnode callback from struct block_device_operations"
Revert "block: bio_copy_data_iter"
ublk: honor IO_URING_F_NONBLOCK for handling control command
block: don't allow splitting of a REQ_NOWAIT bio
block: handle bio_split_to_limits() NULL return
This reverts commit f40eb99897af665f11858dd7b56edcb62c3f3c67.
There are apparently still users out there of this driver. While we'd
love to remove it to ease the maintenance burden, let's reinstate it
for now until better (userspace) solutions can be developed.
Link: https://lore.kernel.org/lkml/20230104190115.ceglfefco475ev6c@pali/
Reported-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Most of control command handlers may sleep, so return -EAGAIN in case
of IO_URING_F_NONBLOCK to defer the handling into io wq context.
Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver")
Reported-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230104133235.836536-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This can't happen right now, but in preparation for allowing
bio_split_to_limits() returning NULL if it ended the bio, check for it
in all the callers.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The virtblk_map_data() function returns negative error codes, however, the
'nents' field of vbr->sg_table is an unsigned int, which causes the error
handling not to work correctly.
Cc: stable@vger.kernel.org
Fixes: 0e9911fa768f ("virtio-blk: support mq_ops->queue_rqs()")
Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com>
Message-Id: <20221021204126.927603-1-rafaelmendsr@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Suwan Kim <suwan.kim027@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
We use UINT_MAX to limit max_discard_sectors in virtblk_probe,
we can use UINT_MAX to limit max_hw_sectors for consistencies.
No functional change intended.
Signed-off-by: Angus Chen <angus.chen@jaguarmicro.com>
Message-Id: <20221110030124.1986-1-angus.chen@jaguarmicro.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Define a new helper function, virtblk_fail_to_queue(), to
clean up the error handling code in virtio_queue_rq().
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
Message-Id: <20221016034127.330942-2-dmitry.fomichev@wdc.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Due to several bugs caused by timers being re-armed after they are
shutdown and just before they are freed, a new state of timers was added
called "shutdown". After a timer is set to this state, then it can no
longer be re-armed.
The following script was run to find all the trivial locations where
del_timer() or del_timer_sync() is called in the same function that the
object holding the timer is freed. It also ignores any locations where
the timer->function is modified between the del_timer*() and the free(),
as that is not considered a "trivial" case.
This was created by using a coccinelle script and the following
commands:
$ cat timer.cocci
@@
expression ptr, slab;
identifier timer, rfield;
@@
(
- del_timer(&ptr->timer);
+ timer_shutdown(&ptr->timer);
|
- del_timer_sync(&ptr->timer);
+ timer_shutdown_sync(&ptr->timer);
)
... when strict
when != ptr->timer
(
kfree_rcu(ptr, rfield);
|
kmem_cache_free(slab, ptr);
|
kfree(ptr);
)
$ spatch timer.cocci . > /tmp/t.patch
$ patch -p1 < /tmp/t.patch
Link: https://lore.kernel.org/lkml/20221123201306.823305113@linutronix.de/
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Pavel Machek <pavel@ucw.cz> [ LED ]
Acked-by: Kalle Valo <kvalo@kernel.org> [ wireless ]
Acked-by: Paolo Abeni <pabeni@redhat.com> [ networking ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmOgp5AQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpm5SD/9tduSZQW00aDm83HbEikWdCgQm0w37tyYl
C2+IwRwLF8pnAoSb6yaO7LZM9ZUYfoIfIlkHXkKhT1xNJ/XdeGDgwjOHi106iaEx
kG08DcFnUjyJ4Yh6hnnpnSepIo0ckwa18pSaE4smvmKZirj3it3O6xSspyBxtUcv
q6PvJDMN15aG6uLHq3xNZPzoI2KYXBDgwanyImRhdvLoOTiS9rok+F9e2ob3lzAa
PB+FOipQoKb7M6jbyfZe4KbeTiJh4EYEl5Qa6ebrDIkOTm7zjc8sQbCkNeI7osh+
D0FvEQ1Vsrjj5Bp6N9CmZcrmNagjEcAPbzguxAilrgw2/XvA8d0fymziGXvuyUEv
bSAx6lyJzfMLrvtubSqMhIF+8DlccQnnXz2ccacwvAfayytzNJjC9serU+czHA4O
ZkPTwZFjAmbn6q6SK3qaOCB9IgITHipj8R/ncGu9KjNvM2QgzM+OIrP0xGxtk6uI
ZGrt9nGMUmgjtaliQjiDVZomMewru1lRWPRAjfQ995gmVkejgapUHYoaDtDzaLKZ
Q9BaK5CC2jltGUuuoFEnXnwu/Eyvp9y++pKkz4Esb+/Wkst4qyGtr9DOSTnv1wKN
W20h3Z5vOAXXquvUJ5S3mQl8TNJHiBz+/CRB9PZG8XFtn8ubGo8XttGdgjQgyLM3
6FHzcZgeWw==
=TSec
-----END PGP SIGNATURE-----
Merge tag 'block-6.2-2022-12-19' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
- Various fixes for BFQ (Yu, Yuwei)
- Fix for loop command line parsing (Isaac)
- No need to specifically clear REQ_ALLOC_CACHE on IOPOLL downgrade
anymore (me)
- blk-iocost enum fix for newer gcc (Jiri)
- UAF fix for queue release (Ming)
- blk-iolatency error handling memory leak fix (Tejun)
* tag 'block-6.2-2022-12-19' of git://git.kernel.dk/linux:
block: don't clear REQ_ALLOC_CACHE for non-polled requests
block: fix use-after-free of q->q_usage_counter
block, bfq: only do counting of pending-request for BFQ_GROUP_IOSCHED
blk-iolatency: Fix memory leak on add_disk() failures
loop: Fix the max_loop commandline argument treatment when it is set to 0
block/blk-iocost (gcc13): keep large values in a new enum
block, bfq: replace 0/1 with false/true in bic apis
block, bfq: don't return bfqg from __bfq_bic_change_cgroup()
block, bfq: fix possible uaf for 'bfqq->bic'
Since moving to memalloc_nofs_save/restore, SUNRPC has stopped setting the
GFP_NOIO flag on sk_allocation which the networking system uses to decide
when it is safe to use current->task_frag. The results of this are
unexpected corruption in task_frag when SUNRPC is involved in memory
reclaim.
The corruption can be seen in crashes, but the root cause is often
difficult to ascertain as a crashing machine's stack trace will have no
evidence of being near NFS or SUNRPC code. I believe this problem to
be much more pervasive than reports to the community may indicate.
Fix this by having kernel users of sockets that may corrupt task_frag due
to reclaim set sk_use_task_frag = false. Preemptively correcting this
situation for users that still set sk_allocation allows them to convert to
memalloc_nofs_save/restore without the same unexpected corruptions that are
sure to follow, unlikely to show up in testing, and difficult to bisect.
CC: Philipp Reisner <philipp.reisner@linbit.com>
CC: Lars Ellenberg <lars.ellenberg@linbit.com>
CC: "Christoph Böhmwalder" <christoph.boehmwalder@linbit.com>
CC: Jens Axboe <axboe@kernel.dk>
CC: Josef Bacik <josef@toxicpanda.com>
CC: Keith Busch <kbusch@kernel.org>
CC: Christoph Hellwig <hch@lst.de>
CC: Sagi Grimberg <sagi@grimberg.me>
CC: Lee Duncan <lduncan@suse.com>
CC: Chris Leech <cleech@redhat.com>
CC: Mike Christie <michael.christie@oracle.com>
CC: "James E.J. Bottomley" <jejb@linux.ibm.com>
CC: "Martin K. Petersen" <martin.petersen@oracle.com>
CC: Valentina Manea <valentina.manea.m@gmail.com>
CC: Shuah Khan <shuah@kernel.org>
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CC: David Howells <dhowells@redhat.com>
CC: Marc Dionne <marc.dionne@auristor.com>
CC: Steve French <sfrench@samba.org>
CC: Christine Caulfield <ccaulfie@redhat.com>
CC: David Teigland <teigland@redhat.com>
CC: Mark Fasheh <mark@fasheh.com>
CC: Joel Becker <jlbec@evilplan.org>
CC: Joseph Qi <joseph.qi@linux.alibaba.com>
CC: Eric Van Hensbergen <ericvh@gmail.com>
CC: Latchesar Ionkov <lucho@ionkov.net>
CC: Dominique Martinet <asmadeus@codewreck.org>
CC: Ilya Dryomov <idryomov@gmail.com>
CC: Xiubo Li <xiubli@redhat.com>
CC: Chuck Lever <chuck.lever@oracle.com>
CC: Jeff Layton <jlayton@kernel.org>
CC: Trond Myklebust <trond.myklebust@hammerspace.com>
CC: Anna Schumaker <anna@kernel.org>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
Suggested-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Here is the set of driver core and kernfs changes for 6.2-rc1.
The "big" change in here is the addition of a new macro,
container_of_const() that will preserve the "const-ness" of a pointer
passed into it.
The "problem" of the current container_of() macro is that if you pass in
a "const *", out of it can comes a non-const pointer unless you
specifically ask for it. For many usages, we want to preserve the
"const" attribute by using the same call. For a specific example, this
series changes the kobj_to_dev() macro to use it, allowing it to be used
no matter what the const value is. This prevents every subsystem from
having to declare 2 different individual macros (i.e.
kobj_const_to_dev() and kobj_to_dev()) and having the compiler enforce
the const value at build time, which having 2 macros would not do
either.
The driver for all of this have been discussions with the Rust kernel
developers as to how to properly mark driver core, and kobject, objects
as being "non-mutable". The changes to the kobject and driver core in
this pull request are the result of that, as there are lots of paths
where kobjects and device pointers are not modified at all, so marking
them as "const" allows the compiler to enforce this.
So, a nice side affect of the Rust development effort has been already
to clean up the driver core code to be more obvious about object rules.
All of this has been bike-shedded in quite a lot of detail on lkml with
different names and implementations resulting in the tiny version we
have in here, much better than my original proposal. Lots of subsystem
maintainers have acked the changes as well.
Other than this change, included in here are smaller stuff like:
- kernfs fixes and updates to handle lock contention better
- vmlinux.lds.h fixes and updates
- sysfs and debugfs documentation updates
- device property updates
All of these have been in the linux-next tree for quite a while with no
problems, OTHER than some merge issues with other trees that should be
obvious when you hit them (block tree deletes a driver that this tree
modifies, iommufd tree modifies code that this tree also touches). If
there are merge problems with these trees, please let me know.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY5wz3A8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yks0ACeKYUlVgCsER8eYW+x18szFa2QTXgAn2h/VhZe
1Fp53boFaQkGBjl8mGF8
=v+FB
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the set of driver core and kernfs changes for 6.2-rc1.
The "big" change in here is the addition of a new macro,
container_of_const() that will preserve the "const-ness" of a pointer
passed into it.
The "problem" of the current container_of() macro is that if you pass
in a "const *", out of it can comes a non-const pointer unless you
specifically ask for it. For many usages, we want to preserve the
"const" attribute by using the same call. For a specific example, this
series changes the kobj_to_dev() macro to use it, allowing it to be
used no matter what the const value is. This prevents every subsystem
from having to declare 2 different individual macros (i.e.
kobj_const_to_dev() and kobj_to_dev()) and having the compiler enforce
the const value at build time, which having 2 macros would not do
either.
The driver for all of this have been discussions with the Rust kernel
developers as to how to properly mark driver core, and kobject,
objects as being "non-mutable". The changes to the kobject and driver
core in this pull request are the result of that, as there are lots of
paths where kobjects and device pointers are not modified at all, so
marking them as "const" allows the compiler to enforce this.
So, a nice side affect of the Rust development effort has been already
to clean up the driver core code to be more obvious about object
rules.
All of this has been bike-shedded in quite a lot of detail on lkml
with different names and implementations resulting in the tiny version
we have in here, much better than my original proposal. Lots of
subsystem maintainers have acked the changes as well.
Other than this change, included in here are smaller stuff like:
- kernfs fixes and updates to handle lock contention better
- vmlinux.lds.h fixes and updates
- sysfs and debugfs documentation updates
- device property updates
All of these have been in the linux-next tree for quite a while with
no problems"
* tag 'driver-core-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (58 commits)
device property: Fix documentation for fwnode_get_next_parent()
firmware_loader: fix up to_fw_sysfs() to preserve const
usb.h: take advantage of container_of_const()
device.h: move kobj_to_dev() to use container_of_const()
container_of: add container_of_const() that preserves const-ness of the pointer
driver core: fix up missed drivers/s390/char/hmcdrv_dev.c class.devnode() conversion.
driver core: fix up missed scsi/cxlflash class.devnode() conversion.
driver core: fix up some missing class.devnode() conversions.
driver core: make struct class.devnode() take a const *
driver core: make struct class.dev_uevent() take a const *
cacheinfo: Remove of_node_put() for fw_token
device property: Add a blank line in Kconfig of tests
device property: Rename goto label to be more precise
device property: Move PROPERTY_ENTRY_BOOL() a bit down
device property: Get rid of __PROPERTY_ENTRY_ARRAY_EL*SIZE*()
kernfs: fix all kernel-doc warnings and multiple typos
driver core: pass a const * into of_device_uevent()
kobject: kset_uevent_ops: make name() callback take a const *
kobject: kset_uevent_ops: make filter() callback take a const *
kobject: make kobject_namespace take a const *
...
Since commit fc7a6209d571 ("bus: Make remove callback return void")
forces bus_type::remove be void-returned, it doesn't make much sense for
any bus based driver implementing remove callbalk to return non-void to
its caller.
This change is for xen bus based drivers.
Acked-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Dawei Li <set_pte_at@outlook.com>
Link: https://lore.kernel.org/r/TYCP286MB23238119AB4DF190997075C9CAE39@TYCP286MB2323.JPNP286.PROD.OUTLOOK.COM
Signed-off-by: Juergen Gross <jgross@suse.com>