IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
[ Upstream commit a8749a35c39903120ec421ef2525acc8e0daa55c ]
Linux has dozens of occurrences of vmalloc(array_size()) and
vzalloc(array_size()). Allow to simplify the code by providing
vmalloc_array and vcalloc, as well as the underscored variants that let
the caller specify the GFP flags.
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 86cffecdeaa278444870c8745ab166a65865dbf0 ]
GCC and Clang can use the "alloc_size" attribute to better inform the
results of __builtin_object_size() (for compile-time constant values).
Clang can additionally use alloc_size to inform the results of
__builtin_dynamic_object_size() (for run-time values).
Because GCC sees the frequent use of struct_size() as an allocator size
argument, and notices it can return SIZE_MAX (the overflow indication),
it complains about these call sites overflowing (since SIZE_MAX is
greater than the default -Walloc-size-larger-than=PTRDIFF_MAX). This
isn't helpful since we already know a SIZE_MAX will be caught at
run-time (this was an intentional design). To deal with this, we must
disable this check as it is both a false positive and redundant. (Clang
does not have this warning option.)
Unfortunately, just checking the -Wno-alloc-size-larger-than is not
sufficient to make the __alloc_size attribute behave correctly under
older GCC versions. The attribute itself must be disabled in those
situations too, as there appears to be no way to reliably silence the
SIZE_MAX constant expression cases for GCC versions less than 9.1:
In file included from ./include/linux/resource_ext.h:11,
from ./include/linux/pci.h:40,
from drivers/net/ethernet/intel/ixgbe/ixgbe.h:9,
from drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c:4:
In function 'kmalloc_node',
inlined from 'ixgbe_alloc_q_vector' at ./include/linux/slab.h:743:9:
./include/linux/slab.h:618:9: error: argument 1 value '18446744073709551615' exceeds maximum object size 9223372036854775807 [-Werror=alloc-size-larger-than=]
return __kmalloc_node(size, flags, node);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/slab.h: In function 'ixgbe_alloc_q_vector':
./include/linux/slab.h:455:7: note: in a call to allocation function '__kmalloc_node' declared here
void *__kmalloc_node(size_t size, gfp_t flags, int node) __assume_slab_alignment __malloc;
^~~~~~~~~~~~~~
Specifically:
'-Wno-alloc-size-larger-than' is not correctly handled by GCC < 9.1
https://godbolt.org/z/hqsfG7q84 (doesn't disable)
https://godbolt.org/z/P9jdrPTYh (doesn't admit to not knowing about option)
https://godbolt.org/z/465TPMWKb (only warns when other warnings appear)
'-Walloc-size-larger-than=18446744073709551615' is not handled by GCC < 8.2
https://godbolt.org/z/73hh1EPxz (ignores numeric value)
Since anything marked with __alloc_size would also qualify for marking
with __malloc, just include __malloc along with it to avoid redundant
markings. (Suggested by Linus Torvalds.)
Finally, make sure checkpatch.pl doesn't get confused about finding the
__alloc_size attribute on functions. (Thanks to Joe Perches.)
Link: https://lkml.kernel.org/r/20210930222704.2631604-3-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Alexandre Bounine <alex.bou9@gmail.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jing Xiangfeng <jingxiangfeng@huawei.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: kernel test robot <lkp@intel.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 151c6b49d679872d6fc0b50e0ad96303091694a2 ]
Even if SPI_NOR_NO_ERASE was set, one could still send erase opcodes
to the flash. It is not recommended to send unsupported opcodes to
flashes. Fix the logic and do not set mtd->_erase when SPI_NOR_NO_ERASE
is specified. With this users will not be able to issue erase opcodes to
flashes and instead they will recive an -ENOTSUPP error.
Fixes: b199489d37b2 ("mtd: spi-nor: add the framework for SPI NOR")
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Reviewed-by: Michael Walle <michael@walle.cc>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220228163334.277730-1-tudor.ambarus@microchip.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 94da81e2fc4285db373fe9a1eb012c2ee205b110 ]
Since commit
baebdf48c3600 ("net: dev: Makes sure netif_rx() can be invoked in any context.")
the function netif_rx() can be used in preemptible/thread context as
well as in interrupt context.
Use netif_rx().
Cc: Antonio Quartulli <a@unstable.cc>
Cc: Marek Lindner <mareklindner@neomailbox.ch>
Cc: Simon Wunderlich <sw@simonwunderlich.de>
Cc: Sven Eckelmann <sven@narfation.org>
Cc: b.a.t.m.a.n@lists.open-mesh.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c87b7b12f48db86ac9909894f4dc0107d7df6375 ]
The original logic to get mma8452_data is wrong, the *dev point to
the device belong to iio_dev. we can't use this dev to find the
correct i2c_client. The original logic happen to work because it
finally use dev->driver_data to get iio_dev. Here use the API
to_i2c_client() is wrong and make reader confuse. To correct the
logic, it should be like this
struct mma8452_data *data = iio_priv(dev_get_drvdata(dev));
But after commit 8b7651f25962 ("iio: iio_device_alloc(): Remove
unnecessary self drvdata"), the upper logic also can't work.
When try to show the avialable scale in userspace, will meet kernel
dump, kernel handle NULL pointer dereference.
So use dev_to_iio_dev() to correct the logic.
Dual fixes tags as the second reflects when the bug was exposed, whilst
the first reflects when the original bug was introduced.
Fixes: c3cdd6e48e35 ("iio: mma8452: refactor for seperating chip specific data")
Fixes: 8b7651f25962 ("iio: iio_device_alloc(): Remove unnecessary self drvdata")
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Reviewed-by: Martin Kepplinger <martink@posteo.de>
Cc: <Stable@vger.kernel.org>
Link: https://lore.kernel.org/r/1645497741-5402-1-git-send-email-haibo.chen@nxp.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3f965021c8bc38965ecb1924f570c4842b33d408 ]
Since, well, forever, the Linux NFS server's nfsd_commit() function
has returned nfserr_inval when the passed-in byte range arguments
were non-sensical.
However, according to RFC 1813 section 3.3.21, NFSv3 COMMIT requests
are permitted to return only the following non-zero status codes:
NFS3ERR_IO
NFS3ERR_STALE
NFS3ERR_BADHANDLE
NFS3ERR_SERVERFAULT
NFS3ERR_INVAL is not included in that list. Likewise, NFS4ERR_INVAL
is not listed in the COMMIT row of Table 6 in RFC 8881.
RFC 7530 does permit COMMIT to return NFS4ERR_INVAL, but does not
specify when it can or should be used.
Instead of dropping or failing a COMMIT request in a byte range that
is not supported, turn it into a valid request by treating one or
both arguments as zero. Offset zero means start-of-file, count zero
means until-end-of-file, so we only ever extend the commit range.
NFS servers are always allowed to commit more and sooner than
requested.
The range check is no longer bounded by NFS_OFFSET_MAX, but rather
by the value that is returned in the maxfilesize field of the NFSv3
FSINFO procedure or the NFSv4 maxfilesize file attribute.
Note that this change results in a new pynfs failure:
CMT4 st_commit.testCommitOverflow : RUNNING
CMT4 st_commit.testCommitOverflow : FAILURE
COMMIT with offset + count overflow should return
NFS4ERR_INVAL, instead got NFS4_OK
IMO the test is not correct as written: RFC 8881 does not allow the
COMMIT operation to return NFS4ERR_INVAL.
Reported-by: Dan Aloni <dan.aloni@vastdata.com>
Cc: stable@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Bruce Fields <bfields@fieldses.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2c445a0e72cb1fbfbdb7f9473c53556ee27c1d90 ]
Since this pointer is used repeatedly, move it to a stack variable.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bc204778b4032b336cb3bde85bea852d79e7e389 ]
[WHY]
Clocks don't get recalculated in 0 stream/0 pipe configs,
blocking S0i3 if dcfclk gets high enough
[HOW]
Create DCN31 copy of DCN30 bandwidth validation func which
doesn't entirely skip validation in 0 pipe scenarios
Override dcfclk to vlevel 0/min value during validation if pipe
count is 0
Reviewed-by: Eric Yang <Eric.Yang2@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Tested-by: Daniel Wheeler <Daniel.Wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ae4d37b5df749926891583d42a6801b5da11e3c1 ]
The bug is here:
idr_remove(&connection->peer_devices, vnr);
If the previous for_each_connection() don't exit early (no goto hit
inside the loop), the iterator 'connection' after the loop will be a
bogus pointer to an invalid structure object containing the HEAD
(&resource->connections). As a result, the use of 'connection' above
will lead to a invalid memory access (including a possible invalid free
as idr_remove could call free_layer).
The original intention should have been to remove all peer_devices,
but the following lines have already done the work. So just remove
this line and the unneeded label, to fix this bug.
Cc: stable@vger.kernel.org
Fixes: c06ece6ba6f1b ("drbd: Turn connection->volumes into connection->peer_devices")
Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com>
Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Reviewed-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 27548088ac628109f70eb0b1eb521d035844dba8 ]
In drbd_create_device(), the 'out_no_io_page' lable has called
blk_cleanup_disk() when return failed.
So remove the 'out_cleanup_disk' lable to avoid double free the
disk pointer.
Fixes: e92ab4eda516 ("drbd: add error handling support for add_disk()")
Signed-off-by: Wu Bo <wubo40@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/1636013229-26309-1-git-send-email-wubo40@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e92ab4eda516a5bfd96c087282ebe9521deba4f4 ]
We never checked for errors on add_disk() as this function
returned void. Now that this is fixed, use the shiny new
error handling.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bbac58698a55cc0a6f0c0d69a6dcd3f9f3134c11 ]
[BUG]
There is a report that a btrfs has a bad super block num devices.
This makes btrfs to reject the fs completely.
BTRFS error (device sdd3): super_num_devices 3 mismatch with num_devices 2 found here
BTRFS error (device sdd3): failed to read chunk tree: -22
BTRFS error (device sdd3): open_ctree failed
[CAUSE]
During btrfs device removal, chunk tree and super block num devs are
updated in two different transactions:
btrfs_rm_device()
|- btrfs_rm_dev_item(device)
| |- trans = btrfs_start_transaction()
| | Now we got transaction X
| |
| |- btrfs_del_item()
| | Now device item is removed from chunk tree
| |
| |- btrfs_commit_transaction()
| Transaction X got committed, super num devs untouched,
| but device item removed from chunk tree.
| (AKA, super num devs is already incorrect)
|
|- cur_devices->num_devices--;
|- cur_devices->total_devices--;
|- btrfs_set_super_num_devices()
All those operations are not in transaction X, thus it will
only be written back to disk in next transaction.
So after the transaction X in btrfs_rm_dev_item() committed, but before
transaction X+1 (which can be minutes away), a power loss happen, then
we got the super num mismatch.
[FIX]
Instead of starting and committing a transaction inside
btrfs_rm_dev_item(), start a transaction in side btrfs_rm_device() and
pass it to btrfs_rm_dev_item().
And only commit the transaction after everything is done.
Reported-by: Luca Béla Palkovics <luca.bela.palkovics@gmail.com>
Link: https://lore.kernel.org/linux-btrfs/CA+8xDSpvdm_U0QLBAnrH=zqDq_cWCOH5TiV46CKmp3igr44okQ@mail.gmail.com/
CC: stable@vger.kernel.org # 4.14+
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1a15eb724aaef8656f8cc01d9355797cfe7c618e ]
For device removal and replace we call btrfs_find_device_by_devspec,
which if we give it a device path and nothing else will call
btrfs_get_dev_args_from_path, which opens the block device and reads the
super block and then looks up our device based on that.
However at this point we're holding the sb write "lock", so reading the
block device pulls in the dependency of ->open_mutex, which produces the
following lockdep splat
======================================================
WARNING: possible circular locking dependency detected
5.14.0-rc2+ #405 Not tainted
------------------------------------------------------
losetup/11576 is trying to acquire lock:
ffff9bbe8cded938 ((wq_completion)loop0){+.+.}-{0:0}, at: flush_workqueue+0x67/0x5e0
but task is already holding lock:
ffff9bbe88e4fc68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #4 (&lo->lo_mutex){+.+.}-{3:3}:
__mutex_lock+0x7d/0x750
lo_open+0x28/0x60 [loop]
blkdev_get_whole+0x25/0xf0
blkdev_get_by_dev.part.0+0x168/0x3c0
blkdev_open+0xd2/0xe0
do_dentry_open+0x161/0x390
path_openat+0x3cc/0xa20
do_filp_open+0x96/0x120
do_sys_openat2+0x7b/0x130
__x64_sys_openat+0x46/0x70
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #3 (&disk->open_mutex){+.+.}-{3:3}:
__mutex_lock+0x7d/0x750
blkdev_get_by_dev.part.0+0x56/0x3c0
blkdev_get_by_path+0x98/0xa0
btrfs_get_bdev_and_sb+0x1b/0xb0
btrfs_find_device_by_devspec+0x12b/0x1c0
btrfs_rm_device+0x127/0x610
btrfs_ioctl+0x2a31/0x2e70
__x64_sys_ioctl+0x80/0xb0
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #2 (sb_writers#12){.+.+}-{0:0}:
lo_write_bvec+0xc2/0x240 [loop]
loop_process_work+0x238/0xd00 [loop]
process_one_work+0x26b/0x560
worker_thread+0x55/0x3c0
kthread+0x140/0x160
ret_from_fork+0x1f/0x30
-> #1 ((work_completion)(&lo->rootcg_work)){+.+.}-{0:0}:
process_one_work+0x245/0x560
worker_thread+0x55/0x3c0
kthread+0x140/0x160
ret_from_fork+0x1f/0x30
-> #0 ((wq_completion)loop0){+.+.}-{0:0}:
__lock_acquire+0x10ea/0x1d90
lock_acquire+0xb5/0x2b0
flush_workqueue+0x91/0x5e0
drain_workqueue+0xa0/0x110
destroy_workqueue+0x36/0x250
__loop_clr_fd+0x9a/0x660 [loop]
block_ioctl+0x3f/0x50
__x64_sys_ioctl+0x80/0xb0
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
other info that might help us debug this:
Chain exists of:
(wq_completion)loop0 --> &disk->open_mutex --> &lo->lo_mutex
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&lo->lo_mutex);
lock(&disk->open_mutex);
lock(&lo->lo_mutex);
lock((wq_completion)loop0);
*** DEADLOCK ***
1 lock held by losetup/11576:
#0: ffff9bbe88e4fc68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop]
stack backtrace:
CPU: 0 PID: 11576 Comm: losetup Not tainted 5.14.0-rc2+ #405
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
Call Trace:
dump_stack_lvl+0x57/0x72
check_noncircular+0xcf/0xf0
? stack_trace_save+0x3b/0x50
__lock_acquire+0x10ea/0x1d90
lock_acquire+0xb5/0x2b0
? flush_workqueue+0x67/0x5e0
? lockdep_init_map_type+0x47/0x220
flush_workqueue+0x91/0x5e0
? flush_workqueue+0x67/0x5e0
? verify_cpu+0xf0/0x100
drain_workqueue+0xa0/0x110
destroy_workqueue+0x36/0x250
__loop_clr_fd+0x9a/0x660 [loop]
? blkdev_ioctl+0x8d/0x2a0
block_ioctl+0x3f/0x50
__x64_sys_ioctl+0x80/0xb0
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f31b02404cb
Instead what we want to do is populate our device lookup args before we
grab any locks, and then pass these args into btrfs_rm_device(). From
there we can find the device and do the appropriate removal.
Suggested-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit faa775c41d655a4786e9d53cb075a77bb5a75f66 ]
We are going to want to populate our device lookup args outside of any
locks and then do the actual device lookup later, so add a helper to do
this work and make btrfs_find_device_by_devspec() use this helper for
now.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 562d7b1512f7369a19bca2883e2e8672d78f0481 ]
We have a lot of device lookup functions that all do something slightly
different. Clean this up by adding a struct to hold the different
lookup criteria, and then pass this around to btrfs_find_device() so it
can do the proper matching based on the lookup criteria.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ad6dc1daaf29f97f23cc810d60ee01c0e83f4c6b ]
If mlx5_vdpa gets unloaded while a VM is running, the workqueue will be
destroyed. However, vhost might still have reference to the kick
function and might attempt to push new works. This could lead to null
pointer dereference.
To fix this, set mvdev->wq to NULL just before destroying and verify
that the workqueue is not NULL in mlx5_vdpa_kick_vq before attempting to
push a new work.
Fixes: 5262912ef3cf ("vdpa/mlx5: Add support for control VQ and MAC setting")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20220321141303.9586-1-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0972252450f90db56dd5415a20e2aec21a08d036 ]
During purex packet handling the driver was incorrectly freeing a
pre-allocated structure. Fix this by skipping that entry.
System crashed with the following stack during a module unload test.
Call Trace:
sbitmap_init_node+0x7f/0x1e0
sbitmap_queue_init_node+0x24/0x150
blk_mq_init_bitmaps+0x3d/0xa0
blk_mq_init_tags+0x68/0x90
blk_mq_alloc_map_and_rqs+0x44/0x120
blk_mq_alloc_set_map_and_rqs+0x63/0x150
blk_mq_alloc_tag_set+0x11b/0x230
scsi_add_host_with_dma.cold+0x3f/0x245
qla2x00_probe_one+0xd5a/0x1b80 [qla2xxx]
Call Trace with slub_debug and debug kernel:
kasan_report_invalid_free+0x50/0x80
__kasan_slab_free+0x137/0x150
slab_free_freelist_hook+0xc6/0x190
kfree+0xe8/0x2e0
qla2x00_free_device+0x3bb/0x5d0 [qla2xxx]
qla2x00_remove_one+0x668/0xcf0 [qla2xxx]
Link: https://lore.kernel.org/r/20220310092604.22950-6-njavali@marvell.com
Fixes: 62e9dd177732 ("scsi: qla2xxx: Change in PUREX to handle FPIN ELS requests")
Cc: stable@vger.kernel.org
Reported-by: Marco Patalano <mpatalan@redhat.com>
Tested-by: Marco Patalano <mpatalan@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8062b742d3bd336ca10ab5a1db1629d33700f9c6 ]
This patch is per review comment by Hannes Reinecke from previous
submission to replace list_for_each_safe with list_for_each_entry_safe.
Link: https://lore.kernel.org/r/20211026115412.27691-8-njavali@marvell.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 713b415726f100f6644971e75ebfe1edbef1a390 ]
For session recovery, driver relies on the dpc thread to initiate certain
operations. The dpc thread runs exclusively without the Mailbox interface
being occupied. A recent code change for heartbeat check via mailbox cmd 0
is preventing the dpc thread from carrying out its operation. This patch
allows the higher priority error recovery to run first before running the
lower priority heartbeat check.
Link: https://lore.kernel.org/r/20220310092604.22950-9-njavali@marvell.com
Fixes: d94d8158e184 ("scsi: qla2xxx: Add heartbeat check")
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3a4e1f3b3a3c733de3b82b9b522e54803e1165ae ]
DPC thread gets restricted due to a no-op mailbox, which is a blocking call
and has a high execution frequency. To free up the DPC thread we move no-op
handling to the workqueue. Also, modified qla_do_heartbeat() to send no-op
MBC if we don’t have any active interrupts, but there are still I/Os
outstanding with firmware.
Link: https://lore.kernel.org/r/20210908164622.19240-9-njavali@marvell.com
Fixes: d94d8158e184 ("scsi: qla2xxx: Add heartbeat check")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 83b83a02073ec8d18c77a9bbe0881d710f7a9d32 ]
Use the common TDP MMU zap helper when handling an MMU notifier unmap
event, the two flows are semantically identical. Consolidate the code in
preparation for a future bug fix, as both kvm_tdp_mmu_unmap_gfn_range()
and __kvm_tdp_mmu_zap_gfn_range() are guilty of not zapping SPTEs in
invalid roots.
No functional change intended.
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20211215011557.399940-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7533377215b6ee432c06c5855f6be5d66e694e46 ]
Use the yield-safe variant of the TDP MMU iterator when handling an
unmapping event from the MMU notifier, as most occurences of the event
allow yielding.
Fixes: e1eed5847b09 ("KVM: x86/mmu: Allow yielding during MMU notifier unmap/zap, if possible")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20211120015008.3780032-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b289cdecc7c3e25e001cde260c882e4d9a8b0772 ]
As per the HW manual (Rev.1.00 Sep, 2021) PLL2 and PLL3 should be
1600 MHz, but with current multiplier and divider values this resulted
to 1596 MHz.
This patch updates the multiplier and divider values for PLL2 and PLL3
so that we get the exact (1600 MHz) values.
Fixes: 17f0ff3d49ff1 ("clk: renesas: Add support for R9A07G044 SoC")
Suggested-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://lore.kernel.org/r/20211223093223.4725-1-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 74be98774dfbc5b8b795db726bd772e735d2edd4 ]
KASAN + DEBUG_KOBJECT_RELEASE reports a potential use-after-free in
cxl_decoder_release() where it goes to reference its parent, a cxl_port,
to free its id back to port->decoder_ida.
BUG: KASAN: use-after-free in to_cxl_port+0x18/0x90 [cxl_core]
Read of size 8 at addr ffff888119270908 by task kworker/35:2/379
CPU: 35 PID: 379 Comm: kworker/35:2 Tainted: G OE 5.17.0-rc2+ #198
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
Workqueue: events kobject_delayed_cleanup
Call Trace:
<TASK>
dump_stack_lvl+0x59/0x73
print_address_description.constprop.0+0x1f/0x150
? to_cxl_port+0x18/0x90 [cxl_core]
kasan_report.cold+0x83/0xdf
? to_cxl_port+0x18/0x90 [cxl_core]
to_cxl_port+0x18/0x90 [cxl_core]
cxl_decoder_release+0x2a/0x60 [cxl_core]
device_release+0x5f/0x100
kobject_cleanup+0x80/0x1c0
The device core only guarantees parent lifetime until all children are
unregistered. If a child needs a parent to complete its ->release()
callback that child needs to hold a reference to extend the lifetime of
the parent.
Fixes: 40ba17afdfab ("cxl/acpi: Introduce cxl_decoder objects")
Reported-by: Ben Widawsky <ben.widawsky@intel.com>
Tested-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Link: https://lore.kernel.org/r/164505751190.4175768.13324905271463416712.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b44eeb8cbdf2b88f2844f11e4f263b0abed5b5b0 ]
After commit 'd430dffbe9dd ("mt76: mt7921: fix a possible race
enabling/disabling runtime-pm")', runtime-pm is always disabled in the
fw even if the user requests to enable it toggling debugfs node since
mt7921_pm_interface_iter routine will use pm->enable to configure the fw.
Fix the issue moving enable variable configuration before running
mt7921_pm_interface_iter routine.
Fixes: d430dffbe9dd ("mt76: mt7921: fix a possible race enabling/disabling runtime-pm")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 43acb728bbc40169d2e2425e84a80068270974be ]
The driver allocates and registers two platform device structures during
probe, but the devices were never deregistered on driver unbind.
This results in a use-after-free on driver unbind as the device
structures were allocated using devres and would be freed by driver
core when remove() returns.
Fix this by adding the missing deregistration calls to the remove()
callback and failing probe on registration errors.
Note that the platform device structures must be freed using a proper
release callback to avoid leaking associated resources like device
names.
Fixes: 479f7a118105 ("[media] davinci: vpif: adaptions for DT support")
Cc: stable@vger.kernel.org # 4.12
Cc: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Lad Prabhakar <prabhakar.csengg@gmail.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d4568fc8525897e683983806f813be1ae9eedaed ]
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally writing across neighboring fields. Wrap the target region
in struct_group(). This additionally fixes a theoretical misalignment
of the copy (since the size of "buf" changes between 64-bit and 32-bit,
but this is likely never built for 64-bit).
FWIW, I think this code is totally broken on 64-bit (which appears to
not be a "real" build configuration): it would either always fail (with
an uninitialized data->buf_size) or would cause corruption in userspace
due to the copy_to_user() in the call path against an uninitialized
data->buf value:
omap3isp_stat_request_statistics_time32(...)
struct omap3isp_stat_data data64;
...
omap3isp_stat_request_statistics(stat, &data64);
int omap3isp_stat_request_statistics(struct ispstat *stat,
struct omap3isp_stat_data *data)
...
buf = isp_stat_buf_get(stat, data);
static struct ispstat_buffer *isp_stat_buf_get(struct ispstat *stat,
struct omap3isp_stat_data *data)
...
if (buf->buf_size > data->buf_size) {
...
return ERR_PTR(-EINVAL);
}
...
rval = copy_to_user(data->buf,
buf->virt_addr,
buf->buf_size);
Regardless, additionally initialize data64 to be zero-filled to avoid
undefined behavior.
Link: https://lore.kernel.org/lkml/20211215220505.GB21862@embeddedor
Cc: Arnd Bergmann <arnd@arndb.de>
Fixes: 378e3f81cb56 ("media: omap3isp: support 64-bit version of omap3isp_stat_data")
Cc: stable@vger.kernel.org
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 50d7bd38c3aafc4749e05e8d7fcb616979143602 ]
Kernel code has a regular need to describe groups of members within a
structure usually when they need to be copied or initialized separately
from the rest of the surrounding structure. The generally accepted design
pattern in C is to use a named sub-struct:
struct foo {
int one;
struct {
int two;
int three, four;
} thing;
int five;
};
This would allow for traditional references and sizing:
memcpy(&dst.thing, &src.thing, sizeof(dst.thing));
However, doing this would mean that referencing struct members enclosed
by such named structs would always require including the sub-struct name
in identifiers:
do_something(dst.thing.three);
This has tended to be quite inflexible, especially when such groupings
need to be added to established code which causes huge naming churn.
Three workarounds exist in the kernel for this problem, and each have
other negative properties.
To avoid the naming churn, there is a design pattern of adding macro
aliases for the named struct:
#define f_three thing.three
This ends up polluting the global namespace, and makes it difficult to
search for identifiers.
Another common work-around in kernel code avoids the pollution by avoiding
the named struct entirely, instead identifying the group's boundaries using
either a pair of empty anonymous structs of a pair of zero-element arrays:
struct foo {
int one;
struct { } start;
int two;
int three, four;
struct { } finish;
int five;
};
struct foo {
int one;
int start[0];
int two;
int three, four;
int finish[0];
int five;
};
This allows code to avoid needing to use a sub-struct named for member
references within the surrounding structure, but loses the benefits of
being able to actually use such a struct, making it rather fragile. Using
these requires open-coded calculation of sizes and offsets. The efforts
made to avoid common mistakes include lots of comments, or adding various
BUILD_BUG_ON()s. Such code is left with no way for the compiler to reason
about the boundaries (e.g. the "start" object looks like it's 0 bytes
in length), making bounds checking depend on open-coded calculations:
if (length > offsetof(struct foo, finish) -
offsetof(struct foo, start))
return -EINVAL;
memcpy(&dst.start, &src.start, offsetof(struct foo, finish) -
offsetof(struct foo, start));
However, the vast majority of places in the kernel that operate on
groups of members do so without any identification of the grouping,
relying either on comments or implicit knowledge of the struct contents,
which is even harder for the compiler to reason about, and results in
even more fragile manual sizing, usually depending on member locations
outside of the region (e.g. to copy "two" and "three", use the start of
"four" to find the size):
BUILD_BUG_ON((offsetof(struct foo, four) <
offsetof(struct foo, two)) ||
(offsetof(struct foo, four) <
offsetof(struct foo, three));
if (length > offsetof(struct foo, four) -
offsetof(struct foo, two))
return -EINVAL;
memcpy(&dst.two, &src.two, length);
In order to have a regular programmatic way to describe a struct
region that can be used for references and sizing, can be examined for
bounds checking, avoids forcing the use of intermediate identifiers,
and avoids polluting the global namespace, introduce the struct_group()
macro. This macro wraps the member declarations to create an anonymous
union of an anonymous struct (no intermediate name) and a named struct
(for references and sizing):
struct foo {
int one;
struct_group(thing,
int two;
int three, four;
);
int five;
};
if (length > sizeof(src.thing))
return -EINVAL;
memcpy(&dst.thing, &src.thing, length);
do_something(dst.three);
There are some rare cases where the resulting struct_group() needs
attributes added, so struct_group_attr() is also introduced to allow
for specifying struct attributes (e.g. __align(x) or __packed).
Additionally, there are places where such declarations would like to
have the struct be tagged, so struct_group_tagged() is added.
Given there is a need for a handful of UAPI uses too, the underlying
__struct_group() macro has been defined in UAPI so it can be used there
too.
To avoid confusing scripts/kernel-doc, hide the macro from its struct
parsing.
Co-developed-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Keith Packard <keithp@keithp.com>
Acked-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/lkml/20210728023217.GC35706@embeddedor
Enhanced-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Link: https://lore.kernel.org/lkml/41183a98-bdb9-4ad6-7eab-5a7292a6df84@rasmusvillemoes.dk
Enhanced-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/lkml/1d9a2e6df2a9a35b2cdd50a9a68cac5991e7e5f0.camel@intel.com
Enhanced-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://lore.kernel.org/lkml/YQKa76A6XuFqgM03@phenom.ffwll.local
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aa1b46dcdc7baaf5fec0be25782ef24b26aa209e ]
a647a524a467 ("block: don't call rq_qos_ops->done_bio if the bio isn't
tracked") made bio_endio() skip rq_qos_done_bio() if BIO_TRACKED is not set.
While this fixed a potential oops, it also broke blk-iocost by skipping the
done_bio callback for merged bios.
Before, whether a bio goes through rq_qos_throttle() or rq_qos_merge(),
rq_qos_done_bio() would be called on the bio on completion with BIO_TRACKED
distinguishing the former from the latter. rq_qos_done_bio() is not called
for bios which wenth through rq_qos_merge(). This royally confuses
blk-iocost as the merged bios never finish and are considered perpetually
in-flight.
One reliably reproducible failure mode is an intermediate cgroup geting
stuck active preventing its children from being activated due to the
leaf-only rule, leading to loss of control. The following is from
resctl-bench protection scenario which emulates isolating a web server like
workload from a memory bomb run on an iocost configuration which should
yield a reasonable level of protection.
# cat /sys/block/nvme2n1/device/model
Samsung SSD 970 PRO 512GB
# cat /sys/fs/cgroup/io.cost.model
259:0 ctrl=user model=linear rbps=834913556 rseqiops=93622 rrandiops=102913 wbps=618985353 wseqiops=72325 wrandiops=71025
# cat /sys/fs/cgroup/io.cost.qos
259:0 enable=1 ctrl=user rpct=95.00 rlat=18776 wpct=95.00 wlat=8897 min=60.00 max=100.00
# resctl-bench -m 29.6G -r out.json run protection::scenario=mem-hog,loops=1
...
Memory Hog Summary
==================
IO Latency: R p50=242u:336u/2.5m p90=794u:1.4m/7.5m p99=2.7m:8.0m/62.5m max=8.0m:36.4m/350m
W p50=221u:323u/1.5m p90=709u:1.2m/5.5m p99=1.5m:2.5m/9.5m max=6.9m:35.9m/350m
Isolation and Request Latency Impact Distributions:
min p01 p05 p10 p25 p50 p75 p90 p95 p99 max mean stdev
isol% 15.90 15.90 15.90 40.05 57.24 59.07 60.01 74.63 74.63 90.35 90.35 58.12 15.82
lat-imp% 0 0 0 0 0 4.55 14.68 15.54 233.5 548.1 548.1 53.88 143.6
Result: isol=58.12:15.82% lat_imp=53.88%:143.6 work_csv=100.0% missing=3.96%
The isolation result of 58.12% is close to what this device would show
without any IO control.
Fix it by introducing a new flag BIO_QOS_MERGED to mark merged bios and
calling rq_qos_done_bio() on them too. For consistency and clarity, rename
BIO_TRACKED to BIO_QOS_THROTTLED. The flag checks are moved into
rq_qos_done_bio() so that it's next to the code paths that set the flags.
With the patch applied, the above same benchmark shows:
# resctl-bench -m 29.6G -r out.json run protection::scenario=mem-hog,loops=1
...
Memory Hog Summary
==================
IO Latency: R p50=123u:84.4u/985u p90=322u:256u/2.5m p99=1.6m:1.4m/9.5m max=11.1m:36.0m/350m
W p50=429u:274u/995u p90=1.7m:1.3m/4.5m p99=3.4m:2.7m/11.5m max=7.9m:5.9m/26.5m
Isolation and Request Latency Impact Distributions:
min p01 p05 p10 p25 p50 p75 p90 p95 p99 max mean stdev
isol% 84.91 84.91 89.51 90.73 92.31 94.49 96.36 98.04 98.71 100.0 100.0 94.42 2.81
lat-imp% 0 0 0 0 0 2.81 5.73 11.11 13.92 17.53 22.61 4.10 4.68
Result: isol=94.42:2.81% lat_imp=4.10%:4.68 work_csv=58.34% missing=0%
Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: a647a524a467 ("block: don't call rq_qos_ops->done_bio if the bio isn't tracked")
Cc: stable@vger.kernel.org # v5.15+
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/Yi7rdrzQEHjJLGKB@slm.duckdns.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 90b8faa0e8de1b02b619fb33f6c6e1e13e7d1d70 ]
We set BIO_TRACKED unconditionally when rq_qos_throttle() is called, even
though we may not even have an rq_qos handler. Only mark it as TRACKED if
it really is potentially tracked.
This saves considerable time for the case where the bio isn't tracked:
2.64% -1.65% [kernel.vmlinux] [k] bio_endio
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f63cf5192fe3418ad5ae1a4412eba5694b145f79 ]
Ensure that we call fsnotify_modify() if we write a file, and that we
do fsnotify_access() if we read it. This enables anyone using inotify
on the file to get notified.
Ditto for fallocate, ensure that fsnotify_modify() is called.
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 02746e26c39ee473b975e0f68d1295abc92672ed ]
No need to pre-allocate a big buffer for the IO SGL anymore. If a device
has lots of deep queues, preallocation for the sg list can consume
substantial amounts of memory. For HW virtio-blk device, nr_hw_queues
can be 64 or 128 and each queue's depth might be 128. This means the
resulting preallocation for the data SGLs is big.
Switch to runtime allocation for SGL for lists longer than 2 entries.
This is the approach used by NVMe drivers so it should be reasonable for
virtio block as well. Runtime SGL allocation has always been the case
for the legacy I/O path so this is nothing new.
The preallocated small SGL depends on SG_CHAIN so if the ARCH doesn't
support SG_CHAIN, use only runtime allocation for the SGL.
Re-organize the setup of the IO request to fit the new sg chain
mechanism.
No performance degradation was seen (fio libaio engine with 16 jobs and
128 iodepth):
IO size IOPs Rand Read (before/after) IOPs Rand Write (before/after)
-------- --------------------------------- ----------------------------------
512B 318K/316K 329K/325K
4KB 323K/321K 353K/349K
16KB 199K/208K 250K/275K
128KB 36K/36.1K 39.2K/41.7K
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Israel Rukshin <israelr@nvidia.com>
Link: https://lore.kernel.org/r/20210901131434.31158-1-mgurtovoy@nvidia.com
Reviewed-by: Feng Li <lifeng1519@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de> # kconfig fixups
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fd98693cb0721317f27341951593712c580c36a1 ]
We currently don't allow queuing resets when adapter is in VNIC_PROBING
state - instead we throw away the reset and return EBUSY. The reasoning
is probably that during ibmvnic_probe() the ibmvnic_adapter itself is
being initialized so performing a reset during this time can lead us to
accessing fields in the ibmvnic_adapter that are not fully initialized.
A review of the code shows that all the adapter state neede to process a
reset is initialized before registering the CRQ so that should no longer
be a concern.
Further the expectation is that if we do get a reset (transport event)
during probe, the do..while() loop in ibmvnic_probe() will handle this
by reinitializing the CRQ.
While that is true to some extent, it is possible that the reset might
occur _after_ the CRQ is registered and CRQ_INIT message was exchanged
but _before_ the adapter state is set to VNIC_PROBED. As mentioned above,
such a reset will be thrown away. While the client assumes that the
adapter is functional, the vnic server will wait for the client to reinit
the adapter. This disconnect between the two leaves the adapter down
needing manual intervention.
Because ibmvnic_probe() has other work to do after initializing the CRQ
(such as registering the netdev at a minimum) and because the reset event
can occur at any instant after the CRQ is initialized, there will always
be a window between initializing the CRQ and considering the adapter
ready for resets (ie state == PROBED).
So rather than discarding resets during this window, allow queueing them
- but only process them after the adapter is fully initialized.
To do this, introduce a new completion state ->probe_done and have the
reset worker thread wait on this before processing resets.
This change brings up two new situations in or just after ibmvnic_probe().
First after one or more resets were queued, we encounter an error and
decide to retry the initialization. At that point the queued resets are
no longer relevant since we could be talking to a new vnic server. So we
must purge/flush the queued resets before restarting the initialization.
As a side note, since we are still in the probing stage and we have not
registered the netdev, it will not be CHANGE_PARAM reset.
Second this change opens up a potential race between the worker thread
in __ibmvnic_reset(), the tasklet and the ibmvnic_open() due to the
following sequence of events:
1. Register CRQ
2. Get transport event before CRQ_INIT completes.
3. Tasklet schedules reset:
a) add rwi to list
b) schedule_work() to start worker thread which runs
and waits for ->probe_done.
4. ibmvnic_probe() decides to retry, purges rwi_list
5. Re-register crq and this time rest of probe succeeds - register
netdev and complete(->probe_done).
6. Worker thread resumes in __ibmvnic_reset() from 3b.
7. Worker thread sets ->resetting bit
8. ibmvnic_open() comes in, notices ->resetting bit, sets state
to IBMVNIC_OPEN and returns early expecting worker thread to
finish the open.
9. Worker thread finds rwi_list empty and returns without
opening the interface.
If this happens, the ->ndo_open() call is effectively lost and the
interface remains down. To address this, ensure that ->rwi_list is
not empty before setting the ->resetting bit. See also comments in
__ibmvnic_reset().
Fixes: 6a2fb0e99f9c ("ibmvnic: driver initialization for kdump/kexec")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f628ad531b4f34fdba0984255b4a2850dd369513 ]
Clear ->failover_pending flag that may have been set in the previous
pass of registering CRQ. If we don't clear, a subsequent ibmvnic_open()
call would be misled into thinking a failover is pending and assuming
that the reset worker thread would open the adapter. If this pass of
registering the CRQ succeeds (i.e there is no transport event), there
wouldn't be a reset worker thread.
This would leave the adapter unconfigured and require manual intervention
to bring it up during boot.
Fixes: 5a18e1e0c193 ("ibmvnic: Fix failover case for non-redundant configuration")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ae16bf15374d8b055e040ac6f3f1147ab1c9bb7d ]
We currently initialize the ->init_done completion/return code fields
before issuing a CRQ_INIT command. But if we get a transport event soon
after registering the CRQ the taskslet may already have recorded the
completion and error code. If we initialize here, we might overwrite/
lose that and end up issuing the CRQ_INIT only to timeout later.
If that timeout happens during probe, we will leave the adapter in the
DOWN state rather than retrying to register/init the CRQ.
Initialize the completion before registering the CRQ so we don't lose
the notification.
Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6b4b54c7ca347bcb4aa7a3cc01aa16e84ac7fbe4 ]
We need to preserve the values at OLDMEM_BASE and OLDMEM_SIZE which are
used by zgetdump in case when kdump crashes. In that case zgetdump will
attempt to read OLDMEM_BASE and OLDMEM_SIZE in order to find out where
the memory range [0 - OLDMEM_SIZE] belonging to the production kernel is.
Fixes: f1a546947431 ("s390/setup: don't reserve memory that occupied decompressor's head")
Cc: stable@vger.kernel.org # 5.15+
Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 04f11ed7d8e018e1f01ebda5814ddfeb3a1e6ae1 ]
memblock_reserve() function accepts physcal address of a memory
block to be reserved, but provided with virtual memory pointers.
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e3ec8e0f5711d73f7e5d5c3cffdf4fad4f1487b9 ]
The memory for amode31 section is allocated from the decompressed
kernel. Instead, allocate that memory from the decompressor. This
is a prerequisite to allow initialization of the virtual memory
before the decompressed kernel takes over.
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a9e8503def0fd4ed89ade1f61c315f904581d439 ]
Loads relative to ->thoff naturally expect that this points to the
transport header, but this is only true if pkt->fragoff == 0.
This has little effect for rulesets with connection tracking/nat because
these enable ip defra. For other rulesets this prevents false matches.
Fixes: 96518518cc41 ("netfilter: add nftables")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c46b38dc8743535e686b911d253a844f0bd50ead ]
Allow to match and mangle on inner headers / payload data after the
transport header. There is a new field in the pktinfo structure that
stores the inner header offset which is calculated only when requested.
Only TCP and UDP supported at this stage.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b5bdc6f9c24db9a0adf8bd00c0e935b184654f00 ]
Generalize boolean field to store more flags on the pktinfo structure.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4045daf0fa87846a27f56329fddad2deeb5ca354 ]
On resume from suspend the following chain of events can happen:
A rt5682_resume() -> mod_delayed_work() for jack_detect_work
B DAPM sequence starts ( DAPM is locked now)
A1. rt5682_jack_detect_handler() scheduled
- Takes both jdet_mutex and calibrate_mutex
- Calls in to rt5682_headset_detect() which tries to take DAPM lock, it
starts to wait for it as B path took it already.
B1. DAPM sequence reaches the "HP Amp", rt5682_hp_event() tries to take
the jdet_mutex, but it is locked in A1, so it waits.
Deadlock.
To solve the deadlock, drop the jdet_mutex, use the jack_detect_work to do
the jack removal handling, move the dapm lock up one level to protect the
most of the rt5682_jack_detect_handler(), but not the jack reporting as it
might trigger a DAPM sequence.
The rt5682_headset_detect() can be changed to static as well.
Fixes: 8deb34a90f063 ("ASoC: rt5682: fix the wrong jack type detected")
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Link: https://lore.kernel.org/r/20220126100325.16513-1-peter.ujfalusi@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2cd9b0ef82d936623d789bb3fbb6fcf52c500367 ]
Sometimes, end-users change the jack type under suspending,
so it needs to re-detect the combo jack type after resuming to
avoid any unexpected behaviors.
Signed-off-by: Derek Fang <derek.fang@realtek.com>
Link: https://lore.kernel.org/r/20211109095450.12950-2-derek.fang@realtek.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a3774a2a6544a7a4a85186e768afc07044aa507f ]
When the system suspends, the codec driver will set SAR to
power saving mode if a headset is plugged in.
There is a chance to generate an unexpected IRQ, and leads to
issues after resuming such as noise from OMTP type headsets.
Signed-off-by: Derek Fang <derek.fang@realtek.com>
Link: https://lore.kernel.org/r/20211109095450.12950-1-derek.fang@realtek.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5623ef8a118838aae65363750dfafcba734dc8cb ]
Such rules are redundant but allowed and passed to the driver.
The driver does not support offloading such rules so return an error.
Fixes: 03a9d11e6eeb ("net/mlx5e: Add TC drop and mirred/redirect action parsing for SRIOV offloads")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>