IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
commit 37de5a80e932f828c34abeaae63170d73930dca3 upstream.
Each smb_rqst struct contains two things: an array of kvecs (rq_iov) that
contains the protocol data for an RPC op and an iterator (rq_iter) that
contains the data payload of an RPC op. When an smb_rqst is allocated
rq_iter is it always cleared, but we don't set it up unless we're going to
use it.
The functions that determines the size of the ciphertext buffer that will
be needed to encrypt a request, cifs_get_num_sgs(), assumes that rq_iter is
always initialised - and employs user_backed_iter() to check that the
iterator isn't user-backed. This used to incidentally work, because
->user_backed was set to false because the iterator has never been
initialised, but with commit f1b4cb650b9a0eeba206d8f069fcdc532bfbcd74[1]
which changes user_backed_iter() to determine this based on the iterator
type insted, a warning is now emitted:
WARNING: CPU: 7 PID: 4584 at fs/smb/client/cifsglob.h:2165 smb2_get_aead_req+0x3fc/0x420 [cifs]
...
RIP: 0010:smb2_get_aead_req+0x3fc/0x420 [cifs]
...
crypt_message+0x33e/0x550 [cifs]
smb3_init_transform_rq+0x27d/0x3f0 [cifs]
smb_send_rqst+0xc7/0x160 [cifs]
compound_send_recv+0x3ca/0x9f0 [cifs]
cifs_send_recv+0x25/0x30 [cifs]
SMB2_tcon+0x38a/0x820 [cifs]
cifs_get_smb_ses+0x69c/0xee0 [cifs]
cifs_mount_get_session+0x76/0x1d0 [cifs]
dfs_mount_share+0x74/0x9d0 [cifs]
cifs_mount+0x6e/0x2e0 [cifs]
cifs_smb3_do_mount+0x143/0x300 [cifs]
smb3_get_tree+0x15e/0x290 [cifs]
vfs_get_tree+0x2d/0xe0
do_new_mount+0x124/0x340
__se_sys_mount+0x143/0x1a0
The problem is that rq_iter was never set, so the type is 0 (ie. ITER_UBUF)
which causes user_backed_iter() to return true. The code doesn't
malfunction because it checks the size of the iterator - which is 0.
Fix cifs_get_num_sgs() to ignore rq_iter if its count is 0, thereby
bypassing the warnings.
It might be better to explicitly initialise rq_iter to a zero-length
ITER_BVEC, say, as it can always be reinitialised later.
Fixes: d08089f649a0 ("cifs: Change the I/O paths to use an iterator rather than a page list")
Reported-by: Damian Tometzki <damian@riscv-rocks.de>
Closes: https://lore.kernel.org/r/ZUfQo47uo0p2ZsYg@fedora.fritz.box/
Tested-by: Damian Tometzki <damian@riscv-rocks.de>
Cc: stable@vger.kernel.org
cc: Eric Biggers <ebiggers@kernel.org>
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f1b4cb650b9a0eeba206d8f069fcdc532bfbcd74 [1]
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9599d59eb8fc0c0fd9480c4f22901533d08965ee upstream.
The only reason why cifs_sb gets passed today to cifs_try_adding_channels
is to pass the local_nls field for the new channels and binding session.
However, the ses struct already has local_nls field that is setup during
the first cifs_setup_session. So there is no need to pass cifs_sb.
This change removes cifs_sb from the arg list for this and the functions
that it calls and uses ses->local_nls instead.
Cc: stable@vger.kernel.org
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6e5e64c9477d58e73cb1a0e83eacad1f8df247cf upstream.
If the mount command has specified multichannel as a mount option,
but multichannel is found to be unsupported by the server at the time
of mount, we set chan_max to 1. Which means that the user needs to
remount the share if the server starts supporting multichannel.
This change removes this reset. What it means is that if the user
specified multichannel or max_channels during mount, and at this
time, multichannel is not supported, but the server starts supporting
it at a later point, the client will be capable of scaling out the
number of channels.
Cc: stable@vger.kernel.org
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d9a6d78096056a3cb5c5f07a730ab92f2f9ac4e6 upstream.
During a session reconnect, it is possible that the
server moved to another physical server (happens in case
of Azure files). So at this time, force a query of server
interfaces again (in case of multichannel session), such
that the secondary channels connect to the right
IP addresses (possibly updated now).
Cc: stable@vger.kernel.org
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c3326a61cdbf3ce1273d9198b6cbf90965d7e029 upstream.
We introduced a helper function to be used by non-cifsd threads to
mark the connection for reconnect. For multichannel, when only
a particular channel needs to be reconnected, this had a bug.
This change fixes that by marking that particular channel
for reconnect.
Fixes: dca65818c80c ("cifs: use a different reconnect helper for non-cifsd threads")
Cc: stable@vger.kernel.org
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5e2fd17f434d2fed78efb123e2fc6711e4f598f1 upstream.
There was a wrong assumption that with CONFIG_CIFS_DFS_UPCALL=y there
would always be a dns_resolver key set up so we could unconditionally
upcall to resolve UNC hostname rather than using the value provided by
mount(2).
Only require it when performing automount of junctions within a DFS
share so users that don't have dns_resolver key still can mount their
regular shares with server hostname resolved by mount.cifs(8).
Fixes: 348a04a8d113 ("smb: client: get rid of dfs code dep in namespace.c")
Cc: stable@vger.kernel.org
Tested-by: Eduard Bachmakov <e.bachmakov@gmail.com>
Reported-by: Eduard Bachmakov <e.bachmakov@gmail.com>
Closes: https://lore.kernel.org/all/CADCRUiNvZuiUZ0VGZZO9HRyPyw6x92kiA7o7Q4tsX5FkZqUkKg@mail.gmail.com/
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e6322fd177c6885a21dd4609dc5e5c973d1a2eb7 upstream.
All release_mid() callers seem to hold a reference of @mid so there is
no need to call kref_put(&mid->refcount, __release_mid) under
@server->mid_lock spinlock. If they don't, then an use-after-free bug
would have occurred anyways.
By getting rid of such spinlock also fixes a potential deadlock as
shown below
CPU 0 CPU 1
------------------------------------------------------------------
cifs_demultiplex_thread() cifs_debug_data_proc_show()
release_mid()
spin_lock(&server->mid_lock);
spin_lock(&cifs_tcp_ses_lock)
spin_lock(&server->mid_lock)
__release_mid()
smb2_find_smb_tcon()
spin_lock(&cifs_tcp_ses_lock) *deadlock*
Cc: stable@vger.kernel.org
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5c86919455c1edec99ebd3338ad213b59271a71b upstream.
The following UAF was triggered when running fstests generic/072 with
KASAN enabled against Windows Server 2022 and mount options
'multichannel,max_channels=2,vers=3.1.1,mfsymlinks,noperm'
BUG: KASAN: slab-use-after-free in smb2_query_info_compound+0x423/0x6d0 [cifs]
Read of size 8 at addr ffff888014941048 by task xfs_io/27534
CPU: 0 PID: 27534 Comm: xfs_io Not tainted 6.6.0-rc7 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
Call Trace:
dump_stack_lvl+0x4a/0x80
print_report+0xcf/0x650
? srso_alias_return_thunk+0x5/0x7f
? srso_alias_return_thunk+0x5/0x7f
? __phys_addr+0x46/0x90
kasan_report+0xda/0x110
? smb2_query_info_compound+0x423/0x6d0 [cifs]
? smb2_query_info_compound+0x423/0x6d0 [cifs]
smb2_query_info_compound+0x423/0x6d0 [cifs]
? __pfx_smb2_query_info_compound+0x10/0x10 [cifs]
? srso_alias_return_thunk+0x5/0x7f
? __stack_depot_save+0x39/0x480
? kasan_save_stack+0x33/0x60
? kasan_set_track+0x25/0x30
? ____kasan_slab_free+0x126/0x170
smb2_queryfs+0xc2/0x2c0 [cifs]
? __pfx_smb2_queryfs+0x10/0x10 [cifs]
? __pfx___lock_acquire+0x10/0x10
smb311_queryfs+0x210/0x220 [cifs]
? __pfx_smb311_queryfs+0x10/0x10 [cifs]
? srso_alias_return_thunk+0x5/0x7f
? __lock_acquire+0x480/0x26c0
? lock_release+0x1ed/0x640
? srso_alias_return_thunk+0x5/0x7f
? do_raw_spin_unlock+0x9b/0x100
cifs_statfs+0x18c/0x4b0 [cifs]
statfs_by_dentry+0x9b/0xf0
fd_statfs+0x4e/0xb0
__do_sys_fstatfs+0x7f/0xe0
? __pfx___do_sys_fstatfs+0x10/0x10
? srso_alias_return_thunk+0x5/0x7f
? lockdep_hardirqs_on_prepare+0x136/0x200
? srso_alias_return_thunk+0x5/0x7f
do_syscall_64+0x3f/0x90
entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Allocated by task 27534:
kasan_save_stack+0x33/0x60
kasan_set_track+0x25/0x30
__kasan_kmalloc+0x8f/0xa0
open_cached_dir+0x71b/0x1240 [cifs]
smb2_query_info_compound+0x5c3/0x6d0 [cifs]
smb2_queryfs+0xc2/0x2c0 [cifs]
smb311_queryfs+0x210/0x220 [cifs]
cifs_statfs+0x18c/0x4b0 [cifs]
statfs_by_dentry+0x9b/0xf0
fd_statfs+0x4e/0xb0
__do_sys_fstatfs+0x7f/0xe0
do_syscall_64+0x3f/0x90
entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Freed by task 27534:
kasan_save_stack+0x33/0x60
kasan_set_track+0x25/0x30
kasan_save_free_info+0x2b/0x50
____kasan_slab_free+0x126/0x170
slab_free_freelist_hook+0xd0/0x1e0
__kmem_cache_free+0x9d/0x1b0
open_cached_dir+0xff5/0x1240 [cifs]
smb2_query_info_compound+0x5c3/0x6d0 [cifs]
smb2_queryfs+0xc2/0x2c0 [cifs]
This is a race between open_cached_dir() and cached_dir_lease_break()
where the cache entry for the open directory handle receives a lease
break while creating it. And before returning from open_cached_dir(),
we put the last reference of the new @cfid because of
!@cfid->has_lease.
Besides the UAF, while running xfstests a lot of missed lease breaks
have been noticed in tests that run several concurrent statfs(2) calls
on those cached fids
CIFS: VFS: \\w22-root1.gandalf.test No task to wake, unknown frame...
CIFS: VFS: \\w22-root1.gandalf.test Cmd: 18 Err: 0x0 Flags: 0x1...
CIFS: VFS: \\w22-root1.gandalf.test smb buf 00000000715bfe83 len 108
CIFS: VFS: Dump pending requests:
CIFS: VFS: \\w22-root1.gandalf.test No task to wake, unknown frame...
CIFS: VFS: \\w22-root1.gandalf.test Cmd: 18 Err: 0x0 Flags: 0x1...
CIFS: VFS: \\w22-root1.gandalf.test smb buf 000000005aa7316e len 108
...
To fix both, in open_cached_dir() ensure that @cfid->has_lease is set
right before sending out compounded request so that any potential
lease break will be get processed by demultiplex thread while we're
still caching @cfid. And, if open failed for some reason, re-check
@cfid->has_lease to decide whether or not put lease reference.
Cc: stable@vger.kernel.org
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5923d6686a100c2b4cabd4c2ca9d5a12579c7614 upstream.
Fixes xfstest generic/728 which had been failing due to incorrect
ctime after setxattr and removexattr
Update ctime on successful set of xattr
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit de4eceab578ead12a71e5b5588a57e142bbe8ceb upstream.
When multiple mounts are to the same share from the same client it was not
possible to determine which section of /proc/fs/cifs/Stats (and DebugData)
correspond to that mount. In some recent examples this turned out to be
a significant problem when trying to analyze performance data - since
there are many cases where unless we know the tree id and session id we
can't figure out which stats (e.g. number of SMB3.1.1 requests by type,
the total time they take, which is slowest, how many fail etc.) apply to
which mount. The only existing loosely related ioctl CIFS_IOC_GET_MNT_INFO
does not return the information needed to uniquely identify which tcon
is which mount although it does return various flags and device info.
Add a cifs.ko ioctl CIFS_IOC_GET_TCON_INFO (0x800ccf0c) to return tid,
session id, tree connect count.
Cc: stable@vger.kernel.org
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 475efd9808a3094944a56240b2711349e433fb66 upstream.
For example:
touch -h -t 02011200 testfile
where testfile is a symlink would not change the timestamp, but
touch -t 02011200 testfile
does work to change the timestamp of the target
Suggested-by: David Howells <dhowells@redhat.com>
Reported-by: Micah Veilleux <micah.veilleux@iba-group.com>
Closes: https://bugzilla.samba.org/show_bug.cgi?id=14476
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 72bc63f5e23a38b65ff2a201bdc11401d4223fa9 upstream.
Fixes some xfstests including generic/564 and generic/157
The "sfu" mount option can be useful for creating special files (character
and block devices in particular) but could not create FIFOs. It did
recognize existing empty files with the "system" attribute flag as FIFOs
but this is too general, so to support creating FIFOs more safely use a new
tag (but the same length as those for char and block devices ie "IntxLNK"
and "IntxBLK") "LnxFIFO" to indicate that the file should be treated as a
FIFO (when mounted with the "sfu"). For some additional context note that
"sfu" followed the way that "Services for Unix" on Windows handled these
special files (at least for character and block devices and symlinks),
which is different than newer Windows which can handle special files
as reparse points (which isn't an option to many servers).
Cc: stable@vger.kernel.org
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a5d6264b638efeca35eff72177fd28d149e0764b upstream.
Use the low-power states of the underlying platform to enable runtime PM.
If the platform doesn't support runtime D3, then enabling default RPM will
result in the controller malfunctioning, as in the case of hotplug devices
not being detected because of a failed interrupt generation.
Cc: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20231019102924.2797346-16-mathias.nyman@linux.intel.com
Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5f74f820f6fc844b95f9e5e406e0a07d97510420 upstream.
Matoro reported various userspace crashes on the parisc platform with kernel
6.6 and bisected it to commit 3033cd430768 ("parisc: Use generic mmap top-down
layout and brk randomization").
That commit switched parisc to use the common infrastructure to calculate
mmap_base, but missed that the mmap_base() function takes care for
architectures where the stack grows downwards only.
Fix the mmap_base() calculation to include the stack-grows-upwards case
and thus fix the userspace crashes on parisc.
Link: https://lkml.kernel.org/r/ZVH2qeS1bG7/1J/l@p100
Fixes: 3033cd430768 ("parisc: Use generic mmap top-down layout and brk randomization")
Signed-off-by: Helge Deller <deller@gmx.de>
Reported-by: matoro <matoro_mailinglist_kernel@matoro.tk>
Tested-by: matoro <matoro_mailinglist_kernel@matoro.tk>
Cc: <stable@vger.kernel.org> [6.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6ad6e15a9c46b8f0932cd99724f26f3db4db1cdf upstream.
Firmware returns the physical address of the power switch,
so need to use gsc_writel() instead of direct memory access.
Fixes: d0c219472980 ("parisc/power: Add power soft-off when running on qemu")
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v6.0+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 166b0110d1ee53290bd11618df6e3991c117495a upstream.
When calculating the pfn for the iitlbt/idtlbt instruction, do not
drop the upper 5 address bits. This doesn't seem to have an effect
on physical hardware which uses less physical address bits, but in
qemu the missing bits are visible.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a406b8b424fa01f244c1aab02ba186258448c36b upstream.
Bail out early with error message when trying to boot a 64-bit kernel on
32-bit machines. This fixes the previous commit to include the check for
true 64-bit kernels as well.
Signed-off-by: Helge Deller <deller@gmx.de>
Fixes: 591d2108f3abc ("parisc: Add runtime check to prevent PA2.0 kernels on PA1.x machines")
Cc: <stable@vger.kernel.org> # v6.0+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3aff5146445582454c35900f3c0c972987cdd595 ]
Unmounting resctrl FS has been moved into the per test functions in
resctrl_tests.c by commit caddc0fbe495 ("selftests/resctrl: Move
resctrl FS mount/umount to higher level"). In case a signal (SIGINT,
SIGTERM, or SIGHUP) is received, the running selftest is aborted by
ctrlc_handler() which then unmounts resctrl fs before exiting. The
current section between signal_handler_register() and
signal_handler_unregister(), however, does not cover the entire
duration when resctrl FS is mounted.
Move signal_handler_register() and signal_handler_unregister() calls
from per test files into resctrl_tests.c to properly unmount resctrl
fs. In order to not add signal_handler_register()/unregister() n times,
create helpers test_prepare() and test_cleanup().
Do not call ksft_exit_fail_msg() in test_prepare() but only in the per
test function to keep the control flow cleaner without adding calls to
exit() deep into the call chain.
Adjust child process kill() call in ctrlc_handler() to only be invoked
if the child was already forked.
Fixes: caddc0fbe495 ("selftests/resctrl: Move resctrl FS mount/umount to higher level")
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e33cb5702a9f287d829b0e9e6abe57f6a4aba6d2 ]
Benchmark command is used in multiple tests so it should not be
mutated by the tests but CMT test alters span argument. Due to the
order of tests (CMT test runs last), mutating the span argument in CMT
test does not trigger any real problems currently.
Mark benchmark_cmd strings as const and setup the benchmark command
using pointers. Because the benchmark command becomes const, the input
arguments can be used directly. Besides being simpler, using the input
arguments directly also removes the internal size restriction.
CMT test has to create a copy of the benchmark command before altering
the benchmark command.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: "Wieczor-Retman, Maciej" <maciej.wieczor-retman@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Stable-dep-of: 3aff51464455 ("selftests/resctrl: Extend signal handler coverage to unmount on receiving signal")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b1a901e078c4ee4a6fe13021c4577ef5f3155251 ]
struct resctrl_val_param contains span member. resctrl_val(), however,
never uses it because the value of span is embedded into the default
benchmark command and parsed from it by run_benchmark().
Remove span from resctrl_val_param. Provide DEFAULT_SPAN for the code
that needs it. CMT and CAT tests communicate span that is different
from the DEFAULT_SPAN between their internal functions which is
converted into passing it directly as a parameter.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: "Wieczor-Retman, Maciej" <maciej.wieczor-retman@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Stable-dep-of: 3aff51464455 ("selftests/resctrl: Extend signal handler coverage to unmount on receiving signal")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 47e36f16c7846bf3627ff68525e02555c53dc99e ]
bw_report is always set to "reads" and bm_type is set to "fill_buf" but
is never used.
Set bw_report directly to "reads" in MBA/MBM test and remove bm_type.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: "Wieczor-Retman, Maciej" <maciej.wieczor-retman@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Stable-dep-of: 3aff51464455 ("selftests/resctrl: Extend signal handler coverage to unmount on receiving signal")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cca42bd8eb1b54a4c9bbf48c79d120e66619a3e4 ]
The stuttering code isn't functioning as expected. Ideally, it should
pause the torture threads for a designated period before resuming. Yet,
it fails to halt the test for the correct duration. Additionally, a race
condition exists, potentially causing the stuttering code to pause for
an extended period if the 'spt' variable is non-zero due to the stutter
orchestration thread's inadequate CPU time.
Moreover, over-stuttering can hinder RCU's progress on TREE07 kernels.
This happens as the stuttering code may run within a softirq due to RCU
callbacks. Consequently, ksoftirqd keeps a CPU busy for several seconds,
thus obstructing RCU's progress. This situation triggers a warning
message in the logs:
[ 2169.481783] rcu_torture_writer: rtort_pipe_count: 9
This warning suggests that an RCU torture object, although invisible to
RCU readers, couldn't make it past the pipe array and be freed -- a
strong indication that there weren't enough grace periods during the
stutter interval.
To address these issues, this patch sets the "stutter end" time to an
absolute point in the future set by the main stutter thread. This is
then used for waiting in stutter_wait(). While the stutter thread still
defines this absolute time, the waiters' waiting logic doesn't rely on
the stutter thread receiving sufficient CPU time to halt the stuttering
as the halting is now self-controlled.
Cc: stable@vger.kernel.org
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a741deac787f0d2d7068638c067db20af9e63752 ]
The current torture-test sleeps are waiting for a duration, but there
are situations where it is better to wait for an absolute time, for
example, when ending a stutter interval. This commit therefore adds
an hrtimer mode parameter to torture_hrtimeout_ns(). Why not also the
other torture_hrtimeout_*() functions? The theory is that most absolute
times will be in nanoseconds, especially not (say) jiffies.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Stable-dep-of: cca42bd8eb1b ("rcutorture: Fix stuttering races and other issues")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 157a3537d6bc28ceb9a11fc8cb67f2152d860146 ]
commit 2db154b3ea8e ("vfs: syscall: Add move_mount(2) to move mounts around")
introduced a new move_mount(2) system call and a corresponding new LSM
security_move_mount hook but did not implement this hook for any
existing LSM. This creates a regression for AppArmor mediation of
mount. This patch provides a base mapping of the move_mount syscall to
the existing mount mediation. In the future we may introduce
additional mediations around the new mount calls.
Fixes: 2db154b3ea8e ("vfs: syscall: Add move_mount(2) to move mounts around")
CC: stable@vger.kernel.org
Reported-by: Andreas Steinmetz <anstein99@googlemail.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 90c436a64a6e20482a9a613c47eb4af2e8a5328e ]
The cred is needed to properly audit some messages, and will be needed
in the future for uid conditional mediation. So pass it through to
where the apparmor_audit_data struct gets defined.
Reviewed-by: Georgia Garcia <georgia.garcia@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Stable-dep-of: 157a3537d6bc ("apparmor: Fix regression in mount mediation")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d20f5a1a6e792d22199c9989ec7ab9e95c48d60c ]
rename audit_data's label field to subj_label to better reflect its
use. Also at the same time drop unneeded assignments to ->subj_label
as the later call to aa_check_perms will do the assignment if needed.
Reviewed-by: Georgia Garcia <georgia.garcia@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Stable-dep-of: 157a3537d6bc ("apparmor: Fix regression in mount mediation")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bd7bd201ca46c211c3ab251ca9854787d1331a2f ]
Everywhere where common_audit_data is used apparmor audit_data is also
used. We can simplify the code and drop the use of the aad macro
everywhere by combining the two structures.
Reviewed-by: Georgia Garcia <georgia.garcia@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Stable-dep-of: 157a3537d6bc ("apparmor: Fix regression in mount mediation")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 25ff0ff2d6286928dc516c74b879809c691c2dd8 ]
Fix kernel-doc warnings:
security/apparmor/policy.c:294: warning: Function parameter or
member 'proxy' not described in 'aa_alloc_profile'
security/apparmor/policy.c:785: warning: Function parameter or
member 'label' not described in 'aa_policy_view_capable'
security/apparmor/policy.c:785: warning: Function parameter or
member 'ns' not described in 'aa_policy_view_capable'
security/apparmor/policy.c:847: warning: Function parameter or
member 'ns' not described in 'aa_may_manage_policy'
security/apparmor/policy.c:964: warning: Function parameter or
member 'hname' not described in '__lookup_replace'
security/apparmor/policy.c:964: warning: Function parameter or
member 'info' not described in '__lookup_replace'
security/apparmor/policy.c:964: warning: Function parameter or
member 'noreplace' not described in '__lookup_replace'
security/apparmor/policy.c:964: warning: Function parameter or
member 'ns' not described in '__lookup_replace'
security/apparmor/policy.c:964: warning: Function parameter or
member 'p' not described in '__lookup_replace'
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Stable-dep-of: 157a3537d6bc ("apparmor: Fix regression in mount mediation")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 13c1748e217078d437727eef333cb0387d13bc0e ]
Fix kernel-doc warnings:
security/apparmor/resource.c:111: warning: Function parameter or
member 'label' not described in 'aa_task_setrlimit'
security/apparmor/resource.c:111: warning: Function parameter or
member 'new_rlim' not described in 'aa_task_setrlimit'
security/apparmor/resource.c:111: warning: Function parameter or
member 'resource' not described in 'aa_task_setrlimit'
security/apparmor/resource.c:111: warning: Function parameter or
member 'task' not described in 'aa_task_setrlimit'
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Stable-dep-of: 157a3537d6bc ("apparmor: Fix regression in mount mediation")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8921482286116af193980f04f2f2755775a410a5 ]
Fix kernel-doc warnings:
security/apparmor/lib.c:33: warning: Excess function parameter
'str' description in 'aa_free_str_table'
security/apparmor/lib.c:33: warning: Function parameter or member
't' not described in 'aa_free_str_table'
security/apparmor/lib.c:94: warning: Function parameter or
member 'n' not described in 'skipn_spaces'
security/apparmor/lib.c:390: warning: Excess function parameter
'deny' description in 'aa_check_perms'
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Stable-dep-of: 157a3537d6bc ("apparmor: Fix regression in mount mediation")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 26c9ecb34f5f5fa43c041a220de01d7cbea97dd0 ]
Fix kernel-doc warnings:
security/apparmor/audit.c:150: warning: Function parameter or
member 'type' not described in 'aa_audit_msg'
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Stable-dep-of: 157a3537d6bc ("apparmor: Fix regression in mount mediation")
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 8d2ad999ca3c64cb08cf6a58d227b9d9e746d708 upstream.
The CXL subsystem, at cxl_mem ->probe() time, establishes a lineage of
ports (struct cxl_port objects) between an endpoint and the root of a
CXL topology. Each port including the endpoint port is attached to the
cxl_port driver.
Given that setup, it follows that when either any port in that lineage
goes through a cxl_port ->remove() event, or the memdev goes through a
cxl_mem ->remove() event. The hierarchy below the removed port, or the
entire hierarchy if the memdev is removed needs to come down.
The delete_endpoint() callback is careful to check whether it is being
called to tear down the hierarchy, or if it is only being called to
teardown the memdev because an ancestor port is going through
->remove().
That care needs to take the device_lock() of the endpoint's parent.
Which requires 2 bugs to be fixed:
1/ A reference on the parent is needed to prevent use-after-free
scenarios like this signature:
BUG: spinlock bad magic on CPU#0, kworker/u56:0/11
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20230524-3.fc38 05/24/2023
Workqueue: cxl_port detach_memdev [cxl_core]
RIP: 0010:spin_bug+0x65/0xa0
Call Trace:
do_raw_spin_lock+0x69/0xa0
__mutex_lock+0x695/0xb80
delete_endpoint+0xad/0x150 [cxl_core]
devres_release_all+0xb8/0x110
device_unbind_cleanup+0xe/0x70
device_release_driver_internal+0x1d2/0x210
detach_memdev+0x15/0x20 [cxl_core]
process_one_work+0x1e3/0x4c0
worker_thread+0x1dd/0x3d0
2/ In the case of RCH topologies, the parent device that needs to be
locked is not always @port->dev as returned by cxl_mem_find_port(), use
endpoint->dev.parent instead.
Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver")
Cc: <stable@vger.kernel.org>
Reported-by: Robert Richter <rrichter@amd.com>
Closes: http://lore.kernel.org/r/20231018171713.1883517-2-rrichter@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 98a04c7aced2b43b3ac4befe216c4eecc7257d4b upstream.
Root decoder granularity must match value from CFWMS, which may not
be the region's granularity for non-interleaved root decoders.
So when calculating granularities for host bridge decoders, use the
region's granularity instead of the root decoder's granularity to ensure
the correct granularities are set for the host bridge decoders and any
downstream switch decoders.
Test configuration is 1 host bridge * 2 switches * 2 endpoints per switch.
Region created with 2048 granularity using following command line:
cxl create-region -m -d decoder0.0 -w 4 mem0 mem2 mem1 mem3 \
-g 2048 -s 2048M
Use "cxl list -PDE | grep granularity" to get a view of the granularity
set at each level of the topology.
Before this patch:
"interleave_granularity":2048,
"interleave_granularity":2048,
"interleave_granularity":512,
"interleave_granularity":2048,
"interleave_granularity":2048,
"interleave_granularity":512,
"interleave_granularity":256,
After:
"interleave_granularity":2048,
"interleave_granularity":2048,
"interleave_granularity":4096,
"interleave_granularity":2048,
"interleave_granularity":2048,
"interleave_granularity":4096,
"interleave_granularity":2048,
Fixes: 27b3f8d13830 ("cxl/region: Program target lists")
Cc: <stable@vger.kernel.org>
Signed-off-by: Jim Harris <jim.harris@samsung.com>
Link: https://lore.kernel.org/r/169824893473.1403938.16110924262989774582.stgit@bgt-140510-bm03.eng.stellus.in
[djbw: fixup the prebuilt cxl_test region]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9aaeef113c55248ecf3ab941c2e4460aaa8b8b9a upstream.
master side report:
silvaco-i3c-master 44330000.i3c-master: Error condition: MSTATUS 0x020090c7, MERRWARN 0x00100000
BIT 20: TIMEOUT error
The module has stalled too long in a frame. This happens when:
- The TX FIFO or RX FIFO is not handled and the bus is stuck in the
middle of a message,
- No STOP was issued and between messages,
- IBI manual is used and no decision was made.
The maximum stall period is 100 μs.
This can be considered as being just a warning as the system IRQ latency
can easily be greater than 100us.
Fixes: dd3c52846d59 ("i3c: master: svc: Add Silvaco I3C master driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/r/20231023161658.3890811-7-Frank.Li@nxp.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dfd7cd6aafdb1f5ba93828e97e56b38304b23a05 upstream.
Upon IBIWON timeout, the SDA line will always be kept low if we don't emit
a stop. Calling svc_i3c_master_emit_stop() there will let the bus return to
idle state.
Fixes: dd3c52846d59 ("i3c: master: svc: Add Silvaco I3C master driver")
Cc: <stable@vger.kernel.org>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20231023161658.3890811-6-Frank.Li@nxp.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c85e209b799f12d18a90ae6353b997b1bb1274a5 upstream.
MSTATUS[RXPEND] is only updated after the data transfer cycle started. This
creates an issue when the I3C clock is slow, and the CPU is running fast
enough that MSTATUS[RXPEND] may not be updated when the code reaches
checking point. As a result, mandatory data can be missed.
Add a wait for MSTATUS[COMPLETE] to ensure that all mandatory data is
already in FIFO. It also works without mandatory data.
Fixes: dd3c52846d59 ("i3c: master: svc: Add Silvaco I3C master driver")
Cc: <stable@vger.kernel.org>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20231023161658.3890811-4-Frank.Li@nxp.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5e5e3c92e748a6d859190e123b9193cf4911fcca upstream.
┌─────┐ ┏──┐ ┏──┐ ┏──┐ ┏──┐ ┏──┐ ┏──┐ ┏──┐ ┏──┐ ┌─────
SCL: ┘ └─────┛ └──┛ └──┛ └──┛ └──┛ └──┛ └──┛ └──┛ └──┘
───┐ ┌─────┐ ┌─────┐ ┌───────────┐
SDA: └───────────────────────┘ └─────┘ └─────┘ └─────
xxx╱ ╲╱ ╲╱ ╲╱ ╲╱ ╲
: xxx╲IBI ╱╲ Addr(0x0a) ╱╲ RW ╱╲NACK╱╲ S ╱
If an In-Band Interrupt (IBI) occurs and IBI work thread is not immediately
scheduled, when svc_i3c_master_priv_xfers() initiates the I3C transfer and
attempts to send address 0x7e, the target interprets it as an
IBI handler and returns the target address 0x0a.
However, svc_i3c_master_priv_xfers() does not handle this case and proceeds
with other transfers, resulting in incorrect data being returned.
Add IBIWON check in svc_i3c_master_xfer(). In case this situation occurs,
return a failure to the driver.
Fixes: dd3c52846d59 ("i3c: master: svc: Add Silvaco I3C master driver")
Cc: <stable@vger.kernel.org>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20231023161658.3890811-3-Frank.Li@nxp.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6bf3fc268183816856c96b8794cd66146bc27b35 upstream.
The ibi work thread operates asynchronously with other transfers, such as
svc_i3c_master_priv_xfers(). Introduce mutex protection to ensure the
completion of the entire i3c/i2c transaction.
Fixes: dd3c52846d59 ("i3c: master: svc: Add Silvaco I3C master driver")
Cc: <stable@vger.kernel.org>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20231023161658.3890811-2-Frank.Li@nxp.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0718588c7aaa7a1510b4de972370535b61dddd0d upstream.
Commit 5e42bcbc3fef ("cxl/region: decrement ->nr_targets on error in
cxl_region_attach()") tried to avoid 'eiw' initialization errors when
->nr_targets exceeded 16, by just decrementing ->nr_targets when
cxl_region_setup_targets() failed.
Commit 86987c766276 ("cxl/region: Cleanup target list on attach error")
extended that cleanup to also clear cxled->pos and p->targets[pos]. The
initialization error was incidentally fixed separately by:
Commit 8d4285425714 ("cxl/region: Fix port setup uninitialized variable
warnings") which was merged a few days after 5e42bcbc3fef.
But now the original cleanup when cxl_region_setup_targets() fails
prevents endpoint and switch decoder resources from being reused:
1) the cleanup does not set the decoder's region to NULL, which results
in future dpa_size_store() calls returning -EBUSY
2) the decoder is not properly freed, which results in future commit
errors associated with the upstream switch
Now that the initialization errors were fixed separately, the proper
cleanup for this case is to just return immediately. Then the resources
associated with this target get cleanup up as normal when the failed
region is deleted.
The ->nr_targets decrement in the error case also helped prevent
a p->targets[] array overflow, so add a new check to prevent against
that overflow.
Tested by trying to create an invalid region for a 2 switch * 2 endpoint
topology, and then following up with creating a valid region.
Fixes: 5e42bcbc3fef ("cxl/region: decrement ->nr_targets on error in cxl_region_attach()")
Cc: <stable@vger.kernel.org>
Signed-off-by: Jim Harris <jim.harris@samsung.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/169703589120.1202031.14696100866518083806.stgit@bgt-140510-bm03.eng.stellus.in
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 565fe150624ee77dc63a735cc1b3bff5101f38a3 upstream.
Currently the offset into the device when looking for OTP
bits can go outside of the address of the MTD NOR devices,
and if that memory isn't readable, bad things happen
on the IXP4xx (added prints that illustrate the problem before
the crash):
cfi_intelext_otp_walk walk OTP on chip 0 start at reg_prot_offset 0x00000100
ixp4xx_copy_from copy from 0x00000100 to 0xc880dd78
cfi_intelext_otp_walk walk OTP on chip 0 start at reg_prot_offset 0x12000000
ixp4xx_copy_from copy from 0x12000000 to 0xc880dd78
8<--- cut here ---
Unable to handle kernel paging request at virtual address db000000
[db000000] *pgd=00000000
(...)
This happens in this case because the IXP4xx is big endian and
the 32- and 16-bit fields in the struct cfi_intelext_otpinfo are not
properly byteswapped. Compare to how the code in read_pri_intelext()
byteswaps the fields in struct cfi_pri_intelext.
Adding a small byte swapping loop for the OTP in read_pri_intelext()
and the crash goes away.
The problem went unnoticed for many years until I enabled
CONFIG_MTD_OTP on the IXP4xx as well, triggering the bug.
Cc: stable@vger.kernel.org
Reviewed-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20231020-mtd-otp-byteswap-v4-1-0d132c06aa9d@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0da668333fb07805c2836d5d50e26eda915b24a1 upstream.
Defining a prctl flag as an int is a footgun because on a 64 bit machine
and with a variadic implementation of prctl (like in musl and glibc), when
used directly as a prctl argument, it can get casted to long with garbage
upper bits which would result in unexpected behaviors.
This patch changes the constant to an unsigned long to eliminate that
possibilities. This does not break UAPI.
I think that a stable backport would be "nice to have": to reduce the
chances that users build binaries that could end up with garbage bits in
their MDWE prctl arguments. We are not aware of anyone having yet
encountered this corner case with MDWE prctls but a backport would reduce
the likelihood it happens, since this sort of issues has happened with
other prctls. But If this is perceived as a backporting burden, I suppose
we could also live without a stable backport.
Link: https://lkml.kernel.org/r/20230828150858.393570-5-revest@chromium.org
Fixes: b507808ebce2 ("mm: implement memory-deny-write-execute as a prctl")
Signed-off-by: Florent Revest <revest@chromium.org>
Suggested-by: Alexey Izbyshev <izbyshev@ispras.ru>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ayush Jain <ayush.jain3@amd.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Szabolcs Nagy <Szabolcs.Nagy@arm.com>
Cc: Topi Miettinen <toiwoton@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1640a0ef80f6d572725f5b0330038c18e98ea168 upstream.
When dealing with hugetlb pages, manipulating struct page pointers
directly can get to wrong struct page, since struct page is not guaranteed
to be contiguous on SPARSEMEM without VMEMMAP. Use pfn calculation to
handle it properly.
Without the fix, a wrong number of page might be skipped. Since skip cannot be
negative, scan_movable_page() will end early and might miss a movable page with
-ENOENT. This might fail offline_pages(). No bug is reported. The fix comes
from code inspection.
Link: https://lkml.kernel.org/r/20230913201248.452081-4-zi.yan@sent.com
Fixes: eeb0efd071d8 ("mm,memory_hotplug: fix scan_movable_pages() for gigantic hugepages")
Signed-off-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 426056efe835cf4864ccf4c328fe3af9146fc539 upstream.
When dealing with hugetlb pages, manipulating struct page pointers
directly can get to wrong struct page, since struct page is not guaranteed
to be contiguous on SPARSEMEM without VMEMMAP. Use nth_page() to handle
it properly.
A wrong or non-existing page might be tried to be grabbed, either
leading to a non freeable page or kernel memory access errors. No bug
is reported. It comes from code inspection.
Link: https://lkml.kernel.org/r/20230913201248.452081-3-zi.yan@sent.com
Fixes: 57a196a58421 ("hugetlb: simplify hugetlb handling in follow_page_mask")
Signed-off-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2e7cfe5cd5b6b0b98abf57a3074885979e187c1c upstream.
Patch series "Use nth_page() in place of direct struct page manipulation",
v3.
On SPARSEMEM without VMEMMAP, struct page is not guaranteed to be
contiguous, since each memory section's memmap might be allocated
independently. hugetlb pages can go beyond a memory section size, thus
direct struct page manipulation on hugetlb pages/subpages might give wrong
struct page. Kernel provides nth_page() to do the manipulation properly.
Use that whenever code can see hugetlb pages.
This patch (of 5):
When dealing with hugetlb pages, manipulating struct page pointers
directly can get to wrong struct page, since struct page is not guaranteed
to be contiguous on SPARSEMEM without VMEMMAP. Use nth_page() to handle
it properly.
Without the fix, page_kasan_tag_reset() could reset wrong page tags,
causing a wrong kasan result. No related bug is reported. The fix
comes from code inspection.
Link: https://lkml.kernel.org/r/20230913201248.452081-1-zi.yan@sent.com
Link: https://lkml.kernel.org/r/20230913201248.452081-2-zi.yan@sent.com
Fixes: 2813b9c02962 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc")
Signed-off-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>