Commit Graph

4058 Commits

Author SHA1 Message Date
Mark Rutland
0e62ccb959 arm64: rename ARM64_HAS_SYSREG_GIC_CPUIF to ARM64_HAS_GIC_CPUIF_SYSREGS
Subsequent patches will add more GIC-related cpucaps. When we do so, it
would be nice to give them a consistent HAS_GIC_* prefix.

In preparation for doing so, this patch renames the existing
ARM64_HAS_SYSREG_GIC_CPUIF cap to ARM64_HAS_GIC_CPUIF_SYSREGS.

The 'CPUIF_SYSREGS' suffix is chosen so that this will be ordered ahead
of other ARM64_HAS_GIC_* definitions in subsequent patches.

The cpucaps file was hand-modified; all other changes were scripted
with:

  find . -type f -name '*.[chS]' -print0 | \
    xargs -0 sed -i
    's/ARM64_HAS_SYSREG_GIC_CPUIF/ARM64_HAS_GIC_CPUIF_SYSREGS/'

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230130145429.903791-2-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-31 16:06:17 +00:00
Pierre Gondois
921e672dee cacheinfo: Remove unused check in init_cache_level()
commit e75d18cecb ("arm64: cacheinfo: Fix incorrect assignment
of signed error value to unsigned fw_level")
checks the fw_level value in init_cache_level() in case the value is
negative.
Remove this check as the error code is not returned through
fw_level anymore, and reset fw_level if acpi_get_cache_info()
failed. This allows to try fetching the cache information from
clidr_el1.

Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Link: https://lore.kernel.org/r/20230124154053.355376-4-pierre.gondois@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-31 16:02:02 +01:00
Pierre Gondois
d931b83e62 cacheinfo: Make default acpi_get_cache_info() return an error
commit bd500361a9 ("ACPI: PPTT: Update acpi_find_last_cache_level()
to acpi_get_cache_info()")
updates the prototype of acpi_get_cache_info(). The cache 'levels'
is update through a pointer and not the return value of the function.

If CONFIG_ACPI_PPTT is not defined, acpi_get_cache_info() doesn't
update its *levels and *split_levels parameters and returns 0.
This can lead to a faulty behaviour.

Make acpi_get_cache_info() return an error code if CONFIG_ACPI_PPTT
is not defined.
Also,

In init_cache_level(), if no PPTT is present or CONFIG_ACPI_PPTT is
not defined, instead of aborting if acpi_get_cache_info() returns an
error code, just continue. This allows to try fetching the cache
information from clidr_el1.

Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Link: https://lore.kernel.org/r/20230124154053.355376-3-pierre.gondois@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-31 16:02:02 +01:00
Ingo Molnar
57a30218fa Linux 6.2-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmPW7E8eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGf7MIAI0JnHN9WvtEukSZ
 E6j6+cEGWxsvD6q0g3GPolaKOCw7hlv0pWcFJFcUAt0jebspMdxV2oUGJ8RYW7Lg
 nCcHvEVswGKLAQtQSWw52qotW6fUfMPsNYYB5l31sm1sKH4Cgss0W7l2HxO/1LvG
 TSeNHX53vNAZ8pVnFYEWCSXC9bzrmU/VALF2EV00cdICmfvjlgkELGXoLKJJWzUp
 s63fBHYGGURSgwIWOKStoO6HNo0j/F/wcSMx8leY8qDUtVKHj4v24EvSgxUSDBER
 ch3LiSQ6qf4sw/z7pqruKFthKOrlNmcc0phjiES0xwwGiNhLv0z3rAhc4OM2cgYh
 SDc/Y/c=
 =zpaD
 -----END PGP SIGNATURE-----

Merge tag 'v6.2-rc6' into sched/core, to pick up fixes

Pick up fixes before merging another batch of cpuidle updates.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2023-01-31 15:01:20 +01:00
Mark Rutland
a873bb493f arm64: traps: attempt to dump all instructions
Currently dump_kernel_instr() dumps a few instructions around the
pt_regs::pc value, dumping 4 instructions before the PC before dumping
the instruction at the PC. If an attempt to read an instruction fails,
it gives up and does not attempt to dump any subsequent instructions.

This is unfortunate when the pt_regs::pc value points to the start of a
page with a leading guard page, where the instruction at the PC can be
read, but prior instructions cannot.

This patch makes dump_kernel_instr() attempt to dump each instruction
regardless of whether reading a prior instruction could be read, which
gives a more useful code dump in such cases. When an instruction cannot
be read, it is reported as "????????", which cannot be confused with a
hex value,

For example, with a `UDF #0` (AKA 0x00000000) early in the kexec control
page, we'll now get the following code dump:

| Internal error: Oops - Undefined instruction: 0000000002000000 [#1] SMP
| Modules linked in:
| CPU: 0 PID: 261 Comm: kexec Not tainted 6.2.0-rc5+ #26
| Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
| pstate: 604003c5 (nZCv DAIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : 0x48c00000
| lr : machine_kexec+0x190/0x200
| sp : ffff80000d36ba80
| x29: ffff80000d36ba80 x28: ffff000002dfc380 x27: 0000000000000000
| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
| x23: ffff80000a9f7858 x22: 000000004c460000 x21: 0000000000000010
| x20: 00000000ad821000 x19: ffff000000aa0000 x18: 0000000000000006
| x17: ffff8000758a2000 x16: ffff800008000000 x15: ffff80000d36b568
| x14: 0000000000000000 x13: ffff80000d36b707 x12: ffff80000a9bf6e0
| x11: 00000000ffffdfff x10: ffff80000aaaf8e0 x9 : ffff80000815eff8
| x8 : 000000000002ffe8 x7 : c0000000ffffdfff x6 : 00000000000affa8
| x5 : 0000000000001fff x4 : 0000000000000001 x3 : ffff80000a263008
| x2 : ffff80000a9e20f8 x1 : 0000000048c00000 x0 : ffff000000aa0000
| Call trace:
|  0x48c00000
|  kernel_kexec+0x88/0x138
|  __do_sys_reboot+0x108/0x288
|  __arm64_sys_reboot+0x2c/0x40
|  invoke_syscall+0x78/0x140
|  el0_svc_common.constprop.0+0x4c/0x100
|  do_el0_svc+0x34/0x80
|  el0_svc+0x34/0x140
|  el0t_64_sync_handler+0xf4/0x140
|  el0t_64_sync+0x194/0x1c0
| Code: ???????? ???????? ???????? ???????? (00000000)
| ---[ end trace 0000000000000000 ]---
| Kernel panic - not syncing: Oops - Undefined instruction: Fatal exception
| Kernel Offset: disabled
| CPU features: 0x002000,00050108,c8004203
| Memory Limit: none

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20230127121256.2141368-1-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-27 17:48:34 +00:00
Mark Rutland
dc4824faa2 arm64: avoid executing padding bytes during kexec / hibernation
Currently we rely on the HIBERNATE_TEXT section starting with the entry
point to swsusp_arch_suspend_exit, and the KEXEC_TEXT section starting
with the entry point to arm64_relocate_new_kernel. In both cases we copy
the entire section into a dynamically-allocated page, and then later
branch to the start of this page.

SYM_FUNC_START() will align the function entry points to
CONFIG_FUNCTION_ALIGNMENT, and when the linker later processes the
assembled code it will place padding bytes before the function entry
point if the location counter was not already sufficiently aligned. The
linker happens to use the value zero for these padding bytes.

This padding may end up being applied whenever CONFIG_FUNCTION_ALIGNMENT
is greater than 4, which can be the case with
CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B=y or
CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS=y.

When such padding is applied, attempting to kexec or resume from
hibernate will result ina crash: the kernel will branch to the padding
bytes as the start of the dynamically-allocated page, and as those bytes
are zero they will decode as UDF #0, which reliably triggers an
UNDEFINED exception. For example:

| # ./kexec --reuse-cmdline -f Image
| [   46.965800] kexec_core: Starting new kernel
| [   47.143641] psci: CPU1 killed (polled 0 ms)
| [   47.233653] psci: CPU2 killed (polled 0 ms)
| [   47.323465] psci: CPU3 killed (polled 0 ms)
| [   47.324776] Bye!
| [   47.327072] Internal error: Oops - Undefined instruction: 0000000002000000 [#1] SMP
| [   47.328510] Modules linked in:
| [   47.329086] CPU: 0 PID: 259 Comm: kexec Not tainted 6.2.0-rc5+ #3
| [   47.330223] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
| [   47.331497] pstate: 604003c5 (nZCv DAIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| [   47.332782] pc : 0x43a95000
| [   47.333338] lr : machine_kexec+0x190/0x1e0
| [   47.334169] sp : ffff80000d293b70
| [   47.334845] x29: ffff80000d293b70 x28: ffff000002cc0000 x27: 0000000000000000
| [   47.336292] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
| [   47.337744] x23: ffff80000a837858 x22: 0000000048ec9000 x21: 0000000000000010
| [   47.339192] x20: 00000000adc83000 x19: ffff000000827000 x18: 0000000000000006
| [   47.340638] x17: ffff800075a61000 x16: ffff800008000000 x15: ffff80000d293658
| [   47.342085] x14: 0000000000000000 x13: ffff80000d2937f7 x12: ffff80000a7ff6e0
| [   47.343530] x11: 00000000ffffdfff x10: ffff80000a8ef8e0 x9 : ffff80000813ef00
| [   47.344976] x8 : 000000000002ffe8 x7 : c0000000ffffdfff x6 : 00000000000affa8
| [   47.346431] x5 : 0000000000001fff x4 : 0000000000000001 x3 : ffff80000a0a3008
| [   47.347877] x2 : ffff80000a8220f8 x1 : 0000000043a95000 x0 : ffff000000827000
| [   47.349334] Call trace:
| [   47.349834]  0x43a95000
| [   47.350338]  kernel_kexec+0x88/0x100
| [   47.351070]  __do_sys_reboot+0x108/0x268
| [   47.351873]  __arm64_sys_reboot+0x2c/0x40
| [   47.352689]  invoke_syscall+0x78/0x108
| [   47.353458]  el0_svc_common.constprop.0+0x4c/0x100
| [   47.354426]  do_el0_svc+0x34/0x50
| [   47.355102]  el0_svc+0x34/0x108
| [   47.355747]  el0t_64_sync_handler+0xf4/0x120
| [   47.356617]  el0t_64_sync+0x194/0x198
| [   47.357374] Code: bad PC value
| [   47.357999] ---[ end trace 0000000000000000 ]---
| [   47.358937] Kernel panic - not syncing: Oops - Undefined instruction: Fatal exception
| [   47.360515] Kernel Offset: disabled
| [   47.361230] CPU features: 0x002000,00050108,c8004203
| [   47.362232] Memory Limit: none

Note: Unfortunately the code dump reports "bad PC value" as it attempts
to dump some instructions prior to the UDF (i.e. before the start of the
page), and terminates early upon a fault, obscuring the problem.

This patch fixes this issue by aligning the section starter markes to
CONFIG_FUNCTION_ALIGNMENT using the ALIGN_FUNCTION() helper, which
ensures that the linker never needs to place padding bytes within the
section. Assertions are added to verify each section begins with the
function we expect, making our implicit requirement explicit.

In future it might be nice to rework the kexec and hibernation code to
decouple the section start from the entry point, but that involves much
more significant changes that come with a higher risk of error, so I've
tried to keep this fix as simple as possible for now.

Fixes: 47a15aa544 ("arm64: Extend support for CONFIG_FUNCTION_ALIGNMENT")
Reported-by: CKI Project <cki-project@redhat.com>
Link: https://lore.kernel.org/linux-arm-kernel/29992.123012504212600261@us-mta-139.us.mimecast.lan/
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-27 17:45:44 +00:00
Ard Biesheuvel
54c968bec3 arm64: Apply dynamic shadow call stack patching in two passes
Code patching for the dynamically enabled shadow call stack comes down
to finding PACIASP and AUTIASP instructions -which behave as NOPs on
cores that do not implement pointer authentication- and converting them
into shadow call stack pushes and pops, respectively.

Due to past bad experiences with the highly complex and overengineered
DWARF standard that describes the unwind metadata that we are using to
locate these instructions, let's make this patching logic a little bit
more robust so that any issues with the unwind metadata detected at boot
time can de dealt with gracefully.

The DWARF annotations that are used for this are emitted at function
granularity, and due to the fact that the instructions we are patching
will simply behave as NOPs if left unpatched, we can abort on errors as
long as we don't leave any functions in a half-patched state.

So do a dry run of each FDE frame (covering a single function) before
performing the actual patching, and give up if the DWARF metadata cannot
be understood.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Link: https://lore.kernel.org/r/20221213142849.1629026-1-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-26 17:54:15 +00:00
Ard Biesheuvel
2ced0f30a4 arm64: head: Switch endianness before populating the ID map
Ensure that the endianness used for populating the ID map matches the
endianness that the running kernel will be using, as this is no longer
guaranteed now that create_idmap() is invoked before init_kernel_el().

Note that doing so is only safe if the MMU is off, as switching the
endianness with the MMU on results in the active ID map to become
invalid. So also clear the M bit when toggling the EE bit in SCTLR, and
mark the MMU as disabled at boot.

Note that the same issue has resulted in preserve_boot_args() recording
the contents of registers X0 ... X3 in the wrong byte order, although
this is arguably a very minor concern.

Fixes: 32b135a7fa ("arm64: head: avoid cache invalidation when entering with the MMU on")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20230125185910.962733-1-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-26 17:42:58 +00:00
Ard Biesheuvel
6178617038 efi: arm64: enter with MMU and caches enabled
Instead of cleaning the entire loaded kernel image to the PoC and
disabling the MMU and caches before branching to the kernel's bare metal
entry point, we can leave the MMU and caches enabled, and rely on EFI's
cacheable 1:1 mapping of all of system RAM (which is mandated by the
spec) to populate the initial page tables.

This removes the need for managing coherency in software, which is
tedious and error prone.

Note that we still need to clean the executable region of the image to
the PoU if this is required for I/D coherency, but only if we actually
decided to move the image in memory, as otherwise, this will have been
taken care of by the loader.

This change affects both the builtin EFI stub as well as the zboot
decompressor, which now carries the entire EFI stub along with the
decompression code and the compressed image.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20230111102236.1430401-7-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-24 11:51:08 +00:00
Ard Biesheuvel
3dcf60bbfd arm64: head: Clean the ID map and the HYP text to the PoC if needed
If we enter with the MMU and caches enabled, the bootloader may not have
performed any cache maintenance to the PoC. So clean the ID mapped page
to the PoC, to ensure that instruction and data accesses with the MMU
off see the correct data. For similar reasons, clean all the HYP text to
the PoC as well when entering at EL2 with the MMU and caches enabled.

Note that this means primary_entry() itself needs to be moved into the
ID map as well, as we will return from init_kernel_el() with the MMU and
caches off.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20230111102236.1430401-6-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-24 11:51:08 +00:00
Ard Biesheuvel
32b135a7fa arm64: head: avoid cache invalidation when entering with the MMU on
If we enter with the MMU on, there is no need for explicit cache
invalidation for stores to memory, as they will be coherent with the
caches.

Let's take advantage of this, and create the ID map with the MMU still
enabled if that is how we entered, and avoid any cache invalidation
calls in that case.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20230111102236.1430401-5-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-24 11:51:07 +00:00
Ard Biesheuvel
9d7c13e5dd arm64: head: record the MMU state at primary entry
Prepare for being able to deal with primary entry with the MMU and
caches enabled, by recording whether or not we entered with the MMU on
in register x19 and in a global variable. (Note that setting this
variable to '1' does not require cache invalidation, nor is it required
for storing the bootargs in that case, so omit the cache maintenance).

Since boot with the MMU and caches enabled is not permitted by the bare
metal boot protocol, ensure that a diagnostic is emitted and a taint bit
set if the MMU was found to be enabled on a non-EFI boot, and panic()
once the console is likely to be up. We will make an exception for EFI
boot later, which has strict requirements for the mapping of system
memory, permitting us to relax the boot protocol and hand over from the
EFI stub to the core kernel with MMU and caches left enabled.

While at it, add 'pre_disable_mmu_workaround' macro invocations to
init_kernel_el, as its manipulation of SCTLR_ELx may amount to disabling
of the MMU after subsequent patches.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20230111102236.1430401-4-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-24 11:51:07 +00:00
Ard Biesheuvel
af7249b317 arm64: kernel: move identity map out of .text mapping
Reorganize the ID map slightly so that only code that is executed with
the MMU off or via the 1:1 mapping remains. This allows us to move the
identity map out of the .text segment, as it will no longer need
executable permissions via the kernel mapping.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20230111102236.1430401-3-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-24 11:51:07 +00:00
Ard Biesheuvel
82e4958800 arm64: head: Move all finalise_el2 calls to after __enable_mmu
In the primary boot path, finalise_el2() is called much later than on
the secondary boot or resume-from-suspend paths, and this does not
appear to be intentional.

Since we aim to do as little as possible before enabling the MMU and
caches, align secondary and resume with primary boot, and defer the call
to after the MMU is turned on. This also removes the need to clean
finalise_el2() to the PoC once we enable support for booting with the
MMU on.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20230111102236.1430401-2-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-24 11:51:07 +00:00
Mark Rutland
baaf553d3b arm64: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
This patch enables support for DYNAMIC_FTRACE_WITH_CALL_OPS on arm64.
This allows each ftrace callsite to provide an ftrace_ops to the common
ftrace trampoline, allowing each callsite to invoke distinct tracer
functions without the need to fall back to list processing or to
allocate custom trampolines for each callsite. This significantly speeds
up cases where multiple distinct trace functions are used and callsites
are mostly traced by a single tracer.

The main idea is to place a pointer to the ftrace_ops as a literal at a
fixed offset from the function entry point, which can be recovered by
the common ftrace trampoline. Using a 64-bit literal avoids branch range
limitations, and permits the ops to be swapped atomically without
special considerations that apply to code-patching. In future this will
also allow for the implementation of DYNAMIC_FTRACE_WITH_DIRECT_CALLS
without branch range limitations by using additional fields in struct
ftrace_ops.

As noted in the core patch adding support for
DYNAMIC_FTRACE_WITH_CALL_OPS, this approach allows for directly invoking
ftrace_ops::func even for ftrace_ops which are dynamically-allocated (or
part of a module), without going via ftrace_ops_list_func.

Currently, this approach is not compatible with CLANG_CFI, as the
presence/absence of pre-function NOPs changes the offset of the
pre-function type hash, and there's no existing mechanism to ensure a
consistent offset for instrumented and uninstrumented functions. When
CLANG_CFI is enabled, the existing scheme with a global ops->func
pointer is used, and there should be no functional change. I am
currently working with others to allow the two to work together in
future (though this will liekly require updated compiler support).

I've benchamrked this with the ftrace_ops sample module [1], which is
not currently upstream, but available at:

  https://lore.kernel.org/lkml/20230103124912.2948963-1-mark.rutland@arm.com
  git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git ftrace-ops-sample-20230109

Using that module I measured the total time taken for 100,000 calls to a
trivial instrumented function, with a number of tracers enabled with
relevant filters (which would apply to the instrumented function) and a
number of tracers enabled with irrelevant filters (which would not apply
to the instrumented function). I tested on an M1 MacBook Pro, running
under a HVF-accelerated QEMU VM (i.e. on real hardware).

Before this patch:

  Number of tracers     || Total time  | Per-call average time (ns)
  Relevant | Irrelevant || (ns)        | Total        | Overhead
  =========+============++=============+==============+============
         0 |          0 ||      94,583 |         0.95 |           -
         0 |          1 ||      93,709 |         0.94 |           -
         0 |          2 ||      93,666 |         0.94 |           -
         0 |         10 ||      93,709 |         0.94 |           -
         0 |        100 ||      93,792 |         0.94 |           -
  ---------+------------++-------------+--------------+------------
         1 |          1 ||   6,467,833 |        64.68 |       63.73
         1 |          2 ||   7,509,708 |        75.10 |       74.15
         1 |         10 ||  23,786,792 |       237.87 |      236.92
         1 |        100 || 106,432,500 |     1,064.43 |     1063.38
  ---------+------------++-------------+--------------+------------
         1 |          0 ||   1,431,875 |        14.32 |       13.37
         2 |          0 ||   6,456,334 |        64.56 |       63.62
        10 |          0 ||  22,717,000 |       227.17 |      226.22
       100 |          0 || 103,293,667 |      1032.94 |     1031.99
  ---------+------------++-------------+--------------+--------------

  Note: per-call overhead is estimated relative to the baseline case
  with 0 relevant tracers and 0 irrelevant tracers.

After this patch

  Number of tracers     || Total time  | Per-call average time (ns)
  Relevant | Irrelevant || (ns)        | Total        | Overhead
  =========+============++=============+==============+============
         0 |          0 ||      94,541 |         0.95 |           -
         0 |          1 ||      93,666 |         0.94 |           -
         0 |          2 ||      93,709 |         0.94 |           -
         0 |         10 ||      93,667 |         0.94 |           -
         0 |        100 ||      93,792 |         0.94 |           -
  ---------+------------++-------------+--------------+------------
         1 |          1 ||     281,000 |         2.81 |        1.86
         1 |          2 ||     281,042 |         2.81 |        1.87
         1 |         10 ||     280,958 |         2.81 |        1.86
         1 |        100 ||     281,250 |         2.81 |        1.87
  ---------+------------++-------------+--------------+------------
         1 |          0 ||     280,959 |         2.81 |        1.86
         2 |          0 ||   6,502,708 |        65.03 |       64.08
        10 |          0 ||  18,681,209 |       186.81 |      185.87
       100 |          0 || 103,550,458 |     1,035.50 |     1034.56
  ---------+------------++-------------+--------------+------------

  Note: per-call overhead is estimated relative to the baseline case
  with 0 relevant tracers and 0 irrelevant tracers.

As can be seen from the above:

a) Whenever there is a single relevant tracer function associated with a
   tracee, the overhead of invoking the tracer is constant, and does not
   scale with the number of tracers which are *not* associated with that
   tracee.

b) The overhead for a single relevant tracer has dropped to ~1/7 of the
   overhead prior to this series (from 13.37ns to 1.86ns). This is
   largely due to permitting calls to dynamically-allocated ftrace_ops
   without going through ftrace_ops_list_func.

I've run the ftrace selftests from v6.2-rc3, which reports:

| # of passed:  110
| # of failed:  0
| # of unresolved:  3
| # of untested:  0
| # of unsupported:  0
| # of xfailed:  1
| # of undefined(test bug):  0

... where the unresolved entries were the tests for DIRECT functions
(which are not supported), and the checkbashisms selftest (which is
irrelevant here):

| [8] Test ftrace direct functions against tracers        [UNRESOLVED]
| [9] Test ftrace direct functions against kprobes        [UNRESOLVED]
| [62] Meta-selftest: Checkbashisms       [UNRESOLVED]

... with all other tests passing (or failing as expected).

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230123134603.1064407-9-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-24 11:49:43 +00:00
Mark Rutland
90955d778a arm64: ftrace: Update stale comment
In commit:

  26299b3f6b ("ftrace: arm64: move from REGS to ARGS")

... we folded ftrace_regs_entry into ftrace_caller, and
ftrace_regs_entry no longer exists.

Update the comment accordingly.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230123134603.1064407-8-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-24 11:49:43 +00:00
Mark Rutland
e4ecbe83fd arm64: patching: Add aarch64_insn_write_literal_u64()
In subsequent patches we'll need to atomically write to a
naturally-aligned 64-bit literal embedded within the kernel text.

Add a helper for this. For consistency with other text patching code we
use copy_to_kernel_nofault(), which is atomic for naturally-aligned
accesses up to 64-bits.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230123134603.1064407-7-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-24 11:49:43 +00:00
Linus Torvalds
9946f0981f Some more EFI fixes for v6.2:
- ensure the EFI ResetSystem and ACPI PRM calls are recognized as users
   of the EFI runtime, and therefore protected against exceptions
 
 - account for the EFI runtime stack in the stacktrace code
 
 - remove Matt Garrett's MAINTAINERS entry for efivarfs
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE+9lifEBpyUIVN1cpw08iOZLZjyQFAmPOsW0ACgkQw08iOZLZ
 jySMWwv/RFYpNsvbG2QxrrgKvRAFzHiqOaWGWlBPXa3jvoZAEWvhtRNMEm+U+3JY
 jJA4F4COnhHy0xrCGb9VWP9ifrI61ZpMMpxFkkxpS/ciTUvilbkQGwLgDZP9g7Hf
 jb5+W36BwQKlSQH+bLPSeiIneBHgDY04Q3gTrslSWQs6/rVc4JvYd2+SGxAu9FGz
 wi5AX7hbF6zf3x1AJCyeihMClW5Pn+5PaO+ik+XZvyO2e/0YzMigStRbHL8ULRSp
 aldrwHuUlNOPa/09bTtHp1cZQwKShupim0DuXuSNvRwJg3+bR4s3673AOy/NUX0G
 P0Z0S/i28DOHDfj4Vf3QZiH8vqory+NGvKwr7bgFDwCdHWIS9FW8b1J3JxSLkckF
 wkzZ7xl8ppA3xYCzlLP2sr+al+kAHqNTEsBsjOK4PQY0lGhrqQpzjoXs9vZ3mUOw
 WAs6ZZI3Y+bhQAHSXqo7xFJr9kIbcE/3aIWc+fRT3EZXCCSkd9KnV8qm6oIeKZYr
 +FYcADzY
 =Qosm
 -----END PGP SIGNATURE-----

Merge tag 'efi-fixes-for-v6.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI fixes from Ard Biesheuvel:
 "Another couple of EFI fixes, of which the first two were already in
  -next when I sent out the previous PR, but they caused some issues on
  non-EFI boots so I let them simmer for a bit longer.

   - ensure the EFI ResetSystem and ACPI PRM calls are recognized as
     users of the EFI runtime, and therefore protected against
     exceptions

   - account for the EFI runtime stack in the stacktrace code

   - remove Matthew Garrett's MAINTAINERS entry for efivarfs"

* tag 'efi-fixes-for-v6.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  efi: Remove Matthew Garrett as efivarfs maintainer
  arm64: efi: Account for the EFI runtime stack in stack unwinder
  arm64: efi: Avoid workqueue to check whether EFI runtime is live
2023-01-23 11:46:19 -08:00
Greg Kroah-Hartman
f89fd04323 Merge 6.2-rc5 into driver-core-next
We need the driver core fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-22 12:56:55 +01:00
Christophe JAILLET
1a920c92cd arm64: cpufeature: Use kstrtobool() instead of strtobool()
strtobool() is the same as kstrtobool().
However, the latter is more used within the kernel.

In order to remove strtobool() and slightly simplify kstrtox.h, switch to
the other function name.

While at it, include the corresponding header file (<linux/kstrtox.h>)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/5a1b329cda34aec67615c0d2fd326eb0d6634bf7.1667336095.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 17:51:23 +00:00
Amit Daniel Kachhap
4f2c9bf16a arm64: Add compat hwcap SSBS
This hwcap was added for 32-bit native arm kernel by commit fea53546be
("ARM: 9274/1: Add hwcap for Speculative Store Bypassing Safe") and hence
the corresponding changes added in 32-bit compat arm64 for similar user
interfaces.

Speculative Store Bypass Safe is a feature(FEAT_SSBS) present in
AArch32/AArch64 state for Armv8 and can be identified by PFR2.SSBS
identification register. This hwcap is already advertised in native arm64
kernel.

Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230111053706.13994-8-amit.kachhap@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 14:28:36 +00:00
Amit Daniel Kachhap
2d602aa99a arm64: Add compat hwcap SB
This hwcap was added for 32-bit native arm kernel by commit
3bda6d8848 ("ARM: 9273/1: Add hwcap for Speculation Barrier(SB)")
and hence the corresponding changes added in 32-bit compat arm64 kernel.

Speculation Barrier is a feature(FEAT_SB) present in both AArch32 and
AArch64 state. This hwcap is already advertised in native arm64 kernel.

Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230111053706.13994-7-amit.kachhap@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 14:28:36 +00:00
Amit Daniel Kachhap
0864d1e429 arm64: Add compat hwcap I8MM
This hwcap was added earlier for 32-bit native arm kernel by commit
956ca3a4eb ("ARM: 9272/1: vfp: Add hwcap for FEAT_AA32I8MM") and hence
the corresponding changes added in 32-bit compat arm64 kernel for similar
user interfaces.

Int8 matrix multiplication is a feature (FEAT_AA32I8MM) present in AArch32
state of Armv8 and is identified by ISAR6.I8MM register. Similar
feature(FEAT_I8MM) exist for AArch64 state and is already advertised in
arm64 kernel.

Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230111053706.13994-6-amit.kachhap@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 14:28:36 +00:00
Amit Daniel Kachhap
f64234fa45 arm64: Add compat hwcap ASIMDBF16
This hwcap was added earlier for 32-bit native arm kernel by commit
23b6d4ad6e ("ARM: 9271/1: vfp: Add hwcap for FEAT_AA32BF16") and hence
the corresponding changes added in 32-bit compat arm64 kernel.

Brain 16-bit floating-point storage format is a feature (FEAT_AA32BF16)
present in AArch32 state for Armv8 and is represented by ISAR6.BF16
identification register. Similar feature (FEAT_BF16) exist for AArch64
state and is already advertised in native arm64 kernel.

Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230111053706.13994-5-amit.kachhap@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 14:28:36 +00:00
Amit Daniel Kachhap
4a87be25b0 arm64: Add compat hwcap ASIMDFHM
This hwcap was added earlier for 32-bit native arm kernel by commit
ce4835497c ("ARM: 9270/1: vfp: Add hwcap for FEAT_FHM") and hence the
corresponding changes added in 32-bit compat arm64 kernel for similar user
interfaces.

Floating-point half-precision multiplication (FHM) is a feature present
in AArch32/AArch64 state for Armv8. This hwcap is already advertised in
native arm64 kernel.

Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230111053706.13994-4-amit.kachhap@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 14:28:36 +00:00
Amit Daniel Kachhap
27addd402a arm64: Add compat hwcap ASIMDDP
This hwcap was added earlier for 32-bit native arm kernel by commit
62ea0d873a ("ARM: 9269/1: vfp: Add hwcap for FEAT_DotProd") and hence the
corresponding changes added in 32-bit compat arm64 kernel for similar user
interfaces.

Advanced Dot product is a feature (FEAT_DotProd) present in both
AArch32/AArch64 state for Armv8 and is already advertised in native arm64
kernel.

Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230111053706.13994-3-amit.kachhap@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 14:28:36 +00:00
Amit Daniel Kachhap
846b73a4a3 arm64: Add compat hwcap FPHP and ASIMDHP
These hwcaps were added earlier for 32-bit native arm kernel by commit
c00a19c8b1 ("ARM: 9268/1: vfp: Add hwcap FPHP and ASIMDHP for FEAT_FP16")
and hence the corresponding changes added in 32-bit compat arm64 kernel for
similar userspace interfaces.

Floating point half-precision (FPHP) and Advanced SIMD half-precision
(ASIMDHP) represents the Armv8 FP16 feature extension and is already
advertised in native arm64 kernel.

Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230111053706.13994-2-amit.kachhap@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 14:28:36 +00:00
Ard Biesheuvel
59b37fe52f arm64: Stash shadow stack pointer in the task struct on interrupt
Instead of reloading the shadow call stack pointer from the ordinary
stack, which may be vulnerable to the kind of gadget based attacks
shadow call stacks were designed to prevent, let's store a task's shadow
call stack pointer in the task struct when switching to the shadow IRQ
stack.

Given that currently, the task_struct::scs_sp field is only used to
preserve the shadow call stack pointer while a task is scheduled out or
running in user space, reusing this field to preserve and restore it
while running off the IRQ stack must be safe, as those occurrences are
guaranteed to never overlap. (The stack switching logic only switches
stacks when running from the task stack, and so the value being saved
here always corresponds to the task mode shadow stack)

While at it, fold a mov/add/mov sequence into a single add.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20230109174800.3286265-3-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 14:26:18 +00:00
Ard Biesheuvel
2198d07c50 arm64: Always load shadow stack pointer directly from the task struct
All occurrences of the scs_load macro load the value of the shadow call
stack pointer from the task which is current at that point. So instead
of taking a task struct register argument in the scs_load macro to
specify the task struct to load from, let's always reference the current
task directly. This should make it much harder to exploit any
instruction sequences reloading the shadow call stack pointer register
from memory.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230109174800.3286265-2-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 14:26:18 +00:00
Mark Brown
39e5449928 arm64/signal: Include TPIDR2 in the signal context
Add a new signal frame record for TPIDR2 using the same format as we
already use for ESR with different magic, a header with the value from the
register appended as the only data. If SME is supported then this record is
always included.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Link: https://lore.kernel.org/r/20221208-arm64-tpidr2-sig-v3-2-c77c6c8775f4@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:42:31 +00:00
Mark Brown
7d5d8601e4 arm64/sme: Add hwcaps for SME 2 and 2.1 features
In order to allow userspace to discover the presence of the new SME features
add hwcaps for them.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-13-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:07 +00:00
Mark Brown
f90b529bcb arm64/sme: Implement ZT0 ptrace support
Implement support for a new note type NT_ARM64_ZT providing access to
ZT0 when implemented. Since ZT0 is a register with constant size this is
much simpler than for other SME state.

As ZT0 is only accessible when PSTATE.ZA is set writes to ZT0 cause
PSTATE.ZA to be set, the main alternative would be to return -EBUSY in
this case but this seemed more constructive. Practical users are also
going to be working with ZA anyway and have some understanding of the
state.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-12-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:07 +00:00
Mark Brown
ee072cf708 arm64/sme: Implement signal handling for ZT
Add a new signal context type for ZT which is present in the signal frame
when ZA is enabled and ZT is supported by the system. In order to account
for the possible addition of further ZT registers in the future we make the
number of registers variable in the ABI, though currently the only possible
number is 1. We could just use a bare list head for the context since the
number of registers can be inferred from the size of the context but for
usability and future extensibility we define a header with the number of
registers and some reserved fields in it.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-11-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:06 +00:00
Mark Brown
95fcec7132 arm64/sme: Implement context switching for ZT0
When the system supports SME2 the ZT0 register must be context switched as
part of the floating point state. This register is stored immediately
after ZA in memory and is only accessible when PSTATE.ZA is set so we
handle it in the same functions we use to save and restore ZA.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-10-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:06 +00:00
Mark Brown
d6138b4adc arm64/sme: Provide storage for ZT0
When the system supports SME2 there is an additional register ZT0 which
we must store when the task is using SME. Since ZT0 is accessible only
when PSTATE.ZA is set just like ZA we allocate storage for it along with
ZA, increasing the allocation size for the memory region where we store
ZA and storing the data for ZT after that for ZA.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-9-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:06 +00:00
Mark Brown
d4913eee15 arm64/sme: Add basic enumeration for SME2
Add basic feature detection for SME2, detecting that the feature is present
and disabling traps for ZT0.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-8-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:06 +00:00
Mark Brown
f122576f35 arm64/sme: Enable host kernel to access ZT0
The new register ZT0 introduced by SME2 comes with a new trap, disable it
for the host kernel so that we can implement support for it.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-7-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:06 +00:00
Mark Brown
ce514000da arm64/sme: Rename za_state to sme_state
In preparation for adding support for storage for ZT0 to the thread_struct
rename za_state to sme_state. Since ZT0 is accessible when PSTATE.ZA is
set just like ZA itself we will extend the allocation done for ZA to
cover it, avoiding the need to further expand task_struct for non-SME
tasks.

No functional changes.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-1-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20 12:23:05 +00:00
Mike Kravetz
e9adcfecf5 mm: remove zap_page_range and create zap_vma_pages
zap_page_range was originally designed to unmap pages within an address
range that could span multiple vmas.  While working on [1], it was
discovered that all callers of zap_page_range pass a range entirely within
a single vma.  In addition, the mmu notification call within zap_page
range does not correctly handle ranges that span multiple vmas.  When
crossing a vma boundary, a new mmu_notifier_range_init/end call pair with
the new vma should be made.

Instead of fixing zap_page_range, do the following:
- Create a new routine zap_vma_pages() that will remove all pages within
  the passed vma.  Most users of zap_page_range pass the entire vma and
  can use this new routine.
- For callers of zap_page_range not passing the entire vma, instead call
  zap_page_range_single().
- Remove zap_page_range.

[1] https://lore.kernel.org/linux-mm/20221114235507.294320-2-mike.kravetz@oracle.com/
Link: https://lkml.kernel.org/r/20230104002732.232573-1-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Suggested-by: Peter Xu <peterx@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Peter Xu <peterx@redhat.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>	[s390]
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-18 17:12:55 -08:00
Peter Zijlstra
19235e4727 cpuidle, arm64: Fix the ARM64 cpuidle logic
The recent cpuidle changes started triggering RCU splats on
Juno development boards:

  | =============================
  | WARNING: suspicious RCU usage
  | -----------------------------
  | include/trace/events/ipi.h:19 suspicious rcu_dereference_check() usage!

Fix cpuidle on ARM64:

 - ... by introducing a new 'is_rcu' flag to the cpuidle helpers & make
   ARM64 use it, as ARM64 wants to keep RCU active longer and wants to
   do the ct_cpuidle_enter()/exit() dance itself.

 - Also update the PSCI driver accordingly.

 - This also removes the last known RCU_NONIDLE() user as a bonus.

Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/Y8Z31UbzG3LJgAXE@hirez.programming.kicks-ass.net

--
2023-01-18 12:27:17 +01:00
Pierre Gondois
bd500361a9 ACPI: PPTT: Update acpi_find_last_cache_level() to acpi_get_cache_info()
acpi_find_last_cache_level() allows to find the last level of cache
for a given CPU. The function is only called on arm64 ACPI based
platforms to check for cache information that would be missing in
the CLIDR_EL1 register.
To allow populating (struct cpu_cacheinfo).num_leaves by only parsing
a PPTT, update acpi_find_last_cache_level() to get the 'split_levels',
i.e. the number of cache levels being split in data/instruction
caches.

It is assumed that there will not be data/instruction caches above a
unified cache.
If a split level consist of one data cache and no instruction cache
(or opposite), then the missing cache will still be populated
by default with minimal cache information, and maximal cpumask
(all non-existing caches have the same fw_token).

Suggested-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Jeremy Linton <jeremy.linton@arm.com>
Acked-by: Rafael J. Wysocki  <rafael.j.wysocki@intel.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/20230104183033.755668-6-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2023-01-18 09:58:40 +00:00
Ard Biesheuvel
7ea55715c4 arm64: efi: Account for the EFI runtime stack in stack unwinder
The EFI runtime services run from a dedicated stack now, and so the
stack unwinder needs to be informed about this.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-16 15:27:31 +01:00
Ard Biesheuvel
8a9a1a1873 arm64: efi: Avoid workqueue to check whether EFI runtime is live
Comparing current_work() against efi_rts_work.work is sufficient to
decide whether current is currently running EFI runtime services code at
any level in its call stack.

However, there are other potential users of the EFI runtime stack, such
as the ACPI subsystem, which may invoke efi_call_virt_pointer()
directly, and so any sync exceptions occurring in firmware during those
calls are currently misidentified.

So instead, let's check whether the stashed value of the thread stack
pointer points into current's thread stack. This can only be the case if
current was interrupted while running EFI runtime code. Note that this
implies that we should clear the stashed value after switching back, to
avoid false positives.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-16 15:27:31 +01:00
Linus Torvalds
0bf913e07b First batch of EFI fixes for v6.2:
- avoid a potential crash on the efi_subsys_init() error path
 - use more appropriate error code for runtime services calls issued
   after a crash in the firmware occurred
 - avoid READ_ONCE() for accessing firmware tables that may appear
   misaligned in memory
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE+9lifEBpyUIVN1cpw08iOZLZjyQFAmPBg68ACgkQw08iOZLZ
 jyQs5Qv+PVg06BhEqN+vwNQy6vd4ezTxmDAy7yx751mo3HIw0qT0ohsCIpRydq0c
 +qlCXa+Uu/yr/IQplfDT9vY+MEwD9iuwJha8ltGRWM3++yEF4uQXowHDoEKsO84l
 5PaC37EfOvHmV6UdFdIF0OYDOcRvX2FsIbmUKRyvIav1e+QRLvUWWKKEmAh04c7G
 yNc0837kmoOpjKrYPc8j2n3dVUbhrFUW5eLIFmd8yrR+GRu6Ae5RH3J7iF7Nqtrq
 oReYYq3XpmYg8c00WV0NKVuB0DK7fhGY7jcbDfLmTrPwqVzLjxQGecxsQPYnqrJd
 mZywkm2fM8KIJy2LQDJOVOZaDAzaC2SkrpELHX/MnPK1UrP561AIv/sXK+3+UBEm
 b6m5dHbJgaifKP3kkbc9Cy4f9avLJOdjdXH5f5zPe7it54yHLsacEvjT6M2oiunx
 zIvTd/MXi24J+tzgxr08KM5wHXgLGh+fUM7BfZTvEVQmUjY8TnIPjsaAJhTS3jzV
 TN3/XAWi
 =4LbF
 -----END PGP SIGNATURE-----

Merge tag 'efi-fixes-for-v6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI fixes from Ard Biesheuvel:

 - avoid a potential crash on the efi_subsys_init() error path

 - use more appropriate error code for runtime services calls issued
   after a crash in the firmware occurred

 - avoid READ_ONCE() for accessing firmware tables that may appear
   misaligned in memory

* tag 'efi-fixes-for-v6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  efi: tpm: Avoid READ_ONCE() for accessing the event log
  efi: rt-wrapper: Add missing include
  efi: fix userspace infinite retry read efivars after EFI runtime services page fault
  efi: fix NULL-deref in init error path
2023-01-13 10:37:10 -06:00
Peter Zijlstra
69e26b4f43 cpuidle, arch: Mark all ct_cpuidle_enter() callers __cpuidle
For all cpuidle drivers that use CPUIDLE_FLAG_RCU_IDLE, ensure that
all functions that call ct_cpuidle_enter() are marked __cpuidle.

( due to lack of noinstr validation on these platforms it is entirely
  possible this isn't complete )

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230112195542.274096325@infradead.org
2023-01-13 11:48:17 +01:00
Peter Zijlstra
4a3182e6d6 arm64, smp: Remove trace_.*_rcuidle() usage
Ever since commit d3afc7f129 ("arm64: Allow IPIs to be handled as
normal interrupts") this function is called in regular IRQ context.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20230112195540.804410487@infradead.org
2023-01-13 11:48:15 +01:00
Peter Zijlstra
89b3098703 arch/idle: Change arch_cpu_idle() behavior: always exit with IRQs disabled
Current arch_cpu_idle() is called with IRQs disabled, but will return
with IRQs enabled.

However, the very first thing the generic code does after calling
arch_cpu_idle() is raw_local_irq_disable(). This means that
architectures that can idle with IRQs disabled end up doing a
pointless 'enable-disable' dance.

Therefore, push this IRQ disabling into the idle function, meaning
that those architectures can avoid the pointless IRQ state flipping.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Acked-by: Mark Rutland <mark.rutland@arm.com> [arm64]
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Guo Ren <guoren@kernel.org>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20230112195540.618076436@infradead.org
2023-01-13 11:48:15 +01:00
Peter Zijlstra
2b5a0e425e objtool/idle: Validate __cpuidle code as noinstr
Idle code is very like entry code in that RCU isn't available. As
such, add a little validation.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20230112195540.373461409@infradead.org
2023-01-13 11:48:15 +01:00
Akihiko Odaki
805e6ec1c5 arm64/cache: Move CLIDR macro definitions
The macros are useful for KVM which needs to manage how CLIDR is exposed
to vcpu so move them to include/asm/cache.h, which KVM can refer to.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Link: https://lore.kernel.org/r/20230112023852.42012-5-akihiko.odaki@daynix.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-01-12 21:07:43 +00:00
Mark Brown
b2482807fb arm64/sme: Optimise SME exit on syscall entry
Our ABI says that we exit streaming mode on syscall entry. Currently we
check if we are in streaming mode before doing this but since we have a
SMSTOP SM instruction which will clear SVCR.SM in a single atomic operation
we can save ourselves the read of the system register and check of the flag
and just unconditionally do the SMSTOP SM. If we are not in streaming mode
it results in a noop change to SVCR, if we are in streaming mode we will
exit as desired.

No functional change.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230110-arm64-sme-syscall-smstop-v1-1-ac94235fd810@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-12 17:09:21 +00:00
Mark Brown
fcd3d2c082 arm64/sme: Don't use streaming mode to probe the maximum SME VL
During development the architecture added the RDSVL instruction which means
we do not need to enter streaming mode to enumerate the SME VLs, use it
when we probe the maximum supported VL. Other users were already updated.

No functional change.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221223-arm64-sme-probe-max-v1-1-cbde68f67ad0@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-12 16:07:26 +00:00
Mark Brown
c3cdd54c61 arm64/ptrace: Use system_supports_tpidr2() to check for TPIDR2 support
We have a separate system_supports_tpidr2() to check for TPIDR2 support
but were using system_supports_sme() in tls_set(). While these are
currently identical let's use the specific check instead so we don't have
any surprises in future.

Reported-by: Will Deacon <will@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-tpidr2-ptrace-feat-v2-1-3760c895a574@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-12 16:06:45 +00:00
Mark Brown
50daf5b7c4 arm64/cpufeature: Fix field sign for DIT hwcap detection
Since it was added our hwcap for DIT has specified that DIT is a signed
field but this appears to be incorrect, the two values for the enumeration
are:

	0b0000	NI
	0b0001	IMP

which look like a normal unsigned enumeration and the in-kernel DIT usage
added by 01ab991fc0 ("arm64: Enable data independent timing (DIT) in the
kernel") detects the feature with an unsigned enum. Fix the hwcap to specify
the field as unsigned.

Fixes: 7206dc93a5 ("arm64: Expose Arm v8.4 features")
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221207-arm64-sysreg-helpers-v3-1-0d71a7b174a8@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-12 16:02:23 +00:00
Ard Biesheuvel
18bba1843f efi: rt-wrapper: Add missing include
Add the missing #include of asm/assembler.h, which is where the ldr_l
macro is defined.

Fixes: ff7a167961 ("arm64: efi: Execute runtime services from a dedicated stack")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-09 12:42:56 +01:00
Anshuman Khandual
5db568e748 arm64: errata: Workaround possible Cortex-A715 [ESR|FAR]_ELx corruption
If a Cortex-A715 cpu sees a page mapping permissions change from executable
to non-executable, it may corrupt the ESR_ELx and FAR_ELx registers, on the
next instruction abort caused by permission fault.

Only user-space does executable to non-executable permission transition via
mprotect() system call which calls ptep_modify_prot_start() and ptep_modify
_prot_commit() helpers, while changing the page mapping. The platform code
can override these helpers via __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION.

Work around the problem via doing a break-before-make TLB invalidation, for
all executable user space mappings, that go through mprotect() system call.
This overrides ptep_modify_prot_start() and ptep_modify_prot_commit(), via
defining HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION on the platform thus giving
an opportunity to intercept user space exec mappings, and do the necessary
TLB invalidation. Similar interceptions are also implemented for HugeTLB.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20230102061651.34745-1-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-01-06 17:14:55 +00:00
Mark Brown
f26cd73721 arm64/signal: Always allocate SVE signal frames on SME only systems
Currently we only allocate space for SVE signal frames on systems that
support SVE, meaning that SME only systems do not allocate a signal frame
for streaming mode SVE state. Change the check so space is allocated if
either feature is supported.

Fixes: 85ed24dad2 ("arm64/sme: Implement streaming SVE signal handling")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221223-arm64-fix-sme-only-v1-3-938d663f69e5@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2023-01-05 15:31:18 +00:00
Mark Brown
7dde62f068 arm64/signal: Always accept SVE signal frames on SME only systems
Currently we reject an attempt to restore a SVE signal frame on a system
with SME but not SVE supported. This means that it is not possible to
disable streaming mode via signal return as this is configured via the
flags in the SVE signal context. Instead accept the signal frame, we will
require it to have a vector length of 0 specified and no payload since the
task will have no SVE vector length configured.

Fixes: 85ed24dad2 ("arm64/sme: Implement streaming SVE signal handling")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221223-arm64-fix-sme-only-v1-2-938d663f69e5@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2023-01-05 15:31:18 +00:00
Mark Brown
0cab5b4964 arm64/sme: Fix context switch for SME only systems
When refactoring fpsimd_load() to support keeping SVE enabled over syscalls
support for systems with SME but not SVE was broken. The code that selects
between loading regular FPSIMD and SVE states was guarded by using
system_supports_sve() but is also needed to handle the streaming SVE state
in SME only systems where that check will be false. Fix this by also
checking for system_supports_sme().

Fixes: a0136be443 ("arm64/fpsimd: Load FP state based on recorded data type")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221223-arm64-fix-sme-only-v1-1-938d663f69e5@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2023-01-05 15:31:18 +00:00
Catalin Marinas
4f4c549feb arm64: mte: Avoid the racy walk of the vma list during core dump
The MTE coredump code in arch/arm64/kernel/elfcore.c iterates over the
vma list without the mmap_lock held. This can race with another process
or userfaultfd concurrently modifying the vma list. Change the
for_each_mte_vma macro and its callers to instead use the vma snapshot
taken by dump_vma_snapshot() and stored in the cprm object.

Fixes: 6dd8b1a0b6 ("arm64: mte: Dump the MTE tags in the core file")
Cc: <stable@vger.kernel.org> # 5.18.x
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Seth Jenkins <sethjenkins@google.com>
Suggested-by: Seth Jenkins <sethjenkins@google.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221222181251.1345752-4-catalin.marinas@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-01-05 15:12:12 +00:00
Catalin Marinas
19e183b545 elfcore: Add a cprm parameter to elf_core_extra_{phdrs,data_size}
A subsequent fix for arm64 will use this parameter to parse the vma
information from the snapshot created by dump_vma_snapshot() rather than
traversing the vma list without the mmap_lock.

Fixes: 6dd8b1a0b6 ("arm64: mte: Dump the MTE tags in the core file")
Cc: <stable@vger.kernel.org> # 5.18.x
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Seth Jenkins <sethjenkins@google.com>
Suggested-by: Seth Jenkins <sethjenkins@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221222181251.1345752-3-catalin.marinas@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-01-05 15:12:12 +00:00
Catalin Marinas
736eedc974 arm64: mte: Fix double-freeing of the temporary tag storage during coredump
Commit 16decce22e ("arm64: mte: Fix the stack frame size warning in
mte_dump_tag_range()") moved the temporary tag storage array from the
stack to slab but it also introduced an error in double freeing this
object. Remove the in-loop freeing.

Fixes: 16decce22e ("arm64: mte: Fix the stack frame size warning in mte_dump_tag_range()")
Cc: <stable@vger.kernel.org> # 5.18.x
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Seth Jenkins <sethjenkins@google.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221222181251.1345752-2-catalin.marinas@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-01-05 15:12:12 +00:00
Zenghui Yu
eb9a85261e arm64: ptrace: Use ARM64_SME to guard the SME register enumerations
We currently guard REGSET_{SSVE, ZA} using ARM64_SVE for no good reason.
Both enumerations would be pointless without ARM64_SME and create two empty
entries in aarch64_regsets[] which would then become part of a process's
native regset view (they should be ignored though).

Switch to use ARM64_SME instead.

Fixes: e12310a0d3 ("arm64/sme: Implement ptrace support for streaming mode SVE registers")
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221214135943.379-1-yuzenghui@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-01-05 15:08:31 +00:00
Linus Torvalds
77856d911a arm64 fixes for -rc1
- Fix Kconfig dependencies to re-allow the enabling of function graph
   tracer and shadow call stacks at the same time.
 
 - Revert the workaround for CPU erratum #2645198 since the CONFIG_
   guards were incorrect and the code has therefore not seen any real
   exposure in -next.
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmOcVWkQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNF/7B/wIGicobLxXMMkuao+ipm5V/eLnVRuVt6WD
 T//5ZG+3Td3xON+mdt/byIm/Npl1I2l+NDjK9jIFcS5A/Q7DmwbcxJV+6BhRdb7o
 XQxkHsKR2DTbmbeqd0+AkZGJc4jk5D+vuyLeo8jcc6bSpQhepCOV5R5JVOadg9mg
 WxuITwsodI9GfQGmupggF6C31yMCYibmlD9WWNW8tNx8TBojU97pJbQaf1h3bC9i
 JE9CxBVYmt3Qg5BAB46EdH60lxELyHpEjJNvgvZZpFz4a/bBB47mG7Cy+ECldc5p
 LukJnAImydedwgQqgBD0e0HyXoIQ8r8NZ+yNgig2asQxA5DYsI1L
 =IZRW
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:

 - Fix Kconfig dependencies to re-allow the enabling of function graph
   tracer and shadow call stacks at the same time.

 - Revert the workaround for CPU erratum #2645198 since the CONFIG_
   guards were incorrect and the code has therefore not seen any real
   exposure in -next.

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  Revert "arm64: errata: Workaround possible Cortex-A715 [ESR|FAR]_ELx corruption"
  ftrace: Allow WITH_ARGS flavour of graph tracer with shadow call stack
2022-12-16 13:46:41 -06:00
Linus Torvalds
8fa590bf34 ARM64:
* Enable the per-vcpu dirty-ring tracking mechanism, together with an
   option to keep the good old dirty log around for pages that are
   dirtied by something other than a vcpu.
 
 * Switch to the relaxed parallel fault handling, using RCU to delay
   page table reclaim and giving better performance under load.
 
 * Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping option,
   which multi-process VMMs such as crosvm rely on (see merge commit 382b5b87a9:
   "Fix a number of issues with MTE, such as races on the tags being
   initialised vs the PG_mte_tagged flag as well as the lack of support
   for VM_SHARED when KVM is involved.  Patches from Catalin Marinas and
   Peter Collingbourne").
 
 * Merge the pKVM shadow vcpu state tracking that allows the hypervisor
   to have its own view of a vcpu, keeping that state private.
 
 * Add support for the PMUv3p5 architecture revision, bringing support
   for 64bit counters on systems that support it, and fix the
   no-quite-compliant CHAIN-ed counter support for the machines that
   actually exist out there.
 
 * Fix a handful of minor issues around 52bit VA/PA support (64kB pages
   only) as a prefix of the oncoming support for 4kB and 16kB pages.
 
 * Pick a small set of documentation and spelling fixes, because no
   good merge window would be complete without those.
 
 s390:
 
 * Second batch of the lazy destroy patches
 
 * First batch of KVM changes for kernel virtual != physical address support
 
 * Removal of a unused function
 
 x86:
 
 * Allow compiling out SMM support
 
 * Cleanup and documentation of SMM state save area format
 
 * Preserve interrupt shadow in SMM state save area
 
 * Respond to generic signals during slow page faults
 
 * Fixes and optimizations for the non-executable huge page errata fix.
 
 * Reprogram all performance counters on PMU filter change
 
 * Cleanups to Hyper-V emulation and tests
 
 * Process Hyper-V TLB flushes from a nested guest (i.e. from a L2 guest
   running on top of a L1 Hyper-V hypervisor)
 
 * Advertise several new Intel features
 
 * x86 Xen-for-KVM:
 
 ** Allow the Xen runstate information to cross a page boundary
 
 ** Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured
 
 ** Add support for 32-bit guests in SCHEDOP_poll
 
 * Notable x86 fixes and cleanups:
 
 ** One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0).
 
 ** Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few
    years back when eliminating unnecessary barriers when switching between
    vmcs01 and vmcs02.
 
 ** Clean up vmread_error_trampoline() to make it more obvious that params
    must be passed on the stack, even for x86-64.
 
 ** Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective
    of the current guest CPUID.
 
 ** Fudge around a race with TSC refinement that results in KVM incorrectly
    thinking a guest needs TSC scaling when running on a CPU with a
    constant TSC, but no hardware-enumerated TSC frequency.
 
 ** Advertise (on AMD) that the SMM_CTL MSR is not supported
 
 ** Remove unnecessary exports
 
 Generic:
 
 * Support for responding to signals during page faults; introduces
   new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks
 
 Selftests:
 
 * Fix an inverted check in the access tracking perf test, and restore
   support for asserting that there aren't too many idle pages when
   running on bare metal.
 
 * Fix build errors that occur in certain setups (unsure exactly what is
   unique about the problematic setup) due to glibc overriding
   static_assert() to a variant that requires a custom message.
 
 * Introduce actual atomics for clear/set_bit() in selftests
 
 * Add support for pinning vCPUs in dirty_log_perf_test.
 
 * Rename the so called "perf_util" framework to "memstress".
 
 * Add a lightweight psuedo RNG for guest use, and use it to randomize
   the access pattern and write vs. read percentage in the memstress tests.
 
 * Add a common ucall implementation; code dedup and pre-work for running
   SEV (and beyond) guests in selftests.
 
 * Provide a common constructor and arch hook, which will eventually be
   used by x86 to automatically select the right hypercall (AMD vs. Intel).
 
 * A bunch of added/enabled/fixed selftests for ARM64, covering memslots,
   breakpoints, stage-2 faults and access tracking.
 
 * x86-specific selftest changes:
 
 ** Clean up x86's page table management.
 
 ** Clean up and enhance the "smaller maxphyaddr" test, and add a related
    test to cover generic emulation failure.
 
 ** Clean up the nEPT support checks.
 
 ** Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values.
 
 ** Fix an ordering issue in the AMX test introduced by recent conversions
    to use kvm_cpu_has(), and harden the code to guard against similar bugs
    in the future.  Anything that tiggers caching of KVM's supported CPUID,
    kvm_cpu_has() in this case, effectively hides opt-in XSAVE features if
    the caching occurs before the test opts in via prctl().
 
 Documentation:
 
 * Remove deleted ioctls from documentation
 
 * Clean up the docs for the x86 MSR filter.
 
 * Various fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmOaFrcUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPemQgAq49excg2Cc+EsHnZw3vu/QWdA0Rt
 KhL3OgKxuHNjCbD2O9n2t5di7eJOTQ7F7T0eDm3xPTr4FS8LQ2327/mQePU/H2CF
 mWOpq9RBWLzFsSTeVA2Mz9TUTkYSnDHYuRsBvHyw/n9cL76BWVzjImldFtjYjjex
 yAwl8c5itKH6bc7KO+5ydswbvBzODkeYKUSBNdbn6m0JGQST7XppNwIAJvpiHsii
 Qgpk0e4Xx9q4PXG/r5DedI6BlufBsLhv0aE9SHPzyKH3JbbUFhJYI8ZD5OhBQuYW
 MwxK2KlM5Jm5ud2NZDDlsMmmvd1lnYCFDyqNozaKEWC1Y5rq1AbMa51fXA==
 =QAYX
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM64:

   - Enable the per-vcpu dirty-ring tracking mechanism, together with an
     option to keep the good old dirty log around for pages that are
     dirtied by something other than a vcpu.

   - Switch to the relaxed parallel fault handling, using RCU to delay
     page table reclaim and giving better performance under load.

   - Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping
     option, which multi-process VMMs such as crosvm rely on (see merge
     commit 382b5b87a9: "Fix a number of issues with MTE, such as
     races on the tags being initialised vs the PG_mte_tagged flag as
     well as the lack of support for VM_SHARED when KVM is involved.
     Patches from Catalin Marinas and Peter Collingbourne").

   - Merge the pKVM shadow vcpu state tracking that allows the
     hypervisor to have its own view of a vcpu, keeping that state
     private.

   - Add support for the PMUv3p5 architecture revision, bringing support
     for 64bit counters on systems that support it, and fix the
     no-quite-compliant CHAIN-ed counter support for the machines that
     actually exist out there.

   - Fix a handful of minor issues around 52bit VA/PA support (64kB
     pages only) as a prefix of the oncoming support for 4kB and 16kB
     pages.

   - Pick a small set of documentation and spelling fixes, because no
     good merge window would be complete without those.

  s390:

   - Second batch of the lazy destroy patches

   - First batch of KVM changes for kernel virtual != physical address
     support

   - Removal of a unused function

  x86:

   - Allow compiling out SMM support

   - Cleanup and documentation of SMM state save area format

   - Preserve interrupt shadow in SMM state save area

   - Respond to generic signals during slow page faults

   - Fixes and optimizations for the non-executable huge page errata
     fix.

   - Reprogram all performance counters on PMU filter change

   - Cleanups to Hyper-V emulation and tests

   - Process Hyper-V TLB flushes from a nested guest (i.e. from a L2
     guest running on top of a L1 Hyper-V hypervisor)

   - Advertise several new Intel features

   - x86 Xen-for-KVM:

      - Allow the Xen runstate information to cross a page boundary

      - Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured

      - Add support for 32-bit guests in SCHEDOP_poll

   - Notable x86 fixes and cleanups:

      - One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0).

      - Reinstate IBPB on emulated VM-Exit that was incorrectly dropped
        a few years back when eliminating unnecessary barriers when
        switching between vmcs01 and vmcs02.

      - Clean up vmread_error_trampoline() to make it more obvious that
        params must be passed on the stack, even for x86-64.

      - Let userspace set all supported bits in MSR_IA32_FEAT_CTL
        irrespective of the current guest CPUID.

      - Fudge around a race with TSC refinement that results in KVM
        incorrectly thinking a guest needs TSC scaling when running on a
        CPU with a constant TSC, but no hardware-enumerated TSC
        frequency.

      - Advertise (on AMD) that the SMM_CTL MSR is not supported

      - Remove unnecessary exports

  Generic:

   - Support for responding to signals during page faults; introduces
     new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks

  Selftests:

   - Fix an inverted check in the access tracking perf test, and restore
     support for asserting that there aren't too many idle pages when
     running on bare metal.

   - Fix build errors that occur in certain setups (unsure exactly what
     is unique about the problematic setup) due to glibc overriding
     static_assert() to a variant that requires a custom message.

   - Introduce actual atomics for clear/set_bit() in selftests

   - Add support for pinning vCPUs in dirty_log_perf_test.

   - Rename the so called "perf_util" framework to "memstress".

   - Add a lightweight psuedo RNG for guest use, and use it to randomize
     the access pattern and write vs. read percentage in the memstress
     tests.

   - Add a common ucall implementation; code dedup and pre-work for
     running SEV (and beyond) guests in selftests.

   - Provide a common constructor and arch hook, which will eventually
     be used by x86 to automatically select the right hypercall (AMD vs.
     Intel).

   - A bunch of added/enabled/fixed selftests for ARM64, covering
     memslots, breakpoints, stage-2 faults and access tracking.

   - x86-specific selftest changes:

      - Clean up x86's page table management.

      - Clean up and enhance the "smaller maxphyaddr" test, and add a
        related test to cover generic emulation failure.

      - Clean up the nEPT support checks.

      - Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values.

      - Fix an ordering issue in the AMX test introduced by recent
        conversions to use kvm_cpu_has(), and harden the code to guard
        against similar bugs in the future. Anything that tiggers
        caching of KVM's supported CPUID, kvm_cpu_has() in this case,
        effectively hides opt-in XSAVE features if the caching occurs
        before the test opts in via prctl().

  Documentation:

   - Remove deleted ioctls from documentation

   - Clean up the docs for the x86 MSR filter.

   - Various fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (361 commits)
  KVM: x86: Add proper ReST tables for userspace MSR exits/flags
  KVM: selftests: Allocate ucall pool from MEM_REGION_DATA
  KVM: arm64: selftests: Align VA space allocator with TTBR0
  KVM: arm64: Fix benign bug with incorrect use of VA_BITS
  KVM: arm64: PMU: Fix period computation for 64bit counters with 32bit overflow
  KVM: x86: Advertise that the SMM_CTL MSR is not supported
  KVM: x86: remove unnecessary exports
  KVM: selftests: Fix spelling mistake "probabalistic" -> "probabilistic"
  tools: KVM: selftests: Convert clear/set_bit() to actual atomics
  tools: Drop "atomic_" prefix from atomic test_and_set_bit()
  tools: Drop conflicting non-atomic test_and_{clear,set}_bit() helpers
  KVM: selftests: Use non-atomic clear/set bit helpers in KVM tests
  perf tools: Use dedicated non-atomic clear/set bit helpers
  tools: Take @bit as an "unsigned long" in {clear,set}_bit() helpers
  KVM: arm64: selftests: Enable single-step without a "full" ucall()
  KVM: x86: fix APICv/x2AVIC disabled when vm reboot by itself
  KVM: Remove stale comment about KVM_REQ_UNHALT
  KVM: Add missing arch for KVM_CREATE_DEVICE and KVM_{SET,GET}_DEVICE_ATTR
  KVM: Reference to kvm_userspace_memory_region in doc and comments
  KVM: Delete all references to removed KVM_SET_MEMORY_ALIAS ioctl
  ...
2022-12-15 11:12:21 -08:00
Will Deacon
c0cd1d5417 Revert "arm64: errata: Workaround possible Cortex-A715 [ESR|FAR]_ELx corruption"
This reverts commit 44ecda71fd.

All versions of this patch on the mailing list, including the version
that ended up getting merged, have portions of code guarded by the
non-existent CONFIG_ARM64_WORKAROUND_2645198 option. Although Anshuman
says he tested the code with some additional debug changes [1], I'm
hesitant to fix the CONFIG option and light up a bunch of code right
before I (and others) disappear for the end of year holidays, during
which time we won't be around to deal with any fallout.

So revert the change for now. We can bring back a fixed, tested version
for a later -rc when folks are thinking about things other than trees
and turkeys.

[1] https://lore.kernel.org/r/b6f61241-e436-5db1-1053-3b441080b8d6@arm.com
Reported-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Link: https://lore.kernel.org/r/20221215094811.23188-1-lukas.bulwahn@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-12-15 17:59:12 +00:00
Linus Torvalds
48ea09cdda hardening updates for v6.2-rc1
- Convert flexible array members, fix -Wstringop-overflow warnings,
   and fix KCFI function type mismatches that went ignored by
   maintainers (Gustavo A. R. Silva, Nathan Chancellor, Kees Cook).
 
 - Remove the remaining side-effect users of ksize() by converting
   dma-buf, btrfs, and coredump to using kmalloc_size_roundup(),
   add more __alloc_size attributes, and introduce full testing
   of all allocator functions. Finally remove the ksize() side-effect
   so that each allocation-aware checker can finally behave without
   exceptions.
 
 - Introduce oops_limit (default 10,000) and warn_limit (default off)
   to provide greater granularity of control for panic_on_oops and
   panic_on_warn (Jann Horn, Kees Cook).
 
 - Introduce overflows_type() and castable_to_type() helpers for
   cleaner overflow checking.
 
 - Improve code generation for strscpy() and update str*() kern-doc.
 
 - Convert strscpy and sigphash tests to KUnit, and expand memcpy
   tests.
 
 - Always use a non-NULL argument for prepare_kernel_cred().
 
 - Disable structleak plugin in FORTIFY KUnit test (Anders Roxell).
 
 - Adjust orphan linker section checking to respect CONFIG_WERROR
   (Xin Li).
 
 - Make sure siginfo is cleared for forced SIGKILL (haifeng.xu).
 
 - Fix um vs FORTIFY warnings for always-NULL arguments.
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmOZSOoWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJjAAD/0YkvpU7f03f8hcQMJK6wv//24K
 AW41hEaBikq9RcmkuvkLLrJRibGgZ5O2xUkUkxRs/HxhkhrZ0kEw8sbwZe8MoWls
 F4Y9+TDjsrdHmjhfcBZdLnVxwcKK5wlaEcpjZXtbsfcdhx3TbgcDA23YELl5t0K+
 I11j4kYmf9SLl4CwIrSP5iACml8CBHARDh8oIMF7FT/LrjNbM8XkvBcVVT6hTbOV
 yjgA8WP2e9GXvj9GzKgqvd0uE/kwPkVAeXLNFWopPi4FQ8AWjlxbBZR0gamA6/EB
 d7TIs0ifpVU2JGQaTav4xO6SsFMj3ntoUI0qIrFaTxZAvV4KYGrPT/Kwz1O4SFaG
 rN5lcxseQbPQSBTFNG4zFjpywTkVCgD2tZqDwz5Rrmiraz0RyIokCN+i4CD9S0Ds
 oEd8JSyLBk1sRALczkuEKo0an5AyC9YWRcBXuRdIHpLo08PsbeUUSe//4pe303cw
 0ApQxYOXnrIk26MLElTzSMImlSvlzW6/5XXzL9ME16leSHOIfDeerPnc9FU9Eb3z
 ODv22z6tJZ9H/apSUIHZbMciMbbVTZ8zgpkfydr08o87b342N/ncYHZ5cSvQ6DWb
 jS5YOIuvl46/IhMPT16qWC8p0bP5YhxoPv5l6Xr0zq0ooEj0E7keiD/SzoLvW+Qs
 AHXcibguPRQBPAdiPQ==
 =yaaN
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull kernel hardening updates from Kees Cook:

 - Convert flexible array members, fix -Wstringop-overflow warnings, and
   fix KCFI function type mismatches that went ignored by maintainers
   (Gustavo A. R. Silva, Nathan Chancellor, Kees Cook)

 - Remove the remaining side-effect users of ksize() by converting
   dma-buf, btrfs, and coredump to using kmalloc_size_roundup(), add
   more __alloc_size attributes, and introduce full testing of all
   allocator functions. Finally remove the ksize() side-effect so that
   each allocation-aware checker can finally behave without exceptions

 - Introduce oops_limit (default 10,000) and warn_limit (default off) to
   provide greater granularity of control for panic_on_oops and
   panic_on_warn (Jann Horn, Kees Cook)

 - Introduce overflows_type() and castable_to_type() helpers for cleaner
   overflow checking

 - Improve code generation for strscpy() and update str*() kern-doc

 - Convert strscpy and sigphash tests to KUnit, and expand memcpy tests

 - Always use a non-NULL argument for prepare_kernel_cred()

 - Disable structleak plugin in FORTIFY KUnit test (Anders Roxell)

 - Adjust orphan linker section checking to respect CONFIG_WERROR (Xin
   Li)

 - Make sure siginfo is cleared for forced SIGKILL (haifeng.xu)

 - Fix um vs FORTIFY warnings for always-NULL arguments

* tag 'hardening-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (31 commits)
  ksmbd: replace one-element arrays with flexible-array members
  hpet: Replace one-element array with flexible-array member
  um: virt-pci: Avoid GCC non-NULL warning
  signal: Initialize the info in ksignal
  lib: fortify_kunit: build without structleak plugin
  panic: Expose "warn_count" to sysfs
  panic: Introduce warn_limit
  panic: Consolidate open-coded panic_on_warn checks
  exit: Allow oops_limit to be disabled
  exit: Expose "oops_count" to sysfs
  exit: Put an upper limit on how often we can oops
  panic: Separate sysctl logic from CONFIG_SMP
  mm/pgtable: Fix multiple -Wstringop-overflow warnings
  mm: Make ksize() a reporting-only function
  kunit/fortify: Validate __alloc_size attribute results
  drm/sti: Fix return type of sti_{dvo,hda,hdmi}_connector_mode_valid()
  drm/fsl-dcu: Fix return type of fsl_dcu_drm_connector_mode_valid()
  driver core: Add __alloc_size hint to devm allocators
  overflow: Introduce overflows_type() and castable_to_type()
  coredump: Proactively round up to kmalloc bucket size
  ...
2022-12-14 12:20:00 -08:00
Linus Torvalds
fc4c9f4504 EFI updates for v6.2:
- Refactor the zboot code so that it incorporates all the EFI stub
   logic, rather than calling the decompressed kernel as a EFI app.
 - Add support for initrd= command line option to x86 mixed mode.
 - Allow initrd= to be used with arbitrary EFI accessible file systems
   instead of just the one the kernel itself was loaded from.
 - Move some x86-only handling and manipulation of the EFI memory map
   into arch/x86, as it is not used anywhere else.
 - More flexible handling of any random seeds provided by the boot
   environment (i.e., systemd-boot) so that it becomes available much
   earlier during the boot.
 - Allow improved arch-agnostic EFI support in loaders, by setting a
   uniform baseline of supported features, and adding a generic magic
   number to the DOS/PE header. This should allow loaders such as GRUB or
   systemd-boot to reduce the amount of arch-specific handling
   substantially.
 - (arm64) Run EFI runtime services from a dedicated stack, and use it to
   recover from synchronous exceptions that might occur in the firmware
   code.
 - (arm64) Ensure that we don't allocate memory outside of the 48-bit
   addressable physical range.
 - Make EFI pstore record size configurable
 - Add support for decoding CXL specific CPER records
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE+9lifEBpyUIVN1cpw08iOZLZjyQFAmOTQ1cACgkQw08iOZLZ
 jyQRkAv+LqaZFWeVwhAQHiw/N3RnRM0nZHea6++D2p1y/ZbCpwv3pdLl2YHQ1KmW
 wDG9Nr4C1ITLtfy1YZKeYpwloQtq9S1GZDWnFpVv/hdo7L924eRAwIlxowWn1OnP
 ruxv2PaYXyb0plh1YD1f6E1BqrfUOtajET55Kxs9ZsxmnMtDpIX3NiYy4LKMBIZC
 +Eywt41M3uBX+wgmSujFBMVVJjhOX60WhUYXqy0RXwDKOyrz/oW5td+eotSCreB6
 FVbjvwQvUdtzn4s1FayOMlTrkxxLw4vLhsaUGAdDOHd3rg3sZT9Xh1HqFFD6nss6
 ZAzAYQ6BzdiV/5WSB9meJe+BeG1hjTNKjJI6JPO2lctzYJqlnJJzI6JzBuH9vzQ0
 dffLB8NITeEW2rphIh+q+PAKFFNbXWkJtV4BMRpqmzZ/w7HwupZbUXAzbWE8/5km
 qlFpr0kmq8GlVcbXNOFjmnQVrJ8jPYn+O3AwmEiVAXKZJOsMH0sjlXHKsonme9oV
 Sk71c6Em
 =JEXz
 -----END PGP SIGNATURE-----

Merge tag 'efi-next-for-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI updates from Ard Biesheuvel:
 "Another fairly sizable pull request, by EFI subsystem standards.

  Most of the work was done by me, some of it in collaboration with the
  distro and bootloader folks (GRUB, systemd-boot), where the main focus
  has been on removing pointless per-arch differences in the way EFI
  boots a Linux kernel.

   - Refactor the zboot code so that it incorporates all the EFI stub
     logic, rather than calling the decompressed kernel as a EFI app.

   - Add support for initrd= command line option to x86 mixed mode.

   - Allow initrd= to be used with arbitrary EFI accessible file systems
     instead of just the one the kernel itself was loaded from.

   - Move some x86-only handling and manipulation of the EFI memory map
     into arch/x86, as it is not used anywhere else.

   - More flexible handling of any random seeds provided by the boot
     environment (i.e., systemd-boot) so that it becomes available much
     earlier during the boot.

   - Allow improved arch-agnostic EFI support in loaders, by setting a
     uniform baseline of supported features, and adding a generic magic
     number to the DOS/PE header. This should allow loaders such as GRUB
     or systemd-boot to reduce the amount of arch-specific handling
     substantially.

   - (arm64) Run EFI runtime services from a dedicated stack, and use it
     to recover from synchronous exceptions that might occur in the
     firmware code.

   - (arm64) Ensure that we don't allocate memory outside of the 48-bit
     addressable physical range.

   - Make EFI pstore record size configurable

   - Add support for decoding CXL specific CPER records"

* tag 'efi-next-for-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: (43 commits)
  arm64: efi: Recover from synchronous exceptions occurring in firmware
  arm64: efi: Execute runtime services from a dedicated stack
  arm64: efi: Limit allocations to 48-bit addressable physical region
  efi: Put Linux specific magic number in the DOS header
  efi: libstub: Always enable initrd command line loader and bump version
  efi: stub: use random seed from EFI variable
  efi: vars: prohibit reading random seed variables
  efi: random: combine bootloader provided RNG seed with RNG protocol output
  efi/cper, cxl: Decode CXL Error Log
  efi/cper, cxl: Decode CXL Protocol Error Section
  efi: libstub: fix efi_load_initrd_dev_path() kernel-doc comment
  efi: x86: Move EFI runtime map sysfs code to arch/x86
  efi: runtime-maps: Clarify purpose and enable by default for kexec
  efi: pstore: Add module parameter for setting the record size
  efi: xen: Set EFI_PARAVIRT for Xen dom0 boot on all architectures
  efi: memmap: Move manipulation routines into x86 arch tree
  efi: memmap: Move EFI fake memmap support into x86 arch tree
  efi: libstub: Undeprecate the command line initrd loader
  efi: libstub: Add mixed mode support to command line initrd loader
  efi: libstub: Permit mixed mode return types other than efi_status_t
  ...
2022-12-13 14:31:47 -08:00
Linus Torvalds
8702f2c611 Non-MM patches for 6.2-rc1.
- A ptrace API cleanup series from Sergey Shtylyov
 
 - Fixes and cleanups for kexec from ye xingchen
 
 - nilfs2 updates from Ryusuke Konishi
 
 - squashfs feature work from Xiaoming Ni: permit configuration of the
   filesystem's compression concurrency from the mount command line.
 
 - A series from Akinobu Mita which addresses bound checking errors when
   writing to debugfs files.
 
 - A series from Yang Yingliang to address rapido memory leaks
 
 - A series from Zheng Yejian to address possible overflow errors in
   encode_comp_t().
 
 - And a whole shower of singleton patches all over the place.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY5efRgAKCRDdBJ7gKXxA
 jgvdAP0al6oFDtaSsshIdNhrzcMwfjt6PfVxxHdLmNhF1hX2dwD/SVluS1bPSP7y
 0sZp7Ustu3YTb8aFkMl96Y9m9mY1Nwg=
 =ga5B
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - A ptrace API cleanup series from Sergey Shtylyov

 - Fixes and cleanups for kexec from ye xingchen

 - nilfs2 updates from Ryusuke Konishi

 - squashfs feature work from Xiaoming Ni: permit configuration of the
   filesystem's compression concurrency from the mount command line

 - A series from Akinobu Mita which addresses bound checking errors when
   writing to debugfs files

 - A series from Yang Yingliang to address rapidio memory leaks

 - A series from Zheng Yejian to address possible overflow errors in
   encode_comp_t()

 - And a whole shower of singleton patches all over the place

* tag 'mm-nonmm-stable-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (79 commits)
  ipc: fix memory leak in init_mqueue_fs()
  hfsplus: fix bug causing custom uid and gid being unable to be assigned with mount
  rapidio: devices: fix missing put_device in mport_cdev_open
  kcov: fix spelling typos in comments
  hfs: Fix OOB Write in hfs_asc2mac
  hfs: fix OOB Read in __hfs_brec_find
  relay: fix type mismatch when allocating memory in relay_create_buf()
  ocfs2: always read both high and low parts of dinode link count
  io-mapping: move some code within the include guarded section
  kernel: kcsan: kcsan_test: build without structleak plugin
  mailmap: update email for Iskren Chernev
  eventfd: change int to __u64 in eventfd_signal() ifndef CONFIG_EVENTFD
  rapidio: fix possible UAF when kfifo_alloc() fails
  relay: use strscpy() is more robust and safer
  cpumask: limit visibility of FORCE_NR_CPUS
  acct: fix potential integer overflow in encode_comp_t()
  acct: fix accuracy loss for input value of encode_comp_t()
  linux/init.h: include <linux/build_bug.h> and <linux/stringify.h>
  rapidio: rio: fix possible name leak in rio_register_mport()
  rapidio: fix possible name leaks when rio_add_device() fails
  ...
2022-12-12 17:28:58 -08:00
Linus Torvalds
268325bda5 Random number generator updates for Linux 6.2-rc1.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmOU+U8ACgkQSfxwEqXe
 A67NnQ//Y5DltmvibyPd7r1TFT2gUYv+Rx3sUV9ZE1NYptd/SWhhcL8c5FZ70Fuw
 bSKCa1uiWjOxosjXT1kGrWq3de7q7oUpAPSOGxgxzoaNURIt58N/ajItCX/4Au8I
 RlGAScHy5e5t41/26a498kB6qJ441fBEqCYKQpPLINMBAhe8TQ+NVp0rlpUwNHFX
 WrUGg4oKWxdBIW3HkDirQjJWDkkAiklRTifQh/Al4b6QDbOnRUGGCeckNOhixsvS
 waHWTld+Td8jRrA4b82tUb2uVZ2/b8dEvj/A8CuTv4yC0lywoyMgBWmJAGOC+UmT
 ZVNdGW02Jc2T+Iap8ZdsEmeLHNqbli4+IcbY5xNlov+tHJ2oz41H9TZoYKbudlr6
 /ReAUPSn7i50PhbQlEruj3eg+M2gjOeh8OF8UKwwRK8PghvyWQ1ScW0l3kUhPIhI
 PdIG6j4+D2mJc1FIj2rTVB+Bg933x6S+qx4zDxGlNp62AARUFYf6EgyD6aXFQVuX
 RxcKb6cjRuFkzFiKc8zkqg5edZH+IJcPNuIBmABqTGBOxbZWURXzIQvK/iULqZa4
 CdGAFIs6FuOh8pFHLI3R4YoHBopbHup/xKDEeAO9KZGyeVIuOSERDxxo5f/ITzcq
 APvT77DFOEuyvanr8RMqqh0yUjzcddXqw9+ieufsAyDwjD9DTuE=
 =QRhK
 -----END PGP SIGNATURE-----

Merge tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random

Pull random number generator updates from Jason Donenfeld:

 - Replace prandom_u32_max() and various open-coded variants of it,
   there is now a new family of functions that uses fast rejection
   sampling to choose properly uniformly random numbers within an
   interval:

       get_random_u32_below(ceil) - [0, ceil)
       get_random_u32_above(floor) - (floor, U32_MAX]
       get_random_u32_inclusive(floor, ceil) - [floor, ceil]

   Coccinelle was used to convert all current users of
   prandom_u32_max(), as well as many open-coded patterns, resulting in
   improvements throughout the tree.

   I'll have a "late" 6.1-rc1 pull for you that removes the now unused
   prandom_u32_max() function, just in case any other trees add a new
   use case of it that needs to converted. According to linux-next,
   there may be two trivial cases of prandom_u32_max() reintroductions
   that are fixable with a 's/.../.../'. So I'll have for you a final
   conversion patch doing that alongside the removal patch during the
   second week.

   This is a treewide change that touches many files throughout.

 - More consistent use of get_random_canary().

 - Updates to comments, documentation, tests, headers, and
   simplification in configuration.

 - The arch_get_random*_early() abstraction was only used by arm64 and
   wasn't entirely useful, so this has been replaced by code that works
   in all relevant contexts.

 - The kernel will use and manage random seeds in non-volatile EFI
   variables, refreshing a variable with a fresh seed when the RNG is
   initialized. The RNG GUID namespace is then hidden from efivarfs to
   prevent accidental leakage.

   These changes are split into random.c infrastructure code used in the
   EFI subsystem, in this pull request, and related support inside of
   EFISTUB, in Ard's EFI tree. These are co-dependent for full
   functionality, but the order of merging doesn't matter.

 - Part of the infrastructure added for the EFI support is also used for
   an improvement to the way vsprintf initializes its siphash key,
   replacing an sleep loop wart.

 - The hardware RNG framework now always calls its correct random.c
   input function, add_hwgenerator_randomness(), rather than sometimes
   going through helpers better suited for other cases.

 - The add_latent_entropy() function has long been called from the fork
   handler, but is a no-op when the latent entropy gcc plugin isn't
   used, which is fine for the purposes of latent entropy.

   But it was missing out on the cycle counter that was also being mixed
   in beside the latent entropy variable. So now, if the latent entropy
   gcc plugin isn't enabled, add_latent_entropy() will expand to a call
   to add_device_randomness(NULL, 0), which adds a cycle counter,
   without the absent latent entropy variable.

 - The RNG is now reseeded from a delayed worker, rather than on demand
   when used. Always running from a worker allows it to make use of the
   CPU RNG on platforms like S390x, whose instructions are too slow to
   do so from interrupts. It also has the effect of adding in new inputs
   more frequently with more regularity, amounting to a long term
   transcript of random values. Plus, it helps a bit with the upcoming
   vDSO implementation (which isn't yet ready for 6.2).

 - The jitter entropy algorithm now tries to execute on many different
   CPUs, round-robining, in hopes of hitting even more memory latencies
   and other unpredictable effects. It also will mix in a cycle counter
   when the entropy timer fires, in addition to being mixed in from the
   main loop, to account more explicitly for fluctuations in that timer
   firing. And the state it touches is now kept within the same cache
   line, so that it's assured that the different execution contexts will
   cause latencies.

* tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (23 commits)
  random: include <linux/once.h> in the right header
  random: align entropy_timer_state to cache line
  random: mix in cycle counter when jitter timer fires
  random: spread out jitter callback to different CPUs
  random: remove extraneous period and add a missing one in comments
  efi: random: refresh non-volatile random seed when RNG is initialized
  vsprintf: initialize siphash key using notifier
  random: add back async readiness notifier
  random: reseed in delayed work rather than on-demand
  random: always mix cycle counter in add_latent_entropy()
  hw_random: use add_hwgenerator_randomness() for early entropy
  random: modernize documentation comment on get_random_bytes()
  random: adjust comment to account for removed function
  random: remove early archrandom abstraction
  random: use random.trust_{bootloader,cpu} command line option only
  stackprotector: actually use get_random_canary()
  stackprotector: move get_random_canary() into stackprotector.h
  treewide: use get_random_u32_inclusive() when possible
  treewide: use get_random_u32_{above,below}() instead of manual loop
  treewide: use get_random_u32_below() instead of deprecated function
  ...
2022-12-12 16:22:22 -08:00
Linus Torvalds
add7695957 Perf events updates for v6.2:
- Thoroughly rewrite the data structures that implement perf task context handling,
    with the goal of fixing various quirks and unfeatures both in already merged,
    and in upcoming proposed code.
 
    The old data structure is the per task and per cpu perf_event_contexts:
 
          task_struct::perf_events_ctxp[] <-> perf_event_context <-> perf_cpu_context
               ^                                 |    ^     |           ^
               `---------------------------------'    |     `--> pmu ---'
                                                      v           ^
                                                 perf_event ------'
 
    In this new design this is replaced with a single task context and
    a single CPU context, plus intermediate data-structures:
 
          task_struct::perf_event_ctxp -> perf_event_context <- perf_cpu_context
               ^                           |   ^ ^
               `---------------------------'   | |
                                               | |    perf_cpu_pmu_context <--.
                                               | `----.    ^                  |
                                               |      |    |                  |
                                               |      v    v                  |
                                               | ,--> perf_event_pmu_context  |
                                               | |                            |
                                               | |                            |
                                               v v                            |
                                          perf_event ---> pmu ----------------'
 
    [ See commit bd27568117 for more details. ]
 
    This rewrite was developed by Peter Zijlstra and Ravi Bangoria.
 
  - Optimize perf_tp_event()
 
  - Update the Intel uncore PMU driver, extending it with UPI topology discovery
    on various hardware models.
 
  - Misc fixes & cleanups
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmOXjuURHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1j+VhAAknimsLwenTHCGQp7yqsWSKfBr9KI2UgD
 ZgtQuuwRwSzwqAEwC5Mt6zcIkxRNhU1ookFPqQbpY3XA0W4aNakUk8bDF8QIEKW0
 MFWxn7PtReWqKcUay2oEGGurqZ5OtfpljJGxigQh5oVeMGc+itIwHF2JefeyoRnu
 pq7R2qDgOBb7Np4lWTdqXGmKufzp04/nely2IZQBO8x80cGRZiKQIrGrch6vLUf7
 3iEz9rwmvPyz0aczYSpa/duEZDMLm4lWNK4oMUEXuUWC8gU7CUzBJsJ3AS5NgxAu
 yGBXe/s7GHqwtc/F30l5gK/J5WAyK83IF7sckxTj0dBUpyC6wQwwYPm8BaCAMoqN
 X6mU7Ve938Siih1TyOBZfZsrtDDILhV2N/nku2erb3iqes26u0RcT25rWtu9Yqvn
 hm4Gm6cmkHWq4EOHSBvAdC7l7lDZ3fyVI5+8nN9ly9Qv867HjG70dvIr9iEEolpX
 rhFAz8r/NwTXhDY0AmFZcOkrnNV3IuHtibJ/9wJlgJNqDPqN12Wxqdzy0Nj3HH6G
 EsukBO05cWaDS0gB8MpaO6Q6YtqAr87ZY+afHDBwcfkME50/CyBLr5rd47dTR+Ip
 B+zreYKcaNHdEMd1A9KULRTTDnEjlXYMwjVVJiPRV0jcmA3dHmM46HN5Ae9NdO6+
 R2BAWv9XR6M=
 =KNaI
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf events updates from Ingo Molnar:

 - Thoroughly rewrite the data structures that implement perf task
   context handling, with the goal of fixing various quirks and
   unfeatures both in already merged, and in upcoming proposed code.

   The old data structure is the per task and per cpu
   perf_event_contexts:

         task_struct::perf_events_ctxp[] <-> perf_event_context <-> perf_cpu_context
              ^                                 |    ^     |           ^
              `---------------------------------'    |     `--> pmu ---'
                                                     v           ^
                                                perf_event ------'

   In this new design this is replaced with a single task context and a
   single CPU context, plus intermediate data-structures:

         task_struct::perf_event_ctxp -> perf_event_context <- perf_cpu_context
              ^                           |   ^ ^
              `---------------------------'   | |
                                              | |    perf_cpu_pmu_context <--.
                                              | `----.    ^                  |
                                              |      |    |                  |
                                              |      v    v                  |
                                              | ,--> perf_event_pmu_context  |
                                              | |                            |
                                              | |                            |
                                              v v                            |
                                         perf_event ---> pmu ----------------'

   [ See commit bd27568117 for more details. ]

   This rewrite was developed by Peter Zijlstra and Ravi Bangoria.

 - Optimize perf_tp_event()

 - Update the Intel uncore PMU driver, extending it with UPI topology
   discovery on various hardware models.

 - Misc fixes & cleanups

* tag 'perf-core-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
  perf/x86/intel/uncore: Fix reference count leak in __uncore_imc_init_box()
  perf/x86/intel/uncore: Fix reference count leak in snr_uncore_mmio_map()
  perf/x86/intel/uncore: Fix reference count leak in hswep_has_limit_sbox()
  perf/x86/intel/uncore: Fix reference count leak in sad_cfg_iio_topology()
  perf/x86/intel/uncore: Make set_mapping() procedure void
  perf/x86/intel/uncore: Update sysfs-devices-mapping file
  perf/x86/intel/uncore: Enable UPI topology discovery for Sapphire Rapids
  perf/x86/intel/uncore: Enable UPI topology discovery for Icelake Server
  perf/x86/intel/uncore: Get UPI NodeID and GroupID
  perf/x86/intel/uncore: Enable UPI topology discovery for Skylake Server
  perf/x86/intel/uncore: Generalize get_topology() for SKX PMUs
  perf/x86/intel/uncore: Disable I/O stacks to PMU mapping on ICX-D
  perf/x86/intel/uncore: Clear attr_update properly
  perf/x86/intel/uncore: Introduce UPI topology type
  perf/x86/intel/uncore: Generalize IIO topology support
  perf/core: Don't allow grouping events from different hw pmus
  perf/amd/ibs: Make IBS a core pmu
  perf: Fix function pointer case
  perf/x86/amd: Remove the repeated declaration
  perf: Fix possible memleak in pmu_dev_alloc()
  ...
2022-12-12 15:19:38 -08:00
Linus Torvalds
456ed864fd ACPI updates for 6.2-rc1
- Update the ACPICA code in the kernel to the 20221020 upstream
    version and fix a couple of issues in it:
 
    * Make acpi_ex_load_op() match upstream implementation (Rafael
      Wysocki).
    * Add support for loong_arch-specific APICs in MADT (Huacai Chen).
    * Add support for fixed PCIe wake event (Huacai Chen).
    * Add EBDA pointer sanity checks (Vit Kabele).
    * Avoid accessing VGA memory when EBDA < 1KiB (Vit Kabele).
    * Add CCEL table support to both compiler/disassembler (Kuppuswamy
      Sathyanarayanan).
    * Add a couple of new UUIDs to the known UUID list (Bob Moore).
    * Add support for FFH Opregion special context data (Sudeep Holla).
    * Improve warning message for "invalid ACPI name" (Bob Moore).
    * Add support for CXL 3.0 structures (CXIMS & RDPAS) in the CEDT
      table (Alison Schofield).
    * Prepare IORT support for revision E.e (Robin Murphy).
    * Finish support for the CDAT table (Bob Moore).
    * Fix error code path in acpi_ds_call_control_method() (Rafael
      Wysocki).
    * Fix use-after-free in acpi_ut_copy_ipackage_to_ipackage() (Li
      Zetao).
    * Update the version of the ACPICA code in the kernel (Bob Moore).
 
  - Use ZERO_PAGE(0) instead of empty_zero_page in the ACPI device
    enumeration code (Giulio Benetti).
 
  - Change the return type of the ACPI driver remove callback to void and
    update its users accordingly (Dawei Li).
 
  - Add general support for FFH address space type and implement the low-
    level part of it for ARM64 (Sudeep Holla).
 
  - Fix stale comments in the ACPI tables parsing code and make it print
    more messages related to MADT (Hanjun Guo, Huacai Chen).
 
  - Replace invocations of generic library functions with more kernel-
    specific counterparts in the ACPI sysfs interface (Christophe JAILLET,
    Xu Panda).
 
  - Print full name paths of ACPI power resource objects during
    enumeration (Kane Chen).
 
  - Eliminate a compiler warning regarding a missing function prototype
    in the ACPI power management code (Sudeep Holla).
 
  - Fix and clean up the ACPI processor driver (Rafael Wysocki, Li Zhong,
    Colin Ian King, Sudeep Holla).
 
  - Add quirk for the HP Pavilion Gaming 15-cx0041ur to the ACPI EC
    driver (Mia Kanashi).
 
  - Add some mew ACPI backlight handling quirks and update some existing
    ones (Hans de Goede).
 
  - Make the ACPI backlight driver prefer the native backlight control
    over vendor backlight control when possible (Hans de Goede).
 
  - Drop unsetting ACPI APEI driver data on remove (Uwe Kleine-König).
 
  - Use xchg_release() instead of cmpxchg() for updating new GHES cache
    slots (Ard Biesheuvel).
 
  - Clean up the ACPI APEI code (Sudeep Holla, Christophe JAILLET, Jay Lu).
 
  - Add new I2C device enumeration quirks for Medion Lifetab S10346 and
    Lenovo Yoga Tab 3 Pro (YT3-X90F) (Hans de Goede).
 
  - Make the ACPI battery driver notify user space about adding new
    battery hooks and removing the existing ones (Armin Wolf).
 
  - Modify the pfr_update and pfr_telemetry drivers to use ACPI_FREE()
    for freeing acpi_object structures to help diagnostics (Wang ShaoBo).
 
  - Make the ACPI fan driver use sysfs_emit_at() in its sysfs interface
    code (ye xingchen).
 
  - Fix the _FIF package extraction failure handling in the ACPI fan
    driver (Hanjun Guo).
 
  - Fix the PCC mailbox handling error code path (Huisong Li).
 
  - Avoid using PCC Opregions if there is no platform interrupt allocated
    for this purpose (Huisong Li).
 
  - Use sysfs_emit() instead of scnprintf() in the ACPI PAD driver and
    CPPC library (ye xingchen).
 
  - Fix some kernel-doc issues in the ACPI GSI processing code (Xiongfeng
    Wang).
 
  - Fix name memory leak in pnp_alloc_dev() (Yang Yingliang).
 
  - Do not disable PNP devices on suspend when they cannot be re-enabled
    on resume (Hans de Goede).
 
  - Clean up the ACPI thermal driver a bit (Rafael Wysocki).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmOXV10SHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxuOwP/2zew6val2Jf7I/Yxf1iQLlRyGmhFnaH
 wpltJvBjlHjAUKnPQ/kLYK9fjuUY5HVgjOE03WpwhFUpmhftYTrSkhoVkJ1Mw9Zl
 RNOAEgCG484ThHiTIVp/dMPxrtfuqpdbamhWX3Q51IfXjGW8Vc/lDxIa3k/JQxyq
 ko8GFPCoebJrSCfuwaAf2+xSQaf6dq4jpL/rlIk+nYMMB9mQmXhNEhc+l97NaCe8
 MyCIGynyNbhGsIlwdHRvTp04EIe8h0Z1+Dyns7g/TrzHj3Aezy7QVZbn8sKdZWa1
 W/Ck9QST5tfpDWyr+hUXxUJjEn4Yy+GXjM2xON0EMx5q+JD9XsOpwWOVwTR7CS5s
 FwEd6I89SC8OZM86AgMtnGxygjpK24R/kGzHjhG15IQCsypc8Rvzoxl0L0YVoon/
 UTkE57GzNWVzu0pY/oXJc2aT7lVqFXMFZ6ft/zHnBRnQmrcIi+xgDO5ni5KxctFN
 TVFwbAMCuwVx6IOcVQCZM2g4aJw426KpUn19fKnXvPwR5UIufBaCzSKWMiYrtdXr
 O5BM8ElYuyKCWGYEE0GSMjZygyDpyY6ENLH7s7P1IEmFyigBzaaGBbKm108JJq4V
 eCWJYTAx8pAptsU/vfuMvEQ1ErfhZ3TTokA5Lv0uPf53VcAnWDb7EAbW6ZGMwFSI
 IaV6cv6ILoqO
 =GVzp
 -----END PGP SIGNATURE-----

Merge tag 'acpi-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI and PNP updates from Rafael Wysocki:
 "These include new code (for instance, support for the FFH address
  space type and support for new firmware data structures in ACPICA),
  some new quirks (mostly related to backlight handling and I2C
  enumeration), a number of fixes and a fair amount of cleanups all
  over.

  Specifics:

   - Update the ACPICA code in the kernel to the 20221020 upstream
     version and fix a couple of issues in it:
      - Make acpi_ex_load_op() match upstream implementation (Rafael
        Wysocki)
      - Add support for loong_arch-specific APICs in MADT (Huacai Chen)
      - Add support for fixed PCIe wake event (Huacai Chen)
      - Add EBDA pointer sanity checks (Vit Kabele)
      - Avoid accessing VGA memory when EBDA < 1KiB (Vit Kabele)
      - Add CCEL table support to both compiler/disassembler (Kuppuswamy
        Sathyanarayanan)
      - Add a couple of new UUIDs to the known UUID list (Bob Moore)
      - Add support for FFH Opregion special context data (Sudeep
        Holla)
      - Improve warning message for "invalid ACPI name" (Bob Moore)
      - Add support for CXL 3.0 structures (CXIMS & RDPAS) in the CEDT
        table (Alison Schofield)
      - Prepare IORT support for revision E.e (Robin Murphy)
      - Finish support for the CDAT table (Bob Moore)
      - Fix error code path in acpi_ds_call_control_method() (Rafael
        Wysocki)
      - Fix use-after-free in acpi_ut_copy_ipackage_to_ipackage() (Li
        Zetao)
      - Update the version of the ACPICA code in the kernel (Bob Moore)

   - Use ZERO_PAGE(0) instead of empty_zero_page in the ACPI device
     enumeration code (Giulio Benetti)

   - Change the return type of the ACPI driver remove callback to void
     and update its users accordingly (Dawei Li)

   - Add general support for FFH address space type and implement the
     low- level part of it for ARM64 (Sudeep Holla)

   - Fix stale comments in the ACPI tables parsing code and make it
     print more messages related to MADT (Hanjun Guo, Huacai Chen)

   - Replace invocations of generic library functions with more kernel-
     specific counterparts in the ACPI sysfs interface (Christophe
     JAILLET, Xu Panda)

   - Print full name paths of ACPI power resource objects during
     enumeration (Kane Chen)

   - Eliminate a compiler warning regarding a missing function prototype
     in the ACPI power management code (Sudeep Holla)

   - Fix and clean up the ACPI processor driver (Rafael Wysocki, Li
     Zhong, Colin Ian King, Sudeep Holla)

   - Add quirk for the HP Pavilion Gaming 15-cx0041ur to the ACPI EC
     driver (Mia Kanashi)

   - Add some mew ACPI backlight handling quirks and update some
     existing ones (Hans de Goede)

   - Make the ACPI backlight driver prefer the native backlight control
     over vendor backlight control when possible (Hans de Goede)

   - Drop unsetting ACPI APEI driver data on remove (Uwe Kleine-König)

   - Use xchg_release() instead of cmpxchg() for updating new GHES cache
     slots (Ard Biesheuvel)

   - Clean up the ACPI APEI code (Sudeep Holla, Christophe JAILLET, Jay
     Lu)

   - Add new I2C device enumeration quirks for Medion Lifetab S10346 and
     Lenovo Yoga Tab 3 Pro (YT3-X90F) (Hans de Goede)

   - Make the ACPI battery driver notify user space about adding new
     battery hooks and removing the existing ones (Armin Wolf)

   - Modify the pfr_update and pfr_telemetry drivers to use ACPI_FREE()
     for freeing acpi_object structures to help diagnostics (Wang
     ShaoBo)

   - Make the ACPI fan driver use sysfs_emit_at() in its sysfs interface
     code (ye xingchen)

   - Fix the _FIF package extraction failure handling in the ACPI fan
     driver (Hanjun Guo)

   - Fix the PCC mailbox handling error code path (Huisong Li)

   - Avoid using PCC Opregions if there is no platform interrupt
     allocated for this purpose (Huisong Li)

   - Use sysfs_emit() instead of scnprintf() in the ACPI PAD driver and
     CPPC library (ye xingchen)

   - Fix some kernel-doc issues in the ACPI GSI processing code
     (Xiongfeng Wang)

   - Fix name memory leak in pnp_alloc_dev() (Yang Yingliang)

   - Do not disable PNP devices on suspend when they cannot be
     re-enabled on resume (Hans de Goede)

   - Clean up the ACPI thermal driver a bit (Rafael Wysocki)"

* tag 'acpi-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (67 commits)
  ACPI: x86: Add skip i2c clients quirk for Medion Lifetab S10346
  ACPI: APEI: EINJ: Refactor available_error_type_show()
  ACPI: APEI: EINJ: Fix formatting errors
  ACPI: processor: perflib: Adjust acpi_processor_notify_smm() return value
  ACPI: processor: perflib: Rearrange acpi_processor_notify_smm()
  ACPI: processor: perflib: Rearrange unregistration routine
  ACPI: processor: perflib: Drop redundant parentheses
  ACPI: processor: perflib: Adjust white space
  ACPI: processor: idle: Drop unnecessary statements and parens
  ACPI: thermal: Adjust critical.flags.valid check
  ACPI: fan: Convert to use sysfs_emit_at() API
  ACPICA: Fix use-after-free in acpi_ut_copy_ipackage_to_ipackage()
  ACPI: battery: Call power_supply_changed() when adding hooks
  ACPI: use sysfs_emit() instead of scnprintf()
  ACPI: x86: Add skip i2c clients quirk for Lenovo Yoga Tab 3 Pro (YT3-X90F)
  ACPI: APEI: Remove a useless include
  PNP: Do not disable devices on suspend when they cannot be re-enabled on resume
  ACPI: processor: Silence missing prototype warnings
  ACPI: processor_idle: Silence missing prototype warnings
  ACPI: PM: Silence missing prototype warning
  ...
2022-12-12 13:38:17 -08:00
Linus Torvalds
0a1d4434db Updates for timers, timekeeping and drivers:
- Core:
 
    - The timer_shutdown[_sync]() infrastructure:
 
      Tearing down timers can be tedious when there are circular
      dependencies to other things which need to be torn down. A prime
      example is timer and workqueue where the timer schedules work and the
      work arms the timer.
 
      What needs to prevented is that pending work which is drained via
      destroy_workqueue() does not rearm the previously shutdown
      timer. Nothing in that shutdown sequence relies on the timer being
      functional.
 
      The conclusion was that the semantics of timer_shutdown_sync() should
      be:
 
 	- timer is not enqueued
     	- timer callback is not running
     	- timer cannot be rearmed
 
      Preventing the rearming of shutdown timers is done by discarding rearm
      attempts silently. A warning for the case that a rearm attempt of a
      shutdown timer is detected would not be really helpful because it's
      entirely unclear how it should be acted upon. The only way to address
      such a case is to add 'if (in_shutdown)' conditionals all over the
      place. This is error prone and in most cases of teardown not required
      all.
 
    - The real fix for the bluetooth HCI teardown based on
      timer_shutdown_sync().
 
      A larger scale conversion to timer_shutdown_sync() is work in
      progress.
 
    - Consolidation of VDSO time namespace helper functions
 
    - Small fixes for timer and timerqueue
 
  - Drivers:
 
    - Prevent integer overflow on the XGene-1 TVAL register which causes
      an never ending interrupt storm.
 
    - The usual set of new device tree bindings
 
    - Small fixes and improvements all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmOUuC0THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYodpZD/9kCDi009n65QFF1J4kE5aZuABbRMtO
 7sy66fJpDyB/MtcbPPH29uzQUEs1VMTQVB+ZM+7e1YGoxSWuSTzeoFH+yK1w4tEZ
 VPbOcvUEjG0esKUehwYFeOjSnIjy6M1Y41aOUaDnq00/azhfTrzLxQA1BbbFbkpw
 S7u2hllbyRJ8KdqQyV9cVpXmze6fcpdtNhdQeoA7qQCsSPnJ24MSpZ/PG9bAovq8
 75IRROT7CQRd6AMKAVpA9Ov8ak9nbY3EgQmoKcp5ZXfXz8kD3nHky9Lste7djgYB
 U085Vwcelt39V5iXevDFfzrBYRUqrMKOXIf2xnnoDNeF5Jlj5gChSNVZwTLO38wu
 RFEVCjCjuC41GQJWSck9LRSYdriW/htVbEE8JLc6uzUJGSyjshgJRn/PK4HjpiLY
 AvH2rd4rAap/rjDKvfWvBqClcfL7pyBvavgJeyJ8oXyQjHrHQwapPcsMFBm0Cky5
 soF0Lr3hIlQ9u+hwUuFdNZkY9mOg09g9ImEjW1AZTKY0DfJMc5JAGjjSCfuopVUN
 Uf/qqcUeQPSEaC+C9xiFs0T3svYFxBqpgPv4B6t8zAnozon9fyZs+lv5KdRg4X77
 qX395qc6PaOSQlA7gcxVw3vjCPd0+hljXX84BORP7z+uzcsomvIH1MxJepIHmgaJ
 JrYbSZ5qzY5TTA==
 =JlDe
 -----END PGP SIGNATURE-----

Merge tag 'timers-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer updates from Thomas Gleixner:
 "Updates for timers, timekeeping and drivers:

  Core:

   - The timer_shutdown[_sync]() infrastructure:

     Tearing down timers can be tedious when there are circular
     dependencies to other things which need to be torn down. A prime
     example is timer and workqueue where the timer schedules work and
     the work arms the timer.

     What needs to prevented is that pending work which is drained via
     destroy_workqueue() does not rearm the previously shutdown timer.
     Nothing in that shutdown sequence relies on the timer being
     functional.

     The conclusion was that the semantics of timer_shutdown_sync()
     should be:
	- timer is not enqueued
    	- timer callback is not running
    	- timer cannot be rearmed

     Preventing the rearming of shutdown timers is done by discarding
     rearm attempts silently.

     A warning for the case that a rearm attempt of a shutdown timer is
     detected would not be really helpful because it's entirely unclear
     how it should be acted upon. The only way to address such a case is
     to add 'if (in_shutdown)' conditionals all over the place. This is
     error prone and in most cases of teardown not required all.

   - The real fix for the bluetooth HCI teardown based on
     timer_shutdown_sync().

     A larger scale conversion to timer_shutdown_sync() is work in
     progress.

   - Consolidation of VDSO time namespace helper functions

   - Small fixes for timer and timerqueue

  Drivers:

   - Prevent integer overflow on the XGene-1 TVAL register which causes
     an never ending interrupt storm.

   - The usual set of new device tree bindings

   - Small fixes and improvements all over the place"

* tag 'timers-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
  dt-bindings: timer: renesas,cmt: Add r8a779g0 CMT support
  dt-bindings: timer: renesas,tmu: Add r8a779g0 support
  clocksource/drivers/arm_arch_timer: Use kstrtobool() instead of strtobool()
  clocksource/drivers/timer-ti-dm: Fix missing clk_disable_unprepare in dmtimer_systimer_init_clock()
  clocksource/drivers/timer-ti-dm: Clear settings on probe and free
  clocksource/drivers/timer-ti-dm: Make timer_get_irq static
  clocksource/drivers/timer-ti-dm: Fix warning for omap_timer_match
  clocksource/drivers/arm_arch_timer: Fix XGene-1 TVAL register math error
  clocksource/drivers/timer-npcm7xx: Enable timer 1 clock before use
  dt-bindings: timer: nuvoton,npcm7xx-timer: Allow specifying all clocks
  dt-bindings: timer: rockchip: Add rockchip,rk3128-timer
  clockevents: Repair kernel-doc for clockevent_delta2ns()
  clocksource/drivers/ingenic-ost: Define pm functions properly in platform_driver struct
  clocksource/drivers/sh_cmt: Access registers according to spec
  vdso/timens: Refactor copy-pasted find_timens_vvar_page() helper into one copy
  Bluetooth: hci_qca: Fix the teardown problem for real
  timers: Update the documentation to reflect on the new timer_shutdown() API
  timers: Provide timer_shutdown[_sync]()
  timers: Add shutdown mechanism to the internal functions
  timers: Split [try_to_]del_timer[_sync]() to prepare for shutdown mode
  ...
2022-12-12 12:52:02 -08:00
Linus Torvalds
06cff4a58e arm64 updates for 6.2
ACPI:
 	* Enable FPDT support for boot-time profiling
 	* Fix CPU PMU probing to work better with PREEMPT_RT
 	* Update SMMUv3 MSI DeviceID parsing to latest IORT spec
 	* APMT support for probing Arm CoreSight PMU devices
 
 CPU features:
 	* Advertise new SVE instructions (v2.1)
 	* Advertise range prefetch instruction
 	* Advertise CSSC ("Common Short Sequence Compression") scalar
 	  instructions, adding things like min, max, abs, popcount
 	* Enable DIT (Data Independent Timing) when running in the kernel
 	* More conversion of system register fields over to the generated
 	  header
 
 CPU misfeatures:
 	* Workaround for Cortex-A715 erratum #2645198
 
 Dynamic SCS:
 	* Support for dynamic shadow call stacks to allow switching at
 	  runtime between Clang's SCS implementation and the CPU's
 	  pointer authentication feature when it is supported (complete
 	  with scary DWARF parser!)
 
 Tracing and debug:
 	* Remove static ftrace in favour of, err, dynamic ftrace!
 	* Seperate 'struct ftrace_regs' from 'struct pt_regs' in core
 	  ftrace and existing arch code
 	* Introduce and implement FTRACE_WITH_ARGS on arm64 to replace
 	  the old FTRACE_WITH_REGS
 	* Extend 'crashkernel=' parameter with default value and fallback
 	  to placement above 4G physical if initial (low) allocation
 	  fails
 
 SVE:
 	* Optimisation to avoid disabling SVE unconditionally on syscall
 	  entry and just zeroing the non-shared state on return instead
 
 Exceptions:
 	* Rework of undefined instruction handling to avoid serialisation
 	  on global lock (this includes emulation of user accesses to the
 	  ID registers)
 
 Perf and PMU:
 	* Support for TLP filters in Hisilicon's PCIe PMU device
 	* Support for the DDR PMU present in Amlogic Meson G12 SoCs
 	* Support for the terribly-named "CoreSight PMU" architecture
 	  from Arm (and Nvidia's implementation of said architecture)
 
 Misc:
 	* Tighten up our boot protocol for systems with memory above
           52 bits physical
 	* Const-ify static keys to satisty jump label asm constraints
 	* Trivial FFA driver cleanups in preparation for v1.1 support
 	* Export the kernel_neon_* APIs as GPL symbols
 	* Harden our instruction generation routines against
 	  instrumentation
 	* A bunch of robustness improvements to our arch-specific selftests
 	* Minor cleanups and fixes all over (kbuild, kprobes, kfence, PMU, ...)
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmOPLFAQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNPRcCACLyDTvkimiqfoPxzzgdkx/6QOvw9s3/mXg
 UcTORSZBR1VnYkiMYEKVz/tTfG99dnWtD8/0k/rz48NbhBfsF2sN4ukyBBXVf0zR
 fjnaVyVC11LUgBgZKPo6maV+jf/JWf9hJtpPl06KTiPb2Hw2JX4DXg+PeF8t2hGx
 NLH4ekQOrlDM8mlsN5mc0YsHbiuO7Xe/NRuet8TsgU4bEvLAwO6bzOLVUMqDQZNq
 bQe2ENcGVAzAf7iRJb38lj9qB/5hrQTHRXqLXMSnJyyVjQEwYca0PeJMa7x30bXF
 ZZ+xQ8Wq0mxiffZraf6SE34yD4gaYS4Fziw7rqvydC15vYhzJBH1
 =hV+2
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "The highlights this time are support for dynamically enabling and
  disabling Clang's Shadow Call Stack at boot and a long-awaited
  optimisation to the way in which we handle the SVE register state on
  system call entry to avoid taking unnecessary traps from userspace.

  Summary:

  ACPI:
   - Enable FPDT support for boot-time profiling
   - Fix CPU PMU probing to work better with PREEMPT_RT
   - Update SMMUv3 MSI DeviceID parsing to latest IORT spec
   - APMT support for probing Arm CoreSight PMU devices

  CPU features:
   - Advertise new SVE instructions (v2.1)
   - Advertise range prefetch instruction
   - Advertise CSSC ("Common Short Sequence Compression") scalar
     instructions, adding things like min, max, abs, popcount
   - Enable DIT (Data Independent Timing) when running in the kernel
   - More conversion of system register fields over to the generated
     header

  CPU misfeatures:
   - Workaround for Cortex-A715 erratum #2645198

  Dynamic SCS:
   - Support for dynamic shadow call stacks to allow switching at
     runtime between Clang's SCS implementation and the CPU's pointer
     authentication feature when it is supported (complete with scary
     DWARF parser!)

  Tracing and debug:
   - Remove static ftrace in favour of, err, dynamic ftrace!
   - Seperate 'struct ftrace_regs' from 'struct pt_regs' in core ftrace
     and existing arch code
   - Introduce and implement FTRACE_WITH_ARGS on arm64 to replace the
     old FTRACE_WITH_REGS
   - Extend 'crashkernel=' parameter with default value and fallback to
     placement above 4G physical if initial (low) allocation fails

  SVE:
   - Optimisation to avoid disabling SVE unconditionally on syscall
     entry and just zeroing the non-shared state on return instead

  Exceptions:
   - Rework of undefined instruction handling to avoid serialisation on
     global lock (this includes emulation of user accesses to the ID
     registers)

  Perf and PMU:
   - Support for TLP filters in Hisilicon's PCIe PMU device
   - Support for the DDR PMU present in Amlogic Meson G12 SoCs
   - Support for the terribly-named "CoreSight PMU" architecture from
     Arm (and Nvidia's implementation of said architecture)

  Misc:
   - Tighten up our boot protocol for systems with memory above 52 bits
     physical
   - Const-ify static keys to satisty jump label asm constraints
   - Trivial FFA driver cleanups in preparation for v1.1 support
   - Export the kernel_neon_* APIs as GPL symbols
   - Harden our instruction generation routines against instrumentation
   - A bunch of robustness improvements to our arch-specific selftests
   - Minor cleanups and fixes all over (kbuild, kprobes, kfence, PMU, ...)"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (151 commits)
  arm64: kprobes: Return DBG_HOOK_ERROR if kprobes can not handle a BRK
  arm64: kprobes: Let arch do_page_fault() fix up page fault in user handler
  arm64: Prohibit instrumentation on arch_stack_walk()
  arm64:uprobe fix the uprobe SWBP_INSN in big-endian
  arm64: alternatives: add __init/__initconst to some functions/variables
  arm_pmu: Drop redundant armpmu->map_event() in armpmu_event_init()
  kselftest/arm64: Allow epoll_wait() to return more than one result
  kselftest/arm64: Don't drain output while spawning children
  kselftest/arm64: Hold fp-stress children until they're all spawned
  arm64/sysreg: Remove duplicate definitions from asm/sysreg.h
  arm64/sysreg: Convert ID_DFR1_EL1 to automatic generation
  arm64/sysreg: Convert ID_DFR0_EL1 to automatic generation
  arm64/sysreg: Convert ID_AFR0_EL1 to automatic generation
  arm64/sysreg: Convert ID_MMFR5_EL1 to automatic generation
  arm64/sysreg: Convert MVFR2_EL1 to automatic generation
  arm64/sysreg: Convert MVFR1_EL1 to automatic generation
  arm64/sysreg: Convert MVFR0_EL1 to automatic generation
  arm64/sysreg: Convert ID_PFR2_EL1 to automatic generation
  arm64/sysreg: Convert ID_PFR1_EL1 to automatic generation
  arm64/sysreg: Convert ID_PFR0_EL1 to automatic generation
  ...
2022-12-12 09:50:05 -08:00
Rafael J. Wysocki
45494d77f2 Merge branches 'acpi-scan', 'acpi-bus', 'acpi-tables' and 'acpi-sysfs'
Merge ACPI changes related to device enumeration, device object
managenet, operation region handling, table parsing and sysfs
interface:

 - Use ZERO_PAGE(0) instead of empty_zero_page in the ACPI device
   enumeration code (Giulio Benetti).

 - Change the return type of the ACPI driver remove callback to void and
   update its users accordingly (Dawei Li).

 - Add general support for FFH address space type and implement the low-
   level part of it for ARM64 (Sudeep Holla).

 - Fix stale comments in the ACPI tables parsing code and make it print
   more messages related to MADT (Hanjun Guo, Huacai Chen).

 - Replace invocations of generic library functions with more kernel-
   specific counterparts in the ACPI sysfs interface (Christophe JAILLET,
   Xu Panda).

* acpi-scan:
  ACPI: scan: substitute empty_zero_page with helper ZERO_PAGE(0)

* acpi-bus:
  ACPI: FFH: Silence missing prototype warnings
  ACPI: make remove callback of ACPI driver void
  ACPI: bus: Fix the _OSC capability check for FFH OpRegion
  arm64: Add architecture specific ACPI FFH Opregion callbacks
  ACPI: Implement a generic FFH Opregion handler

* acpi-tables:
  ACPI: tables: Fix the stale comments for acpi_locate_initial_tables()
  ACPI: tables: Print CORE_PIC information when MADT is parsed

* acpi-sysfs:
  ACPI: sysfs: use sysfs_emit() to instead of scnprintf()
  ACPI: sysfs: Use kstrtobool() instead of strtobool()
2022-12-12 14:55:44 +01:00
Ard Biesheuvel
e8dfdf3162 arm64: efi: Recover from synchronous exceptions occurring in firmware
Unlike x86, which has machinery to deal with page faults that occur
during the execution of EFI runtime services, arm64 has nothing like
that, and a synchronous exception raised by firmware code brings down
the whole system.

With more EFI based systems appearing that were not built to run Linux
(such as the Windows-on-ARM laptops based on Qualcomm SOCs), as well as
the introduction of PRM (platform specific firmware routines that are
callable just like EFI runtime services), we are more likely to run into
issues of this sort, and it is much more likely that we can identify and
work around such issues if they don't bring down the system entirely.

Since we already use a EFI runtime services call wrapper in assembler,
we can quite easily add some code that captures the execution state at
the point where the call is made, allowing us to revert to this state
and proceed execution if the call triggered a synchronous exception.

Given that the kernel and the firmware don't share any data structures
that could end up in an indeterminate state, we can happily continue
running, as long as we mark the EFI runtime services as unavailable from
that point on.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
2022-12-08 18:33:34 +01:00
Ard Biesheuvel
ff7a167961 arm64: efi: Execute runtime services from a dedicated stack
With the introduction of PRMT in the ACPI subsystem, the EFI rts
workqueue is no longer the only caller of efi_call_virt_pointer() in the
kernel. This means the EFI runtime services lock is no longer sufficient
to manage concurrent calls into firmware, but also that firmware calls
may occur that are not marshalled via the workqueue mechanism, but
originate directly from the caller context.

For added robustness, and to ensure that the runtime services have 8 KiB
of stack space available as per the EFI spec, introduce a spinlock
protected EFI runtime stack of 8 KiB, where the spinlock also ensures
serialization between the EFI rts workqueue (which itself serializes EFI
runtime calls) and other callers of efi_call_virt_pointer().

While at it, use the stack pivot to avoid reloading the shadow call
stack pointer from the ordinary stack, as doing so could produce a
gadget to defeat it.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-12-08 08:59:42 +01:00
Ard Biesheuvel
d9f26ae731 Linux 6.1-rc8
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmONI6weHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG9xgH/jqXGuMoO1ikfmGb
 7oY0W/f69G9V/e0DxFLvnIjhFgCUzdnNsmD4jQJA4x6QsxwLWuvpI282Ez+bHV5T
 U4RPsxJZIIMsXE2lKM9BRgeLzDdCt0aK4Pj+3x2x7NZC5cWFSQ8PyQJkCwg+0PQo
 u8Ly+GO8c4RUMf4/rrAZQq16qZUqGDaGm1EJhtSoa+KiR81LmUUmbDIK9Mr53rmQ
 wou+95XhibwMWr17WgXA28bTgYqn9UGr67V3qvTH2LC7GW8BCoKvn+3wh6TVhlWj
 dsWplXgcOP0/OHvSC5Sb1Uibk5Gx3DlIzYa6OfNZQuZ5xmQqm9kXjW8lmYpWFHy/
 38/5HWc=
 =EuoA
 -----END PGP SIGNATURE-----

Merge tag 'v6.1-rc8' into efi/next

Linux 6.1-rc8
2022-12-07 19:08:57 +01:00
Will Deacon
5f4c374760 Merge branch 'for-next/undef-traps' into for-next/core
* for-next/undef-traps:
  arm64: armv8_deprecated: fix unused-function error
  arm64: armv8_deprecated: rework deprected instruction handling
  arm64: armv8_deprecated: move aarch32 helper earlier
  arm64: armv8_deprecated move emulation functions
  arm64: armv8_deprecated: fold ops into insn_emulation
  arm64: rework EL0 MRS emulation
  arm64: factor insn read out of call_undef_hook()
  arm64: factor out EL1 SSBS emulation hook
  arm64: split EL0/EL1 UNDEF handlers
  arm64: allow kprobes on EL0 handlers
2022-12-06 11:34:25 +00:00
Will Deacon
9d84ad425d Merge branch 'for-next/trivial' into for-next/core
* for-next/trivial:
  arm64: alternatives: add __init/__initconst to some functions/variables
  arm64/asm: Remove unused assembler DAIF save/restore macros
  arm64/kpti: Move DAIF masking to C code
  Revert "arm64/mm: Drop redundant BUG_ON(!pgtable_alloc)"
  arm64/mm: Drop unused restore_ttbr1
  arm64: alternatives: make apply_alternatives_vdso() static
  arm64/mm: Drop idmap_pg_end[] declaration
  arm64/mm: Drop redundant BUG_ON(!pgtable_alloc)
  arm64: make is_ttbrX_addr() noinstr-safe
  arm64/signal: Document our convention for choosing magic numbers
  arm64: atomics: lse: remove stale dependency on JUMP_LABEL
  arm64: paravirt: remove conduit check in has_pv_steal_clock
  arm64: entry: Fix typo
  arm64/booting: Add missing colon to FA64 entry
  arm64/mm: Drop ARM64_KERNEL_USES_PMD_MAPS
  arm64/asm: Remove unused enable_da macro
2022-12-06 11:33:29 +00:00
Will Deacon
70b1c62a67 Merge branch 'for-next/sysregs' into for-next/core
* for-next/sysregs: (39 commits)
  arm64/sysreg: Remove duplicate definitions from asm/sysreg.h
  arm64/sysreg: Convert ID_DFR1_EL1 to automatic generation
  arm64/sysreg: Convert ID_DFR0_EL1 to automatic generation
  arm64/sysreg: Convert ID_AFR0_EL1 to automatic generation
  arm64/sysreg: Convert ID_MMFR5_EL1 to automatic generation
  arm64/sysreg: Convert MVFR2_EL1 to automatic generation
  arm64/sysreg: Convert MVFR1_EL1 to automatic generation
  arm64/sysreg: Convert MVFR0_EL1 to automatic generation
  arm64/sysreg: Convert ID_PFR2_EL1 to automatic generation
  arm64/sysreg: Convert ID_PFR1_EL1 to automatic generation
  arm64/sysreg: Convert ID_PFR0_EL1 to automatic generation
  arm64/sysreg: Convert ID_ISAR6_EL1 to automatic generation
  arm64/sysreg: Convert ID_ISAR5_EL1 to automatic generation
  arm64/sysreg: Convert ID_ISAR4_EL1 to automatic generation
  arm64/sysreg: Convert ID_ISAR3_EL1 to automatic generation
  arm64/sysreg: Convert ID_ISAR2_EL1 to automatic generation
  arm64/sysreg: Convert ID_ISAR1_EL1 to automatic generation
  arm64/sysreg: Convert ID_ISAR0_EL1 to automatic generation
  arm64/sysreg: Convert ID_MMFR4_EL1 to automatic generation
  arm64/sysreg: Convert ID_MMFR3_EL1 to automatic generation
  ...
2022-12-06 11:32:25 +00:00
Will Deacon
75bc81d08f Merge branch 'for-next/sve-state' into for-next/core
* for-next/sve-state:
  arm64/fp: Use a struct to pass data to fpsimd_bind_state_to_cpu()
  arm64/sve: Leave SVE enabled on syscall if we don't context switch
  arm64/fpsimd: SME no longer requires SVE register state
  arm64/fpsimd: Load FP state based on recorded data type
  arm64/fpsimd: Stop using TIF_SVE to manage register saving in KVM
  arm64/fpsimd: Have KVM explicitly say which FP registers to save
  arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE
  KVM: arm64: Discard any SVE state when entering KVM guests
2022-12-06 11:27:28 +00:00
Will Deacon
595a121e89 Merge branch 'for-next/stacks' into for-next/core
* for-next/stacks:
  arm64: move on_thread_stack() to <asm/stacktrace.h>
  arm64: remove current_top_of_stack()
2022-12-06 11:26:40 +00:00
Will Deacon
10162e78ea Merge branch 'for-next/perf' into for-next/core
* for-next/perf: (21 commits)
  arm_pmu: Drop redundant armpmu->map_event() in armpmu_event_init()
  drivers/perf: hisi: Add TLP filter support
  Documentation: perf: Indent filter options list of hisi-pcie-pmu
  docs: perf: Fix PMU instance name of hisi-pcie-pmu
  drivers/perf: hisi: Fix some event id for hisi-pcie-pmu
  arm64/perf: Replace PMU version number '0' with ID_AA64DFR0_EL1_PMUVer_NI
  perf/amlogic: Remove unused header inclusions of  <linux/version.h>
  perf/amlogic: Fix build error for x86_64 allmodconfig
  dt-binding: perf: Add Amlogic DDR PMU
  docs/perf: Add documentation for the Amlogic G12 DDR PMU
  perf/amlogic: Add support for Amlogic meson G12 SoC DDR PMU driver
  MAINTAINERS: Update HiSilicon PMU maintainers
  perf: arm_cspmu: Fix module cyclic dependency
  perf: arm_cspmu: Fix build failure on x86_64
  perf: arm_cspmu: Fix modular builds due to missing MODULE_LICENSE()s
  perf: arm_cspmu: Add support for NVIDIA SCF and MCF attribute
  perf: arm_cspmu: Add support for ARM CoreSight PMU driver
  perf/smmuv3: Fix hotplug callback leak in arm_smmu_pmu_init()
  perf/arm_dmc620: Fix hotplug callback leak in dmc620_pmu_init()
  drivers: perf: marvell_cn10k: Fix hotplug callback leak in tad_pmu_init()
  ...
2022-12-06 11:22:48 +00:00
Will Deacon
37f5d61a96 Merge branch 'for-next/kprobes' into for-next/core
* for-next/kprobes:
  arm64: kprobes: Return DBG_HOOK_ERROR if kprobes can not handle a BRK
  arm64: kprobes: Let arch do_page_fault() fix up page fault in user handler
  arm64: Prohibit instrumentation on arch_stack_walk()
2022-12-06 11:20:21 +00:00
Will Deacon
586e1ad9af Merge branch 'for-next/insn' into for-next/core
* for-next/insn:
  arm64:uprobe fix the uprobe SWBP_INSN in big-endian
  arm64: insn: always inline hint generation
  arm64: insn: simplify insn group identification
  arm64: insn: always inline predicates
  arm64: insn: remove aarch64_insn_gen_prefetch()
2022-12-06 11:14:25 +00:00
Will Deacon
a4aebff7ef Merge branch 'for-next/ftrace' into for-next/core
* for-next/ftrace:
  ftrace: arm64: remove static ftrace
  ftrace: arm64: move from REGS to ARGS
  ftrace: abstract DYNAMIC_FTRACE_WITH_ARGS accesses
  ftrace: rename ftrace_instruction_pointer_set() -> ftrace_regs_set_instruction_pointer()
  ftrace: pass fregs to arch_ftrace_set_direct_caller()
2022-12-06 11:07:39 +00:00
Will Deacon
1a916ed79b Merge branch 'for-next/fpsimd' into for-next/core
* for-next/fpsimd:
  arm64/fpsimd: Make kernel_neon_ API _GPL
2022-12-06 11:06:47 +00:00
Will Deacon
f455fb65b4 Merge branch 'for-next/errata' into for-next/core
* for-next/errata:
  arm64: errata: Workaround possible Cortex-A715 [ESR|FAR]_ELx corruption
  arm64: Add Cortex-715 CPU part definition
2022-12-06 11:04:47 +00:00
Will Deacon
f6ffa4c8c1 Merge branch 'for-next/dynamic-scs' into for-next/core
* for-next/dynamic-scs:
  arm64: implement dynamic shadow call stack for Clang
  scs: add support for dynamic shadow call stacks
  arm64: unwind: add asynchronous unwind tables to kernel and modules
2022-12-06 11:01:49 +00:00
Marc Zyngier
753d734f3f Merge remote-tracking branch 'arm64/for-next/sysregs' into kvmarm-master/next
Merge arm64's sysreg repainting branch to avoid too many
ugly conflicts...

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-12-05 14:39:53 +00:00
Marc Zyngier
382b5b87a9 Merge branch kvm-arm64/mte-map-shared into kvmarm-master/next
* kvm-arm64/mte-map-shared:
  : .
  : Update the MTE support to allow the VMM to use shared mappings
  : to back the memslots exposed to MTE-enabled guests.
  :
  : Patches courtesy of Catalin Marinas and Peter Collingbourne.
  : .
  : Fix a number of issues with MTE, such as races on the tags
  : being initialised vs the PG_mte_tagged flag as well as the
  : lack of support for VM_SHARED when KVM is involved.
  :
  : Patches from Catalin Marinas and Peter Collingbourne.
  : .
  Documentation: document the ABI changes for KVM_CAP_ARM_MTE
  KVM: arm64: permit all VM_MTE_ALLOWED mappings with MTE enabled
  KVM: arm64: unify the tests for VMAs in memslots when MTE is enabled
  arm64: mte: Lock a page for MTE tag initialisation
  mm: Add PG_arch_3 page flag
  KVM: arm64: Simplify the sanitise_mte_tags() logic
  arm64: mte: Fix/clarify the PG_mte_tagged semantics
  mm: Do not enable PG_arch_2 for all 64-bit architectures

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-12-05 14:38:24 +00:00
Marc Zyngier
cfa72993d1 Merge branch kvm-arm64/pkvm-vcpu-state into kvmarm-master/next
* kvm-arm64/pkvm-vcpu-state: (25 commits)
  : .
  : Large drop of pKVM patches from Will Deacon and co, adding
  : a private vm/vcpu state at EL2, managed independently from
  : the EL1 state. From the cover letter:
  :
  : "This is version six of the pKVM EL2 state series, extending the pKVM
  : hypervisor code so that it can dynamically instantiate and manage VM
  : data structures without the host being able to access them directly.
  : These structures consist of a hyp VM, a set of hyp vCPUs and the stage-2
  : page-table for the MMU. The pages used to hold the hypervisor structures
  : are returned to the host when the VM is destroyed."
  : .
  KVM: arm64: Use the pKVM hyp vCPU structure in handle___kvm_vcpu_run()
  KVM: arm64: Don't unnecessarily map host kernel sections at EL2
  KVM: arm64: Explicitly map 'kvm_vgic_global_state' at EL2
  KVM: arm64: Maintain a copy of 'kvm_arm_vmid_bits' at EL2
  KVM: arm64: Unmap 'kvm_arm_hyp_percpu_base' from the host
  KVM: arm64: Return guest memory from EL2 via dedicated teardown memcache
  KVM: arm64: Instantiate guest stage-2 page-tables at EL2
  KVM: arm64: Consolidate stage-2 initialisation into a single function
  KVM: arm64: Add generic hyp_memcache helpers
  KVM: arm64: Provide I-cache invalidation by virtual address at EL2
  KVM: arm64: Initialise hypervisor copies of host symbols unconditionally
  KVM: arm64: Add per-cpu fixmap infrastructure at EL2
  KVM: arm64: Instantiate pKVM hypervisor VM and vCPU structures from EL1
  KVM: arm64: Add infrastructure to create and track pKVM instances at EL2
  KVM: arm64: Rename 'host_kvm' to 'host_mmu'
  KVM: arm64: Add hyp_spinlock_t static initializer
  KVM: arm64: Include asm/kvm_mmu.h in nvhe/mem_protect.h
  KVM: arm64: Add helpers to pin memory shared with the hypervisor at EL2
  KVM: arm64: Prevent the donation of no-map pages
  KVM: arm64: Implement do_donate() helper for donating memory
  ...

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-12-05 14:37:23 +00:00
Masami Hiramatsu (Google)
3b84efc066 arm64: kprobes: Return DBG_HOOK_ERROR if kprobes can not handle a BRK
Return DBG_HOOK_ERROR if kprobes can not handle a BRK because it
fails to find a kprobe corresponding to the address.

Since arm64 kprobes uses stop_machine based text patching for removing
BRK, it ensures all running kprobe_break_handler() is done at that point.
And after removing the BRK, it removes the kprobe from its hash list.
Thus, if the kprobe_break_handler() fails to find kprobe from hash list,
there is a bug.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/166994753273.439920.6629626290560350760.stgit@devnote3
Signed-off-by: Will Deacon <will@kernel.org>
2022-12-05 14:20:08 +00:00
Masami Hiramatsu (Google)
30a4215523 arm64: kprobes: Let arch do_page_fault() fix up page fault in user handler
Since arm64's do_page_fault() can handle the page fault correctly
than kprobe_fault_handler() according to the context, let it handle
the page fault instead of simply call fixup_exception() in the
kprobe_fault_handler().

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/166994752269.439920.4801339965959400456.stgit@devnote3
Signed-off-by: Will Deacon <will@kernel.org>
2022-12-05 14:20:08 +00:00
Masami Hiramatsu (Google)
0fbcd8abf3 arm64: Prohibit instrumentation on arch_stack_walk()
Mark arch_stack_walk() as noinstr instead of notrace and inline functions
called from arch_stack_walk() as __always_inline so that user does not
put any instrumentations on it, because this function can be used from
return_address() which is used by lockdep.

Without this, if the kernel built with CONFIG_LOCKDEP=y, just probing
arch_stack_walk() via <tracefs>/kprobe_events will crash the kernel on
arm64.

 # echo p arch_stack_walk >> ${TRACEFS}/kprobe_events
 # echo 1 > ${TRACEFS}/events/kprobes/enable
  kprobes: Failed to recover from reentered kprobes.
  kprobes: Dump kprobe:
  .symbol_name = arch_stack_walk, .offset = 0, .addr = arch_stack_walk+0x0/0x1c0
  ------------[ cut here ]------------
  kernel BUG at arch/arm64/kernel/probes/kprobes.c:241!
  kprobes: Failed to recover from reentered kprobes.
  kprobes: Dump kprobe:
  .symbol_name = arch_stack_walk, .offset = 0, .addr = arch_stack_walk+0x0/0x1c0
  ------------[ cut here ]------------
  kernel BUG at arch/arm64/kernel/probes/kprobes.c:241!
  PREEMPT SMP
  Modules linked in:
  CPU: 0 PID: 17 Comm: migration/0 Tainted: G                 N 6.1.0-rc5+ #6
  Hardware name: linux,dummy-virt (DT)
  Stopper: 0x0 <- 0x0
  pstate: 600003c5 (nZCv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : kprobe_breakpoint_handler+0x178/0x17c
  lr : kprobe_breakpoint_handler+0x178/0x17c
  sp : ffff8000080d3090
  x29: ffff8000080d3090 x28: ffff0df5845798c0 x27: ffffc4f59057a774
  x26: ffff0df5ffbba770 x25: ffff0df58f420f18 x24: ffff49006f641000
  x23: ffffc4f590579768 x22: ffff0df58f420f18 x21: ffff8000080d31c0
  x20: ffffc4f590579768 x19: ffffc4f590579770 x18: 0000000000000006
  x17: 5f6b636174735f68 x16: 637261203d207264 x15: 64612e202c30203d
  x14: 2074657366666f2e x13: 30633178302f3078 x12: 302b6b6c61775f6b
  x11: 636174735f686372 x10: ffffc4f590dc5bd8 x9 : ffffc4f58eb31958
  x8 : 00000000ffffefff x7 : ffffc4f590dc5bd8 x6 : 80000000fffff000
  x5 : 000000000000bff4 x4 : 0000000000000000 x3 : 0000000000000000
  x2 : 0000000000000000 x1 : ffff0df5845798c0 x0 : 0000000000000064
  Call trace:
  kprobes: Failed to recover from reentered kprobes.
  kprobes: Dump kprobe:
  .symbol_name = arch_stack_walk, .offset = 0, .addr = arch_stack_walk+0x0/0x1c0
  ------------[ cut here ]------------
  kernel BUG at arch/arm64/kernel/probes/kprobes.c:241!

Fixes: 39ef362d2d ("arm64: Make return_address() use arch_stack_walk()")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/166994751368.439920.3236636557520824664.stgit@devnote3
Signed-off-by: Will Deacon <will@kernel.org>
2022-12-05 14:20:08 +00:00
Jisheng Zhang
67bc5b2d6d arm64: alternatives: add __init/__initconst to some functions/variables
apply_alternatives_vdso(), __apply_alternatives_multi_stop() and
kernel_alternatives are not needed after booting, so mark the two
functions as __init and the var as __initconst.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Link: https://lore.kernel.org/r/20221202161859.2228-1-jszhang@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-12-05 13:47:06 +00:00
Linus Torvalds
355479c70a Final EFI fix for v6.1
- Revert runtime service sync exception recovery on arm64
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE+9lifEBpyUIVN1cpw08iOZLZjyQFAmOIsLoACgkQw08iOZLZ
 jyQspgv7BFM/z+PXeKCPee6uGUhNCjUoKP3Pnoo9jwe3sbSVagYUZZ5/vx01Ei33
 FHDi4FeSxo+adg2tb8xf8X5tZmfHYiQM5gEqK3xZXaBSJRT7FFY9+eRL6753Qe5W
 yxn220/nI9xTErGeEAabAHIngIOpth5nGQhRDQbM8r7EZDobGUu0NrJSRVnS7IBF
 Ov14VMkfqryInnJ8tzHwP4hz40Rrkf54dgNTPyJvUxJ3LCR5TqmeRp+16CZg87By
 CnmySayLnr+YLbZjp8vmQy9eUGVf4acPoDcXfBTe/lkE1BA4Qi+Hd/TaYgE8Fj0C
 B5g+sZa3khhaQc4Oc2RMOqpxK80DK41n1Vvy1UOZGOBFgZVoxT2bmDbAPohLyU/4
 lw6LqioDR2N4ZxiANGHuOVa8so8iY+Et30fLqRkeXj1TXqC84tDoxXGhsTEjF1Ad
 Vi0GGmrj4No4l0RM+4yGZIhNMaMGd5OhGdbwAiKUaLS9UyjvfqkaLGJ9S2wYVgZj
 R1PoDRWX
 =r/C6
 -----END PGP SIGNATURE-----

Merge tag 'efi-fixes-for-v6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI fix from Ard Biesheuvel:
 "A single revert for some code that I added during this cycle. The code
  is not wrong, but it should be a bit more careful about how to handle
  the shadow call stack pointer, so it is better to revert it for now
  and bring it back later in improved form.

  Summary:

   - Revert runtime service sync exception recovery on arm64"

* tag 'efi-fixes-for-v6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  arm64: efi: Revert "Recover from synchronous exceptions ..."
2022-12-01 11:25:11 -08:00
James Morse
c6e155e8e5 arm64/sysreg: Standardise naming for MVFR2_EL1
To convert the 32bit id registers to use the sysreg generation, they
must first have a regular pattern, to match the symbols the script
generates.

Ensure symbols for the MVFR2_EL1 register use lower-case for feature
names where the arm-arm does the same.

No functional change.

Signed-off-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20221130171637.718182-16-james.morse@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-12-01 15:53:14 +00:00
James Morse
d3e1aa85b1 arm64/sysreg: Standardise naming for MVFR1_EL1
To convert the 32bit id registers to use the sysreg generation, they
must first have a regular pattern, to match the symbols the script
generates.

Ensure symbols for the MVFR1_EL1 register use lower-case for feature
names where the arm-arm does the same.

No functional change.

Signed-off-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20221130171637.718182-15-james.morse@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-12-01 15:53:14 +00:00
James Morse
a3aab94801 arm64/sysreg: Standardise naming for MVFR0_EL1
To convert the 32bit id registers to use the sysreg generation, they
must first have a regular pattern, to match the symbols the script
generates.

Ensure symbols for the MVFR0_EL1 register use lower-case for feature
names where the arm-arm does the same.

No functional change.

Signed-off-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20221130171637.718182-14-james.morse@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-12-01 15:53:14 +00:00