3946 Commits

Author SHA1 Message Date
Ard Biesheuvel
7572ac3c97 arm64: efi: Revert "Recover from synchronous exceptions ..."
This reverts commit 23715a26c8d81291, which introduced some code in
assembler that manipulates both the ordinary and the shadow call stack
pointer in a way that could potentially be taken advantage of. So let's
revert it, and do a better job the next time around.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-12-01 14:48:26 +01:00
Jann Horn
d6c494e8ee vdso/timens: Refactor copy-pasted find_timens_vvar_page() helper into one copy
find_timens_vvar_page() is not architecture-specific, as can be seen from
how all five per-architecture versions of it are the same.

(arm64, powerpc and riscv are exactly the same; x86 and s390 have two
characters difference inside a comment, less blank lines, and mark the
!CONFIG_TIME_NS version as inline.)

Refactor the five copies into a central copy in kernel/time/namespace.c.

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20221130115320.2918447-1-jannh@google.com
2022-12-01 11:35:40 +01:00
Mark Brown
1192b93ba3 arm64/fp: Use a struct to pass data to fpsimd_bind_state_to_cpu()
For reasons that are unclear to this reader fpsimd_bind_state_to_cpu()
populates the struct fpsimd_last_state_struct that it uses to store the
active floating point state for KVM guests by passing an argument for
each member of the structure. As the richness of the architecture increases
this is resulting in a function with a rather large number of arguments
which isn't ideal.

Simplify the interface by using the struct directly as the single argument
for the function, renaming it as we lift the definition into the header.
This could be built on further to reduce the work we do adding storage for
new FP state in various places but for now it just simplifies this one
interface.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221115094640.112848-9-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-29 15:01:56 +00:00
Mark Brown
8c845e2731 arm64/sve: Leave SVE enabled on syscall if we don't context switch
The syscall ABI says that the SVE register state not shared with FPSIMD
may not be preserved on syscall, and this is the only mechanism we have
in the ABI to stop tracking the extra SVE state for a process. Currently
we do this unconditionally by means of disabling SVE for the process on
syscall, causing userspace to take a trap to EL1 if it uses SVE again.
These extra traps result in a noticeable overhead for using SVE instead
of FPSIMD in some workloads, especially for simple syscalls where we can
return directly to userspace and would not otherwise need to update the
floating point registers. Tests with fp-pidbench show an approximately
70% overhead on a range of implementations when SVE is in use - while
this is an extreme and entirely artificial benchmark it is clear that
there is some useful room for improvement here.

Now that we have the ability to track the decision about what to save
seprately to TIF_SVE we can improve things by leaving TIF_SVE enabled on
syscall but only saving the FPSIMD registers if we are in a syscall.
This means that if we need to restore the register state from memory
(eg, after a context switch or kernel mode NEON) we will drop TIF_SVE
and reenable traps for userspace but if we can just return to userspace
then traps will remain disabled.

Since our current implementation and hence ABI has the effect of zeroing
all the SVE register state not shared with FPSIMD on syscall we replace
the disabling of TIF_SVE with a flush of the non-shared register state,
this means that there is still some overhead for syscalls when SVE is in
use but it is very much reduced.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221115094640.112848-8-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-29 15:01:56 +00:00
Mark Brown
bbc6172eef arm64/fpsimd: SME no longer requires SVE register state
Now that we track the type of the stored register state separately to
what is active in the task, it is valid to have the FPSIMD register
state stored while in streaming mode. Remove the special case handling
for SME when setting FPSIMD register state.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221115094640.112848-7-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-29 15:01:56 +00:00
Mark Brown
a0136be443 arm64/fpsimd: Load FP state based on recorded data type
Now that we are recording the type of floating point register state we
are saving when we write the register state out to memory we can use
that information when we load from memory to decide which format to
load, bringing TIF_SVE into line with what we saved rather than relying
on TIF_SVE to determine what to load.

The SME state details are already recorded directly in the saved
SVCR and handled based on the information there.

Since we are not changing any of the save paths there should be no
functional change from this patch, further patches will make use of this
to optimise and clarify the code.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221115094640.112848-6-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-29 15:01:56 +00:00
Mark Brown
62021cc36a arm64/fpsimd: Stop using TIF_SVE to manage register saving in KVM
Now that we are explicitly telling the host FP code which register state
it needs to save we can remove the manipulation of TIF_SVE from the KVM
code, simplifying it and allowing us to optimise our handling of normal
tasks. Remove the manipulation of TIF_SVE from KVM and instead rely on
to_save to ensure we save the correct data for it.

There should be no functional or performance impact from this change.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221115094640.112848-5-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-29 15:01:56 +00:00
Mark Brown
deeb8f9a80 arm64/fpsimd: Have KVM explicitly say which FP registers to save
In order to avoid needlessly saving and restoring the guest registers KVM
relies on the host FPSMID code to save the guest registers when we context
switch away from the guest. This is done by binding the KVM guest state to
the CPU on top of the task state that was originally there, then carefully
managing the TIF_SVE flag for the task to cause the host to save the full
SVE state when needed regardless of the needs of the host task. This works
well enough but isn't terribly direct about what is going on and makes it
much more complicated to try to optimise what we're doing with the SVE
register state.

Let's instead have KVM pass in the register state it wants saving when it
binds to the CPU. We introduce a new FP_STATE_CURRENT for use
during normal task binding to indicate that we should base our
decisions on the current task. This should not be used when
actually saving. Ideally we might want to use a separate enum for
the type to save but this enum and the enum values would then
need to be named which has problems with clarity and ambiguity.

In order to ease any future debugging that might be required this patch
does not actually update any of the decision making about what to save,
it merely starts tracking the new information and warns if the requested
state is not what we would otherwise have decided to save.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221115094640.112848-4-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-29 15:01:56 +00:00
Mark Brown
baa8515281 arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE
When we save the state for the floating point registers this can be done
in the form visible through either the FPSIMD V registers or the SVE Z and
P registers. At present we track which format is currently used based on
TIF_SVE and the SME streaming mode state but particularly in the SVE case
this limits our options for optimising things, especially around syscalls.
Introduce a new enum which we place together with saved floating point
state in both thread_struct and the KVM guest state which explicitly
states which format is active and keep it up to date when we change it.

At present we do not use this state except to verify that it has the
expected value when loading the state, future patches will introduce
functional changes.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221115094640.112848-3-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-29 15:01:56 +00:00
Mark Brown
93ae6b01ba KVM: arm64: Discard any SVE state when entering KVM guests
Since 8383741ab2e773a99 (KVM: arm64: Get rid of host SVE tracking/saving)
KVM has not tracked the host SVE state, relying on the fact that we
currently disable SVE whenever we perform a syscall. This may not be true
in future since performance optimisation may result in us keeping SVE
enabled in order to avoid needing to take access traps to reenable it.
Handle this by clearing TIF_SVE and converting the stored task state to
FPSIMD format when preparing to run the guest.  This is done with a new
call fpsimd_kvm_prepare() to keep the direct state manipulation
functions internal to fpsimd.c.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221115094640.112848-2-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-29 15:01:56 +00:00
Anshuman Khandual
cc91b94816 arm64/perf: Replace PMU version number '0' with ID_AA64DFR0_EL1_PMUVer_NI
__armv8pmu_probe_pmu() returns if detected PMU is either not implemented or
implementation defined. Extracted ID_AA64DFR0_EL1_PMUVer value, when PMU is
not implemented is '0' which can be replaced with ID_AA64DFR0_EL1_PMUVer_NI
defined as '0b0000'.

Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: linux-perf-users@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20221128025449.39085-1-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-29 14:11:44 +00:00
Catalin Marinas
d77e59a8fc arm64: mte: Lock a page for MTE tag initialisation
Initialising the tags and setting PG_mte_tagged flag for a page can race
between multiple set_pte_at() on shared pages or setting the stage 2 pte
via user_mem_abort(). Introduce a new PG_mte_lock flag as PG_arch_3 and
set it before attempting page initialisation. Given that PG_mte_tagged
is never cleared for a page, consider setting this flag to mean page
unlocked and wait on this bit with acquire semantics if the page is
locked:

- try_page_mte_tagging() - lock the page for tagging, return true if it
  can be tagged, false if already tagged. No acquire semantics if it
  returns true (PG_mte_tagged not set) as there is no serialisation with
  a previous set_page_mte_tagged().

- set_page_mte_tagged() - set PG_mte_tagged with release semantics.

The two-bit locking is based on Peter Collingbourne's idea.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Peter Collingbourne <pcc@google.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221104011041.290951-6-pcc@google.com
2022-11-29 09:26:07 +00:00
Catalin Marinas
e059853d14 arm64: mte: Fix/clarify the PG_mte_tagged semantics
Currently the PG_mte_tagged page flag mostly means the page contains
valid tags and it should be set after the tags have been cleared or
restored. However, in mte_sync_tags() it is set before setting the tags
to avoid, in theory, a race with concurrent mprotect(PROT_MTE) for
shared pages. However, a concurrent mprotect(PROT_MTE) with a copy on
write in another thread can cause the new page to have stale tags.
Similarly, tag reading via ptrace() can read stale tags if the
PG_mte_tagged flag is set before actually clearing/restoring the tags.

Fix the PG_mte_tagged semantics so that it is only set after the tags
have been cleared or restored. This is safe for swap restoring into a
MAP_SHARED or CoW page since the core code takes the page lock. Add two
functions to test and set the PG_mte_tagged flag with acquire and
release semantics. The downside is that concurrent mprotect(PROT_MTE) on
a MAP_SHARED page may cause tag loss. This is already the case for KVM
guests if a VMM changes the page protection while the guest triggers a
user_mem_abort().

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[pcc@google.com: fix build with CONFIG_ARM64_MTE disabled]
Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221104011041.290951-3-pcc@google.com
2022-11-29 09:26:07 +00:00
Ren Zhijie
223d3a0d30 arm64: armv8_deprecated: fix unused-function error
If CONFIG_SWP_EMULATION is not set and
CONFIG_CP15_BARRIER_EMULATION is not set,
aarch64-linux-gnu complained about unused-function :

arch/arm64/kernel/armv8_deprecated.c:67:21: error: ‘aarch32_check_condition’ defined but not used [-Werror=unused-function]
 static unsigned int aarch32_check_condition(u32 opcode, u32 psr)
                     ^~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors

To fix this warning, modify aarch32_check_condition() with __maybe_unused.

Fixes: 0c5f416219da ("arm64: armv8_deprecated: move aarch32 helper earlier")
Signed-off-by: Ren Zhijie <renzhijie2@huawei.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20221124022429.19024-1-renzhijie2@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-25 12:16:22 +00:00
Mark Rutland
cfce092dae ftrace: arm64: remove static ftrace
The build test robot pointer out that there's a build failure when:

  CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS=y
  CONFIG_DYNAMIC_FTRACE_WITH_ARGS=n

... due to some mismatched ifdeffery, some of which checks
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS, and some of which checks
CONFIG_DYNAMIC_FTRACE_WITH_ARGS, leading to some missing definitions expected
by the core code when CONFIG_DYNAMIC_FTRACE=n and consequently
CONFIG_DYNAMIC_FTRACE_WITH_ARGS=n.

There's really not much point in supporting CONFIG_DYNAMIC_FTRACE=n (AKA
static ftrace). All supported toolchains allow us to implement
DYNAMIC_FTRACE, distributions all prefer DYNAMIC_FTRACE, and both
powerpc and s390 removed support for static ftrace in commits:

  0c0c52306f4792a4 ("powerpc: Only support DYNAMIC_FTRACE not static")
  5d6a0163494c78ad ("s390/ftrace: enforce DYNAMIC_FTRACE if FUNCTION_TRACER is selected")

... and according to Steven, static ftrace is only supported on x86 to
allow testing that the core code still functions in this configuration.

Given that, let's simplify matters by removing arm64's support for
static ftrace. This avoids the problem originally reported, and leaves
us with less code to maintain.

Fixes: 26299b3f6ba2 ("ftrace: arm64: move from REGS to ARGS")
Link: https://lore.kernel.org/r/202211212249.livTPi3Y-lkp@intel.com
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20221122163624.1225912-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-25 12:11:50 +00:00
Linus Torvalds
23a60a03d9 arm64 fixes:
- Fix a build error with CONFIG_CFI_CLANG + CONFIG_FTRACE when
   CONFIG_FUNCTION_GRAPH_TRACER is not enabled.
 
 - Fix a BUG_ON triggered by the page table checker due to incorrect
   file_map_count for non-leaf pmd/pud (the arm64
   pmd_user_accessible_page() not checking whether it's a leaf entry).
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmN3+hMACgkQa9axLQDI
 XvEB0Q//QIP/rk7g4G5575wnESdNDJXBXfRN+C+p3b3YgXQ7pMnmYUhu96Vgn9Fw
 ErCpf9g+FzYsY0qg6wTToiVhUP58d1pogiSemAIA56PdVA/uDLM4ETCMqUGEV3Im
 1vdtjRly8isdiQAilZBgFCD9mm8OZ75l03zME1ehDUEZVzjMScdaNNiMrRnPr5eq
 gKZvF2CCdtveCtfz8ycy/pH6U4gWBR6FU+L4wMPEnv/AyLcifFh+o6J1S+PLwzmi
 njKPvzsPE2sMOouyanagCdt/sdatajY3JWAL+pNs/hdlNmDk6QJvEhRvoII7wJ4K
 9QDdF/Sxh8MXczZcU4k4B1ziCsbjbTFlJadBth78AA0B/iJK7+WLYxKKnFlnEO5r
 pnQxFjdu392HDwB/d/NRtUA8s9laHeDcQaYiIRJeZDm0o8gMJOrL6uNdrQ+iMvZL
 H+eB0rCnwrS5xFy79WlYgCE8Pff13ElrEg5Lqr9l+0fWlPIzbIONvlysdcBPOd5L
 kGdM4ADixyGFAHuZBlyMWdSHrMO6UgSL8wv5O44qQTvyHEpJsH7Oz1ZINvYqnZyV
 U2DfggaofoJi8WbnuRwWAQeTIqgtIRtxMnDV6PpyUPAMQ/39DZ1HhBEoTLqg7T+P
 nG78oktahqGe/QVgQgKg5Zwr6bi4lFbhGUxXNcWG+BlHYJf66VQ=
 =cGmY
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - Fix a build error with CONFIG_CFI_CLANG + CONFIG_FTRACE when
   CONFIG_FUNCTION_GRAPH_TRACER is not enabled.

 - Fix a BUG_ON triggered by the page table checker due to incorrect
   file_map_count for non-leaf pmd/pud (the arm64
   pmd_user_accessible_page() not checking whether it's a leaf entry).

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64/mm: fix incorrect file_map_count for non-leaf pmd/pud
  arm64: ftrace: Define ftrace_stub_graph only with FUNCTION_GRAPH_TRACER
2022-11-18 14:31:03 -08:00
Anshuman Khandual
44ecda71fd arm64: errata: Workaround possible Cortex-A715 [ESR|FAR]_ELx corruption
If a Cortex-A715 cpu sees a page mapping permissions change from executable
to non-executable, it may corrupt the ESR_ELx and FAR_ELx registers, on the
next instruction abort caused by permission fault.

Only user-space does executable to non-executable permission transition via
mprotect() system call which calls ptep_modify_prot_start() and ptep_modify
_prot_commit() helpers, while changing the page mapping. The platform code
can override these helpers via __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION.

Work around the problem via doing a break-before-make TLB invalidation, for
all executable user space mappings, that go through mprotect() system call.
This overrides ptep_modify_prot_start() and ptep_modify_prot_commit(), via
defining HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION on the platform thus giving
an opportunity to intercept user space exec mappings, and do the necessary
TLB invalidation. Similar interceptions are also implemented for HugeTLB.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20221116140915.356601-3-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-18 16:52:40 +00:00
Mark Rutland
4585a93420 arm64: move on_thread_stack() to <asm/stacktrace.h>
Currently on_thread_stack() is defined in <asm/processor.h>, depending
upon definitiong from <asm/stacktrace.h> despite this header not being
included. This ends up being fragile, and any user of on_thread_stack()
must include both <asm/processor.h> and <asm/stacktrace.h>.

We organised things this way due to header dependencies back in commit:

  0b3e336601b82c6a ("arm64: Add support for STACKLEAK gcc plugin")

... but now that we no longer use current_top_of_stack(), and given that
stackleak includes <asm/stacktrace.h> via <linux/stackleak.h>, we no
longer need the definition to live in <asm/processor.h>.

Move on_thread_stack() to <asm/stacktrace.h>, where all its dependencies
are guaranteed to be defined. This requires having arm64's irq.c
explicitly include <asm/stacktrace.h>, and I've taken the opportunity to
sort the includes, which were slightly out of order.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221117120902.3974163-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-18 14:36:47 +00:00
Mark Rutland
56eea7f87f arm64: alternatives: make apply_alternatives_vdso() static
We define and use apply_alternatives_vdso() within alternative.c, and
don't provide a prototype in a header. There's no need for it to be
visible outside of alternative.c, so mark it as static.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221117131650.4056636-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-18 14:17:37 +00:00
Mark Rutland
26299b3f6b ftrace: arm64: move from REGS to ARGS
This commit replaces arm64's support for FTRACE_WITH_REGS with support
for FTRACE_WITH_ARGS. This removes some overhead and complexity, and
removes some latent issues with inconsistent presentation of struct
pt_regs (which can only be reliably saved/restored at exception
boundaries).

FTRACE_WITH_REGS has been supported on arm64 since commit:

  3b23e4991fb66f6d ("arm64: implement ftrace with regs")

As noted in the commit message, the major reasons for implementing
FTRACE_WITH_REGS were:

(1) To make it possible to use the ftrace graph tracer with pointer
    authentication, where it's necessary to snapshot/manipulate the LR
    before it is signed by the instrumented function.

(2) To make it possible to implement LIVEPATCH in future, where we need
    to hook function entry before an instrumented function manipulates
    the stack or argument registers. Practically speaking, we need to
    preserve the argument/return registers, PC, LR, and SP.

Neither of these need a struct pt_regs, and only require the set of
registers which are live at function call/return boundaries. Our calling
convention is defined by "Procedure Call Standard for the Arm® 64-bit
Architecture (AArch64)" (AKA "AAPCS64"), which can currently be found
at:

  https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst

Per AAPCS64, all function call argument and return values are held in
the following GPRs:

* X0 - X7 : parameter / result registers
* X8      : indirect result location register
* SP      : stack pointer (AKA SP)

Additionally, ad function call boundaries, the following GPRs hold
context/return information:

* X29 : frame pointer (AKA FP)
* X30 : link register (AKA LR)

... and for ftrace we need to capture the instrumented address:

 * PC  : program counter

No other GPRs are relevant, as none of the other arguments hold
parameters or return values:

* X9  - X17 : temporaries, may be clobbered
* X18       : shadow call stack pointer (or temorary)
* X19 - X28 : callee saved

This patch implements FTRACE_WITH_ARGS for arm64, only saving/restoring
the minimal set of registers necessary. This is always sufficient to
manipulate control flow (e.g. for live-patching) or to manipulate
function arguments and return values.

This reduces the necessary stack usage from 336 bytes for pt_regs down
to 112 bytes for ftrace_regs + 32 bytes for two frame records, freeing
up 188 bytes. This could be reduced further with changes to the
unwinder.

As there is no longer a need to save different sets of registers for
different features, we no longer need distinct `ftrace_caller` and
`ftrace_regs_caller` trampolines. This allows the trampoline assembly to
be simpler, and simplifies code which previously had to handle the two
trampolines.

I've tested this with the ftrace selftests, where there are no
unexpected failures.

Co-developed-by: Florent Revest <revest@chromium.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Florent Revest <revest@chromium.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20221103170520.931305-5-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-18 13:56:41 +00:00
Ard Biesheuvel
977122898e Merge tag 'efi-zboot-direct-for-v6.2' into efi/next 2022-11-18 09:13:57 +01:00
Jason A. Donenfeld
8032bf1233 treewide: use get_random_u32_below() instead of deprecated function
This is a simple mechanical transformation done by:

@@
expression E;
@@
- prandom_u32_max
+ get_random_u32_below
  (E)

Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Reviewed-by: SeongJae Park <sj@kernel.org> # for damon
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> # for arm
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-11-18 02:15:15 +01:00
Sergey Shtylyov
687daeeeca arm64: ptrace: user_regset_copyin_ignore() always returns 0
user_regset_copyin_ignore() always returns 0, so checking its result seems
pointless -- don't do this anymore...

Found by Linux Verification Center (linuxtesting.org) with the SVACE static
analysis tool.

Link: https://lkml.kernel.org/r/20221014212235.10770-4-s.shtylyov@omp.ru
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Cc: Brian Cain <bcain@quicinc.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Helge Deller <deller@gmx.de>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-15 14:30:39 -08:00
Mark Rutland
124c49b1b5 arm64: armv8_deprecated: rework deprected instruction handling
Support for deprecated instructions can be enabled or disabled at
runtime. To handle this, the code in armv8_deprecated.c registers and
unregisters undef_hooks, and makes cross CPU calls to configure HW
support. This is rather complicated, and the synchronization required to
make this safe ends up serializing the handling of instructions which
have been trapped.

This patch simplifies the deprecated instruction handling by removing
the dynamic registration and unregistration, and changing the trap
handling code to determine whether a handler should be invoked. This
removes the need for dynamic list management, and simplifies the locking
requirements, making it possible to handle trapped instructions entirely
in parallel.

Where changing the emulation state requires a cross-call, this is
serialized by locally disabling interrupts, ensuring that the CPU is not
left in an inconsistent state.

To simplify sysctl management, each insn_emulation is given a separate
sysctl table, permitting these to be registered separately. The core
sysctl code will iterate over all of these when walking sysfs.

I've tested this with userspace programs which use each of the
deprecated instructions, and I've concurrently modified the support
level for each of the features back-and-forth between HW and emulated to
check that there are no spurious SIGILLs sent to userspace when the
support level is changed.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221019144123.612388-10-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-15 13:46:19 +00:00
Mark Rutland
0c5f416219 arm64: armv8_deprecated: move aarch32 helper earlier
Subsequent patches will rework the logic in armv8_deprecated.c.

In preparation for subsequent changes, this patch moves some shared logic
earlier in the file. This will make subsequent diffs simpler and easier to
read.

At the same time, drop the `__kprobes` annotation from
aarch32_check_condition(), as this is only used for traps from compat
userspace, and has no risk of recursion within kprobes. As this is the
last kprobes annotation in armve8_deprecated.c, we no longer need to
include <asm/kprobes.h>.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221019144123.612388-9-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-15 13:46:18 +00:00
Mark Rutland
25eeac0cfe arm64: armv8_deprecated move emulation functions
Subsequent patches will rework the logic in armv8_deprecated.c.

In preparation for subsequent changes, this patch moves the emulation
logic earlier in the file, and moves the infrastructure later in the
file. This will make subsequent diffs simpler and easier to read.

This is purely a move. There should be no functional change as a result
of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221019144123.612388-8-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-15 13:46:18 +00:00
Mark Rutland
b4453cc8a7 arm64: armv8_deprecated: fold ops into insn_emulation
The code for emulating deprecated instructions has two related
structures: struct insn_emulation_ops and struct insn_emulation, where
each struct insn_emulation_ops is associated 1-1 with a struct
insn_emulation.

It would be simpler to combine the two into a single structure, removing
the need for (unconditional) dynamic allocation at boot time, and
simplifying some runtime pointer chasing.

This patch merges the two structures together.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221019144123.612388-7-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-15 13:46:18 +00:00
Mark Rutland
f5962add74 arm64: rework EL0 MRS emulation
On CPUs without FEAT_IDST, ID register emulation is slower than it needs
to be, as all threads contend for the same lock to perform the
emulation. This patch reworks the emulation to avoid this unnecessary
contention.

On CPUs with FEAT_IDST (which is mandatory from ARMv8.4 onwards), EL0
accesses to ID registers result in a SYS trap, and emulation of these is
handled with a sys64_hook. These hooks are statically allocated, and no
locking is required to iterate through the hooks and perform the
emulation, allowing emulation to occur in parallel with no contention.

On CPUs without FEAT_IDST, EL0 accesses to ID registers result in an
UNDEFINED exception, and emulation of these accesses is handled with an
undef_hook. When an EL0 MRS instruction is trapped to EL1, the kernel
finds the relevant handler by iterating through all of the undef_hooks,
requiring undef_lock to be held during this lookup.

This locking is only required to safely traverse the list of undef_hooks
(as it can be concurrently modified), and the actual emulation of the
MRS does not require any mutual exclusion. This locking is an
unfortunate bottleneck, especially given that MRS emulation is enabled
unconditionally and is never disabled.

This patch reworks the non-FEAT_IDST MRS emulation logic so that it can
be invoked directly from do_el0_undef(). This removes the bottleneck,
allowing MRS traps to be handled entirely in parallel, and is a stepping
stone to making all of the undef_hooks lock-free.

I've tested this in a 64-vCPU VM on a 64-CPU ThunderX2 host, with a
benchmark which spawns a number of threads which each try to read
ID_AA64ISAR0_EL1 1000000 times. This is vastly more contention than will
ever be seen in realistic usage, but clearly demonstrates the removal of
the bottleneck:

  | Threads || Time (seconds)                       |
  |         || Before           || After            |
  |         || Real   | System  || Real   | System  |
  |---------++--------+---------++--------+---------|
  |       1 ||   0.29 |    0.20 ||   0.24 |    0.12 |
  |       2 ||   0.35 |    0.51 ||   0.23 |    0.27 |
  |       4 ||   1.08 |    3.87 ||   0.24 |    0.56 |
  |       8 ||   4.31 |   33.60 ||   0.24 |    1.11 |
  |      16 ||   9.47 |  149.39 ||   0.23 |    2.15 |
  |      32 ||  19.07 |  605.27 ||   0.24 |    4.38 |
  |      64 ||  65.40 | 3609.09 ||   0.33 |   11.27 |

Aside from the speedup, there should be no functional change as a result
of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221019144123.612388-6-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-15 13:46:18 +00:00
Mark Rutland
dbfbd87efa arm64: factor insn read out of call_undef_hook()
Subsequent patches will rework EL0 UNDEF handling, removing the need for
struct undef_hook and call_undef_hook. In preparation for those changes,
this patch factors the logic for reading user instructions out of
call_undef_hook() and into a new user_insn_read() helper, matching the
style of the existing aarch64_insn_read() helper used for reading kernel
instructions.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221019144123.612388-5-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-15 13:46:18 +00:00
Mark Rutland
bff8f413c7 arm64: factor out EL1 SSBS emulation hook
Currently call_undef_hook() is used to handle UNDEFINED exceptions from
EL0 and EL1. As support for deprecated instructions may be enabled
independently, the handlers for individual instructions are organised as
a linked list of struct undef_hook which can be manipulated dynamically.
As this can be manipulated dynamically, the list is protected with a
raw_spinlock which must be acquired when handling UNDEFINED exceptions
or when manipulating the list of handlers.

This locking is unfortunate as it serialises handling of UNDEFINED
exceptions, and requires RCU to be enabled for lockdep, requiring the
use of RCU_NONIDLE() in resume path of cpu_suspend() since commit:

  a2c42bbabbe260b7 ("arm64: spectre: Prevent lockdep splat on v4 mitigation enable path")

The list of UNDEFINED handlers largely consist of handlers for
exceptions taken from EL0, and the only handler for exceptions taken
from EL1 handles `MSR SSBS, #imm` on CPUs which feature PSTATE.SSBS but
lack the corresponding MSR (Immediate) instruction. Other than this we
never expect to take an UNDEFINED exception from EL1 in normal
operation.

This patch reworks do_el0_undef() to invoke the EL1 SSBS handler
directly, relegating call_undef_hook() to only handle EL0 UNDEFs. This
removes redundant work to iterate the list for EL1 UNDEFs, and removes
the need for locking, permitting EL1 UNDEFs to be handled in parallel
without contention.

The RCU_NONIDLE() call in cpu_suspend() will be removed in a subsequent
patch, as there are other potential issues with the use of
instrumentable code and RCU in the CPU suspend code.

I've tested this by forcing the detection of SSBS on a CPU that doesn't
have it, and verifying that the try_emulate_el1_ssbs() callback is
invoked.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221019144123.612388-4-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-15 13:46:18 +00:00
Mark Rutland
61d64a376e arm64: split EL0/EL1 UNDEF handlers
In general, exceptions taken from EL1 need to be handled separately from
exceptions taken from EL0, as the logic to handle the two cases can be
significantly divergent, and exceptions taken from EL1 typically have
more stringent requirements on locking and instrumentation.

Subsequent patches will rework the way EL1 UNDEFs are handled in order
to address longstanding soundness issues with instrumentation and RCU.
In preparation for that rework, this patch splits the existing
do_undefinstr() handler into separate do_el0_undef() and do_el1_undef()
handlers.

Prior to this patch, do_undefinstr() was marked with NOKPROBE_SYMBOL(),
preventing instrumentation via kprobes. However, do_undefinstr() invokes
other code which can be instrumented, and:

* For UNDEFINED exceptions taken from EL0, there is no risk of recursion
  within kprobes. Therefore it is safe for do_el0_undef to be
  instrumented with kprobes, and it does not need to be marked with
  NOKPROBE_SYMBOL().

* For UNDEFINED exceptions taken from EL1, either:

  (a) The exception is has been taken when manipulating SSBS; these cases
      are limited and do not occur within code that can be invoked
      recursively via kprobes. Hence, in these cases instrumentation
      with kprobes is benign.

  (b) The exception has been taken for an unknown reason, as other than
      manipulating SSBS we do not expect to take UNDEFINED exceptions
      from EL1. Any handling of these exception is best-effort.

  ... and in either case, marking do_el1_undef() with NOKPROBE_SYMBOL()
  isn't sufficient to prevent recursion via kprobes as functions it
  calls (including die()) are instrumentable via kprobes.

  Hence, it's not worthwhile to mark do_el1_undef() with
  NOKPROBE_SYMBOL(). The same applies to do_el1_bti() and do_el1_fpac(),
  so their NOKPROBE_SYMBOL() annotations are also removed.

Aside from the new instrumentability, there should be no functional
change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221019144123.612388-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-15 13:46:17 +00:00
Mark Rutland
b3a0c010e9 arm64: allow kprobes on EL0 handlers
Currently do_sysinstr() and do_cp15instr() are marked with
NOKPROBE_SYMBOL(). However, these are only called for exceptions taken
from EL0, and there is no risk of recursion in kprobes, so this is not
necessary.

Remove the NOKPROBE_SYMBOL() annotation, and rename the two functions to
more clearly indicate that these are solely for exceptions taken from
EL0, better matching the names used by the lower level entry points in
entry-common.c.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221019144123.612388-2-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-15 13:46:17 +00:00
Mark Rutland
4488f90c86 arm64: insn: simplify insn group identification
The only code which needs to check for an entire instruction group is
the aarch64_insn_is_steppable() helper function used by kprobes, which
must not be instrumented, and only needs to check for the "Branch,
exception generation and system instructions" class.

Currently we have an out-of-line helper in insn.c which must be marked
as __kprobes, which indexes a table with some bits extracted from the
instruction. In aarch64_insn_is_steppable() we then need to compare the
result with an expected enum value.

It would be simpler to have a predicate for this, as with the other
aarch64_insn_is_*() helpers, which would be always inlined to prevent
inadvertent instrumentation, and would permit better code generation.

This patch adds a predicate function for this instruction group using
the existing __AARCH64_INSN_FUNCS() helpers, and removes the existing
out-of-line helper. As the only class we currently care about is the
branch+exception+sys class, I have only added helpers for this, and left
the other classes unimplemented for now.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://lore.kernel.org/r/20221114135928.3000571-4-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-15 13:07:44 +00:00
Sudeep Holla
1d280ce099 arm64: Add architecture specific ACPI FFH Opregion callbacks
FFH Operation Region space can be used to trigger SMC or HVC calls,
using the Arm SMC Calling Convention (SMCCC). The choice of conduit
(SMC or HVC) is based on what the kernel choose based on PSCI as with
any other users of SMCCC within the kernel.

Function identifiers only in the SMCCC SiP Service, OEM Service and FF-A
specific call range are allowed in FFH Opregions.

Offset can be either 0(32 bit calling convention) or 1(64 bit calling
convention). The length must be set with the range applicable based
on the value of the offset.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-11-14 19:09:07 +01:00
Sami Tolvanen
2598ac6ec4 arm64: ftrace: Define ftrace_stub_graph only with FUNCTION_GRAPH_TRACER
The 0-day bot reports that arm64 builds with CONFIG_CFI_CLANG +
CONFIG_FTRACE are broken when CONFIG_FUNCTION_GRAPH_TRACER is not
enabled:

 ld.lld: error: undefined symbol: __kcfi_typeid_ftrace_stub_graph
 >>> referenced by entry-ftrace.S:299 (arch/arm64/kernel/entry-ftrace.S:299)
 >>>               arch/arm64/kernel/entry-ftrace.o:(.text+0x48) in archive vmlinux.a

This is caused by ftrace_stub_graph using SYM_TYPE_FUNC_START when
the address of the function is not taken in any C translation unit.

Fix the build by only defining ftrace_stub_graph when it's actually
needed, i.e. with CONFIG_FUNCTION_GRAPH_TRACER.

Link: https://lore.kernel.org/lkml/202210251659.tRMs78RH-lkp@intel.com/
Fixes: 883bbbffa5a4 ("ftrace,kcfi: Separate ftrace_stub() and ftrace_stub_graph()")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20221109192831.3057131-1-samitolvanen@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-11-14 12:28:52 +00:00
Linus Torvalds
ab57bc6f02 Third batch of EFI fixes for v6.1
- Force the use of SetVirtualAddressMap() on Ampera Altra arm64
   machines, which crash in SetTime() if no virtual remapping is used
 - Drop a spurious warning on misaligned runtime regions when using 16k
   or 64k pages on arm64
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE+9lifEBpyUIVN1cpw08iOZLZjyQFAmNvduwACgkQw08iOZLZ
 jySmkgv9GTFJUWJY1JWsQZf2OB+Ui2JAVCPJVbLGzDxWEFY/z+mgAcC6rJ6+T0Ju
 9fNNBnFXeSq5bOPqGFcBOsLxHcP1KpNQHNKHjFUv9RovQGiMD29Fl3kT8XiuqtsB
 SJcilTJs+J6umBOX+yQ1oho0P5eq/LkvDW3AFxzxrHAl/k9U0eePLIBAgIXS8Iuf
 wZP3b2Bqt0z9b6JBFBKmXlLTC1WGdoVPmcXc2n+6O3c4MxUrZnbDk9Ou8vA1sCy5
 JO4GlU0qvHercsZwcRRcdsKeQPpXIeDDOklUkicxsuYVhi7ipIfLdYsMwFkxGp22
 IhXfxfV8OyJm71uD4z7EJAIgZibG86UQlh3Lib5846xYAGbZiUx3CaiiPBgHXgeV
 PUy4FtYPlf0u8epC2QWKC3FGRIpkcAVwmZPnNvXV+NFg1wzd2B1dGFJajvCKfW93
 joBsdWLUZABj5bNtSyLlaswT6gHt58w6PkHaqwi3mQaZs0oNt01iLbZCMy33y4A+
 +jhAY/FE
 =sWO/
 -----END PGP SIGNATURE-----

Merge tag 'efi-fixes-for-v6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI fixes from Ard Biesheuvel:

 - Force the use of SetVirtualAddressMap() on Ampera Altra arm64
   machines, which crash in SetTime() if no virtual remapping is used

   This is the first time we've added an SMBIOS based quirk on arm64,
   but fortunately, we can just call a EFI protocol to grab the type #1
   SMBIOS record when running in the stub, so we don't need all the
   machinery we have in the kernel proper to parse SMBIOS data.

 - Drop a spurious warning on misaligned runtime regions when using 16k
   or 64k pages on arm64

* tag 'efi-fixes-for-v6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  arm64: efi: Fix handling of misaligned runtime regions and drop warning
  arm64: efi: Force the use of SetVirtualAddressMap() on Altra machines
2022-11-13 07:52:22 -08:00
Quentin Perret
169cd0f823 KVM: arm64: Don't unnecessarily map host kernel sections at EL2
We no longer need to map the host's '.rodata' and '.bss' sections in the
stage-1 page-table of the pKVM hypervisor at EL2, so remove those
mappings and avoid creating any future dependencies at EL2 on
host-controlled data structures.

Tested-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221110190259.26861-25-will@kernel.org
2022-11-11 17:19:35 +00:00
Will Deacon
73f38ef2ae KVM: arm64: Maintain a copy of 'kvm_arm_vmid_bits' at EL2
Sharing 'kvm_arm_vmid_bits' between EL1 and EL2 allows the host to
modify the variable arbitrarily, potentially leading to all sorts of
shenanians as this is used to configure the VTTBR register for the
guest stage-2.

In preparation for unmapping host sections entirely from EL2, maintain
a copy of 'kvm_arm_vmid_bits' in the pKVM hypervisor and initialise it
from the host value while it is still trusted.

Tested-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221110190259.26861-23-will@kernel.org
2022-11-11 17:19:35 +00:00
Quentin Perret
fe41a7f8c0 KVM: arm64: Unmap 'kvm_arm_hyp_percpu_base' from the host
When pKVM is enabled, the hypervisor at EL2 does not trust the host at
EL1 and must therefore prevent it from having unrestricted access to
internal hypervisor state.

The 'kvm_arm_hyp_percpu_base' array holds the offsets for hypervisor
per-cpu allocations, so move this this into the nVHE code where it
cannot be modified by the untrusted host at EL1.

Tested-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221110190259.26861-22-will@kernel.org
2022-11-11 17:19:35 +00:00
Will Deacon
13e248aab7 KVM: arm64: Provide I-cache invalidation by virtual address at EL2
In preparation for handling cache maintenance of guest pages from within
the pKVM hypervisor at EL2, introduce an EL2 copy of icache_inval_pou()
which will later be plumbed into the stage-2 page-table cache
maintenance callbacks, ensuring that the initial contents of pages
mapped as executable into the guest stage-2 page-table is visible to the
instruction fetcher.

Tested-by: Vincent Donnefort <vdonnefort@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221110190259.26861-17-will@kernel.org
2022-11-11 17:16:25 +00:00
Ard Biesheuvel
9b9eaee982 arm64: efi: Fix handling of misaligned runtime regions and drop warning
Currently, when mapping the EFI runtime regions in the EFI page tables,
we complain about misaligned regions in a rather noisy way, using
WARN().

Not only does this produce a lot of irrelevant clutter in the log, it is
factually incorrect, as misaligned runtime regions are actually allowed
by the EFI spec as long as they don't require conflicting memory types
within the same 64k page.

So let's drop the warning, and tweak the code so that we
- take both the start and end of the region into account when checking
  for misalignment
- only revert to RWX mappings for non-code regions if misaligned code
  regions are also known to exist.

Cc: <stable@vger.kernel.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-10 23:14:15 +01:00
Usama Arif
1e55b44d9e arm64: paravirt: remove conduit check in has_pv_steal_clock
arm_smccc_1_1_invoke() which is called later on in the function
will return failure if there's no conduit (or pre-SMCCC 1.1),
hence the check is unnecessary.

Suggested-by: Steven Price <steven.price@arm.com>
Signed-off-by: Usama Arif <usama.arif@bytedance.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Link: https://lore.kernel.org/r/20221104061659.4116508-1-usama.arif@bytedance.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-09 18:11:56 +00:00
Ard Biesheuvel
3b619e22c4 arm64: implement dynamic shadow call stack for Clang
Implement dynamic shadow call stack support on Clang, by parsing the
unwind tables at init time to locate all occurrences of PACIASP/AUTIASP
instructions, and replacing them with the shadow call stack push and pop
instructions, respectively.

This is useful because the overhead of the shadow call stack is
difficult to justify on hardware that implements pointer authentication
(PAC), and given that the PAC instructions are executed as NOPs on
hardware that doesn't, we can just replace them without breaking
anything. As PACIASP/AUTIASP are guaranteed to be paired with respect to
manipulations of the return address, replacing them 1:1 with shadow call
stack pushes and pops is guaranteed to result in the desired behavior.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Tested-by: Sami Tolvanen <samitolvanen@google.com>
Link: https://lore.kernel.org/r/20221027155908.1940624-4-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-09 18:06:35 +00:00
Ard Biesheuvel
68c76ad4a9 arm64: unwind: add asynchronous unwind tables to kernel and modules
Enable asynchronous unwind table generation for both the core kernel as
well as modules, and emit the resulting .eh_frame sections as init code
so we can use the unwind directives for code patching at boot or module
load time.

This will be used by dynamic shadow call stack support, which will rely
on code patching rather than compiler codegen to emit the shadow call
stack push and pop instructions.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Tested-by: Sami Tolvanen <samitolvanen@google.com>
Link: https://lore.kernel.org/r/20221027155908.1940624-2-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-09 18:06:35 +00:00
Mark Brown
d12aada8df arm64/hwcap: Add support for SVE 2.1
FEAT_SVE2p1 introduces a number of new SVE instructions. Since there is no
new architectural state added kernel support is simply a new hwcap which
lets userspace know that the feature is supported.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20221017152520.1039165-6-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-09 17:54:53 +00:00
Mark Brown
939e4649d4 arm64/hwcap: Add support for FEAT_RPRFM
FEAT_RPRFM adds a new range prefetch hint within the existing PRFM space
for range prefetch hinting. Add a new hwcap to allow userspace to discover
support for the new instruction.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20221017152520.1039165-4-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-09 17:54:53 +00:00
Mark Brown
95aa6860d6 arm64/hwcap: Add support for FEAT_CSSC
FEAT_CSSC adds a number of new instructions usable to optimise common short
sequences of instructions, add a hwcap indicating that the feature is
available and can be used by userspace.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20221017152520.1039165-2-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-09 17:54:53 +00:00
Ard Biesheuvel
da8dd0c75b efi: libstub: Provide local implementations of strrchr() and memchr()
Clone the implementations of strrchr() and memchr() in lib/string.c so
we can use them in the standalone zboot decompressor app. These routines
are used by the FDT handling code.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-09 12:42:02 +01:00
Ard Biesheuvel
2e6fa86f2d efi: libstub: Enable efi_printk() in zboot decompressor
Split the efi_printk() routine into its own source file, and provide
local implementations of strlen() and strnlen() so that the standalone
zboot app can efi_err and efi_info etc.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-09 12:42:02 +01:00
Ard Biesheuvel
52dce39cd2 efi: libstub: Clone memcmp() into the stub
We will no longer be able to call into the kernel image once we merge
the decompressor with the EFI stub, so we need our own implementation of
memcmp(). Let's add the one from lib/string.c and simplify it.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-09 12:42:02 +01:00