IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
In system_supports_{sve,sme,sme2,fa64}() we use cpus_have_const_cap() to
check for the relevant cpucaps, but this is only necessary so that
sve_setup() and sme_setup() can run prior to alternatives being patched,
and otherwise alternative_has_cap_*() would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
All of system_supports_{sve,sme,sme2,fa64}() will return false prior to
system cpucaps being detected. In the window between system cpucaps being
detected and patching alternatives, we need system_supports_sve() and
system_supports_sme() to run to initialize SVE and SME properties, but
all other users of system_supports_{sve,sme,sme2,fa64}() don't depend on
the relevant cpucap becoming true until alternatives are patched:
* No KVM code runs until after alternatives are patched, and so this can
safely use cpus_have_final_cap() or alternative_has_cap_*().
* The cpuid_cpu_online() callback in arch/arm64/kernel/cpuinfo.c is
registered later from cpuinfo_regs_init() as a device_initcall, and so
this can safely use cpus_have_final_cap() or alternative_has_cap_*().
* The entry, signal, and ptrace code isn't reachable until userspace has
run, and so this can safely use cpus_have_final_cap() or
alternative_has_cap_*().
* Currently perf_reg_validate() will un-reserve the PERF_REG_ARM64_VG
pseudo-register before alternatives are patched, and before
sve_setup() has run. If a sampling event is created early enough, this
would allow perf_ext_reg_value() to sample (the as-yet uninitialized)
thread_struct::vl[] prior to alternatives being patched.
It would be preferable to defer this until alternatives are patched,
and this can safely use alternative_has_cap_*().
* The context-switch code will run during this window as part of
stop_machine() used during alternatives_patch_all(), and potentially
for other work if other kernel threads are created early. No threads
require the use of SVE/SME/SME2/FA64 prior to alternatives being
patched, and it would be preferable for the related context-switch
logic to take effect after alternatives are patched so that ths is
guaranteed to see a consistent system-wide state (e.g. anything
initialized by sve_setup() and sme_setup().
This can safely ues alternative_has_cap_*().
This patch replaces the use of cpus_have_const_cap() with
alternative_has_cap_unlikely(), which will avoid generating code to test
the system_cpucaps bitmap and should be better for all subsequent calls
at runtime. The sve_setup() and sme_setup() functions are modified to
use cpus_have_cap() directly so that they can observe the cpucaps being
set prior to alternatives being patched.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In arm64_apply_bp_hardening() we use cpus_have_const_cap() to check for
ARM64_SPECTRE_V2 , but this is not necessary and alternative_has_cap_*()
would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
The cpus_have_const_cap() check in arm64_apply_bp_hardening() is
intended to avoid the overhead of looking up and invoking a per-cpu
function pointer when no branch predictor hardening is required. The
arm64_apply_bp_hardening() function itself is called in two distinct
flows:
1) When handling certain exceptions taken from EL0, where the PC could
be a TTBR1 address and hence might have trained a branch predictor.
As cpucaps are detected and alternatives are patched long before it
is possible to execute userspace, it is not necessary to use
cpus_have_const_cap() for these cases, and cpus_have_final_cap() or
alternative_has_cap() would be preferable.
2) When switching between tasks in check_and_switch_context().
This can be called before cpucaps are detected and alternatives are
patched, but this is long before the kernel mounts filesystems or
accepts any input. At this stage the kernel hasn't loaded any secrets
and there is no potential for hostile branch predictor training. Once
cpucaps have been finalized and alternatives have been patched,
switching tasks will invalidate any prior predictions. Hence it is
not necessary to use cpus_have_const_cap() for this case.
This patch replaces the use of cpus_have_const_cap() with
alternative_has_cap_unlikely(), which will avoid generating code to test
the system_cpucaps bitmap and should be better for all subsequent calls
at runtime.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In ssbs_thread_switch() we use cpus_have_const_cap() to check for
ARM64_SSBS, but this is not necessary and alternative_has_cap_*() would
be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
The cpus_have_const_cap() check in ssbs_thread_switch() is an
optimization to avoid the overhead of
spectre_v4_enable_task_mitigation() where all CPUs implement SSBS and
naturally preserve the SSBS bit in SPSR_ELx. In the window between
detecting the ARM64_SSBS system-wide and patching alternative branches
it is benign to continue to call spectre_v4_enable_task_mitigation().
This patch replaces the use of cpus_have_const_cap() with
alternative_has_cap_unlikely(), which will avoid generating code to test
the system_cpucaps bitmap and should be better for all subsequent calls
at runtime.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In system_supports_mte() we use cpus_have_const_cap() to check for
ARM64_MTE, but this is not necessary and cpus_have_final_boot_cap()
would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
The ARM64_MTE cpucap is a boot cpu feature which is detected and patched
early on the boot CPU under smp_prepare_boot_cpu(). In the window
between detecting the ARM64_MTE cpucap and patching alternatives,
nothing depends on the ARM64_MTE cpucap:
* The kasan_hw_tags_enabled() helper depends upon the kasan_flag_enabled
static key, which is initialized later in kasan_init_hw_tags() after
alternatives have been applied.
* No KVM code is called during this window, and KVM is not initialized
until after system cpucaps have been detected and patched. KVM code
can safely use cpus_have_final_cap() or alternative_has_cap_*().
* We don't context-switch prior to patching boot alternatives, and thus
mte_thread_switch() is not reachable during this window. Thus, we can
safely use cpus_have_final_boot_cap() or alternative_has_cap_*() in
the context-switch code.
* IRQ and FIQ are masked during this window, and we can only take SError
and Debug exceptions. SError exceptions are fatal at this point in
time, and we do not expect to take Debug exceptions, thus:
- It's fine to lave TCO set for exceptions taken during this window,
and mte_disable_tco_entry() doesn't need to do anything.
- We don't need to detect and report asynchronous tag cehck faults
during this window, and neither mte_check_tfsr_entry() nor
mte_check_tfsr_exit() need to do anything.
Since we want to report any SErrors taken during thiw window, these
cannot safely use cpus_have_final_boot_cap() or cpus_have_final_cap(),
but these can safely use alternative_has_cap_*().
* The __set_pte_at() function is not used during this window. It is
possible for this to be used on kernel mappings prior to boot cpucaps
being finalized, so this cannot safely use cpus_have_final_boot_cap()
or cpus_have_final_cap(), but this can safely use
alternative_has_cap_*().
* No userspace translation tables have been created yet, and swap has
not been initialized yet. Thus swapping is not possible and none of
the following are called:
- arch_thp_swp_supported()
- arch_prepare_to_swap()
- arch_swap_invalidate_page()
- arch_swap_invalidate_area()
- arch_swap_restore()
These can safely use system_has_final_cap() or
alternative_has_cap_*().
* The elfcore functions are only reachable after userspace is brought
up, which happens after system cpucaps have been detected and patched.
Thus the elfcore code can safely use cpus_have_final_cap() or
alternative_has_cap_*().
* Hibernation is only possible after userspace is brought up, which
happens after system cpucaps have been detected and patched. Thus the
hibernate code can safely use cpus_have_final_cap() or
alternative_has_cap_*().
* The set_tagged_addr_ctrl() function is only reachable after userspace
is brought up, which happens after system cpucaps have been detected
and patched. Thus this can safely use cpus_have_final_cap() or
alternative_has_cap_*().
* The copy_user_highpage() and copy_highpage() functions are not used
during this window, and can safely use alternative_has_cap_*().
This patch replaces the use of cpus_have_const_cap() with
alternative_has_cap_unlikely(), which avoid generating code to test the
system_cpucaps bitmap and should be better for all subsequent calls at
runtime.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
We use cpus_have_const_cap() to check for ARM64_HAS_TLB_RANGE, but this
is not necessary and alternative_has_cap_*() would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
In the window between detecting the ARM64_HAS_TLB_RANGE cpucap and
patching alternative branches, we do not perform any TLB invalidation,
and even if we were to perform TLB invalidation here it would not be
functionally necessary to optimize this by using range invalidation.
Hence there's no need to use cpus_have_const_cap(), and
alternative_has_cap_unlikely() is sufficient.
This patch replaces the use of cpus_have_const_cap() with
alternative_has_cap_unlikely(), which will avoid generating code to test
the system_cpucaps bitmap and should be better for all subsequent calls
at runtime.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In __delay() we use cpus_have_const_cap() to check for ARM64_HAS_WFXT,
but this is not necessary and alternative_has_cap() would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
The cpus_have_const_cap() check in __delay() is an optimization to use
WFIT and WFET in preference to busy-polling the counter and/or using
regular WFE and relying upon the architected timer event stream. It is
not necessary to apply this optimization in the window between detecting
the ARM64_HAS_WFXT cpucap and patching alternatives.
This patch replaces the use of cpus_have_const_cap() with
alternative_has_cap_unlikely(), which will avoid generating code to test
the system_cpucaps bitmap and should be better for all subsequent calls
at runtime.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In __cpu_has_rng() we use cpus_have_const_cap() to check for
ARM64_HAS_RNG, but this is not necessary and alternative_has_cap_*()
would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
In the window between detecting the ARM64_HAS_RNG cpucap and patching
alternative branches, nothing which calls __cpu_has_rng() can run, and
hence it's not necessary to use cpus_have_const_cap().
This patch replaces the use of cpus_have_const_cap() with
alternative_has_cap_unlikely(), which will avoid generating code to test
the system_cpucaps bitmap and should be better for all subsequent calls
at runtime.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
We use cpus_have_const_cap() to check for ARM64_HAS_EPAN but this is not
necessary and alternative_has_cap() or cpus_have_cap() would be
preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
The ARM64_HAS_EPAN cpucap is used to affect two things:
1) The permision bits used for userspace executable mappings, which are
chosen by adjust_protection_map(), which is an arch_initcall. This is
called after the ARM64_HAS_EPAN cpucap has been detected and
alternatives have been patched, and before any userspace translation
tables exist.
2) The handling of faults taken from (user or kernel) accesses to
userspace executable mappings in do_page_fault(). Userspace
translation tables are created after adjust_protection_map() is
called, and hence after the ARM64_HAS_EPAN cpucap has been detected
and alternatives have been patched.
Neither of these run until after ARM64_HAS_EPAN cpucap has been detected
and alternatives have been patched, and hence there's no need to use
cpus_have_const_cap(). Since adjust_protection_map() is only executed
once at boot time it would be best for it to use cpus_have_cap(), and
since do_page_fault() is executed frequently it would be best for it to
use alternatives_have_cap_unlikely().
This patch replaces the uses of cpus_have_const_cap() with
cpus_have_cap() and alternative_has_cap_unlikely(), which will avoid
generating redundant code, and should be better for all subsequent calls
at runtime. The ARM64_HAS_EPAN cpucap is added to cpucap_is_possible()
so that code can be elided entirely when this is not possible.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Vladimir Murzin <vladimir.murzin@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In system_uses_hw_pan() we use cpus_have_const_cap() to check for
ARM64_HAS_PAN, but this is only necessary so that the
system_uses_ttbr0_pan() check in setup_cpu_features() can run prior to
alternatives being patched, and otherwise this is not necessary and
alternative_has_cap_*() would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
The ARM64_HAS_PAN cpucap is used by system_uses_hw_pan() and
system_uses_ttbr0_pan() depending on whether CONFIG_ARM64_SW_TTBR0_PAN
is selected, and:
* We only use system_uses_hw_pan() directly in __sdei_handler(), which
isn't reachable until after alternatives have been patched, and for
this it is safe to use alternative_has_cap_*().
* We use system_uses_ttbr0_pan() in a few places:
- In check_and_switch_context() and cpu_uninstall_idmap(), which will
defer installing a translation table into TTBR0 when the
ARM64_HAS_PAN cpucap is not detected.
Prior to patching alternatives, all CPUs will be using init_mm with
the reserved ttbr0 translation tables install in TTBR0, so these can
safely use alternative_has_cap_*().
- In update_saved_ttbr0(), which will only save the active TTBR0 into
a per-thread variable when the ARM64_HAS_PAN cpucap is not detected.
Prior to patching alternatives, all CPUs will be using init_mm with
the reserved ttbr0 translation tables install in TTBR0, so these can
safely use alternative_has_cap_*().
- In efi_set_pgd(), which will handle check_and_switch_context()
deferring the installation of TTBR0 when TTBR0 PAN is detected.
The EFI runtime services are not initialized until after
alternatives have been patched, and so this can safely use
alternative_has_cap_*() or cpus_have_final_cap().
- In uaccess_ttbr0_disable() and uaccess_ttbr0_enable(), where we'll
avoid installing/uninstalling a translation table in TTBR0 when
ARM64_HAS_PAN is detected.
Prior to patching alternatives we will not perform any uaccess and
will not call uaccess_ttbr0_disable() or uaccess_ttbr0_enable(), and
so these can safely use alternative_has_cap_*() or
cpus_have_final_cap().
- In is_el1_permission_fault() where we will consider a translation
fault on a TTBR0 address to be a permission fault when ARM64_HAS_PAN
is not detected *and* we have set the PAN bit in the SPSR (which
tells us that in the interrupted context, TTBR0 pointed at the
reserved zero ttbr).
In the window between detecting system cpucaps and patching
alternatives we should not perform any accesses to TTBR0 addresses,
and no userspace translation tables exist until after patching
alternatives. Thus it is safe for this to use alternative_has_cap*().
This patch replaces the use of cpus_have_const_cap() with
alternative_has_cap_unlikely(), which will avoid generating code to test
the system_cpucaps bitmap and should be better for all subsequent calls
at runtime.
So that the check for TTBR0 PAN in setup_cpu_features() can run prior to
alternatives being patched, the call to system_uses_ttbr0_pan() is
replaced with an explicit check of the ARM64_HAS_PAN bit in the
system_cpucaps bitmap.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In system_uses_irq_prio_masking() we use cpus_have_const_cap() to check
for ARM64_HAS_GIC_PRIO_MASKING, but this is not necessary and
alternative_has_cap_*() would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
When CONFIG_ARM64_PSEUDO_NMI=y the ARM64_HAS_GIC_PRIO_MASKING cpucap is
a strict boot cpu feature which is detected and patched early on the
boot cpu, which both happen in smp_prepare_boot_cpu(). In the window
between the ARM64_HAS_GIC_PRIO_MASKING cpucap is detected and
alternatives are patched we don't run any code that depends upon the
ARM64_HAS_GIC_PRIO_MASKING cpucap:
* We leave DAIF.IF set until after boot alternatives are patched, and
interrupts are unmasked later in init_IRQ(), so we cannot reach
IRQ/FIQ entry code and will not use irqs_priority_unmasked().
* We don't call any code which uses arm_cpuidle_save_irq_context() and
arm_cpuidle_restore_irq_context() during this window.
* We don't call start_thread_common() during this window.
* The local_irq_*() code in <asm/irqflags.h> depends solely on an
alternative branch since commit:
a5f61cc636f48bdf ("arm64: irqflags: use alternative branches for pseudo-NMI logic")
... and hence will use the default (DAIF-only) masking behaviour until
alternatives are patched.
* Secondary CPUs are brought up later after alternatives are patched,
and alternatives are patched on the boot CPU immediately prior to
calling init_gic_priority_masking(), so we'll correctly initialize
interrupt masking regardless.
This patch replaces the use of cpus_have_const_cap() with
alternative_has_cap_unlikely(), which avoid generating code to test the
system_cpucaps bitmap and should be better for all subsequent calls at
runtime. As this makes system_uses_irq_prio_masking() equivalent to
__irqflags_uses_pmr(), the latter is removed and replaced with the
former for consistency.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In __cpu_suspend_exit() we use cpus_have_const_cap() to check for
ARM64_HAS_DIT but this is not necessary and cpus_have_final_cap() of
alternative_has_cap_*() would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
The ARM64_HAS_DIT cpucap is detected and patched (along with all other
cpucaps) before __cpu_suspend_exit() can run. We'll only use
__cpu_suspend_exit() as part of PSCI cpuidle or hibernation, and both of
these are intialized after system cpucaps are detected and patched: the
PSCI cpuidle driver is registered with a device_initcall, hibernation
restoration occurs in a late_initcall, and hibarnation saving is driven
by usrspace. Therefore it is not necessary to use cpus_have_const_cap(),
and using alternative_has_cap_*() or cpus_have_final_cap() is
sufficient.
This patch replaces the use of cpus_have_const_cap() with
alternative_has_cap_unlikely(), which will avoid generating code to test
the system_cpucaps bitmap and should be better for all subsequent calls
at runtime. To clearly document the ordering relationship between
suspend/resume and alternatives patching, an explicit check for
system_capabilities_finalized() is added to cpu_suspend() along with a
comment block, which will make it easier to spot issues if code is
changed in future to allow these functions to be reached earlier.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In system_supports_cnp() we use cpus_have_const_cap() to check for
ARM64_HAS_CNP, but this is only necessary so that the cpu_enable_cnp()
callback can run prior to alternatives being patched, and otherwise this
is not necessary and alternative_has_cap_*() would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
The cpu_enable_cnp() callback is run immediately after the ARM64_HAS_CNP
cpucap is detected system-wide under setup_system_capabilities(), prior
to alternatives being patched. During this window cpu_enable_cnp() uses
cpu_replace_ttbr1() to set the CNP bit for the swapper_pg_dir in TTBR1.
No other users of the ARM64_HAS_CNP cpucap need the up-to-date value
during this window:
* As KVM isn't initialized yet, kvm_get_vttbr() isn't reachable.
* As cpuidle isn't initialized yet, __cpu_suspend_exit() isn't
reachable.
* At this point all CPUs are using the swapper_pg_dir with a reserved
ASID in TTBR1, and the idmap_pg_dir in TTBR0, so neither
check_and_switch_context() nor cpu_do_switch_mm() need to do anything
special.
This patch replaces the use of cpus_have_const_cap() with
alternative_has_cap_unlikely(), which will avoid generating code to test
the system_cpucaps bitmap and should be better for all subsequent calls
at runtime. To allow cpu_enable_cnp() to function prior to alternatives
being patched, cpu_replace_ttbr1() is split into cpu_replace_ttbr1() and
cpu_enable_swapper_cnp(), with the former only used for early TTBR1
replacement, and the latter used by both cpu_enable_cnp() and
__cpu_suspend_exit().
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Vladimir Murzin <vladimir.murzin@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In icache_inval_all_pou() we use cpus_have_const_cap() to check for
ARM64_HAS_CACHE_DIC, but this is not necessary and
alternative_has_cap_*() would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
The cpus_have_const_cap() check in icache_inval_all_pou() is an
optimization to skip a redundant (but benign) IC IALLUIS + DSB ISH
sequence when all CPUs in the system have DIC. In the window between
detecting the ARM64_HAS_CACHE_DIC cpucap and patching alternative
branches there is only a single potential call to icache_inval_all_pou()
(in the alternatives patching itself), which there's no need to optimize
for at the expense of other callers.
This patch replaces the use of cpus_have_const_cap() with
alternative_has_cap_unlikely(), which will avoid generating code to test
the system_cpucaps bitmap and should be better for all subsequent calls
at runtime. This also aligns better with the way we patch the assembly
cache maintenance sequences in arch/arm64/mm/cache.S.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In system_supports_bti() we use cpus_have_const_cap() to check for
ARM64_HAS_BTI, but this is not necessary and alternative_has_cap_*() or
cpus_have_final_*cap() would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
When CONFIG_ARM64_BTI_KERNEL=y, the ARM64_HAS_BTI cpucap is a strict
boot cpu feature which is detected and patched early on the boot cpu.
All uses guarded by CONFIG_ARM64_BTI_KERNEL happen after the boot CPU
has detected ARM64_HAS_BTI and patched boot alternatives, and hence can
safely use alternative_has_cap_*() or cpus_have_final_boot_cap().
Regardless of CONFIG_ARM64_BTI_KERNEL, all other uses of ARM64_HAS_BTI
happen after system capabilities have been finalized and alternatives
have been patched. Hence these can safely use alternative_has_cap_*) or
cpus_have_final_cap().
This patch splits system_supports_bti() into system_supports_bti() and
system_supports_bti_kernel(), with the former handling where the cpucap
affects userspace functionality, and ther latter handling where the
cpucap affects kernel functionality. The use of cpus_have_const_cap() is
replaced by cpus_have_final_cap() in cpus_have_const_cap, and
cpus_have_final_boot_cap() in system_supports_bti_kernel(). This will
avoid generating code to test the system_cpucaps bitmap and should be
better for all subsequent calls at runtime. The use of
cpus_have_final_cap() and cpus_have_final_boot_cap() will make it easier
to spot if code is chaanged such that these run before the ARM64_HAS_BTI
cpucap is guaranteed to have been finalized.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In __tlbi_level() we use cpus_have_const_cap() to check for
ARM64_HAS_ARMv8_4_TTL, but this is not necessary and
alternative_has_cap_*() would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
In the window between detecting the ARM64_HAS_ARMv8_4_TTL cpucap and
patching alternative branches, we do not perform any TLB invalidation,
and even if we were to perform TLB invalidation here it would not be
functionally necessary to optimize this by using the TTL hint. Hence
there's no need to use cpus_have_const_cap(), and
alternative_has_cap_unlikely() is sufficient.
This patch replaces the use of cpus_have_const_cap() with
alternative_has_cap_unlikely(), which will avoid generating code to test
the system_cpucaps bitmap and should be better for all subsequent calls
at runtime.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In system_supports_address_auth() and system_supports_generic_auth() we
use cpus_have_const_cap to check for ARM64_HAS_ADDRESS_AUTH and
ARM64_HAS_GENERIC_AUTH respectively, but this is not necessary and
alternative_has_cap_*() would bre preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
The ARM64_HAS_ADDRESS_AUTH cpucap is a boot cpu feature which is
detected and patched early on the boot CPU before any pointer
authentication keys are enabled via their respective SCTLR_ELx.EN* bits.
Nothing which uses system_supports_address_auth() is called before the
boot alternatives are patched. Thus it is safe for
system_supports_address_auth() to use cpus_have_final_boot_cap() to
check for ARM64_HAS_ADDRESS_AUTH.
The ARM64_HAS_GENERIC_AUTH cpucap is a system feature which is detected
on all CPUs, then finalized and patched under
setup_system_capabilities(). We use system_supports_generic_auth() in a
few places:
* The pac_generic_keys_get() and pac_generic_keys_set() functions are
only reachable from system calls once userspace is up and running. As
cpucaps are finalzied long before userspace runs, these can safely use
alternative_has_cap_*() or cpus_have_final_cap().
* The ptrauth_prctl_reset_keys() function is only reachable from system
calls once userspace is up and running. As cpucaps are finalized long
before userspace runs, this can safely use alternative_has_cap_*() or
cpus_have_final_cap().
* The ptrauth_keys_install_user() function is used during
context-switch. This is called prior to alternatives being applied,
and so cannot use cpus_have_final_cap(), but as this only needs to
switch the APGA key for userspace tasks, it's safe to use
alternative_has_cap_*().
* The ptrauth_keys_init_user() function is used to initialize userspace
keys, and is only reachable after system cpucaps have been finalized
and patched. Thus this can safely use alternative_has_cap_*() or
cpus_have_final_cap().
* The system_has_full_ptr_auth() helper function is only used by KVM
code, which is only reachable after system cpucaps have been finalized
and patched. Thus this can safely use alternative_has_cap_*() or
cpus_have_final_cap().
This patch modifies system_supports_address_auth() to use
cpus_have_final_boot_cap() to check ARM64_HAS_ADDRESS_AUTH, and modifies
system_supports_generic_auth() to use alternative_has_cap_unlikely() to
check ARM64_HAS_GENERIC_AUTH. In either case this will avoid generating
code to test the system_cpucaps bitmap and should be better for all
subsequent calls at runtime. The use of cpus_have_final_boot_cap() will
make it easier to spot if code is chaanged such that these run before
the relevant cpucap is guaranteed to have been finalized.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Currently we have a negative cpucap which describes the *absence* of
FP/SIMD rather than *presence* of FP/SIMD. This largely works, but is
somewhat awkward relative to other cpucaps that describe the presence of
a feature, and it would be nicer to have a cpucap which describes the
presence of FP/SIMD:
* This will allow the cpucap to be treated as a standard
ARM64_CPUCAP_SYSTEM_FEATURE, which can be detected with the standard
has_cpuid_feature() function and ARM64_CPUID_FIELDS() description.
* This ensures that the cpucap will only transition from not-present to
present, reducing the risk of unintentional and/or unsafe usage of
FP/SIMD before cpucaps are finalized.
* This will allow using arm64_cpu_capabilities::cpu_enable() to enable
the use of FP/SIMD later, with FP/SIMD being disabled at boot time
otherwise. This will ensure that any unintentional and/or unsafe usage
of FP/SIMD prior to this is trapped, and will ensure that FP/SIMD is
never unintentionally enabled for userspace in mismatched big.LITTLE
systems.
This patch replaces the negative ARM64_HAS_NO_FPSIMD cpucap with a
positive ARM64_HAS_FPSIMD cpucap, making changes as described above.
Note that as FP/SIMD will now be trapped when not supported system-wide,
do_fpsimd_acc() must handle these traps in the same way as for SVE and
SME. The commentary in fpsimd_restore_current_state() is updated to
describe the new scheme.
No users of system_supports_fpsimd() need to know that FP/SIMD is
available prior to alternatives being patched, so this is updated to
use alternative_has_cap_likely() to check for the ARM64_HAS_FPSIMD
cpucap, without generating code to test the system_cpucaps bitmap.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The arm64_cpu_capabilities::cpu_enable() callbacks for SVE, SME, SME2,
and FA64 are named with an unusual "${feature}_kernel_enable" pattern
rather than the much more common "cpu_enable_${feature}". Now that we
only use these as cpu_enable() callbacks, it would be nice to have them
match the usual scheme.
This patch renames the cpu_enable() callbacks to match this scheme. At
the same time, the comment above cpu_enable_sve() is removed for
consistency with the other cpu_enable() callbacks.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Both sme2_kernel_enable() and fa64_kernel_enable() need to run after
sme_kernel_enable(). This happens to be true today as ARM64_SME has a
lower index than either ARM64_SME2 or ARM64_SME_FA64, and both functions
have a comment to this effect.
It would be nicer to have a build-time assertion like we for for
can_use_gic_priorities() and has_gic_prio_relaxed_sync(), as that way
it will be harder to miss any potential breakage.
This patch replaces the comments with build-time assertions.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
When a CPUs onlined we first probe for supported features and
propetites, and then we subsequently enable features that have been
detected. This is a little problematic for SVE and SME, as some
properties (e.g. vector lengths) cannot be probed while they are
disabled. Due to this, the code probing for SVE properties has to enable
SVE for EL1 prior to proving, and the code probing for SME properties
has to enable SME for EL1 prior to probing. We never disable SVE or SME
for EL1 after probing.
It would be a little nicer to transiently enable SVE and SME during
probing, leaving them both disabled unless explicitly enabled, as this
would make it much easier to catch unintentional usage (e.g. when they
are not present system-wide).
This patch reworks the SVE and SME feature probing code to only
transiently enable support at EL1, disabling after probing is complete.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Much of the arm64 KVM code uses cpus_have_const_cap() to check for
cpucaps, but this is unnecessary and it would be preferable to use
cpus_have_final_cap().
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
KVM is initialized after cpucaps have been finalized and alternatives
have been patched. Since commit:
d86de40decaa14e6 ("arm64: cpufeature: upgrade hyp caps to final")
... use of cpus_have_const_cap() in hyp code is automatically converted
to use cpus_have_final_cap():
| static __always_inline bool cpus_have_const_cap(int num)
| {
| if (is_hyp_code())
| return cpus_have_final_cap(num);
| else if (system_capabilities_finalized())
| return __cpus_have_const_cap(num);
| else
| return cpus_have_cap(num);
| }
Thus, converting hyp code to use cpus_have_final_cap() directly will not
result in any functional change.
Non-hyp KVM code is also not executed until cpucaps have been finalized,
and it would be preferable to extent the same treatment to this code and
use cpus_have_final_cap() directly.
This patch converts instances of cpus_have_const_cap() in KVM-only code
over to cpus_have_final_cap(). As all of this code runs after cpucaps
have been finalized, there should be no functional change as a result of
this patch, but the redundant instructions generated by
cpus_have_const_cap() will be removed from the non-hyp KVM code.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The arm64_cpu_capabilities::cpu_enable callbacks are intended for
cpu-local feature enablement (e.g. poking system registers). These get
called for each online CPU when boot/system cpucaps get finalized and
enabled, and get called whenever a CPU is subsequently onlined.
For KPTI with the ARM64_UNMAP_KERNEL_AT_EL0 cpucap, we use the
kpti_install_ng_mappings() function as the cpu_enable callback. This
does a mixture of cpu-local configuration (setting VBAR_EL1 to the
appropriate trampoline vectors) and some global configuration (rewriting
the swapper page tables to sue non-glboal mappings) that must happen at
most once.
This patch splits kpti_install_ng_mappings() into a cpu-local
cpu_enable_kpti() initialization function and a system-wide
kpti_install_ng_mappings() function. The cpu_enable_kpti() function is
responsible for selecting the necessary cpu-local vectors each time a
CPU is onlined, and the kpti_install_ng_mappings() function performs the
one-time rewrite of the translation tables too use non-global mappings.
Splitting the two makes the code a bit easier to follow and also allows
the page table rewriting code to be marked as __init such that it can be
freed after use.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
For ARM64_WORKAROUND_2658417, we use a cpu_enable() callback to hide the
ID_AA64ISAR1_EL1.BF16 ID register field. This is a little awkward as
CPUs may attempt to apply the workaround concurrently, requiring that we
protect the bulk of the callback with a raw_spinlock, and requiring some
pointless work every time a CPU is subsequently hotplugged in.
This patch makes this a little simpler by handling the masking once at
boot time. A new user_feature_fixup() function is called at the start of
setup_user_features() to mask the feature, matching the style of
elf_hwcap_fixup(). The ARM64_WORKAROUND_2658417 cpucap is added to
cpucap_is_possible() so that code can be elided entirely when this is
not possible.
Note that the ARM64_WORKAROUND_2658417 capability is matched with
ERRATA_MIDR_RANGE(), which implicitly gives the capability a
ARM64_CPUCAP_LOCAL_CPU_ERRATUM type, which forbids the late onlining of
a CPU with the erratum if the erratum was not present at boot time.
Therefore this patch doesn't change the behaviour for late onlining.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Currently setup_cpu_features() handles a mixture of one-time kernel
feature setup (e.g. cpucaps) and one-time user feature setup (e.g. ELF
hwcaps). Subsequent patches will rework other one-time setup and expand
the logic currently in setup_cpu_features(), and in preparation for this
it would be helpful to split the kernel and user setup into separate
functions.
This patch splits setup_user_features() out of setup_cpu_features(),
with a few additional cleanups of note:
* setup_cpu_features() is renamed to setup_system_features() to make it
clear that it handles system-wide feature setup rather than cpu-local
feature setup.
* setup_system_capabilities() is folded into setup_system_features().
* Presence of TTBR0 pan is logged immediately after
update_cpu_capabilities(), so that this is guaranteed to appear
alongside all the other detected system cpucaps.
* The 'cwg' variable is removed as its value is only consumed once and
it's simpler to use cache_type_cwg() directly without assigning its
return value to a variable.
* The call to setup_user_features() is moved after alternatives are
patched, which will allow user feature setup code to depend on
alternative branches and allow for simplifications in subsequent
patches.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The cpus_have_final_cap() function can be used to test a cpucap while
also verifying that we do not consume the cpucap until system
capabilities have been finalized. It would be helpful if we could do
likewise for boot cpucaps.
This patch adds a new cpus_have_final_boot_cap() helper which can be
used to test a cpucap while also verifying that boot capabilities have
been finalized. Users will be added in subsequent patches.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Many cpucaps can only be set when certain CONFIG_* options are selected,
and we need to check the CONFIG_* option before the cap in order to
avoid generating redundant code. Due to this, we have a growing number
of helpers in <asm/cpufeature.h> of the form:
| static __always_inline bool system_supports_foo(void)
| {
| return IS_ENABLED(CONFIG_ARM64_FOO) &&
| cpus_have_const_cap(ARM64_HAS_FOO);
| }
This is unfortunate as it forces us to use cpus_have_const_cap()
unnecessarily, resulting in redundant code being generated by the
compiler. In the vast majority of cases, we only require that feature
checks indicate the presence of a feature after cpucaps have been
finalized, and so it would be sufficient to use alternative_has_cap_*().
However some code needs to handle a feature before alternatives have
been patched, and must test the system_cpucaps bitmap via
cpus_have_const_cap(). In other cases we'd like to check for
unintentional usage of a cpucap before alternatives are patched, and so
it would be preferable to use cpus_have_final_cap().
Placing the IS_ENABLED() checks in each callsite is tedious and
error-prone, and the same applies for writing wrappers for each
comination of cpucap and alternative_has_cap_*() / cpus_have_cap() /
cpus_have_final_cap(). It would be nicer if we could centralize the
knowledge of which cpucaps are possible, and have
alternative_has_cap_*(), cpus_have_cap(), and cpus_have_final_cap()
handle this automatically.
This patch adds a new cpucap_is_possible() function which will be
responsible for checking the CONFIG_* option, and updates the low-level
cpucap checks to use this. The existing CONFIG_* checks in
<asm/cpufeature.h> are moved over to cpucap_is_possible(), but the (now
trival) wrapper functions are retained for now.
There should be no functional change as a result of this patch alone.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
For clarity it would be nice to factor cpucap manipulation out of
<asm/cpufeature.h>, and the obvious place would be <asm/cpucap.h>, but
this will clash somewhat with <generated/asm/cpucaps.h>.
Rename <generated/asm/cpucaps.h> to <generated/asm/cpucap-defs.h>,
matching what we do for <generated/asm/sysreg-defs.h>, and introduce a
new <asm/cpucaps.h> which includes the generated header.
Subsequent patches will fill out <asm/cpucaps.h>.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
When KPTI is in use, we cannot register a runstate region as XEN
requires that this is always a valid VA, which we cannot guarantee. Due
to this, xen_starting_cpu() must avoid registering each CPU's runstate
region, and xen_guest_init() must avoid setting up features that depend
upon it.
We tried to ensure that in commit:
f88af7229f6f22ce (" xen/arm: do not setup the runstate info page if kpti is enabled")
... where we added checks for xen_kernel_unmapped_at_usr(), which wraps
arm64_kernel_unmapped_at_el0() on arm64 and is always false on 32-bit
arm.
Unfortunately, as xen_guest_init() is an early_initcall, this happens
before secondary CPUs are booted and arm64 has finalized the
ARM64_UNMAP_KERNEL_AT_EL0 cpucap which backs
arm64_kernel_unmapped_at_el0(), and so this can subsequently be set as
secondary CPUs are onlined. On a big.LITTLE system where the boot CPU
does not require KPTI but some secondary CPUs do, this will result in
xen_guest_init() intializing features that depend on the runstate
region, and xen_starting_cpu() registering the runstate region on some
CPUs before KPTI is subsequent enabled, resulting the the problems the
aforementioned commit tried to avoid.
Handle this more robsutly by deferring the initialization of the
runstate region until secondary CPUs have been initialized and the
ARM64_UNMAP_KERNEL_AT_EL0 cpucap has been finalized. The per-cpu work is
moved into a new hotplug starting function which is registered later
when we're certain that KPTI will not be used.
Fixes: f88af7229f6f ("xen/arm: do not setup the runstate info page if kpti is enabled")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Bertrand Marquis <bertrand.marquis@arm.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
We attempt to initialize each CPU's arch_timer event stream in
arch_timer_evtstrm_enable(), which we call from the
arch_timer_starting_cpu() cpu hotplug callback which is registered early
in boot. As this is registered before we initialize the system cpucaps,
the test for ARM64_HAS_ECV will always be false for CPUs present at boot
time, and will only be taken into account for CPUs onlined late
(including those which are hotplugged out and in again).
Due to this, CPUs present and boot time may not use the intended divider
and scale factor to generate the event stream, and may differ from other
CPUs.
Correct this by only initializing the event stream after cpucaps have been
finalized, registering a separate CPU hotplug callback for the event stream
configuration. Since the caps must be finalized by this point, use
cpus_have_final_cap() to verify this.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
* Fix EL2 Stage-1 MMIO mappings where a random address was used
* Fix SMCCC function number comparison when the SVE hint is set
RISC-V:
* Fix KVM_GET_REG_LIST API for ISA_EXT registers
* Fix reading ISA_EXT register of a missing extension
* Fix ISA_EXT register handling in get-reg-list test
* Fix filtering of AIA registers in get-reg-list test
x86:
* Fixes for TSC_AUX virtualization
* Stop zapping page tables asynchronously, since we don't
zap them as often as before
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmUQU5YUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroNcdwf/X8eHQ5yfAE0J70xs4VZ1z7B8i77q
P54401z/q0FyQ4yyTHwbUv/FgVYscZ0efYogrkd3uuoPNtLmN2xKj1tM95A2ncP/
v318ljevZ0FWZ6J471Xu9MM3u15QmjC3Wai9z6IP4tz0S2rUhOYTJdFqlNf6gQSu
P8n9l2j3ZLAiUNizXa8M7350gCUFCBi37dvLLVTYOxbu17hZtmNjhNpz5G7YNc9y
zmJIJh30ZnMGUgMylLfcW0ZoqQFNIkNg3yyr9YjY68bTW5aspXdhp9u0zI+01xYF
sT+tOXBPPLi9MBuckX+oLMsvNXEZWxos2oMow3qziMo83neG+jU+WhjLHg==
=+sqe
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"ARM:
- Fix EL2 Stage-1 MMIO mappings where a random address was used
- Fix SMCCC function number comparison when the SVE hint is set
RISC-V:
- Fix KVM_GET_REG_LIST API for ISA_EXT registers
- Fix reading ISA_EXT register of a missing extension
- Fix ISA_EXT register handling in get-reg-list test
- Fix filtering of AIA registers in get-reg-list test
x86:
- Fixes for TSC_AUX virtualization
- Stop zapping page tables asynchronously, since we don't zap them as
often as before"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: SVM: Do not use user return MSR support for virtualized TSC_AUX
KVM: SVM: Fix TSC_AUX virtualization setup
KVM: SVM: INTERCEPT_RDTSCP is never intercepted anyway
KVM: x86/mmu: Stop zapping invalidated TDP MMU roots asynchronously
KVM: x86/mmu: Do not filter address spaces in for_each_tdp_mmu_root_yield_safe()
KVM: x86/mmu: Open code leaf invalidation from mmu_notifier
KVM: riscv: selftests: Selectively filter-out AIA registers
KVM: riscv: selftests: Fix ISA_EXT register handling in get-reg-list
RISC-V: KVM: Fix riscv_vcpu_get_isa_ext_single() for missing extensions
RISC-V: KVM: Fix KVM_GET_REG_LIST API for ISA_EXT registers
KVM: selftests: Assert that vasprintf() is successful
KVM: arm64: nvhe: Ignore SVE hint in SMCCC function ID
KVM: arm64: Properly return allocated EL2 VA from hyp_alloc_private_va_range()
- Fix the "bytes" output of the per_cpu stat file
The tracefs/per_cpu/cpu*/stats "bytes" was giving bogus values as the
accounting was not accurate. It is suppose to show how many used bytes are
still in the ring buffer, but even when the ring buffer was empty it would
still show there were bytes used.
- Fix a bug in eventfs where reading a dynamic event directory (open) and then
creating a dynamic event that goes into that diretory screws up the accounting.
On close, the newly created event dentry will get a "dput" without ever having
a "dget" done for it. The fix is to allocate an array on dir open to save what
dentries were actually "dget" on, and what ones to "dput" on close.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZQ9wihQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6quz4AP4vSFohvmAcTzC+sKP7gMLUvEmqL76+
1pixXrQOIP5BrQEApUW3VnjqYgjZJR2ne0N4MvvmYElm/ylBhDd4JRrD3g8=
=X9wd
-----END PGP SIGNATURE-----
Merge tag 'trace-v6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix the "bytes" output of the per_cpu stat file
The tracefs/per_cpu/cpu*/stats "bytes" was giving bogus values as the
accounting was not accurate. It is suppose to show how many used
bytes are still in the ring buffer, but even when the ring buffer was
empty it would still show there were bytes used.
- Fix a bug in eventfs where reading a dynamic event directory (open)
and then creating a dynamic event that goes into that diretory screws
up the accounting.
On close, the newly created event dentry will get a "dput" without
ever having a "dget" done for it. The fix is to allocate an array on
dir open to save what dentries were actually "dget" on, and what ones
to "dput" on close.
* tag 'trace-v6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
eventfs: Remember what dentries were created on dir open
ring-buffer: Fix bytes info in per_cpu buffer stats
- Fix multiple scenarios where platform firmware defined regions fail to
be assembled by the CXL core.
- Fix a spurious driver-load failure on platforms that enable OS native
AER, but not OS native CXL error handling.
- Fix a regression detecting "poison" commands when "security" commands
are also defined.
- Fix a cxl_test regression with the move to centralize CXL port
register enumeration in the CXL core.
- Miscellaneous small fixes and cleanups
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCZQ9ZeAAKCRDfioYZHlFs
Z2b6AQDCNGMZdvJXwXW8LY/GHzJvuIWzvzSf0/Zy050Q1s4qrQEAqmmKCXtzjtMV
PQLm9o3a96Wb/zSzRZJwMVCTyXClVwg=
=sSix
-----END PGP SIGNATURE-----
Merge tag 'cxl-fixes-6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull cxl fixes from Dan Williams:
"A collection of regression fixes, bug fixes, and some small cleanups
to the Compute Express Link code.
The regressions arrived in the v6.5 dev cycle and missed the v6.6
merge window due to my personal absences this cycle. The most
important fixes are for scenarios where the CXL subsystem fails to
parse valid region configurations established by platform firmware.
This is important because agreement between OS and BIOS on the CXL
configuration is fundamental to implementing "OS native" error
handling, i.e. address translation and component failure
identification.
Other important fixes are a driver load error when the BIOS lets the
Linux PCI core handle AER events, but not CXL memory errors.
The other fixex might have end user impact, but for now are only known
to trigger in our test/emulation environment.
Summary:
- Fix multiple scenarios where platform firmware defined regions fail
to be assembled by the CXL core.
- Fix a spurious driver-load failure on platforms that enable OS
native AER, but not OS native CXL error handling.
- Fix a regression detecting "poison" commands when "security"
commands are also defined.
- Fix a cxl_test regression with the move to centralize CXL port
register enumeration in the CXL core.
- Miscellaneous small fixes and cleanups"
* tag 'cxl-fixes-6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
cxl/acpi: Annotate struct cxl_cxims_data with __counted_by
cxl/port: Fix cxl_test register enumeration regression
cxl/region: Refactor granularity select in cxl_port_setup_targets()
cxl/region: Match auto-discovered region decoders by HPA range
cxl/mbox: Fix CEL logic for poison and security commands
cxl/pci: Replace host_bridge->native_aer with pcie_aer_is_native()
PCI/AER: Export pcie_aer_is_native()
cxl/pci: Fix appropriate checking for _OSC while handling CXL RAS registers
- fix an invalid usage of __free(kfree) leading to kfreeing an ERR_PTR()
- fix an irq domain leak in gpio-tb10x
- MAINTAINERS update
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAmUPMiMACgkQEacuoBRx
13KXsQ//UB5KdrlnHSBens2XuHBhPg121di5TDjC8Iae+88vIskfdp4kahq2w20g
VYUS0DsmIkbH74d21uryzxdP/HRSOHx3G9h1fNtLMmF/YEeOhEhk2c5VlWYnV/Oe
V0qiAsqBHdsK3fsX90IeW0qoglSJXGRl89kgm/Wpgs8yt/CHWvJ/tE/bzud58gvU
lP8FipryfTjFRjn+K+tdqCVRq8qg0wlV9vQpLIjiEVyuRlYgPZdwG5srtvaP8hbE
YjN82DKW1Cr6qD72JgD8ND1vHTb2uFznYLaqhF3x+7YvuY7c0JCoa1OBBcjnPFy5
EJTJN/UCV2eQAcDoPZzssvS+xTB//lIzo76VqLHog8D20Zm4JJh/jSY7NKNyHeH8
gkPnNddVP5GKbDsimPjdztE+ciO67uCkvAqROIWpCDY0ZAwMTc4p8xsng59a838k
f2iQeO8g8GiX6z8VYEVyt4JkqLRgd7cAlqP9MdW+eXm7MTXh1h51YbBQGbIfhfde
5lPESJGV9BjQoITa7ur/bcS9zT56pTJztlYpmyqn5tF8S5jwffBsmdROad3AgBEI
XriouPnYlPk9daSa5SQdY+uvoy8hFmH2b17/xZblJ6LBCiDew7kBmkiGMLW0DRPh
TaY7YmXsXxRS5cX2cPdGAgT+u9yNPJtiY/BRJ/JzfmcBl/gURug=
=KqzF
-----END PGP SIGNATURE-----
Merge tag 'gpio-fixes-for-v6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- fix an invalid usage of __free(kfree) leading to kfreeing an
ERR_PTR()
- fix an irq domain leak in gpio-tb10x
- MAINTAINERS update
* tag 'gpio-fixes-for-v6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: sim: fix an invalid __free() usage
gpio: tb10x: Fix an error handling path in tb10x_gpio_probe()
MAINTAINERS: gpio-regmap: make myself a maintainer of it
cc:stable.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZQ8hRwAKCRDdBJ7gKXxA
jlK9AQDzT/FUQV3kIshsV1IwAKFcg7gtcFSN0vs+pV+e1+4tbQD/Z2OgfGFFsCSP
X6uc2cYHc9DG5/o44iFgadW8byMssQs=
=w+St
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2023-09-23-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"13 hotfixes, 10 of which pertain to post-6.5 issues. The other three
are cc:stable"
* tag 'mm-hotfixes-stable-2023-09-23-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
proc: nommu: fix empty /proc/<pid>/maps
filemap: add filemap_map_order0_folio() to handle order0 folio
proc: nommu: /proc/<pid>/maps: release mmap read lock
mm: memcontrol: fix GFP_NOFS recursion in memory.high enforcement
pidfd: prevent a kernel-doc warning
argv_split: fix kernel-doc warnings
scatterlist: add missing function params to kernel-doc
selftests/proc: fixup proc-empty-vm test after KSM changes
revert "scripts/gdb/symbols: add specific ko module load command"
selftests: link libasan statically for tests with -fsanitize=address
task_work: add kerneldoc annotation for 'data' argument
mm: page_alloc: fix CMA and HIGHATOMIC landing on the wrong buddy list
sh: mm: re-add lost __ref to ioremap_prot() to fix modpost warning
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmUOSHkACgkQiiy9cAdy
T1HAswwAmCPPUWgIiR4XqWiXOmWh60+4Q7xsmSVvRAixvTf79Tkif/1AmVYhnYls
QvnbT5TIFPtFbnWHOIYSPxkjBlVRNTgSviSdOebZAv4aP3DJ+HhWqnv39ti7DFMt
qbNn9j53czq5eLChkIiCfqbeg4I8ra+vqZeeDs5TaRFqMLZ5KlJGlVJjnf88/o8m
cCjifqq53Bq1pZeaR1Q9aiwuzsSKazs4odEeFvwFSw0tsdjEj4Rk5vJqDcMnt460
yIXzoUv3iGJ2oz7qi5uysGPOpLAbDqwbJ5+127qja3cbgHg3MJqwotheRuULYX8G
Dz9hhgu5oGlvSO+fZXnsT2bQL+oqYUyql9EU5NBTF2gPf5M3E1vQaMNPj1cueNAg
dfa3x3Ez/M1XuH2aptK3ePuPIQIZgMjpMC7BPBKaIvxtGcNxEIuL0s+TEQbjJ8R1
/ybJv2i+LYfCrnjTDvH0Y4lTS40AHIcOwywQd6rbMuStK71+B7/bNYJkdKKK9PEF
L39ZwXDj
=Q4Rt
-----END PGP SIGNATURE-----
Merge tag '6.6-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
"Six smb3 client fixes, including three for stable, from the SMB
plugfest (testing event) this week:
- Reparse point handling fix (found when investigating dir
enumeration when fifo in dir)
- Fix excessive thread creation for dir lease cleanup
- UAF fix in negotiate path
- remove duplicate error message mapping and fix confusing warning
message
- add dynamic trace point to improve debugging RDMA connection
attempts"
* tag '6.6-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb3: fix confusing debug message
smb: client: handle STATUS_IO_REPARSE_TAG_NOT_HANDLED
smb3: remove duplicate error mapping
cifs: Fix UAF in cifs_demultiplex_thread()
smb3: do not start laundromat thread when dir leases disabled
smb3: Add dynamic trace points for RDMA (smbdirect) reconnect
The code was accidentally mixing new and old style macros, update the
macros used to remove an unused function warning whilst building with
no PM enabled in the config.
Fixes: ace6d1448138 ("mfd: cs42l43: Add support for cs42l43 core driver")
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/all/20230822114914.340359-1-ckeepax@opensource.cirrus.com/
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Lee Jones <lee@kernel.org>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmUKki4WHGNoZW5odWFj
YWlAa2VybmVsLm9yZwAKCRAChivD8uImepsVEACcMAw3/Gg3ldIDlV6mWSYGn6kA
eF2Cc89q4C53CYYlYHalBqVdOObonR0g4roz385UjlGXeVtOuYzKB2DMy8GE3V7s
63Q82jpkGtgpJ9/md+2FnOoaT6CiN+kbcwdbSmEsz+9yht9IzRlO5R0urH92jwsU
wpnFzGtn1kHgGv+yC8XQDvk5ZvYiiA9bWrXiaLl+aEF0qeQBhgI+f7+Jew/VWBNR
ykH0TcOp0cjt7AqYlOHb3YXqwIO6U5sVLIfrHzCxKkrfeV/DE8J0FU3/YQ/okMr7
tjBJxS4o1UsNyT+9ItXjqYClOAy1IaW+2UmC8r2k79hZKEyicHu3/o7xpBCvoQoa
9OAKFAtO1UyX3h3uUynouaSXCuQ48GAetnkGMFuhuUVlF9Aq9OdA6lAWeuolkace
VYs3djjkAvsWq6HH2tm5lpcq8jXsbc2QRbHl+f4BGgyoXtEk7NXsqfPcvJeFDMFF
/PKYFQnPWebv4LoqxSNjN7S7S23N0k9tH+lITX8WvMJzRQaUTt4S19e7YHotNMty
UXDBIW6mjVIOT11zzNcsEzkMXA/8Q4VvZbQy67nfweg8KMLMChBdkRphK+4pOLN/
0Pvge3SAAVI/cdNWOxwqzvHvQbqVsjb4p4GmghPSLOojFPKW47ueWm8xeq/Hd05r
ssZZGOC8/H1AqDvOIw==
=EKBZ
-----END PGP SIGNATURE-----
Merge tag 'loongarch-fixes-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
"Fix lockdep, fix a boot failure, fix some build warnings, fix document
links, and some cleanups"
* tag 'loongarch-fixes-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
docs/zh_CN/LoongArch: Update the links of ABI
docs/LoongArch: Update the links of ABI
LoongArch: Don't inline kasan_mem_to_shadow()/kasan_shadow_to_mem()
kasan: Cleanup the __HAVE_ARCH_SHADOW_MAP usage
LoongArch: Set all reserved memblocks on Node#0 at initialization
LoongArch: Remove dead code in relocate_new_kernel
LoongArch: Use _UL() and _ULL()
LoongArch: Fix some build warnings with W=1
LoongArch: Fix lockdep static memory detection
* Return EIO on bad inputs to iomap_to_bh instead of BUGging, to deal
less poorly with block device io racing with block device resizing.
* Fix a stale page data exposure bug introduced in 6.6-rc1 when
unsharing a file range that is not in the page cache.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQQ2qTKExjcn+O1o2YRKO3ySh0YRpgUCZQsCFAAKCRBKO3ySh0YR
pkhZAP9VHpjBn95MEai0dxAVjAi8IDcfwdzBuifBWlkQwnt6MAEAiHfHEDfN23o9
4Xg9EDqa8IOSYwxphJYnYG73Luvi5QQ=
=Xtnv
-----END PGP SIGNATURE-----
Merge tag 'iomap-6.6-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull iomap fixes from Darrick Wong:
- Return EIO on bad inputs to iomap_to_bh instead of BUGging, to deal
less poorly with block device io racing with block device resizing
- Fix a stale page data exposure bug introduced in 6.6-rc1 when
unsharing a file range that is not in the page cache
* tag 'iomap-6.6-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
iomap: convert iomap_unshare_iter to use large folios
iomap: don't skip reading in !uptodate folios when unsharing a range
iomap: handle error conditions more gracefully in iomap_to_bh
- Fix KVM_GET_REG_LIST API for ISA_EXT registers
- Fix reading ISA_EXT register of a missing extension
- Fix ISA_EXT register handling in get-reg-list test
- Fix filtering of AIA registers in get-reg-list test
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmUMDssACgkQrUjsVaLH
LAckSg/+IZ5DPvPs81rUpL3i1Z5SrK4jXWL2zyvMIksBEYmFD2NPNvinVZ4Sxv6u
IzWNKJcAp4nA/+qPGPLXCURDe1W6PCDvO4SShjYm2UkPtNIfiskmFr3MunXZysgm
I7USJgj9ev+46yfOnwlYrwpZ8sQk7Z6nLTI/6Osk4Q7Sm0Vjoobh6krub7LNjeKQ
y6p+vxrXj+Owc5H9bgl0wAi6GOmOJKAM+cZU5DygQxjOgiUgNbOzsVgbLDTvExNq
gISUU4PoAO7/U1NKEaaopbe7C0KNQHTnegedtXsDzg7WTBah77/MNBt4snbfiR27
6rODklZlG/kAGIHdVtYC+zf8AfPqvGTIT8SLGmzQlyVlHujFBGn0L41NmMzW+EeA
7UpfUk8vPiiGhefBE+Ml3yqiReogo+aRhL1mZoI39rPusd7DMnwx97KpBlAcYuTI
PTgqycIMRmq2dSCHya+nrOVpwwV3Qx4G8Alpq1jOa7XDMeGMj4h521NQHjWckIK2
IJ2a0RtzB10+Z91nLV+amdAno326AnxJC7dR26O6uqVSPJy/nHE2GAc49gFKeWq6
QmzgzY1sU2Y02/TM8miyKSl3i+bpZSIPzXCKlOm1TowBKO+sfJzn/yMon9sVaVhb
4Wjgg3vgE74y9FVsL4JXR/PfrZH5Aq77J1R+/pMtsNTtVYrt1Sk=
=ytFs
-----END PGP SIGNATURE-----
Merge tag 'kvm-riscv-fixes-6.6-1' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv fixes for 6.6, take #1
- Fix KVM_GET_REG_LIST API for ISA_EXT registers
- Fix reading ISA_EXT register of a missing extension
- Fix ISA_EXT register handling in get-reg-list test
- Fix filtering of AIA registers in get-reg-list test
When the TSC_AUX MSR is virtualized, the TSC_AUX value is swap type "B"
within the VMSA. This means that the guest value is loaded on VMRUN and
the host value is restored from the host save area on #VMEXIT.
Since the value is restored on #VMEXIT, the KVM user return MSR support
for TSC_AUX can be replaced by populating the host save area with the
current host value of TSC_AUX. And, since TSC_AUX is not changed by Linux
post-boot, the host save area can be set once in svm_hardware_enable().
This eliminates the two WRMSR instructions associated with the user return
MSR support.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <d381de38eb0ab6c9c93dda8503b72b72546053d7.1694811272.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The checks for virtualizing TSC_AUX occur during the vCPU reset processing
path. However, at the time of initial vCPU reset processing, when the vCPU
is first created, not all of the guest CPUID information has been set. In
this case the RDTSCP and RDPID feature support for the guest is not in
place and so TSC_AUX virtualization is not established.
This continues for each vCPU created for the guest. On the first boot of
an AP, vCPU reset processing is executed as a result of an APIC INIT
event, this time with all of the guest CPUID information set, resulting
in TSC_AUX virtualization being enabled, but only for the APs. The BSP
always sees a TSC_AUX value of 0 which probably went unnoticed because,
at least for Linux, the BSP TSC_AUX value is 0.
Move the TSC_AUX virtualization enablement out of the init_vmcb() path and
into the vcpu_after_set_cpuid() path to allow for proper initialization of
the support after the guest CPUID information has been set.
With the TSC_AUX virtualization support now in the vcpu_set_after_cpuid()
path, the intercepts must be either cleared or set based on the guest
CPUID input.
Fixes: 296d5a17e793 ("KVM: SEV-ES: Use V_TSC_AUX if available instead of RDTSC/MSR_TSC_AUX intercepts")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <4137fbcb9008951ab5f0befa74a0399d2cce809a.1694811272.git.thomas.lendacky@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
svm_recalc_instruction_intercepts() is always called at least once
before the vCPU is started, so the setting or clearing of the RDTSCP
intercept can be dropped from the TSC_AUX virtualization support.
Extracted from a patch by Tom Lendacky.
Cc: stable@vger.kernel.org
Fixes: 296d5a17e793 ("KVM: SEV-ES: Use V_TSC_AUX if available instead of RDTSC/MSR_TSC_AUX intercepts")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Stop zapping invalidate TDP MMU roots via work queue now that KVM
preserves TDP MMU roots until they are explicitly invalidated. Zapping
roots asynchronously was effectively a workaround to avoid stalling a vCPU
for an extended during if a vCPU unloaded a root, which at the time
happened whenever the guest toggled CR0.WP (a frequent operation for some
guest kernels).
While a clever hack, zapping roots via an unbound worker had subtle,
unintended consequences on host scheduling, especially when zapping
multiple roots, e.g. as part of a memslot. Because the work of zapping a
root is no longer bound to the task that initiated the zap, things like
the CPU affinity and priority of the original task get lost. Losing the
affinity and priority can be especially problematic if unbound workqueues
aren't affined to a small number of CPUs, as zapping multiple roots can
cause KVM to heavily utilize the majority of CPUs in the system, *beyond*
the CPUs KVM is already using to run vCPUs.
When deleting a memslot via KVM_SET_USER_MEMORY_REGION, the async root
zap can result in KVM occupying all logical CPUs for ~8ms, and result in
high priority tasks not being scheduled in in a timely manner. In v5.15,
which doesn't preserve unloaded roots, the issues were even more noticeable
as KVM would zap roots more frequently and could occupy all CPUs for 50ms+.
Consuming all CPUs for an extended duration can lead to significant jitter
throughout the system, e.g. on ChromeOS with virtio-gpu, deleting memslots
is a semi-frequent operation as memslots are deleted and recreated with
different host virtual addresses to react to host GPU drivers allocating
and freeing GPU blobs. On ChromeOS, the jitter manifests as audio blips
during games due to the audio server's tasks not getting scheduled in
promptly, despite the tasks having a high realtime priority.
Deleting memslots isn't exactly a fast path and should be avoided when
possible, and ChromeOS is working towards utilizing MAP_FIXED to avoid the
memslot shenanigans, but KVM is squarely in the wrong. Not to mention
that removing the async zapping eliminates a non-trivial amount of
complexity.
Note, one of the subtle behaviors hidden behind the async zapping is that
KVM would zap invalidated roots only once (ignoring partial zaps from
things like mmu_notifier events). Preserve this behavior by adding a flag
to identify roots that are scheduled to be zapped versus roots that have
already been zapped but not yet freed.
Add a comment calling out why kvm_tdp_mmu_invalidate_all_roots() can
encounter invalid roots, as it's not at all obvious why zapping
invalidated roots shouldn't simply zap all invalid roots.
Reported-by: Pattara Teerapong <pteerapong@google.com>
Cc: David Stevens <stevensd@google.com>
Cc: Yiwei Zhang<zzyiwei@google.com>
Cc: Paul Hsia <paulhsia@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230916003916.2545000-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
All callers except the MMU notifier want to process all address spaces.
Remove the address space ID argument of for_each_tdp_mmu_root_yield_safe()
and switch the MMU notifier to use __for_each_tdp_mmu_root_yield_safe().
Extracted out of a patch by Sean Christopherson <seanjc@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Fix an integer overflow bug when processing an fsmap call.
* Fix crash due to CPU hot remove event racing with filesystem mount
operation.
* During read-only mount, XFS does not allow the contents of the log to be
recovered when there are one or more unrecognized rcompat features in the
primary superblock, since the log might have intent items which the kernel
does not know how to process.
* During recovery of log intent items, XFS now reserves log space sufficient
for one cycle of a permanent transaction to execute. Otherwise, this could
lead to livelocks due to non-availability of log space.
* On an fs which has an ondisk unlinked inode list, trying to delete a file
or allocating an O_TMPFILE file can cause the fs to the shutdown if the
first inode in the ondisk inode list is not present in the inode cache.
The bug is solved by explicitly loading the first inode in the ondisk
unlinked inode list into the inode cache if it is not already cached.
A similar problem arises when the uncached inode is present in the middle
of the ondisk unlinked inode list. This second bug is triggered when
executing operations like quotacheck and bulkstat. In this case, XFS now
reads in the entire ondisk unlinked inode list.
* Enable LARP mode only on recent v5 filesystems.
* Fix a out of bounds memory access in scrub.
* Fix a performance bug when locating the tail of the log during mounting a
filesystem.
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQQjMC4mbgVeU7MxEIYH7y4RirJu9AUCZQkx4QAKCRAH7y4RirJu
9HrTAQD6QhvHkS43vueGOb4WISZPG/jMKJ/FjvwLZrIZ0erbJwEAtRWhClwFv3NZ
exJFtsmxrKC6Vifuo0pvfoCiK5mUvQ8=
=SrJR
-----END PGP SIGNATURE-----
Merge tag 'xfs-6.6-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Chandan Babu:
- Fix an integer overflow bug when processing an fsmap call
- Fix crash due to CPU hot remove event racing with filesystem mount
operation
- During read-only mount, XFS does not allow the contents of the log to
be recovered when there are one or more unrecognized rcompat features
in the primary superblock, since the log might have intent items
which the kernel does not know how to process
- During recovery of log intent items, XFS now reserves log space
sufficient for one cycle of a permanent transaction to execute.
Otherwise, this could lead to livelocks due to non-availability of
log space
- On an fs which has an ondisk unlinked inode list, trying to delete a
file or allocating an O_TMPFILE file can cause the fs to the shutdown
if the first inode in the ondisk inode list is not present in the
inode cache. The bug is solved by explicitly loading the first inode
in the ondisk unlinked inode list into the inode cache if it is not
already cached
A similar problem arises when the uncached inode is present in the
middle of the ondisk unlinked inode list. This second bug is
triggered when executing operations like quotacheck and bulkstat. In
this case, XFS now reads in the entire ondisk unlinked inode list
- Enable LARP mode only on recent v5 filesystems
- Fix a out of bounds memory access in scrub
- Fix a performance bug when locating the tail of the log during
mounting a filesystem
* tag 'xfs-6.6-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: use roundup_pow_of_two instead of ffs during xlog_find_tail
xfs: only call xchk_stats_merge after validating scrub inputs
xfs: require a relatively recent V5 filesystem for LARP mode
xfs: make inode unlinked bucket recovery work with quotacheck
xfs: load uncached unlinked inodes into memory on demand
xfs: reserve less log space when recovering log intent items
xfs: fix log recovery when unknown rocompat bits are set
xfs: reload entire unlinked bucket lists
xfs: allow inode inactivation during a ro mount log recovery
xfs: use i_prev_unlinked to distinguish inodes that are not on the unlinked list
xfs: remove CPU hotplug infrastructure
xfs: remove the all-mounts list
xfs: use per-mount cpumask to track nonempty percpu inodegc lists
xfs: fix an agbno overflow in __xfs_getfsmap_datadev
xfs: fix per-cpu CIL structure aggregation racing with dying cpus
xfs: fix select in config XFS_ONLINE_SCRUB_STATS
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).
As found with Coccinelle[1], add __counted_by for struct cxl_cxims_data.
Additionally, since the element count member must be set before accessing
the annotated flexible array member, move its initialization earlier.
[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: linux-cxl@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20230922175319.work.096-kees@kernel.org
Signed-off-by: Dan Williams <dan.j.williams@intel.com>