871752 Commits

Author SHA1 Message Date
Catalin Marinas
6be22809e5 Merge branches 'for-next/elf-hwcap-docs', 'for-next/smccc-conduit-cleanup', 'for-next/zone-dma', 'for-next/relax-icc_pmr_el1-sync', 'for-next/double-page-fault', 'for-next/misc', 'for-next/kselftest-arm64-signal' and 'for-next/kaslr-diagnostics' into for-next/core
* for-next/elf-hwcap-docs:
  : Update the arm64 ELF HWCAP documentation
  docs/arm64: cpu-feature-registers: Rewrite bitfields that don't follow [e, s]
  docs/arm64: cpu-feature-registers: Documents missing visible fields
  docs/arm64: elf_hwcaps: Document HWCAP_SB
  docs/arm64: elf_hwcaps: sort the HWCAP{, 2} documentation by ascending value

* for-next/smccc-conduit-cleanup:
  : SMC calling convention conduit clean-up
  firmware: arm_sdei: use common SMCCC_CONDUIT_*
  firmware/psci: use common SMCCC_CONDUIT_*
  arm: spectre-v2: use arm_smccc_1_1_get_conduit()
  arm64: errata: use arm_smccc_1_1_get_conduit()
  arm/arm64: smccc/psci: add arm_smccc_1_1_get_conduit()

* for-next/zone-dma:
  : Reintroduction of ZONE_DMA for Raspberry Pi 4 support
  arm64: mm: reserve CMA and crashkernel in ZONE_DMA32
  dma/direct: turn ARCH_ZONE_DMA_BITS into a variable
  arm64: Make arm64_dma32_phys_limit static
  arm64: mm: Fix unused variable warning in zone_sizes_init
  mm: refresh ZONE_DMA and ZONE_DMA32 comments in 'enum zone_type'
  arm64: use both ZONE_DMA and ZONE_DMA32
  arm64: rename variables used to calculate ZONE_DMA32's size
  arm64: mm: use arm64_dma_phys_limit instead of calling max_zone_dma_phys()

* for-next/relax-icc_pmr_el1-sync:
  : Relax ICC_PMR_EL1 (GICv3) accesses when ICC_CTLR_EL1.PMHE is clear
  arm64: Document ICC_CTLR_EL3.PMHE setting requirements
  arm64: Relax ICC_PMR_EL1 accesses when ICC_CTLR_EL1.PMHE is clear

* for-next/double-page-fault:
  : Avoid a double page fault in __copy_from_user_inatomic() if hw does not support auto Access Flag
  mm: fix double page fault on arm64 if PTE_AF is cleared
  x86/mm: implement arch_faults_on_old_pte() stub on x86
  arm64: mm: implement arch_faults_on_old_pte() on arm64
  arm64: cpufeature: introduce helper cpu_has_hw_af()

* for-next/misc:
  : Various fixes and clean-ups
  arm64: kpti: Add NVIDIA's Carmel core to the KPTI whitelist
  arm64: mm: Remove MAX_USER_VA_BITS definition
  arm64: mm: simplify the page end calculation in __create_pgd_mapping()
  arm64: print additional fault message when executing non-exec memory
  arm64: psci: Reduce the waiting time for cpu_psci_cpu_kill()
  arm64: pgtable: Correct typo in comment
  arm64: docs: cpu-feature-registers: Document ID_AA64PFR1_EL1
  arm64: cpufeature: Fix typos in comment
  arm64/mm: Poison initmem while freeing with free_reserved_area()
  arm64: use generic free_initrd_mem()
  arm64: simplify syscall wrapper ifdeffery

* for-next/kselftest-arm64-signal:
  : arm64-specific kselftest support with signal-related test-cases
  kselftest: arm64: fake_sigreturn_misaligned_sp
  kselftest: arm64: fake_sigreturn_bad_size
  kselftest: arm64: fake_sigreturn_duplicated_fpsimd
  kselftest: arm64: fake_sigreturn_missing_fpsimd
  kselftest: arm64: fake_sigreturn_bad_size_for_magic0
  kselftest: arm64: fake_sigreturn_bad_magic
  kselftest: arm64: add helper get_current_context
  kselftest: arm64: extend test_init functionalities
  kselftest: arm64: mangle_pstate_invalid_mode_el[123][ht]
  kselftest: arm64: mangle_pstate_invalid_daif_bits
  kselftest: arm64: mangle_pstate_invalid_compat_toggle and common utils
  kselftest: arm64: extend toplevel skeleton Makefile

* for-next/kaslr-diagnostics:
  : Provide diagnostics on boot for KASLR
  arm64: kaslr: Check command line before looking for a seed
  arm64: kaslr: Announce KASLR status on boot
2019-11-08 17:46:11 +00:00
Mark Brown
2203e1adb9 arm64: kaslr: Check command line before looking for a seed
Now that we print diagnostics at boot the reason why we do not initialise
KASLR matters. Currently we check for a seed before we check if the user
has explicitly disabled KASLR on the command line which will result in
misleading diagnostics so reverse the order of those checks. We still
parse the seed from the DT early so that if the user has both provided a
seed and disabled KASLR on the command line we still mask the seed on
the command line.

Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-08 17:36:51 +00:00
Mark Brown
294a9ddde6 arm64: kaslr: Announce KASLR status on boot
Currently the KASLR code is silent at boot unless it forces on KPTI in
which case a message will be printed for that. This can lead to users
incorrectly believing their system has the feature enabled when it in
fact does not, and if they notice the problem the lack of any
diagnostics makes it harder to understand the problem. Add an initcall
which prints a message showing the status of KASLR during boot to make
the status clear.

This is particularly useful in cases where we don't have a seed. It
seems to be a relatively common error for system integrators and
administrators to enable KASLR in their configuration but not provide
the seed at runtime, often due to seed provisioning breaking at some
later point after it is initially enabled and verified.

Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-08 17:36:48 +00:00
Cristian Marussi
3f484ce375 kselftest: arm64: fake_sigreturn_misaligned_sp
Add a simple fake_sigreturn testcase which places a valid sigframe on a
non-16 bytes aligned SP. Expects a SIGSEGV on test PASS.

Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-08 11:10:52 +00:00
Cristian Marussi
49978aa8f0 kselftest: arm64: fake_sigreturn_bad_size
Add a simple fake_sigreturn testcase which builds a ucontext_t with a
badly sized header that causes a overrun in the __reserved area and
place it onto the stack. Expects a SIGSEGV on test PASS.

Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-08 11:10:50 +00:00
Cristian Marussi
46185cd124 kselftest: arm64: fake_sigreturn_duplicated_fpsimd
Add a simple fake_sigreturn testcase which builds a ucontext_t with
an anomalous additional fpsimd_context and place it onto the stack.
Expects a SIGSEGV on test PASS.

Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-08 11:10:48 +00:00
Cristian Marussi
8aa9d08fcb kselftest: arm64: fake_sigreturn_missing_fpsimd
Add a simple fake_sigreturn testcase which builds a ucontext_t without
the required fpsimd_context and place it onto the stack.
Expects a SIGSEGV on test PASS.

Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-08 11:10:46 +00:00
Cristian Marussi
4c94a0ba02 kselftest: arm64: fake_sigreturn_bad_size_for_magic0
Add a simple fake_sigreturn testcase which builds a ucontext_t with a
badly sized terminator record and place it onto the stack.
Expects a SIGSEGV on test PASS.

Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-08 11:10:44 +00:00
Cristian Marussi
6c2aa42845 kselftest: arm64: fake_sigreturn_bad_magic
Add a simple fake_sigreturn testcase which builds a ucontext_t with a bad
magic header and place it onto the stack. Expects a SIGSEGV on test PASS.

Introduce a common utility assembly trampoline function to invoke a
sigreturn while placing the provided sigframe at wanted alignment and
also an helper to make space when needed inside the sigframe reserved
area.

Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-08 11:10:42 +00:00
Cristian Marussi
34306b05d3 kselftest: arm64: add helper get_current_context
Introduce a new common utility function get_current_context() which can be
used to grab a ucontext without the help of libc, and also to detect if
such ucontext has been successfully used by placing it on the stack as a
fake sigframe.

Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-08 11:10:41 +00:00
Cristian Marussi
837387a2cb kselftest: arm64: extend test_init functionalities
Extend signal testing framework to allow the definition of a custom per
test initialization function to be run at the end of the common test_init
after test setup phase has completed and before test-run routine.

This custom per-test initialization function also enables the test writer
to decide on its own when forcibly skip the test itself using standard KSFT
mechanism.

Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-08 11:10:39 +00:00
Cristian Marussi
c282098704 kselftest: arm64: mangle_pstate_invalid_mode_el[123][ht]
Add 6 simple mangle testcases that mess with the ucontext_t from within
the signal handler, trying to toggle PSTATE mode bits to trick the system
into switching to EL1/EL2/EL3 using both SP_EL0(t) and SP_ELx(h).
Expects SIGSEGV on test PASS.

Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-08 11:10:37 +00:00
Cristian Marussi
0fc89f08df kselftest: arm64: mangle_pstate_invalid_daif_bits
Add a simple mangle testcase which messes with the ucontext_t from within
the signal handler, trying to set PSTATE DAIF bits to an invalid value
(masking everything). Expects SIGSEGV on test PASS.

Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-08 11:10:35 +00:00
Cristian Marussi
f96bf43403 kselftest: arm64: mangle_pstate_invalid_compat_toggle and common utils
Add some arm64/signal specific boilerplate and utility code to help
further testcases' development.

Introduce also one simple testcase mangle_pstate_invalid_compat_toggle
and some related helpers: it is a simple mangle testcase which messes
with the ucontext_t from within the signal handler, trying to toggle
PSTATE state bits to switch the system between 32bit/64bit execution
state. Expects SIGSEGV on test PASS.

Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-08 11:10:33 +00:00
Cristian Marussi
313a4db7f3 kselftest: arm64: extend toplevel skeleton Makefile
Modify KSFT arm64 toplevel Makefile to maintain arm64 kselftests organized
by subsystem, keeping them into distinct subdirectories under arm64 custom
KSFT directory: tools/testing/selftests/arm64/

Add to such toplevel Makefile a mechanism to guess the effective location
of Kernel headers as installed by KSFT framework.

Fit existing arm64 tags kselftest into this new schema moving them into
their own subdirectory (arm64/tags).

Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-08 11:10:30 +00:00
Catalin Marinas
51effa6d11 Merge branch 'for-next/perf' into for-next/core
- Support for additional PMU topologies on HiSilicon platforms
- Support for CCN-512 interconnect PMU
- Support for AXI ID filtering in the IMX8 DDR PMU
- Support for the CCPI2 uncore PMU in ThunderX2
- Driver cleanup to use devm_platform_ioremap_resource()

* for-next/perf:
  drivers/perf: hisi: update the sccl_id/ccl_id for certain HiSilicon platform
  perf/imx_ddr: Dump AXI ID filter info to userspace
  docs/perf: Add AXI ID filter capabilities information
  perf/imx_ddr: Add driver for DDR PMU in i.MX8MPlus
  perf/imx_ddr: Add enhanced AXI ID filter support
  bindings: perf: imx-ddr: Add new compatible string
  docs/perf: Add explanation for DDR_CAP_AXI_ID_FILTER_ENHANCED quirk
  arm64: perf: Simplify the ARMv8 PMUv3 event attributes
  drivers/perf: Add CCPI2 PMU support in ThunderX2 UNCORE driver.
  Documentation: perf: Update documentation for ThunderX2 PMU uncore driver
  Documentation: Add documentation for CCN-512 DTS binding
  perf: arm-ccn: Enable stats for CCN-512 interconnect
  perf/smmuv3: use devm_platform_ioremap_resource() to simplify code
  perf/arm-cci: use devm_platform_ioremap_resource() to simplify code
  perf/arm-ccn: use devm_platform_ioremap_resource() to simplify code
  perf: xgene: use devm_platform_ioremap_resource() to simplify code
  perf: hisi: use devm_platform_ioremap_resource() to simplify code
2019-11-08 10:57:14 +00:00
Shaokun Zhang
8703317ae5 drivers/perf: hisi: update the sccl_id/ccl_id for certain HiSilicon platform
For some HiSilicon platform, the originally designed SCCL_ID and CCL_ID
are not satisfied with much rich topology when the MT is set, so we
extend the SCCL_ID to MPIDR[aff3] and CCL_ID to MPIDR[aff2]. Let's
update this for HiSilicon uncore PMU driver.

Cc: John Garry <john.garry@huawei.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-11-07 13:07:55 +00:00
Catalin Marinas
c1c9ea6371 Merge branch 'arm64/ftrace-with-regs' of git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux into for-next/core
FTRACE_WITH_REGS support for arm64.

* 'arm64/ftrace-with-regs' of git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux:
  arm64: ftrace: minimize ifdeffery
  arm64: implement ftrace with regs
  arm64: asm-offsets: add S_FP
  arm64: insn: add encoder for MOV (register)
  arm64: module/ftrace: intialize PLT at load time
  arm64: module: rework special section handling
  module/ftrace: handle patchable-function-entry
  ftrace: add ftrace_init_nop()
2019-11-07 11:26:54 +00:00
Nicolas Saenz Julienne
bff3b04460 arm64: mm: reserve CMA and crashkernel in ZONE_DMA32
With the introduction of ZONE_DMA in arm64 we moved the default CMA and
crashkernel reservation into that area. This caused a regression on big
machines that need big CMA and crashkernel reservations. Note that
ZONE_DMA is only 1GB big.

Restore the previous behavior as the wide majority of devices are OK
with reserving these in ZONE_DMA32. The ones that need them in ZONE_DMA
will configure it explicitly.

Fixes: 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32")
Reported-by: Qian Cai <cai@lca.pw>
Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-07 11:22:20 +00:00
Mark Rutland
7f08ae53a7 arm64: ftrace: minimize ifdeffery
Now that we no longer refer to mod->arch.ftrace_trampolines in the body
of ftrace_make_call(), we can use IS_ENABLED() rather than ifdeffery,
and make the code easier to follow. Likewise in ftrace_make_nop().

Let's do so.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Torsten Duwe <duwe@suse.de>
Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Tested-by: Torsten Duwe <duwe@suse.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
2019-11-06 14:17:36 +00:00
Torsten Duwe
3b23e4991f arm64: implement ftrace with regs
This patch implements FTRACE_WITH_REGS for arm64, which allows a traced
function's arguments (and some other registers) to be captured into a
struct pt_regs, allowing these to be inspected and/or modified. This is
a building block for live-patching, where a function's arguments may be
forwarded to another function. This is also necessary to enable ftrace
and in-kernel pointer authentication at the same time, as it allows the
LR value to be captured and adjusted prior to signing.

Using GCC's -fpatchable-function-entry=N option, we can have the
compiler insert a configurable number of NOPs between the function entry
point and the usual prologue. This also ensures functions are AAPCS
compliant (e.g. disabling inter-procedural register allocation).

For example, with -fpatchable-function-entry=2, GCC 8.1.0 compiles the
following:

| unsigned long bar(void);
|
| unsigned long foo(void)
| {
|         return bar() + 1;
| }

... to:

| <foo>:
|         nop
|         nop
|         stp     x29, x30, [sp, #-16]!
|         mov     x29, sp
|         bl      0 <bar>
|         add     x0, x0, #0x1
|         ldp     x29, x30, [sp], #16
|         ret

This patch builds the kernel with -fpatchable-function-entry=2,
prefixing each function with two NOPs. To trace a function, we replace
these NOPs with a sequence that saves the LR into a GPR, then calls an
ftrace entry assembly function which saves this and other relevant
registers:

| mov	x9, x30
| bl	<ftrace-entry>

Since patchable functions are AAPCS compliant (and the kernel does not
use x18 as a platform register), x9-x18 can be safely clobbered in the
patched sequence and the ftrace entry code.

There are now two ftrace entry functions, ftrace_regs_entry (which saves
all GPRs), and ftrace_entry (which saves the bare minimum). A PLT is
allocated for each within modules.

Signed-off-by: Torsten Duwe <duwe@suse.de>
[Mark: rework asm, comments, PLTs, initialization, commit message]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Torsten Duwe <duwe@suse.de>
Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Tested-by: Torsten Duwe <duwe@suse.de>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Julien Thierry <jthierry@redhat.com>
Cc: Will Deacon <will@kernel.org>
2019-11-06 14:17:35 +00:00
Mark Rutland
1f377e043b arm64: asm-offsets: add S_FP
So that assembly code can more easily manipulate the FP (x29) within a
pt_regs, add an S_FP asm-offsets definition.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Torsten Duwe <duwe@suse.de>
Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Tested-by: Torsten Duwe <duwe@suse.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
2019-11-06 14:17:34 +00:00
Mark Rutland
e3bf8a67f7 arm64: insn: add encoder for MOV (register)
For FTRACE_WITH_REGS, we're going to want to generate a MOV (register)
instruction as part of the callsite intialization. As MOV (register) is
an alias for ORR (shifted register), we can generate this with
aarch64_insn_gen_logical_shifted_reg(), but it's somewhat verbose and
difficult to read in-context.

Add a aarch64_insn_gen_move_reg() wrapper for this case so that we can
write callers in a more straightforward way.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Torsten Duwe <duwe@suse.de>
Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Tested-by: Torsten Duwe <duwe@suse.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
2019-11-06 14:17:33 +00:00
Mark Rutland
f1a54ae9af arm64: module/ftrace: intialize PLT at load time
Currently we lazily-initialize a module's ftrace PLT at runtime when we
install the first ftrace call. To do so we have to apply a number of
sanity checks, transiently mark the module text as RW, and perform an
IPI as part of handling Neoverse-N1 erratum #1542419.

We only expect the ftrace trampoline to point at ftrace_caller() (AKA
FTRACE_ADDR), so let's simplify all of this by intializing the PLT at
module load time, before the module loader marks the module RO and
performs the intial I-cache maintenance for the module.

Thus we can rely on the module having been correctly intialized, and can
simplify the runtime work necessary to install an ftrace call in a
module. This will also allow for the removal of module_disable_ro().

Tested by forcing ftrace_make_call() to use the module PLT, and then
loading up a module after setting up ftrace with:

| echo ":mod:<module-name>" > set_ftrace_filter;
| echo function > current_tracer;
| modprobe <module-name>

Since FTRACE_ADDR is only defined when CONFIG_DYNAMIC_FTRACE is
selected, we wrap its use along with most of module_init_ftrace_plt()
with ifdeffery rather than using IS_ENABLED().

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Torsten Duwe <duwe@suse.de>
Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Tested-by: Torsten Duwe <duwe@suse.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
2019-11-06 14:17:32 +00:00
Mark Rutland
bd8b21d3dd arm64: module: rework special section handling
When we load a module, we have to perform some special work for a couple
of named sections. To do this, we iterate over all of the module's
sections, and perform work for each section we recognize.

To make it easier to handle the unexpected absence of a section, and to
make the section-specific logic easer to read, let's factor the section
search into a helper. Similar is already done in the core module loader,
and other architectures (and ideally we'd unify these in future).

If we expect a module to have an ftrace trampoline section, but it
doesn't have one, we'll now reject loading the module. When
ARM64_MODULE_PLTS is selected, any correctly built module should have
one (and this is assumed by arm64's ftrace PLT code) and the absence of
such a section implies something has gone wrong at build time.

Subsequent patches will make use of the new helper.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Torsten Duwe <duwe@suse.de>
Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Tested-by: Torsten Duwe <duwe@suse.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
2019-11-06 14:17:31 +00:00
Mark Rutland
a1326b17ac module/ftrace: handle patchable-function-entry
When using patchable-function-entry, the compiler will record the
callsites into a section named "__patchable_function_entries" rather
than "__mcount_loc". Let's abstract this difference behind a new
FTRACE_CALLSITE_SECTION, so that architectures don't have to handle this
explicitly (e.g. with custom module linker scripts).

As parisc currently handles this explicitly, it is fixed up accordingly,
with its custom linker script removed. Since FTRACE_CALLSITE_SECTION is
only defined when DYNAMIC_FTRACE is selected, the parisc module loading
code is updated to only use the definition in that case. When
DYNAMIC_FTRACE is not selected, modules shouldn't have this section, so
this removes some redundant work in that case.

To make sure that this is keep up-to-date for modules and the main
kernel, a comment is added to vmlinux.lds.h, with the existing ifdeffery
simplified for legibility.

I built parisc generic-{32,64}bit_defconfig with DYNAMIC_FTRACE enabled,
and verified that the section made it into the .ko files for modules.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Helge Deller <deller@gmx.de>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Torsten Duwe <duwe@suse.de>
Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Tested-by: Sven Schnelle <svens@stackframe.org>
Tested-by: Torsten Duwe <duwe@suse.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: linux-parisc@vger.kernel.org
2019-11-06 14:17:30 +00:00
Mark Rutland
fbf6c73c5b ftrace: add ftrace_init_nop()
Architectures may need to perform special initialization of ftrace
callsites, and today they do so by special-casing ftrace_make_nop() when
the expected branch address is MCOUNT_ADDR. In some cases (e.g. for
patchable-function-entry), we don't have an mcount-like symbol and don't
want a synthetic MCOUNT_ADDR, but we may need to perform some
initialization of callsites.

To make it possible to separate initialization from runtime
modification, and to handle cases without an mcount-like symbol, this
patch adds an optional ftrace_init_nop() function that architectures can
implement, which does not pass a branch address.

Where an architecture does not provide ftrace_init_nop(), we will fall
back to the existing behaviour of calling ftrace_make_nop() with
MCOUNT_ADDR.

At the same time, ftrace_code_disable() is renamed to
ftrace_nop_initialize() to make it clearer that it is intended to
intialize a callsite into a disabled state, and is not for disabling a
callsite that has been runtime enabled. The kerneldoc description of rec
arguments is updated to cover non-mcount callsites.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: Torsten Duwe <duwe@suse.de>
Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Tested-by: Sven Schnelle <svens@stackframe.org>
Tested-by: Torsten Duwe <duwe@suse.de>
Cc: Ingo Molnar <mingo@redhat.com>
2019-11-06 14:17:13 +00:00
Rich Wiley
918e1946c8 arm64: kpti: Add NVIDIA's Carmel core to the KPTI whitelist
NVIDIA Carmel CPUs don't implement ID_AA64PFR0_EL1.CSV3 but
aren't susceptible to Meltdown, so add Carmel to kpti_safe_list[].

Signed-off-by: Rich Wiley <rwiley@nvidia.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-06 11:31:03 +00:00
Bhupesh Sharma
218564b164 arm64: mm: Remove MAX_USER_VA_BITS definition
commit 9b31cf493ffa ("arm64: mm: Introduce MAX_USER_VA_BITS definition")
introduced the MAX_USER_VA_BITS definition, which was used to support
the arm64 mm use-cases where the user-space could use 52-bit virtual
addresses whereas the kernel-space would still could a maximum of 48-bit
virtual addressing.

But, now with commit b6d00d47e81a ("arm64: mm: Introduce 52-bit Kernel
VAs"), we removed the 52-bit user/48-bit kernel kconfig option and hence
there is no longer any scenario where user VA != kernel VA size
(even with CONFIG_ARM64_FORCE_52BIT enabled, the same is true).

Hence we can do away with the MAX_USER_VA_BITS macro as it is equal to
VA_BITS (maximum VA space size) in all possible use-cases. Note that
even though the 'vabits_actual' value would be 48 for arm64 hardware
which don't support LVA-8.2 extension (even when CONFIG_ARM64_VA_BITS_52
is enabled), VA_BITS would still be set to a value 52. Hence this change
would be safe in all possible VA address space combinations.

Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: linux-kernel@vger.kernel.org
Cc: kexec@lists.infradead.org
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-06 11:19:25 +00:00
Masahiro Yamada
32d1870877 arm64: mm: simplify the page end calculation in __create_pgd_mapping()
Calculate the page-aligned end address more simply.

The local variable, "length" is unneeded.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-06 11:17:09 +00:00
Joakim Zhang
f1d303a1b5 perf/imx_ddr: Dump AXI ID filter info to userspace
caps/filter indicates whether HW supports AXI ID filter or not.
caps/enhanced_filter indicates whether HW supports enhanced AXI ID filter
or not.

Users can check filter features from userspace with these attributions.

Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
[will: reworked cap switch to be less error-prone]
Signed-off-by: Will Deacon <will@kernel.org>
2019-11-04 17:25:16 +00:00
Joakim Zhang
ed0207a33a docs/perf: Add AXI ID filter capabilities information
Add capabilities information for AXI ID filter.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-11-04 16:27:34 +00:00
Joakim Zhang
d3eeece9a8 perf/imx_ddr: Add driver for DDR PMU in i.MX8MPlus
Add driver for DDR PMU in i.MX8MPlus.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-11-04 16:27:34 +00:00
Joakim Zhang
44f8bd014a perf/imx_ddr: Add enhanced AXI ID filter support
With DDR_CAP_AXI_ID_FILTER quirk, indicating HW supports AXI ID filter
which only can get bursts from DDR transaction, i.e. DDR read/write
requests.

This patch add DDR_CAP_AXI_ID_ENHANCED_FILTER quirk, indicating HW
supports AXI ID filter which can get bursts and bytes from DDR
transaction at the same time. We hope PMU always return bytes in the
driver due to it is more meaningful for users.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-11-04 16:27:34 +00:00
Joakim Zhang
1178addaca bindings: perf: imx-ddr: Add new compatible string
Add new compatible string for i.MX8MPlus DDR PMU core.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-11-04 16:27:34 +00:00
Joakim Zhang
76d835fcd4 docs/perf: Add explanation for DDR_CAP_AXI_ID_FILTER_ENHANCED quirk
Add explanation for DDR_CAP_AXI_ID_FILTER_ENHANCED quirk.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
[will: Simplified wording]
Signed-off-by: Will Deacon <will@kernel.org>
2019-11-04 16:27:13 +00:00
Julien Grall
478016c383 docs/arm64: cpu-feature-registers: Rewrite bitfields that don't follow [e, s]
Commit "docs/arm64: cpu-feature-registers: Documents missing visible
fields" added bitfields following the convention [s, e]. However, the
documentation is following [s, e] and so does the Arm ARM.

Rewrite the bitfields to match the format [s, e].

Fixes: a8613e7070e7 ("docs/arm64: cpu-feature-registers: Documents missing visible fields")
Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-01 15:53:55 +00:00
Shaokun Zhang
9ef8567ccf arm64: perf: Simplify the ARMv8 PMUv3 event attributes
For each PMU event, there is a ARMV8_EVENT_ATTR(xx, XX) and
&armv8_event_attr_xx.attr.attr. Let's redefine the ARMV8_EVENT_ATTR
to simplify the armv8_pmuv3_event_attrs.

Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
[will: Dropped unnecessary array syntax]
Signed-off-by: Will Deacon <will@kernel.org>
2019-11-01 14:51:19 +00:00
Nicolas Saenz Julienne
8b5369ea58 dma/direct: turn ARCH_ZONE_DMA_BITS into a variable
Some architectures, notably ARM, are interested in tweaking this
depending on their runtime DMA addressing limitations.

Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-01 09:41:18 +00:00
Xiang Zheng
e44ec4a35d arm64: print additional fault message when executing non-exec memory
When attempting to executing non-executable memory, the fault message
shows:

  Unable to handle kernel read from unreadable memory at virtual address
  ffff802dac469000

This may confuse someone, so add a new fault message for instruction
abort.

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-29 15:41:14 +00:00
Ganapatrao Prabhakerrao Kulkarni
5e2c27e833 drivers/perf: Add CCPI2 PMU support in ThunderX2 UNCORE driver.
CCPI2 is a low-latency high-bandwidth serial interface for inter socket
connectivity of ThunderX2 processors.

CCPI2 PMU supports up to 8 counters per socket. Counters are
independently programmable to different events and can be started and
stopped individually. The CCPI2 counters are 64-bit and do not overflow
in normal operation.

Signed-off-by: Ganapatrao Prabhakerrao Kulkarni <gkulkarni@marvell.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-10-29 10:08:46 +00:00
Ganapatrao Prabhakerrao Kulkarni
030f6f84e5 Documentation: perf: Update documentation for ThunderX2 PMU uncore driver
Add documentation for Cavium Coherent Processor Interconnect (CCPI2) PMU.

Signed-off-by: Ganapatrao Prabhakerrao Kulkarni <gkulkarni@marvell.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-10-29 10:08:46 +00:00
Catalin Marinas
8301ae822d Merge branch 'for-next/entry-s-to-c' into for-next/core
Move the synchronous exception paths from entry.S into a C file to
improve the code readability.

* for-next/entry-s-to-c:
  arm64: entry-common: don't touch daif before bp-hardening
  arm64: Remove asmlinkage from updated functions
  arm64: entry: convert el0_sync to C
  arm64: entry: convert el1_sync to C
  arm64: add local_daif_inherit()
  arm64: Add prototypes for functions called by entry.S
  arm64: remove __exception annotations
2019-10-28 17:02:56 +00:00
Catalin Marinas
4686da5140 arm64: Make arm64_dma32_phys_limit static
This variable is only used in the arch/arm64/mm/init.c file for
ZONE_DMA32 initialisation, no need to expose it.

Reported-by: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-28 16:46:43 +00:00
Catalin Marinas
346f6a4636 Merge branch 'kvm-arm64/erratum-1319367' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into for-next/core
Similarly to erratum 1165522 that affects Cortex-A76, A57 and A72
respectively suffer from errata 1319537 and 1319367, potentially
resulting in TLB corruption if the CPU speculates an AT instruction
while switching guests.

The fix is slightly more involved since we don't have VHE to help us
here, but the idea is the same: when switching a guest in, we must
prevent any speculated AT from being able to parse the page tables
until S2 is up and running. Only at this stage can we allow AT to take
place.

For this, we always restore the guest sysregs first, except for its
SCTLR and TCR registers, which must be set with SCTLR.M=1 and
TCR.EPD{0,1} = {1, 1}, effectively disabling the PTW and TLB
allocation. Once S2 is setup, we restore the guest's SCTLR and
TCR. Similar things must be done on TLB invalidation...

* 'kvm-arm64/erratum-1319367' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms:
  arm64: Enable and document ARM errata 1319367 and 1319537
  arm64: KVM: Prevent speculative S1 PTW when restoring vcpu context
  arm64: KVM: Disable EL1 PTW when invalidating S2 TLBs
  arm64: KVM: Reorder system register restoration and stage-2 activation
  arm64: Add ARM64_WORKAROUND_1319367 for all A57 and A72 versions
2019-10-28 16:22:49 +00:00
Catalin Marinas
6a036afb55 Merge branch 'for-next/neoverse-n1-stale-instr' into for-next/core
Neoverse-N1 cores with the 'COHERENT_ICACHE' feature may fetch stale
instructions when software depends on prefetch-speculation-protection
instead of explicit synchronization. [0]

The workaround is to trap I-Cache maintenance and issue an
inner-shareable TLBI. The affected cores have a Coherent I-Cache, so the
I-Cache maintenance isn't necessary. The core tells user-space it can
skip it with CTR_EL0.DIC. We also have to trap this register to hide the
bit forcing DIC-aware user-space to perform the maintenance.

To avoid trapping all cache-maintenance, this workaround depends on
a firmware component that only traps I-cache maintenance from EL0 and
performs the workaround.

For user-space, the kernel's work is to trap CTR_EL0 to hide DIC, and
produce a fake IminLine. EL3 traps the now-necessary I-Cache maintenance
and performs the inner-shareable-TLBI that makes everything better.

[0] https://developer.arm.com/docs/sden885747/latest/arm-neoverse-n1-mp050-software-developer-errata-notice

* for-next/neoverse-n1-stale-instr:
  arm64: Silence clang warning on mismatched value/register sizes
  arm64: compat: Workaround Neoverse-N1 #1542419 for compat user-space
  arm64: Fake the IminLine size on systems affected by Neoverse-N1 #1542419
  arm64: errata: Hide CTR_EL0.DIC on systems affected by Neoverse-N1 #1542419
2019-10-28 16:12:40 +00:00
Marek Bykowski
05daff069f Documentation: Add documentation for CCN-512 DTS binding
Indicate the arm-ccn perf back-end supports now ccn-512.

Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Marek Bykowski <marek.bykowski@gmail.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-10-28 15:41:23 +00:00
Marek Bykowski
126b0a1700 perf: arm-ccn: Enable stats for CCN-512 interconnect
Add compatible string for the ARM CCN-512 interconnect

Acked-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Marek Bykowski <marek.bykowski@gmail.com>
Signed-off-by: Boleslaw Malecki <boleslaw.malecki@tieto.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-10-28 15:40:42 +00:00
Catalin Marinas
ba95e9bd96 Merge remote-tracking branch 'arm64/for-next/fixes' into for-next/core
This is required to solve the conflicts with subsequent merges of two
more errata workaround branches.

* arm64/for-next/fixes:
  arm64: tags: Preserve tags for addresses translated via TTBR1
  arm64: mm: fix inverted PAR_EL1.F check
  arm64: sysreg: fix incorrect definition of SYS_PAR_EL1_F
  arm64: entry.S: Do not preempt from IRQ before all cpufeatures are enabled
  arm64: hibernate: check pgd table allocation
  arm64: cpufeature: Treat ID_AA64ZFR0_EL1 as RAZ when SVE is not enabled
  arm64: Fix kcore macros after 52-bit virtual addressing fallout
  arm64: Allow CAVIUM_TX2_ERRATUM_219 to be selected
  arm64: Avoid Cavium TX2 erratum 219 when switching TTBR
  arm64: Enable workaround for Cavium TX2 erratum 219 when running SMT
  arm64: KVM: Trap VM ops when ARM64_WORKAROUND_CAVIUM_TX2_219_TVM is set
2019-10-28 15:30:52 +00:00
James Morse
bfe298745a arm64: entry-common: don't touch daif before bp-hardening
The previous patches mechanically transformed the assembly version of
entry.S to entry-common.c for synchronous exceptions.

The C version of local_daif_restore() doesn't quite do the same thing
as the assembly versions if pseudo-NMI is in use. In particular,
| local_daif_restore(DAIF_PROCCTX_NOIRQ)
will still allow pNMI to be delivered. This is not the behaviour
do_el0_ia_bp_hardening() and do_sp_pc_abort() want as it should not
be possible for the PMU handler to run as an NMI until the bp-hardening
sequence has run.

The bp-hardening calls were placed where they are because this was the
first C code to run after the relevant exceptions. As we've now moved
that point earlier, move the checks and calls earlier too.

This makes it clearer that this stuff runs before any kind of exception,
and saves modifying PSTATE twice.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-28 11:22:54 +00:00