linux

iv/linux

Author	SHA1	Message	Date
Will Deacon	3e8626b4ed	Merge branch 'for-next/sysregs' into for-next/core * for-next/sysregs: arm64/sysreg: Add missing system instruction definitions for FGT arm64/sysreg: Add missing system register definitions for FGT arm64/sysreg: Add missing ExtTrcBuff field definition to ID_AA64DFR0_EL1 arm64/sysreg: Add missing Pauth_LR field definitions to ID_AA64ISAR1_EL1 arm64/sysreg: Add new system registers for GCS arm64/sysreg: Add definition for FPMR arm64/sysreg: Update HCRX_EL2 definition for DDI0601 2023-09 arm64/sysreg: Update SCTLR_EL1 for DDI0601 2023-09 arm64/sysreg: Update ID_AA64SMFR0_EL1 definition for DDI0601 2023-09 arm64/sysreg: Add definition for ID_AA64FPFR0_EL1 arm64/sysreg: Add definition for ID_AA64ISAR3_EL1 arm64/sysreg: Update ID_AA64ISAR2_EL1 defintion for DDI0601 2023-09 arm64/sysreg: Add definition for ID_AA64PFR2_EL1 arm64/sysreg: update CPACR_EL1 register arm64/sysreg: add system register POR_EL{0,1} arm64/sysreg: Add definition for HAFGRTR_EL2 arm64/sysreg: Update HFGITR_EL2 definiton to DDI0601 2023-09	2024-01-04 12:28:38 +00:00
Will Deacon	41cff14b03	Merge branch 'for-next/stacktrace' into for-next/core * for-next/stacktrace: arm64: stacktrace: factor out kunwind_stack_walk() arm64: stacktrace: factor out kernel unwind state	2024-01-04 12:28:31 +00:00
Will Deacon	ef4896b598	Merge branch 'for-next/selftests' into for-next/core * for-next/selftests: kselftest/arm64: Don't probe the current VL for unsupported vector types kselftest/arm64: Log SVCR when the SME tests barf kselftest/arm64: Improve output for skipped TPIDR2 ABI test	2024-01-04 12:28:22 +00:00
Will Deacon	30431774fe	Merge branch 'for-next/rip-vpipt' into for-next/core * for-next/rip-vpipt: arm64: Rename reserved values for CTR_EL0.L1Ip arm64: Kill detection of VPIPT i-cache policy KVM: arm64: Remove VPIPT I-cache handling	2024-01-04 12:28:14 +00:00
Will Deacon	dd9168ab08	Merge branch 'for-next/perf' into for-next/core * for-next/perf: (30 commits) arm: perf: Fix ARCH=arm build with GCC MAINTAINERS: add maintainers for DesignWare PCIe PMU driver drivers/perf: add DesignWare PCIe PMU driver PCI: Move pci_clear_and_set_dword() helper to PCI header PCI: Add Alibaba Vendor ID to linux/pci_ids.h docs: perf: Add description for Synopsys DesignWare PCIe PMU driver Revert "perf/arm_dmc620: Remove duplicate format attribute #defines" Documentation: arm64: Document the PMU event counting threshold feature arm64: perf: Add support for event counting threshold arm: pmu: Move error message and -EOPNOTSUPP to individual PMUs KVM: selftests: aarch64: Update tools copy of arm_pmuv3.h perf/arm_dmc620: Remove duplicate format attribute #defines arm: pmu: Share user ABI format mechanism with SPE arm64: perf: Include threshold control fields in PMEVTYPER mask arm: perf: Convert remaining fields to use GENMASK arm: perf: Use GENMASK for PMMIR fields arm: perf/kvm: Use GENMASK for ARMV8_PMU_PMCR_N arm: perf: Remove inlines from arm_pmuv3.c drivers/perf: arm_dsu_pmu: Remove kerneldoc-style comment syntax drivers/perf: Remove usage of the deprecated ida_simple_xx() API ...	2024-01-04 12:28:00 +00:00
Will Deacon	3b47bd8fed	Merge branch 'for-next/mm' into for-next/core * for-next/mm: arm64: irq: set the correct node for shadow call stack arm64: irq: set the correct node for VMAP stack	2024-01-04 12:27:55 +00:00
Will Deacon	65180649fa	Merge branch 'for-next/misc' into for-next/core * for-next/misc: arm64: memory: remove duplicated include arm64: Delete the zero_za macro Documentation/arch/arm64: Fix typo	2024-01-04 12:27:49 +00:00
Will Deacon	ccaeeec529	Merge branch 'for-next/lpa2-prep' into for-next/core * for-next/lpa2-prep: arm64: mm: get rid of kimage_vaddr global variable arm64: mm: Take potential load offset into account when KASLR is off arm64: kernel: Disable latent_entropy GCC plugin in early C runtime arm64: Add ARM64_HAS_LPA2 CPU capability arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2] arm64/mm: Update tlb invalidation routines for FEAT_LPA2 arm64/mm: Add lpa2_is_enabled() kvm_lpa2_is_enabled() stubs arm64/mm: Modify range-based tlbi to decrement scale	2024-01-04 12:27:42 +00:00
Will Deacon	88619527b4	Merge branch 'for-next/kbuild' into for-next/core * for-next/kbuild: efi/libstub: zboot: do not use $(shell ...) in cmd_copy_and_pad arm64: properly install vmlinuz.efi arm64: replace <asm-generic/export.h> with <linux/export.h> arm64: vdso32: rename 32-bit debug vdso to vdso32.so.dbg	2024-01-04 12:27:35 +00:00
Will Deacon	79eb42b269	Merge branch 'for-next/fpsimd' into for-next/core * for-next/fpsimd: arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD arm64: fpsimd: Preserve/restore kernel mode NEON at context switch arm64: fpsimd: Drop unneeded 'busy' flag	2024-01-04 12:27:29 +00:00
Will Deacon	e90a8a210f	Merge branch 'for-next/early-idreg-overrides' into for-next/core * for-next/early-idreg-overrides: arm64/kernel: Move 'nokaslr' parsing out of early idreg code arm64: idreg-override: Avoid kstrtou64() to parse a single hex digit arm64: idreg-override: Avoid sprintf() for simple string concatenation arm64: idreg-override: avoid strlen() to check for empty strings arm64: idreg-override: Avoid parameq() and parameqn() arm64: idreg-override: Prepare for place relative reloc patching arm64: idreg-override: Omit non-NULL checks for override pointer	2024-01-04 12:27:13 +00:00
Will Deacon	3f35db4e68	Merge branch 'for-next/cpufeature' into for-next/core * for-next/cpufeature: arm64: Align boot cpucap handling with system cpucap handling arm64: Cleanup system cpucap handling arm64: Kconfig: drop KAISER reference from KPTI option description arm64: mm: Only map KPTI trampoline if it is going to be used arm64: Get rid of ARM64_HAS_NO_HW_PREFETCH	2024-01-04 12:26:56 +00:00
Mark Brown	9a802ddb21	kselftest/arm64: Don't probe the current VL for unsupported vector types The vec-syscfg selftest verifies that setting the VL of the currently tested vector type does not disrupt the VL of the other vector type. To do this it records the current vector length for each type but neglects to guard this with a check for that vector type actually being supported. Add one, using a helper function which we also update all the other instances of this pattern. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20231218-kselftest-arm64-vec-syscfg-rdvl-v1-1-0ac22d47e81f@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2023-12-19 10:03:17 +00:00
Masahiro Yamada	97ba4416d6	efi/libstub: zboot: do not use $(shell ...) in cmd_copy_and_pad You do not need to use $(shell ...) in recipe lines, as they are already executed in a shell. An alternative solution is $$(...), which is an escaped sequence of the shell's command substituion, $(...). For this case, there is a reason to avoid $(shell ...). Kbuild detects command changes by using the if_changed macro, which compares the previous command recorded in ..cmd with the current command from Makefile. If they differ, Kbuild re-runs the build rule. To diff the commands, Make must expand $(shell ...) first. It means that hexdump is executed every time, even when nothing needs rebuilding. If Kbuild determines that vmlinux.bin needs rebuilding, hexdump will be executed again to evaluate the 'cmd' macro, one more time to really build vmlinux.bin, and finally yet again to record the expanded command into ..cmd. Replace $(shell ...) with $$(...) to avoid multiple, unnecessay shell evaluations. Since Make is agnostic about the shell code, $(...), the if_changed macro compares the string "$(hexdump -s16 -n4 ...)" verbatim, so hexdump is run only for building vmlinux.bin. For the same reason, $(shell ...) in EFI_ZBOOT_OBJCOPY_FLAGS should be eliminated. While I was here, I replaced '&&' with ';' because a command for if_changed is executed with 'set -e'. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20231218080127.907460-1-masahiroy@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2023-12-19 10:02:40 +00:00
Josef Bacik	7b21ed7d11	arm64: properly install vmlinuz.efi If you select CONFIG_EFI_ZBOOT, we will generate vmlinuz.efi, and then when we go to install the kernel we'll install the vmlinux instead because install.sh only recognizes Image.gz as wanting the compressed install image. With CONFIG_EFI_ZBOOT we don't get the proper kernel installed, which means it doesn't boot, which makes for a very confused and subsequently angry kernel developer. Fix this by properly installing our compressed kernel if we've enabled CONFIG_EFI_ZBOOT. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Cc: <stable@vger.kernel.org> # 6.1.x Fixes: c37b830fef13 ("arm64: efi: enable generic EFI compressed boot") Reviewed-by: Simon Glass <sjg@chromium.org> Link: https://lore.kernel.org/r/6edb1402769c2c14c4fbef8f7eaedb3167558789.1702570674.git.josef@toxicpanda.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-17 12:31:53 +00:00
Fuad Tabba	4ebee8cebd	arm64/sysreg: Add missing system instruction definitions for FGT Add the definitions of missing system instructions that are trappable by fine grain traps. The definitions are based on DDI0602 2023-09. Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20231214100158.2305400-5-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-17 12:11:23 +00:00
Fuad Tabba	885c6d8e28	arm64/sysreg: Add missing system register definitions for FGT Add the definitions of missing system registers that are trappable by fine grain traps. The definitions are based on DDI0601 2023-09. Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20231214100158.2305400-4-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-17 12:11:23 +00:00
Fuad Tabba	4f101cdcb5	arm64/sysreg: Add missing ExtTrcBuff field definition to ID_AA64DFR0_EL1 Add the ExtTrcBuff field definitions to ID_AA64DFR0_EL1 from DDI0601 2023-09. This field isn't used yet. Adding it for completeness and because it will be used in future patches. Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20231214100158.2305400-3-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-17 12:11:23 +00:00
Fuad Tabba	3b077ad8cb	arm64/sysreg: Add missing Pauth_LR field definitions to ID_AA64ISAR1_EL1 Add the Pauth_LR field definitions to ID_AA64ISAR1_EL1, based on DDI0601 2023-09. These fields aren't used yet. Adding them for completeness and consistency (definition already exists for ID_AA64ISAR2_EL1). Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20231214100158.2305400-2-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-17 12:11:23 +00:00
Wang Jinchao	5cc5ed7a66	arm64: memory: remove duplicated include remove duplicated include Signed-off-by: Wang Jinchao <wangjinchao@xfusion.com> Link: https://lore.kernel.org/r/202312151439+0800-wangjinchao@xfusion.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-17 12:07:53 +00:00
James Clark	bb339db4d3	arm: perf: Fix ARCH=arm build with GCC LLVM ignores everything inside the if statement and doesn't generate errors, but GCC doesn't ignore it, resulting in the following error: drivers/perf/arm_pmuv3.c: In function ‘armv8pmu_write_evtype’: include/linux/bits.h:34:29: error: left shift count >= width of type [-Werror=shift-count-overflow] 34 \| (((~UL(0)) - (UL(1) << (l)) + 1) & \ Fix it by using GENMASK_ULL which doesn't overflow on arm32 (even though the value is never used there). Fixes: 3115ee021bfb ("arm64: perf: Include threshold control fields in PMEVTYPER mask") Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Closes: https://lore.kernel.org/linux-arm-kernel/20231215120817.h2f3akgv72zhrtqo@pengutronix.de/ Signed-off-by: James Clark <james.clark@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20231215175648.3397170-2-james.clark@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-17 12:00:00 +00:00
Mark Rutland	eb15d707c2	arm64: Align boot cpucap handling with system cpucap handling Currently the detection+enablement of boot cpucaps is separate from the patching of boot cpucap alternatives, which means there's a period where cpus_have_cap($CAP) and alternative_has_cap($CAP) may be mismatched. It would be preferable to manage the boot cpucaps in the same way as the system cpucaps, both for clarity and to minimize the risk of accidental usage of code relying upon an alternative which has not yet been patched. This patch aligns the handling of boot cpucaps with the handling of system cpucaps: * The existing setup_boot_cpu_capabilities() function is moved to be closer to the setup_system_capabilities() and setup_system_features() functions so that they're more clearly related and more likely to be updated together in future. * The patching of boot cpucap alternatives is moved into setup_boot_cpu_capabilities(), immediately after boot cpucaps are detected and enabled. * A new setup_boot_cpu_features() function is added to mirror setup_system_features(); this handles initialization of cpucap data structures and calls setup_boot_cpu_capabilities(). This makes init_cpu_features() a closer mirror to update_cpu_features(), and makes smp_prepare_boot_cpu() a closer mirror to smp_cpus_done(). Importantly, while these changes alter the structure of the code, they retain the existing order of calls to: init_cpu_features(); // prefix initializing feature regs init_cpucap_indirect_list(); detect_system_supports_pseudo_nmi(); update_cpu_capabilities(SCOPE_BOOT_CPU \| SCOPE_LOCAL_CPU); enable_cpu_capabilities(SCOPE_BOOT_CPU); apply_boot_alternatives(); ... and hence there should be no functional change as a result of this patch; this is purely a structural cleanup. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20231212170910.3745497-3-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-13 16:02:01 +00:00
Mark Rutland	63a2d92e14	arm64: Cleanup system cpucap handling Recent changes to remove cpus_have_const_cap() introduced new users of cpus_have_cap() in the period between detecting system cpucaps and patching alternatives. It would be preferable to defer these until after the relevant cpucaps have been patched so that these can use the usual feature check helper functions, which is clearer and has less risk of accidental usage of code relying upon an alternative which has not yet been patched. This patch reworks the system-wide cpucap detection and patching to minimize this transient period: * The detection, enablement, and patching of system cpucaps is moved into a new setup_system_capabilities() function so that these can be grouped together more clearly, with no other functions called in the period between detection and patching. This is called from setup_system_features() before the subsequent checks that depend on the cpucaps. The logging of TTBR0 PAN and cpucaps with a mask is also moved here to keep these as close as possible to update_cpu_capabilities(). At the same time, comments are corrected and improved to make the intent clearer. * As hyp_mode_check() only tests system register values (not hwcaps) and must be called prior to patching, the call to hyp_mode_check() is moved before the call to setup_system_features(). * In setup_system_features(), the use of system_uses_ttbr0_pan() is restored, now that this occurs after alternatives are patched. This is a partial revert of commit: 53d62e995d9eaed1 ("arm64: Avoid cpus_have_const_cap() for ARM64_HAS_PAN") * In sve_setup() and sme_setup(), the use of system_supports_sve() and system_supports_sme() respectively are restored, now that these occur after alternatives are patched. This is a partial revert of commit: a76521d160284a1e ("arm64: Avoid cpus_have_const_cap() for ARM64_{SVE,SME,SME2,FA64}") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20231212170910.3745497-2-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-13 16:02:01 +00:00
Shuai Xue	f56bb3de66	MAINTAINERS: add maintainers for DesignWare PCIe PMU driver Add maintainers for Synopsys DesignWare PCIe PMU driver and driver document. Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Link: https://lore.kernel.org/r/20231208025652.87192-6-xueshuai@linux.alibaba.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-13 15:36:30 +00:00
Shuai Xue	af9597adc2	drivers/perf: add DesignWare PCIe PMU driver This commit adds the PCIe Performance Monitoring Unit (PMU) driver support for T-Head Yitian SoC chip. Yitian is based on the Synopsys PCI Express Core controller IP which provides statistics feature. The PMU is a PCIe configuration space register block provided by each PCIe Root Port in a Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error injection, and Statistics). To facilitate collection of statistics the controller provides the following two features for each Root Port: - one 64-bit counter for Time Based Analysis (RX/TX data throughput and time spent in each low-power LTSSM state) and - one 32-bit counter for Event Counting (error and non-error events for a specified lane) Note: There is no interrupt for counter overflow. This driver adds PMU devices for each PCIe Root Port. And the PMU device is named based the BDF of Root Port. For example, 30:03.0 PCI bridge: Device 1ded:8000 (rev 01) the PMU device name for this Root Port is dwc_rootport_3018. Example usage of counting PCIe RX TLP data payload (Units of bytes):: $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ average RX bandwidth can be calculated like this: PCIe TX Bandwidth = Rx_PCIe_TLP_Data_Payload / Measure_Time_Window Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-and-tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/20231208025652.87192-5-xueshuai@linux.alibaba.com [will: Fix sparse error due to use of uninitialised 'vsec' symbol in dwc_pcie_match_des_cap()] Signed-off-by: Will Deacon <will@kernel.org>	2023-12-13 15:35:28 +00:00
Shuai Xue	ac16087134	PCI: Move pci_clear_and_set_dword() helper to PCI header The clear and set pattern is commonly used for accessing PCI config, move the helper pci_clear_and_set_dword() from aspm.c into PCI header. In addition, rename to pci_clear_and_set_config_dword() to retain the "config" information and match the other accessors. No functional change intended. Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/20231208025652.87192-4-xueshuai@linux.alibaba.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-13 13:35:41 +00:00
Shuai Xue	ad6534c626	PCI: Add Alibaba Vendor ID to linux/pci_ids.h The Alibaba Vendor ID (0x1ded) is now used by Alibaba elasticRDMA ("erdma") and will be shared with the upcoming PCIe PMU ("dwc_pcie_pmu"). Move the Vendor ID to linux/pci_ids.h so that it can shared by several drivers later. Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci_ids.h Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/20231208025652.87192-3-xueshuai@linux.alibaba.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-13 13:35:41 +00:00
Shuai Xue	cae40614cd	docs: perf: Add description for Synopsys DesignWare PCIe PMU driver Alibaba's T-Head Yitan 710 SoC includes Synopsys' DesignWare Core PCIe controller which implements PMU for performance and functional debugging to facilitate system maintenance. Document it to provide guidance on how to use it. Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/20231208025652.87192-2-xueshuai@linux.alibaba.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-13 13:35:41 +00:00
Huang Shijie	7b1a09e44d	arm64: irq: set the correct node for shadow call stack The init_irq_stacks() has been changed to use the correct node: https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/commit/?id=75b5e0bf90bf The init_irq_scs() has the same issue with init_irq_stacks(): cpu_to_node() is not initialized yet, it does not work. This patch uses early_cpu_to_node() to set the init_irq_scs() with the correct node. Signed-off-by: Huang Shijie <shijie@os.amperecomputing.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20231213012046.12014-1-shijie@os.amperecomputing.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-13 12:09:00 +00:00
Will Deacon	eb183b2cd0	Revert "perf/arm_dmc620: Remove duplicate format attribute #defines" This reverts commit a5f4ca68f348ac059efd6a3d7ad4040aed1c0818. Pulling in the Arm-specific 'linux/perf/arm_pmu.h' header breaks the allmodconfig build for x86: > In file included from drivers/perf/arm_dmc620_pmu.c:26: > include/linux/perf/arm_pmu.h:15:10: fatal error: asm/cputype.h: No such file or directory > 15 \| #include <asm/cputype.h> > \| ^~~~~~~~~~~~~~~ Just put things back like they were so that the driver can continue to be compile-tested on a variety of architectures. Link: https://lore.kernel.org/r/20231213100931.12d9d85e@canb.auug.org.au Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Will Deacon <will@kernel.org>	2023-12-13 09:47:52 +00:00
Ard Biesheuvel	2632e25217	arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD Now that kernel mode FPSIMD state is context switched along with other task state, we can enable the existing logic that keeps track of which task's FPSIMD state the CPU is holding in its registers. If it is the context of the task that we are switching to, we can elide the reload of the FPSIMD state from memory. Note that we also need to check whether the FPSIMD state on this CPU is the most recent: if a task gets migrated away and back again, the state in memory may be more recent than the state in the CPU. So add another CPU id field to task_struct to keep track of this. (We could reuse the existing CPU id field used for user mode context, but that might result in user state to be discarded unnecessarily, given that two distinct CPUs could be holding the most recent user mode state and the most recent kernel mode state) Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20231208113218.3001940-9-ardb@google.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-12 14:31:55 +00:00
Ard Biesheuvel	aefbab8e77	arm64: fpsimd: Preserve/restore kernel mode NEON at context switch Currently, the FPSIMD register file is not preserved and restored along with the general registers on exception entry/exit or context switch. For this reason, we disable preemption when enabling FPSIMD for kernel mode use in task context, and suspend the processing of softirqs so that there are no concurrent uses in the kernel. (Kernel mode FPSIMD may not be used at all in other contexts). Disabling preemption while doing CPU intensive work on inputs of potentially unbounded size is bad for real-time performance, which is why we try and ensure that SIMD crypto code does not operate on more than ~4k at a time, which is an arbitrary limit and requires assembler code to implement efficiently. We can avoid the need for disabling preemption if we can ensure that any in-kernel users of the NEON will not lose the FPSIMD register state across a context switch. And given that disabling softirqs implicitly disables preemption as well, we will also have to ensure that a softirq that runs code using FPSIMD can safely interrupt an in-kernel user. So introduce a thread_info flag TIF_KERNEL_FPSTATE, and modify the context switch hook for FPSIMD to preserve and restore the kernel mode FPSIMD to/from struct thread_struct when it is set. This avoids any scheduling blackouts due to prolonged use of FPSIMD in kernel mode, without the need for manual yielding. In order to support softirq processing while FPSIMD is being used in kernel task context, use the same flag to decide whether the kernel mode FPSIMD state needs to be preserved and restored before allowing FPSIMD to be used in softirq context. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20231208113218.3001940-8-ardb@google.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-12 14:31:54 +00:00
Ard Biesheuvel	9b19700e62	arm64: fpsimd: Drop unneeded 'busy' flag Kernel mode NEON will preserve the user mode FPSIMD state by saving it into the task struct before clobbering the registers. In order to avoid the need for preserving kernel mode state too, we disallow nested use of kernel mode NEON, i..e, use in softirq context while the interrupted task context was using kernel mode NEON too. Originally, this policy was implemented using a per-CPU flag which was exposed via may_use_simd(), requiring the users of the kernel mode NEON to deal with the possibility that it might return false, and having NEON and non-NEON code paths. This policy was changed by commit 13150149aa6ded1 ("arm64: fpsimd: run kernel mode NEON with softirqs disabled"), and now, softirq processing is disabled entirely instead, and so may_use_simd() can never fail when called from task or softirq context. This means we can drop the fpsimd_context_busy flag entirely, and instead, ensure that we disable softirq processing in places where we formerly relied on the flag for preventing races in the FPSIMD preserve routines. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20231208113218.3001940-7-ardb@google.com [will: Folded in fix from CAMj1kXFhzbJRyWHELCivQW1yJaF=p07LLtbuyXYX3G1WtsdyQg@mail.gmail.com] Signed-off-by: Will Deacon <will@kernel.org>	2023-12-12 14:29:16 +00:00
Ard Biesheuvel	50f176175e	arm64/kernel: Move 'nokaslr' parsing out of early idreg code Parsing and ignoring 'nokaslr' can be done from anywhere, except from the code that runs very early and is therefore built with limitations on the kind of relocations it is permitted to use. So move it to a source file that is part of the ordinary kernel build. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20231129111555.3594833-63-ardb@google.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-12 11:13:53 +00:00
Ard Biesheuvel	ea48626f8f	arm64: idreg-override: Avoid kstrtou64() to parse a single hex digit All ID register value overrides are =0 with the exception of the nokaslr pseudo feature which uses =1. In order to remove the dependency on kstrtou64(), which is part of the core kernel and no longer usable once we move idreg-override into the early mini C runtime, let's just parse a single hex digit (with optional leading 0x) and set the output value accordingly. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20231129111555.3594833-62-ardb@google.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-12 11:13:53 +00:00
Ard Biesheuvel	060260a6be	arm64: idreg-override: Avoid sprintf() for simple string concatenation Instead of using sprintf() with the "%s.%s=" format, where the first string argument is always the same in the inner loop of match_options(), use simple memcpy() for string concatenation, and move the first copy to the outer loop. This removes the dependency on sprintf(), which will be difficult to fulfil when we move this code into the early mini C runtime. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20231129111555.3594833-61-ardb@google.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-12 11:13:53 +00:00
Ard Biesheuvel	bcf1eed3f8	arm64: idreg-override: avoid strlen() to check for empty strings strlen() is a costly way to decide whether a string is empty, as in that case, the first character will be NUL so we can check for that directly. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20231129111555.3594833-60-ardb@google.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-12 11:13:52 +00:00
Ard Biesheuvel	dc3f5aae06	arm64: idreg-override: Avoid parameq() and parameqn() The only way parameq() and parameqn() deviate from the ordinary string and memory routines is that they ignore the difference between dashes and underscores. Since we copy each command line argument into a buffer before passing it to parameq() and parameqn() numerous times, let's just convert all dashes to underscores just once, and update the alias array accordingly. This also helps reduce the dependency on kernel APIs that are no longer available once we move this code into the early mini C runtime. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20231129111555.3594833-59-ardb@google.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-12 11:13:52 +00:00
Ard Biesheuvel	01fd29092a	arm64: idreg-override: Prepare for place relative reloc patching The ID reg override handling code uses a rather elaborate data structure that relies on statically initialized absolute address values in pointer fields. This means that this code cannot run until relocation fixups have been applied, and this is unfortunate, because it means we cannot discover overrides for KASLR or LVA/LPA without creating the kernel mapping and performing the relocations first. This can be solved by switching to place-relative relocations, which can be applied by the linker at build time. This means some additional arithmetic is required when dereferencing these pointers, as we can no longer dereference the pointer members directly. So let's implement this for idreg-override.c in a preliminary way, i.e., convert all the references in code to use a special accessor that produces the correct absolute value at runtime. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20231129111555.3594833-58-ardb@google.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-12 11:13:52 +00:00
Ard Biesheuvel	cbc59c9a4e	arm64: idreg-override: Omit non-NULL checks for override pointer Now that override pointers are always set, we can drop the various non-NULL checks that we have in the code. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20231129111555.3594833-57-ardb@google.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-12 11:13:52 +00:00
Ard Biesheuvel	376f5a3bd7	arm64: mm: get rid of kimage_vaddr global variable We store the address of _text in kimage_vaddr, but since commit 09e3c22a86f6889d ("arm64: Use a variable to store non-global mappings decision"), we no longer reference this variable from modules so we no longer need to export it. In fact, we don't need it at all so let's just get rid of it. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20231129111555.3594833-46-ardb@google.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-12 11:06:28 +00:00
Ard Biesheuvel	a22fc8e102	arm64: mm: Take potential load offset into account when KASLR is off We enable CONFIG_RELOCATABLE even when CONFIG_RANDOMIZE_BASE is disabled, and this permits the loader (i.e., EFI) to place the kernel anywhere in physical memory as long as the base address is 64k aligned. This means that the 'KASLR' case described in the header that defines the size of the statically allocated page tables could take effect even when CONFIG_RANDMIZE_BASE=n. So check for CONFIG_RELOCATABLE instead. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20231129111555.3594833-45-ardb@google.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-12 11:06:27 +00:00
Ard Biesheuvel	3dfdc2750c	arm64: kernel: Disable latent_entropy GCC plugin in early C runtime In subsequent patches, mark portions of the early C code will be marked as __init. Unfortunarely, __init implies __latent_entropy, and this would result in the early C code being instrumented in an unsafe manner. Disable the latent entropy plugin for the early C code. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20231129111555.3594833-44-ardb@google.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-12 11:06:27 +00:00
James Clark	bd690638e2	Documentation: arm64: Document the PMU event counting threshold feature Add documentation for the new Perf event open parameters and the threshold_max capability file. Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: James Clark <james.clark@arm.com> Link: https://lore.kernel.org/r/20231211161331.1277825-12-james.clark@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-12 09:46:23 +00:00
James Clark	816c267544	arm64: perf: Add support for event counting threshold FEAT_PMUv3_TH (Armv8.8) permits a PMU counter to increment only on events whose count meets a specified threshold condition. For example if PMEVTYPERn.TC (Threshold Control) is set to 0b101 (Greater than or equal, count), and the threshold is set to 2, then the PMU counter will now only increment by 1 when an event would have previously incremented the PMU counter by 2 or more on a single processor cycle. Three new Perf event config fields, 'threshold', 'threshold_compare' and 'threshold_count' have been added to control the feature. threshold_compare maps to the upper two bits of PMEVTYPERn.TC and threshold_count maps to the first bit of TC. These separate attributes have been picked rather than enumerating all the possible combinations of the TC field as in the Arm ARM. The attributes would be used on a Perf command line like this: $ perf stat -e stall_slot/threshold=2,threshold_compare=2/ A new capability for reading out the maximum supported threshold value has also been added: $ cat /sys/bus/event_source/devices/armv8_pmuv3/caps/threshold_max 0x000000ff If a threshold higher than threshold_max is provided, then an error is generated. If FEAT_PMUv3_TH isn't implemented or a 32 bit kernel is running, then threshold_max reads zero, and attempting to set a threshold value will also result in an error. The threshold is per PMU counter, and there are potentially different threshold_max values per PMU type on heterogeneous systems. Bits higher than 32 now need to be written into PMEVTYPER, so armv8pmu_write_evtype() has to be updated to take an unsigned long value rather than u32 which gives the correct behavior on both aarch32 and 64. Signed-off-by: James Clark <james.clark@arm.com> Link: https://lore.kernel.org/r/20231211161331.1277825-11-james.clark@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-12 09:46:22 +00:00
James Clark	186c91aaf5	arm: pmu: Move error message and -EOPNOTSUPP to individual PMUs -EPERM or -EINVAL always get converted to -EOPNOTSUPP, so replace them. This will allow __hw_perf_event_init() to return a different code or not print that particular message for a different error in the next commit. Signed-off-by: James Clark <james.clark@arm.com> Link: https://lore.kernel.org/r/20231211161331.1277825-10-james.clark@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-12 09:46:22 +00:00
James Clark	c7b98bf0fc	KVM: selftests: aarch64: Update tools copy of arm_pmuv3.h Now that ARMV8_PMU_PMCR_N is made with GENMASK, update usages to treat it as a pre-shifted mask. Signed-off-by: James Clark <james.clark@arm.com> Link: https://lore.kernel.org/r/20231211161331.1277825-9-james.clark@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-12 09:46:22 +00:00
James Clark	a5f4ca68f3	perf/arm_dmc620: Remove duplicate format attribute #defines These were copied from the SPE driver, but now they're in the arm_pmu.h header so delete them and use the header instead. Signed-off-by: James Clark <james.clark@arm.com> Link: https://lore.kernel.org/r/20231211161331.1277825-8-james.clark@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-12 09:46:22 +00:00
James Clark	f6da86969a	arm: pmu: Share user ABI format mechanism with SPE This mechanism makes it much easier to define and read new attributes so move it to the arm_pmu.h header so that it can be shared. At the same time update the existing format attributes to use it. GENMASK has to be changed to GENMASK_ULL because the config fields are 64 bits even on arm32 where this will also be used now. Signed-off-by: James Clark <james.clark@arm.com> Link: https://lore.kernel.org/r/20231211161331.1277825-7-james.clark@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-12 09:46:22 +00:00
James Clark	3115ee021b	arm64: perf: Include threshold control fields in PMEVTYPER mask FEAT_PMUv3_TH (Armv8.8) adds two new fields to PMEVTYPER, so include them in the mask. These aren't writable on 32 bit kernels as they are in the high part of the register, so only include them for arm64. It would be difficult to do this statically in the asm header files for each platform without resulting in circular includes or #ifdefs inline in the code. For that reason the ARMV8_PMU_EVTYPE_MASK definition has been removed and the mask is constructed programmatically. Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: James Clark <james.clark@arm.com> Link: https://lore.kernel.org/r/20231211161331.1277825-6-james.clark@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-12-12 09:46:22 +00:00

1 2 3 4 5 ...

1234188 Commits