linux

iv/linux

Author	SHA1	Message	Date
Dan Li	af4a142ab7	arm64: Mark __stack_chk_guard as __ro_after_init [ Upstream commit 9fcb2e93f41c07a400885325e7dbdfceba6efaec ] __stack_chk_guard is setup once while init stage and never changed after that. Although the modification of this variable at runtime will usually cause the kernel to crash (so does the attacker), it should be marked as __ro_after_init, and it should not affect performance if it is placed in the ro_after_init section. Signed-off-by: Dan Li <ashimida@linux.alibaba.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/1631612642-102881-1-git-send-email-ashimida@linux.alibaba.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-09-30 10:09:25 +02:00
Thomas Gleixner	2f7bfc07e3	drivers: base: cacheinfo: Get rid of DEFINE_SMP_CALL_CACHE_FUNCTION() [ Upstream commit 4b92d4add5f6dcf21275185c997d6ecb800054cd ] DEFINE_SMP_CALL_CACHE_FUNCTION() was usefel before the CPU hotplug rework to ensure that the cache related functions are called on the upcoming CPU because the notifier itself could run on any online CPU. The hotplug state machine guarantees that the callbacks are invoked on the upcoming CPU. So there is no need to have this SMP function call obfuscation. That indirection was missed when the hotplug notifiers were converted. This also solves the problem of ARM64 init_cache_level() invoking ACPI functions which take a semaphore in that context. That's invalid as SMP function calls run with interrupts disabled. Running it just from the callback in context of the CPU hotplug thread solves this. Fixes: 8571890e1513 ("arm64: Add support for ACPI based firmware tables") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/871r69ersb.ffs@tglx Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-09-26 14:07:10 +02:00
Mark Brown	484fbe9cc0	arm64/sve: Use correct size when reinitialising SVE state commit e35ac9d0b56e9efefaeeb84b635ea26c2839ea86 upstream. When we need a buffer for SVE register state we call sve_alloc() to make sure that one is there. In order to avoid repeated allocations and frees we keep the buffer around unless we change vector length and just memset() it to ensure a clean register state. The function that deals with this takes the task to operate on as an argument, however in the case where we do a memset() we initialise using the SVE state size for the current task rather than the task passed as an argument. This is only an issue in the case where we are setting the register state for a task via ptrace and the task being configured has a different vector length to the task tracing it. In the case where the buffer is larger in the traced process we will leak old state from the traced process to itself, in the case where the buffer is smaller in the traced process we will overflow the buffer and corrupt memory. Fixes: bc0ee4760364 ("arm64/sve: Core task context handling") Cc: <stable@vger.kernel.org> # 4.15.x Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20210909165356.10675-1-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-09-22 12:26:39 +02:00
Mark Rutland	3411b481ed	arm64: head: avoid over-mapping in map_memory commit 90268574a3e8a6b883bd802d702a2738577e1006 upstream. The `compute_indices` and `populate_entries` macros operate on inclusive bounds, and thus the `map_memory` macro which uses them also operates on inclusive bounds. We pass `_end` and `_idmap_text_end` to `map_memory`, but these are exclusive bounds, and if one of these is sufficiently aligned (as a result of kernel configuration, physical placement, and KASLR), then: * In `compute_indices`, the computed `iend` will be in the page/block after the final byte of the intended mapping. * In `populate_entries`, an unnecessary entry will be created at the end of each level of table. At the leaf level, this entry will map up to SWAPPER_BLOCK_SIZE bytes of physical addresses that we did not intend to map. As we may map up to SWAPPER_BLOCK_SIZE bytes more than intended, we may violate the boot protocol and map physical address past the 2MiB-aligned end address we are permitted to map. As we map these with Normal memory attributes, this may result in further problems depending on what these physical addresses correspond to. The final entry at each level may require an additional table at that level. As EARLY_ENTRIES() calculates an inclusive bound, we allocate enough memory for this. Avoid the extraneous mapping by having map_memory convert the exclusive end address to an inclusive end address by subtracting one, and do likewise in EARLY_ENTRIES() when calculating the number of required tables. For clarity, comments are updated to more clearly document which boundaries the macros operate on. For consistency with the other macros, the comments in map_memory are also updated to describe `vstart` and `vend` as virtual addresses. Fixes: 0370b31e4845 ("arm64: Extend early page table code to allow for larger kernels") Cc: <stable@vger.kernel.org> # 4.16.x Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Steve Capper <steve.capper@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210823101253.55567-1-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-09-22 12:26:20 +02:00
Mark Rutland	3c197fdd07	arm64: fix compat syscall return truncation commit e30e8d46cf605d216a799a28c77b8a41c328613a upstream. Due to inconsistencies in the way we manipulate compat GPRs, we have a few issues today: * For audit and tracing, where error codes are handled as a (native) long, negative error codes are expected to be sign-extended to the native 64-bits, or they may fail to be matched correctly. Thus a syscall which fails with an error may erroneously be identified as failing. * For ptrace, all compat return values should be sign-extended for consistency with 32-bit arm, but we currently only do this for negative return codes. * As we may transiently set the upper 32 bits of some compat GPRs while in the kernel, these can be sampled by perf, which is somewhat confusing. This means that where a syscall returns a pointer above 2G, this will be sign-extended, but will not be mistaken for an error as error codes are constrained to the inclusive range [-4096, -1] where no user pointer can exist. To fix all of these, we must consistently use helpers to get/set the compat GPRs, ensuring that we never write the upper 32 bits of the return code, and always sign-extend when reading the return code. This patch does so, with the following changes: * We re-organise syscall_get_return_value() to always sign-extend for compat tasks, and reimplement syscall_get_error() atop. We update syscall_trace_exit() to use syscall_get_return_value(). * We consistently use syscall_set_return_value() to set the return value, ensureing the upper 32 bits are never set unexpectedly. * As the core audit code currently uses regs_return_value() rather than syscall_get_return_value(), we special-case this for compat_user_mode(regs) such that this will do the right thing. Going forward, we should try to move the core audit code over to syscall_get_return_value(). Cc: <stable@vger.kernel.org> Reported-by: He Zhe <zhe.he@windriver.com> Reported-by: weiyuchen <weiyuchen3@huawei.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210802104200.21390-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> [Mark: trivial conflict resolution for v5.4.y] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-08-12 13:21:05 +02:00
Anshuman Khandual	3bf0509d25	arm64/mm: Fix ttbr0 values stored in struct thread_info for software-pan [ Upstream commit 9163f01130304fab1f74683d7d44632da7bda637 ] When using CONFIG_ARM64_SW_TTBR0_PAN, a task's thread_info::ttbr0 must be the TTBR0_EL1 value used to run userspace. With 52-bit PAs, the PA must be packed into the TTBR using phys_to_ttbr(), but we forget to do this in some of the SW PAN code. Thus, if the value is installed into TTBR0_EL1 (as may happen in the uaccess routines), this could result in UNPREDICTABLE behaviour. Since hardware with 52-bit PA support almost certainly has HW PAN, which will be used in preference, this shouldn't be a practical issue, but let's fix this for consistency. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Fixes: 529c4b05a3cb ("arm64: handle 52-bit addresses in TTBR") Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/1623749578-11231-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-07-14 16:53:23 +02:00
Mark Rutland	8e6bcc5664	arm64: consistently use reserved_pg_dir [ Upstream commit 833be850f1cabd0e3b5337c0fcab20a6e936dd48 ] Depending on configuration options and specific code paths, we either use the empty_zero_page or the configuration-dependent reserved_ttbr0 as a reserved value for TTBR{0,1}_EL1. To simplify this code, let's always allocate and use the same reserved_pg_dir, replacing reserved_ttbr0. Note that this is allocated (and hence pre-zeroed), and is also marked as read-only in the kernel Image mapping. Keeping this separate from the empty_zero_page potentially helps with robustness as the empty_zero_page is used in a number of cases where a failure to map it read-only could allow it to become corrupted. The (presently unused) swapper_pg_end symbol is also removed, and comments are added wherever we rely on the offsets between the pre-allocated pg_dirs to keep these cases easily identifiable. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201103102229.8542-1-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-07-14 16:53:23 +02:00
Bill Wendling	13a474c013	arm64/vdso: Discard .note.gnu.property sections in vDSO [ Upstream commit 388708028e6937f3fc5fc19aeeb847f8970f489c ] The arm64 assembler in binutils 2.32 and above generates a program property note in a note section, .note.gnu.property, to encode used x86 ISAs and features. But the kernel linker script only contains a single NOTE segment: PHDRS { text PT_LOAD FLAGS(5) FILEHDR PHDRS; /* PF_R\|PF_X / dynamic PT_DYNAMIC FLAGS(4); / PF_R / note PT_NOTE FLAGS(4); / PF_R */ } The NOTE segment generated by the vDSO linker script is aligned to 4 bytes. But the .note.gnu.property section must be aligned to 8 bytes on arm64. $ readelf -n vdso64.so Displaying notes found in: .note Owner Data size Description Linux 0x00000004 Unknown note type: (0x00000000) description data: 06 00 00 00 readelf: Warning: note with invalid namesz and/or descsz found at offset 0x20 readelf: Warning: type: 0x78, namesize: 0x00000100, descsize: 0x756e694c, alignment: 8 Since the note.gnu.property section in the vDSO is not checked by the dynamic linker, discard the .note.gnu.property sections in the vDSO. Similar to commit 4caffe6a28d31 ("x86/vdso: Discard .note.gnu.property sections in vDSO"), but for arm64. Signed-off-by: Bill Wendling <morbo@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210423205159.830854-1-morbo@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-05-11 14:04:14 +02:00
Suzuki K Poulose	2012f9f754	KVM: arm64: Hide system instruction access to Trace registers [ Upstream commit 1d676673d665fd2162e7e466dcfbe5373bfdb73e ] Currently we advertise the ID_AA6DFR0_EL1.TRACEVER for the guest, when the trace register accesses are trapped (CPTR_EL2.TTA == 1). So, the guest will get an undefined instruction, if trusts the ID registers and access one of the trace registers. Lets be nice to the guest and hide the feature to avoid unexpected behavior. Even though this can be done at KVM sysreg emulation layer, we do this by removing the TRACEVER from the sanitised feature register field. This is fine as long as the ETM drivers can handle the individual trace units separately, even when there are differences among the CPUs. Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210323120647.454211-2-suzuki.poulose@arm.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-04-16 11:46:37 +02:00
Pavel Tatashin	c82d289fe9	arm64: kdump: update ppos when reading elfcorehdr [ Upstream commit 141f8202cfa4192c3af79b6cbd68e7760bb01b5a ] The ppos points to a position in the old kernel memory (and in case of arm64 in the crash kernel since elfcorehdr is passed as a segment). The function should update the ppos by the amount that was read. This bug is not exposed by accident, but other platforms update this value properly. So, fix it in ARM64 version of elfcorehdr_read() as well. Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com> Fixes: e62aaeac426a ("arm64: kdump: provide /proc/vmcore file") Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com> Link: https://lore.kernel.org/r/20210319205054.743368-1-pasha.tatashin@soleen.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-30 14:35:27 +02:00
Ard Biesheuvel	88c79851b8	arm64: mm: use a 48-bit ID map when possible on 52-bit VA builds [ Upstream commit 7ba8f2b2d652cd8d8a2ab61f4be66973e70f9f88 ] 52-bit VA kernels can run on hardware that is only 48-bit capable, but configure the ID map as 52-bit by default. This was not a problem until recently, because the special T0SZ value for a 52-bit VA space was never programmed into the TCR register anwyay, and because a 52-bit ID map happens to use the same number of translation levels as a 48-bit one. This behavior was changed by commit 1401bef703a4 ("arm64: mm: Always update TCR_EL1 from __cpu_set_tcr_t0sz()"), which causes the unsupported T0SZ value for a 52-bit VA to be programmed into TCR_EL1. While some hardware simply ignores this, Mark reports that Amberwing systems choke on this, resulting in a broken boot. But even before that commit, the unsupported idmap_t0sz value was exposed to KVM and used to program TCR_EL2 incorrectly as well. Given that we already have to deal with address spaces being either 48-bit or 52-bit in size, the cleanest approach seems to be to simply default to a 48-bit VA ID map, and only switch to a 52-bit one if the placement of the kernel in DRAM requires it. This is guaranteed not to happen unless the system is actually 52-bit VA capable. Fixes: 90ec95cda91a ("arm64: mm: Introduce VA_BITS_MIN") Reported-by: Mark Salter <msalter@redhat.com> Link: http://lore.kernel.org/r/20210310003216.410037-1-msalter@redhat.com Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210310171515.416643-2-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-17 17:03:56 +01:00
Timothy E Baldwin	1f8884d044	arm64: ptrace: Fix seccomp of traced syscall -1 (NO_SYSCALL) commit df84fe94708985cdfb78a83148322bcd0a699472 upstream. Since commit f086f67485c5 ("arm64: ptrace: add support for syscall emulation"), if system call number -1 is called and the process is being traced with PTRACE_SYSCALL, for example by strace, the seccomp check is skipped and -ENOSYS is returned unconditionally (unless altered by the tracer) rather than carrying out action specified in the seccomp filter. The consequence of this is that it is not possible to reliably strace a seccomp based implementation of a foreign system call interface in which r7/x8 is permitted to be -1 on entry to a system call. Also trace_sys_enter and audit_syscall_entry are skipped if a system call is skipped. Fix by removing the in_syscall(regs) check restoring the previous behaviour which is like AArch32, x86 (which uses generic code) and everything else. Cc: Oleg Nesterov <oleg@redhat.com> Cc: Catalin Marinas<catalin.marinas@arm.com> Cc: <stable@vger.kernel.org> Fixes: f086f67485c5 ("arm64: ptrace: add support for syscall emulation") Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Timothy E Baldwin <T.E.Baldwin99@members.leeds.ac.uk> Link: https://lore.kernel.org/r/90edd33b-6353-1228-791f-0336d94d5f8c@majoroak.me.uk Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-09 11:09:38 +01:00
Shaoying Xu	9757d5c4fc	arm64 module: set plt* section addresses to 0x0 commit f5c6d0fcf90ce07ee0d686d465b19b247ebd5ed7 upstream. These plt* and .text.ftrace_trampoline sections specified for arm64 have non-zero addressses. Non-zero section addresses in a relocatable ELF would confuse GDB when it tries to compute the section offsets and it ends up printing wrong symbol addresses. Therefore, set them to zero, which mirrors the change in commit 5d8591bc0fba ("module: set ksymtab/kcrctab* section addresses to 0x0"). Reported-by: Frank van der Linden <fllinden@amazon.com> Signed-off-by: Shaoying Xu <shaoyi@amazon.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20210216183234.GA23876@amazon.com Signed-off-by: Will Deacon <will@kernel.org> [shaoyi@amazon.com: made same changes in arch/arm64/kernel/module.lds for 5.4] Signed-off-by: Shaoying Xu <shaoyi@amazon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-07 12:20:41 +01:00
Suzuki K Poulose	2e1df9bfe6	arm64: Extend workaround for erratum 1024718 to all versions of Cortex-A55 commit c0b15c25d25171db4b70cc0b7dbc1130ee94017d upstream. The erratum 1024718 affects Cortex-A55 r0p0 to r2p0. However we apply the work around for r0p0 - r1p0. Unfortunately this won't be fixed for the future revisions for the CPU. Thus extend the work around for all versions of A55, to cover for r2p0 and any future revisions. Cc: stable@vger.kernel.org Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20210203230057.3961239-1-suzuki.poulose@arm.com [will: Update Kconfig help text] Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-04 10:26:49 +01:00
He Zhe	a82ebd5dde	arm64: uprobe: Return EOPNOTSUPP for AARCH32 instruction probing commit d47422d953e258ad587b5edf2274eb95d08bdc7d upstream. As stated in linux/errno.h, ENOTSUPP should never be seen by user programs. When we set up uprobe with 32-bit perf and arm64 kernel, we would see the following vague error without useful hint. The sys_perf_event_open() syscall returned with 524 (INTERNAL ERROR: strerror_r(524, [buf], 128)=22) Use EOPNOTSUPP instead to indicate such cases. Signed-off-by: He Zhe <zhe.he@windriver.com> Link: https://lore.kernel.org/r/20210223082535.48730-1-zhe.he@windriver.com Cc: <stable@vger.kernel.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-04 10:26:47 +01:00
qiuguorui1	efca4c991e	arm64: kexec_file: fix memory leakage in create_dtb() when fdt_open_into() fails commit 656d1d58d8e0958d372db86c24f0b2ea36f50888 upstream. in function create_dtb(), if fdt_open_into() fails, we need to vfree buf before return. Fixes: 52b2a8af7436 ("arm64: kexec_file: load initrd and device-tree") Cc: stable@vger.kernel.org # v5.0 Signed-off-by: qiuguorui1 <qiuguorui1@huawei.com> Link: https://lore.kernel.org/r/20210218125900.6810-1-qiuguorui1@huawei.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-04 10:26:47 +01:00
Marc Zyngier	1a47856548	arm64: Add missing ISB after invalidating TLB in __primary_switch [ Upstream commit 9d41053e8dc115c92b8002c3db5f545d7602498b ] Although there has been a bit of back and forth on the subject, it appears that invalidating TLBs requires an ISB instruction when FEAT_ETS is not implemented by the CPU. From the bible: \| In an implementation that does not implement FEAT_ETS, a TLB \| maintenance instruction executed by a PE, PEx, can complete at any \| time after it is issued, but is only guaranteed to be finished for a \| PE, PEx, after the execution of DSB by the PEx followed by a Context \| synchronization event Add the missing ISB in __primary_switch, just in case. Fixes: 3c5e9f238bc4 ("arm64: head.S: move KASLR processing out of __enable_mmu()") Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20210224093738.3629662-3-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-04 10:26:39 +01:00
Mark Rutland	230290dca2	arm64: syscall: exit userspace before unmasking exceptions [ Upstream commit ca1314d73eed493c49bb1932c60a8605530db2e4 ] In el0_svc_common() we unmask exceptions before we call user_exit(), and so there's a window where an IRQ or debug exception can be taken while RCU is not watching. In do_debug_exception() we account for this in via debug_exception_{enter,exit}(), but in the el1_irq asm we do not and we call trace functions which rely on RCU before we have a guarantee that RCU is watching. Let's avoid this by having el0_svc_common() exit userspace before unmasking exceptions, matching what we do for all other EL0 entry paths. We can use user_exit_irqoff() to avoid the pointless save/restore of IRQ flags while we're sure exceptions are masked in DAIF. The workaround for Cortex-A76 erratum 1463225 may trigger a debug exception before this point, but the debug code invoked in this case is safe even when RCU is not watching. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201130115950.22492-2-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-12-30 11:50:57 +01:00
Will Deacon	54d11983c2	arm64: smp: Tell RCU about CPUs that fail to come online [ Upstream commit 04e613ded8c26489b3e0f9101b44462f780d1a35 ] Commit ce3d31ad3cac ("arm64/smp: Move rcu_cpu_starting() earlier") ensured that RCU is informed early about incoming CPUs that might end up calling into printk() before they are online. However, if such a CPU fails the early CPU feature compatibility checks in check_local_cpu_capabilities(), then it will be powered off or parked without informing RCU, leading to an endless stream of stalls: \| rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: \| rcu: 2-O...: (0 ticks this GP) idle=002/1/0x4000000000000000 softirq=0/0 fqs=2593 \| (detected by 0, t=5252 jiffies, g=9317, q=136) \| Task dump for CPU 2: \| task:swapper/2 state:R running task stack: 0 pid: 0 ppid: 1 flags:0x00000028 \| Call trace: \| ret_from_fork+0x0/0x30 Ensure that the dying CPU invokes rcu_report_dead() prior to being powered off or parked. Cc: Qian Cai <cai@redhat.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Suggested-by: Qian Cai <cai@redhat.com> Link: https://lore.kernel.org/r/20201105222242.GA8842@willie-the-truck Link: https://lore.kernel.org/r/20201106103602.9849-3-will@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-11-24 13:29:01 +01:00
Will Deacon	e8df8c25aa	arm64: psci: Avoid printing in cpu_psci_cpu_die() [ Upstream commit 891deb87585017d526b67b59c15d38755b900fea ] cpu_psci_cpu_die() is called in the context of the dying CPU, which will no longer be online or tracked by RCU. It is therefore not generally safe to call printk() if the PSCI "cpu off" request fails, so remove the pr_crit() invocation. Cc: Qian Cai <cai@redhat.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20201106103602.9849-2-will@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-11-24 13:29:01 +01:00
Will Deacon	71eea3d3df	arm64: errata: Fix handling of 1418040 with late CPU onlining [ Upstream commit f969f03888b9438fdb227b6460d99ede5737326d ] In a surprising turn of events, it transpires that CPU capabilities configured as ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE are never set as the result of late-onlining. Therefore our handling of erratum 1418040 does not get activated if it is not required by any of the boot CPUs, even though we allow late-onlining of an affected CPU. In order to get things working again, replace the cpus_have_const_cap() invocation with an explicit check for the current CPU using this_cpu_has_cap(). Cc: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Cc: Stephen Boyd <swboyd@chromium.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201106114952.10032-1-will@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-11-24 13:29:00 +01:00
Qian Cai	8ee6a0f254	arm64/smp: Move rcu_cpu_starting() earlier [ Upstream commit ce3d31ad3cac765484463b4f5a0b6b1f8f1a963e ] The call to rcu_cpu_starting() in secondary_start_kernel() is not early enough in the CPU-hotplug onlining process, which results in lockdep splats as follows: WARNING: suspicious RCU usage ----------------------------- kernel/locking/lockdep.c:3497 RCU-list traversed in non-reader section!! other info that might help us debug this: RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 1 no locks held by swapper/1/0. Call trace: dump_backtrace+0x0/0x3c8 show_stack+0x14/0x60 dump_stack+0x14c/0x1c4 lockdep_rcu_suspicious+0x134/0x14c __lock_acquire+0x1c30/0x2600 lock_acquire+0x274/0xc48 _raw_spin_lock+0xc8/0x140 vprintk_emit+0x90/0x3d0 vprintk_default+0x34/0x40 vprintk_func+0x378/0x590 printk+0xa8/0xd4 __cpuinfo_store_cpu+0x71c/0x868 cpuinfo_store_cpu+0x2c/0xc8 secondary_start_kernel+0x244/0x318 This is avoided by moving the call to rcu_cpu_starting up near the beginning of the secondary_start_kernel() function. Signed-off-by: Qian Cai <cai@redhat.com> Acked-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/lkml/160223032121.7002.1269740091547117869.tip-bot2@tip-bot2/ Link: https://lore.kernel.org/r/20201028182614.13655-1-cai@redhat.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-11-10 12:37:31 +01:00
Valentin Schneider	80685a94f7	arm64: topology: Stop using MPIDR for topology information [ Upstream commit 3102bc0e6ac752cc5df896acb557d779af4d82a1 ] In the absence of ACPI or DT topology data, we fallback to haphazardly decoding something out of MPIDR. Sadly, the contents of that register are mostly unusable due to the implementation leniancy and things like Aff0 having to be capped to 15 (despite being encoded on 8 bits). Consider a simple system with a single package of 32 cores, all under the same LLC. We ought to be shoving them in the same core_sibling mask, but MPIDR is going to look like: \| CPU \| 0 \| ... \| 15 \| 16 \| ... \| 31 \| \|------+---+-----+----+----+-----+----+ \| Aff0 \| 0 \| ... \| 15 \| 0 \| ... \| 15 \| \| Aff1 \| 0 \| ... \| 0 \| 1 \| ... \| 1 \| \| Aff2 \| 0 \| ... \| 0 \| 0 \| ... \| 0 \| Which will eventually yield core_sibling(0-15) == 0-15 core_sibling(16-31) == 16-31 NUMA woes ========= If we try to play games with this and set up NUMA boundaries within those groups of 16 cores via e.g. QEMU: # Node0: 0-9; Node1: 10-19 $ qemu-system-aarch64 <blah> \ -smp 20 -numa node,cpus=0-9,nodeid=0 -numa node,cpus=10-19,nodeid=1 The scheduler's MC domain (all CPUs with same LLC) is going to be built via arch_topology.c::cpu_coregroup_mask() In there we try to figure out a sensible mask out of the topology information we have. In short, here we'll pick the smallest of NUMA or core sibling mask. node_mask(CPU9) == 0-9 core_sibling(CPU9) == 0-15 MC mask for CPU9 will thus be 0-9, not a problem. node_mask(CPU10) == 10-19 core_sibling(CPU10) == 0-15 MC mask for CPU10 will thus be 10-19, not a problem. node_mask(CPU16) == 10-19 core_sibling(CPU16) == 16-19 MC mask for CPU16 will thus be 16-19... Uh oh. CPUs 16-19 are in two different unique MC spans, and the scheduler has no idea what to make of that. That triggers the WARN_ON() added by commit ccf74128d66c ("sched/topology: Assert non-NUMA topology masks don't (partially) overlap") Fixing MPIDR-derived topology ============================= We could try to come up with some cleverer scheme to figure out which of the available masks to pick, but really if one of those masks resulted from MPIDR then it should be discarded because it's bound to be bogus. I was hoping to give MPIDR a chance for SMT, to figure out which threads are in the same core using Aff1-3 as core ID, but Sudeep and Robin pointed out to me that there are systems out there where all cores have non-zero values in their higher affinity fields (e.g. RK3288 has "5" in all of its cores' MPIDR.Aff1), which would expose a bogus core ID to userspace. Stop using MPIDR for topology information. When no other source of topology information is available, mark each CPU as its own core and its NUMA node as its LLC domain. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://lore.kernel.org/r/20200829130016.26106-1-valentin.schneider@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-11-05 11:43:16 +01:00
Marc Zyngier	7736c61080	arm64: Run ARCH_WORKAROUND_2 enabling code on all CPUs commit 39533e12063be7f55e3d6ae21ffe067799d542a4 upstream. Commit 606f8e7b27bf ("arm64: capabilities: Use linear array for detection and verification") changed the way we deal with per-CPU errata by only calling the .matches() callback until one CPU is found to be affected. At this point, .matches() stop being called, and .cpu_enable() will be called on all CPUs. This breaks the ARCH_WORKAROUND_2 handling, as only a single CPU will be mitigated. In order to address this, forcefully call the .matches() callback from a .cpu_enable() callback, which brings us back to the original behaviour. Fixes: 606f8e7b27bf ("arm64: capabilities: Use linear array for detection and verification") Cc: <stable@vger.kernel.org> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-11-01 12:01:02 +01:00
Marc Zyngier	114c6930b3	arm64: Run ARCH_WORKAROUND_1 enabling code on all CPUs commit 18fce56134c987e5b4eceddafdbe4b00c07e2ae1 upstream. Commit 73f381660959 ("arm64: Advertise mitigation of Spectre-v2, or lack thereof") changed the way we deal with ARCH_WORKAROUND_1, by moving most of the enabling code to the .matches() callback. This has the unfortunate effect that the workaround gets only enabled on the first affected CPU, and no other. In order to address this, forcefully call the .matches() callback from a .cpu_enable() callback, which brings us back to the original behaviour. Fixes: 73f381660959 ("arm64: Advertise mitigation of Spectre-v2, or lack thereof") Cc: <stable@vger.kernel.org> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-11-01 12:01:01 +01:00
Anshuman Khandual	180e60f154	arm64/cpufeature: Drop TraceFilt feature exposure from ID_DFR0 register [ Upstream commit 1ed1b90a0594c8c9d31e8bb8be25a2b37717dc9e ] ID_DFR0 based TraceFilt feature should not be exposed to guests. Hence lets drop it. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/1589881254-10082-3-git-send-email-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-10-01 13:18:03 +02:00
James Morse	af02933d59	arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work [ Upstream commit 8fcc4ae6faf8b455eeef00bc9ae70744e3b0f462 ] APEI is unable to do all of its error handling work in nmi-context, so it defers non-fatal work onto the irq_work queue. arch_irq_work_raise() sends an IPI to the calling cpu, but this is not guaranteed to be taken before returning to user-space. Unless the exception interrupted a context with irqs-masked, irq_work_run() can run immediately. Otherwise return -EINPROGRESS to indicate ghes_notify_sea() found some work to do, but it hasn't finished yet. With this apei_claim_sea() returning '0' means this external-abort was also notification of a firmware-first RAS error, and that APEI has processed the CPER records. Signed-off-by: James Morse <james.morse@arm.com> Tested-by: Tyler Baicar <baicar@os.amperecomputing.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-10-01 13:18:02 +02:00
Will Deacon	8e32fed034	arm64: cpufeature: Relax checks for AArch32 support at EL[0-2] [ Upstream commit 98448cdfe7060dd5491bfbd3f7214ffe1395d58e ] We don't need to be quite as strict about mismatched AArch32 support, which is good because the friendly hardware folks have been busy mismatching this to their hearts' content. * We don't care about EL2 or EL3 (there are silly comments concerning the latter, so remove those) * EL1 support is gated by the ARM64_HAS_32BIT_EL1 capability and handled gracefully when a mismatch occurs * EL0 support is gated by the ARM64_HAS_32BIT_EL0 capability and handled gracefully when a mismatch occurs Relax the AArch32 checks to FTR_NONSTRICT. Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20200421142922.18950-8-will@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-10-01 13:17:57 +02:00
Mark Rutland	3283bd6d19	arm64: insn: consistently handle exit text [ Upstream commit ca2ef4ffabbef25644e02a98b0f48869f8be0375 ] A kernel built with KASAN && FTRACE_WITH_REGS && !MODULES, produces a boot-time splat in the bowels of ftrace: \| [ 0.000000] ftrace: allocating 32281 entries in 127 pages \| [ 0.000000] ------------[ cut here ]------------ \| [ 0.000000] WARNING: CPU: 0 PID: 0 at kernel/trace/ftrace.c:2019 ftrace_bug+0x27c/0x328 \| [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.4.0-rc3-00008-g7f08ae53a7e3 #13 \| [ 0.000000] Hardware name: linux,dummy-virt (DT) \| [ 0.000000] pstate: 60000085 (nZCv daIf -PAN -UAO) \| [ 0.000000] pc : ftrace_bug+0x27c/0x328 \| [ 0.000000] lr : ftrace_init+0x640/0x6cc \| [ 0.000000] sp : ffffa000120e7e00 \| [ 0.000000] x29: ffffa000120e7e00 x28: ffff00006ac01b10 \| [ 0.000000] x27: ffff00006ac898c0 x26: dfffa00000000000 \| [ 0.000000] x25: ffffa000120ef290 x24: ffffa0001216df40 \| [ 0.000000] x23: 000000000000018d x22: ffffa0001244c700 \| [ 0.000000] x21: ffffa00011bf393c x20: ffff00006ac898c0 \| [ 0.000000] x19: 00000000ffffffff x18: 0000000000001584 \| [ 0.000000] x17: 0000000000001540 x16: 0000000000000007 \| [ 0.000000] x15: 0000000000000000 x14: ffffa00010432770 \| [ 0.000000] x13: ffff940002483519 x12: 1ffff40002483518 \| [ 0.000000] x11: 1ffff40002483518 x10: ffff940002483518 \| [ 0.000000] x9 : dfffa00000000000 x8 : 0000000000000001 \| [ 0.000000] x7 : ffff940002483519 x6 : ffffa0001241a8c0 \| [ 0.000000] x5 : ffff940002483519 x4 : ffff940002483519 \| [ 0.000000] x3 : ffffa00011780870 x2 : 0000000000000001 \| [ 0.000000] x1 : 1fffe0000d591318 x0 : 0000000000000000 \| [ 0.000000] Call trace: \| [ 0.000000] ftrace_bug+0x27c/0x328 \| [ 0.000000] ftrace_init+0x640/0x6cc \| [ 0.000000] start_kernel+0x27c/0x654 \| [ 0.000000] random: get_random_bytes called from print_oops_end_marker+0x30/0x60 with crng_init=0 \| [ 0.000000] ---[ end trace 0000000000000000 ]--- \| [ 0.000000] ftrace faulted on writing \| [ 0.000000] [<ffffa00011bf393c>] _GLOBAL__sub_D_65535_0___tracepoint_initcall_level+0x4/0x28 \| [ 0.000000] Initializing ftrace call sites \| [ 0.000000] ftrace record flags: 0 \| [ 0.000000] (0) \| [ 0.000000] expected tramp: ffffa000100b3344 This is due to an unfortunate combination of several factors. Building with KASAN results in the compiler generating anonymous functions to register/unregister global variables against the shadow memory. These functions are placed in .text.startup/.text.exit, and given mangled names like _GLOBAL__sub_{I,D}_65535_0_$OTHER_SYMBOL. The kernel linker script places these in .init.text and .exit.text respectively, which are both discarded at runtime as part of initmem. Building with FTRACE_WITH_REGS uses -fpatchable-function-entry=2, which also instruments KASAN's anonymous functions. When these are discarded with the rest of initmem, ftrace removes dangling references to these call sites. Building without MODULES implicitly disables STRICT_MODULE_RWX, and causes arm64's patch_map() function to treat any !core_kernel_text() symbol as something that can be modified in-place. As core_kernel_text() is only true for .text and .init.text, with the latter depending on system_state < SYSTEM_RUNNING, we'll treat .exit.text as something that can be patched in-place. However, .exit.text is mapped read-only. Hence in this configuration the ftrace init code blows up while trying to patch one of the functions generated by KASAN. We could try to filter out the call sites in .exit.text rather than initializing them, but this would be inconsistent with how we handle .init.text, and requires hooking into core bits of ftrace. The behaviour of patch_map() is also inconsistent today, so instead let's clean that up and have it consistently handle .exit.text. This patch teaches patch_map() to handle .exit.text at init time, preventing the boot-time splat above. The flow of patch_map() is reworked to make the logic clearer and minimize redundant conditionality. Fixes: 3b23e4991fb66f6d ("arm64: implement ftrace with regs") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Amit Daniel Kachhap <amit.kachhap@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Torsten Duwe <duwe@suse.de> Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-10-01 13:17:23 +02:00
Marc Zyngier	cdf990e2b2	arm64: Allow CPUs unffected by ARM erratum 1418040 to come in late [ Upstream commit ed888cb0d1ebce69f12794e89fbd5e2c86d40b8d ] Now that we allow CPUs affected by erratum 1418040 to come in late, this prevents their unaffected sibblings from coming in late (or coming back after a suspend or hotplug-off, which amounts to the same thing). To allow this, we need to add ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU, which amounts to set .type to ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE. Fixes: bf87bb0881d0 ("arm64: Allow booting of late CPUs affected by erratum 1418040") Reported-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Tested-by: Matthias Kaehlcke <mka@chromium.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200911181611.2073183-1-maz@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-09-23 12:40:40 +02:00
Jessica Yu	ff37a26364	arm64/module: set trampoline section flags regardless of CONFIG_DYNAMIC_FTRACE [ Upstream commit e0328feda79d9681b3e3245e6e180295550c8ee9 ] In the arm64 module linker script, the section .text.ftrace_trampoline is specified unconditionally regardless of whether CONFIG_DYNAMIC_FTRACE is enabled (this is simply due to the limitation that module linker scripts are not preprocessed like the vmlinux one). Normally, for .plt and .text.ftrace_trampoline, the section flags present in the module binary wouldn't matter since module_frob_arch_sections() would assign them manually anyway. However, the arm64 module loader only sets the section flags for .text.ftrace_trampoline when CONFIG_DYNAMIC_FTRACE=y. That's only become problematic recently due to a recent change in binutils-2.35, where the .text.ftrace_trampoline section (along with the .plt section) is now marked writable and executable (WAX). We no longer allow writable and executable sections to be loaded due to commit 5c3a7db0c7ec ("module: Harden STRICT_MODULE_RWX"), so this is causing all modules linked with binutils-2.35 to be rejected under arm64. Drop the IS_ENABLED(CONFIG_DYNAMIC_FTRACE) check in module_frob_arch_sections() so that the section flags for .text.ftrace_trampoline get properly set to SHF_EXECINSTR\|SHF_ALLOC, without SHF_WRITE. Signed-off-by: Jessica Yu <jeyu@kernel.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: http://lore.kernel.org/r/20200831094651.GA16385@linux-8ccs Link: https://lore.kernel.org/r/20200901160016.3646-1-jeyu@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-09-17 13:47:49 +02:00
James Morse	1744237ca0	KVM: arm64: Add kvm_extable for vaxorcism code commit e9ee186bb735bfc17fa81dbc9aebf268aee5b41e upstream. KVM has a one instruction window where it will allow an SError exception to be consumed by the hypervisor without treating it as a hypervisor bug. This is used to consume asynchronous external abort that were caused by the guest. As we are about to add another location that survives unexpected exceptions, generalise this code to make it behave like the host's extable. KVM's version has to be mapped to EL2 to be accessible on nVHE systems. The SError vaxorcism code is a one instruction window, so has two entries in the extable. Because the KVM code is copied for VHE and nVHE, we end up with four entries, half of which correspond with code that isn't mapped. Cc: <stable@vger.kernel.org> # 5.4.x Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-09-05 11:22:50 +02:00
Marc Zyngier	2f4b202eb1	arm64: Allow booting of late CPUs affected by erratum 1418040 [ Upstream commit bf87bb0881d0f59181fe3bbcf29c609f36483ff8 ] As we can now switch from a system that isn't affected by 1418040 to a system that globally is affected, let's allow affected CPUs to come in at a later time. Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200731173824.107480-3-maz@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-09-03 11:27:01 +02:00
Marc Zyngier	82b05f0838	arm64: Move handling of erratum 1418040 into C code [ Upstream commit d49f7d7376d0c0daf8680984a37bd07581ac7d38 ] Instead of dealing with erratum 1418040 on each entry and exit, let's move the handling to __switch_to() instead, which has several advantages: - It can be applied when it matters (switching between 32 and 64 bit tasks). - It is written in C (yay!) - It can rely on static keys rather than alternatives Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200731173824.107480-2-maz@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-09-03 11:27:01 +02:00
Kefeng Wang	da56eb03ea	arm64: Fix __cpu_logical_map undefined issue [ Upstream commit eaecca9e7710281be7c31d892c9f447eafd7ddd9 ] The __cpu_logical_map undefined issue occued when the new tegra194-cpufreq drvier building as a module. ERROR: modpost: "__cpu_logical_map" [drivers/cpufreq/tegra194-cpufreq.ko] undefined! The driver using cpu_logical_map() macro which will expand to __cpu_logical_map, we can't access it in a drvier. Let's turn cpu_logical_map() into a C wrapper and export it to fix the build issue. Also create a function set_cpu_logical_map(cpu, hwid) when assign a value to cpu_logical_map(cpu). Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-09-03 11:26:49 +02:00
Stephen Boyd	542a493c8c	ARM64: vdso32: Install vdso32 from vdso_install [ Upstream commit 8d75785a814241587802655cc33e384230744f0c ] Add the 32-bit vdso Makefile to the vdso_install rule so that 'make vdso_install' installs the 32-bit compat vdso when it is compiled. Fixes: a7f71a2c8903 ("arm64: compat: Add vDSO") Signed-off-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Will Deacon <will@kernel.org> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lore.kernel.org/r/20200818014950.42492-1-swboyd@chromium.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-08-26 10:41:05 +02:00
Shaokun Zhang	ab58cc0331	arm64: perf: Correct the event index in sysfs commit 539707caa1a89ee4efc57b4e4231c20c46575ccc upstream. When PMU event ID is equal or greater than 0x4000, it will be reduced by 0x4000 and it is not the raw number in the sysfs. Let's correct it and obtain the raw event ID. Before this patch: cat /sys/bus/event_source/devices/armv8_pmuv3_0/events/sample_feed event=0x001 After this patch: cat /sys/bus/event_source/devices/armv8_pmuv3_0/events/sample_feed event=0x4001 Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/1592487344-30555-3-git-send-email-zhangshaokun@hisilicon.com [will: fixed formatting of 'if' condition] Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-08-21 13:05:23 +02:00
Will Deacon	679fe09188	arm64: Use test_tsk_thread_flag() for checking TIF_SINGLESTEP [ Upstream commit 5afc78551bf5d53279036e0bf63314e35631d79f ] Rather than open-code test_tsk_thread_flag() at each callsite, simply replace the couple of offenders with calls to test_tsk_thread_flag() directly. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-07-29 10:18:40 +02:00
Will Deacon	5c2450ac7c	arm64: compat: Ensure upper 32 bits of x0 are zero on syscall return commit 15956689a0e60aa0c795174f3c310b60d8794235 upstream. Although we zero the upper bits of x0 on entry to the kernel from an AArch32 task, we do not clear them on the exception return path and can therefore expose 64-bit sign extended syscall return values to userspace via interfaces such as the 'perf_regs' ABI, which deal exclusively with 64-bit registers. Explicitly clear the upper 32 bits of x0 on return from a compat system call. Cc: <stable@vger.kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Keno Fischer <keno@juliacomputing.com> Cc: Luis Machado <luis.machado@linaro.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-07-22 09:33:16 +02:00
Will Deacon	ed766e740c	arm64: ptrace: Consistently use pseudo-singlestep exceptions commit ac2081cdc4d99c57f219c1a6171526e0fa0a6fff upstream. Although the arm64 single-step state machine can be fast-forwarded in cases where we wish to generate a SIGTRAP without actually executing an instruction, this has two major limitations outside of simply skipping an instruction due to emulation. 1. Stepping out of a ptrace signal stop into a signal handler where SIGTRAP is blocked. Fast-forwarding the stepping state machine in this case will result in a forced SIGTRAP, with the handler reset to SIG_DFL. 2. The hardware implicitly fast-forwards the state machine when executing an SVC instruction for issuing a system call. This can interact badly with subsequent ptrace stops signalled during the execution of the system call (e.g. SYSCALL_EXIT or seccomp traps), as they may corrupt the stepping state by updating the PSTATE for the tracee. Resolve both of these issues by injecting a pseudo-singlestep exception on entry to a signal handler and also on return to userspace following a system call. Cc: <stable@vger.kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Tested-by: Luis Machado <luis.machado@linaro.org> Reported-by: Keno Fischer <keno@juliacomputing.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-07-22 09:33:16 +02:00
Will Deacon	bdb7113299	arm64: ptrace: Override SPSR.SS when single-stepping is enabled commit 3a5a4366cecc25daa300b9a9174f7fdd352b9068 upstream. Luis reports that, when reverse debugging with GDB, single-step does not function as expected on arm64: \| I've noticed, under very specific conditions, that a PTRACE_SINGLESTEP \| request by GDB won't execute the underlying instruction. As a consequence, \| the PC doesn't move, but we return a SIGTRAP just like we would for a \| regular successful PTRACE_SINGLESTEP request. The underlying problem is that when the CPU register state is restored as part of a reverse step, the SPSR.SS bit is cleared and so the hardware single-step state can transition to the "active-pending" state, causing an unexpected step exception to be taken immediately if a step operation is attempted. In hindsight, we probably shouldn't have exposed SPSR.SS in the pstate accessible by the GPR regset, but it's a bit late for that now. Instead, simply prevent userspace from configuring the bit to a value which is inconsistent with the TIF_SINGLESTEP state for the task being traced. Cc: <stable@vger.kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Keno Fischer <keno@juliacomputing.com> Link: https://lore.kernel.org/r/1eed6d69-d53d-9657-1fc9-c089be07f98c@linaro.org Reported-by: Luis Machado <luis.machado@linaro.org> Tested-by: Luis Machado <luis.machado@linaro.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-07-22 09:33:16 +02:00
Ard Biesheuvel	4c7924060f	arm64/alternatives: don't patch up internal branches [ Upstream commit 5679b28142193a62f6af93249c0477be9f0c669b ] Commit f7b93d42945c ("arm64/alternatives: use subsections for replacement sequences") moved the alternatives replacement sequences into subsections, in order to keep the as close as possible to the code that they replace. Unfortunately, this broke the logic in branch_insn_requires_update, which assumed that any branch into kernel executable code was a branch that required updating, which is no longer the case now that the code sequences that are patched in are in the same section as the patch site itself. So the only way to discriminate branches that require updating and ones that don't is to check whether the branch targets the replacement sequence itself, and so we can drop the call to kernel_text_address() entirely. Fixes: f7b93d42945c ("arm64/alternatives: use subsections for replacement sequences") Reported-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Alexandru Elisei <alexandru.elisei@arm.com> Link: https://lore.kernel.org/r/20200709125953.30918-1-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-07-22 09:32:53 +02:00
Ard Biesheuvel	a8f13826f9	arm64/alternatives: use subsections for replacement sequences [ Upstream commit f7b93d42945cc71e1346dd5ae07c59061d56745e ] When building very large kernels, the logic that emits replacement sequences for alternatives fails when relative branches are present in the code that is emitted into the .altinstr_replacement section and patched in at the original site and fixed up. The reason is that the linker will insert veneers if relative branches go out of range, and due to the relative distance of the .altinstr_replacement from the .text section where its branch targets usually live, veneers may be emitted at the end of the .altinstr_replacement section, with the relative branches in the sequence pointed at the veneers instead of the actual target. The alternatives patching logic will attempt to fix up the branch to point to its original target, which will be the veneer in this case, but given that the patch site is likely to be far away as well, it will be out of range and so patching will fail. There are other cases where these veneers are problematic, e.g., when the target of the branch is in .text while the patch site is in .init.text, in which case putting the replacement sequence inside .text may not help either. So let's use subsections to emit the replacement code as closely as possible to the patch site, to ensure that veneers are only likely to be emitted if they are required at the patch site as well, in which case they will be in range for the replacement sequence both before and after it is transported to the patch site. This will prevent alternative sequences in non-init code from being released from memory after boot, but this is tolerable given that the entire section is only 512 KB on an allyesconfig build (which weighs in at 500+ MB for the entire Image). Also, note that modules today carry the replacement sequences in non-init sections as well, and any of those that target init code will be emitted into init sections after this change. This fixes an early crash when booting an allyesconfig kernel on a system where any of the alternatives sequences containing relative branches are activated at boot (e.g., ARM64_HAS_PAN on TX2) Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Andre Przywara <andre.przywara@arm.com> Cc: Dave P Martin <dave.martin@arm.com> Link: https://lore.kernel.org/r/20200630081921.13443-1-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-07-22 09:32:53 +02:00
Wei Li	06cee3572e	arm64: kgdb: Fix single-step exception handling oops [ Upstream commit 8523c006264df65aac7d77284cc69aac46a6f842 ] After entering kdb due to breakpoint, when we execute 'ss' or 'go' (will delay installing breakpoints, do single-step first), it won't work correctly, and it will enter kdb due to oops. It's because the reason gotten in kdb_stub() is not as expected, and it seems that the ex_vector for single-step should be 0, like what arch powerpc/sh/parisc has implemented. Before the patch: Entering kdb (current=0xffff8000119e2dc0, pid 0) on processor 0 due to Keyboard Entry [0]kdb> bp printk Instruction(i) BP #0 at 0xffff8000101486cc (printk) is enabled addr at ffff8000101486cc, hardtype=0 installed=0 [0]kdb> g / # echo h > /proc/sysrq-trigger Entering kdb (current=0xffff0000fa878040, pid 266) on processor 3 due to Breakpoint @ 0xffff8000101486cc [3]kdb> ss Entering kdb (current=0xffff0000fa878040, pid 266) on processor 3 Oops: (null) due to oops @ 0xffff800010082ab8 CPU: 3 PID: 266 Comm: sh Not tainted 5.7.0-rc4-13839-gf0e5ad491718 #6 Hardware name: linux,dummy-virt (DT) pstate: 00000085 (nzcv daIf -PAN -UAO) pc : el1_irq+0x78/0x180 lr : __handle_sysrq+0x80/0x190 sp : ffff800015003bf0 x29: ffff800015003d20 x28: ffff0000fa878040 x27: 0000000000000000 x26: ffff80001126b1f0 x25: ffff800011b6a0d8 x24: 0000000000000000 x23: 0000000080200005 x22: ffff8000101486cc x21: ffff800015003d30 x20: 0000ffffffffffff x19: ffff8000119f2000 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 x8 : ffff800015003e50 x7 : 0000000000000002 x6 : 00000000380b9990 x5 : ffff8000106e99e8 x4 : ffff0000fadd83c0 x3 : 0000ffffffffffff x2 : ffff800011b6a0d8 x1 : ffff800011b6a000 x0 : ffff80001130c9d8 Call trace: el1_irq+0x78/0x180 printk+0x0/0x84 write_sysrq_trigger+0xb0/0x118 proc_reg_write+0xb4/0xe0 __vfs_write+0x18/0x40 vfs_write+0xb0/0x1b8 ksys_write+0x64/0xf0 __arm64_sys_write+0x14/0x20 el0_svc_common.constprop.2+0xb0/0x168 do_el0_svc+0x20/0x98 el0_sync_handler+0xec/0x1a8 el0_sync+0x140/0x180 [3]kdb> After the patch: Entering kdb (current=0xffff8000119e2dc0, pid 0) on processor 0 due to Keyboard Entry [0]kdb> bp printk Instruction(i) BP #0 at 0xffff8000101486cc (printk) is enabled addr at ffff8000101486cc, hardtype=0 installed=0 [0]kdb> g / # echo h > /proc/sysrq-trigger Entering kdb (current=0xffff0000fa852bc0, pid 268) on processor 0 due to Breakpoint @ 0xffff8000101486cc [0]kdb> g Entering kdb (current=0xffff0000fa852bc0, pid 268) on processor 0 due to Breakpoint @ 0xffff8000101486cc [0]kdb> ss Entering kdb (current=0xffff0000fa852bc0, pid 268) on processor 0 due to SS trap @ 0xffff800010082ab8 [0]kdb> Fixes: 44679a4f142b ("arm64: KGDB: Add step debugging support") Signed-off-by: Wei Li <liwei391@huawei.com> Tested-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20200509214159.19680-2-liwei391@huawei.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-07-16 08:16:40 +02:00
Jiping Ma	73f79b420b	arm64: perf: Report the PC value in REGS_ABI_32 mode commit 8dfe804a4031ca6ba3a3efb2048534249b64f3a5 upstream. A 32-bit perf querying the registers of a compat task using REGS_ABI_32 will receive zeroes from w15, when it expects to find the PC. Return the PC value for register dwarf register 15 when returning register values for a compat task to perf. Cc: <stable@vger.kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Jiping Ma <jiping.ma2@windriver.com> Link: https://lore.kernel.org/r/1589165527-188401-1-git-send-email-jiping.ma2@windriver.com [will: Shuffled code and added a comment] Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-06-30 15:37:10 -04:00
Will Deacon	cbed4eb0a9	arm64: sve: Fix build failure when ARM64_SVE=y and SYSCTL=n [ Upstream commit e575fb9e76c8e33440fb859572a8b7d430f053d6 ] When I squashed the 'allnoconfig' compiler warning about the set_sve_default_vl() function being defined but not used in commit 1e570f512cbd ("arm64/sve: Eliminate data races on sve_default_vl"), I accidentally broke the build for configs where ARM64_SVE is enabled, but SYSCTL is not. Fix this by only compiling the SVE sysctl support if both CONFIG_SVE=y and CONFIG_SYSCTL=y. Cc: Dave Martin <Dave.Martin@arm.com> Reported-by: Qian Cai <cai@lca.pw> Link: https://lore.kernel.org/r/20200616131808.GA1040@lca.pw Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-06-30 15:37:05 -04:00
Will Deacon	81344ae52c	arm64: hw_breakpoint: Don't invoke overflow handler on uaccess watchpoints [ Upstream commit 24ebec25fb270100e252b19c288e21bd7d8cc7f7 ] Unprivileged memory accesses generated by the so-called "translated" instructions (e.g. STTR) at EL1 can cause EL0 watchpoints to fire unexpectedly if kernel debugging is enabled. In such cases, the hw_breakpoint logic will invoke the user overflow handler which will typically raise a SIGTRAP back to the current task. This is futile when returning back to the kernel because (a) the signal won't have been delivered and (b) userspace can't handle the thing anyway. Avoid invoking the user overflow handler for watchpoints triggered by kernel uaccess routines, and instead single-step over the faulting instruction as we would if no overflow handler had been installed. (Fixes tag identifies the introduction of unprivileged memory accesses, which exposed this latent bug in the hw_breakpoint code) Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Fixes: 57f4959bad0a ("arm64: kernel: Add support for User Access Override") Reported-by: Luis Machado <luis.machado@linaro.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-06-24 17:50:47 +02:00
Luke Nelson	b432540123	arm64: insn: Fix two bugs in encoding 32-bit logical immediates [ Upstream commit 579d1b3faa3735e781ff74aac0afd598515dbc63 ] This patch fixes two issues present in the current function for encoding arm64 logical immediates when using the 32-bit variants of instructions. First, the code does not correctly reject an all-ones 32-bit immediate, and returns an undefined instruction encoding. Second, the code incorrectly rejects some 32-bit immediates that are actually encodable as logical immediates. The root cause is that the code uses a default mask of 64-bit all-ones, even for 32-bit immediates. This causes an issue later on when the default mask is used to fill the top bits of the immediate with ones, shown here: /* * Pattern: 0..01..10..01..1 * * Fill the unused top bits with ones, and check if * the result is a valid immediate (all ones with a * contiguous ranges of zeroes). */ imm \|= ~mask; if (!range_of_ones(~imm)) return AARCH64_BREAK_FAULT; To see the problem, consider an immediate of the form 0..01..10..01..1, where the upper 32 bits are zero, such as 0x80000001. The code checks if ~(imm \| ~mask) contains a range of ones: the incorrect mask yields 1..10..01..10..0, which fails the check; the correct mask yields 0..01..10..0, which succeeds. The fix for both issues is to generate a correct mask based on the instruction immediate size, and use the mask to check for all-ones, all-zeroes, and values wider than the mask. Currently, arch/arm64/kvm/va_layout.c is the only user of this function, which uses 64-bit immediates and therefore won't trigger these bugs. We tested the new code against llvm-mc with all 1,302 encodable 32-bit logical immediates and all 5,334 encodable 64-bit logical immediates. Fixes: ef3935eeebff ("arm64: insn: Add encoder for bitwise operations using literals") Suggested-by: Will Deacon <will@kernel.org> Co-developed-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Luke Nelson <luke.r.nels@gmail.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200508181547.24783-2-luke.r.nels@gmail.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-06-22 09:30:53 +02:00
Gavin Shan	072d23eef6	arm64/kernel: Fix range on invalidating dcache for boot page tables [ Upstream commit 9d2d75ede59bc1edd8561f2ee9d4702a5ea0ae30 ] Prior to commit 8eb7e28d4c642c31 ("arm64/mm: move runtime pgds to rodata"), idmap_pgd_dir, tramp_pg_dir, reserved_ttbr0, swapper_pg_dir, and init_pg_dir were contiguous at the end of the kernel image. The maintenance at the end of __create_page_tables assumed these were contiguous, and affected everything from the start of idmap_pg_dir to the end of init_pg_dir. That commit moved all but init_pg_dir into the .rodata section, with other data placed between idmap_pg_dir and init_pg_dir, but did not update the maintenance. Hence the maintenance is performed on much more data than necessary (but as the bootloader previously made this clean to the PoC there is no functional problem). As we only alter idmap_pg_dir, and init_pg_dir, we only need to perform maintenance for these. As the other dirs are in .rodata, the bootloader will have initialised them as expected and cleaned them to the PoC. The kernel will initialize them as necessary after enabling the MMU. This patch reworks the maintenance to only cover the idmap_pg_dir and init_pg_dir to avoid this unnecessary work. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20200427235700.112220-1-gshan@redhat.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-06-22 09:30:50 +02:00
Keno Fischer	53683907ef	arm64: Fix PTRACE_SYSEMU semantics commit 1cf6022bd9161081215028203919c33fcfa6debb upstream. Quoth the man page: ``` If the tracee was restarted by PTRACE_SYSCALL or PTRACE_SYSEMU, the tracee enters syscall-enter-stop just prior to entering any system call (which will not be executed if the restart was using PTRACE_SYSEMU, regardless of any change made to registers at this point or how the tracee is restarted after this stop). ``` The parenthetical comment is currently true on x86 and powerpc, but not currently true on arm64. arm64 re-checks the _TIF_SYSCALL_EMU flag after the syscall entry ptrace stop. However, at this point, it reflects which method was used to re-start the syscall at the entry stop, rather than the method that was used to reach it. Fix that by recording the original flag before performing the ptrace stop, bringing the behavior in line with documentation and x86/powerpc. Fixes: f086f67485c5 ("arm64: ptrace: add support for syscall emulation") Cc: <stable@vger.kernel.org> # 5.3.x- Signed-off-by: Keno Fischer <keno@juliacomputing.com> Acked-by: Will Deacon <will@kernel.org> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Tested-by: Bin Lu <Bin.Lu@arm.com> [catalin.marinas@arm.com: moved 'flags' bit masking] [catalin.marinas@arm.com: changed 'flags' type to unsigned long] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-05-27 17:46:40 +02:00

1 2 3 4 5 ...

2367 Commits