linux

iv/linux

Author	SHA1	Message	Date
Mark Rutland	bc28fde909	arm64: ftrace: consistently handle PLTs. [ Upstream commit a6253579977e4c6f7818eeb05bf2bc65678a7187 ] Sometimes it is necessary to use a PLT entry to call an ftrace trampoline. This is handled by ftrace_make_call() and ftrace_make_nop(), with each having almost identical logic, but this is not handled by ftrace_modify_call() since its introduction in commit: 3b23e4991fb66f6d ("arm64: implement ftrace with regs") Due to this, if we ever were to call ftrace_modify_call() for a callsite which requires a PLT entry for a trampoline, then either: a) If the old addr requires a trampoline, ftrace_modify_call() will use an out-of-range address to generate the 'old' branch instruction. This will result in warnings from aarch64_insn_gen_branch_imm() and ftrace_modify_code(), and no instructions will be modified. As ftrace_modify_call() will return an error, this will result in subsequent internal ftrace errors. b) If the old addr does not require a trampoline, but the new addr does, ftrace_modify_call() will use an out-of-range address to generate the 'new' branch instruction. This will result in warnings from aarch64_insn_gen_branch_imm(), and ftrace_modify_code() will replace the 'old' branch with a BRK. This will result in a kernel panic when this BRK is later executed. Practically speaking, case (a) is vastly more likely than case (b), and typically this will result in internal ftrace errors that don't necessarily affect the rest of the system. This can be demonstrated with an out-of-tree test module which triggers ftrace_modify_call(), e.g. \| # insmod test_ftrace.ko \| test_ftrace: Function test_function raw=0xffffb3749399201c, callsite=0xffffb37493992024 \| branch_imm_common: offset out of range \| branch_imm_common: offset out of range \| ------------[ ftrace bug ]------------ \| ftrace failed to modify \| [<ffffb37493992024>] test_function+0x8/0x38 [test_ftrace] \| actual: 1d:00:00:94 \| Updating ftrace call site to call a different ftrace function \| ftrace record flags: e0000002 \| (2) R \| expected tramp: ffffb374ae42ed54 \| ------------[ cut here ]------------ \| WARNING: CPU: 0 PID: 165 at kernel/trace/ftrace.c:2085 ftrace_bug+0x280/0x2b0 \| Modules linked in: test_ftrace(+) \| CPU: 0 PID: 165 Comm: insmod Not tainted 5.19.0-rc2-00002-g4d9ead8b45ce #13 \| Hardware name: linux,dummy-virt (DT) \| pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) \| pc : ftrace_bug+0x280/0x2b0 \| lr : ftrace_bug+0x280/0x2b0 \| sp : ffff80000839ba00 \| x29: ffff80000839ba00 x28: 0000000000000000 x27: ffff80000839bcf0 \| x26: ffffb37493994180 x25: ffffb374b0991c28 x24: ffffb374b0d70000 \| x23: 00000000ffffffea x22: ffffb374afcc33b0 x21: ffffb374b08f9cc8 \| x20: ffff572b8462c000 x19: ffffb374b08f9000 x18: ffffffffffffffff \| x17: 6c6c6163202c6331 x16: ffffb374ae5ad110 x15: ffffb374b0d51ee4 \| x14: 0000000000000000 x13: 3435646532346561 x12: 3437336266666666 \| x11: 203a706d61727420 x10: 6465746365707865 x9 : ffffb374ae5149e8 \| x8 : 336266666666203a x7 : 706d617274206465 x6 : 00000000fffff167 \| x5 : ffff572bffbc4a08 x4 : 00000000fffff167 x3 : 0000000000000000 \| x2 : 0000000000000000 x1 : ffff572b84461e00 x0 : 0000000000000022 \| Call trace: \| ftrace_bug+0x280/0x2b0 \| ftrace_replace_code+0x98/0xa0 \| ftrace_modify_all_code+0xe0/0x144 \| arch_ftrace_update_code+0x14/0x20 \| ftrace_startup+0xf8/0x1b0 \| register_ftrace_function+0x38/0x90 \| test_ftrace_init+0xd0/0x1000 [test_ftrace] \| do_one_initcall+0x50/0x2b0 \| do_init_module+0x50/0x1f0 \| load_module+0x17c8/0x1d64 \| __do_sys_finit_module+0xa8/0x100 \| __arm64_sys_finit_module+0x2c/0x3c \| invoke_syscall+0x50/0x120 \| el0_svc_common.constprop.0+0xdc/0x100 \| do_el0_svc+0x3c/0xd0 \| el0_svc+0x34/0xb0 \| el0t_64_sync_handler+0xbc/0x140 \| el0t_64_sync+0x18c/0x190 \| ---[ end trace 0000000000000000 ]--- We can solve this by consistently determining whether to use a PLT entry for an address. Note that since (the earlier) commit: f1a54ae9af0da4d7 ("arm64: module/ftrace: intialize PLT at load time") ... we can consistently determine the PLT address that a given callsite will use, and therefore ftrace_make_nop() does not need to skip validation when a PLT is in use. This patch factors the existing logic out of ftrace_make_call() and ftrace_make_nop() into a common ftrace_find_callable_addr() helper function, which is used by ftrace_make_call(), ftrace_make_nop(), and ftrace_modify_call(). In ftrace_make_nop() the patching is consistently validated by ftrace_modify_code() as we can always determine what the old instruction should have been. Fixes: 3b23e4991fb6 ("arm64: implement ftrace with regs") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Will Deacon <will@kernel.org> Tested-by: "Ivan T. Ivanov" <iivanov@suse.de> Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220614080944.1349146-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-06-22 14:13:17 +02:00
Mark Rutland	e177f17fe4	arm64: ftrace: fix branch range checks [ Upstream commit 3eefdf9d1e406f3da47470b2854347009ffcb6fa ] The branch range checks in ftrace_make_call() and ftrace_make_nop() are incorrect, erroneously permitting a forwards branch of 128M and erroneously rejecting a backwards branch of 128M. This is because both functions calculate the offset backwards, calculating the offset from the target to the branch, rather than the other way around as the later comparisons expect. If an out-of-range branch were erroeously permitted, this would later be rejected by aarch64_insn_gen_branch_imm() as branch_imm_common() checks the bounds correctly, resulting in warnings and the placement of a BRK instruction. Note that this can only happen for a forwards branch of exactly 128M, and so the caller would need to be exactly 128M bytes below the relevant ftrace trampoline. If an in-range branch were erroeously rejected, then: * For modules when CONFIG_ARM64_MODULE_PLTS=y, this would result in the use of a PLT entry, which is benign. Note that this is the common case, as this is selected by CONFIG_RANDOMIZE_BASE (and therefore RANDOMIZE_MODULE_REGION_FULL), which distributions typically seelct. This is also selected by CONFIG_ARM64_ERRATUM_843419. * For modules when CONFIG_ARM64_MODULE_PLTS=n, this would result in internal ftrace failures. * For core kernel text, this would result in internal ftrace failues. Note that for this to happen, the kernel text would need to be at least 128M bytes in size, and typical configurations are smaller tha this. Fix this by calculating the offset from the branch to the target in both functions. Fixes: f8af0b364e24 ("arm64: ftrace: don't validate branch via PLT in ftrace_make_nop()") Fixes: e71a4e1bebaf ("arm64: ftrace: add support for far branches to dynamic ftrace") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Will Deacon <will@kernel.org> Tested-by: "Ivan T. Ivanov" <iivanov@suse.de> Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220614080944.1349146-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-06-22 14:13:17 +02:00
Alexandru Elisei	ad97425d23	arm64: compat: Do not treat syscall number as ESR_ELx for a bad syscall [ Upstream commit 3fed9e551417b84038b15117732ea4505eee386b ] If a compat process tries to execute an unknown system call above the __ARM_NR_COMPAT_END number, the kernel sends a SIGILL signal to the offending process. Information about the error is printed to dmesg in compat_arm_syscall() -> arm64_notify_die() -> arm64_force_sig_fault() -> arm64_show_signal(). arm64_show_signal() interprets a non-zero value for current->thread.fault_code as an exception syndrome and displays the message associated with the ESR_ELx.EC field (bits 31:26). current->thread.fault_code is set in compat_arm_syscall() -> arm64_notify_die() with the bad syscall number instead of a valid ESR_ELx value. This means that the ESR_ELx.EC field has the value that the user set for the syscall number and the kernel can end up printing bogus exception messages. For example, for the syscall number 0x68000000, which evaluates to ESR_ELx.EC value of 0x1A (ESR_ELx_EC_FPAC) the kernel prints this error: [ 18.349161] syscall[300]: unhandled exception: ERET/ERETAA/ERETAB, ESR 0x68000000, Oops - bad compat syscall(2) in syscall[10000+50000] [ 18.350639] CPU: 2 PID: 300 Comm: syscall Not tainted 5.18.0-rc1 #79 [ 18.351249] Hardware name: Pine64 RockPro64 v2.0 (DT) [..] which is misleading, as the bad compat syscall has nothing to do with pointer authentication. Stop arm64_show_signal() from printing exception syndrome information by having compat_arm_syscall() set the ESR_ELx value to 0, as it has no meaning for an invalid system call number. The example above now becomes: [ 19.935275] syscall[301]: unhandled exception: Oops - bad compat syscall(2) in syscall[10000+50000] [ 19.936124] CPU: 1 PID: 301 Comm: syscall Not tainted 5.18.0-rc1-00005-g7e08006d4102 #80 [ 19.936894] Hardware name: Pine64 RockPro64 v2.0 (DT) [..] which although shows less information because the syscall number, wrongfully advertised as the ESR value, is missing, it is better than showing plainly wrong information. The syscall number can be easily obtained with strace. A 32-bit value above or equal to 0x8000_0000 is interpreted as a negative integer in compat_arm_syscal() and the condition scno < __ARM_NR_COMPAT_END evaluates to true; the syscall will exit to userspace in this case with the ENOSYS error code instead of arm64_notify_die() being called. Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220425114444.368693-3-alexandru.elisei@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-06-09 10:20:53 +02:00
Shreyas K K	a698bf1f72	arm64: Enable repeat tlbi workaround on KRYO4XX gold CPUs [ Upstream commit 51f559d66527e238f9a5f82027bff499784d4eac ] Add KRYO4XX gold/big cores to the list of CPUs that need the repeat TLBI workaround. Apply this to the affected KRYO4XX cores (rcpe to rfpe). The variant and revision bits are implementation defined and are different from the their Cortex CPU counterparts on which they are based on, i.e., (r0p0 to r3p0) is equivalent to (rcpe to rfpe). Signed-off-by: Shreyas K K <quic_shrekk@quicinc.com> Reviewed-by: Sai Prakash Ranjan <quic_saipraka@quicinc.com> Link: https://lore.kernel.org/r/20220512110134.12179-1-quic_shrekk@quicinc.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-05-25 09:18:01 +02:00
Catalin Marinas	6013ef5f51	arm64: mte: Ensure the cleared tags are visible before setting the PTE commit 1d0cb4c8864addc362bae98e8ffa5500c87e1227 upstream. As an optimisation, only pages mapped with PROT_MTE in user space have the MTE tags zeroed. This is done lazily at the set_pte_at() time via mte_sync_tags(). However, this function is missing a barrier and another CPU may see the PTE updated before the zeroed tags are visible. Add an smp_wmb() barrier if the mapping is Normal Tagged. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Fixes: 34bfeea4a9e9 ("arm64: mte: Clear the tags when a page is mapped in user-space with PROT_MTE") Cc: <stable@vger.kernel.org> # 5.10.x Reported-by: Vladimir Murzin <vladimir.murzin@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Steven Price <steven.price@arm.com> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> Link: https://lore.kernel.org/r/20220517093532.127095-1-catalin.marinas@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-05-25 09:17:56 +02:00
Prakruthi Deepak Heragu	a817f78ed6	arm64: paravirt: Use RCU read locks to guard stolen_time commit 19bef63f951e47dd4ba54810e6f7c7ff9344a3ef upstream. During hotplug, the stolen time data structure is unmapped and memset. There is a possibility of the timer IRQ being triggered before memset and stolen time is getting updated as part of this timer IRQ handler. This causes the below crash in timer handler - [ 3457.473139][ C5] Unable to handle kernel paging request at virtual address ffffffc03df05148 ... [ 3458.154398][ C5] Call trace: [ 3458.157648][ C5] para_steal_clock+0x30/0x50 [ 3458.162319][ C5] irqtime_account_process_tick+0x30/0x194 [ 3458.168148][ C5] account_process_tick+0x3c/0x280 [ 3458.173274][ C5] update_process_times+0x5c/0xf4 [ 3458.178311][ C5] tick_sched_timer+0x180/0x384 [ 3458.183164][ C5] __run_hrtimer+0x160/0x57c [ 3458.187744][ C5] hrtimer_interrupt+0x258/0x684 [ 3458.192698][ C5] arch_timer_handler_virt+0x5c/0xa0 [ 3458.198002][ C5] handle_percpu_devid_irq+0xdc/0x414 [ 3458.203385][ C5] handle_domain_irq+0xa8/0x168 [ 3458.208241][ C5] gic_handle_irq.34493+0x54/0x244 [ 3458.213359][ C5] call_on_irq_stack+0x40/0x70 [ 3458.218125][ C5] do_interrupt_handler+0x60/0x9c [ 3458.223156][ C5] el1_interrupt+0x34/0x64 [ 3458.227560][ C5] el1h_64_irq_handler+0x1c/0x2c [ 3458.232503][ C5] el1h_64_irq+0x7c/0x80 [ 3458.236736][ C5] free_vmap_area_noflush+0x108/0x39c [ 3458.242126][ C5] remove_vm_area+0xbc/0x118 [ 3458.246714][ C5] vm_remove_mappings+0x48/0x2a4 [ 3458.251656][ C5] __vunmap+0x154/0x278 [ 3458.255796][ C5] stolen_time_cpu_down_prepare+0xc0/0xd8 [ 3458.261542][ C5] cpuhp_invoke_callback+0x248/0xc34 [ 3458.266842][ C5] cpuhp_thread_fun+0x1c4/0x248 [ 3458.271696][ C5] smpboot_thread_fn+0x1b0/0x400 [ 3458.276638][ C5] kthread+0x17c/0x1e0 [ 3458.280691][ C5] ret_from_fork+0x10/0x20 As a fix, introduce rcu lock to update stolen time structure. Fixes: 75df529bec91 ("arm64: paravirt: Initialize steal time when cpu is online") Cc: stable@vger.kernel.org Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com> Reviewed-by: Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu> Link: https://lore.kernel.org/r/20220513174654.362169-1-quic_eberman@quicinc.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-05-25 09:17:56 +02:00
Joey Gouly	98973d2bdd	arm64: alternatives: mark patch_alternative() as `noinstr` [ Upstream commit a2c0b0fbe01419f8f5d1c0b9c581631f34ffce8b ] The alternatives code must be `noinstr` such that it does not patch itself, as the cache invalidation is only performed after all the alternatives have been applied. Mark patch_alternative() as `noinstr`. Mark branch_insn_requires_update() and get_alt_insn() with `__always_inline` since they are both only called through patch_alternative(). Booting a kernel in QEMU TCG with KCSAN=y and ARM64_USE_LSE_ATOMICS=y caused a boot hang: [ 0.241121] CPU: All CPU(s) started at EL2 The alternatives code was patching the atomics in __tsan_read4() from LL/SC atomics to LSE atomics. The following fragment is using LL/SC atomics in the .text section: \| <__tsan_unaligned_read4+304>: ldxr x6, [x2] \| <__tsan_unaligned_read4+308>: add x6, x6, x5 \| <__tsan_unaligned_read4+312>: stxr w7, x6, [x2] \| <__tsan_unaligned_read4+316>: cbnz w7, <__tsan_unaligned_read4+304> This LL/SC atomic sequence was to be replaced with LSE atomics. However since the alternatives code was instrumentable, __tsan_read4() was being called after only the first instruction was replaced, which led to the following code in memory: \| <__tsan_unaligned_read4+304>: ldadd x5, x6, [x2] \| <__tsan_unaligned_read4+308>: add x6, x6, x5 \| <__tsan_unaligned_read4+312>: stxr w7, x6, [x2] \| <__tsan_unaligned_read4+316>: cbnz w7, <__tsan_unaligned_read4+304> This caused an infinite loop as the `stxr` instruction never completed successfully, so `w7` was always 0. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220405104733.11476-1-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-04-20 09:23:22 +02:00
Mario Limonciello	503934df31	cpuidle: PSCI: Move the `has_lpi` check to the beginning of the function commit 01f6c7338ce267959975da65d86ba34f44d54220 upstream. Currently the first thing checked is whether the PCSI cpu_suspend function has been initialized. Another change will be overloading `acpi_processor_ffh_lpi_probe` and calling it sooner. So make the `has_lpi` check the first thing checked to prepare for that change. Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-04-20 09:23:09 +02:00
Guo Ren	8bb4168291	arm64: patch_text: Fixup last cpu should be master commit 31a099dbd91e69fcab55eef4be15ed7a8c984918 upstream. These patch_text implementations are using stop_machine_cpuslocked infrastructure with atomic cpu_count. The original idea: When the master CPU patch_text, the others should wait for it. But current implementation is using the first CPU as master, which couldn't guarantee the remaining CPUs are waiting. This patch changes the last CPU as the master to solve the potential risk. Fixes: ae16480785de ("arm64: introduce interfaces to hotpatch kernel and module code") Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220407073323.743224-2-guoren@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-04-13 21:01:08 +02:00
Chanho Park	9de98470db	arm64: Add part number for Arm Cortex-A78AE commit 83bea32ac7ed37bbda58733de61fc9369513f9f9 upstream. Add the MIDR part number info for the Arm Cortex-A78AE[1] and add it to spectre-BHB affected list[2]. [1]: https://developer.arm.com/Processors/Cortex-A78AE [2]: https://developer.arm.com/Arm%20Security%20Center/Spectre-BHB Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: James Morse <james.morse@arm.com> Signed-off-by: Chanho Park <chanho61.park@samsung.com> Link: https://lore.kernel.org/r/20220407091128.8700-1-chanho61.park@samsung.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-04-13 21:01:07 +02:00
David Engraf	7ce550a01b	arm64: signal: nofpsimd: Do not allocate fp/simd context when not available commit 0a32c88ddb9af30e8a16d41d7b9b824c27d29459 upstream. Commit 6d502b6ba1b2 ("arm64: signal: nofpsimd: Handle fp/simd context for signal frames") introduced saving the fp/simd context for signal handling only when support is available. But setup_sigframe_layout() always reserves memory for fp/simd context. The additional memory is not touched because preserve_fpsimd_context() is not called and thus the magic is invalid. This may lead to an error when parse_user_sigframe() checks the fp/simd area and does not find a valid magic number. Signed-off-by: David Engraf <david.engraf@sysgo.com> Reviwed-by: Mark Brown <broonie@kernel.org> Fixes: 6d502b6ba1b267b3 ("arm64: signal: nofpsimd: Handle fp/simd context for signal frames") Cc: <stable@vger.kernel.org> # 5.6.x Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220225104008.820289-1-david.engraf@sysgo.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-04-08 14:39:54 +02:00
James Morse	b65b87e718	arm64: proton-pack: Include unprivileged eBPF status in Spectre v2 mitigation reporting commit 58c9a5060cb7cd529d49c93954cdafe81c1d642a upstream. The mitigations for Spectre-BHB are only applied when an exception is taken from user-space. The mitigation status is reported via the spectre_v2 sysfs vulnerabilities file. When unprivileged eBPF is enabled the mitigation in the exception vectors can be avoided by an eBPF program. When unprivileged eBPF is enabled, print a warning and report vulnerable via the sysfs vulnerabilities file. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-11 12:11:53 +01:00
James Morse	551717cf3b	arm64: Use the clearbhb instruction in mitigations commit 228a26b912287934789023b4132ba76065d9491c upstream. Future CPUs may implement a clearbhb instruction that is sufficient to mitigate SpectreBHB. CPUs that implement this instruction, but not CSV2.3 must be affected by Spectre-BHB. Add support to use this instruction as the BHB mitigation on CPUs that support it. The instruction is in the hint space, so it will be treated by a NOP as older CPUs. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> [ modified for stable: Use a KVM vector template instead of alternatives, removed bitmap of mitigations ] Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-11 12:11:53 +01:00
James Morse	e192c8baa6	arm64: Mitigate spectre style branch history side channels commit 558c303c9734af5a813739cd284879227f7297d2 upstream. Speculation attacks against some high-performance processors can make use of branch history to influence future speculation. When taking an exception from user-space, a sequence of branches or a firmware call overwrites or invalidates the branch history. The sequence of branches is added to the vectors, and should appear before the first indirect branch. For systems using KPTI the sequence is added to the kpti trampoline where it has a free register as the exit from the trampoline is via a 'ret'. For systems not using KPTI, the same register tricks are used to free up a register in the vectors. For the firmware call, arch-workaround-3 clobbers 4 registers, so there is no choice but to save them to the EL1 stack. This only happens for entry from EL0, so if we take an exception due to the stack access, it will not become re-entrant. For KVM, the existing branch-predictor-hardening vectors are used. When a spectre version of these vectors is in use, the firmware call is sufficient to mitigate against Spectre-BHB. For the non-spectre versions, the sequence of branches is added to the indirect vector. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> [ modified for stable, removed bitmap of mitigations, use kvm template infrastructure ] Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-11 12:11:53 +01:00
James Morse	192023e6ba	KVM: arm64: Allow indirect vectors to be used without SPECTRE_V3A commit 5bdf3437603d4af87f9c7f424b0c8aeed2420745 upstream. CPUs vulnerable to Spectre-BHB either need to make an SMC-CC firmware call from the vectors, or run a sequence of branches. This gets added to the hyp vectors. If there is no support for arch-workaround-1 in firmware, the indirect vector will be used. kvm_init_vector_slots() only initialises the two indirect slots if the platform is vulnerable to Spectre-v3a. pKVM's hyp_map_vectors() only initialises __hyp_bp_vect_base if the platform is vulnerable to Spectre-v3a. As there are about to more users of the indirect vectors, ensure their entries in hyp_spectre_vector_selector[] are always initialised, and __hyp_bp_vect_base defaults to the regular VA mapping. The Spectre-v3a check is moved to a helper kvm_system_needs_idmapped_vectors(), and merged with the code that creates the hyp mappings. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-11 12:11:53 +01:00
James Morse	13a807a0a0	arm64: proton-pack: Report Spectre-BHB vulnerabilities as part of Spectre-v2 commit dee435be76f4117410bbd90573a881fd33488f37 upstream. Speculation attacks against some high-performance processors can make use of branch history to influence future speculation as part of a spectre-v2 attack. This is not mitigated by CSV2, meaning CPUs that previously reported 'Not affected' are now moderately mitigated by CSV2. Update the value in /sys/devices/system/cpu/vulnerabilities/spectre_v2 to also show the state of the BHB mitigation. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-11 12:11:53 +01:00
James Morse	1f63326a52	arm64: Add percpu vectors for EL1 commit bd09128d16fac3c34b80bd6a29088ac632e8ce09 upstream. The Spectre-BHB workaround adds a firmware call to the vectors. This is needed on some CPUs, but not others. To avoid the unaffected CPU in a big/little pair from making the firmware call, create per cpu vectors. The per-cpu vectors only apply when returning from EL0. Systems using KPTI can use the canonical 'full-fat' vectors directly at EL1, the trampoline exit code will switch to this_cpu_vector on exit to EL0. Systems not using KPTI should always use this_cpu_vector. this_cpu_vector will point at a vector in tramp_vecs or __bp_harden_el1_vectors, depending on whether KPTI is in use. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-11 12:11:52 +01:00
James Morse	56cf5326bd	arm64: entry: Add macro for reading symbol addresses from the trampoline commit b28a8eebe81c186fdb1a0078263b30576c8e1f42 upstream. The trampoline code needs to use the address of symbols in the wider kernel, e.g. vectors. PC-relative addressing wouldn't work as the trampoline code doesn't run at the address the linker expected. tramp_ventry uses a literal pool, unless CONFIG_RANDOMIZE_BASE is set, in which case it uses the data page as a literal pool because the data page can be unmapped when running in user-space, which is required for CPUs vulnerable to meltdown. Pull this logic out as a macro, instead of adding a third copy of it. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-11 12:11:52 +01:00
James Morse	3f21b7e355	arm64: entry: Add vectors that have the bhb mitigation sequences commit ba2689234be92024e5635d30fe744f4853ad97db upstream. Some CPUs affected by Spectre-BHB need a sequence of branches, or a firmware call to be run before any indirect branch. This needs to go in the vectors. No CPU needs both. While this can be patched in, it would run on all CPUs as there is a single set of vectors. If only one part of a big/little combination is affected, the unaffected CPUs have to run the mitigation too. Create extra vectors that include the sequence. Subsequent patches will allow affected CPUs to select this set of vectors. Later patches will modify the loop count to match what the CPU requires. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-11 12:11:52 +01:00
James Morse	4937955296	arm64: entry: Add non-kpti __bp_harden_el1_vectors for mitigations commit aff65393fa1401e034656e349abd655cfe272de0 upstream. kpti is an optional feature, for systems not using kpti a set of vectors for the spectre-bhb mitigations is needed. Add another set of vectors, __bp_harden_el1_vectors, that will be used if a mitigation is needed and kpti is not in use. The EL1 ventries are repeated verbatim as there is no additional work needed for entry from EL1. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-11 12:11:52 +01:00
James Morse	26211252c1	arm64: entry: Allow the trampoline text to occupy multiple pages commit a9c406e6462ff14956d690de7bbe5131a5677dc9 upstream. Adding a second set of vectors to .entry.tramp.text will make it larger than a single 4K page. Allow the trampoline text to occupy up to three pages by adding two more fixmap slots. Previous changes to tramp_valias allowed it to reach beyond a single page. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-11 12:11:52 +01:00
James Morse	73ee716a1f	arm64: entry: Make the kpti trampoline's kpti sequence optional commit c47e4d04ba0f1ea17353d85d45f611277507e07a upstream. Spectre-BHB needs to add sequences to the vectors. Having one global set of vectors is a problem for big/little systems where the sequence is costly on cpus that are not vulnerable. Making the vectors per-cpu in the style of KVM's bh_harden_hyp_vecs requires the vectors to be generated by macros. Make the kpti re-mapping of the kernel optional, so the macros can be used without kpti. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-11 12:11:52 +01:00
James Morse	8c691e5308	arm64: entry: Move trampoline macros out of ifdef'd section commit 13d7a08352a83ef2252aeb464a5e08dfc06b5dfd upstream. The macros for building the kpti trampoline are all behind CONFIG_UNMAP_KERNEL_AT_EL0, and in a region that outputs to the .entry.tramp.text section. Move the macros out so they can be used to generate other kinds of trampoline. Only the symbols need to be guarded by CONFIG_UNMAP_KERNEL_AT_EL0 and appear in the .entry.tramp.text section. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-11 12:11:52 +01:00
James Morse	e550250632	arm64: entry: Don't assume tramp_vectors is the start of the vectors commit ed50da7764535f1e24432ded289974f2bf2b0c5a upstream. The tramp_ventry macro uses tramp_vectors as the address of the vectors when calculating which ventry in the 'full fat' vectors to branch to. While there is one set of tramp_vectors, this will be true. Adding multiple sets of vectors will break this assumption. Move the generation of the vectors to a macro, and pass the start of the vectors as an argument to tramp_ventry. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-11 12:11:52 +01:00
James Morse	5275fb5ea5	arm64: entry: Allow tramp_alias to access symbols after the 4K boundary commit 6c5bf79b69f911560fbf82214c0971af6e58e682 upstream. Systems using kpti enter and exit the kernel through a trampoline mapping that is always mapped, even when the kernel is not. tramp_valias is a macro to find the address of a symbol in the trampoline mapping. Adding extra sets of vectors will expand the size of the entry.tramp.text section to beyond 4K. tramp_valias will be unable to generate addresses for symbols beyond 4K as it uses the 12 bit immediate of the add instruction. As there are now two registers available when tramp_alias is called, use the extra register to avoid the 4K limit of the 12 bit immediate. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-11 12:11:52 +01:00
James Morse	bda8960281	arm64: entry: Move the trampoline data page before the text page commit c091fb6ae059cda563b2a4d93fdbc548ef34e1d6 upstream. The trampoline code has a data page that holds the address of the vectors, which is unmapped when running in user-space. This ensures that with CONFIG_RANDOMIZE_BASE, the randomised address of the kernel can't be discovered until after the kernel has been mapped. If the trampoline text page is extended to include multiple sets of vectors, it will be larger than a single page, making it tricky to find the data page without knowing the size of the trampoline text pages, which will vary with PAGE_SIZE. Move the data page to appear before the text page. This allows the data page to be found without knowing the size of the trampoline text pages. 'tramp_vectors' is used to refer to the beginning of the .entry.tramp.text section, do that explicitly. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-11 12:11:52 +01:00
James Morse	d93b25a665	arm64: entry: Free up another register on kpti's tramp_exit path commit 03aff3a77a58b5b52a77e00537a42090ad57b80b upstream. Kpti stashes x30 in far_el1 while it uses x30 for all its work. Making the vectors a per-cpu data structure will require a second register. Allow tramp_exit two registers before it unmaps the kernel, by leaving x30 on the stack, and stashing x29 in far_el1. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-11 12:11:52 +01:00
James Morse	5242d6971e	arm64: entry: Make the trampoline cleanup optional commit d739da1694a0eaef0358a42b76904b611539b77b upstream. Subsequent patches will add additional sets of vectors that use the same tricks as the kpti vectors to reach the full-fat vectors. The full-fat vectors contain some cleanup for kpti that is patched in by alternatives when kpti is in use. Once there are additional vectors, the cleanup will be needed in more cases. But on big/little systems, the cleanup would be harmful if no trampoline vector were in use. Instead of forcing CPUs that don't need a trampoline vector to use one, make the trampoline cleanup optional. Entry at the top of the vectors will skip the cleanup. The trampoline vectors can then skip the first instruction, triggering the cleanup to run. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-11 12:11:51 +01:00
James Morse	7048a21086	arm64: spectre: Rename spectre_v4_patch_fw_mitigation_conduit commit 1b33d4860deaecf1d8eec3061b7e7ed7ab0bae8d upstream. The spectre-v4 sequence includes an SMC from the assembly entry code. spectre_v4_patch_fw_mitigation_conduit is the patching callback that generates an HVC or SMC depending on the SMCCC conduit type. As this isn't specific to spectre-v4, rename it smccc_patch_fw_mitigation_conduit so it can be re-used. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-11 12:11:51 +01:00
James Morse	dc5b630c0d	arm64: entry.S: Add ventry overflow sanity checks commit 4330e2c5c04c27bebf89d34e0bc14e6943413067 upstream. Subsequent patches add even more code to the ventry slots. Ensure kernels that overflow a ventry slot don't get built. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-11 12:11:51 +01:00
Joey Gouly	97d8bdf331	arm64: cpufeature: add HWCAP for FEAT_RPRES commit 1175011a7d0030d49dc9c10bde36f08f26d0a8ee upstream. Add a new HWCAP to detect the Increased precision of Reciprocal Estimate and Reciprocal Square Root Estimate feature (FEAT_RPRES), introduced in Armv8.7. Also expose this to userspace in the ID_AA64ISAR2_EL1 feature register. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211210165432.8106-4-joey.gouly@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-11 12:11:51 +01:00
Joey Gouly	162aa002ec	arm64: cpufeature: add HWCAP for FEAT_AFP commit 5c13f042e73200b50573ace63e1a6b94e2917616 upstream. Add a new HWCAP to detect the Alternate Floating-point Behaviour feature (FEAT_AFP), introduced in Armv8.7. Also expose this to userspace in the ID_AA64MMFR1_EL1 feature register. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211210165432.8106-2-joey.gouly@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-11 12:11:51 +01:00
Joey Gouly	dbcfa98539	arm64: add ID_AA64ISAR2_EL1 sys register commit 9e45365f1469ef2b934f9d035975dbc9ad352116 upstream. This is a new ID register, introduced in 8.7. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Reiji Watanabe <reijiw@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211210165432.8106-3-joey.gouly@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-11 12:11:51 +01:00
Marc Zyngier	7ae8127e41	arm64: Add HWCAP for self-synchronising virtual counter commit fee29f008aa3f2aff01117f28b57b1145d92cb9b upstream. Since userspace can make use of the CNTVSS_EL0 instruction, expose it via a HWCAP. Suggested-by: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211017124225.3018098-18-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-03-11 12:11:51 +01:00
D Scott Phillips	bf0d4ae5c6	arm64: errata: Fix exec handling in erratum 1418040 workaround commit 38e0257e0e6f4fef2aa2966b089b56a8b1cfb75c upstream. The erratum 1418040 workaround enables CNTVCT_EL1 access trapping in EL0 when executing compat threads. The workaround is applied when switching between tasks, but the need for the workaround could also change at an exec(), when a non-compat task execs a compat binary or vice versa. Apply the workaround in arch_setup_new_exec(). This leaves a small window of time between SET_PERSONALITY and arch_setup_new_exec where preemption could occur and confuse the old workaround logic that compares TIF_32BIT between prev and next. Instead, we can just read cntkctl to make sure it's in the state that the next task needs. I measured cntkctl read time to be about the same as a mov from a general-purpose register on N1. Update the workaround logic to examine the current value of cntkctl instead of the previous task's compat state. Fixes: d49f7d7376d0 ("arm64: Move handling of erratum 1418040 into C code") Cc: <stable@vger.kernel.org> # 5.9.x Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211220234114.3926-1-scott@os.amperecomputing.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-02-01 17:25:40 +01:00
Sean Christopherson	723acd75a0	perf: Protect perf_guest_cbs with RCU commit ff083a2d972f56bebfd82409ca62e5dfce950961 upstream. Protect perf_guest_cbs with RCU to fix multiple possible errors. Luckily, all paths that read perf_guest_cbs already require RCU protection, e.g. to protect the callback chains, so only the direct perf_guest_cbs touchpoints need to be modified. Bug #1 is a simple lack of WRITE_ONCE/READ_ONCE behavior to ensure perf_guest_cbs isn't reloaded between a !NULL check and a dereference. Fixed via the READ_ONCE() in rcu_dereference(). Bug #2 is that on weakly-ordered architectures, updates to the callbacks themselves are not guaranteed to be visible before the pointer is made visible to readers. Fixed by the smp_store_release() in rcu_assign_pointer() when the new pointer is non-NULL. Bug #3 is that, because the callbacks are global, it's possible for readers to run in parallel with an unregisters, and thus a module implementing the callbacks can be unloaded while readers are in flight, resulting in a use-after-free. Fixed by a synchronize_rcu() call when unregistering callbacks. Bug #1 escaped notice because it's extremely unlikely a compiler will reload perf_guest_cbs in this sequence. perf_guest_cbs does get reloaded for future derefs, e.g. for ->is_user_mode(), but the ->is_in_guest() guard all but guarantees the consumer will win the race, e.g. to nullify perf_guest_cbs, KVM has to completely exit the guest and teardown down all VMs before KVM start its module unload / unregister sequence. This also makes it all but impossible to encounter bug #3. Bug #2 has not been a problem because all architectures that register callbacks are strongly ordered and/or have a static set of callbacks. But with help, unloading kvm_intel can trigger bug #1 e.g. wrapping perf_guest_cbs with READ_ONCE in perf_misc_flags() while spamming kvm_intel module load/unload leads to: BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP CPU: 6 PID: 1825 Comm: stress Not tainted 5.14.0-rc2+ #459 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:perf_misc_flags+0x1c/0x70 Call Trace: perf_prepare_sample+0x53/0x6b0 perf_event_output_forward+0x67/0x160 __perf_event_overflow+0x52/0xf0 handle_pmi_common+0x207/0x300 intel_pmu_handle_irq+0xcf/0x410 perf_event_nmi_handler+0x28/0x50 nmi_handle+0xc7/0x260 default_do_nmi+0x6b/0x170 exc_nmi+0x103/0x130 asm_exc_nmi+0x76/0xbf Fixes: 39447b386c84 ("perf: Enhance perf to allow for guest statistic collection from host") Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211111020738.2512932-2-seanjc@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-20 09:17:50 +01:00
Nick Desaulniers	8c0059a25c	arm64: vdso32: require CROSS_COMPILE_COMPAT for gcc+bfd commit 3e6f8d1fa18457d54b20917bd9174d27daf09ab9 upstream. Similar to commit 231ad7f409f1 ("Makefile: infer --target from ARCH for CC=clang") There really is no point in setting --target based on $CROSS_COMPILE_COMPAT for clang when the integrated assembler is being used, since commit ef94340583ee ("arm64: vdso32: drop -no-integrated-as flag"). Allows COMPAT_VDSO to be selected without setting $CROSS_COMPILE_COMPAT when using clang and lld together. Before: $ ARCH=arm64 CROSS_COMPILE_COMPAT=arm-linux-gnueabi- make -j72 LLVM=1 defconfig $ grep CONFIG_COMPAT_VDSO .config CONFIG_COMPAT_VDSO=y $ ARCH=arm64 make -j72 LLVM=1 defconfig $ grep CONFIG_COMPAT_VDSO .config $ After: $ ARCH=arm64 CROSS_COMPILE_COMPAT=arm-linux-gnueabi- make -j72 LLVM=1 defconfig $ grep CONFIG_COMPAT_VDSO .config CONFIG_COMPAT_VDSO=y $ ARCH=arm64 make -j72 LLVM=1 defconfig $ grep CONFIG_COMPAT_VDSO .config CONFIG_COMPAT_VDSO=y Reviewed-by: Nathan Chancellor <nathan@kernel.org> Suggested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lore.kernel.org/r/20211019223646.1146945-5-ndesaulniers@google.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-12-29 12:25:53 +01:00
Nick Desaulniers	b16b124a42	arm64: vdso32: drop -no-integrated-as flag commit ef94340583eec5cb1544dc41a87baa4f684b3fe1 upstream. Clang can assemble these files just fine; this is a relic from the top level Makefile conditionally adding this. We no longer need --prefix, --gcc-toolchain, or -Qunused-arguments flags either with this change, so remove those too. To test building: $ ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- \ CROSS_COMPILE_COMPAT=arm-linux-gnueabi- make LLVM=1 LLVM_IAS=1 \ defconfig arch/arm64/kernel/vdso32/ Suggested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Tested-by: Stephen Boyd <swboyd@chromium.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210420174427.230228-1-ndesaulniers@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-12-29 12:25:53 +01:00
Mark Rutland	59d2dc7710	arm64: ftrace: add missing BTIs commit 35b6b28e69985eafb20b3b2c7bd6eca452b56b53 upstream. When branch target identifiers are in use, code reachable via an indirect branch requires a BTI landing pad at the branch target site. When building FTRACE_WITH_REGS atop patchable-function-entry, we miss BTIs at the start start of the `ftrace_caller` and `ftrace_regs_caller` trampolines, and when these are called from a module via a PLT (which will use a `BR X16`), we will encounter a BTI failure, e.g. \| # insmod lkdtm.ko \| lkdtm: No crash points registered, enable through debugfs \| # echo function_graph > /sys/kernel/debug/tracing/current_tracer \| # cat /sys/kernel/debug/provoke-crash/DIRECT \| Unhandled 64-bit el1h sync exception on CPU0, ESR 0x34000001 -- BTI \| CPU: 0 PID: 174 Comm: cat Not tainted 5.16.0-rc2-dirty #3 \| Hardware name: linux,dummy-virt (DT) \| pstate: 60400405 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=jc) \| pc : ftrace_caller+0x0/0x3c \| lr : lkdtm_debugfs_open+0xc/0x20 [lkdtm] \| sp : ffff800012e43b00 \| x29: ffff800012e43b00 x28: 0000000000000000 x27: ffff800012e43c88 \| x26: 0000000000000000 x25: 0000000000000000 x24: ffff0000c171f200 \| x23: ffff0000c27b1e00 x22: ffff0000c2265240 x21: ffff0000c23c8c30 \| x20: ffff8000090ba380 x19: 0000000000000000 x18: 0000000000000000 \| x17: 0000000000000000 x16: ffff80001002bb4c x15: 0000000000000000 \| x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000900ff0 \| x11: ffff0000c4166310 x10: ffff800012e43b00 x9 : ffff8000104f2384 \| x8 : 0000000000000001 x7 : 0000000000000000 x6 : 000000000000003f \| x5 : 0000000000000040 x4 : ffff800012e43af0 x3 : 0000000000000001 \| x2 : ffff8000090b0000 x1 : ffff0000c171f200 x0 : ffff0000c23c8c30 \| Kernel panic - not syncing: Unhandled exception \| CPU: 0 PID: 174 Comm: cat Not tainted 5.16.0-rc2-dirty #3 \| Hardware name: linux,dummy-virt (DT) \| Call trace: \| dump_backtrace+0x0/0x1a4 \| show_stack+0x24/0x30 \| dump_stack_lvl+0x68/0x84 \| dump_stack+0x1c/0x38 \| panic+0x168/0x360 \| arm64_exit_nmi.isra.0+0x0/0x80 \| el1h_64_sync_handler+0x68/0xd4 \| el1h_64_sync+0x78/0x7c \| ftrace_caller+0x0/0x3c \| do_dentry_open+0x134/0x3b0 \| vfs_open+0x38/0x44 \| path_openat+0x89c/0xe40 \| do_filp_open+0x8c/0x13c \| do_sys_openat2+0xbc/0x174 \| __arm64_sys_openat+0x6c/0xbc \| invoke_syscall+0x50/0x120 \| el0_svc_common.constprop.0+0xdc/0x100 \| do_el0_svc+0x84/0xa0 \| el0_svc+0x28/0x80 \| el0t_64_sync_handler+0xa8/0x130 \| el0t_64_sync+0x1a0/0x1a4 \| SMP: stopping secondary CPUs \| Kernel Offset: disabled \| CPU features: 0x0,00000f42,da660c5f \| Memory Limit: none \| ---[ end Kernel panic - not syncing: Unhandled exception ]--- Fix this by adding the required `BTI C`, as we only require these to be reachable via BL for direct calls or BR X16/X17 for PLTs. For now, these are open-coded in the function prologue, matching the style of the `__hwasan_tag_mismatch` trampoline. In future we may wish to consider adding a new SYM_CODE_START_*() variant which has an implicit BTI. When ftrace is built atop mcount, the trampolines are marked with SYM_FUNC_START(), and so get an implicit BTI. We may need to change these over to SYM_CODE_START() in future for RELIABLE_STACKTRACE, in case we need to apply special care aroud the return address being rewritten. Fixes: 97fed779f2a6 ("arm64: bti: Provide Kconfig for kernel mode BTI") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20211129135709.2274019-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-12-08 09:03:24 +01:00
Nick Desaulniers	af1d3c437e	arm64: vdso32: suppress error message for 'make mrproper' commit 14831fad73f5ac30ac61760487d95a538e6ab3cb upstream. When running the following command without arm-linux-gnueabi-gcc in one's $PATH, the following warning is observed: $ ARCH=arm64 CROSS_COMPILE_COMPAT=arm-linux-gnueabi- make -j72 LLVM=1 mrproper make[1]: arm-linux-gnueabi-gcc: No such file or directory This is because KCONFIG is not run for mrproper, so CONFIG_CC_IS_CLANG is not set, and we end up eagerly evaluating various variables that try to invoke CC_COMPAT. This is a similar problem to what was observed in commit dc960bfeedb0 ("h8300: suppress error messages for 'make clean'") Reported-by: Lucas Henneman <henneman@google.com> Suggested-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20211019223646.1146945-4-ndesaulniers@google.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-11-26 10:39:18 +01:00
Dan Li	0a511ba6d2	arm64: Mark __stack_chk_guard as __ro_after_init [ Upstream commit 9fcb2e93f41c07a400885325e7dbdfceba6efaec ] __stack_chk_guard is setup once while init stage and never changed after that. Although the modification of this variable at runtime will usually cause the kernel to crash (so does the attacker), it should be marked as __ro_after_init, and it should not affect performance if it is placed in the ro_after_init section. Signed-off-by: Dan Li <ashimida@linux.alibaba.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/1631612642-102881-1-git-send-email-ashimida@linux.alibaba.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-09-30 10:11:07 +02:00
Thomas Gleixner	b9a1526d51	drivers: base: cacheinfo: Get rid of DEFINE_SMP_CALL_CACHE_FUNCTION() [ Upstream commit 4b92d4add5f6dcf21275185c997d6ecb800054cd ] DEFINE_SMP_CALL_CACHE_FUNCTION() was usefel before the CPU hotplug rework to ensure that the cache related functions are called on the upcoming CPU because the notifier itself could run on any online CPU. The hotplug state machine guarantees that the callbacks are invoked on the upcoming CPU. So there is no need to have this SMP function call obfuscation. That indirection was missed when the hotplug notifiers were converted. This also solves the problem of ARM64 init_cache_level() invoking ACPI functions which take a semaphore in that context. That's invalid as SMP function calls run with interrupts disabled. Running it just from the callback in context of the CPU hotplug thread solves this. Fixes: 8571890e1513 ("arm64: Add support for ACPI based firmware tables") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/871r69ersb.ffs@tglx Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-09-26 14:08:59 +02:00
Mark Brown	a67e7cdbc6	arm64/sve: Use correct size when reinitialising SVE state commit e35ac9d0b56e9efefaeeb84b635ea26c2839ea86 upstream. When we need a buffer for SVE register state we call sve_alloc() to make sure that one is there. In order to avoid repeated allocations and frees we keep the buffer around unless we change vector length and just memset() it to ensure a clean register state. The function that deals with this takes the task to operate on as an argument, however in the case where we do a memset() we initialise using the SVE state size for the current task rather than the task passed as an argument. This is only an issue in the case where we are setting the register state for a task via ptrace and the task being configured has a different vector length to the task tracing it. In the case where the buffer is larger in the traced process we will leak old state from the traced process to itself, in the case where the buffer is smaller in the traced process we will overflow the buffer and corrupt memory. Fixes: bc0ee4760364 ("arm64/sve: Core task context handling") Cc: <stable@vger.kernel.org> # 4.15.x Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20210909165356.10675-1-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-09-22 12:27:54 +02:00
Mark Rutland	de32e15180	arm64: head: avoid over-mapping in map_memory commit 90268574a3e8a6b883bd802d702a2738577e1006 upstream. The `compute_indices` and `populate_entries` macros operate on inclusive bounds, and thus the `map_memory` macro which uses them also operates on inclusive bounds. We pass `_end` and `_idmap_text_end` to `map_memory`, but these are exclusive bounds, and if one of these is sufficiently aligned (as a result of kernel configuration, physical placement, and KASLR), then: * In `compute_indices`, the computed `iend` will be in the page/block after the final byte of the intended mapping. * In `populate_entries`, an unnecessary entry will be created at the end of each level of table. At the leaf level, this entry will map up to SWAPPER_BLOCK_SIZE bytes of physical addresses that we did not intend to map. As we may map up to SWAPPER_BLOCK_SIZE bytes more than intended, we may violate the boot protocol and map physical address past the 2MiB-aligned end address we are permitted to map. As we map these with Normal memory attributes, this may result in further problems depending on what these physical addresses correspond to. The final entry at each level may require an additional table at that level. As EARLY_ENTRIES() calculates an inclusive bound, we allocate enough memory for this. Avoid the extraneous mapping by having map_memory convert the exclusive end address to an inclusive end address by subtracting one, and do likewise in EARLY_ENTRIES() when calculating the number of required tables. For clarity, comments are updated to more clearly document which boundaries the macros operate on. For consistency with the other macros, the comments in map_memory are also updated to describe `vstart` and `vend` as virtual addresses. Fixes: 0370b31e4845 ("arm64: Extend early page table code to allow for larger kernels") Cc: <stable@vger.kernel.org> # 4.16.x Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Steve Capper <steve.capper@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210823101253.55567-1-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-09-18 13:40:08 +02:00
Mark Rutland	3d7d1b0f5f	arm64: fix compat syscall return truncation commit e30e8d46cf605d216a799a28c77b8a41c328613a upstream. Due to inconsistencies in the way we manipulate compat GPRs, we have a few issues today: * For audit and tracing, where error codes are handled as a (native) long, negative error codes are expected to be sign-extended to the native 64-bits, or they may fail to be matched correctly. Thus a syscall which fails with an error may erroneously be identified as failing. * For ptrace, all compat return values should be sign-extended for consistency with 32-bit arm, but we currently only do this for negative return codes. * As we may transiently set the upper 32 bits of some compat GPRs while in the kernel, these can be sampled by perf, which is somewhat confusing. This means that where a syscall returns a pointer above 2G, this will be sign-extended, but will not be mistaken for an error as error codes are constrained to the inclusive range [-4096, -1] where no user pointer can exist. To fix all of these, we must consistently use helpers to get/set the compat GPRs, ensuring that we never write the upper 32 bits of the return code, and always sign-extend when reading the return code. This patch does so, with the following changes: * We re-organise syscall_get_return_value() to always sign-extend for compat tasks, and reimplement syscall_get_error() atop. We update syscall_trace_exit() to use syscall_get_return_value(). * We consistently use syscall_set_return_value() to set the return value, ensureing the upper 32 bits are never set unexpectedly. * As the core audit code currently uses regs_return_value() rather than syscall_get_return_value(), we special-case this for compat_user_mode(regs) such that this will do the right thing. Going forward, we should try to move the core audit code over to syscall_get_return_value(). Cc: <stable@vger.kernel.org> Reported-by: He Zhe <zhe.he@windriver.com> Reported-by: weiyuchen <weiyuchen3@huawei.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210802104200.21390-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> [Mark: trivial conflict resolution for v5.10.y] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-08-12 13:22:20 +02:00
Mark Rutland	e5d8fd8709	arm64: stacktrace: avoid tracing arch_stack_walk() commit 0c32706dac1b0a72713184246952ab0f54327c21 upstream. When the function_graph tracer is in use, arch_stack_walk() may unwind the stack incorrectly, erroneously reporting itself, missing the final entry which is being traced, and reporting all traced entries between these off-by-one from where they should be. When ftrace hooks a function return, the original return address is saved to the fgraph ret_stack, and the return address in the LR (or the function's frame record) is replaced with `return_to_handler`. When arm64's unwinder encounter frames returning to `return_to_handler`, it finds the associated original return address from the fgraph ret stack, assuming the most recent `ret_to_hander` entry on the stack corresponds to the most recent entry in the fgraph ret stack, and so on. When arch_stack_walk() is used to dump the current task's stack, it starts from the caller of arch_stack_walk(). However, arch_stack_walk() can be traced, and so may push an entry on to the fgraph ret stack, leaving the fgraph ret stack offset by one from the expected position. This can be seen when dumping the stack via /proc/self/stack, where enabling the graph tracer results in an unexpected `stack_trace_save_tsk` entry at the start of the trace, and `el0_svc` missing form the end of the trace. This patch fixes this by marking arch_stack_walk() as notrace, as we do for all other functions on the path to ftrace_graph_get_ret_stack(). While a few helper functions are not marked notrace, their calls/returns are balanced, and will have no observable effect when examining the fgraph ret stack. It is possible for an exeption boundary to cause a similar offset if the return address of the interrupted context was in the LR. Fixing those cases will require some more substantial rework, and is left for subsequent patches. Before: \| # cat /proc/self/stack \| [<0>] proc_pid_stack+0xc4/0x140 \| [<0>] proc_single_show+0x6c/0x120 \| [<0>] seq_read_iter+0x240/0x4e0 \| [<0>] seq_read+0xe8/0x140 \| [<0>] vfs_read+0xb8/0x1e4 \| [<0>] ksys_read+0x74/0x100 \| [<0>] __arm64_sys_read+0x28/0x3c \| [<0>] invoke_syscall+0x50/0x120 \| [<0>] el0_svc_common.constprop.0+0xc4/0xd4 \| [<0>] do_el0_svc+0x30/0x9c \| [<0>] el0_svc+0x2c/0x54 \| [<0>] el0t_64_sync_handler+0x1a8/0x1b0 \| [<0>] el0t_64_sync+0x198/0x19c \| # echo function_graph > /sys/kernel/tracing/current_tracer \| # cat /proc/self/stack \| [<0>] stack_trace_save_tsk+0xa4/0x110 \| [<0>] proc_pid_stack+0xc4/0x140 \| [<0>] proc_single_show+0x6c/0x120 \| [<0>] seq_read_iter+0x240/0x4e0 \| [<0>] seq_read+0xe8/0x140 \| [<0>] vfs_read+0xb8/0x1e4 \| [<0>] ksys_read+0x74/0x100 \| [<0>] __arm64_sys_read+0x28/0x3c \| [<0>] invoke_syscall+0x50/0x120 \| [<0>] el0_svc_common.constprop.0+0xc4/0xd4 \| [<0>] do_el0_svc+0x30/0x9c \| [<0>] el0t_64_sync_handler+0x1a8/0x1b0 \| [<0>] el0t_64_sync+0x198/0x19c After: \| # cat /proc/self/stack \| [<0>] proc_pid_stack+0xc4/0x140 \| [<0>] proc_single_show+0x6c/0x120 \| [<0>] seq_read_iter+0x240/0x4e0 \| [<0>] seq_read+0xe8/0x140 \| [<0>] vfs_read+0xb8/0x1e4 \| [<0>] ksys_read+0x74/0x100 \| [<0>] __arm64_sys_read+0x28/0x3c \| [<0>] invoke_syscall+0x50/0x120 \| [<0>] el0_svc_common.constprop.0+0xc4/0xd4 \| [<0>] do_el0_svc+0x30/0x9c \| [<0>] el0_svc+0x2c/0x54 \| [<0>] el0t_64_sync_handler+0x1a8/0x1b0 \| [<0>] el0t_64_sync+0x198/0x19c \| # echo function_graph > /sys/kernel/tracing/current_tracer \| # cat /proc/self/stack \| [<0>] proc_pid_stack+0xc4/0x140 \| [<0>] proc_single_show+0x6c/0x120 \| [<0>] seq_read_iter+0x240/0x4e0 \| [<0>] seq_read+0xe8/0x140 \| [<0>] vfs_read+0xb8/0x1e4 \| [<0>] ksys_read+0x74/0x100 \| [<0>] __arm64_sys_read+0x28/0x3c \| [<0>] invoke_syscall+0x50/0x120 \| [<0>] el0_svc_common.constprop.0+0xc4/0xd4 \| [<0>] do_el0_svc+0x30/0x9c \| [<0>] el0_svc+0x2c/0x54 \| [<0>] el0t_64_sync_handler+0x1a8/0x1b0 \| [<0>] el0t_64_sync+0x198/0x19c Cc: <stable@vger.kernel.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Reviwed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20210802164845.45506-3-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-08-12 13:22:12 +02:00
Anshuman Khandual	bb5e089df7	arm64/mm: Fix ttbr0 values stored in struct thread_info for software-pan [ Upstream commit 9163f01130304fab1f74683d7d44632da7bda637 ] When using CONFIG_ARM64_SW_TTBR0_PAN, a task's thread_info::ttbr0 must be the TTBR0_EL1 value used to run userspace. With 52-bit PAs, the PA must be packed into the TTBR using phys_to_ttbr(), but we forget to do this in some of the SW PAN code. Thus, if the value is installed into TTBR0_EL1 (as may happen in the uaccess routines), this could result in UNPREDICTABLE behaviour. Since hardware with 52-bit PA support almost certainly has HW PAN, which will be used in preference, this shouldn't be a practical issue, but let's fix this for consistency. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Fixes: 529c4b05a3cb ("arm64: handle 52-bit addresses in TTBR") Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/1623749578-11231-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-07-14 16:56:07 +02:00
Mark Rutland	8d6acfe80d	arm64: consistently use reserved_pg_dir [ Upstream commit 833be850f1cabd0e3b5337c0fcab20a6e936dd48 ] Depending on configuration options and specific code paths, we either use the empty_zero_page or the configuration-dependent reserved_ttbr0 as a reserved value for TTBR{0,1}_EL1. To simplify this code, let's always allocate and use the same reserved_pg_dir, replacing reserved_ttbr0. Note that this is allocated (and hence pre-zeroed), and is also marked as read-only in the kernel Image mapping. Keeping this separate from the empty_zero_page potentially helps with robustness as the empty_zero_page is used in a number of cases where a failure to map it read-only could allow it to become corrupted. The (presently unused) swapper_pg_end symbol is also removed, and comments are added wherever we rely on the offsets between the pre-allocated pg_dirs to keep these cases easily identifiable. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201103102229.8542-1-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-07-14 16:56:06 +02:00
Tian Tao	d0214b841c	arm64: perf: Convert snprintf to sysfs_emit [ Upstream commit a5740e955540181f4ab8f076cc9795c6bbe4d730 ] Use sysfs_emit instead of snprintf to avoid buf overrun,because in sysfs_emit it strictly checks whether buf is null or buf whether pagesize aligned, otherwise it returns an error. Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Link: https://lore.kernel.org/r/1621497585-30887-1-git-send-email-tiantao6@hisilicon.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-07-14 16:55:52 +02:00
Valentin Schneider	3c51d82d0b	sched/core: Initialize the idle task with preemption disabled [ Upstream commit f1a0a376ca0c4ef1fc3d24e3e502acbb5b795674 ] As pointed out by commit de9b8f5dcbd9 ("sched: Fix crash trying to dequeue/enqueue the idle thread") init_idle() can and will be invoked more than once on the same idle task. At boot time, it is invoked for the boot CPU thread by sched_init(). Then smp_init() creates the threads for all the secondary CPUs and invokes init_idle() on them. As the hotplug machinery brings the secondaries to life, it will issue calls to idle_thread_get(), which itself invokes init_idle() yet again. In this case it's invoked twice more per secondary: at _cpu_up(), and at bringup_cpu(). Given smp_init() already initializes the idle tasks for all possible CPUs, no further initialization should be required. Now, removing init_idle() from idle_thread_get() exposes some interesting expectations with regards to the idle task's preempt_count: the secondary startup always issues a preempt_disable(), requiring some reset of the preempt count to 0 between hot-unplug and hotplug, which is currently served by idle_thread_get() -> idle_init(). Given the idle task is supposed to have preemption disabled once and never see it re-enabled, it seems that what we actually want is to initialize its preempt_count to PREEMPT_DISABLED and leave it there. Do that, and remove init_idle() from idle_thread_get(). Secondary startups were patched via coccinelle: @begone@ @@ -preempt_disable(); ... cpu_startup_entry(CPUHP_AP_ONLINE_IDLE); Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210512094636.2958515-1-valentin.schneider@arm.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-07-14 16:55:50 +02:00

1 2 3 4 5 ...

2946 Commits