linux

iv/linux

Author	SHA1	Message	Date
Mark Rutland	7cf283c7bd	arm64: uaccess: remove redundant PAN toggling Some code (e.g. futex) needs to make privileged accesses to userspace memory, and uses uaccess_{enable,disable}_privileged() in order to permit this. All other uaccess primitives use LDTR/STTR, and never need to toggle PAN. Remove the redundant PAN toggling. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201202131558.39270-12-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2020-12-02 19:49:11 +00:00
Mark Rutland	b5a5a01d8e	arm64: uaccess: remove addr_limit_user_check() Now that set_fs() is gone, addr_limit_user_check() is redundant. Remove the checks and associated thread flag. To ensure that _TIF_WORK_MASK can be used as an immediate value in an AND instruction (as it is in `ret_to_user`), TIF_MTE_ASYNC_FAULT is renumbered to keep the constituent bits of _TIF_WORK_MASK contiguous. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201202131558.39270-11-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2020-12-02 19:49:11 +00:00
Mark Rutland	3d2403fd10	arm64: uaccess: remove set_fs() Now that the uaccess primitives dont take addr_limit into account, we have no need to manipulate this via set_fs() and get_fs(). Remove support for these, along with some infrastructure this renders redundant. We no longer need to flip UAO to access kernel memory under KERNEL_DS, and head.S unconditionally clears UAO for all kernel configurations via an ERET in init_kernel_el. Thus, we don't need to dynamically flip UAO, nor do we need to context-switch it. However, we still need to adjust PAN during SDEI entry. Masking of __user pointers no longer needs to use the dynamic value of addr_limit, and can use a constant derived from the maximum possible userspace task size. A new TASK_SIZE_MAX constant is introduced for this, which is also used by core code. In configurations supporting 52-bit VAs, this may include a region of unusable VA space above a 48-bit TTBR0 limit, but never includes any portion of TTBR1. Note that TASK_SIZE_MAX is an exclusive limit, while USER_DS and KERNEL_DS were inclusive limits, and is converted to a mask by subtracting one. As the SDEI entry code repurposes the otherwise unnecessary pt_regs::orig_addr_limit field to store the TTBR1 of the interrupted context, for now we rename that to pt_regs::sdei_ttbr1. In future we can consider factoring that out. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: James Morse <james.morse@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201202131558.39270-10-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2020-12-02 19:49:11 +00:00
Mark Rutland	7b90dc40e3	arm64: uaccess cleanup macro naming Now the uaccess primitives use LDTR/STTR unconditionally, the uao_{ldp,stp,user_alternative} asm macros are misnamed, and have a redundant argument. Let's remove the redundant argument and rename these to user_{ldp,stp,ldst} respectively to clean this up. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Robin Murohy <robin.murphy@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201202131558.39270-9-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2020-12-02 19:49:11 +00:00
Mark Rutland	fc703d8013	arm64: uaccess: split user/kernel routines This patch separates arm64's user and kernel memory access primitives into distinct routines, adding new __{get,put}_kernel_nofault() helpers to access kernel memory, upon which core code builds larger copy routines. The kernel access routines (using LDR/STR) are not affected by PAN (when legitimately accessing kernel memory), nor are they affected by UAO. Switching to KERNEL_DS may set UAO, but this does not adversely affect the kernel access routines. The user access routines (using LDTR/STTR) are not affected by PAN (when legitimately accessing user memory), but are affected by UAO. As these are only legitimate to use under USER_DS with UAO clear, this should not be problematic. Routines performing atomics to user memory (futex and deprecated instruction emulation) still need to transiently clear PAN, and these are left as-is. These are never used on kernel memory. Subsequent patches will refactor the uaccess helpers to remove redundant code, and will also remove the redundant PAN/UAO manipulation. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201202131558.39270-8-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2020-12-02 19:49:11 +00:00
Mark Rutland	f253d827f3	arm64: uaccess: refactor __{get,put}_user As a step towards implementing __{get,put}_kernel_nofault(), this patch splits most user-memory specific logic out of __{get,put}_user(), with the memory access and fault handling in new __{raw_get,put}_mem() helpers. For now the LDR/LDTR patching is left within the *get_mem() helpers, and will be removed in a subsequent patch. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201202131558.39270-7-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2020-12-02 19:49:11 +00:00
Mark Rutland	923e1e7d82	arm64: uaccess: rename privileged uaccess routines We currently have many uaccess_{enable,disable}() variants, which subsequent patches will cut down as part of removing set_fs() and friends. Once this simplification is made, most uaccess routines will only need to ensure that the user page tables are mapped in TTBR0, as is currently dealt with by uaccess_ttbr0_{enable,disable}(). The existing uaccess_{enable,disable}() routines ensure that user page tables are mapped in TTBR0, and also disable PAN protections, which is necessary to be able to use atomics on user memory, but also permit unrelated privileged accesses to access user memory. As preparatory step, let's rename uaccess_{enable,disable}() to uaccess_{enable,disable}_privileged(), highlighting this caveat and discouraging wider misuse. Subsequent patches can reuse the uaccess_{enable,disable}() naming for the common case of ensuring the user page tables are mapped in TTBR0. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201202131558.39270-5-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2020-12-02 19:49:10 +00:00
Mark Rutland	2376e75cc7	arm64: sdei: explicitly simulate PAN/UAO entry In preparation for removing addr_limit and set_fs() we must decouple the SDEI PAN/UAO manipulation from the uaccess code, and explicitly reinitialize these as required. SDEI enters the kernel with a non-architectural exception, and prior to the most recent revision of the specification (ARM DEN 0054B), PSTATE bits (e.g. PAN, UAO) are not manipulated in the same way as for architectural exceptions. Notably, older versions of the spec can be read ambiguously as to whether PSTATE bits are inherited unchanged from the interrupted context or whether they are generated from scratch, with TF-A doing the latter. We have three cases to consider: 1) The existing TF-A implementation of SDEI will clear PAN and clear UAO (along with other bits in PSTATE) when delivering an SDEI exception. 2) In theory, implementations of SDEI prior to revision B could inherit PAN and UAO (along with other bits in PSTATE) unchanged from the interrupted context. However, in practice such implementations do not exist. 3) Going forward, new implementations of SDEI must clear UAO, and depending on SCTLR_ELx.SPAN must either inherit or set PAN. As we can ignore (2) we can assume that upon SDEI entry, UAO is always clear, though PAN may be clear, inherited, or set per SCTLR_ELx.SPAN. Therefore, we must explicitly initialize PAN, but do not need to do anything for UAO. Considering what we need to do: * When set_fs() is removed, force_uaccess_begin() will have no HW side-effects. As this only clears UAO, which we can assume has already been cleared upon entry, this is not a problem. We do not need to add code to manipulate UAO explicitly. * PAN may be cleared upon entry (in case 1 above), so where a kernel is built to use PAN and this is supported by all CPUs, the kernel must set PAN upon entry to ensure expected behaviour. * PAN may be inherited from the interrupted context (in case 3 above), and so where a kernel is not built to use PAN or where PAN support is not uniform across CPUs, the kernel must clear PAN to ensure expected behaviour. This patch reworks the SDEI code accordingly, explicitly setting PAN to the expected state in all cases. To cater for the cases where the kernel does not use PAN or this is not uniformly supported by hardware we add a new cpu_has_pan() helper which can be used regardless of whether the kernel is built to use PAN. The existing system_uses_ttbr0_pan() is redefined in terms of system_uses_hw_pan() both for clarity and as a minor optimization when HW PAN is not selected. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201202131558.39270-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2020-12-02 19:48:57 +00:00
Mark Rutland	d87a8e65b5	arm64: head.S: always initialize PSTATE As with SCTLR_ELx and other control registers, some PSTATE bits are UNKNOWN out-of-reset, and we may not be able to rely on hardware or firmware to initialize them to our liking prior to entry to the kernel, e.g. in the primary/secondary boot paths and return from idle/suspend. It would be more robust (and easier to reason about) if we consistently initialized PSTATE to a default value, as we do with control registers. This will ensure that the kernel is not adversely affected by bits it is not aware of, e.g. when support for a feature such as PAN/UAO is disabled. This patch ensures that PSTATE is consistently initialized at boot time via an ERET. This is not intended to relax the existing requirements (e.g. DAIF bits must still be set prior to entering the kernel). For features detected dynamically (which may require system-wide support), it is still necessary to subsequently modify PSTATE. As ERET is not always a Context Synchronization Event, an ISB is placed before each exception return to ensure updates to control registers have taken effect. This handles the kernel being entered with SCTLR_ELx.EOS clear (or any future control bits being in an UNKNOWN state). Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201113124937.20574-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2020-12-02 19:44:03 +00:00
Mark Rutland	2ffac9e3fd	arm64: head.S: cleanup SCTLR_ELx initialization Let's make SCTLR_ELx initialization a bit clearer by using meaningful names for the initialization values, following the same scheme for SCTLR_EL1 and SCTLR_EL2. These definitions will be used more widely in subsequent patches. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201113124937.20574-5-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2020-12-02 19:44:03 +00:00
Mark Rutland	515d5c8a13	arm64: add C wrappers for SET_PSTATE_() To make callsites easier to read, add trivial C wrappers for the SET_PSTATE_() helpers, and convert trivial uses over to these. The new wrappers will be used further in subsequent patches. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201113124937.20574-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2020-12-02 19:44:02 +00:00
Yanan Wang	7d894834a3	KVM: arm64: Add usage of stage 2 fault lookup level in user_mem_abort() If we get a FSC_PERM fault, just using (logging_active && writable) to determine calling kvm_pgtable_stage2_map(). There will be two more cases we should consider. (1) After logging_active is configged back to false from true. When we get a FSC_PERM fault with write_fault and adjustment of hugepage is needed, we should merge tables back to a block entry. This case is ignored by still calling kvm_pgtable_stage2_relax_perms(), which will lead to an endless loop and guest panic due to soft lockup. (2) We use (FSC_PERM && logging_active && writable) to determine collapsing a block entry into a table by calling kvm_pgtable_stage2_map(). But sometimes we may only need to relax permissions when trying to write to a page other than a block. In this condition,using kvm_pgtable_stage2_relax_perms() will be fine. The ISS filed bit[1:0] in ESR_EL2 regesiter indicates the stage2 lookup level at which a D-abort or I-abort occurred. By comparing granule of the fault lookup level with vma_pagesize, we can strictly distinguish conditions of calling kvm_pgtable_stage2_relax_perms() or kvm_pgtable_stage2_map(), and the above two cases will be well considered. Suggested-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201201201034.116760-4-wangyanan55@huawei.com	2020-12-02 09:53:29 +00:00
Vincenzo Frascino	9e5344e0ff	arm64: mte: Fix typo in macro definition UL in the definition of SYS_TFSR_EL1_TF1 was misspelled causing compilation issues when trying to implement in kernel MTE async mode. Fix the macro correcting the typo. Note: MTE async mode will be introduced with a future series. Fixes: c058b1c4a5ea ("arm64: mte: system register definitions") Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20201130170709.22309-1-vincenzo.frascino@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2020-11-30 17:36:52 +00:00
Marc Zyngier	4f1df628d4	KVM: arm64: Advertise ID_AA64PFR0_EL1.CSV3=1 if the CPUs are Meltdown-safe Cores that predate the introduction of ID_AA64PFR0_EL1.CSV3 to the ARMv8 architecture have this field set to 0, even of some of them are not affected by the vulnerability. The kernel maintains a list of unaffected cores (A53, A55 and a few others) so that it doesn't impose an expensive mitigation uncessarily. As we do for CSV2, let's expose the CSV3 property to guests that run on HW that is effectively not vulnerable. This can be reset to zero by writing to the ID register from userspace, ensuring that VMs can be migrated despite the new property being set. Reported-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-11-30 16:02:53 +00:00
Mark Rutland	f0cd5ac1e4	arm64: entry: fix NMI {user, kernel}->kernel transitions Exceptions which can be taken at (almost) any time are consdiered to be NMIs. On arm64 that includes: * SDEI events * GICv3 Pseudo-NMIs * Kernel stack overflows * Unexpected/unhandled exceptions ... but currently debug exceptions (BRKs, breakpoints, watchpoints, single-step) are not considered NMIs. As these can be taken at any time, kernel features (lockdep, RCU, ftrace) may not be in a consistent kernel state. For example, we may take an NMI from the idle code or partway through an entry/exit path. While nmi_enter() and nmi_exit() handle most of this state, notably they don't save/restore the lockdep state across an NMI being taken and handled. When interrupts are enabled and an NMI is taken, lockdep may see interrupts become disabled within the NMI code, but not see interrupts become enabled when returning from the NMI, leaving lockdep believing interrupts are disabled when they are actually disabled. The x86 code handles this in idtentry_{enter,exit}_nmi(), which will shortly be moved to the generic entry code. As we can't use either yet, we copy the x86 approach in arm64-specific helpers. All the NMI entrypoints are marked as noinstr to prevent any instrumentation handling code being invoked before the state has been corrected. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201130115950.22492-11-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2020-11-30 12:11:38 +00:00
Mark Rutland	7cd1ea1010	arm64: entry: fix non-NMI kernel<->kernel transitions There are periods in kernel mode when RCU is not watching and/or the scheduler tick is disabled, but we can still take exceptions such as interrupts. The arm64 exception handlers do not account for this, and it's possible that RCU is not watching while an exception handler runs. The x86/generic entry code handles this by ensuring that all (non-NMI) kernel exception handlers call irqentry_enter() and irqentry_exit(), which handle RCU, lockdep, and IRQ flag tracing. We can't yet move to the generic entry code, and already hadnle the user<->kernel transitions elsewhere, so we add new kernel<->kernel transition helpers alog the lines of the generic entry code. Since we now track interrupts becoming masked when an exception is taken, local_daif_inherit() is modified to track interrupts becoming re-enabled when the original context is inherited. To balance the entry/exit paths, each handler masks all DAIF exceptions before exit_to_kernel_mode(). Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201130115950.22492-10-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2020-11-30 12:11:38 +00:00
Mark Rutland	1ec2f2c05b	arm64: ptrace: prepare for EL1 irq/rcu tracking Exceptions from EL1 may be taken when RCU isn't watching (e.g. in idle sequences), or when the lockdep hardirqs transiently out-of-sync with the hardware state (e.g. in the middle of local_irq_enable()). To correctly handle these cases, we'll need to save/restore this state across some exceptions taken from EL1. A series of subsequent patches will update EL1 exception handlers to handle this. In preparation for this, and to avoid dependencies between those patches, this patch adds two new fields to struct pt_regs so that exception handlers can track this state. Note that this is placed in pt_regs as some entry/exit sequences such as el1_irq are invoked from assembly, which makes it very difficult to add a separate structure as with the irqentry_state used by x86. We can separate this once more of the exception logic is moved to C. While the fields only need to be bool, they are both made u64 to keep pt_regs 16-byte aligned. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201130115950.22492-9-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2020-11-30 12:11:38 +00:00
Mark Rutland	23529049c6	arm64: entry: fix non-NMI user<->kernel transitions When built with PROVE_LOCKING, NO_HZ_FULL, and CONTEXT_TRACKING_FORCE will WARN() at boot time that interrupts are enabled when we call context_tracking_user_enter(), despite the DAIF flags indicating that IRQs are masked. The problem is that we're not tracking IRQ flag changes accurately, and so lockdep believes interrupts are enabled when they are not (and vice-versa). We can shuffle things so to make this more accurate. For kernel->user transitions there are a number of constraints we need to consider: 1) When we call __context_tracking_user_enter() HW IRQs must be disabled and lockdep must be up-to-date with this. 2) Userspace should be treated as having IRQs enabled from the PoV of both lockdep and tracing. 3) As context_tracking_user_enter() stops RCU from watching, we cannot use RCU after calling it. 4) IRQ flag tracing and lockdep have state that must be manipulated before RCU is disabled. ... with similar constraints applying for user->kernel transitions, with the ordering reversed. The generic entry code has enter_from_user_mode() and exit_to_user_mode() helpers to handle this. We can't use those directly, so we add arm64 copies for now (without the instrumentation markers which aren't used on arm64). These replace the existing user_exit() and user_exit_irqoff() calls spread throughout handlers, and the exception unmasking is left as-is. Note that: * The accounting for debug exceptions from userspace now happens in el0_dbg() and ret_to_user(), so this is removed from debug_exception_enter() and debug_exception_exit(). As user_exit_irqoff() wakes RCU, the userspace-specific check is removed. * The accounting for syscalls now happens in el0_svc(), el0_svc_compat(), and ret_to_user(), so this is removed from el0_svc_common(). This does not adversely affect the workaround for erratum 1463225, as this does not depend on any of the state tracking. * In ret_to_user() we mask interrupts with local_daif_mask(), and so we need to inform lockdep and tracing. Here a trace_hardirqs_off() is sufficient and safe as we have not yet exited kernel context and RCU is usable. * As PROVE_LOCKING selects TRACE_IRQFLAGS, the ifdeferry in entry.S only needs to check for the latter. * EL0 SError handling will be dealt with in a subsequent patch, as this needs to be treated as an NMI. Prior to this patch, booting an appropriately-configured kernel would result in spats as below: \| DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled()) \| WARNING: CPU: 2 PID: 1 at kernel/locking/lockdep.c:5280 check_flags.part.54+0x1dc/0x1f0 \| Modules linked in: \| CPU: 2 PID: 1 Comm: init Not tainted 5.10.0-rc3 #3 \| Hardware name: linux,dummy-virt (DT) \| pstate: 804003c5 (Nzcv DAIF +PAN -UAO -TCO BTYPE=--) \| pc : check_flags.part.54+0x1dc/0x1f0 \| lr : check_flags.part.54+0x1dc/0x1f0 \| sp : ffff80001003bd80 \| x29: ffff80001003bd80 x28: ffff66ce801e0000 \| x27: 00000000ffffffff x26: 00000000000003c0 \| x25: 0000000000000000 x24: ffffc31842527258 \| x23: ffffc31842491368 x22: ffffc3184282d000 \| x21: 0000000000000000 x20: 0000000000000001 \| x19: ffffc318432ce000 x18: 0080000000000000 \| x17: 0000000000000000 x16: ffffc31840f18a78 \| x15: 0000000000000001 x14: ffffc3184285c810 \| x13: 0000000000000001 x12: 0000000000000000 \| x11: ffffc318415857a0 x10: ffffc318406614c0 \| x9 : ffffc318415857a0 x8 : ffffc31841f1d000 \| x7 : 647261685f706564 x6 : ffffc3183ff7c66c \| x5 : ffff66ce801e0000 x4 : 0000000000000000 \| x3 : ffffc3183fe00000 x2 : ffffc31841500000 \| x1 : e956dc24146b3500 x0 : 0000000000000000 \| Call trace: \| check_flags.part.54+0x1dc/0x1f0 \| lock_is_held_type+0x10c/0x188 \| rcu_read_lock_sched_held+0x70/0x98 \| __context_tracking_enter+0x310/0x350 \| context_tracking_enter.part.3+0x5c/0xc8 \| context_tracking_user_enter+0x6c/0x80 \| finish_ret_to_user+0x2c/0x13cr Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201130115950.22492-8-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2020-11-30 12:11:38 +00:00
Mark Rutland	105fc33520	arm64: entry: move el1 irq/nmi logic to C In preparation for reworking the EL1 irq/nmi entry code, move the existing logic to C. We no longer need the asm_nmi_enter() and asm_nmi_exit() wrappers, so these are removed. The new C functions are marked noinstr, which prevents compiler instrumentation and runtime probing. In subsequent patches we'll want the new C helpers to be called in all cases, so we don't bother wrapping the calls with ifdeferry. Even when the new C functions are stubs the trivial calls are unlikely to have a measurable impact on the IRQ or NMI paths anyway. Prototypes are added to <asm/exception.h> as otherwise (in some configurations) GCC will complain about the lack of a forward declaration. We already do this for existing function, e.g. enter_from_user_mode(). The new helpers are marked as noinstr (which prevents all instrumentation, tracing, and kprobes). Otherwise, there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201130115950.22492-7-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2020-11-30 12:11:38 +00:00
Marc Zyngier	7f43c2014f	arm64: Make the Meltdown mitigation state available Our Meltdown mitigation state isn't exposed outside of cpufeature.c, contrary to the rest of the Spectre mitigation state. As we are going to use it in KVM, expose a arm64_get_meltdown_state() helper which returns the same possible values as arm64_get_spectre_v?_state(). Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-11-28 12:02:44 +00:00
Marc Zyngier	90f0e16c64	Merge branch 'kvm-arm64/misc-5.11' into kvmarm-master/next Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-11-27 19:48:24 +00:00
Will Deacon	bf118a5cb7	KVM: arm64: Remove unused __extended_idmap_trampoline() prototype __extended_idmap_trampoline() was removed a long time ago by 3421e9d88d7a ("arm64: KVM: Simplify HYP init/teardown") so remove the unused function prototype. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201118194402.2892-4-will@kernel.org	2020-11-27 18:59:05 +00:00
Will Deacon	36fb4cd55f	KVM: arm64: Remove kvm_arch_vm_ioctl_check_extension() kvm_arch_vm_ioctl_check_extension() is only called from kvm_vm_ioctl_check_extension(), so we can inline it and remove the extra function. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201118194402.2892-3-will@kernel.org	2020-11-27 18:59:05 +00:00
Will Deacon	8d14797b53	KVM: arm64: Move 'struct kvm_arch_memory_slot' out of uapi/ 'struct kvm_arch_memory_slot' isn't part of the user ABI, so move it out of the uapi/ headers in case we start using it in future and accidentally back ourselves into a corner. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201118194402.2892-2-will@kernel.org	2020-11-27 18:59:05 +00:00
Linus Torvalds	e4e9458073	arm64 fixes for -rc6 - Fix kerneldoc warnings generated by ACPI IORT code - Fix pte_accessible() so that access flag is ignored - Fix missing header #include - Fix loss of software dirty bit across pte_wrprotect() when HW DBM is enabled -----BEGIN PGP SIGNATURE----- iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl/A4moQHHdpbGxAa2Vy bmVsLm9yZwAKCRC3rHDchMFjNKhtB/4kdY/Do6h8ivkVNy1oId9mWSHrk9HwoEmS wFvyTIuyhaS2NlNW2/9V662e+wWhzv0ULkDu6aSVsnuw88b1yz1X2ytuh/t8beMJ PZGWO9UUuVuf+YjcaI7ygppXHILQWdau8eyOVMTciMOYYZRdIqex8WPApxpATRO4 7TphGUuo0IH0GGwq4u6JhPrT0Ihr9txWnvqnM8+fP/y3IbFncH+GKjxZq0k0BwcK 1fkIINwRFHmvkPouPyRPzE6DBudIlHPMGS68CXJV5/T4+JQ5bjc/bmAm8AYVVgip xSUeKg/x7AMF/8vho0pnvxAqh0FF6c370EjiKBk6AIGNJhjdROxt =e2/i -----END PGP SIGNATURE----- Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "The main changes are relating to our handling of access/dirty bits, where our low-level page-table helpers could lead to stale young mappings and loss of the dirty bit in some cases (the latter has not been observed in practice, but could happen when clearing "soft-dirty" if we enabled that). These were posted as part of a larger series, but the rest of that is less urgent and needs a v2 which I'll get to shortly. In other news, we've now got a set of fixes to resolve the lockdep/tracing problems that have been plaguing us for a while, but they're still a bit "fresh" and I plan to send them to you next week after we've got some more confidence in them (although initial CI results look good). Summary: - Fix kerneldoc warnings generated by ACPI IORT code - Fix pte_accessible() so that access flag is ignored - Fix missing header #include - Fix loss of software dirty bit across pte_wrprotect() when HW DBM is enabled" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: pgtable: Ensure dirty bit is preserved across pte_wrprotect() arm64: pgtable: Fix pte_accessible() ACPI/IORT: Fix doc warnings in iort.c arm64/fpsimd: add <asm/insn.h> to <asm/kprobes.h> to fix fpsimd build	2020-11-27 10:44:59 -08:00
Marc Zyngier	dc2286f397	Merge branch 'kvm-arm64/vector-rework' into kvmarm-master/next Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-11-27 11:47:08 +00:00
Marc Zyngier	6e5d8c713d	Merge branch 'kvm-arm64/pmu-undef' into kvmarm-master/next Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-11-27 11:46:47 +00:00
Marc Zyngier	14bda7a927	KVM: arm64: Add kvm_vcpu_has_pmu() helper There are a number of places where we check for the KVM_ARM_VCPU_PMU_V3 feature. Wrap this check into a new kvm_vcpu_has_pmu(), and use it at the existing locations. No functional change. Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-11-27 11:39:14 +00:00
Marc Zyngier	8c38602fb3	Merge branch 'kvm-arm64/host-hvc-table' into kvmarm-master/next Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-11-27 11:33:27 +00:00
Marc Zyngier	149f120edb	Merge branch 'kvm-arm64/copro-no-more' into kvmarm-master/next Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-11-27 11:33:16 +00:00
Marc Zyngier	37da329ed6	Merge branch 'kvm-arm64/el2-pc' into kvmarm-master/next Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-11-27 11:33:10 +00:00
Marc Zyngier	29052f1b92	KVM: arm64: Simplify __kvm_enable_ssbs() Move the setting of SSBS directly into the HVC handler, using the C helpers rather than the inline asssembly code. Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-11-27 11:32:44 +00:00
Marc Zyngier	68b824e428	KVM: arm64: Patch kimage_voffset instead of loading the EL1 value Directly using the kimage_voffset variable is fine for now, but will become more problematic as we start distrusting EL1. Instead, patch the kimage_voffset into the HYP text, ensuring we don't have to load an untrusted value later on. Signed-off-by: Marc Zyngier <maz@kernel.org>	2020-11-27 11:32:43 +00:00
Peter Collingbourne	49b3cf035e	kasan: arm64: set TCR_EL1.TBID1 when enabled On hardware supporting pointer authentication, we previously ended up enabling TBI on instruction accesses when tag-based ASAN was enabled, but this was costing us 8 bits of PAC entropy, which was unnecessary since tag-based ASAN does not require TBI on instruction accesses. Get them back by setting TCR_EL1.TBID1. Signed-off-by: Peter Collingbourne <pcc@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Link: https://linux-review.googlesource.com/id/I3dded7824be2e70ea64df0aabab9598d5aebfcc4 Link: https://lore.kernel.org/r/20f64e26fc8a1309caa446fffcb1b4e2fe9e229f.1605952129.git.pcc@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2020-11-25 16:53:19 +00:00
Peter Collingbourne	dceec3ff78	arm64: expose FAR_EL1 tag bits in siginfo The kernel currently clears the tag bits (i.e. bits 56-63) in the fault address exposed via siginfo.si_addr and sigcontext.fault_address. However, the tag bits may be needed by tools in order to accurately diagnose memory errors, such as HWASan [1] or future tools based on the Memory Tagging Extension (MTE). Expose these bits via the arch_untagged_si_addr mechanism, so that they are only exposed to signal handlers with the SA_EXPOSE_TAGBITS flag set. [1] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html Signed-off-by: Peter Collingbourne <pcc@google.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://linux-review.googlesource.com/id/Ia8876bad8c798e0a32df7c2ce1256c4771c81446 Link: https://lore.kernel.org/r/0010296597784267472fa13b39f8238d87a72cf8.1605904350.git.pcc@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2020-11-23 18:17:39 +00:00
Will Deacon	ff1712f953	arm64: pgtable: Ensure dirty bit is preserved across pte_wrprotect() With hardware dirty bit management, calling pte_wrprotect() on a writable, dirty PTE will lose the dirty state and return a read-only, clean entry. Move the logic from ptep_set_wrprotect() into pte_wrprotect() to ensure that the dirty bit is preserved for writable entries, as this is required for soft-dirty bit management if we enable it in the future. Cc: <stable@vger.kernel.org> Fixes: 2f4b829c625e ("arm64: Add support for hardware updates of the access and dirty pte bits") Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20201120143557.6715-3-will@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2020-11-23 16:13:18 +00:00
Will Deacon	07509e10dc	arm64: pgtable: Fix pte_accessible() pte_accessible() is used by ptep_clear_flush() to figure out whether TLB invalidation is necessary when unmapping pages for reclaim. Although our implementation is correct according to the architecture, returning true only for valid, young ptes in the absence of racing page-table modifications, this is in fact flawed due to lazy invalidation of old ptes in ptep_clear_flush_young() where we elide the expensive DSB instruction for completing the TLB invalidation. Rather than penalise the aging path, adjust pte_accessible() to return true for any valid pte, even if the access flag is cleared. Cc: <stable@vger.kernel.org> Fixes: 76c714be0e5e ("arm64: pgtable: implement pte_accessible()") Reported-by: Yu Zhao <yuzhao@google.com> Acked-by: Yu Zhao <yuzhao@google.com> Reviewed-by: Minchan Kim <minchan@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20201120143557.6715-2-will@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2020-11-23 16:13:12 +00:00
Randy Dunlap	03659efe42	arm64/fpsimd: add <asm/insn.h> to <asm/kprobes.h> to fix fpsimd build Adding <asm/exception.h> brought in <asm/kprobes.h> which uses <asm/probes.h>, which uses 'pstate_check_t' so the latter needs to #include <asm/insn.h> for this typedef. Fixes this build error: In file included from arch/arm64/include/asm/kprobes.h:24, from arch/arm64/include/asm/exception.h:11, from arch/arm64/kernel/fpsimd.c:35: arch/arm64/include/asm/probes.h:16:2: error: unknown type name 'pstate_check_t' 16 \| pstate_check_t *pstate_cc; Fixes: c6b90d5cf637 ("arm64/fpsimd: Fix missing-prototypes in fpsimd.c") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Tian Tao <tiantao6@hisilicon.com> Link: https://lore.kernel.org/r/20201123044510.9942-1-rdunlap@infradead.org Signed-off-by: Will Deacon <will@kernel.org>	2020-11-23 10:59:00 +00:00
Thomas Gleixner	2cb0837e56	arm64: irqstat: Get rid of duplicated declaration irq_cpustat_t is exactly the same as the asm-generic one. Define ack_bad_irq so the generic header does not emit the generic version of it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201113141733.392015387@linutronix.de	2020-11-23 10:31:05 +01:00
Kees Cook	ffde703470	arm64: Enable seccomp architecture tracking To enable seccomp constant action bitmaps, we need to have a static mapping to the audit architecture and system call table size. Add these for arm64. Signed-off-by: Kees Cook <keescook@chromium.org>	2020-11-20 11:16:34 -08:00
Will Deacon	4f6a36fed7	KVM: arm64: Remove redundant hyp vectors entry The hyp vectors entry corresponding to HYP_VECTOR_DIRECT (i.e. when neither Spectre-v2 nor Spectre-v3a are present) is unused, as we can simply dispatch straight to __kvm_hyp_vector in this case. Remove the redundant vector, and massage the logic for resolving a slot to a vectors entry. Reported-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201113113847.21619-11-will@kernel.org	2020-11-16 10:43:06 +00:00
Will Deacon	cd1f56b930	arm64: spectre: Consolidate spectre-v3a detection The spectre-v3a mitigation is split between cpu_errata.c and spectre.c, with the former handling detection of the problem and the latter handling enabling of the workaround. Move the detection logic alongside the enabling logic, like we do for the other spectre mitigations. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20201113113847.21619-10-will@kernel.org	2020-11-16 10:43:06 +00:00
Will Deacon	c4792b6dbc	arm64: spectre: Rename ARM64_HARDEN_EL2_VECTORS to ARM64_SPECTRE_V3A Since ARM64_HARDEN_EL2_VECTORS is really a mitigation for Spectre-v3a, rename it accordingly for consistency with the v2 and v4 mitigation. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20201113113847.21619-9-will@kernel.org	2020-11-16 10:43:06 +00:00
Will Deacon	b881cdce77	KVM: arm64: Allocate hyp vectors statically The EL2 vectors installed when a guest is running point at one of the following configurations for a given CPU: - Straight at __kvm_hyp_vector - A trampoline containing an SMC sequence to mitigate Spectre-v2 and then a direct branch to __kvm_hyp_vector - A dynamically-allocated trampoline which has an indirect branch to __kvm_hyp_vector - A dynamically-allocated trampoline containing an SMC sequence to mitigate Spectre-v2 and then an indirect branch to __kvm_hyp_vector The indirect branches mean that VA randomization at EL2 isn't trivially bypassable using Spectre-v3a (where the vector base is readable by the guest). Rather than populate these vectors dynamically, configure everything statically and use an enumerated type to identify the vector "slot" corresponding to one of the configurations above. This both simplifies the code, but also makes it much easier to implement at EL2 later on. Signed-off-by: Will Deacon <will@kernel.org> [maz: fixed double call to kvm_init_vector_slots() on nVHE] Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20201113113847.21619-8-will@kernel.org	2020-11-16 10:43:05 +00:00
Will Deacon	6279017e80	KVM: arm64: Move BP hardening helpers into spectre.h The BP hardening helpers are an integral part of the Spectre-v2 mitigation, so move them into asm/spectre.h and inline the arm64_get_bp_hardening_data() function at the same time. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20201113113847.21619-6-will@kernel.org	2020-11-16 10:40:18 +00:00
Will Deacon	07cf8aa922	KVM: arm64: Make BP hardening globals static instead Branch predictor hardening of the hyp vectors is partially driven by a couple of global variables ('__kvm_bp_vect_base' and '__kvm_harden_el2_vector_slot'). However, these are only used within a single compilation unit, so internalise them there instead. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20201113113847.21619-5-will@kernel.org	2020-11-16 10:40:18 +00:00
Will Deacon	042c76a950	KVM: arm64: Move kvm_get_hyp_vector() out of header file kvm_get_hyp_vector() has only one caller, so move it out of kvm_mmu.h and inline it into a new function, cpu_set_hyp_vector(), for setting the vector. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20201113113847.21619-4-will@kernel.org	2020-11-16 10:40:17 +00:00
Linus Torvalds	0062442ecf	Fixes for ARM and x86, the latter especially for old processors without two-dimensional paging (EPT/NPT). -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl+xQ54UHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroMzeQf+JP9NpXgeB7dhiODhmO5SyLdw0u9j kVOM6+kHcEvG6o0yU1uUZr2ZPh9vIAwIjXi8Luiodcazdp6jvxvJ32CeMYJz2lel y+3Gjp3WS2+FExOjBephBztaMHLihlWQt3E0EKuCc7StyfMhaZooiTRMpvrmiLWe HQ/epM9oLMyrCqG9MKkvTwH0lDyB5CprV1BNt6YyKjt7d5swEqC75A6lOXnmdAah utgx1agSIVQPv6vDF9HLaQaoelHT7ucudx+zIkvOAmoQ56AJMPfCr0+Af3ZVW+f/ I5tXVfBhoOV3BVSIsJS7Px0HcZt7siVtl6ISZZos8ox85S4ysjWm2vXFcQ== =MiOr -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "Fixes for ARM and x86, the latter especially for old processors without two-dimensional paging (EPT/NPT)" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: kvm: mmu: fix is_tdp_mmu_check when the TDP MMU is not in use KVM: SVM: Update cr3_lm_rsvd_bits for AMD SEV guests KVM: x86: Introduce cr3_lm_rsvd_bits in kvm_vcpu_arch KVM: x86: clflushopt should be treated as a no-op by emulation KVM: arm64: Handle SCXTNUM_ELx traps KVM: arm64: Unify trap handlers injecting an UNDEF KVM: arm64: Allow setting of ID_AA64PFR0_EL1.CSV2 from userspace	2020-11-15 09:57:58 -08:00
Ionela Voinescu	68c5debcc0	arm64: implement CPPC FFH support using AMUs If Activity Monitors (AMUs) are present, two of the counters can be used to implement support for CPPC's (Collaborative Processor Performance Control) delivered and reference performance monitoring functionality using FFH (Functional Fixed Hardware). Given that counters for a certain CPU can only be read from that CPU, while FFH operations can be called from any CPU for any of the CPUs, use smp_call_function_single() to provide the requested values. Therefore, depending on the register addresses, the following values are returned: - 0x0 (DeliveredPerformanceCounterRegister): AMU core counter - 0x1 (ReferencePerformanceCounterRegister): AMU constant counter The use of Activity Monitors is hidden behind the generic cpu_read_{corecnt,constcnt}() functions. Read functionality for these two registers represents the only current FFH support for CPPC. Read operations for other register values or write operation for all registers are unsupported. Therefore, keep CPPC's FFH unsupported if no CPUs have valid AMU frequency counters. For this purpose, the get_cpu_with_amu_feat() is introduced. Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201106125334.21570-4-ionela.voinescu@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2020-11-13 20:05:10 +00:00
Ionela Voinescu	4b9cf23c17	arm64: wrap and generalise counter read functions In preparation for other uses of Activity Monitors (AMU) cycle counters, place counter read functionality in generic functions that can reused: read_corecnt() and read_constcnt(). As a result, implement update_freq_counters_refs() to replace init_cpu_freq_invariance_counters() and both initialise and update the per-cpu reference variables. Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201106125334.21570-2-ionela.voinescu@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2020-11-13 20:05:10 +00:00

... 2 3 4 5 6 ...

3536 Commits