2019-06-03 07:44:50 +02:00
// SPDX-License-Identifier: GPL-2.0-only
2012-03-05 11:49:27 +00:00
/*
* Based on arch / arm / kernel / traps . c
*
* Copyright ( C ) 1995 - 2009 Russell King
* Copyright ( C ) 2012 ARM Ltd .
*/
2015-07-24 16:37:48 +01:00
# include <linux/bug.h>
2019-08-20 18:45:57 +01:00
# include <linux/context_tracking.h>
2012-03-05 11:49:27 +00:00
# include <linux/signal.h>
# include <linux/kallsyms.h>
2019-08-20 18:45:57 +01:00
# include <linux/kprobes.h>
2012-03-05 11:49:27 +00:00
# include <linux/spinlock.h>
# include <linux/uaccess.h>
# include <linux/hardirq.h>
# include <linux/kdebug.h>
# include <linux/module.h>
# include <linux/kexec.h>
# include <linux/delay.h>
2023-02-01 09:50:07 +01:00
# include <linux/efi.h>
2012-03-05 11:49:27 +00:00
# include <linux/init.h>
2017-02-08 18:51:30 +01:00
# include <linux/sched/signal.h>
2017-02-08 18:51:35 +01:00
# include <linux/sched/debug.h>
2017-02-08 18:51:37 +01:00
# include <linux/sched/task_stack.h>
arm64: add VMAP_STACK overflow detection
This patch adds stack overflow detection to arm64, usable when vmap'd stacks
are in use.
Overflow is detected in a small preamble executed for each exception entry,
which checks whether there is enough space on the current stack for the general
purpose registers to be saved. If there is not enough space, the overflow
handler is invoked on a per-cpu overflow stack. This approach preserves the
original exception information in ESR_EL1 (and where appropriate, FAR_EL1).
Task and IRQ stacks are aligned to double their size, enabling overflow to be
detected with a single bit test. For example, a 16K stack is aligned to 32K,
ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
this bit is flipped. Thus, overflow (of less than the size of the stack) can be
detected by testing whether this bit is set.
The overflow check is performed before any attempt is made to access the
stack, avoiding recursive faults (and the loss of exception information
these would entail). As logical operations cannot be performed on the SP
directly, the SP is temporarily swapped with a general purpose register
using arithmetic operations to enable the test to be performed.
This gives us a useful error message on stack overflow, as can be trigger with
the LKDTM overflow test:
[ 305.388749] lkdtm: Performing direct entry OVERFLOW
[ 305.395444] Insufficient stack space to handle exception!
[ 305.395482] ESR: 0x96000047 -- DABT (current EL)
[ 305.399890] FAR: 0xffff00000a5e7f30
[ 305.401315] Task stack: [0xffff00000a5e8000..0xffff00000a5ec000]
[ 305.403815] IRQ stack: [0xffff000008000000..0xffff000008004000]
[ 305.407035] Overflow stack: [0xffff80003efce4e0..0xffff80003efcf4e0]
[ 305.409622] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.412785] Hardware name: linux,dummy-virt (DT)
[ 305.415756] task: ffff80003d051c00 task.stack: ffff00000a5e8000
[ 305.419221] PC is at recursive_loop+0x10/0x48
[ 305.421637] LR is at recursive_loop+0x38/0x48
[ 305.423768] pc : [<ffff00000859f330>] lr : [<ffff00000859f358>] pstate: 40000145
[ 305.428020] sp : ffff00000a5e7f50
[ 305.430469] x29: ffff00000a5e8350 x28: ffff80003d051c00
[ 305.433191] x27: ffff000008981000 x26: ffff000008f80400
[ 305.439012] x25: ffff00000a5ebeb8 x24: ffff00000a5ebeb8
[ 305.440369] x23: ffff000008f80138 x22: 0000000000000009
[ 305.442241] x21: ffff80003ce65000 x20: ffff000008f80188
[ 305.444552] x19: 0000000000000013 x18: 0000000000000006
[ 305.446032] x17: 0000ffffa2601280 x16: ffff0000081fe0b8
[ 305.448252] x15: ffff000008ff546d x14: 000000000047a4c8
[ 305.450246] x13: ffff000008ff7872 x12: 0000000005f5e0ff
[ 305.452953] x11: ffff000008ed2548 x10: 000000000005ee8d
[ 305.454824] x9 : ffff000008545380 x8 : ffff00000a5e8770
[ 305.457105] x7 : 1313131313131313 x6 : 00000000000000e1
[ 305.459285] x5 : 0000000000000000 x4 : 0000000000000000
[ 305.461781] x3 : 0000000000000000 x2 : 0000000000000400
[ 305.465119] x1 : 0000000000000013 x0 : 0000000000000012
[ 305.467724] Kernel panic - not syncing: kernel stack overflow
[ 305.470561] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.473325] Hardware name: linux,dummy-virt (DT)
[ 305.475070] Call trace:
[ 305.476116] [<ffff000008088ad8>] dump_backtrace+0x0/0x378
[ 305.478991] [<ffff000008088e64>] show_stack+0x14/0x20
[ 305.481237] [<ffff00000895a178>] dump_stack+0x98/0xb8
[ 305.483294] [<ffff0000080c3288>] panic+0x118/0x280
[ 305.485673] [<ffff0000080c2e9c>] nmi_panic+0x6c/0x70
[ 305.486216] [<ffff000008089710>] handle_bad_stack+0x118/0x128
[ 305.486612] Exception stack(0xffff80003efcf3a0 to 0xffff80003efcf4e0)
[ 305.487334] f3a0: 0000000000000012 0000000000000013 0000000000000400 0000000000000000
[ 305.488025] f3c0: 0000000000000000 0000000000000000 00000000000000e1 1313131313131313
[ 305.488908] f3e0: ffff00000a5e8770 ffff000008545380 000000000005ee8d ffff000008ed2548
[ 305.489403] f400: 0000000005f5e0ff ffff000008ff7872 000000000047a4c8 ffff000008ff546d
[ 305.489759] f420: ffff0000081fe0b8 0000ffffa2601280 0000000000000006 0000000000000013
[ 305.490256] f440: ffff000008f80188 ffff80003ce65000 0000000000000009 ffff000008f80138
[ 305.490683] f460: ffff00000a5ebeb8 ffff00000a5ebeb8 ffff000008f80400 ffff000008981000
[ 305.491051] f480: ffff80003d051c00 ffff00000a5e8350 ffff00000859f358 ffff00000a5e7f50
[ 305.491444] f4a0: ffff00000859f330 0000000040000145 0000000000000000 0000000000000000
[ 305.492008] f4c0: 0001000000000000 0000000000000000 ffff00000a5e8350 ffff00000859f330
[ 305.493063] [<ffff00000808205c>] __bad_stack+0x88/0x8c
[ 305.493396] [<ffff00000859f330>] recursive_loop+0x10/0x48
[ 305.493731] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494088] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494425] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494649] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494898] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495205] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495453] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495708] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496000] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496302] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496644] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496894] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497138] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497325] [<ffff00000859f3dc>] lkdtm_OVERFLOW+0x14/0x20
[ 305.497506] [<ffff00000859f314>] lkdtm_do_action+0x1c/0x28
[ 305.497786] [<ffff00000859f178>] direct_entry+0xe0/0x170
[ 305.498095] [<ffff000008345568>] full_proxy_write+0x60/0xa8
[ 305.498387] [<ffff0000081fb7f4>] __vfs_write+0x1c/0x128
[ 305.498679] [<ffff0000081fcc68>] vfs_write+0xa0/0x1b0
[ 305.498926] [<ffff0000081fe0fc>] SyS_write+0x44/0xa0
[ 305.499182] Exception stack(0xffff00000a5ebec0 to 0xffff00000a5ec000)
[ 305.499429] bec0: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.499674] bee0: 574f4c465245564f 0000000000000000 0000000000000000 8000000080808080
[ 305.499904] bf00: 0000000000000040 0000000000000038 fefefeff1b4bc2ff 7f7f7f7f7f7fff7f
[ 305.500189] bf20: 0101010101010101 0000000000000000 000000000047a4c8 0000000000000038
[ 305.500712] bf40: 0000000000000000 0000ffffa2601280 0000ffffc63f6068 00000000004b5000
[ 305.501241] bf60: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.501791] bf80: 0000000000000020 0000000000000000 00000000004b5000 000000001c4cc458
[ 305.502314] bfa0: 0000000000000000 0000ffffc63f7950 000000000040a3c4 0000ffffc63f70e0
[ 305.502762] bfc0: 0000ffffa2601268 0000000080000000 0000000000000001 0000000000000040
[ 305.503207] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 305.503680] [<ffff000008082fb0>] el0_svc_naked+0x24/0x28
[ 305.504720] Kernel Offset: disabled
[ 305.505189] CPU features: 0x002082
[ 305.505473] Memory Limit: none
[ 305.506181] ---[ end Kernel panic - not syncing: kernel stack overflow
This patch was co-authored by Ard Biesheuvel and Mark Rutland.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-07-14 20:30:35 +01:00
# include <linux/sizes.h>
2012-03-05 11:49:27 +00:00
# include <linux/syscalls.h>
2017-02-04 00:16:44 +01:00
# include <linux/mm_types.h>
2018-12-28 00:30:54 -08:00
# include <linux/kasan.h>
2023-02-02 22:07:49 +00:00
# include <linux/ubsan.h>
2022-09-08 14:54:52 -07:00
# include <linux/cfi.h>
2012-03-05 11:49:27 +00:00
# include <asm/atomic.h>
2015-07-24 16:37:48 +01:00
# include <asm/bug.h>
2018-03-26 15:12:28 +01:00
# include <asm/cpufeature.h>
2017-11-02 12:12:34 +00:00
# include <asm/daifflags.h>
2013-03-16 08:48:13 +00:00
# include <asm/debug-monitors.h>
2023-02-01 09:50:07 +01:00
# include <asm/efi.h>
2014-11-18 12:16:30 +00:00
# include <asm/esr.h>
arm64: entry: fix NMI {user, kernel}->kernel transitions
Exceptions which can be taken at (almost) any time are consdiered to be
NMIs. On arm64 that includes:
* SDEI events
* GICv3 Pseudo-NMIs
* Kernel stack overflows
* Unexpected/unhandled exceptions
... but currently debug exceptions (BRKs, breakpoints, watchpoints,
single-step) are not considered NMIs.
As these can be taken at any time, kernel features (lockdep, RCU,
ftrace) may not be in a consistent kernel state. For example, we may
take an NMI from the idle code or partway through an entry/exit path.
While nmi_enter() and nmi_exit() handle most of this state, notably they
don't save/restore the lockdep state across an NMI being taken and
handled. When interrupts are enabled and an NMI is taken, lockdep may
see interrupts become disabled within the NMI code, but not see
interrupts become enabled when returning from the NMI, leaving lockdep
believing interrupts are disabled when they are actually disabled.
The x86 code handles this in idtentry_{enter,exit}_nmi(), which will
shortly be moved to the generic entry code. As we can't use either yet,
we copy the x86 approach in arm64-specific helpers. All the NMI
entrypoints are marked as noinstr to prevent any instrumentation
handling code being invoked before the state has been corrected.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201130115950.22492-11-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-30 11:59:49 +00:00
# include <asm/exception.h>
2020-09-15 15:48:09 +01:00
# include <asm/extable.h>
2015-07-24 16:37:48 +01:00
# include <asm/insn.h>
2019-10-25 17:42:10 +01:00
# include <asm/kprobes.h>
2021-06-09 11:23:00 +01:00
# include <asm/patching.h>
2012-03-05 11:49:27 +00:00
# include <asm/traps.h>
arm64: add VMAP_STACK overflow detection
This patch adds stack overflow detection to arm64, usable when vmap'd stacks
are in use.
Overflow is detected in a small preamble executed for each exception entry,
which checks whether there is enough space on the current stack for the general
purpose registers to be saved. If there is not enough space, the overflow
handler is invoked on a per-cpu overflow stack. This approach preserves the
original exception information in ESR_EL1 (and where appropriate, FAR_EL1).
Task and IRQ stacks are aligned to double their size, enabling overflow to be
detected with a single bit test. For example, a 16K stack is aligned to 32K,
ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
this bit is flipped. Thus, overflow (of less than the size of the stack) can be
detected by testing whether this bit is set.
The overflow check is performed before any attempt is made to access the
stack, avoiding recursive faults (and the loss of exception information
these would entail). As logical operations cannot be performed on the SP
directly, the SP is temporarily swapped with a general purpose register
using arithmetic operations to enable the test to be performed.
This gives us a useful error message on stack overflow, as can be trigger with
the LKDTM overflow test:
[ 305.388749] lkdtm: Performing direct entry OVERFLOW
[ 305.395444] Insufficient stack space to handle exception!
[ 305.395482] ESR: 0x96000047 -- DABT (current EL)
[ 305.399890] FAR: 0xffff00000a5e7f30
[ 305.401315] Task stack: [0xffff00000a5e8000..0xffff00000a5ec000]
[ 305.403815] IRQ stack: [0xffff000008000000..0xffff000008004000]
[ 305.407035] Overflow stack: [0xffff80003efce4e0..0xffff80003efcf4e0]
[ 305.409622] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.412785] Hardware name: linux,dummy-virt (DT)
[ 305.415756] task: ffff80003d051c00 task.stack: ffff00000a5e8000
[ 305.419221] PC is at recursive_loop+0x10/0x48
[ 305.421637] LR is at recursive_loop+0x38/0x48
[ 305.423768] pc : [<ffff00000859f330>] lr : [<ffff00000859f358>] pstate: 40000145
[ 305.428020] sp : ffff00000a5e7f50
[ 305.430469] x29: ffff00000a5e8350 x28: ffff80003d051c00
[ 305.433191] x27: ffff000008981000 x26: ffff000008f80400
[ 305.439012] x25: ffff00000a5ebeb8 x24: ffff00000a5ebeb8
[ 305.440369] x23: ffff000008f80138 x22: 0000000000000009
[ 305.442241] x21: ffff80003ce65000 x20: ffff000008f80188
[ 305.444552] x19: 0000000000000013 x18: 0000000000000006
[ 305.446032] x17: 0000ffffa2601280 x16: ffff0000081fe0b8
[ 305.448252] x15: ffff000008ff546d x14: 000000000047a4c8
[ 305.450246] x13: ffff000008ff7872 x12: 0000000005f5e0ff
[ 305.452953] x11: ffff000008ed2548 x10: 000000000005ee8d
[ 305.454824] x9 : ffff000008545380 x8 : ffff00000a5e8770
[ 305.457105] x7 : 1313131313131313 x6 : 00000000000000e1
[ 305.459285] x5 : 0000000000000000 x4 : 0000000000000000
[ 305.461781] x3 : 0000000000000000 x2 : 0000000000000400
[ 305.465119] x1 : 0000000000000013 x0 : 0000000000000012
[ 305.467724] Kernel panic - not syncing: kernel stack overflow
[ 305.470561] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.473325] Hardware name: linux,dummy-virt (DT)
[ 305.475070] Call trace:
[ 305.476116] [<ffff000008088ad8>] dump_backtrace+0x0/0x378
[ 305.478991] [<ffff000008088e64>] show_stack+0x14/0x20
[ 305.481237] [<ffff00000895a178>] dump_stack+0x98/0xb8
[ 305.483294] [<ffff0000080c3288>] panic+0x118/0x280
[ 305.485673] [<ffff0000080c2e9c>] nmi_panic+0x6c/0x70
[ 305.486216] [<ffff000008089710>] handle_bad_stack+0x118/0x128
[ 305.486612] Exception stack(0xffff80003efcf3a0 to 0xffff80003efcf4e0)
[ 305.487334] f3a0: 0000000000000012 0000000000000013 0000000000000400 0000000000000000
[ 305.488025] f3c0: 0000000000000000 0000000000000000 00000000000000e1 1313131313131313
[ 305.488908] f3e0: ffff00000a5e8770 ffff000008545380 000000000005ee8d ffff000008ed2548
[ 305.489403] f400: 0000000005f5e0ff ffff000008ff7872 000000000047a4c8 ffff000008ff546d
[ 305.489759] f420: ffff0000081fe0b8 0000ffffa2601280 0000000000000006 0000000000000013
[ 305.490256] f440: ffff000008f80188 ffff80003ce65000 0000000000000009 ffff000008f80138
[ 305.490683] f460: ffff00000a5ebeb8 ffff00000a5ebeb8 ffff000008f80400 ffff000008981000
[ 305.491051] f480: ffff80003d051c00 ffff00000a5e8350 ffff00000859f358 ffff00000a5e7f50
[ 305.491444] f4a0: ffff00000859f330 0000000040000145 0000000000000000 0000000000000000
[ 305.492008] f4c0: 0001000000000000 0000000000000000 ffff00000a5e8350 ffff00000859f330
[ 305.493063] [<ffff00000808205c>] __bad_stack+0x88/0x8c
[ 305.493396] [<ffff00000859f330>] recursive_loop+0x10/0x48
[ 305.493731] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494088] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494425] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494649] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494898] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495205] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495453] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495708] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496000] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496302] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496644] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496894] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497138] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497325] [<ffff00000859f3dc>] lkdtm_OVERFLOW+0x14/0x20
[ 305.497506] [<ffff00000859f314>] lkdtm_do_action+0x1c/0x28
[ 305.497786] [<ffff00000859f178>] direct_entry+0xe0/0x170
[ 305.498095] [<ffff000008345568>] full_proxy_write+0x60/0xa8
[ 305.498387] [<ffff0000081fb7f4>] __vfs_write+0x1c/0x128
[ 305.498679] [<ffff0000081fcc68>] vfs_write+0xa0/0x1b0
[ 305.498926] [<ffff0000081fe0fc>] SyS_write+0x44/0xa0
[ 305.499182] Exception stack(0xffff00000a5ebec0 to 0xffff00000a5ec000)
[ 305.499429] bec0: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.499674] bee0: 574f4c465245564f 0000000000000000 0000000000000000 8000000080808080
[ 305.499904] bf00: 0000000000000040 0000000000000038 fefefeff1b4bc2ff 7f7f7f7f7f7fff7f
[ 305.500189] bf20: 0101010101010101 0000000000000000 000000000047a4c8 0000000000000038
[ 305.500712] bf40: 0000000000000000 0000ffffa2601280 0000ffffc63f6068 00000000004b5000
[ 305.501241] bf60: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.501791] bf80: 0000000000000020 0000000000000000 00000000004b5000 000000001c4cc458
[ 305.502314] bfa0: 0000000000000000 0000ffffc63f7950 000000000040a3c4 0000ffffc63f70e0
[ 305.502762] bfc0: 0000ffffa2601268 0000000080000000 0000000000000001 0000000000000040
[ 305.503207] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 305.503680] [<ffff000008082fb0>] el0_svc_naked+0x24/0x28
[ 305.504720] Kernel Offset: disabled
[ 305.505189] CPU features: 0x002082
[ 305.505473] Memory Limit: none
[ 305.506181] ---[ end Kernel panic - not syncing: kernel stack overflow
This patch was co-authored by Ard Biesheuvel and Mark Rutland.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-07-14 20:30:35 +01:00
# include <asm/smp.h>
2016-11-03 20:23:05 +00:00
# include <asm/stack_pointer.h>
2012-03-05 11:49:27 +00:00
# include <asm/stacktrace.h>
# include <asm/system_misc.h>
2016-06-28 18:07:32 +01:00
# include <asm/sysreg.h>
2012-03-05 11:49:27 +00:00
2021-03-03 18:05:30 +01:00
static bool __kprobes __check_eq ( unsigned long pstate )
{
return ( pstate & PSR_Z_BIT ) ! = 0 ;
}
static bool __kprobes __check_ne ( unsigned long pstate )
{
return ( pstate & PSR_Z_BIT ) = = 0 ;
}
static bool __kprobes __check_cs ( unsigned long pstate )
{
return ( pstate & PSR_C_BIT ) ! = 0 ;
}
static bool __kprobes __check_cc ( unsigned long pstate )
{
return ( pstate & PSR_C_BIT ) = = 0 ;
}
static bool __kprobes __check_mi ( unsigned long pstate )
{
return ( pstate & PSR_N_BIT ) ! = 0 ;
}
static bool __kprobes __check_pl ( unsigned long pstate )
{
return ( pstate & PSR_N_BIT ) = = 0 ;
}
static bool __kprobes __check_vs ( unsigned long pstate )
{
return ( pstate & PSR_V_BIT ) ! = 0 ;
}
static bool __kprobes __check_vc ( unsigned long pstate )
{
return ( pstate & PSR_V_BIT ) = = 0 ;
}
static bool __kprobes __check_hi ( unsigned long pstate )
{
pstate & = ~ ( pstate > > 1 ) ; /* PSR_C_BIT &= ~PSR_Z_BIT */
return ( pstate & PSR_C_BIT ) ! = 0 ;
}
static bool __kprobes __check_ls ( unsigned long pstate )
{
pstate & = ~ ( pstate > > 1 ) ; /* PSR_C_BIT &= ~PSR_Z_BIT */
return ( pstate & PSR_C_BIT ) = = 0 ;
}
static bool __kprobes __check_ge ( unsigned long pstate )
{
pstate ^ = ( pstate < < 3 ) ; /* PSR_N_BIT ^= PSR_V_BIT */
return ( pstate & PSR_N_BIT ) = = 0 ;
}
static bool __kprobes __check_lt ( unsigned long pstate )
{
pstate ^ = ( pstate < < 3 ) ; /* PSR_N_BIT ^= PSR_V_BIT */
return ( pstate & PSR_N_BIT ) ! = 0 ;
}
static bool __kprobes __check_gt ( unsigned long pstate )
{
/*PSR_N_BIT ^= PSR_V_BIT */
unsigned long temp = pstate ^ ( pstate < < 3 ) ;
temp | = ( pstate < < 1 ) ; /*PSR_N_BIT |= PSR_Z_BIT */
return ( temp & PSR_N_BIT ) = = 0 ;
}
static bool __kprobes __check_le ( unsigned long pstate )
{
/*PSR_N_BIT ^= PSR_V_BIT */
unsigned long temp = pstate ^ ( pstate < < 3 ) ;
temp | = ( pstate < < 1 ) ; /*PSR_N_BIT |= PSR_Z_BIT */
return ( temp & PSR_N_BIT ) ! = 0 ;
}
static bool __kprobes __check_al ( unsigned long pstate )
{
return true ;
}
/*
* Note that the ARMv8 ARM calls condition code 0 b1111 " nv " , but states that
* it behaves identically to 0 b1110 ( " al " ) .
*/
pstate_check_t * const aarch32_opcode_cond_checks [ 16 ] = {
__check_eq , __check_ne , __check_cs , __check_cc ,
__check_mi , __check_pl , __check_vs , __check_vc ,
__check_hi , __check_ls , __check_ge , __check_lt ,
__check_gt , __check_le , __check_al , __check_al
} ;
2018-02-01 23:13:38 +01:00
int show_unhandled_signals = 0 ;
2012-03-05 11:49:27 +00:00
2019-06-26 20:50:13 +09:00
static void dump_kernel_instr ( const char * lvl , struct pt_regs * regs )
2012-03-05 11:49:27 +00:00
{
unsigned long addr = instruction_pointer ( regs ) ;
char str [ sizeof ( " 00000000 " ) * 5 + 2 + 1 ] , * p = str ;
int i ;
2019-06-26 20:50:13 +09:00
if ( user_mode ( regs ) )
return ;
2012-03-05 11:49:27 +00:00
for ( i = - 4 ; i < 1 ; i + + ) {
unsigned int val , bad ;
2019-06-26 20:50:13 +09:00
bad = aarch64_insn_read ( & ( ( u32 * ) addr ) [ i ] , & val ) ;
2012-03-05 11:49:27 +00:00
if ( ! bad )
p + = sprintf ( p , i = = 0 ? " (%08x) " : " %08x " , val ) ;
arm64: traps: attempt to dump all instructions
Currently dump_kernel_instr() dumps a few instructions around the
pt_regs::pc value, dumping 4 instructions before the PC before dumping
the instruction at the PC. If an attempt to read an instruction fails,
it gives up and does not attempt to dump any subsequent instructions.
This is unfortunate when the pt_regs::pc value points to the start of a
page with a leading guard page, where the instruction at the PC can be
read, but prior instructions cannot.
This patch makes dump_kernel_instr() attempt to dump each instruction
regardless of whether reading a prior instruction could be read, which
gives a more useful code dump in such cases. When an instruction cannot
be read, it is reported as "????????", which cannot be confused with a
hex value,
For example, with a `UDF #0` (AKA 0x00000000) early in the kexec control
page, we'll now get the following code dump:
| Internal error: Oops - Undefined instruction: 0000000002000000 [#1] SMP
| Modules linked in:
| CPU: 0 PID: 261 Comm: kexec Not tainted 6.2.0-rc5+ #26
| Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
| pstate: 604003c5 (nZCv DAIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : 0x48c00000
| lr : machine_kexec+0x190/0x200
| sp : ffff80000d36ba80
| x29: ffff80000d36ba80 x28: ffff000002dfc380 x27: 0000000000000000
| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
| x23: ffff80000a9f7858 x22: 000000004c460000 x21: 0000000000000010
| x20: 00000000ad821000 x19: ffff000000aa0000 x18: 0000000000000006
| x17: ffff8000758a2000 x16: ffff800008000000 x15: ffff80000d36b568
| x14: 0000000000000000 x13: ffff80000d36b707 x12: ffff80000a9bf6e0
| x11: 00000000ffffdfff x10: ffff80000aaaf8e0 x9 : ffff80000815eff8
| x8 : 000000000002ffe8 x7 : c0000000ffffdfff x6 : 00000000000affa8
| x5 : 0000000000001fff x4 : 0000000000000001 x3 : ffff80000a263008
| x2 : ffff80000a9e20f8 x1 : 0000000048c00000 x0 : ffff000000aa0000
| Call trace:
| 0x48c00000
| kernel_kexec+0x88/0x138
| __do_sys_reboot+0x108/0x288
| __arm64_sys_reboot+0x2c/0x40
| invoke_syscall+0x78/0x140
| el0_svc_common.constprop.0+0x4c/0x100
| do_el0_svc+0x34/0x80
| el0_svc+0x34/0x140
| el0t_64_sync_handler+0xf4/0x140
| el0t_64_sync+0x194/0x1c0
| Code: ???????? ???????? ???????? ???????? (00000000)
| ---[ end trace 0000000000000000 ]---
| Kernel panic - not syncing: Oops - Undefined instruction: Fatal exception
| Kernel Offset: disabled
| CPU features: 0x002000,00050108,c8004203
| Memory Limit: none
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20230127121256.2141368-1-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-27 12:12:56 +00:00
else
p + = sprintf ( p , i = = 0 ? " (????????) " : " ???????? " ) ;
2012-03-05 11:49:27 +00:00
}
2019-06-26 20:50:13 +09:00
printk ( " %sCode: %s \n " , lvl , str ) ;
2012-03-05 11:49:27 +00:00
}
# ifdef CONFIG_PREEMPT
# define S_PREEMPT " PREEMPT"
2019-10-15 21:17:49 +02:00
# elif defined(CONFIG_PREEMPT_RT)
# define S_PREEMPT " PREEMPT_RT"
2012-03-05 11:49:27 +00:00
# else
# define S_PREEMPT ""
# endif
2019-10-15 21:17:49 +02:00
2012-03-05 11:49:27 +00:00
# define S_SMP " SMP"
2022-09-13 11:17:29 +01:00
static int __die ( const char * str , long err , struct pt_regs * regs )
2012-03-05 11:49:27 +00:00
{
static int die_counter ;
int ret ;
2022-09-13 11:17:29 +01:00
pr_emerg ( " Internal error: %s: %016lx [#%d] " S_PREEMPT S_SMP " \n " ,
2012-03-05 11:49:27 +00:00
str , err , + + die_counter ) ;
/* trap and error numbers are mostly meaningless on ARM */
ret = notify_die ( DIE_OOPS , str , regs , err , 0 , SIGSEGV ) ;
if ( ret = = NOTIFY_STOP )
return ret ;
print_modules ( ) ;
2019-04-08 17:56:34 +01:00
show_regs ( regs ) ;
2012-03-05 11:49:27 +00:00
2019-06-26 20:50:13 +09:00
dump_kernel_instr ( KERN_EMERG , regs ) ;
2012-03-05 11:49:27 +00:00
return ret ;
}
static DEFINE_RAW_SPINLOCK ( die_lock ) ;
/*
* This function is protected against re - entrancy .
*/
2022-09-13 11:17:29 +01:00
void die ( const char * str , struct pt_regs * regs , long err )
2012-03-05 11:49:27 +00:00
{
int ret ;
2017-07-07 17:29:34 +08:00
unsigned long flags ;
raw_spin_lock_irqsave ( & die_lock , flags ) ;
2012-03-05 11:49:27 +00:00
oops_enter ( ) ;
console_verbose ( ) ;
bust_spinlocks ( 1 ) ;
2016-11-03 20:23:06 +00:00
ret = __die ( str , err , regs ) ;
2012-03-05 11:49:27 +00:00
2016-11-03 20:23:06 +00:00
if ( regs & & kexec_should_crash ( current ) )
2012-03-05 11:49:27 +00:00
crash_kexec ( regs ) ;
bust_spinlocks ( 0 ) ;
2013-01-21 17:17:39 +10:30
add_taint ( TAINT_DIE , LOCKDEP_NOW_UNRELIABLE ) ;
2012-03-05 11:49:27 +00:00
oops_exit ( ) ;
if ( in_interrupt ( ) )
2020-08-04 16:53:47 +08:00
panic ( " %s: Fatal exception in interrupt " , str ) ;
2012-03-05 11:49:27 +00:00
if ( panic_on_oops )
2020-08-04 16:53:47 +08:00
panic ( " %s: Fatal exception " , str ) ;
2017-07-07 17:29:34 +08:00
raw_spin_unlock_irqrestore ( & die_lock , flags ) ;
2012-03-05 11:49:27 +00:00
if ( ret ! = NOTIFY_STOP )
2021-06-28 14:52:01 -05:00
make_task_dead ( SIGSEGV ) ;
2012-03-05 11:49:27 +00:00
}
2018-09-22 00:52:21 +02:00
static void arm64_show_signal ( int signo , const char * str )
2018-02-20 15:08:51 +00:00
{
static DEFINE_RATELIMIT_STATE ( rs , DEFAULT_RATELIMIT_INTERVAL ,
DEFAULT_RATELIMIT_BURST ) ;
2018-09-22 00:38:41 +02:00
struct task_struct * tsk = current ;
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
unsigned long esr = tsk - > thread . fault_code ;
2018-02-20 13:46:05 +00:00
struct pt_regs * regs = task_pt_regs ( tsk ) ;
2018-09-22 00:52:21 +02:00
/* Leave if the signal won't be shown */
if ( ! show_unhandled_signals | |
! unhandled_signal ( tsk , signo ) | |
! __ratelimit ( & rs ) )
return ;
2018-02-20 13:46:05 +00:00
pr_info ( " %s[%d]: unhandled exception: " , tsk - > comm , task_pid_nr ( tsk ) ) ;
if ( esr )
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
pr_cont ( " %s, ESR 0x%016lx, " , esr_get_class_string ( esr ) , esr ) ;
2018-02-20 13:46:05 +00:00
pr_cont ( " %s " , str ) ;
print_vma_addr ( KERN_CONT " in " , regs - > pc ) ;
pr_cont ( " \n " ) ;
__show_regs ( regs ) ;
2018-09-22 00:52:21 +02:00
}
2018-02-20 13:46:05 +00:00
2020-11-20 12:33:46 -08:00
void arm64_force_sig_fault ( int signo , int code , unsigned long far ,
2018-09-22 10:26:57 +02:00
const char * str )
{
arm64_show_signal ( signo , str ) ;
2019-05-23 11:11:19 -05:00
if ( signo = = SIGKILL )
2019-05-23 10:17:27 -05:00
force_sig ( SIGKILL ) ;
2019-05-23 11:11:19 -05:00
else
2020-11-20 12:33:46 -08:00
force_sig_fault ( signo , code , ( void __user * ) far ) ;
2018-09-22 10:26:57 +02:00
}
2018-02-20 13:46:05 +00:00
2020-11-20 12:33:46 -08:00
void arm64_force_sig_mceerr ( int code , unsigned long far , short lsb ,
2018-09-22 10:37:15 +02:00
const char * str )
{
arm64_show_signal ( SIGBUS , str ) ;
2020-11-20 12:33:46 -08:00
force_sig_mceerr ( code , ( void __user * ) far , lsb ) ;
2018-09-22 10:37:15 +02:00
}
2020-11-20 12:33:46 -08:00
void arm64_force_sig_ptrace_errno_trap ( int errno , unsigned long far ,
2018-09-22 10:52:41 +02:00
const char * str )
{
arm64_show_signal ( SIGTRAP , str ) ;
2020-11-20 12:33:46 -08:00
force_sig_ptrace_errno_trap ( errno , ( void __user * ) far ) ;
2018-02-20 13:46:05 +00:00
}
2012-03-05 11:49:27 +00:00
void arm64_notify_die ( const char * str , struct pt_regs * regs ,
2020-11-20 12:33:46 -08:00
int signo , int sicode , unsigned long far ,
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
unsigned long err )
2012-03-05 11:49:27 +00:00
{
2014-04-06 23:04:12 +01:00
if ( user_mode ( regs ) ) {
2018-02-20 13:46:05 +00:00
WARN_ON ( regs ! = current_pt_regs ( ) ) ;
2014-04-06 23:04:12 +01:00
current - > thread . fault_address = 0 ;
current - > thread . fault_code = err ;
2018-09-21 17:24:40 +02:00
2020-11-20 12:33:46 -08:00
arm64_force_sig_fault ( signo , sicode , far , str ) ;
2014-04-06 23:04:12 +01:00
} else {
2012-03-05 11:49:27 +00:00
die ( str , regs , err ) ;
2014-04-06 23:04:12 +01:00
}
2012-03-05 11:49:27 +00:00
}
2020-03-16 16:50:50 +00:00
# ifdef CONFIG_COMPAT
# define PSTATE_IT_1_0_SHIFT 25
# define PSTATE_IT_1_0_MASK (0x3 << PSTATE_IT_1_0_SHIFT)
# define PSTATE_IT_7_2_SHIFT 10
# define PSTATE_IT_7_2_MASK (0x3f << PSTATE_IT_7_2_SHIFT)
static u32 compat_get_it_state ( struct pt_regs * regs )
{
u32 it , pstate = regs - > pstate ;
it = ( pstate & PSTATE_IT_1_0_MASK ) > > PSTATE_IT_1_0_SHIFT ;
it | = ( ( pstate & PSTATE_IT_7_2_MASK ) > > PSTATE_IT_7_2_SHIFT ) < < 2 ;
return it ;
}
static void compat_set_it_state ( struct pt_regs * regs , u32 it )
{
u32 pstate_it ;
pstate_it = ( it < < PSTATE_IT_1_0_SHIFT ) & PSTATE_IT_1_0_MASK ;
pstate_it | = ( ( it > > 2 ) < < PSTATE_IT_7_2_SHIFT ) & PSTATE_IT_7_2_MASK ;
regs - > pstate & = ~ PSR_AA32_IT_MASK ;
regs - > pstate | = pstate_it ;
}
static void advance_itstate ( struct pt_regs * regs )
{
u32 it ;
/* ARM mode */
if ( ! ( regs - > pstate & PSR_AA32_T_BIT ) | |
! ( regs - > pstate & PSR_AA32_IT_MASK ) )
return ;
it = compat_get_it_state ( regs ) ;
/*
* If this is the last instruction of the block , wipe the IT
* state . Otherwise advance it .
*/
if ( ! ( it & 7 ) )
it = 0 ;
else
it = ( it & 0xe0 ) | ( ( it < < 1 ) & 0x1f ) ;
compat_set_it_state ( regs , it ) ;
}
# else
static void advance_itstate ( struct pt_regs * regs )
{
}
# endif
2020-03-16 16:50:49 +00:00
2017-10-25 10:04:33 +01:00
void arm64_skip_faulting_instruction ( struct pt_regs * regs , unsigned long size )
{
regs - > pc + = size ;
/*
* If we were single stepping , we want to get the step exception after
* we return from the trap .
*/
2018-04-03 11:22:51 +01:00
if ( user_mode ( regs ) )
user_fastforward_single_step ( current ) ;
2020-03-16 16:50:49 +00:00
2020-03-16 16:50:50 +00:00
if ( compat_user_mode ( regs ) )
2020-03-16 16:50:49 +00:00
advance_itstate ( regs ) ;
2020-03-16 16:50:51 +00:00
else
regs - > pstate & = ~ PSR_BTYPE_MASK ;
2017-10-25 10:04:33 +01:00
}
2022-10-19 15:41:18 +01:00
static int user_insn_read ( struct pt_regs * regs , u32 * insnp )
2014-11-18 11:41:22 +00:00
{
u32 instr ;
2021-09-17 11:28:11 +05:30
unsigned long pc = instruction_pointer ( regs ) ;
2014-11-18 11:41:22 +00:00
arm64: factor out EL1 SSBS emulation hook
Currently call_undef_hook() is used to handle UNDEFINED exceptions from
EL0 and EL1. As support for deprecated instructions may be enabled
independently, the handlers for individual instructions are organised as
a linked list of struct undef_hook which can be manipulated dynamically.
As this can be manipulated dynamically, the list is protected with a
raw_spinlock which must be acquired when handling UNDEFINED exceptions
or when manipulating the list of handlers.
This locking is unfortunate as it serialises handling of UNDEFINED
exceptions, and requires RCU to be enabled for lockdep, requiring the
use of RCU_NONIDLE() in resume path of cpu_suspend() since commit:
a2c42bbabbe260b7 ("arm64: spectre: Prevent lockdep splat on v4 mitigation enable path")
The list of UNDEFINED handlers largely consist of handlers for
exceptions taken from EL0, and the only handler for exceptions taken
from EL1 handles `MSR SSBS, #imm` on CPUs which feature PSTATE.SSBS but
lack the corresponding MSR (Immediate) instruction. Other than this we
never expect to take an UNDEFINED exception from EL1 in normal
operation.
This patch reworks do_el0_undef() to invoke the EL1 SSBS handler
directly, relegating call_undef_hook() to only handle EL0 UNDEFs. This
removes redundant work to iterate the list for EL1 UNDEFs, and removes
the need for locking, permitting EL1 UNDEFs to be handled in parallel
without contention.
The RCU_NONIDLE() call in cpu_suspend() will be removed in a subsequent
patch, as there are other potential issues with the use of
instrumentable code and RCU in the CPU suspend code.
I've tested this by forcing the detection of SSBS on a CPU that doesn't
have it, and verifying that the try_emulate_el1_ssbs() callback is
invoked.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221019144123.612388-4-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-10-19 15:41:17 +01:00
if ( compat_thumb_mode ( regs ) ) {
2014-11-18 11:41:22 +00:00
/* 16-bit Thumb instruction */
2017-06-28 16:55:55 +02:00
__le16 instr_le ;
if ( get_user ( instr_le , ( __le16 __user * ) pc ) )
2022-10-19 15:41:18 +01:00
return - EFAULT ;
2017-06-28 16:55:55 +02:00
instr = le16_to_cpu ( instr_le ) ;
2014-11-18 11:41:22 +00:00
if ( aarch32_insn_is_wide ( instr ) ) {
u32 instr2 ;
2017-06-28 16:55:55 +02:00
if ( get_user ( instr_le , ( __le16 __user * ) ( pc + 2 ) ) )
2022-10-19 15:41:18 +01:00
return - EFAULT ;
2017-06-28 16:55:55 +02:00
instr2 = le16_to_cpu ( instr_le ) ;
2014-11-18 11:41:22 +00:00
instr = ( instr < < 16 ) | instr2 ;
}
} else {
/* 32-bit ARM instruction */
2017-06-28 16:55:55 +02:00
__le32 instr_le ;
if ( get_user ( instr_le , ( __le32 __user * ) pc ) )
2022-10-19 15:41:18 +01:00
return - EFAULT ;
2017-06-28 16:55:55 +02:00
instr = le32_to_cpu ( instr_le ) ;
2014-11-18 11:41:22 +00:00
}
2022-10-19 15:41:18 +01:00
* insnp = instr ;
return 0 ;
}
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
void force_signal_inject ( int signal , int code , unsigned long address , unsigned long err )
2012-03-05 11:49:27 +00:00
{
2016-06-28 18:07:31 +01:00
const char * desc ;
2018-02-20 14:16:29 +00:00
struct pt_regs * regs = current_pt_regs ( ) ;
2018-08-14 16:24:54 +01:00
if ( WARN_ON ( ! user_mode ( regs ) ) )
return ;
2016-06-28 18:07:31 +01:00
switch ( signal ) {
case SIGILL :
desc = " undefined instruction " ;
break ;
case SIGSEGV :
desc = " illegal memory access " ;
break ;
default :
arm64/sve: Core task context handling
This patch adds the core support for switching and managing the SVE
architectural state of user tasks.
Calls to the existing FPSIMD low-level save/restore functions are
factored out as new functions task_fpsimd_{save,load}(), since SVE
now dynamically may or may not need to be handled at these points
depending on the kernel configuration, hardware features discovered
at boot, and the runtime state of the task. To make these
decisions as fast as possible, const cpucaps are used where
feasible, via the system_supports_sve() helper.
The SVE registers are only tracked for threads that have explicitly
used SVE, indicated by the new thread flag TIF_SVE. Otherwise, the
FPSIMD view of the architectural state is stored in
thread.fpsimd_state as usual.
When in use, the SVE registers are not stored directly in
thread_struct due to their potentially large and variable size.
Because the task_struct slab allocator must be configured very
early during kernel boot, it is also tricky to configure it
correctly to match the maximum vector length provided by the
hardware, since this depends on examining secondary CPUs as well as
the primary. Instead, a pointer sve_state in thread_struct points
to a dynamically allocated buffer containing the SVE register data,
and code is added to allocate and free this buffer at appropriate
times.
TIF_SVE is set when taking an SVE access trap from userspace, if
suitable hardware support has been detected. This enables SVE for
the thread: a subsequent return to userspace will disable the trap
accordingly. If such a trap is taken without sufficient system-
wide hardware support, SIGILL is sent to the thread instead as if
an undefined instruction had been executed: this may happen if
userspace tries to use SVE in a system where not all CPUs support
it for example.
The kernel will clear TIF_SVE and disable SVE for the thread
whenever an explicit syscall is made by userspace. For backwards
compatibility reasons and conformance with the spirit of the base
AArch64 procedure call standard, the subset of the SVE register
state that aliases the FPSIMD registers is still preserved across a
syscall even if this happens. The remainder of the SVE register
state logically becomes zero at syscall entry, though the actual
zeroing work is currently deferred until the thread next tries to
use SVE, causing another trap to the kernel. This implementation
is suboptimal: in the future, the fastpath case may be optimised
to zero the registers in-place and leave SVE enabled for the task,
where beneficial.
TIF_SVE is also cleared in the following slowpath cases, which are
taken as reasonable hints that the task may no longer use SVE:
* exec
* fork and clone
Code is added to sync data between thread.fpsimd_state and
thread.sve_state whenever enabling/disabling SVE, in a manner
consistent with the SVE architectural programmer's model.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Alex Bennée <alex.bennee@linaro.org>
[will: added #include to fix allnoconfig build]
[will: use enable_daif in do_sve_acc]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-31 15:51:05 +00:00
desc = " unknown or unrecoverable error " ;
2016-06-28 18:07:31 +01:00
break ;
}
2018-02-20 18:08:40 +00:00
/* Force signals we don't understand to SIGKILL */
2018-04-16 16:45:01 +01:00
if ( WARN_ON ( signal ! = SIGKILL & &
2018-02-20 18:08:40 +00:00
siginfo_layout ( signal , code ) ! = SIL_FAULT ) ) {
signal = SIGKILL ;
}
2020-11-20 12:33:46 -08:00
arm64_notify_die ( desc , regs , signal , code , address , err ) ;
2016-06-28 18:07:31 +01:00
}
/*
* Set up process info to signal segmentation fault - called on access error .
*/
2018-02-20 14:16:29 +00:00
void arm64_notify_segfault ( unsigned long addr )
2016-06-28 18:07:31 +01:00
{
int code ;
2020-06-08 21:33:25 -07:00
mmap_read_lock ( current - > mm ) ;
2020-11-20 12:33:46 -08:00
if ( find_vma ( current - > mm , untagged_addr ( addr ) ) = = NULL )
2016-06-28 18:07:31 +01:00
code = SEGV_MAPERR ;
else
code = SEGV_ACCERR ;
2020-06-08 21:33:25 -07:00
mmap_read_unlock ( current - > mm ) ;
2012-03-05 11:49:27 +00:00
2020-09-14 14:06:52 +05:30
force_signal_inject ( SIGSEGV , code , addr , 0 ) ;
2016-06-28 18:07:31 +01:00
}
arm64: split EL0/EL1 UNDEF handlers
In general, exceptions taken from EL1 need to be handled separately from
exceptions taken from EL0, as the logic to handle the two cases can be
significantly divergent, and exceptions taken from EL1 typically have
more stringent requirements on locking and instrumentation.
Subsequent patches will rework the way EL1 UNDEFs are handled in order
to address longstanding soundness issues with instrumentation and RCU.
In preparation for that rework, this patch splits the existing
do_undefinstr() handler into separate do_el0_undef() and do_el1_undef()
handlers.
Prior to this patch, do_undefinstr() was marked with NOKPROBE_SYMBOL(),
preventing instrumentation via kprobes. However, do_undefinstr() invokes
other code which can be instrumented, and:
* For UNDEFINED exceptions taken from EL0, there is no risk of recursion
within kprobes. Therefore it is safe for do_el0_undef to be
instrumented with kprobes, and it does not need to be marked with
NOKPROBE_SYMBOL().
* For UNDEFINED exceptions taken from EL1, either:
(a) The exception is has been taken when manipulating SSBS; these cases
are limited and do not occur within code that can be invoked
recursively via kprobes. Hence, in these cases instrumentation
with kprobes is benign.
(b) The exception has been taken for an unknown reason, as other than
manipulating SSBS we do not expect to take UNDEFINED exceptions
from EL1. Any handling of these exception is best-effort.
... and in either case, marking do_el1_undef() with NOKPROBE_SYMBOL()
isn't sufficient to prevent recursion via kprobes as functions it
calls (including die()) are instrumentable via kprobes.
Hence, it's not worthwhile to mark do_el1_undef() with
NOKPROBE_SYMBOL(). The same applies to do_el1_bti() and do_el1_fpac(),
so their NOKPROBE_SYMBOL() annotations are also removed.
Aside from the new instrumentability, there should be no functional
change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221019144123.612388-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-10-19 15:41:16 +01:00
void do_el0_undef ( struct pt_regs * regs , unsigned long esr )
2016-06-28 18:07:31 +01:00
{
2022-10-19 15:41:18 +01:00
u32 insn ;
2012-03-05 11:49:27 +00:00
/* check for AArch32 breakpoint instructions */
2013-03-16 08:48:13 +00:00
if ( ! aarch32_break_handler ( regs ) )
2012-03-05 11:49:27 +00:00
return ;
2022-10-19 15:41:18 +01:00
if ( user_insn_read ( regs , & insn ) )
goto out_err ;
arm64: rework EL0 MRS emulation
On CPUs without FEAT_IDST, ID register emulation is slower than it needs
to be, as all threads contend for the same lock to perform the
emulation. This patch reworks the emulation to avoid this unnecessary
contention.
On CPUs with FEAT_IDST (which is mandatory from ARMv8.4 onwards), EL0
accesses to ID registers result in a SYS trap, and emulation of these is
handled with a sys64_hook. These hooks are statically allocated, and no
locking is required to iterate through the hooks and perform the
emulation, allowing emulation to occur in parallel with no contention.
On CPUs without FEAT_IDST, EL0 accesses to ID registers result in an
UNDEFINED exception, and emulation of these accesses is handled with an
undef_hook. When an EL0 MRS instruction is trapped to EL1, the kernel
finds the relevant handler by iterating through all of the undef_hooks,
requiring undef_lock to be held during this lookup.
This locking is only required to safely traverse the list of undef_hooks
(as it can be concurrently modified), and the actual emulation of the
MRS does not require any mutual exclusion. This locking is an
unfortunate bottleneck, especially given that MRS emulation is enabled
unconditionally and is never disabled.
This patch reworks the non-FEAT_IDST MRS emulation logic so that it can
be invoked directly from do_el0_undef(). This removes the bottleneck,
allowing MRS traps to be handled entirely in parallel, and is a stepping
stone to making all of the undef_hooks lock-free.
I've tested this in a 64-vCPU VM on a 64-CPU ThunderX2 host, with a
benchmark which spawns a number of threads which each try to read
ID_AA64ISAR0_EL1 1000000 times. This is vastly more contention than will
ever be seen in realistic usage, but clearly demonstrates the removal of
the bottleneck:
| Threads || Time (seconds) |
| || Before || After |
| || Real | System || Real | System |
|---------++--------+---------++--------+---------|
| 1 || 0.29 | 0.20 || 0.24 | 0.12 |
| 2 || 0.35 | 0.51 || 0.23 | 0.27 |
| 4 || 1.08 | 3.87 || 0.24 | 0.56 |
| 8 || 4.31 | 33.60 || 0.24 | 1.11 |
| 16 || 9.47 | 149.39 || 0.23 | 2.15 |
| 32 || 19.07 | 605.27 || 0.24 | 4.38 |
| 64 || 65.40 | 3609.09 || 0.33 | 11.27 |
Aside from the speedup, there should be no functional change as a result
of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221019144123.612388-6-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-10-19 15:41:19 +01:00
if ( try_emulate_mrs ( regs , insn ) )
return ;
arm64: armv8_deprecated: rework deprected instruction handling
Support for deprecated instructions can be enabled or disabled at
runtime. To handle this, the code in armv8_deprecated.c registers and
unregisters undef_hooks, and makes cross CPU calls to configure HW
support. This is rather complicated, and the synchronization required to
make this safe ends up serializing the handling of instructions which
have been trapped.
This patch simplifies the deprecated instruction handling by removing
the dynamic registration and unregistration, and changing the trap
handling code to determine whether a handler should be invoked. This
removes the need for dynamic list management, and simplifies the locking
requirements, making it possible to handle trapped instructions entirely
in parallel.
Where changing the emulation state requires a cross-call, this is
serialized by locally disabling interrupts, ensuring that the CPU is not
left in an inconsistent state.
To simplify sysctl management, each insn_emulation is given a separate
sysctl table, permitting these to be registered separately. The core
sysctl code will iterate over all of these when walking sysfs.
I've tested this with userspace programs which use each of the
deprecated instructions, and I've concurrently modified the support
level for each of the features back-and-forth between HW and emulated to
check that there are no spurious SIGILLs sent to userspace when the
support level is changed.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221019144123.612388-10-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-10-19 15:41:23 +01:00
if ( try_emulate_armv8_deprecated ( regs , insn ) )
2014-11-18 11:41:22 +00:00
return ;
2022-10-19 15:41:18 +01:00
out_err :
2020-09-14 14:06:52 +05:30
force_signal_inject ( SIGILL , ILL_ILLOPC , regs - > pc , 0 ) ;
2012-03-05 11:49:27 +00:00
}
arm64: split EL0/EL1 UNDEF handlers
In general, exceptions taken from EL1 need to be handled separately from
exceptions taken from EL0, as the logic to handle the two cases can be
significantly divergent, and exceptions taken from EL1 typically have
more stringent requirements on locking and instrumentation.
Subsequent patches will rework the way EL1 UNDEFs are handled in order
to address longstanding soundness issues with instrumentation and RCU.
In preparation for that rework, this patch splits the existing
do_undefinstr() handler into separate do_el0_undef() and do_el1_undef()
handlers.
Prior to this patch, do_undefinstr() was marked with NOKPROBE_SYMBOL(),
preventing instrumentation via kprobes. However, do_undefinstr() invokes
other code which can be instrumented, and:
* For UNDEFINED exceptions taken from EL0, there is no risk of recursion
within kprobes. Therefore it is safe for do_el0_undef to be
instrumented with kprobes, and it does not need to be marked with
NOKPROBE_SYMBOL().
* For UNDEFINED exceptions taken from EL1, either:
(a) The exception is has been taken when manipulating SSBS; these cases
are limited and do not occur within code that can be invoked
recursively via kprobes. Hence, in these cases instrumentation
with kprobes is benign.
(b) The exception has been taken for an unknown reason, as other than
manipulating SSBS we do not expect to take UNDEFINED exceptions
from EL1. Any handling of these exception is best-effort.
... and in either case, marking do_el1_undef() with NOKPROBE_SYMBOL()
isn't sufficient to prevent recursion via kprobes as functions it
calls (including die()) are instrumentable via kprobes.
Hence, it's not worthwhile to mark do_el1_undef() with
NOKPROBE_SYMBOL(). The same applies to do_el1_bti() and do_el1_fpac(),
so their NOKPROBE_SYMBOL() annotations are also removed.
Aside from the new instrumentability, there should be no functional
change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221019144123.612388-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-10-19 15:41:16 +01:00
void do_el1_undef ( struct pt_regs * regs , unsigned long esr )
{
arm64: factor out EL1 SSBS emulation hook
Currently call_undef_hook() is used to handle UNDEFINED exceptions from
EL0 and EL1. As support for deprecated instructions may be enabled
independently, the handlers for individual instructions are organised as
a linked list of struct undef_hook which can be manipulated dynamically.
As this can be manipulated dynamically, the list is protected with a
raw_spinlock which must be acquired when handling UNDEFINED exceptions
or when manipulating the list of handlers.
This locking is unfortunate as it serialises handling of UNDEFINED
exceptions, and requires RCU to be enabled for lockdep, requiring the
use of RCU_NONIDLE() in resume path of cpu_suspend() since commit:
a2c42bbabbe260b7 ("arm64: spectre: Prevent lockdep splat on v4 mitigation enable path")
The list of UNDEFINED handlers largely consist of handlers for
exceptions taken from EL0, and the only handler for exceptions taken
from EL1 handles `MSR SSBS, #imm` on CPUs which feature PSTATE.SSBS but
lack the corresponding MSR (Immediate) instruction. Other than this we
never expect to take an UNDEFINED exception from EL1 in normal
operation.
This patch reworks do_el0_undef() to invoke the EL1 SSBS handler
directly, relegating call_undef_hook() to only handle EL0 UNDEFs. This
removes redundant work to iterate the list for EL1 UNDEFs, and removes
the need for locking, permitting EL1 UNDEFs to be handled in parallel
without contention.
The RCU_NONIDLE() call in cpu_suspend() will be removed in a subsequent
patch, as there are other potential issues with the use of
instrumentable code and RCU in the CPU suspend code.
I've tested this by forcing the detection of SSBS on a CPU that doesn't
have it, and verifying that the try_emulate_el1_ssbs() callback is
invoked.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221019144123.612388-4-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-10-19 15:41:17 +01:00
u32 insn ;
if ( aarch64_insn_read ( ( void * ) regs - > pc , & insn ) )
goto out_err ;
if ( try_emulate_el1_ssbs ( regs , insn ) )
arm64: split EL0/EL1 UNDEF handlers
In general, exceptions taken from EL1 need to be handled separately from
exceptions taken from EL0, as the logic to handle the two cases can be
significantly divergent, and exceptions taken from EL1 typically have
more stringent requirements on locking and instrumentation.
Subsequent patches will rework the way EL1 UNDEFs are handled in order
to address longstanding soundness issues with instrumentation and RCU.
In preparation for that rework, this patch splits the existing
do_undefinstr() handler into separate do_el0_undef() and do_el1_undef()
handlers.
Prior to this patch, do_undefinstr() was marked with NOKPROBE_SYMBOL(),
preventing instrumentation via kprobes. However, do_undefinstr() invokes
other code which can be instrumented, and:
* For UNDEFINED exceptions taken from EL0, there is no risk of recursion
within kprobes. Therefore it is safe for do_el0_undef to be
instrumented with kprobes, and it does not need to be marked with
NOKPROBE_SYMBOL().
* For UNDEFINED exceptions taken from EL1, either:
(a) The exception is has been taken when manipulating SSBS; these cases
are limited and do not occur within code that can be invoked
recursively via kprobes. Hence, in these cases instrumentation
with kprobes is benign.
(b) The exception has been taken for an unknown reason, as other than
manipulating SSBS we do not expect to take UNDEFINED exceptions
from EL1. Any handling of these exception is best-effort.
... and in either case, marking do_el1_undef() with NOKPROBE_SYMBOL()
isn't sufficient to prevent recursion via kprobes as functions it
calls (including die()) are instrumentable via kprobes.
Hence, it's not worthwhile to mark do_el1_undef() with
NOKPROBE_SYMBOL(). The same applies to do_el1_bti() and do_el1_fpac(),
so their NOKPROBE_SYMBOL() annotations are also removed.
Aside from the new instrumentability, there should be no functional
change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221019144123.612388-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-10-19 15:41:16 +01:00
return ;
arm64: factor out EL1 SSBS emulation hook
Currently call_undef_hook() is used to handle UNDEFINED exceptions from
EL0 and EL1. As support for deprecated instructions may be enabled
independently, the handlers for individual instructions are organised as
a linked list of struct undef_hook which can be manipulated dynamically.
As this can be manipulated dynamically, the list is protected with a
raw_spinlock which must be acquired when handling UNDEFINED exceptions
or when manipulating the list of handlers.
This locking is unfortunate as it serialises handling of UNDEFINED
exceptions, and requires RCU to be enabled for lockdep, requiring the
use of RCU_NONIDLE() in resume path of cpu_suspend() since commit:
a2c42bbabbe260b7 ("arm64: spectre: Prevent lockdep splat on v4 mitigation enable path")
The list of UNDEFINED handlers largely consist of handlers for
exceptions taken from EL0, and the only handler for exceptions taken
from EL1 handles `MSR SSBS, #imm` on CPUs which feature PSTATE.SSBS but
lack the corresponding MSR (Immediate) instruction. Other than this we
never expect to take an UNDEFINED exception from EL1 in normal
operation.
This patch reworks do_el0_undef() to invoke the EL1 SSBS handler
directly, relegating call_undef_hook() to only handle EL0 UNDEFs. This
removes redundant work to iterate the list for EL1 UNDEFs, and removes
the need for locking, permitting EL1 UNDEFs to be handled in parallel
without contention.
The RCU_NONIDLE() call in cpu_suspend() will be removed in a subsequent
patch, as there are other potential issues with the use of
instrumentable code and RCU in the CPU suspend code.
I've tested this by forcing the detection of SSBS on a CPU that doesn't
have it, and verifying that the try_emulate_el1_ssbs() callback is
invoked.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221019144123.612388-4-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-10-19 15:41:17 +01:00
out_err :
arm64: split EL0/EL1 UNDEF handlers
In general, exceptions taken from EL1 need to be handled separately from
exceptions taken from EL0, as the logic to handle the two cases can be
significantly divergent, and exceptions taken from EL1 typically have
more stringent requirements on locking and instrumentation.
Subsequent patches will rework the way EL1 UNDEFs are handled in order
to address longstanding soundness issues with instrumentation and RCU.
In preparation for that rework, this patch splits the existing
do_undefinstr() handler into separate do_el0_undef() and do_el1_undef()
handlers.
Prior to this patch, do_undefinstr() was marked with NOKPROBE_SYMBOL(),
preventing instrumentation via kprobes. However, do_undefinstr() invokes
other code which can be instrumented, and:
* For UNDEFINED exceptions taken from EL0, there is no risk of recursion
within kprobes. Therefore it is safe for do_el0_undef to be
instrumented with kprobes, and it does not need to be marked with
NOKPROBE_SYMBOL().
* For UNDEFINED exceptions taken from EL1, either:
(a) The exception is has been taken when manipulating SSBS; these cases
are limited and do not occur within code that can be invoked
recursively via kprobes. Hence, in these cases instrumentation
with kprobes is benign.
(b) The exception has been taken for an unknown reason, as other than
manipulating SSBS we do not expect to take UNDEFINED exceptions
from EL1. Any handling of these exception is best-effort.
... and in either case, marking do_el1_undef() with NOKPROBE_SYMBOL()
isn't sufficient to prevent recursion via kprobes as functions it
calls (including die()) are instrumentable via kprobes.
Hence, it's not worthwhile to mark do_el1_undef() with
NOKPROBE_SYMBOL(). The same applies to do_el1_bti() and do_el1_fpac(),
so their NOKPROBE_SYMBOL() annotations are also removed.
Aside from the new instrumentability, there should be no functional
change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221019144123.612388-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-10-19 15:41:16 +01:00
die ( " Oops - Undefined instruction " , regs , esr ) ;
}
2012-03-05 11:49:27 +00:00
arm64: rework BTI exception handling
If a BTI exception is taken from EL1, the entry code will treat this as
an unhandled exception and will panic() the kernel. This is inconsistent
with the way we handle FPAC exceptions, which have a dedicated handler
and only necessarily kill the thread from which the exception was taken
from, and we don't log all the information that could be relevant to
debug the issue.
The code in do_bti() has:
BUG_ON(!user_mode(regs));
... and it seems like the intent was to call this for EL1 BTI
exceptions, as with FPAC, but this was omitted due to an oversight.
This patch adds separate EL0 and EL1 BTI exception handlers, with the
latter calling die() directly to report the original context the BTI
exception was taken from. This matches our handling of FPAC exceptions.
Prior to this patch, a BTI failure is reported as:
| Unhandled 64-bit el1h sync exception on CPU0, ESR 0x0000000034000002 -- BTI
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00131-g7d937ff0221d-dirty #9
| Hardware name: linux,dummy-virt (DT)
| pstate: 20400809 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=-c)
| pc : test_bti_callee+0x4/0x10
| lr : test_bti_caller+0x1c/0x28
| sp : ffff80000800bdf0
| x29: ffff80000800bdf0 x28: 0000000000000000 x27: 0000000000000000
| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
| x23: ffff80000a2b8000 x22: 0000000000000000 x21: 0000000000000000
| x20: ffff8000099fa5b0 x19: ffff800009ff7000 x18: fffffbfffda37000
| x17: 3120676e696d7573 x16: 7361202c6e6f6974 x15: 0000000041a90000
| x14: 0040000000000041 x13: 0040000000000001 x12: ffff000001a90000
| x11: fffffbfffda37480 x10: 0068000000000703 x9 : 0001000040000000
| x8 : 0000000000090000 x7 : 0068000000000f03 x6 : 0060000000000f83
| x5 : ffff80000a2b6000 x4 : ffff0000028d0000 x3 : ffff800009f78378
| x2 : 0000000000000000 x1 : 0000000040210000 x0 : ffff8000080257e4
| Kernel panic - not syncing: Unhandled exception
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00131-g7d937ff0221d-dirty #9
| Hardware name: linux,dummy-virt (DT)
| Call trace:
| dump_backtrace.part.0+0xcc/0xe0
| show_stack+0x18/0x5c
| dump_stack_lvl+0x64/0x80
| dump_stack+0x18/0x34
| panic+0x170/0x360
| arm64_exit_nmi.isra.0+0x0/0x80
| el1h_64_sync_handler+0x64/0xd0
| el1h_64_sync+0x64/0x68
| test_bti_callee+0x4/0x10
| smp_cpus_done+0xb0/0xbc
| smp_init+0x7c/0x8c
| kernel_init_freeable+0x128/0x28c
| kernel_init+0x28/0x13c
| ret_from_fork+0x10/0x20
With this patch applied, a BTI failure is reported as:
| Internal error: Oops - BTI: 0000000034000002 [#1] PREEMPT SMP
| Modules linked in:
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00132-g0ad98265d582-dirty #8
| Hardware name: linux,dummy-virt (DT)
| pstate: 20400809 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=-c)
| pc : test_bti_callee+0x4/0x10
| lr : test_bti_caller+0x1c/0x28
| sp : ffff80000800bdf0
| x29: ffff80000800bdf0 x28: 0000000000000000 x27: 0000000000000000
| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
| x23: ffff80000a2b8000 x22: 0000000000000000 x21: 0000000000000000
| x20: ffff8000099fa5b0 x19: ffff800009ff7000 x18: fffffbfffda37000
| x17: 3120676e696d7573 x16: 7361202c6e6f6974 x15: 0000000041a90000
| x14: 0040000000000041 x13: 0040000000000001 x12: ffff000001a90000
| x11: fffffbfffda37480 x10: 0068000000000703 x9 : 0001000040000000
| x8 : 0000000000090000 x7 : 0068000000000f03 x6 : 0060000000000f83
| x5 : ffff80000a2b6000 x4 : ffff0000028d0000 x3 : ffff800009f78378
| x2 : 0000000000000000 x1 : 0000000040210000 x0 : ffff800008025804
| Call trace:
| test_bti_callee+0x4/0x10
| smp_cpus_done+0xb0/0xbc
| smp_init+0x7c/0x8c
| kernel_init_freeable+0x128/0x28c
| kernel_init+0x28/0x13c
| ret_from_fork+0x10/0x20
| Code: d50323bf d53cd040 d65f03c0 d503233f (d50323bf)
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220913101732.3925290-6-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-13 11:17:32 +01:00
void do_el0_bti ( struct pt_regs * regs )
2020-03-16 16:50:45 +00:00
{
2020-09-14 14:06:52 +05:30
force_signal_inject ( SIGILL , ILL_ILLOPC , regs - > pc , 0 ) ;
2020-03-16 16:50:45 +00:00
}
arm64: rework BTI exception handling
If a BTI exception is taken from EL1, the entry code will treat this as
an unhandled exception and will panic() the kernel. This is inconsistent
with the way we handle FPAC exceptions, which have a dedicated handler
and only necessarily kill the thread from which the exception was taken
from, and we don't log all the information that could be relevant to
debug the issue.
The code in do_bti() has:
BUG_ON(!user_mode(regs));
... and it seems like the intent was to call this for EL1 BTI
exceptions, as with FPAC, but this was omitted due to an oversight.
This patch adds separate EL0 and EL1 BTI exception handlers, with the
latter calling die() directly to report the original context the BTI
exception was taken from. This matches our handling of FPAC exceptions.
Prior to this patch, a BTI failure is reported as:
| Unhandled 64-bit el1h sync exception on CPU0, ESR 0x0000000034000002 -- BTI
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00131-g7d937ff0221d-dirty #9
| Hardware name: linux,dummy-virt (DT)
| pstate: 20400809 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=-c)
| pc : test_bti_callee+0x4/0x10
| lr : test_bti_caller+0x1c/0x28
| sp : ffff80000800bdf0
| x29: ffff80000800bdf0 x28: 0000000000000000 x27: 0000000000000000
| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
| x23: ffff80000a2b8000 x22: 0000000000000000 x21: 0000000000000000
| x20: ffff8000099fa5b0 x19: ffff800009ff7000 x18: fffffbfffda37000
| x17: 3120676e696d7573 x16: 7361202c6e6f6974 x15: 0000000041a90000
| x14: 0040000000000041 x13: 0040000000000001 x12: ffff000001a90000
| x11: fffffbfffda37480 x10: 0068000000000703 x9 : 0001000040000000
| x8 : 0000000000090000 x7 : 0068000000000f03 x6 : 0060000000000f83
| x5 : ffff80000a2b6000 x4 : ffff0000028d0000 x3 : ffff800009f78378
| x2 : 0000000000000000 x1 : 0000000040210000 x0 : ffff8000080257e4
| Kernel panic - not syncing: Unhandled exception
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00131-g7d937ff0221d-dirty #9
| Hardware name: linux,dummy-virt (DT)
| Call trace:
| dump_backtrace.part.0+0xcc/0xe0
| show_stack+0x18/0x5c
| dump_stack_lvl+0x64/0x80
| dump_stack+0x18/0x34
| panic+0x170/0x360
| arm64_exit_nmi.isra.0+0x0/0x80
| el1h_64_sync_handler+0x64/0xd0
| el1h_64_sync+0x64/0x68
| test_bti_callee+0x4/0x10
| smp_cpus_done+0xb0/0xbc
| smp_init+0x7c/0x8c
| kernel_init_freeable+0x128/0x28c
| kernel_init+0x28/0x13c
| ret_from_fork+0x10/0x20
With this patch applied, a BTI failure is reported as:
| Internal error: Oops - BTI: 0000000034000002 [#1] PREEMPT SMP
| Modules linked in:
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00132-g0ad98265d582-dirty #8
| Hardware name: linux,dummy-virt (DT)
| pstate: 20400809 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=-c)
| pc : test_bti_callee+0x4/0x10
| lr : test_bti_caller+0x1c/0x28
| sp : ffff80000800bdf0
| x29: ffff80000800bdf0 x28: 0000000000000000 x27: 0000000000000000
| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
| x23: ffff80000a2b8000 x22: 0000000000000000 x21: 0000000000000000
| x20: ffff8000099fa5b0 x19: ffff800009ff7000 x18: fffffbfffda37000
| x17: 3120676e696d7573 x16: 7361202c6e6f6974 x15: 0000000041a90000
| x14: 0040000000000041 x13: 0040000000000001 x12: ffff000001a90000
| x11: fffffbfffda37480 x10: 0068000000000703 x9 : 0001000040000000
| x8 : 0000000000090000 x7 : 0068000000000f03 x6 : 0060000000000f83
| x5 : ffff80000a2b6000 x4 : ffff0000028d0000 x3 : ffff800009f78378
| x2 : 0000000000000000 x1 : 0000000040210000 x0 : ffff800008025804
| Call trace:
| test_bti_callee+0x4/0x10
| smp_cpus_done+0xb0/0xbc
| smp_init+0x7c/0x8c
| kernel_init_freeable+0x128/0x28c
| kernel_init+0x28/0x13c
| ret_from_fork+0x10/0x20
| Code: d50323bf d53cd040 d65f03c0 d503233f (d50323bf)
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220913101732.3925290-6-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-13 11:17:32 +01:00
void do_el1_bti ( struct pt_regs * regs , unsigned long esr )
{
2023-02-01 09:50:07 +01:00
if ( efi_runtime_fixup_exception ( regs , " BTI violation " ) ) {
regs - > pstate & = ~ PSR_BTYPE_MASK ;
return ;
}
arm64: rework BTI exception handling
If a BTI exception is taken from EL1, the entry code will treat this as
an unhandled exception and will panic() the kernel. This is inconsistent
with the way we handle FPAC exceptions, which have a dedicated handler
and only necessarily kill the thread from which the exception was taken
from, and we don't log all the information that could be relevant to
debug the issue.
The code in do_bti() has:
BUG_ON(!user_mode(regs));
... and it seems like the intent was to call this for EL1 BTI
exceptions, as with FPAC, but this was omitted due to an oversight.
This patch adds separate EL0 and EL1 BTI exception handlers, with the
latter calling die() directly to report the original context the BTI
exception was taken from. This matches our handling of FPAC exceptions.
Prior to this patch, a BTI failure is reported as:
| Unhandled 64-bit el1h sync exception on CPU0, ESR 0x0000000034000002 -- BTI
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00131-g7d937ff0221d-dirty #9
| Hardware name: linux,dummy-virt (DT)
| pstate: 20400809 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=-c)
| pc : test_bti_callee+0x4/0x10
| lr : test_bti_caller+0x1c/0x28
| sp : ffff80000800bdf0
| x29: ffff80000800bdf0 x28: 0000000000000000 x27: 0000000000000000
| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
| x23: ffff80000a2b8000 x22: 0000000000000000 x21: 0000000000000000
| x20: ffff8000099fa5b0 x19: ffff800009ff7000 x18: fffffbfffda37000
| x17: 3120676e696d7573 x16: 7361202c6e6f6974 x15: 0000000041a90000
| x14: 0040000000000041 x13: 0040000000000001 x12: ffff000001a90000
| x11: fffffbfffda37480 x10: 0068000000000703 x9 : 0001000040000000
| x8 : 0000000000090000 x7 : 0068000000000f03 x6 : 0060000000000f83
| x5 : ffff80000a2b6000 x4 : ffff0000028d0000 x3 : ffff800009f78378
| x2 : 0000000000000000 x1 : 0000000040210000 x0 : ffff8000080257e4
| Kernel panic - not syncing: Unhandled exception
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00131-g7d937ff0221d-dirty #9
| Hardware name: linux,dummy-virt (DT)
| Call trace:
| dump_backtrace.part.0+0xcc/0xe0
| show_stack+0x18/0x5c
| dump_stack_lvl+0x64/0x80
| dump_stack+0x18/0x34
| panic+0x170/0x360
| arm64_exit_nmi.isra.0+0x0/0x80
| el1h_64_sync_handler+0x64/0xd0
| el1h_64_sync+0x64/0x68
| test_bti_callee+0x4/0x10
| smp_cpus_done+0xb0/0xbc
| smp_init+0x7c/0x8c
| kernel_init_freeable+0x128/0x28c
| kernel_init+0x28/0x13c
| ret_from_fork+0x10/0x20
With this patch applied, a BTI failure is reported as:
| Internal error: Oops - BTI: 0000000034000002 [#1] PREEMPT SMP
| Modules linked in:
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00132-g0ad98265d582-dirty #8
| Hardware name: linux,dummy-virt (DT)
| pstate: 20400809 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=-c)
| pc : test_bti_callee+0x4/0x10
| lr : test_bti_caller+0x1c/0x28
| sp : ffff80000800bdf0
| x29: ffff80000800bdf0 x28: 0000000000000000 x27: 0000000000000000
| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
| x23: ffff80000a2b8000 x22: 0000000000000000 x21: 0000000000000000
| x20: ffff8000099fa5b0 x19: ffff800009ff7000 x18: fffffbfffda37000
| x17: 3120676e696d7573 x16: 7361202c6e6f6974 x15: 0000000041a90000
| x14: 0040000000000041 x13: 0040000000000001 x12: ffff000001a90000
| x11: fffffbfffda37480 x10: 0068000000000703 x9 : 0001000040000000
| x8 : 0000000000090000 x7 : 0068000000000f03 x6 : 0060000000000f83
| x5 : ffff80000a2b6000 x4 : ffff0000028d0000 x3 : ffff800009f78378
| x2 : 0000000000000000 x1 : 0000000040210000 x0 : ffff800008025804
| Call trace:
| test_bti_callee+0x4/0x10
| smp_cpus_done+0xb0/0xbc
| smp_init+0x7c/0x8c
| kernel_init_freeable+0x128/0x28c
| kernel_init+0x28/0x13c
| ret_from_fork+0x10/0x20
| Code: d50323bf d53cd040 d65f03c0 d503233f (d50323bf)
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220913101732.3925290-6-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-13 11:17:32 +01:00
die ( " Oops - BTI " , regs , esr ) ;
}
2020-03-16 16:50:45 +00:00
2022-09-13 11:17:31 +01:00
void do_el0_fpac ( struct pt_regs * regs , unsigned long esr )
{
force_signal_inject ( SIGILL , ILL_ILLOPN , regs - > pc , esr ) ;
}
void do_el1_fpac ( struct pt_regs * regs , unsigned long esr )
2020-09-14 14:06:53 +05:30
{
/*
2022-09-13 11:17:31 +01:00
* Unexpected FPAC exception in the kernel : kill the task before it
* does any more harm .
2020-09-14 14:06:53 +05:30
*/
2022-09-13 11:17:31 +01:00
die ( " Oops - FPAC " , regs , esr ) ;
2020-09-14 14:06:53 +05:30
}
2023-05-09 15:22:31 +01:00
void do_el0_mops ( struct pt_regs * regs , unsigned long esr )
{
bool wrong_option = esr & ESR_ELx_MOPS_ISS_WRONG_OPTION ;
bool option_a = esr & ESR_ELx_MOPS_ISS_OPTION_A ;
int dstreg = ESR_ELx_MOPS_ISS_DESTREG ( esr ) ;
int srcreg = ESR_ELx_MOPS_ISS_SRCREG ( esr ) ;
int sizereg = ESR_ELx_MOPS_ISS_SIZEREG ( esr ) ;
unsigned long dst , src , size ;
dst = pt_regs_read_reg ( regs , dstreg ) ;
src = pt_regs_read_reg ( regs , srcreg ) ;
size = pt_regs_read_reg ( regs , sizereg ) ;
/*
* Put the registers back in the original format suitable for a
* prologue instruction , using the generic return routine from the
* Arm ARM ( DDI 04 87 I . a ) rules CNTMJ and MWFQH .
*/
if ( esr & ESR_ELx_MOPS_ISS_MEM_INST ) {
/* SET* instruction */
if ( option_a ^ wrong_option ) {
/* Format is from Option A; forward set */
pt_regs_write_reg ( regs , dstreg , dst + size ) ;
pt_regs_write_reg ( regs , sizereg , - size ) ;
}
} else {
/* CPY* instruction */
if ( ! ( option_a ^ wrong_option ) ) {
/* Format is from Option B */
if ( regs - > pstate & PSR_N_BIT ) {
/* Backward copy */
pt_regs_write_reg ( regs , dstreg , dst - size ) ;
pt_regs_write_reg ( regs , srcreg , src - size ) ;
}
} else {
/* Format is from Option A */
if ( size & BIT ( 63 ) ) {
/* Forward copy */
pt_regs_write_reg ( regs , dstreg , dst + size ) ;
pt_regs_write_reg ( regs , srcreg , src + size ) ;
pt_regs_write_reg ( regs , sizereg , - size ) ;
}
}
}
if ( esr & ESR_ELx_MOPS_ISS_FROM_EPILOGUE )
regs - > pc - = 8 ;
else
regs - > pc - = 4 ;
2023-05-09 15:22:32 +01:00
/*
* If single stepping then finish the step before executing the
* prologue instruction .
*/
user_fastforward_single_step ( current ) ;
2023-05-09 15:22:31 +01:00
}
2016-06-28 18:07:32 +01:00
# define __user_cache_maint(insn, address, res) \
2022-02-11 21:42:45 +01:00
if ( address > = TASK_SIZE_MAX ) { \
2016-10-19 14:40:54 +01:00
res = - EFAULT ; \
2016-09-02 14:54:03 +01:00
} else { \
uaccess_ttbr0_enable ( ) ; \
2016-10-19 14:40:54 +01:00
asm volatile ( \
" 1: " insn " , %1 \n " \
" mov %w0, #0 \n " \
" 2: \n " \
arm64: extable: add a dedicated uaccess handler
For inline assembly, we place exception fixups out-of-line in the
`.fixup` section such that these are out of the way of the fast path.
This has a few drawbacks:
* Since the fixup code is anonymous, backtraces will symbolize fixups as
offsets from the nearest prior symbol, currently
`__entry_tramp_text_end`. This is confusing, and painful to debug
without access to the relevant vmlinux.
* Since the exception handler adjusts the PC to execute the fixup, and
the fixup uses a direct branch back into the function it fixes,
backtraces of fixups miss the original function. This is confusing,
and violates requirements for RELIABLE_STACKTRACE (and therefore
LIVEPATCH).
* Inline assembly and associated fixups are generated from templates,
and we have many copies of logically identical fixups which only
differ in which specific registers are written to and which address is
branched to at the end of the fixup. This is potentially wasteful of
I-cache resources, and makes it hard to add additional logic to fixups
without significant bloat.
This patch address all three concerns for inline uaccess fixups by
adding a dedicated exception handler which updates registers in
exception context and subsequent returns back into the function which
faulted, removing the need for fixups specialized to each faulting
instruction.
Other than backtracing, there should be no functional change as a result
of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-12-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-19 17:02:17 +01:00
_ASM_EXTABLE_UACCESS_ERR ( 1 b , 2 b , % w0 ) \
2016-10-19 14:40:54 +01:00
: " =r " ( res ) \
arm64: extable: add a dedicated uaccess handler
For inline assembly, we place exception fixups out-of-line in the
`.fixup` section such that these are out of the way of the fast path.
This has a few drawbacks:
* Since the fixup code is anonymous, backtraces will symbolize fixups as
offsets from the nearest prior symbol, currently
`__entry_tramp_text_end`. This is confusing, and painful to debug
without access to the relevant vmlinux.
* Since the exception handler adjusts the PC to execute the fixup, and
the fixup uses a direct branch back into the function it fixes,
backtraces of fixups miss the original function. This is confusing,
and violates requirements for RELIABLE_STACKTRACE (and therefore
LIVEPATCH).
* Inline assembly and associated fixups are generated from templates,
and we have many copies of logically identical fixups which only
differ in which specific registers are written to and which address is
branched to at the end of the fixup. This is potentially wasteful of
I-cache resources, and makes it hard to add additional logic to fixups
without significant bloat.
This patch address all three concerns for inline uaccess fixups by
adding a dedicated exception handler which updates registers in
exception context and subsequent returns back into the function which
faulted, removing the need for fixups specialized to each faulting
instruction.
Other than backtracing, there should be no functional change as a result
of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211019160219.5202-12-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-19 17:02:17 +01:00
: " r " ( address ) ) ; \
2016-09-02 14:54:03 +01:00
uaccess_ttbr0_disable ( ) ; \
}
2016-06-28 18:07:32 +01:00
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
static void user_cache_maint_handler ( unsigned long esr , struct pt_regs * regs )
2016-06-28 18:07:32 +01:00
{
2020-11-20 12:33:46 -08:00
unsigned long tagged_address , address ;
2018-09-20 09:36:19 +05:30
int rt = ESR_ELx_SYS64_ISS_RT ( esr ) ;
2016-09-09 14:07:15 +01:00
int crm = ( esr & ESR_ELx_SYS64_ISS_CRM_MASK ) > > ESR_ELx_SYS64_ISS_CRM_SHIFT ;
int ret = 0 ;
2016-06-28 18:07:32 +01:00
2020-11-20 12:33:46 -08:00
tagged_address = pt_regs_read_reg ( regs , rt ) ;
address = untagged_addr ( tagged_address ) ;
2016-06-28 18:07:32 +01:00
2016-09-09 14:07:15 +01:00
switch ( crm ) {
case ESR_ELx_SYS64_ISS_CRM_DC_CVAU : /* DC CVAU, gets promoted */
__user_cache_maint ( " dc civac " , address , ret ) ;
break ;
case ESR_ELx_SYS64_ISS_CRM_DC_CVAC : /* DC CVAC, gets promoted */
__user_cache_maint ( " dc civac " , address , ret ) ;
break ;
2019-04-09 10:52:42 +01:00
case ESR_ELx_SYS64_ISS_CRM_DC_CVADP : /* DC CVADP */
__user_cache_maint ( " sys 3, c7, c13, 1 " , address , ret ) ;
break ;
2017-07-25 11:55:41 +01:00
case ESR_ELx_SYS64_ISS_CRM_DC_CVAP : /* DC CVAP */
__user_cache_maint ( " sys 3, c7, c12, 1 " , address , ret ) ;
break ;
2016-09-09 14:07:15 +01:00
case ESR_ELx_SYS64_ISS_CRM_DC_CIVAC : /* DC CIVAC */
__user_cache_maint ( " dc civac " , address , ret ) ;
break ;
case ESR_ELx_SYS64_ISS_CRM_IC_IVAU : /* IC IVAU */
__user_cache_maint ( " ic ivau " , address , ret ) ;
break ;
default :
2020-09-14 14:06:52 +05:30
force_signal_inject ( SIGILL , ILL_ILLOPC , regs - > pc , 0 ) ;
2016-06-28 18:07:32 +01:00
return ;
}
if ( ret )
2020-11-20 12:33:46 -08:00
arm64_notify_segfault ( tagged_address ) ;
2016-06-28 18:07:32 +01:00
else
2017-10-25 10:04:33 +01:00
arm64_skip_faulting_instruction ( regs , AARCH64_INSN_SIZE ) ;
2016-06-28 18:07:32 +01:00
}
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
static void ctr_read_handler ( unsigned long esr , struct pt_regs * regs )
2016-09-09 14:07:16 +01:00
{
2018-09-20 09:36:19 +05:30
int rt = ESR_ELx_SYS64_ISS_RT ( esr ) ;
2017-02-09 15:19:19 +00:00
unsigned long val = arm64_ftr_reg_user_value ( & arm64_ftr_reg_ctrel0 ) ;
2019-10-17 18:42:59 +01:00
if ( cpus_have_const_cap ( ARM64_WORKAROUND_1542419 ) ) {
/* Hide DIC so that we can trap the unnecessary maintenance...*/
2022-07-04 18:02:40 +01:00
val & = ~ BIT ( CTR_EL0_DIC_SHIFT ) ;
2019-10-17 18:42:58 +01:00
2019-10-17 18:42:59 +01:00
/* ... and fake IminLine to reduce the number of traps. */
2022-07-04 18:02:40 +01:00
val & = ~ CTR_EL0_IminLine_MASK ;
val | = ( PAGE_SHIFT - 2 ) & CTR_EL0_IminLine_MASK ;
2019-10-17 18:42:59 +01:00
}
2017-02-09 15:19:19 +00:00
pt_regs_write_reg ( regs , rt , val ) ;
2016-09-09 14:07:16 +01:00
2017-10-25 10:04:33 +01:00
arm64_skip_faulting_instruction ( regs , AARCH64_INSN_SIZE ) ;
2016-09-09 14:07:16 +01:00
}
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
static void cntvct_read_handler ( unsigned long esr , struct pt_regs * regs )
2017-02-01 11:48:58 +00:00
{
2018-09-20 09:36:19 +05:30
int rt = ESR_ELx_SYS64_ISS_RT ( esr ) ;
2017-02-01 11:48:58 +00:00
2019-04-08 16:49:03 +01:00
pt_regs_write_reg ( regs , rt , arch_timer_read_counter ( ) ) ;
2017-10-25 10:04:33 +01:00
arm64_skip_faulting_instruction ( regs , AARCH64_INSN_SIZE ) ;
2017-02-01 11:48:58 +00:00
}
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
static void cntfrq_read_handler ( unsigned long esr , struct pt_regs * regs )
2017-04-24 09:04:03 +01:00
{
2018-09-20 09:36:19 +05:30
int rt = ESR_ELx_SYS64_ISS_RT ( esr ) ;
2017-04-24 09:04:03 +01:00
2017-07-21 18:15:27 +01:00
pt_regs_write_reg ( regs , rt , arch_timer_get_rate ( ) ) ;
2017-10-25 10:04:33 +01:00
arm64_skip_faulting_instruction ( regs , AARCH64_INSN_SIZE ) ;
2017-04-24 09:04:03 +01:00
}
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
static void mrs_handler ( unsigned long esr , struct pt_regs * regs )
2018-09-20 09:36:21 +05:30
{
u32 sysreg , rt ;
rt = ESR_ELx_SYS64_ISS_RT ( esr ) ;
sysreg = esr_sys64_to_sysreg ( esr ) ;
if ( do_emulate_mrs ( regs , sysreg , rt ) ! = 0 )
2020-09-14 14:06:52 +05:30
force_signal_inject ( SIGILL , ILL_ILLOPC , regs - > pc , 0 ) ;
2018-09-20 09:36:21 +05:30
}
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
static void wfi_handler ( unsigned long esr , struct pt_regs * regs )
2018-10-01 12:19:43 +01:00
{
arm64_skip_faulting_instruction ( regs , AARCH64_INSN_SIZE ) ;
}
2016-09-09 14:07:15 +01:00
struct sys64_hook {
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
unsigned long esr_mask ;
unsigned long esr_val ;
void ( * handler ) ( unsigned long esr , struct pt_regs * regs ) ;
2016-09-09 14:07:15 +01:00
} ;
2019-08-13 15:16:39 +01:00
static const struct sys64_hook sys64_hooks [ ] = {
2016-09-09 14:07:15 +01:00
{
. esr_mask = ESR_ELx_SYS64_ISS_EL0_CACHE_OP_MASK ,
. esr_val = ESR_ELx_SYS64_ISS_EL0_CACHE_OP_VAL ,
. handler = user_cache_maint_handler ,
} ,
2016-09-09 14:07:16 +01:00
{
/* Trap read access to CTR_EL0 */
. esr_mask = ESR_ELx_SYS64_ISS_SYS_OP_MASK ,
. esr_val = ESR_ELx_SYS64_ISS_SYS_CTR_READ ,
. handler = ctr_read_handler ,
} ,
2017-02-01 11:48:58 +00:00
{
/* Trap read access to CNTVCT_EL0 */
. esr_mask = ESR_ELx_SYS64_ISS_SYS_OP_MASK ,
. esr_val = ESR_ELx_SYS64_ISS_SYS_CNTVCT ,
. handler = cntvct_read_handler ,
} ,
2021-10-17 13:42:24 +01:00
{
/* Trap read access to CNTVCTSS_EL0 */
. esr_mask = ESR_ELx_SYS64_ISS_SYS_OP_MASK ,
. esr_val = ESR_ELx_SYS64_ISS_SYS_CNTVCTSS ,
. handler = cntvct_read_handler ,
} ,
2017-04-24 09:04:03 +01:00
{
/* Trap read access to CNTFRQ_EL0 */
. esr_mask = ESR_ELx_SYS64_ISS_SYS_OP_MASK ,
. esr_val = ESR_ELx_SYS64_ISS_SYS_CNTFRQ ,
. handler = cntfrq_read_handler ,
} ,
2018-09-20 09:36:21 +05:30
{
/* Trap read access to CPUID registers */
. esr_mask = ESR_ELx_SYS64_ISS_SYS_MRS_OP_MASK ,
. esr_val = ESR_ELx_SYS64_ISS_SYS_MRS_OP_VAL ,
. handler = mrs_handler ,
} ,
2018-10-01 12:19:43 +01:00
{
/* Trap WFI instructions executed in userspace */
. esr_mask = ESR_ELx_WFx_MASK ,
. esr_val = ESR_ELx_WFx_WFI_VAL ,
. handler = wfi_handler ,
} ,
2016-09-09 14:07:15 +01:00
{ } ,
} ;
2018-09-27 17:15:29 +01:00
# ifdef CONFIG_COMPAT
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
static bool cp15_cond_valid ( unsigned long esr , struct pt_regs * regs )
2018-09-27 17:15:30 +01:00
{
int cond ;
/* Only a T32 instruction can trap without CV being set */
if ( ! ( esr & ESR_ELx_CV ) ) {
u32 it ;
it = compat_get_it_state ( regs ) ;
if ( ! it )
return true ;
cond = it > > 4 ;
} else {
cond = ( esr & ESR_ELx_COND_MASK ) > > ESR_ELx_COND_SHIFT ;
}
return aarch32_opcode_cond_checks [ cond ] ( regs - > pstate ) ;
}
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
static void compat_cntfrq_read_handler ( unsigned long esr , struct pt_regs * regs )
2018-09-27 17:15:33 +01:00
{
int reg = ( esr & ESR_ELx_CP15_32_ISS_RT_MASK ) > > ESR_ELx_CP15_32_ISS_RT_SHIFT ;
pt_regs_write_reg ( regs , reg , arch_timer_get_rate ( ) ) ;
2020-03-16 16:50:49 +00:00
arm64_skip_faulting_instruction ( regs , 4 ) ;
2018-09-27 17:15:33 +01:00
}
2019-08-13 15:16:39 +01:00
static const struct sys64_hook cp15_32_hooks [ ] = {
2018-09-27 17:15:33 +01:00
{
. esr_mask = ESR_ELx_CP15_32_ISS_SYS_MASK ,
. esr_val = ESR_ELx_CP15_32_ISS_SYS_CNTFRQ ,
. handler = compat_cntfrq_read_handler ,
} ,
2018-09-27 17:15:31 +01:00
{ } ,
} ;
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
static void compat_cntvct_read_handler ( unsigned long esr , struct pt_regs * regs )
2018-09-27 17:15:32 +01:00
{
int rt = ( esr & ESR_ELx_CP15_64_ISS_RT_MASK ) > > ESR_ELx_CP15_64_ISS_RT_SHIFT ;
int rt2 = ( esr & ESR_ELx_CP15_64_ISS_RT2_MASK ) > > ESR_ELx_CP15_64_ISS_RT2_SHIFT ;
2019-04-08 16:49:03 +01:00
u64 val = arch_timer_read_counter ( ) ;
2018-09-27 17:15:32 +01:00
pt_regs_write_reg ( regs , rt , lower_32_bits ( val ) ) ;
pt_regs_write_reg ( regs , rt2 , upper_32_bits ( val ) ) ;
2020-03-16 16:50:49 +00:00
arm64_skip_faulting_instruction ( regs , 4 ) ;
2018-09-27 17:15:32 +01:00
}
2019-08-13 15:16:39 +01:00
static const struct sys64_hook cp15_64_hooks [ ] = {
2018-09-27 17:15:32 +01:00
{
. esr_mask = ESR_ELx_CP15_64_ISS_SYS_MASK ,
. esr_val = ESR_ELx_CP15_64_ISS_SYS_CNTVCT ,
. handler = compat_cntvct_read_handler ,
} ,
2021-10-17 13:42:24 +01:00
{
. esr_mask = ESR_ELx_CP15_64_ISS_SYS_MASK ,
. esr_val = ESR_ELx_CP15_64_ISS_SYS_CNTVCTSS ,
. handler = compat_cntvct_read_handler ,
} ,
2018-09-27 17:15:31 +01:00
{ } ,
} ;
2022-10-19 15:41:15 +01:00
void do_el0_cp15 ( unsigned long esr , struct pt_regs * regs )
2018-09-27 17:15:29 +01:00
{
2019-08-13 15:16:39 +01:00
const struct sys64_hook * hook , * hook_base ;
2018-09-27 17:15:31 +01:00
2018-09-27 17:15:30 +01:00
if ( ! cp15_cond_valid ( esr , regs ) ) {
/*
* There is no T16 variant of a CP access , so we
* always advance PC by 4 bytes .
*/
2020-03-16 16:50:49 +00:00
arm64_skip_faulting_instruction ( regs , 4 ) ;
2018-09-27 17:15:30 +01:00
return ;
}
2018-09-27 17:15:31 +01:00
switch ( ESR_ELx_EC ( esr ) ) {
case ESR_ELx_EC_CP15_32 :
hook_base = cp15_32_hooks ;
break ;
case ESR_ELx_EC_CP15_64 :
hook_base = cp15_64_hooks ;
break ;
default :
arm64: split EL0/EL1 UNDEF handlers
In general, exceptions taken from EL1 need to be handled separately from
exceptions taken from EL0, as the logic to handle the two cases can be
significantly divergent, and exceptions taken from EL1 typically have
more stringent requirements on locking and instrumentation.
Subsequent patches will rework the way EL1 UNDEFs are handled in order
to address longstanding soundness issues with instrumentation and RCU.
In preparation for that rework, this patch splits the existing
do_undefinstr() handler into separate do_el0_undef() and do_el1_undef()
handlers.
Prior to this patch, do_undefinstr() was marked with NOKPROBE_SYMBOL(),
preventing instrumentation via kprobes. However, do_undefinstr() invokes
other code which can be instrumented, and:
* For UNDEFINED exceptions taken from EL0, there is no risk of recursion
within kprobes. Therefore it is safe for do_el0_undef to be
instrumented with kprobes, and it does not need to be marked with
NOKPROBE_SYMBOL().
* For UNDEFINED exceptions taken from EL1, either:
(a) The exception is has been taken when manipulating SSBS; these cases
are limited and do not occur within code that can be invoked
recursively via kprobes. Hence, in these cases instrumentation
with kprobes is benign.
(b) The exception has been taken for an unknown reason, as other than
manipulating SSBS we do not expect to take UNDEFINED exceptions
from EL1. Any handling of these exception is best-effort.
... and in either case, marking do_el1_undef() with NOKPROBE_SYMBOL()
isn't sufficient to prevent recursion via kprobes as functions it
calls (including die()) are instrumentable via kprobes.
Hence, it's not worthwhile to mark do_el1_undef() with
NOKPROBE_SYMBOL(). The same applies to do_el1_bti() and do_el1_fpac(),
so their NOKPROBE_SYMBOL() annotations are also removed.
Aside from the new instrumentability, there should be no functional
change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221019144123.612388-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-10-19 15:41:16 +01:00
do_el0_undef ( regs , esr ) ;
2018-09-27 17:15:31 +01:00
return ;
}
for ( hook = hook_base ; hook - > handler ; hook + + )
if ( ( hook - > esr_mask & esr ) = = hook - > esr_val ) {
hook - > handler ( esr , regs ) ;
return ;
}
2018-09-27 17:15:29 +01:00
/*
* New cp15 instructions may previously have been undefined at
* EL0 . Fall back to our usual undefined instruction handler
* so that we handle these consistently .
*/
arm64: split EL0/EL1 UNDEF handlers
In general, exceptions taken from EL1 need to be handled separately from
exceptions taken from EL0, as the logic to handle the two cases can be
significantly divergent, and exceptions taken from EL1 typically have
more stringent requirements on locking and instrumentation.
Subsequent patches will rework the way EL1 UNDEFs are handled in order
to address longstanding soundness issues with instrumentation and RCU.
In preparation for that rework, this patch splits the existing
do_undefinstr() handler into separate do_el0_undef() and do_el1_undef()
handlers.
Prior to this patch, do_undefinstr() was marked with NOKPROBE_SYMBOL(),
preventing instrumentation via kprobes. However, do_undefinstr() invokes
other code which can be instrumented, and:
* For UNDEFINED exceptions taken from EL0, there is no risk of recursion
within kprobes. Therefore it is safe for do_el0_undef to be
instrumented with kprobes, and it does not need to be marked with
NOKPROBE_SYMBOL().
* For UNDEFINED exceptions taken from EL1, either:
(a) The exception is has been taken when manipulating SSBS; these cases
are limited and do not occur within code that can be invoked
recursively via kprobes. Hence, in these cases instrumentation
with kprobes is benign.
(b) The exception has been taken for an unknown reason, as other than
manipulating SSBS we do not expect to take UNDEFINED exceptions
from EL1. Any handling of these exception is best-effort.
... and in either case, marking do_el1_undef() with NOKPROBE_SYMBOL()
isn't sufficient to prevent recursion via kprobes as functions it
calls (including die()) are instrumentable via kprobes.
Hence, it's not worthwhile to mark do_el1_undef() with
NOKPROBE_SYMBOL(). The same applies to do_el1_bti() and do_el1_fpac(),
so their NOKPROBE_SYMBOL() annotations are also removed.
Aside from the new instrumentability, there should be no functional
change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221019144123.612388-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-10-19 15:41:16 +01:00
do_el0_undef ( regs , esr ) ;
2018-09-27 17:15:29 +01:00
}
# endif
2022-10-19 15:41:15 +01:00
void do_el0_sys ( unsigned long esr , struct pt_regs * regs )
2016-09-09 14:07:15 +01:00
{
2019-08-13 15:16:39 +01:00
const struct sys64_hook * hook ;
2016-09-09 14:07:15 +01:00
for ( hook = sys64_hooks ; hook - > handler ; hook + + )
if ( ( hook - > esr_mask & esr ) = = hook - > esr_val ) {
hook - > handler ( esr , regs ) ;
return ;
}
arm64: handle sys and undef traps consistently
If an EL0 instruction in the SYS class triggers an exception, do_sysintr
looks for a sys64_hook matching the instruction, and if none is found,
injects a SIGILL. This mirrors what we do for undefined instruction
encodings in do_undefinstr, where we look for an undef_hook matching the
instruction, and if none is found, inject a SIGILL.
Over time, new SYS instruction encodings may be allocated. Prior to
allocation, exceptions resulting from these would be handled by
do_undefinstr, whereas after allocation these may be handled by
do_sysintr.
To ensure that we have consistent behaviour if and when this happens, it
would be beneficial to have do_sysinstr fall back to do_undefinstr.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Suzuki Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-01-27 16:15:38 +00:00
/*
* New SYS instructions may previously have been undefined at EL0 . Fall
* back to our usual undefined instruction handler so that we handle
* these consistently .
*/
arm64: split EL0/EL1 UNDEF handlers
In general, exceptions taken from EL1 need to be handled separately from
exceptions taken from EL0, as the logic to handle the two cases can be
significantly divergent, and exceptions taken from EL1 typically have
more stringent requirements on locking and instrumentation.
Subsequent patches will rework the way EL1 UNDEFs are handled in order
to address longstanding soundness issues with instrumentation and RCU.
In preparation for that rework, this patch splits the existing
do_undefinstr() handler into separate do_el0_undef() and do_el1_undef()
handlers.
Prior to this patch, do_undefinstr() was marked with NOKPROBE_SYMBOL(),
preventing instrumentation via kprobes. However, do_undefinstr() invokes
other code which can be instrumented, and:
* For UNDEFINED exceptions taken from EL0, there is no risk of recursion
within kprobes. Therefore it is safe for do_el0_undef to be
instrumented with kprobes, and it does not need to be marked with
NOKPROBE_SYMBOL().
* For UNDEFINED exceptions taken from EL1, either:
(a) The exception is has been taken when manipulating SSBS; these cases
are limited and do not occur within code that can be invoked
recursively via kprobes. Hence, in these cases instrumentation
with kprobes is benign.
(b) The exception has been taken for an unknown reason, as other than
manipulating SSBS we do not expect to take UNDEFINED exceptions
from EL1. Any handling of these exception is best-effort.
... and in either case, marking do_el1_undef() with NOKPROBE_SYMBOL()
isn't sufficient to prevent recursion via kprobes as functions it
calls (including die()) are instrumentable via kprobes.
Hence, it's not worthwhile to mark do_el1_undef() with
NOKPROBE_SYMBOL(). The same applies to do_el1_bti() and do_el1_fpac(),
so their NOKPROBE_SYMBOL() annotations are also removed.
Aside from the new instrumentability, there should be no functional
change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221019144123.612388-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-10-19 15:41:16 +01:00
do_el0_undef ( regs , esr ) ;
2016-09-09 14:07:15 +01:00
}
2014-11-18 12:16:30 +00:00
static const char * esr_class_str [ ] = {
[ 0 . . . ESR_ELx_EC_MAX ] = " UNRECOGNIZED EC " ,
[ ESR_ELx_EC_UNKNOWN ] = " Unknown/Uncategorized " ,
[ ESR_ELx_EC_WFx ] = " WFI/WFE " ,
[ ESR_ELx_EC_CP15_32 ] = " CP15 MCR/MRC " ,
[ ESR_ELx_EC_CP15_64 ] = " CP15 MCRR/MRRC " ,
[ ESR_ELx_EC_CP14_MR ] = " CP14 MCR/MRC " ,
[ ESR_ELx_EC_CP14_LS ] = " CP14 LDC/STC " ,
[ ESR_ELx_EC_FP_ASIMD ] = " ASIMD " ,
[ ESR_ELx_EC_CP10_ID ] = " CP10 MRC/VMRS " ,
2019-07-13 04:40:54 +00:00
[ ESR_ELx_EC_PAC ] = " PAC " ,
2014-11-18 12:16:30 +00:00
[ ESR_ELx_EC_CP14_64 ] = " CP14 MCRR/MRRC " ,
2020-03-16 16:50:45 +00:00
[ ESR_ELx_EC_BTI ] = " BTI " ,
2014-11-18 12:16:30 +00:00
[ ESR_ELx_EC_ILL ] = " PSTATE.IL " ,
[ ESR_ELx_EC_SVC32 ] = " SVC (AArch32) " ,
[ ESR_ELx_EC_HVC32 ] = " HVC (AArch32) " ,
[ ESR_ELx_EC_SMC32 ] = " SMC (AArch32) " ,
[ ESR_ELx_EC_SVC64 ] = " SVC (AArch64) " ,
[ ESR_ELx_EC_HVC64 ] = " HVC (AArch64) " ,
[ ESR_ELx_EC_SMC64 ] = " SMC (AArch64) " ,
[ ESR_ELx_EC_SYS64 ] = " MSR/MRS (AArch64) " ,
2017-10-31 15:51:00 +00:00
[ ESR_ELx_EC_SVE ] = " SVE " ,
2019-07-16 08:14:19 +01:00
[ ESR_ELx_EC_ERET ] = " ERET/ERETAA/ERETAB " ,
2020-09-14 14:06:53 +05:30
[ ESR_ELx_EC_FPAC ] = " FPAC " ,
2022-04-19 12:22:13 +01:00
[ ESR_ELx_EC_SME ] = " SME " ,
2014-11-18 12:16:30 +00:00
[ ESR_ELx_EC_IMP_DEF ] = " EL3 IMP DEF " ,
[ ESR_ELx_EC_IABT_LOW ] = " IABT (lower EL) " ,
[ ESR_ELx_EC_IABT_CUR ] = " IABT (current EL) " ,
[ ESR_ELx_EC_PC_ALIGN ] = " PC Alignment " ,
[ ESR_ELx_EC_DABT_LOW ] = " DABT (lower EL) " ,
[ ESR_ELx_EC_DABT_CUR ] = " DABT (current EL) " ,
[ ESR_ELx_EC_SP_ALIGN ] = " SP Alignment " ,
2023-05-09 15:22:31 +01:00
[ ESR_ELx_EC_MOPS ] = " MOPS " ,
2014-11-18 12:16:30 +00:00
[ ESR_ELx_EC_FP_EXC32 ] = " FP (AArch32) " ,
[ ESR_ELx_EC_FP_EXC64 ] = " FP (AArch64) " ,
[ ESR_ELx_EC_SERROR ] = " SError " ,
[ ESR_ELx_EC_BREAKPT_LOW ] = " Breakpoint (lower EL) " ,
[ ESR_ELx_EC_BREAKPT_CUR ] = " Breakpoint (current EL) " ,
[ ESR_ELx_EC_SOFTSTP_LOW ] = " Software Step (lower EL) " ,
[ ESR_ELx_EC_SOFTSTP_CUR ] = " Software Step (current EL) " ,
[ ESR_ELx_EC_WATCHPT_LOW ] = " Watchpoint (lower EL) " ,
[ ESR_ELx_EC_WATCHPT_CUR ] = " Watchpoint (current EL) " ,
[ ESR_ELx_EC_BKPT32 ] = " BKPT (AArch32) " ,
[ ESR_ELx_EC_VECTOR32 ] = " Vector catch (AArch32) " ,
[ ESR_ELx_EC_BRK64 ] = " BRK (AArch64) " ,
} ;
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
const char * esr_get_class_string ( unsigned long esr )
2014-11-18 12:16:30 +00:00
{
2016-05-31 12:33:01 +01:00
return esr_class_str [ ESR_ELx_EC ( esr ) ] ;
2014-11-18 12:16:30 +00:00
}
arm64: avoid returning from bad_mode
Generally, taking an unexpected exception should be a fatal event, and
bad_mode is intended to cater for this. However, it should be possible
to contain unexpected synchronous exceptions from EL0 without bringing
the kernel down, by sending a SIGILL to the task.
We tried to apply this approach in commit 9955ac47f4ba1c95 ("arm64:
don't kill the kernel on a bad esr from el0"), by sending a signal for
any bad_mode call resulting from an EL0 exception.
However, this also applies to other unexpected exceptions, such as
SError and FIQ. The entry paths for these exceptions branch to bad_mode
without configuring the link register, and have no kernel_exit. Thus, if
we take one of these exceptions from EL0, bad_mode will eventually
return to the original user link register value.
This patch fixes this by introducing a new bad_el0_sync handler to cater
for the recoverable case, and restoring bad_mode to its original state,
whereby it calls panic() and never returns. The recoverable case
branches to bad_el0_sync with a bl, and returns to userspace via the
usual ret_to_user mechanism.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Fixes: 9955ac47f4ba1c95 ("arm64: don't kill the kernel on a bad esr from el0")
Reported-by: Mark Salter <msalter@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-01-18 17:23:41 +00:00
/*
* bad_el0_sync handles unexpected , but potentially recoverable synchronous
arm64: entry: handle all vectors with C
We have 16 architectural exception vectors, and depending on kernel
configuration we handle 8 or 12 of these with C code, with the remaining
8 or 4 of these handled as special cases in the entry assembly.
It would be nicer if the entry assembly were uniform for all exceptions,
and we deferred any specific handling of the exceptions to C code. This
way the entry assembly can be more easily templated without ifdeffery or
special cases, and it's easier to modify the handling of these cases in
future (e.g. to dump additional registers other context).
This patch reworks the entry code so that we always have a C handler for
every architectural exception vector, with the entry assembly being
completely uniform. We now have to handle exceptions from EL1t and EL1h,
and also have to handle exceptions from AArch32 even when the kernel is
built without CONFIG_COMPAT. To make this clear and to simplify
templating, we rename the top-level exception handlers with a consistent
naming scheme:
asm: <el+sp>_<regsize>_<type>
c: <el+sp>_<regsize>_<type>_handler
.. where:
<el+sp> is `el1t`, `el1h`, or `el0t`
<regsize> is `64` or `32`
<type> is `sync`, `irq`, `fiq`, or `error`
... e.g.
asm: el1h_64_sync
c: el1h_64_sync_handler
... with lower-level handlers simply using "el1" and "compat" as today.
For unexpected exceptions, this information is passed to
__panic_unhandled(), so it can report the specific vector an unexpected
exception was taken from, e.g.
| Unhandled 64-bit el1t sync exception
For vectors we never expect to enter legitimately, the C code is
generated using a macro to avoid code duplication. The exceptions are
handled via __panic_unhandled(), replacing bad_mode() (which is
removed).
The `kernel_ventry` and `entry_handler` assembly macros are updated to
handle the new naming scheme. In theory it should be possible to
generate the entry functions at the same time as the vectors using a
single table, but this will require reworking the linker script to split
the two into separate sections, so for now we have separate tables.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210607094624.34689-15-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07 10:46:18 +01:00
* exceptions taken from EL0 .
arm64: avoid returning from bad_mode
Generally, taking an unexpected exception should be a fatal event, and
bad_mode is intended to cater for this. However, it should be possible
to contain unexpected synchronous exceptions from EL0 without bringing
the kernel down, by sending a SIGILL to the task.
We tried to apply this approach in commit 9955ac47f4ba1c95 ("arm64:
don't kill the kernel on a bad esr from el0"), by sending a signal for
any bad_mode call resulting from an EL0 exception.
However, this also applies to other unexpected exceptions, such as
SError and FIQ. The entry paths for these exceptions branch to bad_mode
without configuring the link register, and have no kernel_exit. Thus, if
we take one of these exceptions from EL0, bad_mode will eventually
return to the original user link register value.
This patch fixes this by introducing a new bad_el0_sync handler to cater
for the recoverable case, and restoring bad_mode to its original state,
whereby it calls panic() and never returns. The recoverable case
branches to bad_el0_sync with a bl, and returns to userspace via the
usual ret_to_user mechanism.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Fixes: 9955ac47f4ba1c95 ("arm64: don't kill the kernel on a bad esr from el0")
Reported-by: Mark Salter <msalter@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-01-18 17:23:41 +00:00
*/
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
void bad_el0_sync ( struct pt_regs * regs , int reason , unsigned long esr )
arm64: avoid returning from bad_mode
Generally, taking an unexpected exception should be a fatal event, and
bad_mode is intended to cater for this. However, it should be possible
to contain unexpected synchronous exceptions from EL0 without bringing
the kernel down, by sending a SIGILL to the task.
We tried to apply this approach in commit 9955ac47f4ba1c95 ("arm64:
don't kill the kernel on a bad esr from el0"), by sending a signal for
any bad_mode call resulting from an EL0 exception.
However, this also applies to other unexpected exceptions, such as
SError and FIQ. The entry paths for these exceptions branch to bad_mode
without configuring the link register, and have no kernel_exit. Thus, if
we take one of these exceptions from EL0, bad_mode will eventually
return to the original user link register value.
This patch fixes this by introducing a new bad_el0_sync handler to cater
for the recoverable case, and restoring bad_mode to its original state,
whereby it calls panic() and never returns. The recoverable case
branches to bad_el0_sync with a bl, and returns to userspace via the
usual ret_to_user mechanism.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Fixes: 9955ac47f4ba1c95 ("arm64: don't kill the kernel on a bad esr from el0")
Reported-by: Mark Salter <msalter@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-01-18 17:23:41 +00:00
{
2020-11-20 12:33:46 -08:00
unsigned long pc = instruction_pointer ( regs ) ;
2013-05-28 15:54:15 +01:00
arm64: avoid returning from bad_mode
Generally, taking an unexpected exception should be a fatal event, and
bad_mode is intended to cater for this. However, it should be possible
to contain unexpected synchronous exceptions from EL0 without bringing
the kernel down, by sending a SIGILL to the task.
We tried to apply this approach in commit 9955ac47f4ba1c95 ("arm64:
don't kill the kernel on a bad esr from el0"), by sending a signal for
any bad_mode call resulting from an EL0 exception.
However, this also applies to other unexpected exceptions, such as
SError and FIQ. The entry paths for these exceptions branch to bad_mode
without configuring the link register, and have no kernel_exit. Thus, if
we take one of these exceptions from EL0, bad_mode will eventually
return to the original user link register value.
This patch fixes this by introducing a new bad_el0_sync handler to cater
for the recoverable case, and restoring bad_mode to its original state,
whereby it calls panic() and never returns. The recoverable case
branches to bad_el0_sync with a bl, and returns to userspace via the
usual ret_to_user mechanism.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Fixes: 9955ac47f4ba1c95 ("arm64: don't kill the kernel on a bad esr from el0")
Reported-by: Mark Salter <msalter@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-01-18 17:23:41 +00:00
current - > thread . fault_address = 0 ;
2018-02-20 15:18:13 +00:00
current - > thread . fault_code = esr ;
arm64: avoid returning from bad_mode
Generally, taking an unexpected exception should be a fatal event, and
bad_mode is intended to cater for this. However, it should be possible
to contain unexpected synchronous exceptions from EL0 without bringing
the kernel down, by sending a SIGILL to the task.
We tried to apply this approach in commit 9955ac47f4ba1c95 ("arm64:
don't kill the kernel on a bad esr from el0"), by sending a signal for
any bad_mode call resulting from an EL0 exception.
However, this also applies to other unexpected exceptions, such as
SError and FIQ. The entry paths for these exceptions branch to bad_mode
without configuring the link register, and have no kernel_exit. Thus, if
we take one of these exceptions from EL0, bad_mode will eventually
return to the original user link register value.
This patch fixes this by introducing a new bad_el0_sync handler to cater
for the recoverable case, and restoring bad_mode to its original state,
whereby it calls panic() and never returns. The recoverable case
branches to bad_el0_sync with a bl, and returns to userspace via the
usual ret_to_user mechanism.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Fixes: 9955ac47f4ba1c95 ("arm64: don't kill the kernel on a bad esr from el0")
Reported-by: Mark Salter <msalter@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-01-18 17:23:41 +00:00
2018-09-22 10:26:57 +02:00
arm64_force_sig_fault ( SIGILL , ILL_ILLOPC , pc ,
" Bad EL0 synchronous exception " ) ;
2012-03-05 11:49:27 +00:00
}
arm64: add VMAP_STACK overflow detection
This patch adds stack overflow detection to arm64, usable when vmap'd stacks
are in use.
Overflow is detected in a small preamble executed for each exception entry,
which checks whether there is enough space on the current stack for the general
purpose registers to be saved. If there is not enough space, the overflow
handler is invoked on a per-cpu overflow stack. This approach preserves the
original exception information in ESR_EL1 (and where appropriate, FAR_EL1).
Task and IRQ stacks are aligned to double their size, enabling overflow to be
detected with a single bit test. For example, a 16K stack is aligned to 32K,
ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
this bit is flipped. Thus, overflow (of less than the size of the stack) can be
detected by testing whether this bit is set.
The overflow check is performed before any attempt is made to access the
stack, avoiding recursive faults (and the loss of exception information
these would entail). As logical operations cannot be performed on the SP
directly, the SP is temporarily swapped with a general purpose register
using arithmetic operations to enable the test to be performed.
This gives us a useful error message on stack overflow, as can be trigger with
the LKDTM overflow test:
[ 305.388749] lkdtm: Performing direct entry OVERFLOW
[ 305.395444] Insufficient stack space to handle exception!
[ 305.395482] ESR: 0x96000047 -- DABT (current EL)
[ 305.399890] FAR: 0xffff00000a5e7f30
[ 305.401315] Task stack: [0xffff00000a5e8000..0xffff00000a5ec000]
[ 305.403815] IRQ stack: [0xffff000008000000..0xffff000008004000]
[ 305.407035] Overflow stack: [0xffff80003efce4e0..0xffff80003efcf4e0]
[ 305.409622] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.412785] Hardware name: linux,dummy-virt (DT)
[ 305.415756] task: ffff80003d051c00 task.stack: ffff00000a5e8000
[ 305.419221] PC is at recursive_loop+0x10/0x48
[ 305.421637] LR is at recursive_loop+0x38/0x48
[ 305.423768] pc : [<ffff00000859f330>] lr : [<ffff00000859f358>] pstate: 40000145
[ 305.428020] sp : ffff00000a5e7f50
[ 305.430469] x29: ffff00000a5e8350 x28: ffff80003d051c00
[ 305.433191] x27: ffff000008981000 x26: ffff000008f80400
[ 305.439012] x25: ffff00000a5ebeb8 x24: ffff00000a5ebeb8
[ 305.440369] x23: ffff000008f80138 x22: 0000000000000009
[ 305.442241] x21: ffff80003ce65000 x20: ffff000008f80188
[ 305.444552] x19: 0000000000000013 x18: 0000000000000006
[ 305.446032] x17: 0000ffffa2601280 x16: ffff0000081fe0b8
[ 305.448252] x15: ffff000008ff546d x14: 000000000047a4c8
[ 305.450246] x13: ffff000008ff7872 x12: 0000000005f5e0ff
[ 305.452953] x11: ffff000008ed2548 x10: 000000000005ee8d
[ 305.454824] x9 : ffff000008545380 x8 : ffff00000a5e8770
[ 305.457105] x7 : 1313131313131313 x6 : 00000000000000e1
[ 305.459285] x5 : 0000000000000000 x4 : 0000000000000000
[ 305.461781] x3 : 0000000000000000 x2 : 0000000000000400
[ 305.465119] x1 : 0000000000000013 x0 : 0000000000000012
[ 305.467724] Kernel panic - not syncing: kernel stack overflow
[ 305.470561] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.473325] Hardware name: linux,dummy-virt (DT)
[ 305.475070] Call trace:
[ 305.476116] [<ffff000008088ad8>] dump_backtrace+0x0/0x378
[ 305.478991] [<ffff000008088e64>] show_stack+0x14/0x20
[ 305.481237] [<ffff00000895a178>] dump_stack+0x98/0xb8
[ 305.483294] [<ffff0000080c3288>] panic+0x118/0x280
[ 305.485673] [<ffff0000080c2e9c>] nmi_panic+0x6c/0x70
[ 305.486216] [<ffff000008089710>] handle_bad_stack+0x118/0x128
[ 305.486612] Exception stack(0xffff80003efcf3a0 to 0xffff80003efcf4e0)
[ 305.487334] f3a0: 0000000000000012 0000000000000013 0000000000000400 0000000000000000
[ 305.488025] f3c0: 0000000000000000 0000000000000000 00000000000000e1 1313131313131313
[ 305.488908] f3e0: ffff00000a5e8770 ffff000008545380 000000000005ee8d ffff000008ed2548
[ 305.489403] f400: 0000000005f5e0ff ffff000008ff7872 000000000047a4c8 ffff000008ff546d
[ 305.489759] f420: ffff0000081fe0b8 0000ffffa2601280 0000000000000006 0000000000000013
[ 305.490256] f440: ffff000008f80188 ffff80003ce65000 0000000000000009 ffff000008f80138
[ 305.490683] f460: ffff00000a5ebeb8 ffff00000a5ebeb8 ffff000008f80400 ffff000008981000
[ 305.491051] f480: ffff80003d051c00 ffff00000a5e8350 ffff00000859f358 ffff00000a5e7f50
[ 305.491444] f4a0: ffff00000859f330 0000000040000145 0000000000000000 0000000000000000
[ 305.492008] f4c0: 0001000000000000 0000000000000000 ffff00000a5e8350 ffff00000859f330
[ 305.493063] [<ffff00000808205c>] __bad_stack+0x88/0x8c
[ 305.493396] [<ffff00000859f330>] recursive_loop+0x10/0x48
[ 305.493731] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494088] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494425] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494649] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494898] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495205] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495453] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495708] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496000] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496302] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496644] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496894] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497138] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497325] [<ffff00000859f3dc>] lkdtm_OVERFLOW+0x14/0x20
[ 305.497506] [<ffff00000859f314>] lkdtm_do_action+0x1c/0x28
[ 305.497786] [<ffff00000859f178>] direct_entry+0xe0/0x170
[ 305.498095] [<ffff000008345568>] full_proxy_write+0x60/0xa8
[ 305.498387] [<ffff0000081fb7f4>] __vfs_write+0x1c/0x128
[ 305.498679] [<ffff0000081fcc68>] vfs_write+0xa0/0x1b0
[ 305.498926] [<ffff0000081fe0fc>] SyS_write+0x44/0xa0
[ 305.499182] Exception stack(0xffff00000a5ebec0 to 0xffff00000a5ec000)
[ 305.499429] bec0: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.499674] bee0: 574f4c465245564f 0000000000000000 0000000000000000 8000000080808080
[ 305.499904] bf00: 0000000000000040 0000000000000038 fefefeff1b4bc2ff 7f7f7f7f7f7fff7f
[ 305.500189] bf20: 0101010101010101 0000000000000000 000000000047a4c8 0000000000000038
[ 305.500712] bf40: 0000000000000000 0000ffffa2601280 0000ffffc63f6068 00000000004b5000
[ 305.501241] bf60: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.501791] bf80: 0000000000000020 0000000000000000 00000000004b5000 000000001c4cc458
[ 305.502314] bfa0: 0000000000000000 0000ffffc63f7950 000000000040a3c4 0000ffffc63f70e0
[ 305.502762] bfc0: 0000ffffa2601268 0000000080000000 0000000000000001 0000000000000040
[ 305.503207] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 305.503680] [<ffff000008082fb0>] el0_svc_naked+0x24/0x28
[ 305.504720] Kernel Offset: disabled
[ 305.505189] CPU features: 0x002082
[ 305.505473] Memory Limit: none
[ 305.506181] ---[ end Kernel panic - not syncing: kernel stack overflow
This patch was co-authored by Ard Biesheuvel and Mark Rutland.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-07-14 20:30:35 +01:00
# ifdef CONFIG_VMAP_STACK
DEFINE_PER_CPU ( unsigned long [ OVERFLOW_STACK_SIZE / sizeof ( long ) ] , overflow_stack )
__aligned ( 16 ) ;
2023-04-12 16:49:34 -07:00
void __noreturn panic_bad_stack ( struct pt_regs * regs , unsigned long esr , unsigned long far )
arm64: add VMAP_STACK overflow detection
This patch adds stack overflow detection to arm64, usable when vmap'd stacks
are in use.
Overflow is detected in a small preamble executed for each exception entry,
which checks whether there is enough space on the current stack for the general
purpose registers to be saved. If there is not enough space, the overflow
handler is invoked on a per-cpu overflow stack. This approach preserves the
original exception information in ESR_EL1 (and where appropriate, FAR_EL1).
Task and IRQ stacks are aligned to double their size, enabling overflow to be
detected with a single bit test. For example, a 16K stack is aligned to 32K,
ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
this bit is flipped. Thus, overflow (of less than the size of the stack) can be
detected by testing whether this bit is set.
The overflow check is performed before any attempt is made to access the
stack, avoiding recursive faults (and the loss of exception information
these would entail). As logical operations cannot be performed on the SP
directly, the SP is temporarily swapped with a general purpose register
using arithmetic operations to enable the test to be performed.
This gives us a useful error message on stack overflow, as can be trigger with
the LKDTM overflow test:
[ 305.388749] lkdtm: Performing direct entry OVERFLOW
[ 305.395444] Insufficient stack space to handle exception!
[ 305.395482] ESR: 0x96000047 -- DABT (current EL)
[ 305.399890] FAR: 0xffff00000a5e7f30
[ 305.401315] Task stack: [0xffff00000a5e8000..0xffff00000a5ec000]
[ 305.403815] IRQ stack: [0xffff000008000000..0xffff000008004000]
[ 305.407035] Overflow stack: [0xffff80003efce4e0..0xffff80003efcf4e0]
[ 305.409622] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.412785] Hardware name: linux,dummy-virt (DT)
[ 305.415756] task: ffff80003d051c00 task.stack: ffff00000a5e8000
[ 305.419221] PC is at recursive_loop+0x10/0x48
[ 305.421637] LR is at recursive_loop+0x38/0x48
[ 305.423768] pc : [<ffff00000859f330>] lr : [<ffff00000859f358>] pstate: 40000145
[ 305.428020] sp : ffff00000a5e7f50
[ 305.430469] x29: ffff00000a5e8350 x28: ffff80003d051c00
[ 305.433191] x27: ffff000008981000 x26: ffff000008f80400
[ 305.439012] x25: ffff00000a5ebeb8 x24: ffff00000a5ebeb8
[ 305.440369] x23: ffff000008f80138 x22: 0000000000000009
[ 305.442241] x21: ffff80003ce65000 x20: ffff000008f80188
[ 305.444552] x19: 0000000000000013 x18: 0000000000000006
[ 305.446032] x17: 0000ffffa2601280 x16: ffff0000081fe0b8
[ 305.448252] x15: ffff000008ff546d x14: 000000000047a4c8
[ 305.450246] x13: ffff000008ff7872 x12: 0000000005f5e0ff
[ 305.452953] x11: ffff000008ed2548 x10: 000000000005ee8d
[ 305.454824] x9 : ffff000008545380 x8 : ffff00000a5e8770
[ 305.457105] x7 : 1313131313131313 x6 : 00000000000000e1
[ 305.459285] x5 : 0000000000000000 x4 : 0000000000000000
[ 305.461781] x3 : 0000000000000000 x2 : 0000000000000400
[ 305.465119] x1 : 0000000000000013 x0 : 0000000000000012
[ 305.467724] Kernel panic - not syncing: kernel stack overflow
[ 305.470561] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.473325] Hardware name: linux,dummy-virt (DT)
[ 305.475070] Call trace:
[ 305.476116] [<ffff000008088ad8>] dump_backtrace+0x0/0x378
[ 305.478991] [<ffff000008088e64>] show_stack+0x14/0x20
[ 305.481237] [<ffff00000895a178>] dump_stack+0x98/0xb8
[ 305.483294] [<ffff0000080c3288>] panic+0x118/0x280
[ 305.485673] [<ffff0000080c2e9c>] nmi_panic+0x6c/0x70
[ 305.486216] [<ffff000008089710>] handle_bad_stack+0x118/0x128
[ 305.486612] Exception stack(0xffff80003efcf3a0 to 0xffff80003efcf4e0)
[ 305.487334] f3a0: 0000000000000012 0000000000000013 0000000000000400 0000000000000000
[ 305.488025] f3c0: 0000000000000000 0000000000000000 00000000000000e1 1313131313131313
[ 305.488908] f3e0: ffff00000a5e8770 ffff000008545380 000000000005ee8d ffff000008ed2548
[ 305.489403] f400: 0000000005f5e0ff ffff000008ff7872 000000000047a4c8 ffff000008ff546d
[ 305.489759] f420: ffff0000081fe0b8 0000ffffa2601280 0000000000000006 0000000000000013
[ 305.490256] f440: ffff000008f80188 ffff80003ce65000 0000000000000009 ffff000008f80138
[ 305.490683] f460: ffff00000a5ebeb8 ffff00000a5ebeb8 ffff000008f80400 ffff000008981000
[ 305.491051] f480: ffff80003d051c00 ffff00000a5e8350 ffff00000859f358 ffff00000a5e7f50
[ 305.491444] f4a0: ffff00000859f330 0000000040000145 0000000000000000 0000000000000000
[ 305.492008] f4c0: 0001000000000000 0000000000000000 ffff00000a5e8350 ffff00000859f330
[ 305.493063] [<ffff00000808205c>] __bad_stack+0x88/0x8c
[ 305.493396] [<ffff00000859f330>] recursive_loop+0x10/0x48
[ 305.493731] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494088] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494425] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494649] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494898] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495205] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495453] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495708] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496000] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496302] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496644] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496894] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497138] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497325] [<ffff00000859f3dc>] lkdtm_OVERFLOW+0x14/0x20
[ 305.497506] [<ffff00000859f314>] lkdtm_do_action+0x1c/0x28
[ 305.497786] [<ffff00000859f178>] direct_entry+0xe0/0x170
[ 305.498095] [<ffff000008345568>] full_proxy_write+0x60/0xa8
[ 305.498387] [<ffff0000081fb7f4>] __vfs_write+0x1c/0x128
[ 305.498679] [<ffff0000081fcc68>] vfs_write+0xa0/0x1b0
[ 305.498926] [<ffff0000081fe0fc>] SyS_write+0x44/0xa0
[ 305.499182] Exception stack(0xffff00000a5ebec0 to 0xffff00000a5ec000)
[ 305.499429] bec0: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.499674] bee0: 574f4c465245564f 0000000000000000 0000000000000000 8000000080808080
[ 305.499904] bf00: 0000000000000040 0000000000000038 fefefeff1b4bc2ff 7f7f7f7f7f7fff7f
[ 305.500189] bf20: 0101010101010101 0000000000000000 000000000047a4c8 0000000000000038
[ 305.500712] bf40: 0000000000000000 0000ffffa2601280 0000ffffc63f6068 00000000004b5000
[ 305.501241] bf60: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.501791] bf80: 0000000000000020 0000000000000000 00000000004b5000 000000001c4cc458
[ 305.502314] bfa0: 0000000000000000 0000ffffc63f7950 000000000040a3c4 0000ffffc63f70e0
[ 305.502762] bfc0: 0000ffffa2601268 0000000080000000 0000000000000001 0000000000000040
[ 305.503207] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 305.503680] [<ffff000008082fb0>] el0_svc_naked+0x24/0x28
[ 305.504720] Kernel Offset: disabled
[ 305.505189] CPU features: 0x002082
[ 305.505473] Memory Limit: none
[ 305.506181] ---[ end Kernel panic - not syncing: kernel stack overflow
This patch was co-authored by Ard Biesheuvel and Mark Rutland.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-07-14 20:30:35 +01:00
{
unsigned long tsk_stk = ( unsigned long ) current - > stack ;
unsigned long irq_stk = ( unsigned long ) this_cpu_read ( irq_stack_ptr ) ;
unsigned long ovf_stk = ( unsigned long ) this_cpu_ptr ( overflow_stack ) ;
arm64: entry: fix NMI {user, kernel}->kernel transitions
Exceptions which can be taken at (almost) any time are consdiered to be
NMIs. On arm64 that includes:
* SDEI events
* GICv3 Pseudo-NMIs
* Kernel stack overflows
* Unexpected/unhandled exceptions
... but currently debug exceptions (BRKs, breakpoints, watchpoints,
single-step) are not considered NMIs.
As these can be taken at any time, kernel features (lockdep, RCU,
ftrace) may not be in a consistent kernel state. For example, we may
take an NMI from the idle code or partway through an entry/exit path.
While nmi_enter() and nmi_exit() handle most of this state, notably they
don't save/restore the lockdep state across an NMI being taken and
handled. When interrupts are enabled and an NMI is taken, lockdep may
see interrupts become disabled within the NMI code, but not see
interrupts become enabled when returning from the NMI, leaving lockdep
believing interrupts are disabled when they are actually disabled.
The x86 code handles this in idtentry_{enter,exit}_nmi(), which will
shortly be moved to the generic entry code. As we can't use either yet,
we copy the x86 approach in arm64-specific helpers. All the NMI
entrypoints are marked as noinstr to prevent any instrumentation
handling code being invoked before the state has been corrected.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201130115950.22492-11-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-30 11:59:49 +00:00
arm64: add VMAP_STACK overflow detection
This patch adds stack overflow detection to arm64, usable when vmap'd stacks
are in use.
Overflow is detected in a small preamble executed for each exception entry,
which checks whether there is enough space on the current stack for the general
purpose registers to be saved. If there is not enough space, the overflow
handler is invoked on a per-cpu overflow stack. This approach preserves the
original exception information in ESR_EL1 (and where appropriate, FAR_EL1).
Task and IRQ stacks are aligned to double their size, enabling overflow to be
detected with a single bit test. For example, a 16K stack is aligned to 32K,
ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
this bit is flipped. Thus, overflow (of less than the size of the stack) can be
detected by testing whether this bit is set.
The overflow check is performed before any attempt is made to access the
stack, avoiding recursive faults (and the loss of exception information
these would entail). As logical operations cannot be performed on the SP
directly, the SP is temporarily swapped with a general purpose register
using arithmetic operations to enable the test to be performed.
This gives us a useful error message on stack overflow, as can be trigger with
the LKDTM overflow test:
[ 305.388749] lkdtm: Performing direct entry OVERFLOW
[ 305.395444] Insufficient stack space to handle exception!
[ 305.395482] ESR: 0x96000047 -- DABT (current EL)
[ 305.399890] FAR: 0xffff00000a5e7f30
[ 305.401315] Task stack: [0xffff00000a5e8000..0xffff00000a5ec000]
[ 305.403815] IRQ stack: [0xffff000008000000..0xffff000008004000]
[ 305.407035] Overflow stack: [0xffff80003efce4e0..0xffff80003efcf4e0]
[ 305.409622] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.412785] Hardware name: linux,dummy-virt (DT)
[ 305.415756] task: ffff80003d051c00 task.stack: ffff00000a5e8000
[ 305.419221] PC is at recursive_loop+0x10/0x48
[ 305.421637] LR is at recursive_loop+0x38/0x48
[ 305.423768] pc : [<ffff00000859f330>] lr : [<ffff00000859f358>] pstate: 40000145
[ 305.428020] sp : ffff00000a5e7f50
[ 305.430469] x29: ffff00000a5e8350 x28: ffff80003d051c00
[ 305.433191] x27: ffff000008981000 x26: ffff000008f80400
[ 305.439012] x25: ffff00000a5ebeb8 x24: ffff00000a5ebeb8
[ 305.440369] x23: ffff000008f80138 x22: 0000000000000009
[ 305.442241] x21: ffff80003ce65000 x20: ffff000008f80188
[ 305.444552] x19: 0000000000000013 x18: 0000000000000006
[ 305.446032] x17: 0000ffffa2601280 x16: ffff0000081fe0b8
[ 305.448252] x15: ffff000008ff546d x14: 000000000047a4c8
[ 305.450246] x13: ffff000008ff7872 x12: 0000000005f5e0ff
[ 305.452953] x11: ffff000008ed2548 x10: 000000000005ee8d
[ 305.454824] x9 : ffff000008545380 x8 : ffff00000a5e8770
[ 305.457105] x7 : 1313131313131313 x6 : 00000000000000e1
[ 305.459285] x5 : 0000000000000000 x4 : 0000000000000000
[ 305.461781] x3 : 0000000000000000 x2 : 0000000000000400
[ 305.465119] x1 : 0000000000000013 x0 : 0000000000000012
[ 305.467724] Kernel panic - not syncing: kernel stack overflow
[ 305.470561] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.473325] Hardware name: linux,dummy-virt (DT)
[ 305.475070] Call trace:
[ 305.476116] [<ffff000008088ad8>] dump_backtrace+0x0/0x378
[ 305.478991] [<ffff000008088e64>] show_stack+0x14/0x20
[ 305.481237] [<ffff00000895a178>] dump_stack+0x98/0xb8
[ 305.483294] [<ffff0000080c3288>] panic+0x118/0x280
[ 305.485673] [<ffff0000080c2e9c>] nmi_panic+0x6c/0x70
[ 305.486216] [<ffff000008089710>] handle_bad_stack+0x118/0x128
[ 305.486612] Exception stack(0xffff80003efcf3a0 to 0xffff80003efcf4e0)
[ 305.487334] f3a0: 0000000000000012 0000000000000013 0000000000000400 0000000000000000
[ 305.488025] f3c0: 0000000000000000 0000000000000000 00000000000000e1 1313131313131313
[ 305.488908] f3e0: ffff00000a5e8770 ffff000008545380 000000000005ee8d ffff000008ed2548
[ 305.489403] f400: 0000000005f5e0ff ffff000008ff7872 000000000047a4c8 ffff000008ff546d
[ 305.489759] f420: ffff0000081fe0b8 0000ffffa2601280 0000000000000006 0000000000000013
[ 305.490256] f440: ffff000008f80188 ffff80003ce65000 0000000000000009 ffff000008f80138
[ 305.490683] f460: ffff00000a5ebeb8 ffff00000a5ebeb8 ffff000008f80400 ffff000008981000
[ 305.491051] f480: ffff80003d051c00 ffff00000a5e8350 ffff00000859f358 ffff00000a5e7f50
[ 305.491444] f4a0: ffff00000859f330 0000000040000145 0000000000000000 0000000000000000
[ 305.492008] f4c0: 0001000000000000 0000000000000000 ffff00000a5e8350 ffff00000859f330
[ 305.493063] [<ffff00000808205c>] __bad_stack+0x88/0x8c
[ 305.493396] [<ffff00000859f330>] recursive_loop+0x10/0x48
[ 305.493731] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494088] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494425] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494649] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494898] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495205] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495453] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495708] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496000] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496302] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496644] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496894] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497138] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497325] [<ffff00000859f3dc>] lkdtm_OVERFLOW+0x14/0x20
[ 305.497506] [<ffff00000859f314>] lkdtm_do_action+0x1c/0x28
[ 305.497786] [<ffff00000859f178>] direct_entry+0xe0/0x170
[ 305.498095] [<ffff000008345568>] full_proxy_write+0x60/0xa8
[ 305.498387] [<ffff0000081fb7f4>] __vfs_write+0x1c/0x128
[ 305.498679] [<ffff0000081fcc68>] vfs_write+0xa0/0x1b0
[ 305.498926] [<ffff0000081fe0fc>] SyS_write+0x44/0xa0
[ 305.499182] Exception stack(0xffff00000a5ebec0 to 0xffff00000a5ec000)
[ 305.499429] bec0: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.499674] bee0: 574f4c465245564f 0000000000000000 0000000000000000 8000000080808080
[ 305.499904] bf00: 0000000000000040 0000000000000038 fefefeff1b4bc2ff 7f7f7f7f7f7fff7f
[ 305.500189] bf20: 0101010101010101 0000000000000000 000000000047a4c8 0000000000000038
[ 305.500712] bf40: 0000000000000000 0000ffffa2601280 0000ffffc63f6068 00000000004b5000
[ 305.501241] bf60: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.501791] bf80: 0000000000000020 0000000000000000 00000000004b5000 000000001c4cc458
[ 305.502314] bfa0: 0000000000000000 0000ffffc63f7950 000000000040a3c4 0000ffffc63f70e0
[ 305.502762] bfc0: 0000ffffa2601268 0000000080000000 0000000000000001 0000000000000040
[ 305.503207] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 305.503680] [<ffff000008082fb0>] el0_svc_naked+0x24/0x28
[ 305.504720] Kernel Offset: disabled
[ 305.505189] CPU features: 0x002082
[ 305.505473] Memory Limit: none
[ 305.506181] ---[ end Kernel panic - not syncing: kernel stack overflow
This patch was co-authored by Ard Biesheuvel and Mark Rutland.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-07-14 20:30:35 +01:00
console_verbose ( ) ;
pr_emerg ( " Insufficient stack space to handle exception! " ) ;
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
pr_emerg ( " ESR: 0x%016lx -- %s \n " , esr , esr_get_class_string ( esr ) ) ;
arm64: add VMAP_STACK overflow detection
This patch adds stack overflow detection to arm64, usable when vmap'd stacks
are in use.
Overflow is detected in a small preamble executed for each exception entry,
which checks whether there is enough space on the current stack for the general
purpose registers to be saved. If there is not enough space, the overflow
handler is invoked on a per-cpu overflow stack. This approach preserves the
original exception information in ESR_EL1 (and where appropriate, FAR_EL1).
Task and IRQ stacks are aligned to double their size, enabling overflow to be
detected with a single bit test. For example, a 16K stack is aligned to 32K,
ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
this bit is flipped. Thus, overflow (of less than the size of the stack) can be
detected by testing whether this bit is set.
The overflow check is performed before any attempt is made to access the
stack, avoiding recursive faults (and the loss of exception information
these would entail). As logical operations cannot be performed on the SP
directly, the SP is temporarily swapped with a general purpose register
using arithmetic operations to enable the test to be performed.
This gives us a useful error message on stack overflow, as can be trigger with
the LKDTM overflow test:
[ 305.388749] lkdtm: Performing direct entry OVERFLOW
[ 305.395444] Insufficient stack space to handle exception!
[ 305.395482] ESR: 0x96000047 -- DABT (current EL)
[ 305.399890] FAR: 0xffff00000a5e7f30
[ 305.401315] Task stack: [0xffff00000a5e8000..0xffff00000a5ec000]
[ 305.403815] IRQ stack: [0xffff000008000000..0xffff000008004000]
[ 305.407035] Overflow stack: [0xffff80003efce4e0..0xffff80003efcf4e0]
[ 305.409622] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.412785] Hardware name: linux,dummy-virt (DT)
[ 305.415756] task: ffff80003d051c00 task.stack: ffff00000a5e8000
[ 305.419221] PC is at recursive_loop+0x10/0x48
[ 305.421637] LR is at recursive_loop+0x38/0x48
[ 305.423768] pc : [<ffff00000859f330>] lr : [<ffff00000859f358>] pstate: 40000145
[ 305.428020] sp : ffff00000a5e7f50
[ 305.430469] x29: ffff00000a5e8350 x28: ffff80003d051c00
[ 305.433191] x27: ffff000008981000 x26: ffff000008f80400
[ 305.439012] x25: ffff00000a5ebeb8 x24: ffff00000a5ebeb8
[ 305.440369] x23: ffff000008f80138 x22: 0000000000000009
[ 305.442241] x21: ffff80003ce65000 x20: ffff000008f80188
[ 305.444552] x19: 0000000000000013 x18: 0000000000000006
[ 305.446032] x17: 0000ffffa2601280 x16: ffff0000081fe0b8
[ 305.448252] x15: ffff000008ff546d x14: 000000000047a4c8
[ 305.450246] x13: ffff000008ff7872 x12: 0000000005f5e0ff
[ 305.452953] x11: ffff000008ed2548 x10: 000000000005ee8d
[ 305.454824] x9 : ffff000008545380 x8 : ffff00000a5e8770
[ 305.457105] x7 : 1313131313131313 x6 : 00000000000000e1
[ 305.459285] x5 : 0000000000000000 x4 : 0000000000000000
[ 305.461781] x3 : 0000000000000000 x2 : 0000000000000400
[ 305.465119] x1 : 0000000000000013 x0 : 0000000000000012
[ 305.467724] Kernel panic - not syncing: kernel stack overflow
[ 305.470561] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.473325] Hardware name: linux,dummy-virt (DT)
[ 305.475070] Call trace:
[ 305.476116] [<ffff000008088ad8>] dump_backtrace+0x0/0x378
[ 305.478991] [<ffff000008088e64>] show_stack+0x14/0x20
[ 305.481237] [<ffff00000895a178>] dump_stack+0x98/0xb8
[ 305.483294] [<ffff0000080c3288>] panic+0x118/0x280
[ 305.485673] [<ffff0000080c2e9c>] nmi_panic+0x6c/0x70
[ 305.486216] [<ffff000008089710>] handle_bad_stack+0x118/0x128
[ 305.486612] Exception stack(0xffff80003efcf3a0 to 0xffff80003efcf4e0)
[ 305.487334] f3a0: 0000000000000012 0000000000000013 0000000000000400 0000000000000000
[ 305.488025] f3c0: 0000000000000000 0000000000000000 00000000000000e1 1313131313131313
[ 305.488908] f3e0: ffff00000a5e8770 ffff000008545380 000000000005ee8d ffff000008ed2548
[ 305.489403] f400: 0000000005f5e0ff ffff000008ff7872 000000000047a4c8 ffff000008ff546d
[ 305.489759] f420: ffff0000081fe0b8 0000ffffa2601280 0000000000000006 0000000000000013
[ 305.490256] f440: ffff000008f80188 ffff80003ce65000 0000000000000009 ffff000008f80138
[ 305.490683] f460: ffff00000a5ebeb8 ffff00000a5ebeb8 ffff000008f80400 ffff000008981000
[ 305.491051] f480: ffff80003d051c00 ffff00000a5e8350 ffff00000859f358 ffff00000a5e7f50
[ 305.491444] f4a0: ffff00000859f330 0000000040000145 0000000000000000 0000000000000000
[ 305.492008] f4c0: 0001000000000000 0000000000000000 ffff00000a5e8350 ffff00000859f330
[ 305.493063] [<ffff00000808205c>] __bad_stack+0x88/0x8c
[ 305.493396] [<ffff00000859f330>] recursive_loop+0x10/0x48
[ 305.493731] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494088] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494425] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494649] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494898] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495205] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495453] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495708] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496000] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496302] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496644] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496894] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497138] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497325] [<ffff00000859f3dc>] lkdtm_OVERFLOW+0x14/0x20
[ 305.497506] [<ffff00000859f314>] lkdtm_do_action+0x1c/0x28
[ 305.497786] [<ffff00000859f178>] direct_entry+0xe0/0x170
[ 305.498095] [<ffff000008345568>] full_proxy_write+0x60/0xa8
[ 305.498387] [<ffff0000081fb7f4>] __vfs_write+0x1c/0x128
[ 305.498679] [<ffff0000081fcc68>] vfs_write+0xa0/0x1b0
[ 305.498926] [<ffff0000081fe0fc>] SyS_write+0x44/0xa0
[ 305.499182] Exception stack(0xffff00000a5ebec0 to 0xffff00000a5ec000)
[ 305.499429] bec0: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.499674] bee0: 574f4c465245564f 0000000000000000 0000000000000000 8000000080808080
[ 305.499904] bf00: 0000000000000040 0000000000000038 fefefeff1b4bc2ff 7f7f7f7f7f7fff7f
[ 305.500189] bf20: 0101010101010101 0000000000000000 000000000047a4c8 0000000000000038
[ 305.500712] bf40: 0000000000000000 0000ffffa2601280 0000ffffc63f6068 00000000004b5000
[ 305.501241] bf60: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.501791] bf80: 0000000000000020 0000000000000000 00000000004b5000 000000001c4cc458
[ 305.502314] bfa0: 0000000000000000 0000ffffc63f7950 000000000040a3c4 0000ffffc63f70e0
[ 305.502762] bfc0: 0000ffffa2601268 0000000080000000 0000000000000001 0000000000000040
[ 305.503207] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 305.503680] [<ffff000008082fb0>] el0_svc_naked+0x24/0x28
[ 305.504720] Kernel Offset: disabled
[ 305.505189] CPU features: 0x002082
[ 305.505473] Memory Limit: none
[ 305.506181] ---[ end Kernel panic - not syncing: kernel stack overflow
This patch was co-authored by Ard Biesheuvel and Mark Rutland.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-07-14 20:30:35 +01:00
pr_emerg ( " FAR: 0x%016lx \n " , far ) ;
pr_emerg ( " Task stack: [0x%016lx..0x%016lx] \n " ,
tsk_stk , tsk_stk + THREAD_SIZE ) ;
pr_emerg ( " IRQ stack: [0x%016lx..0x%016lx] \n " ,
2020-07-31 17:19:50 +05:30
irq_stk , irq_stk + IRQ_STACK_SIZE ) ;
arm64: add VMAP_STACK overflow detection
This patch adds stack overflow detection to arm64, usable when vmap'd stacks
are in use.
Overflow is detected in a small preamble executed for each exception entry,
which checks whether there is enough space on the current stack for the general
purpose registers to be saved. If there is not enough space, the overflow
handler is invoked on a per-cpu overflow stack. This approach preserves the
original exception information in ESR_EL1 (and where appropriate, FAR_EL1).
Task and IRQ stacks are aligned to double their size, enabling overflow to be
detected with a single bit test. For example, a 16K stack is aligned to 32K,
ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
this bit is flipped. Thus, overflow (of less than the size of the stack) can be
detected by testing whether this bit is set.
The overflow check is performed before any attempt is made to access the
stack, avoiding recursive faults (and the loss of exception information
these would entail). As logical operations cannot be performed on the SP
directly, the SP is temporarily swapped with a general purpose register
using arithmetic operations to enable the test to be performed.
This gives us a useful error message on stack overflow, as can be trigger with
the LKDTM overflow test:
[ 305.388749] lkdtm: Performing direct entry OVERFLOW
[ 305.395444] Insufficient stack space to handle exception!
[ 305.395482] ESR: 0x96000047 -- DABT (current EL)
[ 305.399890] FAR: 0xffff00000a5e7f30
[ 305.401315] Task stack: [0xffff00000a5e8000..0xffff00000a5ec000]
[ 305.403815] IRQ stack: [0xffff000008000000..0xffff000008004000]
[ 305.407035] Overflow stack: [0xffff80003efce4e0..0xffff80003efcf4e0]
[ 305.409622] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.412785] Hardware name: linux,dummy-virt (DT)
[ 305.415756] task: ffff80003d051c00 task.stack: ffff00000a5e8000
[ 305.419221] PC is at recursive_loop+0x10/0x48
[ 305.421637] LR is at recursive_loop+0x38/0x48
[ 305.423768] pc : [<ffff00000859f330>] lr : [<ffff00000859f358>] pstate: 40000145
[ 305.428020] sp : ffff00000a5e7f50
[ 305.430469] x29: ffff00000a5e8350 x28: ffff80003d051c00
[ 305.433191] x27: ffff000008981000 x26: ffff000008f80400
[ 305.439012] x25: ffff00000a5ebeb8 x24: ffff00000a5ebeb8
[ 305.440369] x23: ffff000008f80138 x22: 0000000000000009
[ 305.442241] x21: ffff80003ce65000 x20: ffff000008f80188
[ 305.444552] x19: 0000000000000013 x18: 0000000000000006
[ 305.446032] x17: 0000ffffa2601280 x16: ffff0000081fe0b8
[ 305.448252] x15: ffff000008ff546d x14: 000000000047a4c8
[ 305.450246] x13: ffff000008ff7872 x12: 0000000005f5e0ff
[ 305.452953] x11: ffff000008ed2548 x10: 000000000005ee8d
[ 305.454824] x9 : ffff000008545380 x8 : ffff00000a5e8770
[ 305.457105] x7 : 1313131313131313 x6 : 00000000000000e1
[ 305.459285] x5 : 0000000000000000 x4 : 0000000000000000
[ 305.461781] x3 : 0000000000000000 x2 : 0000000000000400
[ 305.465119] x1 : 0000000000000013 x0 : 0000000000000012
[ 305.467724] Kernel panic - not syncing: kernel stack overflow
[ 305.470561] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.473325] Hardware name: linux,dummy-virt (DT)
[ 305.475070] Call trace:
[ 305.476116] [<ffff000008088ad8>] dump_backtrace+0x0/0x378
[ 305.478991] [<ffff000008088e64>] show_stack+0x14/0x20
[ 305.481237] [<ffff00000895a178>] dump_stack+0x98/0xb8
[ 305.483294] [<ffff0000080c3288>] panic+0x118/0x280
[ 305.485673] [<ffff0000080c2e9c>] nmi_panic+0x6c/0x70
[ 305.486216] [<ffff000008089710>] handle_bad_stack+0x118/0x128
[ 305.486612] Exception stack(0xffff80003efcf3a0 to 0xffff80003efcf4e0)
[ 305.487334] f3a0: 0000000000000012 0000000000000013 0000000000000400 0000000000000000
[ 305.488025] f3c0: 0000000000000000 0000000000000000 00000000000000e1 1313131313131313
[ 305.488908] f3e0: ffff00000a5e8770 ffff000008545380 000000000005ee8d ffff000008ed2548
[ 305.489403] f400: 0000000005f5e0ff ffff000008ff7872 000000000047a4c8 ffff000008ff546d
[ 305.489759] f420: ffff0000081fe0b8 0000ffffa2601280 0000000000000006 0000000000000013
[ 305.490256] f440: ffff000008f80188 ffff80003ce65000 0000000000000009 ffff000008f80138
[ 305.490683] f460: ffff00000a5ebeb8 ffff00000a5ebeb8 ffff000008f80400 ffff000008981000
[ 305.491051] f480: ffff80003d051c00 ffff00000a5e8350 ffff00000859f358 ffff00000a5e7f50
[ 305.491444] f4a0: ffff00000859f330 0000000040000145 0000000000000000 0000000000000000
[ 305.492008] f4c0: 0001000000000000 0000000000000000 ffff00000a5e8350 ffff00000859f330
[ 305.493063] [<ffff00000808205c>] __bad_stack+0x88/0x8c
[ 305.493396] [<ffff00000859f330>] recursive_loop+0x10/0x48
[ 305.493731] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494088] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494425] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494649] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494898] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495205] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495453] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495708] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496000] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496302] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496644] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496894] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497138] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497325] [<ffff00000859f3dc>] lkdtm_OVERFLOW+0x14/0x20
[ 305.497506] [<ffff00000859f314>] lkdtm_do_action+0x1c/0x28
[ 305.497786] [<ffff00000859f178>] direct_entry+0xe0/0x170
[ 305.498095] [<ffff000008345568>] full_proxy_write+0x60/0xa8
[ 305.498387] [<ffff0000081fb7f4>] __vfs_write+0x1c/0x128
[ 305.498679] [<ffff0000081fcc68>] vfs_write+0xa0/0x1b0
[ 305.498926] [<ffff0000081fe0fc>] SyS_write+0x44/0xa0
[ 305.499182] Exception stack(0xffff00000a5ebec0 to 0xffff00000a5ec000)
[ 305.499429] bec0: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.499674] bee0: 574f4c465245564f 0000000000000000 0000000000000000 8000000080808080
[ 305.499904] bf00: 0000000000000040 0000000000000038 fefefeff1b4bc2ff 7f7f7f7f7f7fff7f
[ 305.500189] bf20: 0101010101010101 0000000000000000 000000000047a4c8 0000000000000038
[ 305.500712] bf40: 0000000000000000 0000ffffa2601280 0000ffffc63f6068 00000000004b5000
[ 305.501241] bf60: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.501791] bf80: 0000000000000020 0000000000000000 00000000004b5000 000000001c4cc458
[ 305.502314] bfa0: 0000000000000000 0000ffffc63f7950 000000000040a3c4 0000ffffc63f70e0
[ 305.502762] bfc0: 0000ffffa2601268 0000000080000000 0000000000000001 0000000000000040
[ 305.503207] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 305.503680] [<ffff000008082fb0>] el0_svc_naked+0x24/0x28
[ 305.504720] Kernel Offset: disabled
[ 305.505189] CPU features: 0x002082
[ 305.505473] Memory Limit: none
[ 305.506181] ---[ end Kernel panic - not syncing: kernel stack overflow
This patch was co-authored by Ard Biesheuvel and Mark Rutland.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-07-14 20:30:35 +01:00
pr_emerg ( " Overflow stack: [0x%016lx..0x%016lx] \n " ,
ovf_stk , ovf_stk + OVERFLOW_STACK_SIZE ) ;
__show_regs ( regs ) ;
/*
* We use nmi_panic to limit the potential for recusive overflows , and
* to get a better stack trace .
*/
nmi_panic ( NULL , " kernel stack overflow " ) ;
cpu_park_loop ( ) ;
}
# endif
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
void __noreturn arm64_serror_panic ( struct pt_regs * regs , unsigned long esr )
2017-11-02 12:12:42 +00:00
{
console_verbose ( ) ;
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
pr_crit ( " SError Interrupt on CPU%d, code 0x%016lx -- %s \n " ,
2017-11-02 12:12:42 +00:00
smp_processor_id ( ) , esr , esr_get_class_string ( esr ) ) ;
2018-01-15 19:38:57 +00:00
if ( regs )
__show_regs ( regs ) ;
nmi_panic ( regs , " Asynchronous SError Interrupt " ) ;
cpu_park_loop ( ) ;
}
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
bool arm64_is_fatal_ras_serror ( struct pt_regs * regs , unsigned long esr )
2018-01-15 19:38:57 +00:00
{
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
unsigned long aet = arm64_ras_serror_get_severity ( esr ) ;
2018-01-15 19:38:57 +00:00
switch ( aet ) {
case ESR_ELx_AET_CE : /* corrected error */
case ESR_ELx_AET_UEO : /* restartable, not yet consumed */
/*
* The CPU can make progress . We may take UEO again as
* a more severe error .
*/
return false ;
case ESR_ELx_AET_UEU : /* Uncorrected Unrecoverable */
case ESR_ELx_AET_UER : /* Uncorrected Recoverable */
/*
* The CPU can ' t make progress . The exception may have
* been imprecise .
2019-06-18 16:17:38 +01:00
*
* Neoverse - N1 # 1349291 means a non - KVM SError reported as
* Unrecoverable should be treated as Uncontainable . We
* call arm64_serror_panic ( ) in both cases .
2018-01-15 19:38:57 +00:00
*/
return true ;
case ESR_ELx_AET_UC : /* Uncontainable or Uncategorized error */
default :
/* Error has been silently propagated */
arm64_serror_panic ( regs , esr ) ;
}
}
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
void do_serror ( struct pt_regs * regs , unsigned long esr )
2018-01-15 19:38:57 +00:00
{
/* non-RAS errors are not containable */
if ( ! arm64_is_ras_serror ( esr ) | | arm64_is_fatal_ras_serror ( regs , esr ) )
arm64_serror_panic ( regs , esr ) ;
2019-08-20 18:45:57 +01:00
}
2015-07-24 16:37:48 +01:00
/* GENERIC_BUG traps */
2023-05-16 18:06:36 +02:00
# ifdef CONFIG_GENERIC_BUG
2015-07-24 16:37:48 +01:00
int is_valid_bugaddr ( unsigned long addr )
{
/*
* bug_handler ( ) only called for BRK # BUG_BRK_IMM .
* So the answer is trivial - - any spurious instances with no
* bug table entry will be rejected by report_bug ( ) and passed
* back to the debug - monitors code and handled as a fatal
* unexpected debug exception .
*/
return 1 ;
}
2023-05-16 18:06:36 +02:00
# endif
2015-07-24 16:37:48 +01:00
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
static int bug_handler ( struct pt_regs * regs , unsigned long esr )
2015-07-24 16:37:48 +01:00
{
switch ( report_bug ( regs - > pc , regs ) ) {
case BUG_TRAP_TYPE_BUG :
2022-09-13 11:17:30 +01:00
die ( " Oops - BUG " , regs , esr ) ;
2015-07-24 16:37:48 +01:00
break ;
case BUG_TRAP_TYPE_WARN :
break ;
default :
/* unknown/unrecognised bug trap type */
return DBG_HOOK_ERROR ;
}
/* If thread survives, skip over the BUG instruction and continue: */
2017-10-25 10:04:33 +01:00
arm64_skip_faulting_instruction ( regs , AARCH64_INSN_SIZE ) ;
2015-07-24 16:37:48 +01:00
return DBG_HOOK_HANDLED ;
}
static struct break_hook bug_break_hook = {
. fn = bug_handler ,
2019-02-26 12:52:47 +00:00
. imm = BUG_BRK_IMM ,
2015-07-24 16:37:48 +01:00
} ;
2022-09-08 14:54:52 -07:00
# ifdef CONFIG_CFI_CLANG
static int cfi_handler ( struct pt_regs * regs , unsigned long esr )
{
unsigned long target ;
u32 type ;
target = pt_regs_read_reg ( regs , FIELD_GET ( CFI_BRK_IMM_TARGET , esr ) ) ;
type = ( u32 ) pt_regs_read_reg ( regs , FIELD_GET ( CFI_BRK_IMM_TYPE , esr ) ) ;
switch ( report_cfi_failure ( regs , regs - > pc , & target , type ) ) {
case BUG_TRAP_TYPE_BUG :
2023-02-20 16:34:41 +09:00
die ( " Oops - CFI " , regs , esr ) ;
2022-09-08 14:54:52 -07:00
break ;
case BUG_TRAP_TYPE_WARN :
break ;
default :
return DBG_HOOK_ERROR ;
}
arm64_skip_faulting_instruction ( regs , AARCH64_INSN_SIZE ) ;
return DBG_HOOK_HANDLED ;
}
static struct break_hook cfi_break_hook = {
. fn = cfi_handler ,
. imm = CFI_BRK_IMM_BASE ,
. mask = CFI_BRK_IMM_MASK ,
} ;
# endif /* CONFIG_CFI_CLANG */
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
static int reserved_fault_handler ( struct pt_regs * regs , unsigned long esr )
2020-09-15 15:48:09 +01:00
{
pr_err ( " %s generated an invalid instruction at %pS! \n " ,
2021-11-05 16:50:45 +00:00
" Kernel text patching " ,
2020-09-15 15:48:09 +01:00
( void * ) instruction_pointer ( regs ) ) ;
/* We cannot handle this */
return DBG_HOOK_ERROR ;
}
static struct break_hook fault_break_hook = {
. fn = reserved_fault_handler ,
. imm = FAULT_BRK_IMM ,
} ;
2018-12-28 00:30:54 -08:00
# ifdef CONFIG_KASAN_SW_TAGS
# define KASAN_ESR_RECOVER 0x20
# define KASAN_ESR_WRITE 0x10
# define KASAN_ESR_SIZE_MASK 0x0f
# define KASAN_ESR_SIZE(esr) (1 << ((esr) & KASAN_ESR_SIZE_MASK))
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
static int kasan_handler ( struct pt_regs * regs , unsigned long esr )
2018-12-28 00:30:54 -08:00
{
bool recover = esr & KASAN_ESR_RECOVER ;
bool write = esr & KASAN_ESR_WRITE ;
size_t size = KASAN_ESR_SIZE ( esr ) ;
kasan: use internal prototypes matching gcc-13 builtins
gcc-13 warns about function definitions for builtin interfaces that have a
different prototype, e.g.:
In file included from kasan_test.c:31:
kasan.h:574:6: error: conflicting types for built-in function '__asan_register_globals'; expected 'void(void *, long int)' [-Werror=builtin-declaration-mismatch]
574 | void __asan_register_globals(struct kasan_global *globals, size_t size);
kasan.h:577:6: error: conflicting types for built-in function '__asan_alloca_poison'; expected 'void(void *, long int)' [-Werror=builtin-declaration-mismatch]
577 | void __asan_alloca_poison(unsigned long addr, size_t size);
kasan.h:580:6: error: conflicting types for built-in function '__asan_load1'; expected 'void(void *)' [-Werror=builtin-declaration-mismatch]
580 | void __asan_load1(unsigned long addr);
kasan.h:581:6: error: conflicting types for built-in function '__asan_store1'; expected 'void(void *)' [-Werror=builtin-declaration-mismatch]
581 | void __asan_store1(unsigned long addr);
kasan.h:643:6: error: conflicting types for built-in function '__hwasan_tag_memory'; expected 'void(void *, unsigned char, long int)' [-Werror=builtin-declaration-mismatch]
643 | void __hwasan_tag_memory(unsigned long addr, u8 tag, unsigned long size);
The two problems are:
- Addresses are passes as 'unsigned long' in the kernel, but gcc-13
expects a 'void *'.
- sizes meant to use a signed ssize_t rather than size_t.
Change all the prototypes to match these. Using 'void *' consistently for
addresses gets rid of a couple of type casts, so push that down to the
leaf functions where possible.
This now passes all randconfig builds on arm, arm64 and x86, but I have
not tested it on the other architectures that support kasan, since they
tend to fail randconfig builds in other ways. This might fail if any of
the 32-bit architectures expect a 'long' instead of 'int' for the size
argument.
The __asan_allocas_unpoison() function prototype is somewhat weird, since
it uses a pointer for 'stack_top' and an size_t for 'stack_bottom'. This
looks like it is meant to be 'addr' and 'size' like the others, but the
implementation clearly treats them as 'top' and 'bottom'.
Link: https://lkml.kernel.org/r/20230509145735.9263-2-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-09 16:57:21 +02:00
void * addr = ( void * ) regs - > regs [ 0 ] ;
2018-12-28 00:30:54 -08:00
u64 pc = regs - > pc ;
kasan_report ( addr , size , write , pc ) ;
/*
* The instrumentation allows to control whether we can proceed after
* a crash was detected . This is done by passing the - recover flag to
* the compiler . Disabling recovery allows to generate more compact
* code .
*
* Unfortunately disabling recovery doesn ' t work for the kernel right
* now . KASAN reporting is disabled in some contexts ( for example when
* the allocator accesses slab object metadata ; this is controlled by
* current - > kasan_depth ) . All these accesses are detected by the tool ,
* even though the reports for them are not printed .
*
* This is something that might be fixed at some point in the future .
*/
if ( ! recover )
2022-09-13 11:17:30 +01:00
die ( " Oops - KASAN " , regs , esr ) ;
2018-12-28 00:30:54 -08:00
/* If thread survives, skip over the brk instruction and continue: */
arm64_skip_faulting_instruction ( regs , AARCH64_INSN_SIZE ) ;
return DBG_HOOK_HANDLED ;
}
static struct break_hook kasan_break_hook = {
2019-02-26 12:52:47 +00:00
. fn = kasan_handler ,
. imm = KASAN_BRK_IMM ,
. mask = KASAN_BRK_MASK ,
2018-12-28 00:30:54 -08:00
} ;
# endif
2023-02-02 22:07:49 +00:00
# ifdef CONFIG_UBSAN_TRAP
static int ubsan_handler ( struct pt_regs * regs , unsigned long esr )
{
die ( report_ubsan_failure ( regs , esr & UBSAN_BRK_MASK ) , regs , esr ) ;
return DBG_HOOK_HANDLED ;
}
static struct break_hook ubsan_break_hook = {
. fn = ubsan_handler ,
. imm = UBSAN_BRK_IMM ,
. mask = UBSAN_BRK_MASK ,
} ;
# endif
2022-09-08 14:54:52 -07:00
# define esr_comment(esr) ((esr) & ESR_ELx_BRK64_ISS_COMMENT_MASK)
2015-07-24 16:37:48 +01:00
/*
* Initial handler for AArch64 BRK exceptions
* This handler only used until debug_traps_init ( ) .
*/
arm64: Treat ESR_ELx as a 64-bit register
In the initial release of the ARM Architecture Reference Manual for
ARMv8-A, the ESR_ELx registers were defined as 32-bit registers. This
changed in 2018 with version D.a (ARM DDI 0487D.a) of the architecture,
when they became 64-bit registers, with bits [63:32] defined as RES0. In
version G.a, a new field was added to ESR_ELx, ISS2, which covers bits
[36:32]. This field is used when the Armv8.7 extension FEAT_LS64 is
implemented.
As a result of the evolution of the register width, Linux stores it as
both a 64-bit value and a 32-bit value, which hasn't affected correctness
so far as Linux only uses the lower 32 bits of the register.
Make the register type consistent and always treat it as 64-bit wide. The
register is redefined as an "unsigned long", which is an unsigned
double-word (64-bit quantity) for the LP64 machine (aapcs64 [1], Table 1,
page 14). The type was chosen because "unsigned int" is the most frequent
type for ESR_ELx and because FAR_ELx, which is used together with ESR_ELx
in exception handling, is also declared as "unsigned long". The 64-bit type
also makes adding support for architectural features that use fields above
bit 31 easier in the future.
The KVM hypervisor will receive a similar update in a subsequent patch.
[1] https://github.com/ARM-software/abi-aa/releases/download/2021Q3/aapcs64.pdf
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425114444.368693-4-alexandru.elisei@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-25 12:44:42 +01:00
int __init early_brk64 ( unsigned long addr , unsigned long esr ,
2015-07-24 16:37:48 +01:00
struct pt_regs * regs )
{
2022-09-08 14:54:52 -07:00
# ifdef CONFIG_CFI_CLANG
if ( ( esr_comment ( esr ) & ~ CFI_BRK_IMM_MASK ) = = CFI_BRK_IMM_BASE )
return cfi_handler ( regs , esr ) ! = DBG_HOOK_HANDLED ;
# endif
2018-12-28 00:30:54 -08:00
# ifdef CONFIG_KASAN_SW_TAGS
2022-09-08 14:54:52 -07:00
if ( ( esr_comment ( esr ) & ~ KASAN_BRK_MASK ) = = KASAN_BRK_IMM )
2018-12-28 00:30:54 -08:00
return kasan_handler ( regs , esr ) ! = DBG_HOOK_HANDLED ;
2023-02-02 22:07:49 +00:00
# endif
# ifdef CONFIG_UBSAN_TRAP
if ( ( esr_comment ( esr ) & ~ UBSAN_BRK_MASK ) = = UBSAN_BRK_IMM )
return ubsan_handler ( regs , esr ) ! = DBG_HOOK_HANDLED ;
2018-12-28 00:30:54 -08:00
# endif
2015-07-24 16:37:48 +01:00
return bug_handler ( regs , esr ) ! = DBG_HOOK_HANDLED ;
}
2012-03-05 11:49:27 +00:00
void __init trap_init ( void )
{
2019-02-26 12:52:47 +00:00
register_kernel_break_hook ( & bug_break_hook ) ;
2022-09-08 14:54:52 -07:00
# ifdef CONFIG_CFI_CLANG
register_kernel_break_hook ( & cfi_break_hook ) ;
# endif
2020-09-15 15:48:09 +01:00
register_kernel_break_hook ( & fault_break_hook ) ;
2018-12-28 00:30:54 -08:00
# ifdef CONFIG_KASAN_SW_TAGS
2019-02-26 12:52:47 +00:00
register_kernel_break_hook ( & kasan_break_hook ) ;
2023-02-02 22:07:49 +00:00
# endif
# ifdef CONFIG_UBSAN_TRAP
register_kernel_break_hook ( & ubsan_break_hook ) ;
2018-12-28 00:30:54 -08:00
# endif
2020-05-13 16:06:37 -07:00
debug_traps_init ( ) ;
2012-03-05 11:49:27 +00:00
}