2019-06-03 08:44:50 +03:00
/* SPDX-License-Identifier: GPL-2.0-only */
2012-03-05 15:49:27 +04:00
/ *
* Low- l e v e l e x c e p t i o n h a n d l i n g c o d e
*
* Copyright ( C ) 2 0 1 2 A R M L t d .
* Authors : Catalin M a r i n a s < c a t a l i n . m a r i n a s @arm.com>
* Will D e a c o n < w i l l . d e a c o n @arm.com>
* /
2018-05-29 15:11:06 +03:00
# include < l i n u x / a r m - s m c c c . h >
2012-03-05 15:49:27 +04:00
# include < l i n u x / i n i t . h >
# include < l i n u x / l i n k a g e . h >
2015-06-01 12:47:41 +03:00
# include < a s m / a l t e r n a t i v e . h >
2012-03-05 15:49:27 +04:00
# include < a s m / a s s e m b l e r . h >
# include < a s m / a s m - o f f s e t s . h >
2020-03-13 12:04:51 +03:00
# include < a s m / a s m _ p o i n t e r _ a u t h . h >
2020-06-30 15:53:07 +03:00
# include < a s m / b u g . h >
2015-03-23 22:07:02 +03:00
# include < a s m / c p u f e a t u r e . h >
2012-03-05 15:49:27 +04:00
# include < a s m / e r r n o . h >
2013-04-08 20:17:03 +04:00
# include < a s m / e s r . h >
2015-12-04 14:02:27 +03:00
# include < a s m / i r q . h >
2017-11-14 17:07:40 +03:00
# include < a s m / m e m o r y . h >
# include < a s m / m m u . h >
2017-08-31 11:30:50 +03:00
# include < a s m / p r o c e s s o r . h >
2016-09-02 16:54:03 +03:00
# include < a s m / p t r a c e . h >
2020-04-27 19:00:16 +03:00
# include < a s m / s c s . h >
2012-03-05 15:49:27 +04:00
# include < a s m / t h r e a d _ i n f o . h >
2016-12-26 12:10:19 +03:00
# include < a s m / a s m - u a c c e s s . h >
2012-03-05 15:49:27 +04:00
# include < a s m / u n i s t d . h >
2018-07-11 16:56:48 +03:00
.macro clear_gp_regs
.irp n,0 ,1 ,2 ,3 ,4 ,5 ,6 ,7 ,8 ,9 ,1 0 ,1 1 ,1 2 ,1 3 ,1 4 ,1 5 ,1 6 ,1 7 ,1 8 ,1 9 ,2 0 ,2 1 ,2 2 ,2 3 ,2 4 ,2 5 ,2 6 ,2 7 ,2 8 ,2 9
mov x \ n , x z r
.endr
.endm
arm64: entry: handle all vectors with C
We have 16 architectural exception vectors, and depending on kernel
configuration we handle 8 or 12 of these with C code, with the remaining
8 or 4 of these handled as special cases in the entry assembly.
It would be nicer if the entry assembly were uniform for all exceptions,
and we deferred any specific handling of the exceptions to C code. This
way the entry assembly can be more easily templated without ifdeffery or
special cases, and it's easier to modify the handling of these cases in
future (e.g. to dump additional registers other context).
This patch reworks the entry code so that we always have a C handler for
every architectural exception vector, with the entry assembly being
completely uniform. We now have to handle exceptions from EL1t and EL1h,
and also have to handle exceptions from AArch32 even when the kernel is
built without CONFIG_COMPAT. To make this clear and to simplify
templating, we rename the top-level exception handlers with a consistent
naming scheme:
asm: <el+sp>_<regsize>_<type>
c: <el+sp>_<regsize>_<type>_handler
.. where:
<el+sp> is `el1t`, `el1h`, or `el0t`
<regsize> is `64` or `32`
<type> is `sync`, `irq`, `fiq`, or `error`
... e.g.
asm: el1h_64_sync
c: el1h_64_sync_handler
... with lower-level handlers simply using "el1" and "compat" as today.
For unexpected exceptions, this information is passed to
__panic_unhandled(), so it can report the specific vector an unexpected
exception was taken from, e.g.
| Unhandled 64-bit el1t sync exception
For vectors we never expect to enter legitimately, the C code is
generated using a macro to avoid code duplication. The exceptions are
handled via __panic_unhandled(), replacing bad_mode() (which is
removed).
The `kernel_ventry` and `entry_handler` assembly macros are updated to
handle the new naming scheme. In theory it should be possible to
generate the entry functions at the same time as the vectors using a
single table, but this will require reworking the linker script to split
the two into separate sections, so for now we have separate tables.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210607094624.34689-15-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07 12:46:18 +03:00
.macro kernel_ v e n t r y , e l : r e q , h t : r e q , r e g s i z e : r e q , l a b e l : r e q
2017-07-19 19:24:49 +03:00
.align 7
2021-11-17 18:15:26 +03:00
.Lventry_start \ @:
2017-11-14 17:24:29 +03:00
.if \ el = = 0
2021-11-24 18:36:12 +03:00
/ *
* This m u s t b e t h e f i r s t i n s t r u c t i o n o f t h e E L 0 v e c t o r e n t r i e s . I t i s
* skipped b y t h e t r a m p o l i n e v e c t o r s , t o t r i g g e r t h e c l e a n u p .
* /
b . L s k i p _ t r a m p _ v e c t o r s _ c l e a n u p \ @
2017-11-14 17:24:29 +03:00
.if \ regsize = = 6 4
mrs x30 , t p i d r r o _ e l 0
msr t p i d r r o _ e l 0 , x z r
.else
mov x30 , x z r
.endif
2021-11-24 18:36:12 +03:00
.Lskip_tramp_vectors_cleanup \ @:
2020-01-09 19:02:59 +03:00
.endif
2017-11-14 17:24:29 +03:00
2021-01-12 04:58:13 +03:00
sub s p , s p , #P T _ R E G S _ S I Z E
arm64: add VMAP_STACK overflow detection
This patch adds stack overflow detection to arm64, usable when vmap'd stacks
are in use.
Overflow is detected in a small preamble executed for each exception entry,
which checks whether there is enough space on the current stack for the general
purpose registers to be saved. If there is not enough space, the overflow
handler is invoked on a per-cpu overflow stack. This approach preserves the
original exception information in ESR_EL1 (and where appropriate, FAR_EL1).
Task and IRQ stacks are aligned to double their size, enabling overflow to be
detected with a single bit test. For example, a 16K stack is aligned to 32K,
ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
this bit is flipped. Thus, overflow (of less than the size of the stack) can be
detected by testing whether this bit is set.
The overflow check is performed before any attempt is made to access the
stack, avoiding recursive faults (and the loss of exception information
these would entail). As logical operations cannot be performed on the SP
directly, the SP is temporarily swapped with a general purpose register
using arithmetic operations to enable the test to be performed.
This gives us a useful error message on stack overflow, as can be trigger with
the LKDTM overflow test:
[ 305.388749] lkdtm: Performing direct entry OVERFLOW
[ 305.395444] Insufficient stack space to handle exception!
[ 305.395482] ESR: 0x96000047 -- DABT (current EL)
[ 305.399890] FAR: 0xffff00000a5e7f30
[ 305.401315] Task stack: [0xffff00000a5e8000..0xffff00000a5ec000]
[ 305.403815] IRQ stack: [0xffff000008000000..0xffff000008004000]
[ 305.407035] Overflow stack: [0xffff80003efce4e0..0xffff80003efcf4e0]
[ 305.409622] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.412785] Hardware name: linux,dummy-virt (DT)
[ 305.415756] task: ffff80003d051c00 task.stack: ffff00000a5e8000
[ 305.419221] PC is at recursive_loop+0x10/0x48
[ 305.421637] LR is at recursive_loop+0x38/0x48
[ 305.423768] pc : [<ffff00000859f330>] lr : [<ffff00000859f358>] pstate: 40000145
[ 305.428020] sp : ffff00000a5e7f50
[ 305.430469] x29: ffff00000a5e8350 x28: ffff80003d051c00
[ 305.433191] x27: ffff000008981000 x26: ffff000008f80400
[ 305.439012] x25: ffff00000a5ebeb8 x24: ffff00000a5ebeb8
[ 305.440369] x23: ffff000008f80138 x22: 0000000000000009
[ 305.442241] x21: ffff80003ce65000 x20: ffff000008f80188
[ 305.444552] x19: 0000000000000013 x18: 0000000000000006
[ 305.446032] x17: 0000ffffa2601280 x16: ffff0000081fe0b8
[ 305.448252] x15: ffff000008ff546d x14: 000000000047a4c8
[ 305.450246] x13: ffff000008ff7872 x12: 0000000005f5e0ff
[ 305.452953] x11: ffff000008ed2548 x10: 000000000005ee8d
[ 305.454824] x9 : ffff000008545380 x8 : ffff00000a5e8770
[ 305.457105] x7 : 1313131313131313 x6 : 00000000000000e1
[ 305.459285] x5 : 0000000000000000 x4 : 0000000000000000
[ 305.461781] x3 : 0000000000000000 x2 : 0000000000000400
[ 305.465119] x1 : 0000000000000013 x0 : 0000000000000012
[ 305.467724] Kernel panic - not syncing: kernel stack overflow
[ 305.470561] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.473325] Hardware name: linux,dummy-virt (DT)
[ 305.475070] Call trace:
[ 305.476116] [<ffff000008088ad8>] dump_backtrace+0x0/0x378
[ 305.478991] [<ffff000008088e64>] show_stack+0x14/0x20
[ 305.481237] [<ffff00000895a178>] dump_stack+0x98/0xb8
[ 305.483294] [<ffff0000080c3288>] panic+0x118/0x280
[ 305.485673] [<ffff0000080c2e9c>] nmi_panic+0x6c/0x70
[ 305.486216] [<ffff000008089710>] handle_bad_stack+0x118/0x128
[ 305.486612] Exception stack(0xffff80003efcf3a0 to 0xffff80003efcf4e0)
[ 305.487334] f3a0: 0000000000000012 0000000000000013 0000000000000400 0000000000000000
[ 305.488025] f3c0: 0000000000000000 0000000000000000 00000000000000e1 1313131313131313
[ 305.488908] f3e0: ffff00000a5e8770 ffff000008545380 000000000005ee8d ffff000008ed2548
[ 305.489403] f400: 0000000005f5e0ff ffff000008ff7872 000000000047a4c8 ffff000008ff546d
[ 305.489759] f420: ffff0000081fe0b8 0000ffffa2601280 0000000000000006 0000000000000013
[ 305.490256] f440: ffff000008f80188 ffff80003ce65000 0000000000000009 ffff000008f80138
[ 305.490683] f460: ffff00000a5ebeb8 ffff00000a5ebeb8 ffff000008f80400 ffff000008981000
[ 305.491051] f480: ffff80003d051c00 ffff00000a5e8350 ffff00000859f358 ffff00000a5e7f50
[ 305.491444] f4a0: ffff00000859f330 0000000040000145 0000000000000000 0000000000000000
[ 305.492008] f4c0: 0001000000000000 0000000000000000 ffff00000a5e8350 ffff00000859f330
[ 305.493063] [<ffff00000808205c>] __bad_stack+0x88/0x8c
[ 305.493396] [<ffff00000859f330>] recursive_loop+0x10/0x48
[ 305.493731] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494088] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494425] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494649] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494898] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495205] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495453] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495708] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496000] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496302] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496644] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496894] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497138] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497325] [<ffff00000859f3dc>] lkdtm_OVERFLOW+0x14/0x20
[ 305.497506] [<ffff00000859f314>] lkdtm_do_action+0x1c/0x28
[ 305.497786] [<ffff00000859f178>] direct_entry+0xe0/0x170
[ 305.498095] [<ffff000008345568>] full_proxy_write+0x60/0xa8
[ 305.498387] [<ffff0000081fb7f4>] __vfs_write+0x1c/0x128
[ 305.498679] [<ffff0000081fcc68>] vfs_write+0xa0/0x1b0
[ 305.498926] [<ffff0000081fe0fc>] SyS_write+0x44/0xa0
[ 305.499182] Exception stack(0xffff00000a5ebec0 to 0xffff00000a5ec000)
[ 305.499429] bec0: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.499674] bee0: 574f4c465245564f 0000000000000000 0000000000000000 8000000080808080
[ 305.499904] bf00: 0000000000000040 0000000000000038 fefefeff1b4bc2ff 7f7f7f7f7f7fff7f
[ 305.500189] bf20: 0101010101010101 0000000000000000 000000000047a4c8 0000000000000038
[ 305.500712] bf40: 0000000000000000 0000ffffa2601280 0000ffffc63f6068 00000000004b5000
[ 305.501241] bf60: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.501791] bf80: 0000000000000020 0000000000000000 00000000004b5000 000000001c4cc458
[ 305.502314] bfa0: 0000000000000000 0000ffffc63f7950 000000000040a3c4 0000ffffc63f70e0
[ 305.502762] bfc0: 0000ffffa2601268 0000000080000000 0000000000000001 0000000000000040
[ 305.503207] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 305.503680] [<ffff000008082fb0>] el0_svc_naked+0x24/0x28
[ 305.504720] Kernel Offset: disabled
[ 305.505189] CPU features: 0x002082
[ 305.505473] Memory Limit: none
[ 305.506181] ---[ end Kernel panic - not syncing: kernel stack overflow
This patch was co-authored by Ard Biesheuvel and Mark Rutland.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-07-14 22:30:35 +03:00
# ifdef C O N F I G _ V M A P _ S T A C K
/ *
* Test w h e t h e r t h e S P h a s o v e r f l o w e d , w i t h o u t c o r r u p t i n g a G P R .
2019-12-02 14:37:02 +03:00
* Task a n d I R Q s t a c k s a r e a l i g n e d s o t h a t S P & ( 1 < < T H R E A D _ S H I F T )
* should a l w a y s b e z e r o .
arm64: add VMAP_STACK overflow detection
This patch adds stack overflow detection to arm64, usable when vmap'd stacks
are in use.
Overflow is detected in a small preamble executed for each exception entry,
which checks whether there is enough space on the current stack for the general
purpose registers to be saved. If there is not enough space, the overflow
handler is invoked on a per-cpu overflow stack. This approach preserves the
original exception information in ESR_EL1 (and where appropriate, FAR_EL1).
Task and IRQ stacks are aligned to double their size, enabling overflow to be
detected with a single bit test. For example, a 16K stack is aligned to 32K,
ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
this bit is flipped. Thus, overflow (of less than the size of the stack) can be
detected by testing whether this bit is set.
The overflow check is performed before any attempt is made to access the
stack, avoiding recursive faults (and the loss of exception information
these would entail). As logical operations cannot be performed on the SP
directly, the SP is temporarily swapped with a general purpose register
using arithmetic operations to enable the test to be performed.
This gives us a useful error message on stack overflow, as can be trigger with
the LKDTM overflow test:
[ 305.388749] lkdtm: Performing direct entry OVERFLOW
[ 305.395444] Insufficient stack space to handle exception!
[ 305.395482] ESR: 0x96000047 -- DABT (current EL)
[ 305.399890] FAR: 0xffff00000a5e7f30
[ 305.401315] Task stack: [0xffff00000a5e8000..0xffff00000a5ec000]
[ 305.403815] IRQ stack: [0xffff000008000000..0xffff000008004000]
[ 305.407035] Overflow stack: [0xffff80003efce4e0..0xffff80003efcf4e0]
[ 305.409622] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.412785] Hardware name: linux,dummy-virt (DT)
[ 305.415756] task: ffff80003d051c00 task.stack: ffff00000a5e8000
[ 305.419221] PC is at recursive_loop+0x10/0x48
[ 305.421637] LR is at recursive_loop+0x38/0x48
[ 305.423768] pc : [<ffff00000859f330>] lr : [<ffff00000859f358>] pstate: 40000145
[ 305.428020] sp : ffff00000a5e7f50
[ 305.430469] x29: ffff00000a5e8350 x28: ffff80003d051c00
[ 305.433191] x27: ffff000008981000 x26: ffff000008f80400
[ 305.439012] x25: ffff00000a5ebeb8 x24: ffff00000a5ebeb8
[ 305.440369] x23: ffff000008f80138 x22: 0000000000000009
[ 305.442241] x21: ffff80003ce65000 x20: ffff000008f80188
[ 305.444552] x19: 0000000000000013 x18: 0000000000000006
[ 305.446032] x17: 0000ffffa2601280 x16: ffff0000081fe0b8
[ 305.448252] x15: ffff000008ff546d x14: 000000000047a4c8
[ 305.450246] x13: ffff000008ff7872 x12: 0000000005f5e0ff
[ 305.452953] x11: ffff000008ed2548 x10: 000000000005ee8d
[ 305.454824] x9 : ffff000008545380 x8 : ffff00000a5e8770
[ 305.457105] x7 : 1313131313131313 x6 : 00000000000000e1
[ 305.459285] x5 : 0000000000000000 x4 : 0000000000000000
[ 305.461781] x3 : 0000000000000000 x2 : 0000000000000400
[ 305.465119] x1 : 0000000000000013 x0 : 0000000000000012
[ 305.467724] Kernel panic - not syncing: kernel stack overflow
[ 305.470561] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.473325] Hardware name: linux,dummy-virt (DT)
[ 305.475070] Call trace:
[ 305.476116] [<ffff000008088ad8>] dump_backtrace+0x0/0x378
[ 305.478991] [<ffff000008088e64>] show_stack+0x14/0x20
[ 305.481237] [<ffff00000895a178>] dump_stack+0x98/0xb8
[ 305.483294] [<ffff0000080c3288>] panic+0x118/0x280
[ 305.485673] [<ffff0000080c2e9c>] nmi_panic+0x6c/0x70
[ 305.486216] [<ffff000008089710>] handle_bad_stack+0x118/0x128
[ 305.486612] Exception stack(0xffff80003efcf3a0 to 0xffff80003efcf4e0)
[ 305.487334] f3a0: 0000000000000012 0000000000000013 0000000000000400 0000000000000000
[ 305.488025] f3c0: 0000000000000000 0000000000000000 00000000000000e1 1313131313131313
[ 305.488908] f3e0: ffff00000a5e8770 ffff000008545380 000000000005ee8d ffff000008ed2548
[ 305.489403] f400: 0000000005f5e0ff ffff000008ff7872 000000000047a4c8 ffff000008ff546d
[ 305.489759] f420: ffff0000081fe0b8 0000ffffa2601280 0000000000000006 0000000000000013
[ 305.490256] f440: ffff000008f80188 ffff80003ce65000 0000000000000009 ffff000008f80138
[ 305.490683] f460: ffff00000a5ebeb8 ffff00000a5ebeb8 ffff000008f80400 ffff000008981000
[ 305.491051] f480: ffff80003d051c00 ffff00000a5e8350 ffff00000859f358 ffff00000a5e7f50
[ 305.491444] f4a0: ffff00000859f330 0000000040000145 0000000000000000 0000000000000000
[ 305.492008] f4c0: 0001000000000000 0000000000000000 ffff00000a5e8350 ffff00000859f330
[ 305.493063] [<ffff00000808205c>] __bad_stack+0x88/0x8c
[ 305.493396] [<ffff00000859f330>] recursive_loop+0x10/0x48
[ 305.493731] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494088] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494425] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494649] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494898] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495205] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495453] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495708] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496000] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496302] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496644] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496894] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497138] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497325] [<ffff00000859f3dc>] lkdtm_OVERFLOW+0x14/0x20
[ 305.497506] [<ffff00000859f314>] lkdtm_do_action+0x1c/0x28
[ 305.497786] [<ffff00000859f178>] direct_entry+0xe0/0x170
[ 305.498095] [<ffff000008345568>] full_proxy_write+0x60/0xa8
[ 305.498387] [<ffff0000081fb7f4>] __vfs_write+0x1c/0x128
[ 305.498679] [<ffff0000081fcc68>] vfs_write+0xa0/0x1b0
[ 305.498926] [<ffff0000081fe0fc>] SyS_write+0x44/0xa0
[ 305.499182] Exception stack(0xffff00000a5ebec0 to 0xffff00000a5ec000)
[ 305.499429] bec0: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.499674] bee0: 574f4c465245564f 0000000000000000 0000000000000000 8000000080808080
[ 305.499904] bf00: 0000000000000040 0000000000000038 fefefeff1b4bc2ff 7f7f7f7f7f7fff7f
[ 305.500189] bf20: 0101010101010101 0000000000000000 000000000047a4c8 0000000000000038
[ 305.500712] bf40: 0000000000000000 0000ffffa2601280 0000ffffc63f6068 00000000004b5000
[ 305.501241] bf60: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.501791] bf80: 0000000000000020 0000000000000000 00000000004b5000 000000001c4cc458
[ 305.502314] bfa0: 0000000000000000 0000ffffc63f7950 000000000040a3c4 0000ffffc63f70e0
[ 305.502762] bfc0: 0000ffffa2601268 0000000080000000 0000000000000001 0000000000000040
[ 305.503207] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 305.503680] [<ffff000008082fb0>] el0_svc_naked+0x24/0x28
[ 305.504720] Kernel Offset: disabled
[ 305.505189] CPU features: 0x002082
[ 305.505473] Memory Limit: none
[ 305.506181] ---[ end Kernel panic - not syncing: kernel stack overflow
This patch was co-authored by Ard Biesheuvel and Mark Rutland.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-07-14 22:30:35 +03:00
* /
add s p , s p , x0 / / s p ' = s p + x0
sub x0 , s p , x0 / / x0 ' = s p ' - x0 = ( s p + x0 ) - x0 = s p
tbnz x0 , #T H R E A D _ S H I F T , 0 f
sub x0 , s p , x0 / / x0 ' ' = s p ' - x0 ' = ( s p + x0 ) - s p = x0
sub s p , s p , x0 / / s p ' ' = s p ' - x0 = ( s p + x0 ) - x0 = s p
arm64: entry: handle all vectors with C
We have 16 architectural exception vectors, and depending on kernel
configuration we handle 8 or 12 of these with C code, with the remaining
8 or 4 of these handled as special cases in the entry assembly.
It would be nicer if the entry assembly were uniform for all exceptions,
and we deferred any specific handling of the exceptions to C code. This
way the entry assembly can be more easily templated without ifdeffery or
special cases, and it's easier to modify the handling of these cases in
future (e.g. to dump additional registers other context).
This patch reworks the entry code so that we always have a C handler for
every architectural exception vector, with the entry assembly being
completely uniform. We now have to handle exceptions from EL1t and EL1h,
and also have to handle exceptions from AArch32 even when the kernel is
built without CONFIG_COMPAT. To make this clear and to simplify
templating, we rename the top-level exception handlers with a consistent
naming scheme:
asm: <el+sp>_<regsize>_<type>
c: <el+sp>_<regsize>_<type>_handler
.. where:
<el+sp> is `el1t`, `el1h`, or `el0t`
<regsize> is `64` or `32`
<type> is `sync`, `irq`, `fiq`, or `error`
... e.g.
asm: el1h_64_sync
c: el1h_64_sync_handler
... with lower-level handlers simply using "el1" and "compat" as today.
For unexpected exceptions, this information is passed to
__panic_unhandled(), so it can report the specific vector an unexpected
exception was taken from, e.g.
| Unhandled 64-bit el1t sync exception
For vectors we never expect to enter legitimately, the C code is
generated using a macro to avoid code duplication. The exceptions are
handled via __panic_unhandled(), replacing bad_mode() (which is
removed).
The `kernel_ventry` and `entry_handler` assembly macros are updated to
handle the new naming scheme. In theory it should be possible to
generate the entry functions at the same time as the vectors using a
single table, but this will require reworking the linker script to split
the two into separate sections, so for now we have separate tables.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210607094624.34689-15-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07 12:46:18 +03:00
b e l \ e l \ h t \ ( ) _ \ r e g s i z e \ ( ) _ \ l a b e l
arm64: add VMAP_STACK overflow detection
This patch adds stack overflow detection to arm64, usable when vmap'd stacks
are in use.
Overflow is detected in a small preamble executed for each exception entry,
which checks whether there is enough space on the current stack for the general
purpose registers to be saved. If there is not enough space, the overflow
handler is invoked on a per-cpu overflow stack. This approach preserves the
original exception information in ESR_EL1 (and where appropriate, FAR_EL1).
Task and IRQ stacks are aligned to double their size, enabling overflow to be
detected with a single bit test. For example, a 16K stack is aligned to 32K,
ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
this bit is flipped. Thus, overflow (of less than the size of the stack) can be
detected by testing whether this bit is set.
The overflow check is performed before any attempt is made to access the
stack, avoiding recursive faults (and the loss of exception information
these would entail). As logical operations cannot be performed on the SP
directly, the SP is temporarily swapped with a general purpose register
using arithmetic operations to enable the test to be performed.
This gives us a useful error message on stack overflow, as can be trigger with
the LKDTM overflow test:
[ 305.388749] lkdtm: Performing direct entry OVERFLOW
[ 305.395444] Insufficient stack space to handle exception!
[ 305.395482] ESR: 0x96000047 -- DABT (current EL)
[ 305.399890] FAR: 0xffff00000a5e7f30
[ 305.401315] Task stack: [0xffff00000a5e8000..0xffff00000a5ec000]
[ 305.403815] IRQ stack: [0xffff000008000000..0xffff000008004000]
[ 305.407035] Overflow stack: [0xffff80003efce4e0..0xffff80003efcf4e0]
[ 305.409622] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.412785] Hardware name: linux,dummy-virt (DT)
[ 305.415756] task: ffff80003d051c00 task.stack: ffff00000a5e8000
[ 305.419221] PC is at recursive_loop+0x10/0x48
[ 305.421637] LR is at recursive_loop+0x38/0x48
[ 305.423768] pc : [<ffff00000859f330>] lr : [<ffff00000859f358>] pstate: 40000145
[ 305.428020] sp : ffff00000a5e7f50
[ 305.430469] x29: ffff00000a5e8350 x28: ffff80003d051c00
[ 305.433191] x27: ffff000008981000 x26: ffff000008f80400
[ 305.439012] x25: ffff00000a5ebeb8 x24: ffff00000a5ebeb8
[ 305.440369] x23: ffff000008f80138 x22: 0000000000000009
[ 305.442241] x21: ffff80003ce65000 x20: ffff000008f80188
[ 305.444552] x19: 0000000000000013 x18: 0000000000000006
[ 305.446032] x17: 0000ffffa2601280 x16: ffff0000081fe0b8
[ 305.448252] x15: ffff000008ff546d x14: 000000000047a4c8
[ 305.450246] x13: ffff000008ff7872 x12: 0000000005f5e0ff
[ 305.452953] x11: ffff000008ed2548 x10: 000000000005ee8d
[ 305.454824] x9 : ffff000008545380 x8 : ffff00000a5e8770
[ 305.457105] x7 : 1313131313131313 x6 : 00000000000000e1
[ 305.459285] x5 : 0000000000000000 x4 : 0000000000000000
[ 305.461781] x3 : 0000000000000000 x2 : 0000000000000400
[ 305.465119] x1 : 0000000000000013 x0 : 0000000000000012
[ 305.467724] Kernel panic - not syncing: kernel stack overflow
[ 305.470561] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.473325] Hardware name: linux,dummy-virt (DT)
[ 305.475070] Call trace:
[ 305.476116] [<ffff000008088ad8>] dump_backtrace+0x0/0x378
[ 305.478991] [<ffff000008088e64>] show_stack+0x14/0x20
[ 305.481237] [<ffff00000895a178>] dump_stack+0x98/0xb8
[ 305.483294] [<ffff0000080c3288>] panic+0x118/0x280
[ 305.485673] [<ffff0000080c2e9c>] nmi_panic+0x6c/0x70
[ 305.486216] [<ffff000008089710>] handle_bad_stack+0x118/0x128
[ 305.486612] Exception stack(0xffff80003efcf3a0 to 0xffff80003efcf4e0)
[ 305.487334] f3a0: 0000000000000012 0000000000000013 0000000000000400 0000000000000000
[ 305.488025] f3c0: 0000000000000000 0000000000000000 00000000000000e1 1313131313131313
[ 305.488908] f3e0: ffff00000a5e8770 ffff000008545380 000000000005ee8d ffff000008ed2548
[ 305.489403] f400: 0000000005f5e0ff ffff000008ff7872 000000000047a4c8 ffff000008ff546d
[ 305.489759] f420: ffff0000081fe0b8 0000ffffa2601280 0000000000000006 0000000000000013
[ 305.490256] f440: ffff000008f80188 ffff80003ce65000 0000000000000009 ffff000008f80138
[ 305.490683] f460: ffff00000a5ebeb8 ffff00000a5ebeb8 ffff000008f80400 ffff000008981000
[ 305.491051] f480: ffff80003d051c00 ffff00000a5e8350 ffff00000859f358 ffff00000a5e7f50
[ 305.491444] f4a0: ffff00000859f330 0000000040000145 0000000000000000 0000000000000000
[ 305.492008] f4c0: 0001000000000000 0000000000000000 ffff00000a5e8350 ffff00000859f330
[ 305.493063] [<ffff00000808205c>] __bad_stack+0x88/0x8c
[ 305.493396] [<ffff00000859f330>] recursive_loop+0x10/0x48
[ 305.493731] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494088] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494425] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494649] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494898] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495205] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495453] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495708] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496000] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496302] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496644] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496894] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497138] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497325] [<ffff00000859f3dc>] lkdtm_OVERFLOW+0x14/0x20
[ 305.497506] [<ffff00000859f314>] lkdtm_do_action+0x1c/0x28
[ 305.497786] [<ffff00000859f178>] direct_entry+0xe0/0x170
[ 305.498095] [<ffff000008345568>] full_proxy_write+0x60/0xa8
[ 305.498387] [<ffff0000081fb7f4>] __vfs_write+0x1c/0x128
[ 305.498679] [<ffff0000081fcc68>] vfs_write+0xa0/0x1b0
[ 305.498926] [<ffff0000081fe0fc>] SyS_write+0x44/0xa0
[ 305.499182] Exception stack(0xffff00000a5ebec0 to 0xffff00000a5ec000)
[ 305.499429] bec0: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.499674] bee0: 574f4c465245564f 0000000000000000 0000000000000000 8000000080808080
[ 305.499904] bf00: 0000000000000040 0000000000000038 fefefeff1b4bc2ff 7f7f7f7f7f7fff7f
[ 305.500189] bf20: 0101010101010101 0000000000000000 000000000047a4c8 0000000000000038
[ 305.500712] bf40: 0000000000000000 0000ffffa2601280 0000ffffc63f6068 00000000004b5000
[ 305.501241] bf60: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.501791] bf80: 0000000000000020 0000000000000000 00000000004b5000 000000001c4cc458
[ 305.502314] bfa0: 0000000000000000 0000ffffc63f7950 000000000040a3c4 0000ffffc63f70e0
[ 305.502762] bfc0: 0000ffffa2601268 0000000080000000 0000000000000001 0000000000000040
[ 305.503207] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 305.503680] [<ffff000008082fb0>] el0_svc_naked+0x24/0x28
[ 305.504720] Kernel Offset: disabled
[ 305.505189] CPU features: 0x002082
[ 305.505473] Memory Limit: none
[ 305.506181] ---[ end Kernel panic - not syncing: kernel stack overflow
This patch was co-authored by Ard Biesheuvel and Mark Rutland.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-07-14 22:30:35 +03:00
0 :
/ *
* Either w e ' v e j u s t d e t e c t e d a n o v e r f l o w , o r w e ' v e t a k e n a n e x c e p t i o n
* while o n t h e o v e r f l o w s t a c k . E i t h e r w a y , w e w o n ' t r e t u r n t o
* userspace, a n d c a n c l o b b e r E L 0 r e g i s t e r s t o f r e e u p G P R s .
* /
2021-01-12 04:58:13 +03:00
/* Stash the original SP (minus PT_REGS_SIZE) in tpidr_el0. */
arm64: add VMAP_STACK overflow detection
This patch adds stack overflow detection to arm64, usable when vmap'd stacks
are in use.
Overflow is detected in a small preamble executed for each exception entry,
which checks whether there is enough space on the current stack for the general
purpose registers to be saved. If there is not enough space, the overflow
handler is invoked on a per-cpu overflow stack. This approach preserves the
original exception information in ESR_EL1 (and where appropriate, FAR_EL1).
Task and IRQ stacks are aligned to double their size, enabling overflow to be
detected with a single bit test. For example, a 16K stack is aligned to 32K,
ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
this bit is flipped. Thus, overflow (of less than the size of the stack) can be
detected by testing whether this bit is set.
The overflow check is performed before any attempt is made to access the
stack, avoiding recursive faults (and the loss of exception information
these would entail). As logical operations cannot be performed on the SP
directly, the SP is temporarily swapped with a general purpose register
using arithmetic operations to enable the test to be performed.
This gives us a useful error message on stack overflow, as can be trigger with
the LKDTM overflow test:
[ 305.388749] lkdtm: Performing direct entry OVERFLOW
[ 305.395444] Insufficient stack space to handle exception!
[ 305.395482] ESR: 0x96000047 -- DABT (current EL)
[ 305.399890] FAR: 0xffff00000a5e7f30
[ 305.401315] Task stack: [0xffff00000a5e8000..0xffff00000a5ec000]
[ 305.403815] IRQ stack: [0xffff000008000000..0xffff000008004000]
[ 305.407035] Overflow stack: [0xffff80003efce4e0..0xffff80003efcf4e0]
[ 305.409622] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.412785] Hardware name: linux,dummy-virt (DT)
[ 305.415756] task: ffff80003d051c00 task.stack: ffff00000a5e8000
[ 305.419221] PC is at recursive_loop+0x10/0x48
[ 305.421637] LR is at recursive_loop+0x38/0x48
[ 305.423768] pc : [<ffff00000859f330>] lr : [<ffff00000859f358>] pstate: 40000145
[ 305.428020] sp : ffff00000a5e7f50
[ 305.430469] x29: ffff00000a5e8350 x28: ffff80003d051c00
[ 305.433191] x27: ffff000008981000 x26: ffff000008f80400
[ 305.439012] x25: ffff00000a5ebeb8 x24: ffff00000a5ebeb8
[ 305.440369] x23: ffff000008f80138 x22: 0000000000000009
[ 305.442241] x21: ffff80003ce65000 x20: ffff000008f80188
[ 305.444552] x19: 0000000000000013 x18: 0000000000000006
[ 305.446032] x17: 0000ffffa2601280 x16: ffff0000081fe0b8
[ 305.448252] x15: ffff000008ff546d x14: 000000000047a4c8
[ 305.450246] x13: ffff000008ff7872 x12: 0000000005f5e0ff
[ 305.452953] x11: ffff000008ed2548 x10: 000000000005ee8d
[ 305.454824] x9 : ffff000008545380 x8 : ffff00000a5e8770
[ 305.457105] x7 : 1313131313131313 x6 : 00000000000000e1
[ 305.459285] x5 : 0000000000000000 x4 : 0000000000000000
[ 305.461781] x3 : 0000000000000000 x2 : 0000000000000400
[ 305.465119] x1 : 0000000000000013 x0 : 0000000000000012
[ 305.467724] Kernel panic - not syncing: kernel stack overflow
[ 305.470561] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.473325] Hardware name: linux,dummy-virt (DT)
[ 305.475070] Call trace:
[ 305.476116] [<ffff000008088ad8>] dump_backtrace+0x0/0x378
[ 305.478991] [<ffff000008088e64>] show_stack+0x14/0x20
[ 305.481237] [<ffff00000895a178>] dump_stack+0x98/0xb8
[ 305.483294] [<ffff0000080c3288>] panic+0x118/0x280
[ 305.485673] [<ffff0000080c2e9c>] nmi_panic+0x6c/0x70
[ 305.486216] [<ffff000008089710>] handle_bad_stack+0x118/0x128
[ 305.486612] Exception stack(0xffff80003efcf3a0 to 0xffff80003efcf4e0)
[ 305.487334] f3a0: 0000000000000012 0000000000000013 0000000000000400 0000000000000000
[ 305.488025] f3c0: 0000000000000000 0000000000000000 00000000000000e1 1313131313131313
[ 305.488908] f3e0: ffff00000a5e8770 ffff000008545380 000000000005ee8d ffff000008ed2548
[ 305.489403] f400: 0000000005f5e0ff ffff000008ff7872 000000000047a4c8 ffff000008ff546d
[ 305.489759] f420: ffff0000081fe0b8 0000ffffa2601280 0000000000000006 0000000000000013
[ 305.490256] f440: ffff000008f80188 ffff80003ce65000 0000000000000009 ffff000008f80138
[ 305.490683] f460: ffff00000a5ebeb8 ffff00000a5ebeb8 ffff000008f80400 ffff000008981000
[ 305.491051] f480: ffff80003d051c00 ffff00000a5e8350 ffff00000859f358 ffff00000a5e7f50
[ 305.491444] f4a0: ffff00000859f330 0000000040000145 0000000000000000 0000000000000000
[ 305.492008] f4c0: 0001000000000000 0000000000000000 ffff00000a5e8350 ffff00000859f330
[ 305.493063] [<ffff00000808205c>] __bad_stack+0x88/0x8c
[ 305.493396] [<ffff00000859f330>] recursive_loop+0x10/0x48
[ 305.493731] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494088] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494425] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494649] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494898] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495205] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495453] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495708] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496000] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496302] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496644] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496894] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497138] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497325] [<ffff00000859f3dc>] lkdtm_OVERFLOW+0x14/0x20
[ 305.497506] [<ffff00000859f314>] lkdtm_do_action+0x1c/0x28
[ 305.497786] [<ffff00000859f178>] direct_entry+0xe0/0x170
[ 305.498095] [<ffff000008345568>] full_proxy_write+0x60/0xa8
[ 305.498387] [<ffff0000081fb7f4>] __vfs_write+0x1c/0x128
[ 305.498679] [<ffff0000081fcc68>] vfs_write+0xa0/0x1b0
[ 305.498926] [<ffff0000081fe0fc>] SyS_write+0x44/0xa0
[ 305.499182] Exception stack(0xffff00000a5ebec0 to 0xffff00000a5ec000)
[ 305.499429] bec0: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.499674] bee0: 574f4c465245564f 0000000000000000 0000000000000000 8000000080808080
[ 305.499904] bf00: 0000000000000040 0000000000000038 fefefeff1b4bc2ff 7f7f7f7f7f7fff7f
[ 305.500189] bf20: 0101010101010101 0000000000000000 000000000047a4c8 0000000000000038
[ 305.500712] bf40: 0000000000000000 0000ffffa2601280 0000ffffc63f6068 00000000004b5000
[ 305.501241] bf60: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.501791] bf80: 0000000000000020 0000000000000000 00000000004b5000 000000001c4cc458
[ 305.502314] bfa0: 0000000000000000 0000ffffc63f7950 000000000040a3c4 0000ffffc63f70e0
[ 305.502762] bfc0: 0000ffffa2601268 0000000080000000 0000000000000001 0000000000000040
[ 305.503207] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 305.503680] [<ffff000008082fb0>] el0_svc_naked+0x24/0x28
[ 305.504720] Kernel Offset: disabled
[ 305.505189] CPU features: 0x002082
[ 305.505473] Memory Limit: none
[ 305.506181] ---[ end Kernel panic - not syncing: kernel stack overflow
This patch was co-authored by Ard Biesheuvel and Mark Rutland.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-07-14 22:30:35 +03:00
msr t p i d r _ e l 0 , x0
/* Recover the original x0 value and stash it in tpidrro_el0 */
sub x0 , s p , x0
msr t p i d r r o _ e l 0 , x0
/* Switch to the overflow stack */
adr_ t h i s _ c p u s p , o v e r f l o w _ s t a c k + O V E R F L O W _ S T A C K _ S I Z E , x0
/ *
* Check w h e t h e r w e w e r e a l r e a d y o n t h e o v e r f l o w s t a c k . T h i s m a y h a p p e n
* after p a n i c ( ) r e - e n a b l e s i n t e r r u p t s .
* /
mrs x0 , t p i d r _ e l 0 / / s p o f i n t e r r u p t e d c o n t e x t
sub x0 , s p , x0 / / d e l t a w i t h t o p o f o v e r f l o w s t a c k
tst x0 , #~ ( O V E R F L O W _ S T A C K _ S I Z E - 1 ) / / w i t h i n r a n g e ?
b. n e _ _ b a d _ s t a c k / / n o ? - > b a d s t a c k p o i n t e r
/* We were already on the overflow stack. Restore sp/x0 and carry on. */
sub s p , s p , x0
mrs x0 , t p i d r r o _ e l 0
# endif
arm64: entry: handle all vectors with C
We have 16 architectural exception vectors, and depending on kernel
configuration we handle 8 or 12 of these with C code, with the remaining
8 or 4 of these handled as special cases in the entry assembly.
It would be nicer if the entry assembly were uniform for all exceptions,
and we deferred any specific handling of the exceptions to C code. This
way the entry assembly can be more easily templated without ifdeffery or
special cases, and it's easier to modify the handling of these cases in
future (e.g. to dump additional registers other context).
This patch reworks the entry code so that we always have a C handler for
every architectural exception vector, with the entry assembly being
completely uniform. We now have to handle exceptions from EL1t and EL1h,
and also have to handle exceptions from AArch32 even when the kernel is
built without CONFIG_COMPAT. To make this clear and to simplify
templating, we rename the top-level exception handlers with a consistent
naming scheme:
asm: <el+sp>_<regsize>_<type>
c: <el+sp>_<regsize>_<type>_handler
.. where:
<el+sp> is `el1t`, `el1h`, or `el0t`
<regsize> is `64` or `32`
<type> is `sync`, `irq`, `fiq`, or `error`
... e.g.
asm: el1h_64_sync
c: el1h_64_sync_handler
... with lower-level handlers simply using "el1" and "compat" as today.
For unexpected exceptions, this information is passed to
__panic_unhandled(), so it can report the specific vector an unexpected
exception was taken from, e.g.
| Unhandled 64-bit el1t sync exception
For vectors we never expect to enter legitimately, the C code is
generated using a macro to avoid code duplication. The exceptions are
handled via __panic_unhandled(), replacing bad_mode() (which is
removed).
The `kernel_ventry` and `entry_handler` assembly macros are updated to
handle the new naming scheme. In theory it should be possible to
generate the entry functions at the same time as the vectors using a
single table, but this will require reworking the linker script to split
the two into separate sections, so for now we have separate tables.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210607094624.34689-15-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07 12:46:18 +03:00
b e l \ e l \ h t \ ( ) _ \ r e g s i z e \ ( ) _ \ l a b e l
2021-11-17 18:15:26 +03:00
.org .Lventry_start \ @ + 128 // Did we overflow the ventry slot?
2017-07-19 19:24:49 +03:00
.endm
2023-04-18 17:36:04 +03:00
.macro tramp_ a l i a s , d s t , s y m
.set .Lalias \ @, TRAMP_VALIAS + \sym - .entry.tramp.text
movz \ d s t , : a b s _ g 2 _ s : . L a l i a s \ @
movk \ d s t , : a b s _ g 1 _ n c : . L a l i a s \ @
movk \ d s t , : a b s _ g 0 _ n c : . L a l i a s \ @
2017-07-19 19:24:49 +03:00
.endm
2020-07-09 00:10:01 +03:00
/ *
* This m a c r o c o r r u p t s x0 - x3 . I t i s t h e c a l l e r ' s d u t y t o s a v e / r e s t o r e
* them i f r e q u i r e d .
* /
2018-07-11 16:56:47 +03:00
.macro apply_ s s b d , s t a t e , t m p1 , t m p2
2022-09-12 19:22:08 +03:00
alternative_ c b A R M 6 4 _ A L W A Y S _ S Y S T E M , s p e c t r e _ v4 _ p a t c h _ f w _ m i t i g a t i o n _ e n a b l e
2020-09-18 13:54:33 +03:00
b . L _ _ a s m _ s s b d _ s k i p \ @ // Patched to NOP
2018-05-29 15:11:11 +03:00
alternative_ c b _ e n d
2018-05-29 15:11:07 +03:00
ldr_ t h i s _ c p u \ t m p2 , a r m 6 4 _ s s b d _ c a l l b a c k _ r e q u i r e d , \ t m p1
2018-07-11 16:56:47 +03:00
cbz \ t m p2 , . L _ _ a s m _ s s b d _ s k i p \ @
2018-05-29 15:11:13 +03:00
ldr \ t m p2 , [ t s k , #T S K _ T I _ F L A G S ]
2018-07-11 16:56:47 +03:00
tbnz \ t m p2 , #T I F _ S S B D , . L _ _ a s m _ s s b d _ s k i p \ @
2018-05-29 15:11:06 +03:00
mov w0 , #A R M _ S M C C C _ A R C H _ W O R K A R O U N D _ 2
mov w1 , #\ s t a t e
2022-09-12 19:22:08 +03:00
alternative_ c b A R M 6 4 _ A L W A Y S _ S Y S T E M , s m c c c _ p a t c h _ f w _ m i t i g a t i o n _ c o n d u i t
2018-05-29 15:11:06 +03:00
nop / / P a t c h e d t o S M C / H V C #0
alternative_ c b _ e n d
2018-07-11 16:56:47 +03:00
.L__asm_ssbd_skip \ @:
2018-05-29 15:11:06 +03:00
.endm
2019-09-16 13:51:17 +03:00
/* Check for MTE asynchronous tag check faults */
2021-07-09 05:35:32 +03:00
.macro check_ m t e _ a s y n c _ t c f , t m p , t i _ f l a g s , t h r e a d _ s c t l r
2019-09-16 13:51:17 +03:00
# ifdef C O N F I G _ A R M 6 4 _ M T E
2021-04-09 20:37:10 +03:00
.arch_extension lse
2019-09-16 13:51:17 +03:00
alternative_ i f _ n o t A R M 6 4 _ M T E
b 1 f
alternative_ e l s e _ n o p _ e n d i f
2021-07-09 05:35:32 +03:00
/ *
* Asynchronous t a g c h e c k f a u l t s a r e o n l y p o s s i b l e i n A S Y N C ( 2 ) o r
* ASYM ( 3 ) m o d e s . I n e a c h o f t h e s e m o d e s b i t 1 o f S C T L R _ E L 1 . T C F 0 i s
* set, s o s k i p t h e c h e c k i f i t i s u n s e t .
* /
tbz \ t h r e a d _ s c t l r , #( S C T L R _ E L 1 _ T C F 0 _ S H I F T + 1 ) , 1 f
2019-09-16 13:51:17 +03:00
mrs_ s \ t m p , S Y S _ T F S R E 0 _ E L 1
tbz \ t m p , #S Y S _ T F S R _ E L 1 _ T F 0 _ S H I F T , 1 f
/* Asynchronous TCF occurred for TTBR0 access, set the TI flag */
2021-04-09 20:37:10 +03:00
mov \ t m p , #_ T I F _ M T E _ A S Y N C _ F A U L T
add \ t i _ f l a g s , t s k , #T S K _ T I _ F L A G S
stset \ t m p , [ \ t i _ f l a g s ]
2019-09-16 13:51:17 +03:00
1 :
# endif
.endm
/* Clear the MTE asynchronous tag check faults */
2021-07-09 05:35:32 +03:00
.macro clear_mte_async_tcf thread_ s c t l r
2019-09-16 13:51:17 +03:00
# ifdef C O N F I G _ A R M 6 4 _ M T E
alternative_ i f A R M 6 4 _ M T E
2021-07-09 05:35:32 +03:00
/* See comment in check_mte_async_tcf above. */
tbz \ t h r e a d _ s c t l r , #( S C T L R _ E L 1 _ T C F 0 _ S H I F T + 1 ) , 1 f
2019-09-16 13:51:17 +03:00
dsb i s h
msr_ s S Y S _ T F S R E 0 _ E L 1 , x z r
2021-07-09 05:35:32 +03:00
1 :
2019-09-16 13:51:17 +03:00
alternative_ e l s e _ n o p _ e n d i f
# endif
.endm
2021-07-14 04:36:38 +03:00
.macro mte_ s e t _ g c r , m t e _ c t r l , t m p
2020-12-22 23:01:45 +03:00
# ifdef C O N F I G _ A R M 6 4 _ M T E
2021-07-14 04:36:38 +03:00
ubfx \ t m p , \ m t e _ c t r l , #M T E _ C T R L _ G C R _ U S E R _ E X C L _ S H I F T , # 16
orr \ t m p , \ t m p , #S Y S _ G C R _ E L 1 _ R R N D
msr_ s S Y S _ G C R _ E L 1 , \ t m p
2020-12-22 23:01:45 +03:00
# endif
.endm
.macro mte_ s e t _ k e r n e l _ g c r , t m p , t m p2
# ifdef C O N F I G _ K A S A N _ H W _ T A G S
2022-09-12 19:22:08 +03:00
alternative_ c b A R M 6 4 _ A L W A Y S _ S Y S T E M , k a s a n _ h w _ t a g s _ e n a b l e
2020-12-22 23:01:45 +03:00
b 1 f
2021-09-24 04:06:55 +03:00
alternative_ c b _ e n d
arm64: kasan: mte: use a constant kernel GCR_EL1 value
When KASAN_HW_TAGS is selected, KASAN is enabled at boot time, and the
hardware supports MTE, we'll initialize `kernel_gcr_excl` with a value
dependent on KASAN_TAG_MAX. While the resulting value is a constant
which depends on KASAN_TAG_MAX, we have to perform some runtime work to
generate the value, and have to read the value from memory during the
exception entry path. It would be better if we could generate this as a
constant at compile-time, and use it as such directly.
Early in boot within __cpu_setup(), we initialize GCR_EL1 to a safe
value, and later override this with the value required by KASAN. If
CONFIG_KASAN_HW_TAGS is not selected, or if KASAN is disabeld at boot
time, the kernel will not use IRG instructions, and so the initial value
of GCR_EL1 is does not matter to the kernel. Thus, we can instead have
__cpu_setup() initialize GCR_EL1 to a value consistent with
KASAN_TAG_MAX, and avoid the need to re-initialize it during hotplug and
resume form suspend.
This patch makes arem64 use a compile-time constant KERNEL_GCR_EL1
value, which is compatible with KASAN_HW_TAGS when this is selected.
This removes the need to re-initialize GCR_EL1 dynamically, and acts as
an optimization to the entry assembly, which no longer needs to load
this value from memory. The redundant initialization hooks are removed.
In order to do this, KASAN_TAG_MAX needs to be visible outside of the
core KASAN code. To do this, I've moved the KASAN_TAG_* values into
<linux/kasan-tags.h>.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Tested-by: Andrey Konovalov <andreyknvl@gmail.com>
Link: https://lore.kernel.org/r/20210714143843.56537-3-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-07-14 17:38:42 +03:00
mov \ t m p , K E R N E L _ G C R _ E L 1
msr_ s S Y S _ G C R _ E L 1 , \ t m p
2020-12-22 23:01:45 +03:00
1 :
# endif
.endm
.macro mte_ s e t _ u s e r _ g c r , t s k , t m p , t m p2
2021-09-24 04:06:55 +03:00
# ifdef C O N F I G _ K A S A N _ H W _ T A G S
2022-09-12 19:22:08 +03:00
alternative_ c b A R M 6 4 _ A L W A Y S _ S Y S T E M , k a s a n _ h w _ t a g s _ e n a b l e
2020-12-22 23:01:45 +03:00
b 1 f
2021-09-24 04:06:55 +03:00
alternative_ c b _ e n d
2021-07-27 23:52:55 +03:00
ldr \ t m p , [ \ t s k , #T H R E A D _ M T E _ C T R L ]
2020-12-22 23:01:45 +03:00
mte_ s e t _ g c r \ t m p , \ t m p2
1 :
# endif
.endm
2017-07-19 19:24:49 +03:00
.macro kernel_ e n t r y , e l , r e g s i z e = 6 4
arm64: Enable data independent timing (DIT) in the kernel
The ARM architecture revision v8.4 introduces a data independent timing
control (DIT) which can be set at any exception level, and instructs the
CPU to avoid optimizations that may result in a correlation between the
execution time of certain instructions and the value of the data they
operate on.
The DIT bit is part of PSTATE, and is therefore context switched as
usual, given that it becomes part of the saved program state (SPSR) when
taking an exception. We have also defined a hwcap for DIT, and so user
space can discover already whether or nor DIT is available. This means
that, as far as user space is concerned, DIT is wired up and fully
functional.
In the kernel, however, we never bothered with DIT: we disable at it
boot (i.e., INIT_PSTATE_EL1 has DIT cleared) and ignore the fact that we
might run with DIT enabled if user space happened to set it.
Currently, we have no idea whether or not running privileged code with
DIT disabled on a CPU that implements support for it may result in a
side channel that exposes privileged data to unprivileged user space
processes, so let's be cautious and just enable DIT while running in the
kernel if supported by all CPUs.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Adam Langley <agl@google.com>
Link: https://lore.kernel.org/all/YwgCrqutxmX0W72r@gmail.com/
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20221107172400.1851434-1-ardb@kernel.org
[will: Removed cpu_has_dit() as per Mark's suggestion on the list]
Signed-off-by: Will Deacon <will@kernel.org>
2022-11-07 20:24:00 +03:00
.if \ el = = 0
alternative_ i n s n n o p , S E T _ P S T A T E _ D I T ( 1 ) , A R M 6 4 _ H A S _ D I T
.endif
2012-03-05 15:49:27 +04:00
.if \ regsize = = 3 2
mov w0 , w0 / / z e r o u p p e r 3 2 b i t s o f x0
.endif
2014-09-29 15:26:41 +04:00
stp x0 , x1 , [ s p , #16 * 0 ]
stp x2 , x3 , [ s p , #16 * 1 ]
stp x4 , x5 , [ s p , #16 * 2 ]
stp x6 , x7 , [ s p , #16 * 3 ]
stp x8 , x9 , [ s p , #16 * 4 ]
stp x10 , x11 , [ s p , #16 * 5 ]
stp x12 , x13 , [ s p , #16 * 6 ]
stp x14 , x15 , [ s p , #16 * 7 ]
stp x16 , x17 , [ s p , #16 * 8 ]
stp x18 , x19 , [ s p , #16 * 9 ]
stp x20 , x21 , [ s p , #16 * 1 0 ]
stp x22 , x23 , [ s p , #16 * 1 1 ]
stp x24 , x25 , [ s p , #16 * 1 2 ]
stp x26 , x27 , [ s p , #16 * 1 3 ]
stp x28 , x29 , [ s p , #16 * 1 4 ]
2012-03-05 15:49:27 +04:00
.if \ el = = 0
2018-07-11 16:56:48 +03:00
clear_ g p _ r e g s
2012-03-05 15:49:27 +04:00
mrs x21 , s p _ e l 0
2020-01-16 21:35:48 +03:00
ldr_ t h i s _ c p u t s k , _ _ e n t r y _ t a s k , x20
msr s p _ e l 0 , t s k
2020-07-09 00:10:01 +03:00
/ *
* Ensure M D S C R _ E L 1 . S S i s c l e a r , s i n c e w e c a n u n m a s k d e b u g e x c e p t i o n s
* when s c h e d u l i n g .
* /
2020-01-16 21:35:48 +03:00
ldr x19 , [ t s k , #T S K _ T I _ F L A G S ]
disable_ s t e p _ t s k x19 , x20
2015-12-10 13:22:41 +03:00
2019-09-16 13:51:17 +03:00
/* Check for asynchronous tag check faults in user space */
2021-07-09 05:35:32 +03:00
ldr x0 , [ t s k , T H R E A D _ S C T L R _ U S E R ]
check_ m t e _ a s y n c _ t c f x22 , x23 , x0
2018-05-29 15:11:06 +03:00
2021-03-19 06:10:53 +03:00
# ifdef C O N F I G _ A R M 6 4 _ P T R _ A U T H
alternative_ i f A R M 6 4 _ H A S _ A D D R E S S _ A U T H
/ *
* Enable I A f o r i n - k e r n e l P A C i f t h e t a s k h a d i t d i s a b l e d . A l t h o u g h
* this c o u l d b e i m p l e m e n t e d w i t h a n u n c o n d i t i o n a l M R S w h i c h w o u l d a v o i d
* a l o a d , t h i s w a s m e a s u r e d t o b e s l o w e r o n C o r t e x - A 7 5 a n d C o r t e x - A 7 6 .
2021-03-19 06:10:54 +03:00
*
* Install t h e k e r n e l I A k e y o n l y i f I A w a s e n a b l e d i n t h e t a s k . I f I A
* was d i s a b l e d o n k e r n e l e x i t t h e n w e w o u l d h a v e l e f t t h e k e r n e l I A
* installed s o t h e r e i s n o n e e d t o i n s t a l l i t a g a i n .
2021-03-19 06:10:53 +03:00
* /
2021-03-19 06:10:54 +03:00
tbz x0 , S C T L R _ E L x _ E N I A _ S H I F T , 1 f
_ _ ptrauth_ k e y s _ i n s t a l l _ k e r n e l _ n o s y n c t s k , x20 , x22 , x23
b 2 f
1 :
2021-03-19 06:10:53 +03:00
mrs x0 , s c t l r _ e l 1
orr x0 , x0 , S C T L R _ E L x _ E N I A
msr s c t l r _ e l 1 , x0
2021-03-19 06:10:54 +03:00
2 :
2021-03-19 06:10:53 +03:00
alternative_ e l s e _ n o p _ e n d i f
# endif
2020-04-27 19:00:16 +03:00
2021-07-09 05:35:32 +03:00
apply_ s s b d 1 , x22 , x23
2020-12-22 23:01:45 +03:00
mte_ s e t _ k e r n e l _ g c r x22 , x23
2021-07-27 23:54:39 +03:00
/ *
* Any n o n - s e l f - s y n c h r o n i z i n g s y s t e m r e g i s t e r u p d a t e s r e q u i r e d f o r
* kernel e n t r y s h o u l d b e p l a c e d b e f o r e t h i s p o i n t .
* /
alternative_ i f A R M 6 4 _ M T E
isb
b 1 f
alternative_ e l s e _ n o p _ e n d i f
alternative_ i f A R M 6 4 _ H A S _ A D D R E S S _ A U T H
isb
alternative_ e l s e _ n o p _ e n d i f
1 :
2023-01-09 20:47:59 +03:00
scs_ l o a d _ c u r r e n t
2012-03-05 15:49:27 +04:00
.else
2021-01-12 04:58:13 +03:00
add x21 , s p , #P T _ R E G S _ S I Z E
2019-02-22 12:32:50 +03:00
get_ c u r r e n t _ t a s k t s k
2016-06-20 20:28:01 +03:00
.endif /* \el == 0 */
2012-03-05 15:49:27 +04:00
mrs x22 , e l r _ e l 1
mrs x23 , s p s r _ e l 1
stp l r , x21 , [ s p , #S _ L R ]
2016-09-02 16:54:03 +03:00
arm64: unwind: reference pt_regs via embedded stack frame
As it turns out, the unwind code is slightly broken, and probably has
been for a while. The problem is in the dumping of the exception stack,
which is intended to dump the contents of the pt_regs struct at each
level in the call stack where an exception was taken and routed to a
routine marked as __exception (which means its stack frame is right
below the pt_regs struct on the stack).
'Right below the pt_regs struct' is ill defined, though: the unwind
code assigns 'frame pointer + 0x10' to the .sp member of the stackframe
struct at each level, and dump_backtrace() happily dereferences that as
the pt_regs pointer when encountering an __exception routine. However,
the actual size of the stack frame created by this routine (which could
be one of many __exception routines we have in the kernel) is not known,
and so frame.sp is pretty useless to figure out where struct pt_regs
really is.
So it seems the only way to ensure that we can find our struct pt_regs
when walking the stack frames is to put it at a known fixed offset of
the stack frame pointer that is passed to such __exception routines.
The simplest way to do that is to put it inside pt_regs itself, which is
the main change implemented by this patch. As a bonus, doing this allows
us to get rid of a fair amount of cruft related to walking from one stack
to the other, which is especially nice since we intend to introduce yet
another stack for overflow handling once we add support for vmapped
stacks. It also fixes an inconsistency where we only add a stack frame
pointing to ELR_EL1 if we are executing from the IRQ stack but not when
we are executing from the task stack.
To consistly identify exceptions regs even in the presence of exceptions
taken from entry code, we must check whether the next frame was created
by entry text, rather than whether the current frame was crated by
exception text.
To avoid backtracing using PCs that fall in the idmap, or are controlled
by userspace, we must explcitly zero the FP and LR in startup paths, and
must ensure that the frame embedded in pt_regs is zeroed upon entry from
EL0. To avoid these NULL entries showin in the backtrace, unwind_frame()
is updated to avoid them.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[Mark: compare current frame against .entry.text, avoid bogus PCs]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
2017-07-22 20:45:33 +03:00
/ *
arm64: Implement stack trace termination record
Reliable stacktracing requires that we identify when a stacktrace is
terminated early. We can do this by ensuring all tasks have a final
frame record at a known location on their task stack, and checking
that this is the final frame record in the chain.
We'd like to use task_pt_regs(task)->stackframe as the final frame
record, as this is already setup upon exception entry from EL0. For
kernel tasks we need to consistently reserve the pt_regs and point x29
at this, which we can do with small changes to __primary_switched,
__secondary_switched, and copy_process().
Since the final frame record must be at a specific location, we must
create the final frame record in __primary_switched and
__secondary_switched rather than leaving this to start_kernel and
secondary_start_kernel. Thus, __primary_switched and
__secondary_switched will now show up in stacktraces for the idle tasks.
Since the final frame record is now identified by its location rather
than by its contents, we identify it at the start of unwind_frame(),
before we read any values from it.
External debuggers may terminate the stack trace when FP == 0. In the
pt_regs->stackframe, the PC is 0 as well. So, stack traces taken in the
debugger may print an extra record 0x0 at the end. While this is not
pretty, this does not do any harm. This is a small price to pay for
having reliable stack trace termination in the kernel. That said, gdb
does not show the extra record probably because it uses DWARF and not
frame pointers for stack traces.
Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
[Mark: rebase, use ASM_BUG(), update comments, update commit message]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210510110026.18061-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-05-10 14:00:26 +03:00
* For e x c e p t i o n s f r o m E L 0 , c r e a t e a f i n a l f r a m e r e c o r d .
2021-01-13 20:31:55 +03:00
* For e x c e p t i o n s f r o m E L 1 , c r e a t e a s y n t h e t i c f r a m e r e c o r d s o t h e
* interrupted c o d e s h o w s u p i n t h e b a c k t r a c e .
arm64: unwind: reference pt_regs via embedded stack frame
As it turns out, the unwind code is slightly broken, and probably has
been for a while. The problem is in the dumping of the exception stack,
which is intended to dump the contents of the pt_regs struct at each
level in the call stack where an exception was taken and routed to a
routine marked as __exception (which means its stack frame is right
below the pt_regs struct on the stack).
'Right below the pt_regs struct' is ill defined, though: the unwind
code assigns 'frame pointer + 0x10' to the .sp member of the stackframe
struct at each level, and dump_backtrace() happily dereferences that as
the pt_regs pointer when encountering an __exception routine. However,
the actual size of the stack frame created by this routine (which could
be one of many __exception routines we have in the kernel) is not known,
and so frame.sp is pretty useless to figure out where struct pt_regs
really is.
So it seems the only way to ensure that we can find our struct pt_regs
when walking the stack frames is to put it at a known fixed offset of
the stack frame pointer that is passed to such __exception routines.
The simplest way to do that is to put it inside pt_regs itself, which is
the main change implemented by this patch. As a bonus, doing this allows
us to get rid of a fair amount of cruft related to walking from one stack
to the other, which is especially nice since we intend to introduce yet
another stack for overflow handling once we add support for vmapped
stacks. It also fixes an inconsistency where we only add a stack frame
pointing to ELR_EL1 if we are executing from the IRQ stack but not when
we are executing from the task stack.
To consistly identify exceptions regs even in the presence of exceptions
taken from entry code, we must check whether the next frame was created
by entry text, rather than whether the current frame was crated by
exception text.
To avoid backtracing using PCs that fall in the idmap, or are controlled
by userspace, we must explcitly zero the FP and LR in startup paths, and
must ensure that the frame embedded in pt_regs is zeroed upon entry from
EL0. To avoid these NULL entries showin in the backtrace, unwind_frame()
is updated to avoid them.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[Mark: compare current frame against .entry.text, avoid bogus PCs]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
2017-07-22 20:45:33 +03:00
* /
.if \ el = = 0
2021-04-29 13:20:04 +03:00
stp x z r , x z r , [ s p , #S _ S T A C K F R A M E ]
arm64: unwind: reference pt_regs via embedded stack frame
As it turns out, the unwind code is slightly broken, and probably has
been for a while. The problem is in the dumping of the exception stack,
which is intended to dump the contents of the pt_regs struct at each
level in the call stack where an exception was taken and routed to a
routine marked as __exception (which means its stack frame is right
below the pt_regs struct on the stack).
'Right below the pt_regs struct' is ill defined, though: the unwind
code assigns 'frame pointer + 0x10' to the .sp member of the stackframe
struct at each level, and dump_backtrace() happily dereferences that as
the pt_regs pointer when encountering an __exception routine. However,
the actual size of the stack frame created by this routine (which could
be one of many __exception routines we have in the kernel) is not known,
and so frame.sp is pretty useless to figure out where struct pt_regs
really is.
So it seems the only way to ensure that we can find our struct pt_regs
when walking the stack frames is to put it at a known fixed offset of
the stack frame pointer that is passed to such __exception routines.
The simplest way to do that is to put it inside pt_regs itself, which is
the main change implemented by this patch. As a bonus, doing this allows
us to get rid of a fair amount of cruft related to walking from one stack
to the other, which is especially nice since we intend to introduce yet
another stack for overflow handling once we add support for vmapped
stacks. It also fixes an inconsistency where we only add a stack frame
pointing to ELR_EL1 if we are executing from the IRQ stack but not when
we are executing from the task stack.
To consistly identify exceptions regs even in the presence of exceptions
taken from entry code, we must check whether the next frame was created
by entry text, rather than whether the current frame was crated by
exception text.
To avoid backtracing using PCs that fall in the idmap, or are controlled
by userspace, we must explcitly zero the FP and LR in startup paths, and
must ensure that the frame embedded in pt_regs is zeroed upon entry from
EL0. To avoid these NULL entries showin in the backtrace, unwind_frame()
is updated to avoid them.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[Mark: compare current frame against .entry.text, avoid bogus PCs]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
2017-07-22 20:45:33 +03:00
.else
stp x29 , x22 , [ s p , #S _ S T A C K F R A M E ]
2021-01-13 20:31:55 +03:00
.endif
2021-04-29 13:20:04 +03:00
add x29 , s p , #S _ S T A C K F R A M E
arm64: unwind: reference pt_regs via embedded stack frame
As it turns out, the unwind code is slightly broken, and probably has
been for a while. The problem is in the dumping of the exception stack,
which is intended to dump the contents of the pt_regs struct at each
level in the call stack where an exception was taken and routed to a
routine marked as __exception (which means its stack frame is right
below the pt_regs struct on the stack).
'Right below the pt_regs struct' is ill defined, though: the unwind
code assigns 'frame pointer + 0x10' to the .sp member of the stackframe
struct at each level, and dump_backtrace() happily dereferences that as
the pt_regs pointer when encountering an __exception routine. However,
the actual size of the stack frame created by this routine (which could
be one of many __exception routines we have in the kernel) is not known,
and so frame.sp is pretty useless to figure out where struct pt_regs
really is.
So it seems the only way to ensure that we can find our struct pt_regs
when walking the stack frames is to put it at a known fixed offset of
the stack frame pointer that is passed to such __exception routines.
The simplest way to do that is to put it inside pt_regs itself, which is
the main change implemented by this patch. As a bonus, doing this allows
us to get rid of a fair amount of cruft related to walking from one stack
to the other, which is especially nice since we intend to introduce yet
another stack for overflow handling once we add support for vmapped
stacks. It also fixes an inconsistency where we only add a stack frame
pointing to ELR_EL1 if we are executing from the IRQ stack but not when
we are executing from the task stack.
To consistly identify exceptions regs even in the presence of exceptions
taken from entry code, we must check whether the next frame was created
by entry text, rather than whether the current frame was crated by
exception text.
To avoid backtracing using PCs that fall in the idmap, or are controlled
by userspace, we must explcitly zero the FP and LR in startup paths, and
must ensure that the frame embedded in pt_regs is zeroed upon entry from
EL0. To avoid these NULL entries showin in the backtrace, unwind_frame()
is updated to avoid them.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[Mark: compare current frame against .entry.text, avoid bogus PCs]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
2017-07-22 20:45:33 +03:00
2016-09-02 16:54:03 +03:00
# ifdef C O N F I G _ A R M 6 4 _ S W _ T T B R 0 _ P A N
2020-07-21 11:33:15 +03:00
alternative_ i f _ n o t A R M 6 4 _ H A S _ P A N
bl _ _ s w p a n _ e n t r y _ e l \ e l
2016-09-02 16:54:03 +03:00
alternative_ e l s e _ n o p _ e n d i f
# endif
2012-03-05 15:49:27 +04:00
stp x22 , x23 , [ s p , #S _ P C ]
2017-08-01 17:35:54 +03:00
/* Not in a syscall by default (el0_svc overwrites for real syscall) */
2012-03-05 15:49:27 +04:00
.if \ el = = 0
2017-08-01 17:35:54 +03:00
mov w21 , #N O _ S Y S C A L L
arm64: syscallno is secretly an int, make it official
The upper 32 bits of the syscallno field in thread_struct are
handled inconsistently, being sometimes zero extended and sometimes
sign-extended. In fact, only the lower 32 bits seem to have any
real significance for the behaviour of the code: it's been OK to
handle the upper bits inconsistently because they don't matter.
Currently, the only place I can find where those bits are
significant is in calling trace_sys_enter(), which may be
unintentional: for example, if a compat tracer attempts to cancel a
syscall by passing -1 to (COMPAT_)PTRACE_SET_SYSCALL at the
syscall-enter-stop, it will be traced as syscall 4294967295
rather than -1 as might be expected (and as occurs for a native
tracer doing the same thing). Elsewhere, reads of syscallno cast
it to an int or truncate it.
There's also a conspicuous amount of code and casting to bodge
around the fact that although semantically an int, syscallno is
stored as a u64.
Let's not pretend any more.
In order to preserve the stp x instruction that stores the syscall
number in entry.S, this patch special-cases the layout of struct
pt_regs for big endian so that the newly 32-bit syscallno field
maps onto the low bits of the stored value. This is not beautiful,
but benchmarking of the getpid syscall on Juno suggests indicates a
minor slowdown if the stp is split into an stp x and stp w.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-08-01 17:35:53 +03:00
str w21 , [ s p , #S _ S Y S C A L L N O ]
2012-03-05 15:49:27 +04:00
.endif
2022-01-12 06:24:10 +03:00
# ifdef C O N F I G _ A R M 6 4 _ P S E U D O _ N M I
arm64: add ARM64_HAS_GIC_PRIO_RELAXED_SYNC cpucap
When Priority Mask Hint Enable (PMHE) == 0b1, the GIC may use the PMR
value to determine whether to signal an IRQ to a PE, and consequently
after a change to the PMR value, a DSB SY may be required to ensure that
interrupts are signalled to a CPU in finite time. When PMHE == 0b0,
interrupts are always signalled to the relevant PE, and all masking
occurs locally, without requiring a DSB SY.
Since commit:
f226650494c6aa87 ("arm64: Relax ICC_PMR_EL1 accesses when ICC_CTLR_EL1.PMHE is clear")
... we handle this dynamically: in most cases a static key is used to
determine whether to issue a DSB SY, but the entry code must read from
ICC_CTLR_EL1 as static keys aren't accessible from plain assembly.
It would be much nicer to use an alternative instruction sequence for
the DSB, as this would avoid the need to read from ICC_CTLR_EL1 in the
entry code, and for most other code this will result in simpler code
generation with fewer instructions and fewer branches.
This patch adds a new ARM64_HAS_GIC_PRIO_RELAXED_SYNC cpucap which is
only set when ICC_CTLR_EL1.PMHE == 0b0 (and GIC priority masking is in
use). This allows us to replace the existing users of the
`gic_pmr_sync` static key with alternative sequences which default to a
DSB SY and are relaxed to a NOP when PMHE is not in use.
The entry assembly management of the PMR is slightly restructured to use
a branch (rather than multiple NOPs) when priority masking is not in
use. This is more in keeping with other alternatives in the entry
assembly, and permits the use of a separate alternatives for the
PMHE-dependent DSB SY (and removal of the conditional branch this
currently requires). For consistency I've adjusted both the save and
restore paths.
According to bloat-o-meter, when building defconfig +
CONFIG_ARM64_PSEUDO_NMI=y this shrinks the kernel text by ~4KiB:
| add/remove: 4/2 grow/shrink: 42/310 up/down: 332/-5032 (-4700)
The resulting vmlinux is ~66KiB smaller, though the resulting Image size
is unchanged due to padding and alignment:
| [mark@lakrids:~/src/linux]% ls -al vmlinux-*
| -rwxr-xr-x 1 mark mark 137508344 Jan 17 14:11 vmlinux-after
| -rwxr-xr-x 1 mark mark 137575440 Jan 17 13:49 vmlinux-before
| [mark@lakrids:~/src/linux]% ls -al Image-*
| -rw-r--r-- 1 mark mark 38777344 Jan 17 14:11 Image-after
| -rw-r--r-- 1 mark mark 38777344 Jan 17 13:49 Image-before
Prior to this patch we did not verify the state of ICC_CTLR_EL1.PMHE on
secondary CPUs. As of this patch this is verified by the cpufeature code
when using GIC priority masking (i.e. when using pseudo-NMIs).
Note that since commit:
7e3a57fa6ca831fa ("arm64: Document ICC_CTLR_EL3.PMHE setting requirements")
... Documentation/arm64/booting.rst specifies:
| - ICC_CTLR_EL3.PMHE (bit 6) must be set to the same value across
| all CPUs the kernel is executing on, and must stay constant
| for the lifetime of the kernel.
... so that should not adversely affect any compliant systems, and as
we'll only check for the absense of PMHE when using pseudo-NMIs, this
will only fire when such mismatch will adversely affect the system.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230130145429.903791-5-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 17:54:28 +03:00
alternative_ i f _ n o t A R M 6 4 _ H A S _ G I C _ P R I O _ M A S K I N G
b . L s k i p _ p m r _ s a v e \ @
alternative_ e l s e _ n o p _ e n d i f
2019-01-31 17:58:46 +03:00
mrs_ s x20 , S Y S _ I C C _ P M R _ E L 1
str x20 , [ s p , #S _ P M R _ S A V E ]
arm64: entry: always set GIC_PRIO_PSR_I_SET during entry
Zenghui reports that booting a kernel with "irqchip.gicv3_pseudo_nmi=1"
on the command line hits a warning during kernel entry, due to the way
we manipulate the PMR.
Early in the entry sequence, we call lockdep_hardirqs_off() to inform
lockdep that interrupts have been masked (as the HW sets DAIF wqhen
entering an exception). Architecturally PMR_EL1 is not affected by
exception entry, and we don't set GIC_PRIO_PSR_I_SET in the PMR early in
the exception entry sequence, so early in exception entry the PMR can
indicate that interrupts are unmasked even though they are masked by
DAIF.
If DEBUG_LOCKDEP is selected, lockdep_hardirqs_off() will check that
interrupts are masked, before we set GIC_PRIO_PSR_I_SET in any of the
exception entry paths, and hence lockdep_hardirqs_off() will WARN() that
something is amiss.
We can avoid this by consistently setting GIC_PRIO_PSR_I_SET during
exception entry so that kernel code sees a consistent environment. We
must also update local_daif_inherit() to undo this, as currently only
touches DAIF. For other paths, local_daif_restore() will update both
DAIF and the PMR. With this done, we can remove the existing special
cases which set this later in the entry code.
We always use (GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET) for consistency with
local_daif_save(), as this will warn if it ever encounters
(GIC_PRIO_IRQOFF | GIC_PRIO_PSR_I_SET), and never sets this itself. This
matches the gic_prio_kentry_setup that we have to retain for
ret_to_user.
The original splat from Zenghui's report was:
| DEBUG_LOCKS_WARN_ON(!irqs_disabled())
| WARNING: CPU: 3 PID: 125 at kernel/locking/lockdep.c:4258 lockdep_hardirqs_off+0xd4/0xe8
| Modules linked in:
| CPU: 3 PID: 125 Comm: modprobe Tainted: G W 5.12.0-rc8+ #463
| Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
| pstate: 604003c5 (nZCv DAIF +PAN -UAO -TCO BTYPE=--)
| pc : lockdep_hardirqs_off+0xd4/0xe8
| lr : lockdep_hardirqs_off+0xd4/0xe8
| sp : ffff80002a39bad0
| pmr_save: 000000e0
| x29: ffff80002a39bad0 x28: ffff0000de214bc0
| x27: ffff0000de1c0400 x26: 000000000049b328
| x25: 0000000000406f30 x24: ffff0000de1c00a0
| x23: 0000000020400005 x22: ffff8000105f747c
| x21: 0000000096000044 x20: 0000000000498ef9
| x19: ffff80002a39bc88 x18: ffffffffffffffff
| x17: 0000000000000000 x16: ffff800011c61eb0
| x15: ffff800011700a88 x14: 0720072007200720
| x13: 0720072007200720 x12: 0720072007200720
| x11: 0720072007200720 x10: 0720072007200720
| x9 : ffff80002a39bad0 x8 : ffff80002a39bad0
| x7 : ffff8000119f0800 x6 : c0000000ffff7fff
| x5 : ffff8000119f07a8 x4 : 0000000000000001
| x3 : 9bcdab23f2432800 x2 : ffff800011730538
| x1 : 9bcdab23f2432800 x0 : 0000000000000000
| Call trace:
| lockdep_hardirqs_off+0xd4/0xe8
| enter_from_kernel_mode.isra.5+0x7c/0xa8
| el1_abort+0x24/0x100
| el1_sync_handler+0x80/0xd0
| el1_sync+0x6c/0x100
| __arch_clear_user+0xc/0x90
| load_elf_binary+0x9fc/0x1450
| bprm_execve+0x404/0x880
| kernel_execve+0x180/0x188
| call_usermodehelper_exec_async+0xdc/0x158
| ret_from_fork+0x10/0x18
Fixes: 23529049c684 ("arm64: entry: fix non-NMI user<->kernel transitions")
Fixes: 7cd1ea1010ac ("arm64: entry: fix non-NMI kernel<->kernel transitions")
Fixes: f0cd5ac1e4c5 ("arm64: entry: fix NMI {user, kernel}->kernel transitions")
Fixes: 2a9b3e6ac69a ("arm64: entry: fix EL1 debug transitions")
Link: https://lore.kernel.org/r/f4012761-026f-4e51-3a0c-7524e434e8b3@huawei.com
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210428111555.50880-1-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-04-28 14:15:55 +03:00
mov x20 , #G I C _ P R I O _ I R Q O N | G I C _ P R I O _ P S R _ I _ S E T
msr_ s S Y S _ I C C _ P M R _ E L 1 , x20
arm64: add ARM64_HAS_GIC_PRIO_RELAXED_SYNC cpucap
When Priority Mask Hint Enable (PMHE) == 0b1, the GIC may use the PMR
value to determine whether to signal an IRQ to a PE, and consequently
after a change to the PMR value, a DSB SY may be required to ensure that
interrupts are signalled to a CPU in finite time. When PMHE == 0b0,
interrupts are always signalled to the relevant PE, and all masking
occurs locally, without requiring a DSB SY.
Since commit:
f226650494c6aa87 ("arm64: Relax ICC_PMR_EL1 accesses when ICC_CTLR_EL1.PMHE is clear")
... we handle this dynamically: in most cases a static key is used to
determine whether to issue a DSB SY, but the entry code must read from
ICC_CTLR_EL1 as static keys aren't accessible from plain assembly.
It would be much nicer to use an alternative instruction sequence for
the DSB, as this would avoid the need to read from ICC_CTLR_EL1 in the
entry code, and for most other code this will result in simpler code
generation with fewer instructions and fewer branches.
This patch adds a new ARM64_HAS_GIC_PRIO_RELAXED_SYNC cpucap which is
only set when ICC_CTLR_EL1.PMHE == 0b0 (and GIC priority masking is in
use). This allows us to replace the existing users of the
`gic_pmr_sync` static key with alternative sequences which default to a
DSB SY and are relaxed to a NOP when PMHE is not in use.
The entry assembly management of the PMR is slightly restructured to use
a branch (rather than multiple NOPs) when priority masking is not in
use. This is more in keeping with other alternatives in the entry
assembly, and permits the use of a separate alternatives for the
PMHE-dependent DSB SY (and removal of the conditional branch this
currently requires). For consistency I've adjusted both the save and
restore paths.
According to bloat-o-meter, when building defconfig +
CONFIG_ARM64_PSEUDO_NMI=y this shrinks the kernel text by ~4KiB:
| add/remove: 4/2 grow/shrink: 42/310 up/down: 332/-5032 (-4700)
The resulting vmlinux is ~66KiB smaller, though the resulting Image size
is unchanged due to padding and alignment:
| [mark@lakrids:~/src/linux]% ls -al vmlinux-*
| -rwxr-xr-x 1 mark mark 137508344 Jan 17 14:11 vmlinux-after
| -rwxr-xr-x 1 mark mark 137575440 Jan 17 13:49 vmlinux-before
| [mark@lakrids:~/src/linux]% ls -al Image-*
| -rw-r--r-- 1 mark mark 38777344 Jan 17 14:11 Image-after
| -rw-r--r-- 1 mark mark 38777344 Jan 17 13:49 Image-before
Prior to this patch we did not verify the state of ICC_CTLR_EL1.PMHE on
secondary CPUs. As of this patch this is verified by the cpufeature code
when using GIC priority masking (i.e. when using pseudo-NMIs).
Note that since commit:
7e3a57fa6ca831fa ("arm64: Document ICC_CTLR_EL3.PMHE setting requirements")
... Documentation/arm64/booting.rst specifies:
| - ICC_CTLR_EL3.PMHE (bit 6) must be set to the same value across
| all CPUs the kernel is executing on, and must stay constant
| for the lifetime of the kernel.
... so that should not adversely affect any compliant systems, and as
we'll only check for the absense of PMHE when using pseudo-NMIs, this
will only fire when such mismatch will adversely affect the system.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230130145429.903791-5-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 17:54:28 +03:00
.Lskip_pmr_save \ @:
2019-09-16 13:51:17 +03:00
# endif
2012-03-05 15:49:27 +04:00
/ *
* Registers t h a t m a y b e u s e f u l a f t e r t h i s m a c r o i s i n v o k e d :
*
2019-06-11 12:38:10 +03:00
* x2 0 - I C C _ P M R _ E L 1
2012-03-05 15:49:27 +04:00
* x2 1 - a b o r t e d S P
* x2 2 - a b o r t e d P C
* x2 3 - a b o r t e d P S T A T E
* /
.endm
2015-08-19 17:57:09 +03:00
.macro kernel_ e x i t , e l
2016-06-20 20:28:01 +03:00
.if \ el ! = 0
2017-11-02 15:12:37 +03:00
disable_ d a i f
2016-06-20 20:28:01 +03:00
.endif
2022-01-12 06:24:10 +03:00
# ifdef C O N F I G _ A R M 6 4 _ P S E U D O _ N M I
arm64: add ARM64_HAS_GIC_PRIO_RELAXED_SYNC cpucap
When Priority Mask Hint Enable (PMHE) == 0b1, the GIC may use the PMR
value to determine whether to signal an IRQ to a PE, and consequently
after a change to the PMR value, a DSB SY may be required to ensure that
interrupts are signalled to a CPU in finite time. When PMHE == 0b0,
interrupts are always signalled to the relevant PE, and all masking
occurs locally, without requiring a DSB SY.
Since commit:
f226650494c6aa87 ("arm64: Relax ICC_PMR_EL1 accesses when ICC_CTLR_EL1.PMHE is clear")
... we handle this dynamically: in most cases a static key is used to
determine whether to issue a DSB SY, but the entry code must read from
ICC_CTLR_EL1 as static keys aren't accessible from plain assembly.
It would be much nicer to use an alternative instruction sequence for
the DSB, as this would avoid the need to read from ICC_CTLR_EL1 in the
entry code, and for most other code this will result in simpler code
generation with fewer instructions and fewer branches.
This patch adds a new ARM64_HAS_GIC_PRIO_RELAXED_SYNC cpucap which is
only set when ICC_CTLR_EL1.PMHE == 0b0 (and GIC priority masking is in
use). This allows us to replace the existing users of the
`gic_pmr_sync` static key with alternative sequences which default to a
DSB SY and are relaxed to a NOP when PMHE is not in use.
The entry assembly management of the PMR is slightly restructured to use
a branch (rather than multiple NOPs) when priority masking is not in
use. This is more in keeping with other alternatives in the entry
assembly, and permits the use of a separate alternatives for the
PMHE-dependent DSB SY (and removal of the conditional branch this
currently requires). For consistency I've adjusted both the save and
restore paths.
According to bloat-o-meter, when building defconfig +
CONFIG_ARM64_PSEUDO_NMI=y this shrinks the kernel text by ~4KiB:
| add/remove: 4/2 grow/shrink: 42/310 up/down: 332/-5032 (-4700)
The resulting vmlinux is ~66KiB smaller, though the resulting Image size
is unchanged due to padding and alignment:
| [mark@lakrids:~/src/linux]% ls -al vmlinux-*
| -rwxr-xr-x 1 mark mark 137508344 Jan 17 14:11 vmlinux-after
| -rwxr-xr-x 1 mark mark 137575440 Jan 17 13:49 vmlinux-before
| [mark@lakrids:~/src/linux]% ls -al Image-*
| -rw-r--r-- 1 mark mark 38777344 Jan 17 14:11 Image-after
| -rw-r--r-- 1 mark mark 38777344 Jan 17 13:49 Image-before
Prior to this patch we did not verify the state of ICC_CTLR_EL1.PMHE on
secondary CPUs. As of this patch this is verified by the cpufeature code
when using GIC priority masking (i.e. when using pseudo-NMIs).
Note that since commit:
7e3a57fa6ca831fa ("arm64: Document ICC_CTLR_EL3.PMHE setting requirements")
... Documentation/arm64/booting.rst specifies:
| - ICC_CTLR_EL3.PMHE (bit 6) must be set to the same value across
| all CPUs the kernel is executing on, and must stay constant
| for the lifetime of the kernel.
... so that should not adversely affect any compliant systems, and as
we'll only check for the absense of PMHE when using pseudo-NMIs, this
will only fire when such mismatch will adversely affect the system.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230130145429.903791-5-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 17:54:28 +03:00
alternative_ i f _ n o t A R M 6 4 _ H A S _ G I C _ P R I O _ M A S K I N G
b . L s k i p _ p m r _ r e s t o r e \ @
alternative_ e l s e _ n o p _ e n d i f
2019-01-31 17:58:46 +03:00
ldr x20 , [ s p , #S _ P M R _ S A V E ]
msr_ s S Y S _ I C C _ P M R _ E L 1 , x20
arm64: add ARM64_HAS_GIC_PRIO_RELAXED_SYNC cpucap
When Priority Mask Hint Enable (PMHE) == 0b1, the GIC may use the PMR
value to determine whether to signal an IRQ to a PE, and consequently
after a change to the PMR value, a DSB SY may be required to ensure that
interrupts are signalled to a CPU in finite time. When PMHE == 0b0,
interrupts are always signalled to the relevant PE, and all masking
occurs locally, without requiring a DSB SY.
Since commit:
f226650494c6aa87 ("arm64: Relax ICC_PMR_EL1 accesses when ICC_CTLR_EL1.PMHE is clear")
... we handle this dynamically: in most cases a static key is used to
determine whether to issue a DSB SY, but the entry code must read from
ICC_CTLR_EL1 as static keys aren't accessible from plain assembly.
It would be much nicer to use an alternative instruction sequence for
the DSB, as this would avoid the need to read from ICC_CTLR_EL1 in the
entry code, and for most other code this will result in simpler code
generation with fewer instructions and fewer branches.
This patch adds a new ARM64_HAS_GIC_PRIO_RELAXED_SYNC cpucap which is
only set when ICC_CTLR_EL1.PMHE == 0b0 (and GIC priority masking is in
use). This allows us to replace the existing users of the
`gic_pmr_sync` static key with alternative sequences which default to a
DSB SY and are relaxed to a NOP when PMHE is not in use.
The entry assembly management of the PMR is slightly restructured to use
a branch (rather than multiple NOPs) when priority masking is not in
use. This is more in keeping with other alternatives in the entry
assembly, and permits the use of a separate alternatives for the
PMHE-dependent DSB SY (and removal of the conditional branch this
currently requires). For consistency I've adjusted both the save and
restore paths.
According to bloat-o-meter, when building defconfig +
CONFIG_ARM64_PSEUDO_NMI=y this shrinks the kernel text by ~4KiB:
| add/remove: 4/2 grow/shrink: 42/310 up/down: 332/-5032 (-4700)
The resulting vmlinux is ~66KiB smaller, though the resulting Image size
is unchanged due to padding and alignment:
| [mark@lakrids:~/src/linux]% ls -al vmlinux-*
| -rwxr-xr-x 1 mark mark 137508344 Jan 17 14:11 vmlinux-after
| -rwxr-xr-x 1 mark mark 137575440 Jan 17 13:49 vmlinux-before
| [mark@lakrids:~/src/linux]% ls -al Image-*
| -rw-r--r-- 1 mark mark 38777344 Jan 17 14:11 Image-after
| -rw-r--r-- 1 mark mark 38777344 Jan 17 13:49 Image-before
Prior to this patch we did not verify the state of ICC_CTLR_EL1.PMHE on
secondary CPUs. As of this patch this is verified by the cpufeature code
when using GIC priority masking (i.e. when using pseudo-NMIs).
Note that since commit:
7e3a57fa6ca831fa ("arm64: Document ICC_CTLR_EL3.PMHE setting requirements")
... Documentation/arm64/booting.rst specifies:
| - ICC_CTLR_EL3.PMHE (bit 6) must be set to the same value across
| all CPUs the kernel is executing on, and must stay constant
| for the lifetime of the kernel.
... so that should not adversely affect any compliant systems, and as
we'll only check for the absense of PMHE when using pseudo-NMIs, this
will only fire when such mismatch will adversely affect the system.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230130145429.903791-5-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 17:54:28 +03:00
/* Ensure priority change is seen by redistributor */
alternative_ i f _ n o t A R M 6 4 _ H A S _ G I C _ P R I O _ R E L A X E D _ S Y N C
dsb s y
2019-01-31 17:58:46 +03:00
alternative_ e l s e _ n o p _ e n d i f
arm64: add ARM64_HAS_GIC_PRIO_RELAXED_SYNC cpucap
When Priority Mask Hint Enable (PMHE) == 0b1, the GIC may use the PMR
value to determine whether to signal an IRQ to a PE, and consequently
after a change to the PMR value, a DSB SY may be required to ensure that
interrupts are signalled to a CPU in finite time. When PMHE == 0b0,
interrupts are always signalled to the relevant PE, and all masking
occurs locally, without requiring a DSB SY.
Since commit:
f226650494c6aa87 ("arm64: Relax ICC_PMR_EL1 accesses when ICC_CTLR_EL1.PMHE is clear")
... we handle this dynamically: in most cases a static key is used to
determine whether to issue a DSB SY, but the entry code must read from
ICC_CTLR_EL1 as static keys aren't accessible from plain assembly.
It would be much nicer to use an alternative instruction sequence for
the DSB, as this would avoid the need to read from ICC_CTLR_EL1 in the
entry code, and for most other code this will result in simpler code
generation with fewer instructions and fewer branches.
This patch adds a new ARM64_HAS_GIC_PRIO_RELAXED_SYNC cpucap which is
only set when ICC_CTLR_EL1.PMHE == 0b0 (and GIC priority masking is in
use). This allows us to replace the existing users of the
`gic_pmr_sync` static key with alternative sequences which default to a
DSB SY and are relaxed to a NOP when PMHE is not in use.
The entry assembly management of the PMR is slightly restructured to use
a branch (rather than multiple NOPs) when priority masking is not in
use. This is more in keeping with other alternatives in the entry
assembly, and permits the use of a separate alternatives for the
PMHE-dependent DSB SY (and removal of the conditional branch this
currently requires). For consistency I've adjusted both the save and
restore paths.
According to bloat-o-meter, when building defconfig +
CONFIG_ARM64_PSEUDO_NMI=y this shrinks the kernel text by ~4KiB:
| add/remove: 4/2 grow/shrink: 42/310 up/down: 332/-5032 (-4700)
The resulting vmlinux is ~66KiB smaller, though the resulting Image size
is unchanged due to padding and alignment:
| [mark@lakrids:~/src/linux]% ls -al vmlinux-*
| -rwxr-xr-x 1 mark mark 137508344 Jan 17 14:11 vmlinux-after
| -rwxr-xr-x 1 mark mark 137575440 Jan 17 13:49 vmlinux-before
| [mark@lakrids:~/src/linux]% ls -al Image-*
| -rw-r--r-- 1 mark mark 38777344 Jan 17 14:11 Image-after
| -rw-r--r-- 1 mark mark 38777344 Jan 17 13:49 Image-before
Prior to this patch we did not verify the state of ICC_CTLR_EL1.PMHE on
secondary CPUs. As of this patch this is verified by the cpufeature code
when using GIC priority masking (i.e. when using pseudo-NMIs).
Note that since commit:
7e3a57fa6ca831fa ("arm64: Document ICC_CTLR_EL3.PMHE setting requirements")
... Documentation/arm64/booting.rst specifies:
| - ICC_CTLR_EL3.PMHE (bit 6) must be set to the same value across
| all CPUs the kernel is executing on, and must stay constant
| for the lifetime of the kernel.
... so that should not adversely affect any compliant systems, and as
we'll only check for the absense of PMHE when using pseudo-NMIs, this
will only fire when such mismatch will adversely affect the system.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230130145429.903791-5-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-30 17:54:28 +03:00
.Lskip_pmr_restore \ @:
2022-01-12 06:24:10 +03:00
# endif
2019-01-31 17:58:46 +03:00
2012-03-05 15:49:27 +04:00
ldp x21 , x22 , [ s p , #S _ P C ] / / l o a d E L R , S P S R
2016-09-02 16:54:03 +03:00
# ifdef C O N F I G _ A R M 6 4 _ S W _ T T B R 0 _ P A N
2020-07-21 11:33:15 +03:00
alternative_ i f _ n o t A R M 6 4 _ H A S _ P A N
bl _ _ s w p a n _ e x i t _ e l \ e l
2016-09-02 16:54:03 +03:00
alternative_ e l s e _ n o p _ e n d i f
# endif
.if \ el = = 0
2012-03-05 15:49:27 +04:00
ldr x23 , [ s p , #S _ S P ] / / l o a d r e t u r n s t a c k p o i n t e r
2014-09-29 15:26:41 +04:00
msr s p _ e l 0 , x23
2017-11-14 17:24:29 +03:00
tst x22 , #P S R _ M O D E 32 _ B I T / / n a t i v e t a s k ?
b. e q 3 f
2015-03-23 22:07:02 +03:00
# ifdef C O N F I G _ A R M 6 4 _ E R R A T U M _ 8 4 5 7 1 9
2016-09-07 13:07:09 +03:00
alternative_ i f A R M 6 4 _ W O R K A R O U N D _ 8 4 5 7 1 9
2015-07-22 14:21:03 +03:00
# ifdef C O N F I G _ P I D _ I N _ C O N T E X T I D R
mrs x29 , c o n t e x t i d r _ e l 1
msr c o n t e x t i d r _ e l 1 , x29
2015-03-23 22:07:02 +03:00
# else
2015-07-22 14:21:03 +03:00
msr c o n t e x t i d r _ e l 1 , x z r
2015-03-23 22:07:02 +03:00
# endif
2016-09-07 13:07:09 +03:00
alternative_ e l s e _ n o p _ e n d i f
2015-03-23 22:07:02 +03:00
# endif
2017-11-14 17:24:29 +03:00
3 :
2021-05-27 13:55:29 +03:00
scs_ s a v e t s k
2020-04-27 19:00:16 +03:00
2021-07-09 05:35:32 +03:00
/* Ignore asynchronous tag check faults in the uaccess routines */
ldr x0 , [ t s k , T H R E A D _ S C T L R _ U S E R ]
clear_ m t e _ a s y n c _ t c f x0
2021-03-19 06:10:53 +03:00
# ifdef C O N F I G _ A R M 6 4 _ P T R _ A U T H
alternative_ i f A R M 6 4 _ H A S _ A D D R E S S _ A U T H
/ *
2021-03-19 06:10:54 +03:00
* IA w a s e n a b l e d f o r i n - k e r n e l P A C . D i s a b l e i t n o w i f n e e d e d , o r
* alternatively i n s t a l l t h e u s e r ' s I A . A l l o t h e r p e r - t a s k k e y s a n d
* SCTLR b i t s w e r e u p d a t e d o n t a s k s w i t c h .
*
* No k e r n e l C f u n c t i o n c a l l s a f t e r t h i s .
2021-03-19 06:10:53 +03:00
* /
2021-03-19 06:10:54 +03:00
tbz x0 , S C T L R _ E L x _ E N I A _ S H I F T , 1 f
_ _ ptrauth_ k e y s _ i n s t a l l _ u s e r t s k , x0 , x1 , x2
b 2 f
1 :
2021-03-19 06:10:53 +03:00
mrs x0 , s c t l r _ e l 1
bic x0 , x0 , S C T L R _ E L x _ E N I A
msr s c t l r _ e l 1 , x0
2021-03-19 06:10:54 +03:00
2 :
2021-03-19 06:10:53 +03:00
alternative_ e l s e _ n o p _ e n d i f
# endif
2020-03-13 12:04:51 +03:00
2020-12-22 23:01:45 +03:00
mte_ s e t _ u s e r _ g c r t s k , x0 , x1
2018-07-11 16:56:47 +03:00
apply_ s s b d 0 , x0 , x1
2012-03-05 15:49:27 +04:00
.endif
2016-09-02 16:54:03 +03:00
2014-09-29 15:26:41 +04:00
msr e l r _ e l 1 , x21 / / s e t u p t h e r e t u r n d a t a
msr s p s r _ e l 1 , x22
ldp x0 , x1 , [ s p , #16 * 0 ]
ldp x2 , x3 , [ s p , #16 * 1 ]
ldp x4 , x5 , [ s p , #16 * 2 ]
ldp x6 , x7 , [ s p , #16 * 3 ]
ldp x8 , x9 , [ s p , #16 * 4 ]
ldp x10 , x11 , [ s p , #16 * 5 ]
ldp x12 , x13 , [ s p , #16 * 6 ]
ldp x14 , x15 , [ s p , #16 * 7 ]
ldp x16 , x17 , [ s p , #16 * 8 ]
ldp x18 , x19 , [ s p , #16 * 9 ]
ldp x20 , x21 , [ s p , #16 * 1 0 ]
ldp x22 , x23 , [ s p , #16 * 1 1 ]
ldp x24 , x25 , [ s p , #16 * 1 2 ]
ldp x26 , x27 , [ s p , #16 * 1 3 ]
ldp x28 , x29 , [ s p , #16 * 1 4 ]
2017-11-14 17:24:29 +03:00
.if \ el = = 0
2021-11-23 21:41:43 +03:00
alternative_ i f _ n o t A R M 6 4 _ U N M A P _ K E R N E L _ A T _ E L 0
ldr l r , [ s p , #S _ L R ]
add s p , s p , #P T _ R E G S _ S I Z E / / r e s t o r e s p
eret
alternative_ e l s e _ n o p _ e n d i f
2017-11-14 17:38:19 +03:00
# ifdef C O N F I G _ U N M A P _ K E R N E L _ A T _ E L 0
2021-11-23 21:41:43 +03:00
msr f a r _ e l 1 , x29
2023-04-18 17:36:04 +03:00
ldr_ t h i s _ c p u x30 , t h i s _ c p u _ v e c t o r , x29
tramp_ a l i a s x29 , t r a m p _ e x i t
msr v b a r _ e l 1 , x30 / / i n s t a l l v e c t o r t a b l e
ldr l r , [ s p , #S _ L R ] / / r e s t o r e x 30
add s p , s p , #P T _ R E G S _ S I Z E / / r e s t o r e s p
br x29
2017-11-14 17:38:19 +03:00
# endif
2017-11-14 17:24:29 +03:00
.else
2021-11-23 21:41:43 +03:00
ldr l r , [ s p , #S _ L R ]
add s p , s p , #P T _ R E G S _ S I Z E / / r e s t o r e s p
2020-10-28 21:28:39 +03:00
/* Ensure any device/NC reads complete */
alternative_ i n s n n o p , " d m b s y " , A R M 6 4 _ W O R K A R O U N D _ 1 5 0 8 4 1 2
2017-11-14 17:24:29 +03:00
eret
.endif
2018-06-14 13:23:38 +03:00
sb
2012-03-05 15:49:27 +04:00
.endm
2020-07-21 11:33:15 +03:00
# ifdef C O N F I G _ A R M 6 4 _ S W _ T T B R 0 _ P A N
/ *
* Set t h e T T B R 0 P A N b i t i n S P S R . W h e n t h e e x c e p t i o n i s t a k e n f r o m
* EL0 , t h e r e i s n o n e e d t o c h e c k t h e s t a t e o f T T B R 0 _ E L 1 s i n c e
* accesses a r e a l w a y s e n a b l e d .
* Note t h a t t h e m e a n i n g o f t h i s b i t d i f f e r s f r o m t h e A R M v8 . 1 P A N
* feature a s a l l T T B R 0 _ E L 1 a c c e s s e s a r e d i s a b l e d , n o t j u s t t h o s e t o
* user m a p p i n g s .
* /
SYM_ C O D E _ S T A R T _ L O C A L ( _ _ s w p a n _ e n t r y _ e l 1 )
mrs x21 , t t b r0 _ e l 1
tst x21 , #T T B R _ A S I D _ M A S K / / C h e c k f o r t h e r e s e r v e d A S I D
orr x23 , x23 , #P S R _ P A N _ B I T / / S e t t h e e m u l a t e d P A N i n t h e s a v e d S P S R
b. e q 1 f / / T T B R 0 a c c e s s a l r e a d y d i s a b l e d
and x23 , x23 , #~ P S R _ P A N _ B I T / / C l e a r t h e e m u l a t e d P A N i n t h e s a v e d S P S R
SYM_ I N N E R _ L A B E L ( _ _ s w p a n _ e n t r y _ e l 0 , S Y M _ L _ L O C A L )
_ _ uaccess_ t t b r0 _ d i s a b l e x21
1 : ret
SYM_ C O D E _ E N D ( _ _ s w p a n _ e n t r y _ e l 1 )
/ *
* Restore a c c e s s t o T T B R 0 _ E L 1 . I f r e t u r n i n g t o E L 0 , n o n e e d f o r S P S R
* PAN b i t c h e c k i n g .
* /
SYM_ C O D E _ S T A R T _ L O C A L ( _ _ s w p a n _ e x i t _ e l 1 )
tbnz x22 , #22 , 1 f / / S k i p r e - e n a b l i n g T T B R 0 a c c e s s i f t h e P S R _ P A N _ B I T i s s e t
_ _ uaccess_ t t b r0 _ e n a b l e x0 , x1
1 : and x22 , x22 , #~ P S R _ P A N _ B I T / / A R M v 8.0 C P U s d o n o t u n d e r s t a n d t h i s b i t
ret
SYM_ C O D E _ E N D ( _ _ s w p a n _ e x i t _ e l 1 )
SYM_ C O D E _ S T A R T _ L O C A L ( _ _ s w p a n _ e x i t _ e l 0 )
_ _ uaccess_ t t b r0 _ e n a b l e x0 , x1
/ *
* Enable e r r a t a w o r k a r o u n d s o n l y i f r e t u r n i n g t o u s e r . T h e o n l y
* workaround c u r r e n t l y r e q u i r e d f o r T T B R 0 _ E L 1 c h a n g e s a r e f o r t h e
* Cavium e r r a t u m 2 7 4 5 6 ( b r o a d c a s t T L B I i n s t r u c t i o n s m a y c a u s e I - c a c h e
* corruption) .
* /
b p o s t _ t t b r _ u p d a t e _ w o r k a r o u n d
SYM_ C O D E _ E N D ( _ _ s w p a n _ e x i t _ e l 0 )
# endif
2019-01-03 16:23:10 +03:00
/* GPRs used by entry code */
2012-03-05 15:49:27 +04:00
tsk . r e q x28 / / c u r r e n t t h r e a d _ i n f o
.text
/ *
* Exception v e c t o r s .
* /
2016-07-08 19:35:50 +03:00
.pushsection " .entry .text " , " ax"
2012-03-05 15:49:27 +04:00
.align 11
2020-02-18 22:58:27 +03:00
SYM_ C O D E _ S T A R T ( v e c t o r s )
arm64: entry: handle all vectors with C
We have 16 architectural exception vectors, and depending on kernel
configuration we handle 8 or 12 of these with C code, with the remaining
8 or 4 of these handled as special cases in the entry assembly.
It would be nicer if the entry assembly were uniform for all exceptions,
and we deferred any specific handling of the exceptions to C code. This
way the entry assembly can be more easily templated without ifdeffery or
special cases, and it's easier to modify the handling of these cases in
future (e.g. to dump additional registers other context).
This patch reworks the entry code so that we always have a C handler for
every architectural exception vector, with the entry assembly being
completely uniform. We now have to handle exceptions from EL1t and EL1h,
and also have to handle exceptions from AArch32 even when the kernel is
built without CONFIG_COMPAT. To make this clear and to simplify
templating, we rename the top-level exception handlers with a consistent
naming scheme:
asm: <el+sp>_<regsize>_<type>
c: <el+sp>_<regsize>_<type>_handler
.. where:
<el+sp> is `el1t`, `el1h`, or `el0t`
<regsize> is `64` or `32`
<type> is `sync`, `irq`, `fiq`, or `error`
... e.g.
asm: el1h_64_sync
c: el1h_64_sync_handler
... with lower-level handlers simply using "el1" and "compat" as today.
For unexpected exceptions, this information is passed to
__panic_unhandled(), so it can report the specific vector an unexpected
exception was taken from, e.g.
| Unhandled 64-bit el1t sync exception
For vectors we never expect to enter legitimately, the C code is
generated using a macro to avoid code duplication. The exceptions are
handled via __panic_unhandled(), replacing bad_mode() (which is
removed).
The `kernel_ventry` and `entry_handler` assembly macros are updated to
handle the new naming scheme. In theory it should be possible to
generate the entry functions at the same time as the vectors using a
single table, but this will require reworking the linker script to split
the two into separate sections, so for now we have separate tables.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210607094624.34689-15-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07 12:46:18 +03:00
kernel_ v e n t r y 1 , t , 6 4 , s y n c / / S y n c h r o n o u s E L 1 t
kernel_ v e n t r y 1 , t , 6 4 , i r q / / I R Q E L 1 t
2022-07-21 06:05:31 +03:00
kernel_ v e n t r y 1 , t , 6 4 , f i q / / F I Q E L 1 t
arm64: entry: handle all vectors with C
We have 16 architectural exception vectors, and depending on kernel
configuration we handle 8 or 12 of these with C code, with the remaining
8 or 4 of these handled as special cases in the entry assembly.
It would be nicer if the entry assembly were uniform for all exceptions,
and we deferred any specific handling of the exceptions to C code. This
way the entry assembly can be more easily templated without ifdeffery or
special cases, and it's easier to modify the handling of these cases in
future (e.g. to dump additional registers other context).
This patch reworks the entry code so that we always have a C handler for
every architectural exception vector, with the entry assembly being
completely uniform. We now have to handle exceptions from EL1t and EL1h,
and also have to handle exceptions from AArch32 even when the kernel is
built without CONFIG_COMPAT. To make this clear and to simplify
templating, we rename the top-level exception handlers with a consistent
naming scheme:
asm: <el+sp>_<regsize>_<type>
c: <el+sp>_<regsize>_<type>_handler
.. where:
<el+sp> is `el1t`, `el1h`, or `el0t`
<regsize> is `64` or `32`
<type> is `sync`, `irq`, `fiq`, or `error`
... e.g.
asm: el1h_64_sync
c: el1h_64_sync_handler
... with lower-level handlers simply using "el1" and "compat" as today.
For unexpected exceptions, this information is passed to
__panic_unhandled(), so it can report the specific vector an unexpected
exception was taken from, e.g.
| Unhandled 64-bit el1t sync exception
For vectors we never expect to enter legitimately, the C code is
generated using a macro to avoid code duplication. The exceptions are
handled via __panic_unhandled(), replacing bad_mode() (which is
removed).
The `kernel_ventry` and `entry_handler` assembly macros are updated to
handle the new naming scheme. In theory it should be possible to
generate the entry functions at the same time as the vectors using a
single table, but this will require reworking the linker script to split
the two into separate sections, so for now we have separate tables.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210607094624.34689-15-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07 12:46:18 +03:00
kernel_ v e n t r y 1 , t , 6 4 , e r r o r / / E r r o r E L 1 t
kernel_ v e n t r y 1 , h , 6 4 , s y n c / / S y n c h r o n o u s E L 1 h
kernel_ v e n t r y 1 , h , 6 4 , i r q / / I R Q E L 1 h
kernel_ v e n t r y 1 , h , 6 4 , f i q / / F I Q E L 1 h
kernel_ v e n t r y 1 , h , 6 4 , e r r o r / / E r r o r E L 1 h
kernel_ v e n t r y 0 , t , 6 4 , s y n c / / S y n c h r o n o u s 6 4 - b i t E L 0
kernel_ v e n t r y 0 , t , 6 4 , i r q / / I R Q 6 4 - b i t E L 0
kernel_ v e n t r y 0 , t , 6 4 , f i q / / F I Q 6 4 - b i t E L 0
kernel_ v e n t r y 0 , t , 6 4 , e r r o r / / E r r o r 6 4 - b i t E L 0
kernel_ v e n t r y 0 , t , 3 2 , s y n c / / S y n c h r o n o u s 3 2 - b i t E L 0
kernel_ v e n t r y 0 , t , 3 2 , i r q / / I R Q 3 2 - b i t E L 0
kernel_ v e n t r y 0 , t , 3 2 , f i q / / F I Q 3 2 - b i t E L 0
kernel_ v e n t r y 0 , t , 3 2 , e r r o r / / E r r o r 3 2 - b i t E L 0
2020-02-18 22:58:27 +03:00
SYM_ C O D E _ E N D ( v e c t o r s )
2012-03-05 15:49:27 +04:00
arm64: add VMAP_STACK overflow detection
This patch adds stack overflow detection to arm64, usable when vmap'd stacks
are in use.
Overflow is detected in a small preamble executed for each exception entry,
which checks whether there is enough space on the current stack for the general
purpose registers to be saved. If there is not enough space, the overflow
handler is invoked on a per-cpu overflow stack. This approach preserves the
original exception information in ESR_EL1 (and where appropriate, FAR_EL1).
Task and IRQ stacks are aligned to double their size, enabling overflow to be
detected with a single bit test. For example, a 16K stack is aligned to 32K,
ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
this bit is flipped. Thus, overflow (of less than the size of the stack) can be
detected by testing whether this bit is set.
The overflow check is performed before any attempt is made to access the
stack, avoiding recursive faults (and the loss of exception information
these would entail). As logical operations cannot be performed on the SP
directly, the SP is temporarily swapped with a general purpose register
using arithmetic operations to enable the test to be performed.
This gives us a useful error message on stack overflow, as can be trigger with
the LKDTM overflow test:
[ 305.388749] lkdtm: Performing direct entry OVERFLOW
[ 305.395444] Insufficient stack space to handle exception!
[ 305.395482] ESR: 0x96000047 -- DABT (current EL)
[ 305.399890] FAR: 0xffff00000a5e7f30
[ 305.401315] Task stack: [0xffff00000a5e8000..0xffff00000a5ec000]
[ 305.403815] IRQ stack: [0xffff000008000000..0xffff000008004000]
[ 305.407035] Overflow stack: [0xffff80003efce4e0..0xffff80003efcf4e0]
[ 305.409622] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.412785] Hardware name: linux,dummy-virt (DT)
[ 305.415756] task: ffff80003d051c00 task.stack: ffff00000a5e8000
[ 305.419221] PC is at recursive_loop+0x10/0x48
[ 305.421637] LR is at recursive_loop+0x38/0x48
[ 305.423768] pc : [<ffff00000859f330>] lr : [<ffff00000859f358>] pstate: 40000145
[ 305.428020] sp : ffff00000a5e7f50
[ 305.430469] x29: ffff00000a5e8350 x28: ffff80003d051c00
[ 305.433191] x27: ffff000008981000 x26: ffff000008f80400
[ 305.439012] x25: ffff00000a5ebeb8 x24: ffff00000a5ebeb8
[ 305.440369] x23: ffff000008f80138 x22: 0000000000000009
[ 305.442241] x21: ffff80003ce65000 x20: ffff000008f80188
[ 305.444552] x19: 0000000000000013 x18: 0000000000000006
[ 305.446032] x17: 0000ffffa2601280 x16: ffff0000081fe0b8
[ 305.448252] x15: ffff000008ff546d x14: 000000000047a4c8
[ 305.450246] x13: ffff000008ff7872 x12: 0000000005f5e0ff
[ 305.452953] x11: ffff000008ed2548 x10: 000000000005ee8d
[ 305.454824] x9 : ffff000008545380 x8 : ffff00000a5e8770
[ 305.457105] x7 : 1313131313131313 x6 : 00000000000000e1
[ 305.459285] x5 : 0000000000000000 x4 : 0000000000000000
[ 305.461781] x3 : 0000000000000000 x2 : 0000000000000400
[ 305.465119] x1 : 0000000000000013 x0 : 0000000000000012
[ 305.467724] Kernel panic - not syncing: kernel stack overflow
[ 305.470561] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.473325] Hardware name: linux,dummy-virt (DT)
[ 305.475070] Call trace:
[ 305.476116] [<ffff000008088ad8>] dump_backtrace+0x0/0x378
[ 305.478991] [<ffff000008088e64>] show_stack+0x14/0x20
[ 305.481237] [<ffff00000895a178>] dump_stack+0x98/0xb8
[ 305.483294] [<ffff0000080c3288>] panic+0x118/0x280
[ 305.485673] [<ffff0000080c2e9c>] nmi_panic+0x6c/0x70
[ 305.486216] [<ffff000008089710>] handle_bad_stack+0x118/0x128
[ 305.486612] Exception stack(0xffff80003efcf3a0 to 0xffff80003efcf4e0)
[ 305.487334] f3a0: 0000000000000012 0000000000000013 0000000000000400 0000000000000000
[ 305.488025] f3c0: 0000000000000000 0000000000000000 00000000000000e1 1313131313131313
[ 305.488908] f3e0: ffff00000a5e8770 ffff000008545380 000000000005ee8d ffff000008ed2548
[ 305.489403] f400: 0000000005f5e0ff ffff000008ff7872 000000000047a4c8 ffff000008ff546d
[ 305.489759] f420: ffff0000081fe0b8 0000ffffa2601280 0000000000000006 0000000000000013
[ 305.490256] f440: ffff000008f80188 ffff80003ce65000 0000000000000009 ffff000008f80138
[ 305.490683] f460: ffff00000a5ebeb8 ffff00000a5ebeb8 ffff000008f80400 ffff000008981000
[ 305.491051] f480: ffff80003d051c00 ffff00000a5e8350 ffff00000859f358 ffff00000a5e7f50
[ 305.491444] f4a0: ffff00000859f330 0000000040000145 0000000000000000 0000000000000000
[ 305.492008] f4c0: 0001000000000000 0000000000000000 ffff00000a5e8350 ffff00000859f330
[ 305.493063] [<ffff00000808205c>] __bad_stack+0x88/0x8c
[ 305.493396] [<ffff00000859f330>] recursive_loop+0x10/0x48
[ 305.493731] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494088] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494425] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494649] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494898] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495205] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495453] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495708] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496000] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496302] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496644] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496894] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497138] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497325] [<ffff00000859f3dc>] lkdtm_OVERFLOW+0x14/0x20
[ 305.497506] [<ffff00000859f314>] lkdtm_do_action+0x1c/0x28
[ 305.497786] [<ffff00000859f178>] direct_entry+0xe0/0x170
[ 305.498095] [<ffff000008345568>] full_proxy_write+0x60/0xa8
[ 305.498387] [<ffff0000081fb7f4>] __vfs_write+0x1c/0x128
[ 305.498679] [<ffff0000081fcc68>] vfs_write+0xa0/0x1b0
[ 305.498926] [<ffff0000081fe0fc>] SyS_write+0x44/0xa0
[ 305.499182] Exception stack(0xffff00000a5ebec0 to 0xffff00000a5ec000)
[ 305.499429] bec0: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.499674] bee0: 574f4c465245564f 0000000000000000 0000000000000000 8000000080808080
[ 305.499904] bf00: 0000000000000040 0000000000000038 fefefeff1b4bc2ff 7f7f7f7f7f7fff7f
[ 305.500189] bf20: 0101010101010101 0000000000000000 000000000047a4c8 0000000000000038
[ 305.500712] bf40: 0000000000000000 0000ffffa2601280 0000ffffc63f6068 00000000004b5000
[ 305.501241] bf60: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.501791] bf80: 0000000000000020 0000000000000000 00000000004b5000 000000001c4cc458
[ 305.502314] bfa0: 0000000000000000 0000ffffc63f7950 000000000040a3c4 0000ffffc63f70e0
[ 305.502762] bfc0: 0000ffffa2601268 0000000080000000 0000000000000001 0000000000000040
[ 305.503207] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 305.503680] [<ffff000008082fb0>] el0_svc_naked+0x24/0x28
[ 305.504720] Kernel Offset: disabled
[ 305.505189] CPU features: 0x002082
[ 305.505473] Memory Limit: none
[ 305.506181] ---[ end Kernel panic - not syncing: kernel stack overflow
This patch was co-authored by Ard Biesheuvel and Mark Rutland.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-07-14 22:30:35 +03:00
# ifdef C O N F I G _ V M A P _ S T A C K
2021-08-04 21:17:10 +03:00
SYM_ C O D E _ S T A R T _ L O C A L ( _ _ b a d _ s t a c k )
arm64: add VMAP_STACK overflow detection
This patch adds stack overflow detection to arm64, usable when vmap'd stacks
are in use.
Overflow is detected in a small preamble executed for each exception entry,
which checks whether there is enough space on the current stack for the general
purpose registers to be saved. If there is not enough space, the overflow
handler is invoked on a per-cpu overflow stack. This approach preserves the
original exception information in ESR_EL1 (and where appropriate, FAR_EL1).
Task and IRQ stacks are aligned to double their size, enabling overflow to be
detected with a single bit test. For example, a 16K stack is aligned to 32K,
ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
this bit is flipped. Thus, overflow (of less than the size of the stack) can be
detected by testing whether this bit is set.
The overflow check is performed before any attempt is made to access the
stack, avoiding recursive faults (and the loss of exception information
these would entail). As logical operations cannot be performed on the SP
directly, the SP is temporarily swapped with a general purpose register
using arithmetic operations to enable the test to be performed.
This gives us a useful error message on stack overflow, as can be trigger with
the LKDTM overflow test:
[ 305.388749] lkdtm: Performing direct entry OVERFLOW
[ 305.395444] Insufficient stack space to handle exception!
[ 305.395482] ESR: 0x96000047 -- DABT (current EL)
[ 305.399890] FAR: 0xffff00000a5e7f30
[ 305.401315] Task stack: [0xffff00000a5e8000..0xffff00000a5ec000]
[ 305.403815] IRQ stack: [0xffff000008000000..0xffff000008004000]
[ 305.407035] Overflow stack: [0xffff80003efce4e0..0xffff80003efcf4e0]
[ 305.409622] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.412785] Hardware name: linux,dummy-virt (DT)
[ 305.415756] task: ffff80003d051c00 task.stack: ffff00000a5e8000
[ 305.419221] PC is at recursive_loop+0x10/0x48
[ 305.421637] LR is at recursive_loop+0x38/0x48
[ 305.423768] pc : [<ffff00000859f330>] lr : [<ffff00000859f358>] pstate: 40000145
[ 305.428020] sp : ffff00000a5e7f50
[ 305.430469] x29: ffff00000a5e8350 x28: ffff80003d051c00
[ 305.433191] x27: ffff000008981000 x26: ffff000008f80400
[ 305.439012] x25: ffff00000a5ebeb8 x24: ffff00000a5ebeb8
[ 305.440369] x23: ffff000008f80138 x22: 0000000000000009
[ 305.442241] x21: ffff80003ce65000 x20: ffff000008f80188
[ 305.444552] x19: 0000000000000013 x18: 0000000000000006
[ 305.446032] x17: 0000ffffa2601280 x16: ffff0000081fe0b8
[ 305.448252] x15: ffff000008ff546d x14: 000000000047a4c8
[ 305.450246] x13: ffff000008ff7872 x12: 0000000005f5e0ff
[ 305.452953] x11: ffff000008ed2548 x10: 000000000005ee8d
[ 305.454824] x9 : ffff000008545380 x8 : ffff00000a5e8770
[ 305.457105] x7 : 1313131313131313 x6 : 00000000000000e1
[ 305.459285] x5 : 0000000000000000 x4 : 0000000000000000
[ 305.461781] x3 : 0000000000000000 x2 : 0000000000000400
[ 305.465119] x1 : 0000000000000013 x0 : 0000000000000012
[ 305.467724] Kernel panic - not syncing: kernel stack overflow
[ 305.470561] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.473325] Hardware name: linux,dummy-virt (DT)
[ 305.475070] Call trace:
[ 305.476116] [<ffff000008088ad8>] dump_backtrace+0x0/0x378
[ 305.478991] [<ffff000008088e64>] show_stack+0x14/0x20
[ 305.481237] [<ffff00000895a178>] dump_stack+0x98/0xb8
[ 305.483294] [<ffff0000080c3288>] panic+0x118/0x280
[ 305.485673] [<ffff0000080c2e9c>] nmi_panic+0x6c/0x70
[ 305.486216] [<ffff000008089710>] handle_bad_stack+0x118/0x128
[ 305.486612] Exception stack(0xffff80003efcf3a0 to 0xffff80003efcf4e0)
[ 305.487334] f3a0: 0000000000000012 0000000000000013 0000000000000400 0000000000000000
[ 305.488025] f3c0: 0000000000000000 0000000000000000 00000000000000e1 1313131313131313
[ 305.488908] f3e0: ffff00000a5e8770 ffff000008545380 000000000005ee8d ffff000008ed2548
[ 305.489403] f400: 0000000005f5e0ff ffff000008ff7872 000000000047a4c8 ffff000008ff546d
[ 305.489759] f420: ffff0000081fe0b8 0000ffffa2601280 0000000000000006 0000000000000013
[ 305.490256] f440: ffff000008f80188 ffff80003ce65000 0000000000000009 ffff000008f80138
[ 305.490683] f460: ffff00000a5ebeb8 ffff00000a5ebeb8 ffff000008f80400 ffff000008981000
[ 305.491051] f480: ffff80003d051c00 ffff00000a5e8350 ffff00000859f358 ffff00000a5e7f50
[ 305.491444] f4a0: ffff00000859f330 0000000040000145 0000000000000000 0000000000000000
[ 305.492008] f4c0: 0001000000000000 0000000000000000 ffff00000a5e8350 ffff00000859f330
[ 305.493063] [<ffff00000808205c>] __bad_stack+0x88/0x8c
[ 305.493396] [<ffff00000859f330>] recursive_loop+0x10/0x48
[ 305.493731] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494088] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494425] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494649] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494898] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495205] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495453] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495708] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496000] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496302] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496644] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496894] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497138] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497325] [<ffff00000859f3dc>] lkdtm_OVERFLOW+0x14/0x20
[ 305.497506] [<ffff00000859f314>] lkdtm_do_action+0x1c/0x28
[ 305.497786] [<ffff00000859f178>] direct_entry+0xe0/0x170
[ 305.498095] [<ffff000008345568>] full_proxy_write+0x60/0xa8
[ 305.498387] [<ffff0000081fb7f4>] __vfs_write+0x1c/0x128
[ 305.498679] [<ffff0000081fcc68>] vfs_write+0xa0/0x1b0
[ 305.498926] [<ffff0000081fe0fc>] SyS_write+0x44/0xa0
[ 305.499182] Exception stack(0xffff00000a5ebec0 to 0xffff00000a5ec000)
[ 305.499429] bec0: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.499674] bee0: 574f4c465245564f 0000000000000000 0000000000000000 8000000080808080
[ 305.499904] bf00: 0000000000000040 0000000000000038 fefefeff1b4bc2ff 7f7f7f7f7f7fff7f
[ 305.500189] bf20: 0101010101010101 0000000000000000 000000000047a4c8 0000000000000038
[ 305.500712] bf40: 0000000000000000 0000ffffa2601280 0000ffffc63f6068 00000000004b5000
[ 305.501241] bf60: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.501791] bf80: 0000000000000020 0000000000000000 00000000004b5000 000000001c4cc458
[ 305.502314] bfa0: 0000000000000000 0000ffffc63f7950 000000000040a3c4 0000ffffc63f70e0
[ 305.502762] bfc0: 0000ffffa2601268 0000000080000000 0000000000000001 0000000000000040
[ 305.503207] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 305.503680] [<ffff000008082fb0>] el0_svc_naked+0x24/0x28
[ 305.504720] Kernel Offset: disabled
[ 305.505189] CPU features: 0x002082
[ 305.505473] Memory Limit: none
[ 305.506181] ---[ end Kernel panic - not syncing: kernel stack overflow
This patch was co-authored by Ard Biesheuvel and Mark Rutland.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-07-14 22:30:35 +03:00
/ *
* We d e t e c t e d a n o v e r f l o w i n k e r n e l _ v e n t r y , w h i c h s w i t c h e d t o t h e
* overflow s t a c k . S t a s h t h e e x c e p t i o n r e g s , a n d h e a d t o o u r o v e r f l o w
* handler.
* /
2021-08-04 21:17:10 +03:00
arm64: add VMAP_STACK overflow detection
This patch adds stack overflow detection to arm64, usable when vmap'd stacks
are in use.
Overflow is detected in a small preamble executed for each exception entry,
which checks whether there is enough space on the current stack for the general
purpose registers to be saved. If there is not enough space, the overflow
handler is invoked on a per-cpu overflow stack. This approach preserves the
original exception information in ESR_EL1 (and where appropriate, FAR_EL1).
Task and IRQ stacks are aligned to double their size, enabling overflow to be
detected with a single bit test. For example, a 16K stack is aligned to 32K,
ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
this bit is flipped. Thus, overflow (of less than the size of the stack) can be
detected by testing whether this bit is set.
The overflow check is performed before any attempt is made to access the
stack, avoiding recursive faults (and the loss of exception information
these would entail). As logical operations cannot be performed on the SP
directly, the SP is temporarily swapped with a general purpose register
using arithmetic operations to enable the test to be performed.
This gives us a useful error message on stack overflow, as can be trigger with
the LKDTM overflow test:
[ 305.388749] lkdtm: Performing direct entry OVERFLOW
[ 305.395444] Insufficient stack space to handle exception!
[ 305.395482] ESR: 0x96000047 -- DABT (current EL)
[ 305.399890] FAR: 0xffff00000a5e7f30
[ 305.401315] Task stack: [0xffff00000a5e8000..0xffff00000a5ec000]
[ 305.403815] IRQ stack: [0xffff000008000000..0xffff000008004000]
[ 305.407035] Overflow stack: [0xffff80003efce4e0..0xffff80003efcf4e0]
[ 305.409622] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.412785] Hardware name: linux,dummy-virt (DT)
[ 305.415756] task: ffff80003d051c00 task.stack: ffff00000a5e8000
[ 305.419221] PC is at recursive_loop+0x10/0x48
[ 305.421637] LR is at recursive_loop+0x38/0x48
[ 305.423768] pc : [<ffff00000859f330>] lr : [<ffff00000859f358>] pstate: 40000145
[ 305.428020] sp : ffff00000a5e7f50
[ 305.430469] x29: ffff00000a5e8350 x28: ffff80003d051c00
[ 305.433191] x27: ffff000008981000 x26: ffff000008f80400
[ 305.439012] x25: ffff00000a5ebeb8 x24: ffff00000a5ebeb8
[ 305.440369] x23: ffff000008f80138 x22: 0000000000000009
[ 305.442241] x21: ffff80003ce65000 x20: ffff000008f80188
[ 305.444552] x19: 0000000000000013 x18: 0000000000000006
[ 305.446032] x17: 0000ffffa2601280 x16: ffff0000081fe0b8
[ 305.448252] x15: ffff000008ff546d x14: 000000000047a4c8
[ 305.450246] x13: ffff000008ff7872 x12: 0000000005f5e0ff
[ 305.452953] x11: ffff000008ed2548 x10: 000000000005ee8d
[ 305.454824] x9 : ffff000008545380 x8 : ffff00000a5e8770
[ 305.457105] x7 : 1313131313131313 x6 : 00000000000000e1
[ 305.459285] x5 : 0000000000000000 x4 : 0000000000000000
[ 305.461781] x3 : 0000000000000000 x2 : 0000000000000400
[ 305.465119] x1 : 0000000000000013 x0 : 0000000000000012
[ 305.467724] Kernel panic - not syncing: kernel stack overflow
[ 305.470561] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.473325] Hardware name: linux,dummy-virt (DT)
[ 305.475070] Call trace:
[ 305.476116] [<ffff000008088ad8>] dump_backtrace+0x0/0x378
[ 305.478991] [<ffff000008088e64>] show_stack+0x14/0x20
[ 305.481237] [<ffff00000895a178>] dump_stack+0x98/0xb8
[ 305.483294] [<ffff0000080c3288>] panic+0x118/0x280
[ 305.485673] [<ffff0000080c2e9c>] nmi_panic+0x6c/0x70
[ 305.486216] [<ffff000008089710>] handle_bad_stack+0x118/0x128
[ 305.486612] Exception stack(0xffff80003efcf3a0 to 0xffff80003efcf4e0)
[ 305.487334] f3a0: 0000000000000012 0000000000000013 0000000000000400 0000000000000000
[ 305.488025] f3c0: 0000000000000000 0000000000000000 00000000000000e1 1313131313131313
[ 305.488908] f3e0: ffff00000a5e8770 ffff000008545380 000000000005ee8d ffff000008ed2548
[ 305.489403] f400: 0000000005f5e0ff ffff000008ff7872 000000000047a4c8 ffff000008ff546d
[ 305.489759] f420: ffff0000081fe0b8 0000ffffa2601280 0000000000000006 0000000000000013
[ 305.490256] f440: ffff000008f80188 ffff80003ce65000 0000000000000009 ffff000008f80138
[ 305.490683] f460: ffff00000a5ebeb8 ffff00000a5ebeb8 ffff000008f80400 ffff000008981000
[ 305.491051] f480: ffff80003d051c00 ffff00000a5e8350 ffff00000859f358 ffff00000a5e7f50
[ 305.491444] f4a0: ffff00000859f330 0000000040000145 0000000000000000 0000000000000000
[ 305.492008] f4c0: 0001000000000000 0000000000000000 ffff00000a5e8350 ffff00000859f330
[ 305.493063] [<ffff00000808205c>] __bad_stack+0x88/0x8c
[ 305.493396] [<ffff00000859f330>] recursive_loop+0x10/0x48
[ 305.493731] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494088] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494425] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494649] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494898] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495205] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495453] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495708] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496000] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496302] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496644] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496894] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497138] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497325] [<ffff00000859f3dc>] lkdtm_OVERFLOW+0x14/0x20
[ 305.497506] [<ffff00000859f314>] lkdtm_do_action+0x1c/0x28
[ 305.497786] [<ffff00000859f178>] direct_entry+0xe0/0x170
[ 305.498095] [<ffff000008345568>] full_proxy_write+0x60/0xa8
[ 305.498387] [<ffff0000081fb7f4>] __vfs_write+0x1c/0x128
[ 305.498679] [<ffff0000081fcc68>] vfs_write+0xa0/0x1b0
[ 305.498926] [<ffff0000081fe0fc>] SyS_write+0x44/0xa0
[ 305.499182] Exception stack(0xffff00000a5ebec0 to 0xffff00000a5ec000)
[ 305.499429] bec0: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.499674] bee0: 574f4c465245564f 0000000000000000 0000000000000000 8000000080808080
[ 305.499904] bf00: 0000000000000040 0000000000000038 fefefeff1b4bc2ff 7f7f7f7f7f7fff7f
[ 305.500189] bf20: 0101010101010101 0000000000000000 000000000047a4c8 0000000000000038
[ 305.500712] bf40: 0000000000000000 0000ffffa2601280 0000ffffc63f6068 00000000004b5000
[ 305.501241] bf60: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.501791] bf80: 0000000000000020 0000000000000000 00000000004b5000 000000001c4cc458
[ 305.502314] bfa0: 0000000000000000 0000ffffc63f7950 000000000040a3c4 0000ffffc63f70e0
[ 305.502762] bfc0: 0000ffffa2601268 0000000080000000 0000000000000001 0000000000000040
[ 305.503207] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 305.503680] [<ffff000008082fb0>] el0_svc_naked+0x24/0x28
[ 305.504720] Kernel Offset: disabled
[ 305.505189] CPU features: 0x002082
[ 305.505473] Memory Limit: none
[ 305.506181] ---[ end Kernel panic - not syncing: kernel stack overflow
This patch was co-authored by Ard Biesheuvel and Mark Rutland.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-07-14 22:30:35 +03:00
/* Restore the original x0 value */
mrs x0 , t p i d r r o _ e l 0
/ *
* Store t h e o r i g i n a l G P R s t o t h e n e w s t a c k . T h e o r g i n a l S P ( m i n u s
2021-01-12 04:58:13 +03:00
* PT_ R E G S _ S I Z E ) w a s s t a s h e d i n t p i d r _ e l 0 b y k e r n e l _ v e n t r y .
arm64: add VMAP_STACK overflow detection
This patch adds stack overflow detection to arm64, usable when vmap'd stacks
are in use.
Overflow is detected in a small preamble executed for each exception entry,
which checks whether there is enough space on the current stack for the general
purpose registers to be saved. If there is not enough space, the overflow
handler is invoked on a per-cpu overflow stack. This approach preserves the
original exception information in ESR_EL1 (and where appropriate, FAR_EL1).
Task and IRQ stacks are aligned to double their size, enabling overflow to be
detected with a single bit test. For example, a 16K stack is aligned to 32K,
ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
this bit is flipped. Thus, overflow (of less than the size of the stack) can be
detected by testing whether this bit is set.
The overflow check is performed before any attempt is made to access the
stack, avoiding recursive faults (and the loss of exception information
these would entail). As logical operations cannot be performed on the SP
directly, the SP is temporarily swapped with a general purpose register
using arithmetic operations to enable the test to be performed.
This gives us a useful error message on stack overflow, as can be trigger with
the LKDTM overflow test:
[ 305.388749] lkdtm: Performing direct entry OVERFLOW
[ 305.395444] Insufficient stack space to handle exception!
[ 305.395482] ESR: 0x96000047 -- DABT (current EL)
[ 305.399890] FAR: 0xffff00000a5e7f30
[ 305.401315] Task stack: [0xffff00000a5e8000..0xffff00000a5ec000]
[ 305.403815] IRQ stack: [0xffff000008000000..0xffff000008004000]
[ 305.407035] Overflow stack: [0xffff80003efce4e0..0xffff80003efcf4e0]
[ 305.409622] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.412785] Hardware name: linux,dummy-virt (DT)
[ 305.415756] task: ffff80003d051c00 task.stack: ffff00000a5e8000
[ 305.419221] PC is at recursive_loop+0x10/0x48
[ 305.421637] LR is at recursive_loop+0x38/0x48
[ 305.423768] pc : [<ffff00000859f330>] lr : [<ffff00000859f358>] pstate: 40000145
[ 305.428020] sp : ffff00000a5e7f50
[ 305.430469] x29: ffff00000a5e8350 x28: ffff80003d051c00
[ 305.433191] x27: ffff000008981000 x26: ffff000008f80400
[ 305.439012] x25: ffff00000a5ebeb8 x24: ffff00000a5ebeb8
[ 305.440369] x23: ffff000008f80138 x22: 0000000000000009
[ 305.442241] x21: ffff80003ce65000 x20: ffff000008f80188
[ 305.444552] x19: 0000000000000013 x18: 0000000000000006
[ 305.446032] x17: 0000ffffa2601280 x16: ffff0000081fe0b8
[ 305.448252] x15: ffff000008ff546d x14: 000000000047a4c8
[ 305.450246] x13: ffff000008ff7872 x12: 0000000005f5e0ff
[ 305.452953] x11: ffff000008ed2548 x10: 000000000005ee8d
[ 305.454824] x9 : ffff000008545380 x8 : ffff00000a5e8770
[ 305.457105] x7 : 1313131313131313 x6 : 00000000000000e1
[ 305.459285] x5 : 0000000000000000 x4 : 0000000000000000
[ 305.461781] x3 : 0000000000000000 x2 : 0000000000000400
[ 305.465119] x1 : 0000000000000013 x0 : 0000000000000012
[ 305.467724] Kernel panic - not syncing: kernel stack overflow
[ 305.470561] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.473325] Hardware name: linux,dummy-virt (DT)
[ 305.475070] Call trace:
[ 305.476116] [<ffff000008088ad8>] dump_backtrace+0x0/0x378
[ 305.478991] [<ffff000008088e64>] show_stack+0x14/0x20
[ 305.481237] [<ffff00000895a178>] dump_stack+0x98/0xb8
[ 305.483294] [<ffff0000080c3288>] panic+0x118/0x280
[ 305.485673] [<ffff0000080c2e9c>] nmi_panic+0x6c/0x70
[ 305.486216] [<ffff000008089710>] handle_bad_stack+0x118/0x128
[ 305.486612] Exception stack(0xffff80003efcf3a0 to 0xffff80003efcf4e0)
[ 305.487334] f3a0: 0000000000000012 0000000000000013 0000000000000400 0000000000000000
[ 305.488025] f3c0: 0000000000000000 0000000000000000 00000000000000e1 1313131313131313
[ 305.488908] f3e0: ffff00000a5e8770 ffff000008545380 000000000005ee8d ffff000008ed2548
[ 305.489403] f400: 0000000005f5e0ff ffff000008ff7872 000000000047a4c8 ffff000008ff546d
[ 305.489759] f420: ffff0000081fe0b8 0000ffffa2601280 0000000000000006 0000000000000013
[ 305.490256] f440: ffff000008f80188 ffff80003ce65000 0000000000000009 ffff000008f80138
[ 305.490683] f460: ffff00000a5ebeb8 ffff00000a5ebeb8 ffff000008f80400 ffff000008981000
[ 305.491051] f480: ffff80003d051c00 ffff00000a5e8350 ffff00000859f358 ffff00000a5e7f50
[ 305.491444] f4a0: ffff00000859f330 0000000040000145 0000000000000000 0000000000000000
[ 305.492008] f4c0: 0001000000000000 0000000000000000 ffff00000a5e8350 ffff00000859f330
[ 305.493063] [<ffff00000808205c>] __bad_stack+0x88/0x8c
[ 305.493396] [<ffff00000859f330>] recursive_loop+0x10/0x48
[ 305.493731] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494088] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494425] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494649] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494898] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495205] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495453] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495708] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496000] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496302] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496644] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496894] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497138] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497325] [<ffff00000859f3dc>] lkdtm_OVERFLOW+0x14/0x20
[ 305.497506] [<ffff00000859f314>] lkdtm_do_action+0x1c/0x28
[ 305.497786] [<ffff00000859f178>] direct_entry+0xe0/0x170
[ 305.498095] [<ffff000008345568>] full_proxy_write+0x60/0xa8
[ 305.498387] [<ffff0000081fb7f4>] __vfs_write+0x1c/0x128
[ 305.498679] [<ffff0000081fcc68>] vfs_write+0xa0/0x1b0
[ 305.498926] [<ffff0000081fe0fc>] SyS_write+0x44/0xa0
[ 305.499182] Exception stack(0xffff00000a5ebec0 to 0xffff00000a5ec000)
[ 305.499429] bec0: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.499674] bee0: 574f4c465245564f 0000000000000000 0000000000000000 8000000080808080
[ 305.499904] bf00: 0000000000000040 0000000000000038 fefefeff1b4bc2ff 7f7f7f7f7f7fff7f
[ 305.500189] bf20: 0101010101010101 0000000000000000 000000000047a4c8 0000000000000038
[ 305.500712] bf40: 0000000000000000 0000ffffa2601280 0000ffffc63f6068 00000000004b5000
[ 305.501241] bf60: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.501791] bf80: 0000000000000020 0000000000000000 00000000004b5000 000000001c4cc458
[ 305.502314] bfa0: 0000000000000000 0000ffffc63f7950 000000000040a3c4 0000ffffc63f70e0
[ 305.502762] bfc0: 0000ffffa2601268 0000000080000000 0000000000000001 0000000000000040
[ 305.503207] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 305.503680] [<ffff000008082fb0>] el0_svc_naked+0x24/0x28
[ 305.504720] Kernel Offset: disabled
[ 305.505189] CPU features: 0x002082
[ 305.505473] Memory Limit: none
[ 305.506181] ---[ end Kernel panic - not syncing: kernel stack overflow
This patch was co-authored by Ard Biesheuvel and Mark Rutland.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-07-14 22:30:35 +03:00
* /
2021-01-12 04:58:13 +03:00
sub s p , s p , #P T _ R E G S _ S I Z E
arm64: add VMAP_STACK overflow detection
This patch adds stack overflow detection to arm64, usable when vmap'd stacks
are in use.
Overflow is detected in a small preamble executed for each exception entry,
which checks whether there is enough space on the current stack for the general
purpose registers to be saved. If there is not enough space, the overflow
handler is invoked on a per-cpu overflow stack. This approach preserves the
original exception information in ESR_EL1 (and where appropriate, FAR_EL1).
Task and IRQ stacks are aligned to double their size, enabling overflow to be
detected with a single bit test. For example, a 16K stack is aligned to 32K,
ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
this bit is flipped. Thus, overflow (of less than the size of the stack) can be
detected by testing whether this bit is set.
The overflow check is performed before any attempt is made to access the
stack, avoiding recursive faults (and the loss of exception information
these would entail). As logical operations cannot be performed on the SP
directly, the SP is temporarily swapped with a general purpose register
using arithmetic operations to enable the test to be performed.
This gives us a useful error message on stack overflow, as can be trigger with
the LKDTM overflow test:
[ 305.388749] lkdtm: Performing direct entry OVERFLOW
[ 305.395444] Insufficient stack space to handle exception!
[ 305.395482] ESR: 0x96000047 -- DABT (current EL)
[ 305.399890] FAR: 0xffff00000a5e7f30
[ 305.401315] Task stack: [0xffff00000a5e8000..0xffff00000a5ec000]
[ 305.403815] IRQ stack: [0xffff000008000000..0xffff000008004000]
[ 305.407035] Overflow stack: [0xffff80003efce4e0..0xffff80003efcf4e0]
[ 305.409622] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.412785] Hardware name: linux,dummy-virt (DT)
[ 305.415756] task: ffff80003d051c00 task.stack: ffff00000a5e8000
[ 305.419221] PC is at recursive_loop+0x10/0x48
[ 305.421637] LR is at recursive_loop+0x38/0x48
[ 305.423768] pc : [<ffff00000859f330>] lr : [<ffff00000859f358>] pstate: 40000145
[ 305.428020] sp : ffff00000a5e7f50
[ 305.430469] x29: ffff00000a5e8350 x28: ffff80003d051c00
[ 305.433191] x27: ffff000008981000 x26: ffff000008f80400
[ 305.439012] x25: ffff00000a5ebeb8 x24: ffff00000a5ebeb8
[ 305.440369] x23: ffff000008f80138 x22: 0000000000000009
[ 305.442241] x21: ffff80003ce65000 x20: ffff000008f80188
[ 305.444552] x19: 0000000000000013 x18: 0000000000000006
[ 305.446032] x17: 0000ffffa2601280 x16: ffff0000081fe0b8
[ 305.448252] x15: ffff000008ff546d x14: 000000000047a4c8
[ 305.450246] x13: ffff000008ff7872 x12: 0000000005f5e0ff
[ 305.452953] x11: ffff000008ed2548 x10: 000000000005ee8d
[ 305.454824] x9 : ffff000008545380 x8 : ffff00000a5e8770
[ 305.457105] x7 : 1313131313131313 x6 : 00000000000000e1
[ 305.459285] x5 : 0000000000000000 x4 : 0000000000000000
[ 305.461781] x3 : 0000000000000000 x2 : 0000000000000400
[ 305.465119] x1 : 0000000000000013 x0 : 0000000000000012
[ 305.467724] Kernel panic - not syncing: kernel stack overflow
[ 305.470561] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.473325] Hardware name: linux,dummy-virt (DT)
[ 305.475070] Call trace:
[ 305.476116] [<ffff000008088ad8>] dump_backtrace+0x0/0x378
[ 305.478991] [<ffff000008088e64>] show_stack+0x14/0x20
[ 305.481237] [<ffff00000895a178>] dump_stack+0x98/0xb8
[ 305.483294] [<ffff0000080c3288>] panic+0x118/0x280
[ 305.485673] [<ffff0000080c2e9c>] nmi_panic+0x6c/0x70
[ 305.486216] [<ffff000008089710>] handle_bad_stack+0x118/0x128
[ 305.486612] Exception stack(0xffff80003efcf3a0 to 0xffff80003efcf4e0)
[ 305.487334] f3a0: 0000000000000012 0000000000000013 0000000000000400 0000000000000000
[ 305.488025] f3c0: 0000000000000000 0000000000000000 00000000000000e1 1313131313131313
[ 305.488908] f3e0: ffff00000a5e8770 ffff000008545380 000000000005ee8d ffff000008ed2548
[ 305.489403] f400: 0000000005f5e0ff ffff000008ff7872 000000000047a4c8 ffff000008ff546d
[ 305.489759] f420: ffff0000081fe0b8 0000ffffa2601280 0000000000000006 0000000000000013
[ 305.490256] f440: ffff000008f80188 ffff80003ce65000 0000000000000009 ffff000008f80138
[ 305.490683] f460: ffff00000a5ebeb8 ffff00000a5ebeb8 ffff000008f80400 ffff000008981000
[ 305.491051] f480: ffff80003d051c00 ffff00000a5e8350 ffff00000859f358 ffff00000a5e7f50
[ 305.491444] f4a0: ffff00000859f330 0000000040000145 0000000000000000 0000000000000000
[ 305.492008] f4c0: 0001000000000000 0000000000000000 ffff00000a5e8350 ffff00000859f330
[ 305.493063] [<ffff00000808205c>] __bad_stack+0x88/0x8c
[ 305.493396] [<ffff00000859f330>] recursive_loop+0x10/0x48
[ 305.493731] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494088] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494425] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494649] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494898] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495205] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495453] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495708] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496000] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496302] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496644] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496894] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497138] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497325] [<ffff00000859f3dc>] lkdtm_OVERFLOW+0x14/0x20
[ 305.497506] [<ffff00000859f314>] lkdtm_do_action+0x1c/0x28
[ 305.497786] [<ffff00000859f178>] direct_entry+0xe0/0x170
[ 305.498095] [<ffff000008345568>] full_proxy_write+0x60/0xa8
[ 305.498387] [<ffff0000081fb7f4>] __vfs_write+0x1c/0x128
[ 305.498679] [<ffff0000081fcc68>] vfs_write+0xa0/0x1b0
[ 305.498926] [<ffff0000081fe0fc>] SyS_write+0x44/0xa0
[ 305.499182] Exception stack(0xffff00000a5ebec0 to 0xffff00000a5ec000)
[ 305.499429] bec0: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.499674] bee0: 574f4c465245564f 0000000000000000 0000000000000000 8000000080808080
[ 305.499904] bf00: 0000000000000040 0000000000000038 fefefeff1b4bc2ff 7f7f7f7f7f7fff7f
[ 305.500189] bf20: 0101010101010101 0000000000000000 000000000047a4c8 0000000000000038
[ 305.500712] bf40: 0000000000000000 0000ffffa2601280 0000ffffc63f6068 00000000004b5000
[ 305.501241] bf60: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.501791] bf80: 0000000000000020 0000000000000000 00000000004b5000 000000001c4cc458
[ 305.502314] bfa0: 0000000000000000 0000ffffc63f7950 000000000040a3c4 0000ffffc63f70e0
[ 305.502762] bfc0: 0000ffffa2601268 0000000080000000 0000000000000001 0000000000000040
[ 305.503207] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 305.503680] [<ffff000008082fb0>] el0_svc_naked+0x24/0x28
[ 305.504720] Kernel Offset: disabled
[ 305.505189] CPU features: 0x002082
[ 305.505473] Memory Limit: none
[ 305.506181] ---[ end Kernel panic - not syncing: kernel stack overflow
This patch was co-authored by Ard Biesheuvel and Mark Rutland.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-07-14 22:30:35 +03:00
kernel_ e n t r y 1
mrs x0 , t p i d r _ e l 0
2021-01-12 04:58:13 +03:00
add x0 , x0 , #P T _ R E G S _ S I Z E
arm64: add VMAP_STACK overflow detection
This patch adds stack overflow detection to arm64, usable when vmap'd stacks
are in use.
Overflow is detected in a small preamble executed for each exception entry,
which checks whether there is enough space on the current stack for the general
purpose registers to be saved. If there is not enough space, the overflow
handler is invoked on a per-cpu overflow stack. This approach preserves the
original exception information in ESR_EL1 (and where appropriate, FAR_EL1).
Task and IRQ stacks are aligned to double their size, enabling overflow to be
detected with a single bit test. For example, a 16K stack is aligned to 32K,
ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
this bit is flipped. Thus, overflow (of less than the size of the stack) can be
detected by testing whether this bit is set.
The overflow check is performed before any attempt is made to access the
stack, avoiding recursive faults (and the loss of exception information
these would entail). As logical operations cannot be performed on the SP
directly, the SP is temporarily swapped with a general purpose register
using arithmetic operations to enable the test to be performed.
This gives us a useful error message on stack overflow, as can be trigger with
the LKDTM overflow test:
[ 305.388749] lkdtm: Performing direct entry OVERFLOW
[ 305.395444] Insufficient stack space to handle exception!
[ 305.395482] ESR: 0x96000047 -- DABT (current EL)
[ 305.399890] FAR: 0xffff00000a5e7f30
[ 305.401315] Task stack: [0xffff00000a5e8000..0xffff00000a5ec000]
[ 305.403815] IRQ stack: [0xffff000008000000..0xffff000008004000]
[ 305.407035] Overflow stack: [0xffff80003efce4e0..0xffff80003efcf4e0]
[ 305.409622] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.412785] Hardware name: linux,dummy-virt (DT)
[ 305.415756] task: ffff80003d051c00 task.stack: ffff00000a5e8000
[ 305.419221] PC is at recursive_loop+0x10/0x48
[ 305.421637] LR is at recursive_loop+0x38/0x48
[ 305.423768] pc : [<ffff00000859f330>] lr : [<ffff00000859f358>] pstate: 40000145
[ 305.428020] sp : ffff00000a5e7f50
[ 305.430469] x29: ffff00000a5e8350 x28: ffff80003d051c00
[ 305.433191] x27: ffff000008981000 x26: ffff000008f80400
[ 305.439012] x25: ffff00000a5ebeb8 x24: ffff00000a5ebeb8
[ 305.440369] x23: ffff000008f80138 x22: 0000000000000009
[ 305.442241] x21: ffff80003ce65000 x20: ffff000008f80188
[ 305.444552] x19: 0000000000000013 x18: 0000000000000006
[ 305.446032] x17: 0000ffffa2601280 x16: ffff0000081fe0b8
[ 305.448252] x15: ffff000008ff546d x14: 000000000047a4c8
[ 305.450246] x13: ffff000008ff7872 x12: 0000000005f5e0ff
[ 305.452953] x11: ffff000008ed2548 x10: 000000000005ee8d
[ 305.454824] x9 : ffff000008545380 x8 : ffff00000a5e8770
[ 305.457105] x7 : 1313131313131313 x6 : 00000000000000e1
[ 305.459285] x5 : 0000000000000000 x4 : 0000000000000000
[ 305.461781] x3 : 0000000000000000 x2 : 0000000000000400
[ 305.465119] x1 : 0000000000000013 x0 : 0000000000000012
[ 305.467724] Kernel panic - not syncing: kernel stack overflow
[ 305.470561] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.473325] Hardware name: linux,dummy-virt (DT)
[ 305.475070] Call trace:
[ 305.476116] [<ffff000008088ad8>] dump_backtrace+0x0/0x378
[ 305.478991] [<ffff000008088e64>] show_stack+0x14/0x20
[ 305.481237] [<ffff00000895a178>] dump_stack+0x98/0xb8
[ 305.483294] [<ffff0000080c3288>] panic+0x118/0x280
[ 305.485673] [<ffff0000080c2e9c>] nmi_panic+0x6c/0x70
[ 305.486216] [<ffff000008089710>] handle_bad_stack+0x118/0x128
[ 305.486612] Exception stack(0xffff80003efcf3a0 to 0xffff80003efcf4e0)
[ 305.487334] f3a0: 0000000000000012 0000000000000013 0000000000000400 0000000000000000
[ 305.488025] f3c0: 0000000000000000 0000000000000000 00000000000000e1 1313131313131313
[ 305.488908] f3e0: ffff00000a5e8770 ffff000008545380 000000000005ee8d ffff000008ed2548
[ 305.489403] f400: 0000000005f5e0ff ffff000008ff7872 000000000047a4c8 ffff000008ff546d
[ 305.489759] f420: ffff0000081fe0b8 0000ffffa2601280 0000000000000006 0000000000000013
[ 305.490256] f440: ffff000008f80188 ffff80003ce65000 0000000000000009 ffff000008f80138
[ 305.490683] f460: ffff00000a5ebeb8 ffff00000a5ebeb8 ffff000008f80400 ffff000008981000
[ 305.491051] f480: ffff80003d051c00 ffff00000a5e8350 ffff00000859f358 ffff00000a5e7f50
[ 305.491444] f4a0: ffff00000859f330 0000000040000145 0000000000000000 0000000000000000
[ 305.492008] f4c0: 0001000000000000 0000000000000000 ffff00000a5e8350 ffff00000859f330
[ 305.493063] [<ffff00000808205c>] __bad_stack+0x88/0x8c
[ 305.493396] [<ffff00000859f330>] recursive_loop+0x10/0x48
[ 305.493731] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494088] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494425] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494649] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494898] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495205] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495453] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495708] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496000] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496302] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496644] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496894] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497138] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497325] [<ffff00000859f3dc>] lkdtm_OVERFLOW+0x14/0x20
[ 305.497506] [<ffff00000859f314>] lkdtm_do_action+0x1c/0x28
[ 305.497786] [<ffff00000859f178>] direct_entry+0xe0/0x170
[ 305.498095] [<ffff000008345568>] full_proxy_write+0x60/0xa8
[ 305.498387] [<ffff0000081fb7f4>] __vfs_write+0x1c/0x128
[ 305.498679] [<ffff0000081fcc68>] vfs_write+0xa0/0x1b0
[ 305.498926] [<ffff0000081fe0fc>] SyS_write+0x44/0xa0
[ 305.499182] Exception stack(0xffff00000a5ebec0 to 0xffff00000a5ec000)
[ 305.499429] bec0: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.499674] bee0: 574f4c465245564f 0000000000000000 0000000000000000 8000000080808080
[ 305.499904] bf00: 0000000000000040 0000000000000038 fefefeff1b4bc2ff 7f7f7f7f7f7fff7f
[ 305.500189] bf20: 0101010101010101 0000000000000000 000000000047a4c8 0000000000000038
[ 305.500712] bf40: 0000000000000000 0000ffffa2601280 0000ffffc63f6068 00000000004b5000
[ 305.501241] bf60: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.501791] bf80: 0000000000000020 0000000000000000 00000000004b5000 000000001c4cc458
[ 305.502314] bfa0: 0000000000000000 0000ffffc63f7950 000000000040a3c4 0000ffffc63f70e0
[ 305.502762] bfc0: 0000ffffa2601268 0000000080000000 0000000000000001 0000000000000040
[ 305.503207] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 305.503680] [<ffff000008082fb0>] el0_svc_naked+0x24/0x28
[ 305.504720] Kernel Offset: disabled
[ 305.505189] CPU features: 0x002082
[ 305.505473] Memory Limit: none
[ 305.506181] ---[ end Kernel panic - not syncing: kernel stack overflow
This patch was co-authored by Ard Biesheuvel and Mark Rutland.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-07-14 22:30:35 +03:00
str x0 , [ s p , #S _ S P ]
/* Stash the regs for handle_bad_stack */
mov x0 , s p
/* Time to die */
bl h a n d l e _ b a d _ s t a c k
ASM_ B U G ( )
2021-08-04 21:17:10 +03:00
SYM_ C O D E _ E N D ( _ _ b a d _ s t a c k )
arm64: add VMAP_STACK overflow detection
This patch adds stack overflow detection to arm64, usable when vmap'd stacks
are in use.
Overflow is detected in a small preamble executed for each exception entry,
which checks whether there is enough space on the current stack for the general
purpose registers to be saved. If there is not enough space, the overflow
handler is invoked on a per-cpu overflow stack. This approach preserves the
original exception information in ESR_EL1 (and where appropriate, FAR_EL1).
Task and IRQ stacks are aligned to double their size, enabling overflow to be
detected with a single bit test. For example, a 16K stack is aligned to 32K,
ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
this bit is flipped. Thus, overflow (of less than the size of the stack) can be
detected by testing whether this bit is set.
The overflow check is performed before any attempt is made to access the
stack, avoiding recursive faults (and the loss of exception information
these would entail). As logical operations cannot be performed on the SP
directly, the SP is temporarily swapped with a general purpose register
using arithmetic operations to enable the test to be performed.
This gives us a useful error message on stack overflow, as can be trigger with
the LKDTM overflow test:
[ 305.388749] lkdtm: Performing direct entry OVERFLOW
[ 305.395444] Insufficient stack space to handle exception!
[ 305.395482] ESR: 0x96000047 -- DABT (current EL)
[ 305.399890] FAR: 0xffff00000a5e7f30
[ 305.401315] Task stack: [0xffff00000a5e8000..0xffff00000a5ec000]
[ 305.403815] IRQ stack: [0xffff000008000000..0xffff000008004000]
[ 305.407035] Overflow stack: [0xffff80003efce4e0..0xffff80003efcf4e0]
[ 305.409622] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.412785] Hardware name: linux,dummy-virt (DT)
[ 305.415756] task: ffff80003d051c00 task.stack: ffff00000a5e8000
[ 305.419221] PC is at recursive_loop+0x10/0x48
[ 305.421637] LR is at recursive_loop+0x38/0x48
[ 305.423768] pc : [<ffff00000859f330>] lr : [<ffff00000859f358>] pstate: 40000145
[ 305.428020] sp : ffff00000a5e7f50
[ 305.430469] x29: ffff00000a5e8350 x28: ffff80003d051c00
[ 305.433191] x27: ffff000008981000 x26: ffff000008f80400
[ 305.439012] x25: ffff00000a5ebeb8 x24: ffff00000a5ebeb8
[ 305.440369] x23: ffff000008f80138 x22: 0000000000000009
[ 305.442241] x21: ffff80003ce65000 x20: ffff000008f80188
[ 305.444552] x19: 0000000000000013 x18: 0000000000000006
[ 305.446032] x17: 0000ffffa2601280 x16: ffff0000081fe0b8
[ 305.448252] x15: ffff000008ff546d x14: 000000000047a4c8
[ 305.450246] x13: ffff000008ff7872 x12: 0000000005f5e0ff
[ 305.452953] x11: ffff000008ed2548 x10: 000000000005ee8d
[ 305.454824] x9 : ffff000008545380 x8 : ffff00000a5e8770
[ 305.457105] x7 : 1313131313131313 x6 : 00000000000000e1
[ 305.459285] x5 : 0000000000000000 x4 : 0000000000000000
[ 305.461781] x3 : 0000000000000000 x2 : 0000000000000400
[ 305.465119] x1 : 0000000000000013 x0 : 0000000000000012
[ 305.467724] Kernel panic - not syncing: kernel stack overflow
[ 305.470561] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[ 305.473325] Hardware name: linux,dummy-virt (DT)
[ 305.475070] Call trace:
[ 305.476116] [<ffff000008088ad8>] dump_backtrace+0x0/0x378
[ 305.478991] [<ffff000008088e64>] show_stack+0x14/0x20
[ 305.481237] [<ffff00000895a178>] dump_stack+0x98/0xb8
[ 305.483294] [<ffff0000080c3288>] panic+0x118/0x280
[ 305.485673] [<ffff0000080c2e9c>] nmi_panic+0x6c/0x70
[ 305.486216] [<ffff000008089710>] handle_bad_stack+0x118/0x128
[ 305.486612] Exception stack(0xffff80003efcf3a0 to 0xffff80003efcf4e0)
[ 305.487334] f3a0: 0000000000000012 0000000000000013 0000000000000400 0000000000000000
[ 305.488025] f3c0: 0000000000000000 0000000000000000 00000000000000e1 1313131313131313
[ 305.488908] f3e0: ffff00000a5e8770 ffff000008545380 000000000005ee8d ffff000008ed2548
[ 305.489403] f400: 0000000005f5e0ff ffff000008ff7872 000000000047a4c8 ffff000008ff546d
[ 305.489759] f420: ffff0000081fe0b8 0000ffffa2601280 0000000000000006 0000000000000013
[ 305.490256] f440: ffff000008f80188 ffff80003ce65000 0000000000000009 ffff000008f80138
[ 305.490683] f460: ffff00000a5ebeb8 ffff00000a5ebeb8 ffff000008f80400 ffff000008981000
[ 305.491051] f480: ffff80003d051c00 ffff00000a5e8350 ffff00000859f358 ffff00000a5e7f50
[ 305.491444] f4a0: ffff00000859f330 0000000040000145 0000000000000000 0000000000000000
[ 305.492008] f4c0: 0001000000000000 0000000000000000 ffff00000a5e8350 ffff00000859f330
[ 305.493063] [<ffff00000808205c>] __bad_stack+0x88/0x8c
[ 305.493396] [<ffff00000859f330>] recursive_loop+0x10/0x48
[ 305.493731] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494088] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494425] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494649] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.494898] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495205] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495453] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.495708] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496000] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496302] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496644] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.496894] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497138] [<ffff00000859f358>] recursive_loop+0x38/0x48
[ 305.497325] [<ffff00000859f3dc>] lkdtm_OVERFLOW+0x14/0x20
[ 305.497506] [<ffff00000859f314>] lkdtm_do_action+0x1c/0x28
[ 305.497786] [<ffff00000859f178>] direct_entry+0xe0/0x170
[ 305.498095] [<ffff000008345568>] full_proxy_write+0x60/0xa8
[ 305.498387] [<ffff0000081fb7f4>] __vfs_write+0x1c/0x128
[ 305.498679] [<ffff0000081fcc68>] vfs_write+0xa0/0x1b0
[ 305.498926] [<ffff0000081fe0fc>] SyS_write+0x44/0xa0
[ 305.499182] Exception stack(0xffff00000a5ebec0 to 0xffff00000a5ec000)
[ 305.499429] bec0: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.499674] bee0: 574f4c465245564f 0000000000000000 0000000000000000 8000000080808080
[ 305.499904] bf00: 0000000000000040 0000000000000038 fefefeff1b4bc2ff 7f7f7f7f7f7fff7f
[ 305.500189] bf20: 0101010101010101 0000000000000000 000000000047a4c8 0000000000000038
[ 305.500712] bf40: 0000000000000000 0000ffffa2601280 0000ffffc63f6068 00000000004b5000
[ 305.501241] bf60: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[ 305.501791] bf80: 0000000000000020 0000000000000000 00000000004b5000 000000001c4cc458
[ 305.502314] bfa0: 0000000000000000 0000ffffc63f7950 000000000040a3c4 0000ffffc63f70e0
[ 305.502762] bfc0: 0000ffffa2601268 0000000080000000 0000000000000001 0000000000000040
[ 305.503207] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 305.503680] [<ffff000008082fb0>] el0_svc_naked+0x24/0x28
[ 305.504720] Kernel Offset: disabled
[ 305.505189] CPU features: 0x002082
[ 305.505473] Memory Limit: none
[ 305.506181] ---[ end Kernel panic - not syncing: kernel stack overflow
This patch was co-authored by Ard Biesheuvel and Mark Rutland.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
2017-07-14 22:30:35 +03:00
# endif / * C O N F I G _ V M A P _ S T A C K * /
2012-03-05 15:49:27 +04:00
arm64: entry: handle all vectors with C
We have 16 architectural exception vectors, and depending on kernel
configuration we handle 8 or 12 of these with C code, with the remaining
8 or 4 of these handled as special cases in the entry assembly.
It would be nicer if the entry assembly were uniform for all exceptions,
and we deferred any specific handling of the exceptions to C code. This
way the entry assembly can be more easily templated without ifdeffery or
special cases, and it's easier to modify the handling of these cases in
future (e.g. to dump additional registers other context).
This patch reworks the entry code so that we always have a C handler for
every architectural exception vector, with the entry assembly being
completely uniform. We now have to handle exceptions from EL1t and EL1h,
and also have to handle exceptions from AArch32 even when the kernel is
built without CONFIG_COMPAT. To make this clear and to simplify
templating, we rename the top-level exception handlers with a consistent
naming scheme:
asm: <el+sp>_<regsize>_<type>
c: <el+sp>_<regsize>_<type>_handler
.. where:
<el+sp> is `el1t`, `el1h`, or `el0t`
<regsize> is `64` or `32`
<type> is `sync`, `irq`, `fiq`, or `error`
... e.g.
asm: el1h_64_sync
c: el1h_64_sync_handler
... with lower-level handlers simply using "el1" and "compat" as today.
For unexpected exceptions, this information is passed to
__panic_unhandled(), so it can report the specific vector an unexpected
exception was taken from, e.g.
| Unhandled 64-bit el1t sync exception
For vectors we never expect to enter legitimately, the C code is
generated using a macro to avoid code duplication. The exceptions are
handled via __panic_unhandled(), replacing bad_mode() (which is
removed).
The `kernel_ventry` and `entry_handler` assembly macros are updated to
handle the new naming scheme. In theory it should be possible to
generate the entry functions at the same time as the vectors using a
single table, but this will require reworking the linker script to split
the two into separate sections, so for now we have separate tables.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210607094624.34689-15-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07 12:46:18 +03:00
.macro entry_handler el : req, h t : r e q , r e g s i z e : r e q , l a b e l : r e q
SYM_ C O D E _ S T A R T _ L O C A L ( e l \ e l \ h t \ ( ) _ \ r e g s i z e \ ( ) _ \ l a b e l )
2016-03-18 12:58:09 +03:00
kernel_ e n t r y \ e l , \ r e g s i z e
2012-03-05 15:49:27 +04:00
mov x0 , s p
arm64: entry: handle all vectors with C
We have 16 architectural exception vectors, and depending on kernel
configuration we handle 8 or 12 of these with C code, with the remaining
8 or 4 of these handled as special cases in the entry assembly.
It would be nicer if the entry assembly were uniform for all exceptions,
and we deferred any specific handling of the exceptions to C code. This
way the entry assembly can be more easily templated without ifdeffery or
special cases, and it's easier to modify the handling of these cases in
future (e.g. to dump additional registers other context).
This patch reworks the entry code so that we always have a C handler for
every architectural exception vector, with the entry assembly being
completely uniform. We now have to handle exceptions from EL1t and EL1h,
and also have to handle exceptions from AArch32 even when the kernel is
built without CONFIG_COMPAT. To make this clear and to simplify
templating, we rename the top-level exception handlers with a consistent
naming scheme:
asm: <el+sp>_<regsize>_<type>
c: <el+sp>_<regsize>_<type>_handler
.. where:
<el+sp> is `el1t`, `el1h`, or `el0t`
<regsize> is `64` or `32`
<type> is `sync`, `irq`, `fiq`, or `error`
... e.g.
asm: el1h_64_sync
c: el1h_64_sync_handler
... with lower-level handlers simply using "el1" and "compat" as today.
For unexpected exceptions, this information is passed to
__panic_unhandled(), so it can report the specific vector an unexpected
exception was taken from, e.g.
| Unhandled 64-bit el1t sync exception
For vectors we never expect to enter legitimately, the C code is
generated using a macro to avoid code duplication. The exceptions are
handled via __panic_unhandled(), replacing bad_mode() (which is
removed).
The `kernel_ventry` and `entry_handler` assembly macros are updated to
handle the new naming scheme. In theory it should be possible to
generate the entry functions at the same time as the vectors using a
single table, but this will require reworking the linker script to split
the two into separate sections, so for now we have separate tables.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210607094624.34689-15-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07 12:46:18 +03:00
bl e l \ e l \ h t \ ( ) _ \ r e g s i z e \ ( ) _ \ l a b e l \ ( ) _ h a n d l e r
2021-06-07 12:46:17 +03:00
.if \ el = = 0
b r e t _ t o _ u s e r
.else
2021-06-07 12:46:14 +03:00
b r e t _ t o _ k e r n e l
2021-06-07 12:46:17 +03:00
.endif
arm64: entry: handle all vectors with C
We have 16 architectural exception vectors, and depending on kernel
configuration we handle 8 or 12 of these with C code, with the remaining
8 or 4 of these handled as special cases in the entry assembly.
It would be nicer if the entry assembly were uniform for all exceptions,
and we deferred any specific handling of the exceptions to C code. This
way the entry assembly can be more easily templated without ifdeffery or
special cases, and it's easier to modify the handling of these cases in
future (e.g. to dump additional registers other context).
This patch reworks the entry code so that we always have a C handler for
every architectural exception vector, with the entry assembly being
completely uniform. We now have to handle exceptions from EL1t and EL1h,
and also have to handle exceptions from AArch32 even when the kernel is
built without CONFIG_COMPAT. To make this clear and to simplify
templating, we rename the top-level exception handlers with a consistent
naming scheme:
asm: <el+sp>_<regsize>_<type>
c: <el+sp>_<regsize>_<type>_handler
.. where:
<el+sp> is `el1t`, `el1h`, or `el0t`
<regsize> is `64` or `32`
<type> is `sync`, `irq`, `fiq`, or `error`
... e.g.
asm: el1h_64_sync
c: el1h_64_sync_handler
... with lower-level handlers simply using "el1" and "compat" as today.
For unexpected exceptions, this information is passed to
__panic_unhandled(), so it can report the specific vector an unexpected
exception was taken from, e.g.
| Unhandled 64-bit el1t sync exception
For vectors we never expect to enter legitimately, the C code is
generated using a macro to avoid code duplication. The exceptions are
handled via __panic_unhandled(), replacing bad_mode() (which is
removed).
The `kernel_ventry` and `entry_handler` assembly macros are updated to
handle the new naming scheme. In theory it should be possible to
generate the entry functions at the same time as the vectors using a
single table, but this will require reworking the linker script to split
the two into separate sections, so for now we have separate tables.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210607094624.34689-15-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07 12:46:18 +03:00
SYM_ C O D E _ E N D ( e l \ e l \ h t \ ( ) _ \ r e g s i z e \ ( ) _ \ l a b e l )
2012-03-05 15:49:27 +04:00
.endm
/ *
2021-06-07 12:46:17 +03:00
* Early e x c e p t i o n h a n d l e r s
2012-03-05 15:49:27 +04:00
* /
arm64: entry: handle all vectors with C
We have 16 architectural exception vectors, and depending on kernel
configuration we handle 8 or 12 of these with C code, with the remaining
8 or 4 of these handled as special cases in the entry assembly.
It would be nicer if the entry assembly were uniform for all exceptions,
and we deferred any specific handling of the exceptions to C code. This
way the entry assembly can be more easily templated without ifdeffery or
special cases, and it's easier to modify the handling of these cases in
future (e.g. to dump additional registers other context).
This patch reworks the entry code so that we always have a C handler for
every architectural exception vector, with the entry assembly being
completely uniform. We now have to handle exceptions from EL1t and EL1h,
and also have to handle exceptions from AArch32 even when the kernel is
built without CONFIG_COMPAT. To make this clear and to simplify
templating, we rename the top-level exception handlers with a consistent
naming scheme:
asm: <el+sp>_<regsize>_<type>
c: <el+sp>_<regsize>_<type>_handler
.. where:
<el+sp> is `el1t`, `el1h`, or `el0t`
<regsize> is `64` or `32`
<type> is `sync`, `irq`, `fiq`, or `error`
... e.g.
asm: el1h_64_sync
c: el1h_64_sync_handler
... with lower-level handlers simply using "el1" and "compat" as today.
For unexpected exceptions, this information is passed to
__panic_unhandled(), so it can report the specific vector an unexpected
exception was taken from, e.g.
| Unhandled 64-bit el1t sync exception
For vectors we never expect to enter legitimately, the C code is
generated using a macro to avoid code duplication. The exceptions are
handled via __panic_unhandled(), replacing bad_mode() (which is
removed).
The `kernel_ventry` and `entry_handler` assembly macros are updated to
handle the new naming scheme. In theory it should be possible to
generate the entry functions at the same time as the vectors using a
single table, but this will require reworking the linker script to split
the two into separate sections, so for now we have separate tables.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210607094624.34689-15-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07 12:46:18 +03:00
entry_ h a n d l e r 1 , t , 6 4 , s y n c
entry_ h a n d l e r 1 , t , 6 4 , i r q
entry_ h a n d l e r 1 , t , 6 4 , f i q
entry_ h a n d l e r 1 , t , 6 4 , e r r o r
entry_ h a n d l e r 1 , h , 6 4 , s y n c
entry_ h a n d l e r 1 , h , 6 4 , i r q
entry_ h a n d l e r 1 , h , 6 4 , f i q
entry_ h a n d l e r 1 , h , 6 4 , e r r o r
entry_ h a n d l e r 0 , t , 6 4 , s y n c
entry_ h a n d l e r 0 , t , 6 4 , i r q
entry_ h a n d l e r 0 , t , 6 4 , f i q
entry_ h a n d l e r 0 , t , 6 4 , e r r o r
entry_ h a n d l e r 0 , t , 3 2 , s y n c
entry_ h a n d l e r 0 , t , 3 2 , i r q
entry_ h a n d l e r 0 , t , 3 2 , f i q
entry_ h a n d l e r 0 , t , 3 2 , e r r o r
2012-03-05 15:49:27 +04:00
2021-06-07 12:46:17 +03:00
SYM_ C O D E _ S T A R T _ L O C A L ( r e t _ t o _ k e r n e l )
2017-11-02 15:12:42 +03:00
kernel_ e x i t 1
2021-06-07 12:46:17 +03:00
SYM_ C O D E _ E N D ( r e t _ t o _ k e r n e l )
2017-11-02 15:12:42 +03:00
2020-05-01 14:54:28 +03:00
SYM_ C O D E _ S T A R T _ L O C A L ( r e t _ t o _ u s e r )
arm64: entry: move bulk of ret_to_user to C
In `ret_to_user` we perform some conditional work depending on the
thread flags, then perform some IRQ/context tracking which is intended
to balance with the IRQ/context tracking performed in the entry C code.
For simplicity and consistency, it would be preferable to move this all
to C. As a step towards that, this patch moves the conditional work and
IRQ/context tracking into a C helper function. To aid bisectability,
this is called from the `ret_to_user` assembly, and a subsequent patch
will move the call to C code.
As local_daif_mask() handles all necessary tracing and PMR manipulation,
we no longer need to handle this explicitly. As we call
exit_to_user_mode() directly, the `user_enter_irqoff` macro is no longer
used, and can be removed. As enter_from_user_mode() and
exit_to_user_mode() are no longer called from assembly, these can be
made static, and as these are typically very small, they are marked
__always_inline to avoid the overhead of a function call.
For now, enablement of single-step is left in entry.S, and for this we
still need to read the flags in ret_to_user(). It is safe to read this
separately as TIF_SINGLESTEP is not part of _TIF_WORK_MASK.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://lore.kernel.org/r/20210802140733.52716-4-mark.rutland@arm.com
[catalin.marinas@arm.com: removed unused gic_prio_kentry_setup macro]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-08-02 17:07:32 +03:00
ldr x19 , [ t s k , #T S K _ T I _ F L A G S ] / / r e - c h e c k f o r s i n g l e - s t e p
2020-11-30 14:59:44 +03:00
enable_ s t e p _ t s k x19 , x2
2018-07-21 00:41:54 +03:00
# ifdef C O N F I G _ G C C _ P L U G I N _ S T A C K L E A K
2022-04-27 20:31:28 +03:00
bl s t a c k l e a k _ e r a s e _ o n _ t a s k _ s t a c k
2018-07-21 00:41:54 +03:00
# endif
2015-08-19 17:57:09 +03:00
kernel_ e x i t 0
2020-05-01 14:54:28 +03:00
SYM_ C O D E _ E N D ( r e t _ t o _ u s e r )
2012-03-05 15:49:27 +04:00
2016-07-08 19:35:50 +03:00
.popsection / / .entry .text
2020-11-03 13:22:29 +03:00
/ / Move f r o m t r a m p _ p g _ d i r t o s w a p p e r _ p g _ d i r
2017-11-14 17:07:40 +03:00
.macro tramp_ m a p _ k e r n e l , t m p
mrs \ t m p , t t b r1 _ e l 1
2021-02-02 15:36:58 +03:00
add \ t m p , \ t m p , #T R A M P _ S W A P P E R _ O F F S E T
2017-11-14 17:07:40 +03:00
bic \ t m p , \ t m p , #U S E R _ A S I D _ F L A G
msr t t b r1 _ e l 1 , \ t m p
2017-11-14 17:29:19 +03:00
# ifdef C O N F I G _ Q C O M _ F A L K O R _ E R R A T U M _ 1 0 0 3
alternative_ i f A R M 6 4 _ W O R K A R O U N D _ Q C O M _ F A L K O R _ E 1 0 0 3
/* ASID already in \tmp[63:48] */
movk \ t m p , #: a b s _ g 2 _ n c : ( T R A M P _ V A L I A S > > 1 2 )
movk \ t m p , #: a b s _ g 1 _ n c : ( T R A M P _ V A L I A S > > 1 2 )
/* 2MB boundary containing the vectors, so we nobble the walk cache */
movk \ t m p , #: a b s _ g 0 _ n c : ( ( T R A M P _ V A L I A S & ~ ( S Z _ 2 M - 1 ) ) > > 1 2 )
isb
tlbi v a e 1 , \ t m p
dsb n s h
alternative_ e l s e _ n o p _ e n d i f
# endif / * C O N F I G _ Q C O M _ F A L K O R _ E R R A T U M _ 1 0 0 3 * /
2017-11-14 17:07:40 +03:00
.endm
2020-11-03 13:22:29 +03:00
/ / Move f r o m s w a p p e r _ p g _ d i r t o t r a m p _ p g _ d i r
2017-11-14 17:07:40 +03:00
.macro tramp_ u n m a p _ k e r n e l , t m p
mrs \ t m p , t t b r1 _ e l 1
2021-02-02 15:36:58 +03:00
sub \ t m p , \ t m p , #T R A M P _ S W A P P E R _ O F F S E T
2017-11-14 17:07:40 +03:00
orr \ t m p , \ t m p , #U S E R _ A S I D _ F L A G
msr t t b r1 _ e l 1 , \ t m p
/ *
2018-01-29 14:59:58 +03:00
* We a v o i d r u n n i n g t h e p o s t _ t t b r _ u p d a t e _ w o r k a r o u n d h e r e b e c a u s e
* it' s o n l y n e e d e d b y C a v i u m T h u n d e r X , w h i c h r e q u i r e s K P T I t o b e
* disabled.
2017-11-14 17:07:40 +03:00
* /
.endm
2022-06-22 19:10:10 +03:00
.macro tramp_data_read_var dst, v a r
# ifdef C O N F I G _ R E L O C A T A B L E
ldr \ d s t , . L _ _ t r a m p _ d a t a _ \ v a r
.ifndef .L__tramp_data_ \ var
.pushsection " .entry .tramp .rodata " , " a" , % p r o g b i t s
.align 3
.L__tramp_data_ \ var :
.quad \ var
.popsection
.endif
2021-11-25 17:25:34 +03:00
# else
2022-06-22 19:10:10 +03:00
/ *
* As ! R E L O C A T A B L E i m p l i e s ! R A N D O M I Z E _ B A S E t h e a d d r e s s i s a l w a y s a
* compile t i m e c o n s t a n t ( a n d h e n c e n o t s e c r e t a n d n o t w o r t h h i d i n g ) .
*
* As s t a t i c a l l y a l l o c a t e d k e r n e l c o d e a n d d a t a a l w a y s l i v e i n t h e t o p
* 4 7 bits o f t h e a d d r e s s s p a c e w e c a n s i g n - e x t e n d b i t 4 7 a n d a v o i d a n
* instruction t o l o a d t h e u p p e r 1 6 b i t s ( w h i c h m u s t b e 0 x F F F F ) .
* /
movz \ d s t , : a b s _ g 2 _ s : \ v a r
movk \ d s t , : a b s _ g 1 _ n c : \ v a r
movk \ d s t , : a b s _ g 0 _ n c : \ v a r
2021-11-25 17:25:34 +03:00
# endif
.endm
2021-11-18 16:59:46 +03:00
# define B H B _ M I T I G A T I O N _ N O N E 0
# define B H B _ M I T I G A T I O N _ L O O P 1
# define B H B _ M I T I G A T I O N _ F W 2
2021-12-10 17:32:56 +03:00
# define B H B _ M I T I G A T I O N _ I N S N 3
2021-11-18 16:59:46 +03:00
.macro tramp_ v e n t r y , v e c t o r _ s t a r t , r e g s i z e , k p t i , b h b
2017-11-14 17:07:40 +03:00
.align 7
1 :
.if \ regsize = = 6 4
msr t p i d r r o _ e l 0 , x30 / / R e s t o r e d i n k e r n e l _ v e n t r y
.endif
2021-11-24 18:03:15 +03:00
2021-11-18 16:59:46 +03:00
.if \ bhb = = B H B _ M I T I G A T I O N _ L O O P
/ *
* This s e q u e n c e m u s t a p p e a r b e f o r e t h e f i r s t i n d i r e c t b r a n c h . i . e . t h e
* ret o u t o f t r a m p _ v e n t r y . I t a p p e a r s h e r e b e c a u s e x30 i s f r e e .
* /
_ _ mitigate_ s p e c t r e _ b h b _ l o o p x30
.endif / / \ bhb = = B H B _ M I T I G A T I O N _ L O O P
2021-12-10 17:32:56 +03:00
.if \ bhb = = B H B _ M I T I G A T I O N _ I N S N
clearbhb
isb
.endif / / \ bhb = = B H B _ M I T I G A T I O N _ I N S N
2021-11-24 18:03:15 +03:00
.if \ kpti = = 1
2017-11-14 19:15:59 +03:00
/ *
* Defend a g a i n s t b r a n c h a l i a s i n g a t t a c k s b y p u s h i n g a d u m m y
* entry o n t o t h e r e t u r n s t a c k a n d u s i n g a R E T i n s t r u c t i o n t o
* enter t h e f u l l - f a t k e r n e l v e c t o r s .
* /
bl 2 f
b .
2 :
2017-11-14 17:07:40 +03:00
tramp_ m a p _ k e r n e l x30
2017-12-06 14:24:02 +03:00
alternative_ i n s n i s b , n o p , A R M 6 4 _ W O R K A R O U N D _ Q C O M _ F A L K O R _ E 1 0 0 3
2021-11-25 17:25:34 +03:00
tramp_ d a t a _ r e a d _ v a r x30 , v e c t o r s
2019-04-09 18:22:24 +03:00
alternative_ i f _ n o t A R M 6 4 _ W O R K A R O U N D _ C A V I U M _ T X 2 _ 2 1 9 _ P R F M
2021-11-24 16:40:09 +03:00
prfm p l i l 1 s t r m , [ x30 , #( 1 b - \ v e c t o r _ s t a r t ) ]
2019-04-09 18:22:24 +03:00
alternative_ e l s e _ n o p _ e n d i f
2021-11-18 16:16:23 +03:00
2017-11-14 17:07:40 +03:00
msr v b a r _ e l 1 , x30
isb
2021-11-18 16:16:23 +03:00
.else
2022-06-22 19:10:10 +03:00
adr_ l x30 , v e c t o r s
2021-11-18 16:16:23 +03:00
.endif / / \ kpti = = 1
2021-11-18 16:59:46 +03:00
.if \ bhb = = B H B _ M I T I G A T I O N _ F W
/ *
* The f i r m w a r e s e q u e n c e m u s t a p p e a r b e f o r e t h e f i r s t i n d i r e c t b r a n c h .
* i. e . t h e r e t o u t o f t r a m p _ v e n t r y . B u t i t a l s o n e e d s t h e s t a c k t o b e
* mapped t o s a v e / r e s t o r e t h e r e g i s t e r s t h e S M C c l o b b e r s .
* /
_ _ mitigate_ s p e c t r e _ b h b _ f w
.endif / / \ bhb = = B H B _ M I T I G A T I O N _ F W
2021-11-18 16:16:23 +03:00
add x30 , x30 , #( 1 b - \ v e c t o r _ s t a r t + 4 )
2017-11-14 19:15:59 +03:00
ret
2021-11-17 18:15:26 +03:00
.org 1b + 1 2 8 / / Did w e o v e r f l o w t h e v e n t r y s l o t ?
2017-11-14 17:07:40 +03:00
.endm
2021-11-18 16:59:46 +03:00
.macro generate_ t r a m p _ v e c t o r , k p t i , b h b
2021-11-24 16:40:09 +03:00
.Lvector_start \ @:
2017-11-14 17:07:40 +03:00
.space 0x400
2021-11-24 16:40:09 +03:00
.rept 4
2021-11-18 16:59:46 +03:00
tramp_ v e n t r y . L v e c t o r _ s t a r t \ @, 64, \kpti, \bhb
2021-11-24 16:40:09 +03:00
.endr
.rept 4
2021-11-18 16:59:46 +03:00
tramp_ v e n t r y . L v e c t o r _ s t a r t \ @, 32, \kpti, \bhb
2021-11-24 16:40:09 +03:00
.endr
.endm
2017-11-14 17:07:40 +03:00
2021-11-18 17:02:30 +03:00
# ifdef C O N F I G _ U N M A P _ K E R N E L _ A T _ E L 0
/ *
* Exception v e c t o r s t r a m p o l i n e .
2021-11-18 16:59:46 +03:00
* The o r d e r m u s t m a t c h _ _ b p _ h a r d e n _ e l 1 _ v e c t o r s a n d t h e
* arm6 4 _ b p _ h a r d e n _ e l 1 _ v e c t o r s e n u m .
2021-11-18 17:02:30 +03:00
* /
.pushsection " .entry .tramp .text " , " ax"
2021-11-24 16:40:09 +03:00
.align 11
2023-04-18 17:36:04 +03:00
SYM_ C O D E _ S T A R T _ L O C A L _ N O A L I G N ( t r a m p _ v e c t o r s )
2021-11-18 16:59:46 +03:00
# ifdef C O N F I G _ M I T I G A T E _ S P E C T R E _ B R A N C H _ H I S T O R Y
generate_ t r a m p _ v e c t o r k p t i =1 , b h b =BHB_MITIGATION_LOOP
generate_ t r a m p _ v e c t o r k p t i =1 , b h b =BHB_MITIGATION_FW
2021-12-10 17:32:56 +03:00
generate_ t r a m p _ v e c t o r k p t i =1 , b h b =BHB_MITIGATION_INSN
2021-11-18 16:59:46 +03:00
# endif / * C O N F I G _ M I T I G A T E _ S P E C T R E _ B R A N C H _ H I S T O R Y * /
generate_ t r a m p _ v e c t o r k p t i =1 , b h b =BHB_MITIGATION_NONE
2020-02-18 22:58:29 +03:00
SYM_ C O D E _ E N D ( t r a m p _ v e c t o r s )
2017-11-14 17:07:40 +03:00
2023-04-18 17:36:04 +03:00
SYM_ C O D E _ S T A R T _ L O C A L ( t r a m p _ e x i t )
tramp_ u n m a p _ k e r n e l x29
mrs x29 , f a r _ e l 1 / / r e s t o r e x29
eret
sb
SYM_ C O D E _ E N D ( t r a m p _ e x i t )
2017-11-14 17:07:40 +03:00
.popsection / / .entry .tramp .text
# endif / * C O N F I G _ U N M A P _ K E R N E L _ A T _ E L 0 * /
2021-11-24 18:03:15 +03:00
/ *
* Exception v e c t o r s f o r s p e c t r e m i t i g a t i o n s o n e n t r y f r o m E L 1 w h e n
* kpti i s n o t i n u s e .
* /
2021-11-18 16:59:46 +03:00
.macro generate_ e l 1 _ v e c t o r , b h b
2021-11-24 18:03:15 +03:00
.Lvector_start \ @:
kernel_ v e n t r y 1 , t , 6 4 , s y n c / / S y n c h r o n o u s E L 1 t
kernel_ v e n t r y 1 , t , 6 4 , i r q / / I R Q E L 1 t
kernel_ v e n t r y 1 , t , 6 4 , f i q / / F I Q E L 1 h
kernel_ v e n t r y 1 , t , 6 4 , e r r o r / / E r r o r E L 1 t
kernel_ v e n t r y 1 , h , 6 4 , s y n c / / S y n c h r o n o u s E L 1 h
kernel_ v e n t r y 1 , h , 6 4 , i r q / / I R Q E L 1 h
kernel_ v e n t r y 1 , h , 6 4 , f i q / / F I Q E L 1 h
kernel_ v e n t r y 1 , h , 6 4 , e r r o r / / E r r o r E L 1 h
.rept 4
2021-11-18 16:59:46 +03:00
tramp_ v e n t r y . L v e c t o r _ s t a r t \ @, 64, 0, \bhb
2021-11-24 18:03:15 +03:00
.endr
.rept 4
2021-11-18 16:59:46 +03:00
tramp_ v e n t r y . L v e c t o r _ s t a r t \ @, 32, 0, \bhb
2021-11-24 18:03:15 +03:00
.endr
.endm
2021-11-18 16:59:46 +03:00
/* The order must match tramp_vecs and the arm64_bp_harden_el1_vectors enum. */
2021-11-24 18:03:15 +03:00
.pushsection " .entry .text " , " ax"
.align 11
SYM_ C O D E _ S T A R T ( _ _ b p _ h a r d e n _ e l 1 _ v e c t o r s )
2021-11-18 16:59:46 +03:00
# ifdef C O N F I G _ M I T I G A T E _ S P E C T R E _ B R A N C H _ H I S T O R Y
generate_ e l 1 _ v e c t o r b h b =BHB_MITIGATION_LOOP
generate_ e l 1 _ v e c t o r b h b =BHB_MITIGATION_FW
2021-12-10 17:32:56 +03:00
generate_ e l 1 _ v e c t o r b h b =BHB_MITIGATION_INSN
2021-11-18 16:59:46 +03:00
# endif / * C O N F I G _ M I T I G A T E _ S P E C T R E _ B R A N C H _ H I S T O R Y * /
2021-11-24 18:03:15 +03:00
SYM_ C O D E _ E N D ( _ _ b p _ h a r d e n _ e l 1 _ v e c t o r s )
.popsection
2017-07-26 18:05:20 +03:00
/ *
* Register s w i t c h f o r A A r c h64 . T h e c a l l e e - s a v e d r e g i s t e r s n e e d t o b e s a v e d
* and r e s t o r e d . O n e n t r y :
* x0 = p r e v i o u s t a s k _ s t r u c t ( m u s t b e p r e s e r v e d a c r o s s t h e s w i t c h )
* x1 = n e x t t a s k _ s t r u c t
* Previous a n d n e x t a r e g u a r a n t e e d n o t t o b e t h e s a m e .
*
* /
2020-02-18 22:58:29 +03:00
SYM_ F U N C _ S T A R T ( c p u _ s w i t c h _ t o )
2017-07-26 18:05:20 +03:00
mov x10 , #T H R E A D _ C P U _ C O N T E X T
add x8 , x0 , x10
mov x9 , s p
stp x19 , x20 , [ x8 ] , #16 / / s t o r e c a l l e e - s a v e d r e g i s t e r s
stp x21 , x22 , [ x8 ] , #16
stp x23 , x24 , [ x8 ] , #16
stp x25 , x26 , [ x8 ] , #16
stp x27 , x28 , [ x8 ] , #16
stp x29 , x9 , [ x8 ] , #16
str l r , [ x8 ]
add x8 , x1 , x10
ldp x19 , x20 , [ x8 ] , #16 / / r e s t o r e c a l l e e - s a v e d r e g i s t e r s
ldp x21 , x22 , [ x8 ] , #16
ldp x23 , x24 , [ x8 ] , #16
ldp x25 , x26 , [ x8 ] , #16
ldp x27 , x28 , [ x8 ] , #16
ldp x29 , x9 , [ x8 ] , #16
ldr l r , [ x8 ]
mov s p , x9
msr s p _ e l 0 , x1
2020-04-23 13:16:05 +03:00
ptrauth_ k e y s _ i n s t a l l _ k e r n e l x1 , x8 , x9 , x10
2021-05-27 13:55:29 +03:00
scs_ s a v e x0
2023-01-09 20:47:59 +03:00
scs_ l o a d _ c u r r e n t
2017-07-26 18:05:20 +03:00
ret
2020-02-18 22:58:29 +03:00
SYM_ F U N C _ E N D ( c p u _ s w i t c h _ t o )
2017-07-26 18:05:20 +03:00
NOKPROBE( c p u _ s w i t c h _ t o )
/ *
* This i s h o w w e r e t u r n f r o m a f o r k .
* /
2020-02-18 22:58:28 +03:00
SYM_ C O D E _ S T A R T ( r e t _ f r o m _ f o r k )
2017-07-26 18:05:20 +03:00
bl s c h e d u l e _ t a i l
cbz x19 , 1 f / / n o t a k e r n e l t h r e a d
mov x0 , x20
blr x19
2019-02-22 12:32:50 +03:00
1 : get_ c u r r e n t _ t a s k t s k
arm64: entry: call exit_to_user_mode() from C
When handling an exception from EL0, we perform the entry work in that
exception's C handler, and once the C handler has finished, we return
back to the entry assembly. Subsequently in the common `ret_to_user`
assembly we perform the exit work that balances with the entry work.
This can be somewhat difficult to follow, and makes it hard to rework
the return paths (e.g. to pass additional context to the exit code, or
to have exception return logic for specific exceptions).
This patch reworks the entry code such that each EL0 C exception handler
is responsible for both the entry and exit work. This clearly balances
the two (and will permit additional variation in future), and avoids an
unnecessary bounce between assembly and C in the common case, leaving
`ret_from_fork` as the only place assembly has to call the exit code.
This means that the exit work is now inlined into the C handler, which
is already the case for the entry work, and allows the compiler to
generate better code (e.g. by immediately returning when there is no
exit work to perform).
To align with other exception entry/exit helpers, enter_from_user_mode()
is updated to take the EL0 pt_regs as a parameter, though this is
currently unused.
There should be no functional change as a result of this patch. However,
this should lead to slightly better backtraces when an error is
encountered within do_notify_resume(), as the C handler should appear in
the backtrace, indicating the specific exception that the kernel was
entered with.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://lore.kernel.org/r/20210802140733.52716-5-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-08-02 17:07:33 +03:00
mov x0 , s p
bl a s m _ e x i t _ t o _ u s e r _ m o d e
2017-07-26 18:05:20 +03:00
b r e t _ t o _ u s e r
2020-02-18 22:58:28 +03:00
SYM_ C O D E _ E N D ( r e t _ f r o m _ f o r k )
2017-07-26 18:05:20 +03:00
NOKPROBE( r e t _ f r o m _ f o r k )
arm64: kernel: Add arch-specific SDEI entry code and CPU masking
The Software Delegated Exception Interface (SDEI) is an ARM standard
for registering callbacks from the platform firmware into the OS.
This is typically used to implement RAS notifications.
Such notifications enter the kernel at the registered entry-point
with the register values of the interrupted CPU context. Because this
is not a CPU exception, it cannot reuse the existing entry code.
(crucially we don't implicitly know which exception level we interrupted),
Add the entry point to entry.S to set us up for calling into C code. If
the event interrupted code that had interrupts masked, we always return
to that location. Otherwise we pretend this was an IRQ, and use SDEI's
complete_and_resume call to return to vbar_el1 + offset.
This allows the kernel to deliver signals to user space processes. For
KVM this triggers the world switch, a quick spin round vcpu_run, then
back into the guest, unless there are pending signals.
Add sdei_mask_local_cpu() calls to the smp_send_stop() code, this covers
the panic() code-path, which doesn't invoke cpuhotplug notifiers.
Because we can interrupt entry-from/exit-to another EL, we can't trust the
value in sp_el0 or x29, even if we interrupted the kernel, in this case
the code in entry.S will save/restore sp_el0 and use the value in
__entry_task.
When we have VMAP stacks we can interrupt the stack-overflow test, which
stirs x0 into sp, meaning we have to have our own VMAP stacks. For now
these are allocated when we probe the interface. Future patches will add
refcounting hooks to allow the arch code to allocate them lazily.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-08 18:38:12 +03:00
2021-06-07 12:46:10 +03:00
/ *
* void c a l l _ o n _ i r q _ s t a c k ( s t r u c t p t _ r e g s * r e g s ,
* void ( * f u n c ) ( s t r u c t p t _ r e g s * ) ) ;
*
* Calls f u n c ( r e g s ) u s i n g t h i s C P U ' s i r q s t a c k a n d s h a d o w i r q s t a c k .
* /
SYM_ F U N C _ S T A R T ( c a l l _ o n _ i r q _ s t a c k )
# ifdef C O N F I G _ S H A D O W _ C A L L _ S T A C K
2023-01-09 20:48:00 +03:00
get_ c u r r e n t _ t a s k x16
scs_ s a v e x16
2021-06-07 12:46:10 +03:00
ldr_ t h i s _ c p u s c s _ s p , i r q _ s h a d o w _ c a l l _ s t a c k _ p t r , x17
# endif
2023-01-09 20:48:00 +03:00
2021-06-07 12:46:10 +03:00
/* Create a frame record to save our LR and SP (implicit in FP) */
stp x29 , x30 , [ s p , #- 16 ] !
mov x29 , s p
ldr_ t h i s _ c p u x16 , i r q _ s t a c k _ p t r , x17
/* Move to the new stack and call the function there */
2023-01-09 20:48:00 +03:00
add s p , x16 , #I R Q _ S T A C K _ S I Z E
2021-06-07 12:46:10 +03:00
blr x1
/ *
* Restore t h e S P f r o m t h e F P , a n d r e s t o r e t h e F P a n d L R f r o m t h e f r a m e
* record.
* /
mov s p , x29
ldp x29 , x30 , [ s p ] , #16
2023-01-09 20:48:00 +03:00
scs_ l o a d _ c u r r e n t
2021-06-07 12:46:10 +03:00
ret
SYM_ F U N C _ E N D ( c a l l _ o n _ i r q _ s t a c k )
NOKPROBE( c a l l _ o n _ i r q _ s t a c k )
arm64: kernel: Add arch-specific SDEI entry code and CPU masking
The Software Delegated Exception Interface (SDEI) is an ARM standard
for registering callbacks from the platform firmware into the OS.
This is typically used to implement RAS notifications.
Such notifications enter the kernel at the registered entry-point
with the register values of the interrupted CPU context. Because this
is not a CPU exception, it cannot reuse the existing entry code.
(crucially we don't implicitly know which exception level we interrupted),
Add the entry point to entry.S to set us up for calling into C code. If
the event interrupted code that had interrupts masked, we always return
to that location. Otherwise we pretend this was an IRQ, and use SDEI's
complete_and_resume call to return to vbar_el1 + offset.
This allows the kernel to deliver signals to user space processes. For
KVM this triggers the world switch, a quick spin round vcpu_run, then
back into the guest, unless there are pending signals.
Add sdei_mask_local_cpu() calls to the smp_send_stop() code, this covers
the panic() code-path, which doesn't invoke cpuhotplug notifiers.
Because we can interrupt entry-from/exit-to another EL, we can't trust the
value in sp_el0 or x29, even if we interrupted the kernel, in this case
the code in entry.S will save/restore sp_el0 and use the value in
__entry_task.
When we have VMAP stacks we can interrupt the stack-overflow test, which
stirs x0 into sp, meaning we have to have our own VMAP stacks. For now
these are allocated when we probe the interface. Future patches will add
refcounting hooks to allow the arch code to allocate them lazily.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-08 18:38:12 +03:00
# ifdef C O N F I G _ A R M _ S D E _ I N T E R F A C E
# include < a s m / s d e i . h >
# include < u a p i / l i n u x / a r m _ s d e i . h >
2018-01-08 18:38:18 +03:00
.macro sdei_handler_exit exit_ m o d e
/* On success, this call never returns... */
cmp \ e x i t _ m o d e , #S D E I _ E X I T _ S M C
b. n e 9 9 f
smc #0
b .
99 : hvc #0
b .
.endm
# ifdef C O N F I G _ U N M A P _ K E R N E L _ A T _ E L 0
/ *
* The r e g u l a r S D E I e n t r y p o i n t m a y h a v e b e e n u n m a p p e d a l o n g w i t h t h e r e s t o f
* the k e r n e l . T h i s t r a m p o l i n e r e s t o r e s t h e k e r n e l m a p p i n g t o m a k e t h e x1 m e m o r y
* argument a c c e s s i b l e .
*
* This c l o b b e r s x4 , _ _ s d e i _ h a n d l e r ( ) w i l l r e s t o r e t h i s f r o m f i r m w a r e ' s
* copy.
* /
.pushsection " .entry .tramp .text " , " ax"
2020-02-18 22:58:40 +03:00
SYM_ C O D E _ S T A R T ( _ _ s d e i _ a s m _ e n t r y _ t r a m p o l i n e )
2018-01-08 18:38:18 +03:00
mrs x4 , t t b r1 _ e l 1
tbz x4 , #U S E R _ A S I D _ B I T , 1 f
tramp_ m a p _ k e r n e l t m p =x4
isb
mov x4 , x z r
/ *
arm64: uaccess: remove set_fs()
Now that the uaccess primitives dont take addr_limit into account, we
have no need to manipulate this via set_fs() and get_fs(). Remove
support for these, along with some infrastructure this renders
redundant.
We no longer need to flip UAO to access kernel memory under KERNEL_DS,
and head.S unconditionally clears UAO for all kernel configurations via
an ERET in init_kernel_el. Thus, we don't need to dynamically flip UAO,
nor do we need to context-switch it. However, we still need to adjust
PAN during SDEI entry.
Masking of __user pointers no longer needs to use the dynamic value of
addr_limit, and can use a constant derived from the maximum possible
userspace task size. A new TASK_SIZE_MAX constant is introduced for
this, which is also used by core code. In configurations supporting
52-bit VAs, this may include a region of unusable VA space above a
48-bit TTBR0 limit, but never includes any portion of TTBR1.
Note that TASK_SIZE_MAX is an exclusive limit, while USER_DS and
KERNEL_DS were inclusive limits, and is converted to a mask by
subtracting one.
As the SDEI entry code repurposes the otherwise unnecessary
pt_regs::orig_addr_limit field to store the TTBR1 of the interrupted
context, for now we rename that to pt_regs::sdei_ttbr1. In future we can
consider factoring that out.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: James Morse <james.morse@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201202131558.39270-10-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-12-02 16:15:55 +03:00
* Remember w h e t h e r t o u n m a p t h e k e r n e l o n e x i t .
2018-01-08 18:38:18 +03:00
* /
arm64: uaccess: remove set_fs()
Now that the uaccess primitives dont take addr_limit into account, we
have no need to manipulate this via set_fs() and get_fs(). Remove
support for these, along with some infrastructure this renders
redundant.
We no longer need to flip UAO to access kernel memory under KERNEL_DS,
and head.S unconditionally clears UAO for all kernel configurations via
an ERET in init_kernel_el. Thus, we don't need to dynamically flip UAO,
nor do we need to context-switch it. However, we still need to adjust
PAN during SDEI entry.
Masking of __user pointers no longer needs to use the dynamic value of
addr_limit, and can use a constant derived from the maximum possible
userspace task size. A new TASK_SIZE_MAX constant is introduced for
this, which is also used by core code. In configurations supporting
52-bit VAs, this may include a region of unusable VA space above a
48-bit TTBR0 limit, but never includes any portion of TTBR1.
Note that TASK_SIZE_MAX is an exclusive limit, while USER_DS and
KERNEL_DS were inclusive limits, and is converted to a mask by
subtracting one.
As the SDEI entry code repurposes the otherwise unnecessary
pt_regs::orig_addr_limit field to store the TTBR1 of the interrupted
context, for now we rename that to pt_regs::sdei_ttbr1. In future we can
consider factoring that out.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: James Morse <james.morse@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201202131558.39270-10-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-12-02 16:15:55 +03:00
1 : str x4 , [ x1 , #( S D E I _ E V E N T _ I N T R E G S + S _ S D E I _ T T B R 1 ) ]
2021-11-25 17:25:34 +03:00
tramp_ d a t a _ r e a d _ v a r x4 , _ _ s d e i _ a s m _ h a n d l e r
2018-01-08 18:38:18 +03:00
br x4
2020-02-18 22:58:40 +03:00
SYM_ C O D E _ E N D ( _ _ s d e i _ a s m _ e n t r y _ t r a m p o l i n e )
2018-01-08 18:38:18 +03:00
NOKPROBE( _ _ s d e i _ a s m _ e n t r y _ t r a m p o l i n e )
/ *
* Make t h e e x i t c a l l a n d r e s t o r e t h e o r i g i n a l t t b r1 _ e l 1
*
* x0 & x1 : s e t u p f o r t h e e x i t A P I c a l l
* x2 : exit_ m o d e
* x4 : struct s d e i _ r e g i s t e r e d _ e v e n t a r g u m e n t f r o m r e g i s t r a t i o n t i m e .
* /
2020-02-18 22:58:40 +03:00
SYM_ C O D E _ S T A R T ( _ _ s d e i _ a s m _ e x i t _ t r a m p o l i n e )
arm64: uaccess: remove set_fs()
Now that the uaccess primitives dont take addr_limit into account, we
have no need to manipulate this via set_fs() and get_fs(). Remove
support for these, along with some infrastructure this renders
redundant.
We no longer need to flip UAO to access kernel memory under KERNEL_DS,
and head.S unconditionally clears UAO for all kernel configurations via
an ERET in init_kernel_el. Thus, we don't need to dynamically flip UAO,
nor do we need to context-switch it. However, we still need to adjust
PAN during SDEI entry.
Masking of __user pointers no longer needs to use the dynamic value of
addr_limit, and can use a constant derived from the maximum possible
userspace task size. A new TASK_SIZE_MAX constant is introduced for
this, which is also used by core code. In configurations supporting
52-bit VAs, this may include a region of unusable VA space above a
48-bit TTBR0 limit, but never includes any portion of TTBR1.
Note that TASK_SIZE_MAX is an exclusive limit, while USER_DS and
KERNEL_DS were inclusive limits, and is converted to a mask by
subtracting one.
As the SDEI entry code repurposes the otherwise unnecessary
pt_regs::orig_addr_limit field to store the TTBR1 of the interrupted
context, for now we rename that to pt_regs::sdei_ttbr1. In future we can
consider factoring that out.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: James Morse <james.morse@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201202131558.39270-10-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-12-02 16:15:55 +03:00
ldr x4 , [ x4 , #( S D E I _ E V E N T _ I N T R E G S + S _ S D E I _ T T B R 1 ) ]
2018-01-08 18:38:18 +03:00
cbnz x4 , 1 f
tramp_ u n m a p _ k e r n e l t m p =x4
1 : sdei_ h a n d l e r _ e x i t e x i t _ m o d e =x2
2020-02-18 22:58:40 +03:00
SYM_ C O D E _ E N D ( _ _ s d e i _ a s m _ e x i t _ t r a m p o l i n e )
2018-01-08 18:38:18 +03:00
NOKPROBE( _ _ s d e i _ a s m _ e x i t _ t r a m p o l i n e )
.popsection / / .entry .tramp .text
# endif / * C O N F I G _ U N M A P _ K E R N E L _ A T _ E L 0 * /
arm64: kernel: Add arch-specific SDEI entry code and CPU masking
The Software Delegated Exception Interface (SDEI) is an ARM standard
for registering callbacks from the platform firmware into the OS.
This is typically used to implement RAS notifications.
Such notifications enter the kernel at the registered entry-point
with the register values of the interrupted CPU context. Because this
is not a CPU exception, it cannot reuse the existing entry code.
(crucially we don't implicitly know which exception level we interrupted),
Add the entry point to entry.S to set us up for calling into C code. If
the event interrupted code that had interrupts masked, we always return
to that location. Otherwise we pretend this was an IRQ, and use SDEI's
complete_and_resume call to return to vbar_el1 + offset.
This allows the kernel to deliver signals to user space processes. For
KVM this triggers the world switch, a quick spin round vcpu_run, then
back into the guest, unless there are pending signals.
Add sdei_mask_local_cpu() calls to the smp_send_stop() code, this covers
the panic() code-path, which doesn't invoke cpuhotplug notifiers.
Because we can interrupt entry-from/exit-to another EL, we can't trust the
value in sp_el0 or x29, even if we interrupted the kernel, in this case
the code in entry.S will save/restore sp_el0 and use the value in
__entry_task.
When we have VMAP stacks we can interrupt the stack-overflow test, which
stirs x0 into sp, meaning we have to have our own VMAP stacks. For now
these are allocated when we probe the interface. Future patches will add
refcounting hooks to allow the arch code to allocate them lazily.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-08 18:38:12 +03:00
/ *
* Software D e l e g a t e d E x c e p t i o n e n t r y p o i n t .
*
* x0 : Event n u m b e r
* x1 : struct s d e i _ r e g i s t e r e d _ e v e n t a r g u m e n t f r o m r e g i s t r a t i o n t i m e .
* x2 : interrupted P C
* x3 : interrupted P S T A T E
2018-01-08 18:38:18 +03:00
* x4 : maybe c l o b b e r e d b y t h e t r a m p o l i n e
arm64: kernel: Add arch-specific SDEI entry code and CPU masking
The Software Delegated Exception Interface (SDEI) is an ARM standard
for registering callbacks from the platform firmware into the OS.
This is typically used to implement RAS notifications.
Such notifications enter the kernel at the registered entry-point
with the register values of the interrupted CPU context. Because this
is not a CPU exception, it cannot reuse the existing entry code.
(crucially we don't implicitly know which exception level we interrupted),
Add the entry point to entry.S to set us up for calling into C code. If
the event interrupted code that had interrupts masked, we always return
to that location. Otherwise we pretend this was an IRQ, and use SDEI's
complete_and_resume call to return to vbar_el1 + offset.
This allows the kernel to deliver signals to user space processes. For
KVM this triggers the world switch, a quick spin round vcpu_run, then
back into the guest, unless there are pending signals.
Add sdei_mask_local_cpu() calls to the smp_send_stop() code, this covers
the panic() code-path, which doesn't invoke cpuhotplug notifiers.
Because we can interrupt entry-from/exit-to another EL, we can't trust the
value in sp_el0 or x29, even if we interrupted the kernel, in this case
the code in entry.S will save/restore sp_el0 and use the value in
__entry_task.
When we have VMAP stacks we can interrupt the stack-overflow test, which
stirs x0 into sp, meaning we have to have our own VMAP stacks. For now
these are allocated when we probe the interface. Future patches will add
refcounting hooks to allow the arch code to allocate them lazily.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-08 18:38:12 +03:00
*
* Firmware h a s p r e s e r v e d x0 - > x17 f o r u s , w e m u s t s a v e / r e s t o r e t h e r e s t t o
* follow S M C - C C . W e s a v e ( o r r e t r i e v e ) a l l t h e r e g i s t e r s a s t h e h a n d l e r m a y
* want t h e m .
* /
2020-02-18 22:58:40 +03:00
SYM_ C O D E _ S T A R T ( _ _ s d e i _ a s m _ h a n d l e r )
arm64: kernel: Add arch-specific SDEI entry code and CPU masking
The Software Delegated Exception Interface (SDEI) is an ARM standard
for registering callbacks from the platform firmware into the OS.
This is typically used to implement RAS notifications.
Such notifications enter the kernel at the registered entry-point
with the register values of the interrupted CPU context. Because this
is not a CPU exception, it cannot reuse the existing entry code.
(crucially we don't implicitly know which exception level we interrupted),
Add the entry point to entry.S to set us up for calling into C code. If
the event interrupted code that had interrupts masked, we always return
to that location. Otherwise we pretend this was an IRQ, and use SDEI's
complete_and_resume call to return to vbar_el1 + offset.
This allows the kernel to deliver signals to user space processes. For
KVM this triggers the world switch, a quick spin round vcpu_run, then
back into the guest, unless there are pending signals.
Add sdei_mask_local_cpu() calls to the smp_send_stop() code, this covers
the panic() code-path, which doesn't invoke cpuhotplug notifiers.
Because we can interrupt entry-from/exit-to another EL, we can't trust the
value in sp_el0 or x29, even if we interrupted the kernel, in this case
the code in entry.S will save/restore sp_el0 and use the value in
__entry_task.
When we have VMAP stacks we can interrupt the stack-overflow test, which
stirs x0 into sp, meaning we have to have our own VMAP stacks. For now
these are allocated when we probe the interface. Future patches will add
refcounting hooks to allow the arch code to allocate them lazily.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-08 18:38:12 +03:00
stp x2 , x3 , [ x1 , #S D E I _ E V E N T _ I N T R E G S + S _ P C ]
stp x4 , x5 , [ x1 , #S D E I _ E V E N T _ I N T R E G S + 16 * 2 ]
stp x6 , x7 , [ x1 , #S D E I _ E V E N T _ I N T R E G S + 16 * 3 ]
stp x8 , x9 , [ x1 , #S D E I _ E V E N T _ I N T R E G S + 16 * 4 ]
stp x10 , x11 , [ x1 , #S D E I _ E V E N T _ I N T R E G S + 16 * 5 ]
stp x12 , x13 , [ x1 , #S D E I _ E V E N T _ I N T R E G S + 16 * 6 ]
stp x14 , x15 , [ x1 , #S D E I _ E V E N T _ I N T R E G S + 16 * 7 ]
stp x16 , x17 , [ x1 , #S D E I _ E V E N T _ I N T R E G S + 16 * 8 ]
stp x18 , x19 , [ x1 , #S D E I _ E V E N T _ I N T R E G S + 16 * 9 ]
stp x20 , x21 , [ x1 , #S D E I _ E V E N T _ I N T R E G S + 16 * 1 0 ]
stp x22 , x23 , [ x1 , #S D E I _ E V E N T _ I N T R E G S + 16 * 1 1 ]
stp x24 , x25 , [ x1 , #S D E I _ E V E N T _ I N T R E G S + 16 * 1 2 ]
stp x26 , x27 , [ x1 , #S D E I _ E V E N T _ I N T R E G S + 16 * 1 3 ]
stp x28 , x29 , [ x1 , #S D E I _ E V E N T _ I N T R E G S + 16 * 1 4 ]
mov x4 , s p
stp l r , x4 , [ x1 , #S D E I _ E V E N T _ I N T R E G S + S _ L R ]
mov x19 , x1
2023-06-27 03:29:39 +03:00
/* Store the registered-event for crash_smp_send_stop() */
2020-04-27 19:00:17 +03:00
ldrb w4 , [ x19 , #S D E I _ E V E N T _ P R I O R I T Y ]
2023-06-27 03:29:39 +03:00
cbnz w4 , 1 f
adr_ t h i s _ c p u d s t =x5 , s y m =sdei_active_normal_event , t m p =x6
b 2 f
1 : adr_ t h i s _ c p u d s t =x5 , s y m =sdei_active_critical_event , t m p =x6
2 : str x19 , [ x5 ]
2020-04-27 19:00:17 +03:00
arm64: kernel: Add arch-specific SDEI entry code and CPU masking
The Software Delegated Exception Interface (SDEI) is an ARM standard
for registering callbacks from the platform firmware into the OS.
This is typically used to implement RAS notifications.
Such notifications enter the kernel at the registered entry-point
with the register values of the interrupted CPU context. Because this
is not a CPU exception, it cannot reuse the existing entry code.
(crucially we don't implicitly know which exception level we interrupted),
Add the entry point to entry.S to set us up for calling into C code. If
the event interrupted code that had interrupts masked, we always return
to that location. Otherwise we pretend this was an IRQ, and use SDEI's
complete_and_resume call to return to vbar_el1 + offset.
This allows the kernel to deliver signals to user space processes. For
KVM this triggers the world switch, a quick spin round vcpu_run, then
back into the guest, unless there are pending signals.
Add sdei_mask_local_cpu() calls to the smp_send_stop() code, this covers
the panic() code-path, which doesn't invoke cpuhotplug notifiers.
Because we can interrupt entry-from/exit-to another EL, we can't trust the
value in sp_el0 or x29, even if we interrupted the kernel, in this case
the code in entry.S will save/restore sp_el0 and use the value in
__entry_task.
When we have VMAP stacks we can interrupt the stack-overflow test, which
stirs x0 into sp, meaning we have to have our own VMAP stacks. For now
these are allocated when we probe the interface. Future patches will add
refcounting hooks to allow the arch code to allocate them lazily.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-08 18:38:12 +03:00
# ifdef C O N F I G _ V M A P _ S T A C K
/ *
* entry. S m a y h a v e b e e n u s i n g s p a s a s c r a t c h r e g i s t e r , f i n d w h e t h e r
* this i s a n o r m a l o r c r i t i c a l e v e n t a n d s w i t c h t o t h e a p p r o p r i a t e
* stack f o r t h i s C P U .
* /
cbnz w4 , 1 f
ldr_ t h i s _ c p u d s t =x5 , s y m =sdei_stack_normal_ptr , t m p =x6
b 2 f
1 : ldr_ t h i s _ c p u d s t =x5 , s y m =sdei_stack_critical_ptr , t m p =x6
2 : mov x6 , #S D E I _ S T A C K _ S I Z E
add x5 , x5 , x6
mov s p , x5
# endif
2020-04-27 19:00:17 +03:00
# ifdef C O N F I G _ S H A D O W _ C A L L _ S T A C K
/* Use a separate shadow call stack for normal and critical events */
cbnz w4 , 3 f
2020-12-01 02:34:42 +03:00
ldr_ t h i s _ c p u d s t =scs_sp , s y m =sdei_shadow_call_stack_normal_ptr , t m p =x6
2020-04-27 19:00:17 +03:00
b 4 f
2020-12-01 02:34:42 +03:00
3 : ldr_ t h i s _ c p u d s t =scs_sp , s y m =sdei_shadow_call_stack_critical_ptr , t m p =x6
2020-04-27 19:00:17 +03:00
4 :
# endif
arm64: kernel: Add arch-specific SDEI entry code and CPU masking
The Software Delegated Exception Interface (SDEI) is an ARM standard
for registering callbacks from the platform firmware into the OS.
This is typically used to implement RAS notifications.
Such notifications enter the kernel at the registered entry-point
with the register values of the interrupted CPU context. Because this
is not a CPU exception, it cannot reuse the existing entry code.
(crucially we don't implicitly know which exception level we interrupted),
Add the entry point to entry.S to set us up for calling into C code. If
the event interrupted code that had interrupts masked, we always return
to that location. Otherwise we pretend this was an IRQ, and use SDEI's
complete_and_resume call to return to vbar_el1 + offset.
This allows the kernel to deliver signals to user space processes. For
KVM this triggers the world switch, a quick spin round vcpu_run, then
back into the guest, unless there are pending signals.
Add sdei_mask_local_cpu() calls to the smp_send_stop() code, this covers
the panic() code-path, which doesn't invoke cpuhotplug notifiers.
Because we can interrupt entry-from/exit-to another EL, we can't trust the
value in sp_el0 or x29, even if we interrupted the kernel, in this case
the code in entry.S will save/restore sp_el0 and use the value in
__entry_task.
When we have VMAP stacks we can interrupt the stack-overflow test, which
stirs x0 into sp, meaning we have to have our own VMAP stacks. For now
these are allocated when we probe the interface. Future patches will add
refcounting hooks to allow the arch code to allocate them lazily.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-08 18:38:12 +03:00
/ *
* We m a y h a v e i n t e r r u p t e d u s e r s p a c e , o r a g u e s t , o r e x i t - f r o m o r
* return- t o e i t h e r o f t h e s e . W e c a n ' t t r u s t s p _ e l 0 , r e s t o r e i t .
* /
mrs x28 , s p _ e l 0
ldr_ t h i s _ c p u d s t =x0 , s y m =__entry_task , t m p =x1
msr s p _ e l 0 , x0
/* If we interrupted the kernel point to the previous stack/frame. */
and x0 , x3 , #0xc
mrs x1 , C u r r e n t E L
cmp x0 , x1
csel x29 , x29 , x z r , e q / / f p , o r z e r o
csel x4 , x2 , x z r , e q / / e l r , o r z e r o
stp x29 , x4 , [ s p , #- 16 ] !
mov x29 , s p
add x0 , x19 , #S D E I _ E V E N T _ I N T R E G S
mov x1 , x19
bl _ _ s d e i _ h a n d l e r
msr s p _ e l 0 , x28
/* restore regs >x17 that we clobbered */
2018-01-08 18:38:18 +03:00
mov x4 , x19 / / k e e p x4 f o r _ _ s d e i _ a s m _ e x i t _ t r a m p o l i n e
ldp x28 , x29 , [ x4 , #S D E I _ E V E N T _ I N T R E G S + 16 * 1 4 ]
ldp x18 , x19 , [ x4 , #S D E I _ E V E N T _ I N T R E G S + 16 * 9 ]
ldp l r , x1 , [ x4 , #S D E I _ E V E N T _ I N T R E G S + S _ L R ]
mov s p , x1
arm64: kernel: Add arch-specific SDEI entry code and CPU masking
The Software Delegated Exception Interface (SDEI) is an ARM standard
for registering callbacks from the platform firmware into the OS.
This is typically used to implement RAS notifications.
Such notifications enter the kernel at the registered entry-point
with the register values of the interrupted CPU context. Because this
is not a CPU exception, it cannot reuse the existing entry code.
(crucially we don't implicitly know which exception level we interrupted),
Add the entry point to entry.S to set us up for calling into C code. If
the event interrupted code that had interrupts masked, we always return
to that location. Otherwise we pretend this was an IRQ, and use SDEI's
complete_and_resume call to return to vbar_el1 + offset.
This allows the kernel to deliver signals to user space processes. For
KVM this triggers the world switch, a quick spin round vcpu_run, then
back into the guest, unless there are pending signals.
Add sdei_mask_local_cpu() calls to the smp_send_stop() code, this covers
the panic() code-path, which doesn't invoke cpuhotplug notifiers.
Because we can interrupt entry-from/exit-to another EL, we can't trust the
value in sp_el0 or x29, even if we interrupted the kernel, in this case
the code in entry.S will save/restore sp_el0 and use the value in
__entry_task.
When we have VMAP stacks we can interrupt the stack-overflow test, which
stirs x0 into sp, meaning we have to have our own VMAP stacks. For now
these are allocated when we probe the interface. Future patches will add
refcounting hooks to allow the arch code to allocate them lazily.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-08 18:38:12 +03:00
mov x1 , x0 / / a d d r e s s t o c o m p l e t e _ a n d _ r e s u m e
2021-11-18 23:18:10 +03:00
/ * x0 = ( x0 < = S D E I _ E V _ F A I L E D ) ?
* EVENT_COMPLETE : EVENT_ C O M P L E T E _ A N D _ R E S U M E
* /
cmp x0 , #S D E I _ E V _ F A I L E D
arm64: kernel: Add arch-specific SDEI entry code and CPU masking
The Software Delegated Exception Interface (SDEI) is an ARM standard
for registering callbacks from the platform firmware into the OS.
This is typically used to implement RAS notifications.
Such notifications enter the kernel at the registered entry-point
with the register values of the interrupted CPU context. Because this
is not a CPU exception, it cannot reuse the existing entry code.
(crucially we don't implicitly know which exception level we interrupted),
Add the entry point to entry.S to set us up for calling into C code. If
the event interrupted code that had interrupts masked, we always return
to that location. Otherwise we pretend this was an IRQ, and use SDEI's
complete_and_resume call to return to vbar_el1 + offset.
This allows the kernel to deliver signals to user space processes. For
KVM this triggers the world switch, a quick spin round vcpu_run, then
back into the guest, unless there are pending signals.
Add sdei_mask_local_cpu() calls to the smp_send_stop() code, this covers
the panic() code-path, which doesn't invoke cpuhotplug notifiers.
Because we can interrupt entry-from/exit-to another EL, we can't trust the
value in sp_el0 or x29, even if we interrupted the kernel, in this case
the code in entry.S will save/restore sp_el0 and use the value in
__entry_task.
When we have VMAP stacks we can interrupt the stack-overflow test, which
stirs x0 into sp, meaning we have to have our own VMAP stacks. For now
these are allocated when we probe the interface. Future patches will add
refcounting hooks to allow the arch code to allocate them lazily.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-08 18:38:12 +03:00
mov_ q x2 , S D E I _ 1 _ 0 _ F N _ S D E I _ E V E N T _ C O M P L E T E
mov_ q x3 , S D E I _ 1 _ 0 _ F N _ S D E I _ E V E N T _ C O M P L E T E _ A N D _ R E S U M E
csel x0 , x2 , x3 , l s
ldr_ l x2 , s d e i _ e x i t _ m o d e
2018-01-08 18:38:18 +03:00
2023-06-27 03:29:39 +03:00
/* Clear the registered-event seen by crash_smp_send_stop() */
ldrb w3 , [ x4 , #S D E I _ E V E N T _ P R I O R I T Y ]
cbnz w3 , 1 f
adr_ t h i s _ c p u d s t =x5 , s y m =sdei_active_normal_event , t m p =x6
b 2 f
1 : adr_ t h i s _ c p u d s t =x5 , s y m =sdei_active_critical_event , t m p =x6
2 : str x z r , [ x5 ]
2018-01-08 18:38:18 +03:00
alternative_ i f _ n o t A R M 6 4 _ U N M A P _ K E R N E L _ A T _ E L 0
sdei_ h a n d l e r _ e x i t e x i t _ m o d e =x2
alternative_ e l s e _ n o p _ e n d i f
# ifdef C O N F I G _ U N M A P _ K E R N E L _ A T _ E L 0
2023-04-18 17:36:04 +03:00
tramp_ a l i a s d s t =x5 , s y m =__sdei_asm_exit_trampoline
2018-01-08 18:38:18 +03:00
br x5
# endif
2020-02-18 22:58:40 +03:00
SYM_ C O D E _ E N D ( _ _ s d e i _ a s m _ h a n d l e r )
arm64: kernel: Add arch-specific SDEI entry code and CPU masking
The Software Delegated Exception Interface (SDEI) is an ARM standard
for registering callbacks from the platform firmware into the OS.
This is typically used to implement RAS notifications.
Such notifications enter the kernel at the registered entry-point
with the register values of the interrupted CPU context. Because this
is not a CPU exception, it cannot reuse the existing entry code.
(crucially we don't implicitly know which exception level we interrupted),
Add the entry point to entry.S to set us up for calling into C code. If
the event interrupted code that had interrupts masked, we always return
to that location. Otherwise we pretend this was an IRQ, and use SDEI's
complete_and_resume call to return to vbar_el1 + offset.
This allows the kernel to deliver signals to user space processes. For
KVM this triggers the world switch, a quick spin round vcpu_run, then
back into the guest, unless there are pending signals.
Add sdei_mask_local_cpu() calls to the smp_send_stop() code, this covers
the panic() code-path, which doesn't invoke cpuhotplug notifiers.
Because we can interrupt entry-from/exit-to another EL, we can't trust the
value in sp_el0 or x29, even if we interrupted the kernel, in this case
the code in entry.S will save/restore sp_el0 and use the value in
__entry_task.
When we have VMAP stacks we can interrupt the stack-overflow test, which
stirs x0 into sp, meaning we have to have our own VMAP stacks. For now
these are allocated when we probe the interface. Future patches will add
refcounting hooks to allow the arch code to allocate them lazily.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-08 18:38:12 +03:00
NOKPROBE( _ _ s d e i _ a s m _ h a n d l e r )
2023-06-27 03:29:39 +03:00
SYM_ C O D E _ S T A R T ( _ _ s d e i _ h a n d l e r _ a b o r t )
mov_ q x0 , S D E I _ 1 _ 0 _ F N _ S D E I _ E V E N T _ C O M P L E T E _ A N D _ R E S U M E
adr x1 , 1 f
ldr_ l x2 , s d e i _ e x i t _ m o d e
sdei_ h a n d l e r _ e x i t e x i t _ m o d e =x2
/ / exit t h e h a n d l e r a n d j u m p t o t h e n e x t i n s t r u c t i o n .
/ / Exit w i l l s t o m p x0 - x17 , P S T A T E , E L R _ E L x , a n d S P S R _ E L x .
1 : ret
SYM_ C O D E _ E N D ( _ _ s d e i _ h a n d l e r _ a b o r t )
NOKPROBE( _ _ s d e i _ h a n d l e r _ a b o r t )
arm64: kernel: Add arch-specific SDEI entry code and CPU masking
The Software Delegated Exception Interface (SDEI) is an ARM standard
for registering callbacks from the platform firmware into the OS.
This is typically used to implement RAS notifications.
Such notifications enter the kernel at the registered entry-point
with the register values of the interrupted CPU context. Because this
is not a CPU exception, it cannot reuse the existing entry code.
(crucially we don't implicitly know which exception level we interrupted),
Add the entry point to entry.S to set us up for calling into C code. If
the event interrupted code that had interrupts masked, we always return
to that location. Otherwise we pretend this was an IRQ, and use SDEI's
complete_and_resume call to return to vbar_el1 + offset.
This allows the kernel to deliver signals to user space processes. For
KVM this triggers the world switch, a quick spin round vcpu_run, then
back into the guest, unless there are pending signals.
Add sdei_mask_local_cpu() calls to the smp_send_stop() code, this covers
the panic() code-path, which doesn't invoke cpuhotplug notifiers.
Because we can interrupt entry-from/exit-to another EL, we can't trust the
value in sp_el0 or x29, even if we interrupted the kernel, in this case
the code in entry.S will save/restore sp_el0 and use the value in
__entry_task.
When we have VMAP stacks we can interrupt the stack-overflow test, which
stirs x0 into sp, meaning we have to have our own VMAP stacks. For now
these are allocated when we probe the interface. Future patches will add
refcounting hooks to allow the arch code to allocate them lazily.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-01-08 18:38:12 +03:00
# endif / * C O N F I G _ A R M _ S D E _ I N T E R F A C E * /