2019-06-03 07:44:50 +02:00
// SPDX-License-Identifier: GPL-2.0-only
2015-07-06 12:23:54 +01:00
/*
* arm64 callchain support
*
* Copyright ( C ) 2015 ARM Limited
*/
# include <linux/perf_event.h>
arm64: Make perf_callchain_kernel() use arch_stack_walk()
To enable RELIABLE_STACKTRACE and LIVEPATCH on arm64, we need to
substantially rework arm64's unwinding code. As part of this, we want to
minimize the set of unwind interfaces we expose, and avoid open-coding
of unwind logic outside of stacktrace.c.
Currently perf_callchain_kernel() walks the stack of an interrupted
context by calling start_backtrace() with the context's PC and FP, and
iterating unwind steps using walk_stackframe(). This is functionally
equivalent to calling arch_stack_walk() with the interrupted context's
pt_regs, which will start with the PC and FP from the regs.
Make perf_callchain_kernel() use arch_stack_walk(). This simplifies
perf_callchain_kernel(), and in future will alow us to make
walk_stackframe() private to stacktrace.c.
At the same time, we update the callchain_trace() callback to check the
return value of perf_callchain_store(), which indicates whether there is
space for any further entries. When a non-zero value is returned,
further calls will be ignored, and are redundant, so we can stop the
unwind at this point.
We also remove the stale and confusing comment for callchain_trace.
There should be no functional change as a result of this patch.
Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
[Mark: elaborate commit message, remove comment, fix includes]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20211129142849.3056714-5-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-11-29 14:28:44 +00:00
# include <linux/stacktrace.h>
2015-07-06 12:23:54 +01:00
# include <linux/uaccess.h>
2018-12-07 18:39:27 +00:00
# include <asm/pointer_auth.h>
2015-07-06 12:23:54 +01:00
struct frame_tail {
struct frame_tail __user * fp ;
unsigned long lr ;
} __attribute__ ( ( packed ) ) ;
/*
* Get the return address for a single stackframe and return a pointer to the
* next frame tail .
*/
static struct frame_tail __user *
user_backtrace ( struct frame_tail __user * tail ,
2016-04-28 12:30:53 -03:00
struct perf_callchain_entry_ctx * entry )
2015-07-06 12:23:54 +01:00
{
struct frame_tail buftail ;
unsigned long err ;
2018-12-07 18:39:27 +00:00
unsigned long lr ;
2015-07-06 12:23:54 +01:00
/* Also check accessibility of one struct frame_tail beyond */
Remove 'type' argument from access_ok() function
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
of the user address range verification function since we got rid of the
old racy i386-only code to walk page tables by hand.
It existed because the original 80386 would not honor the write protect
bit when in kernel mode, so you had to do COW by hand before doing any
user access. But we haven't supported that in a long time, and these
days the 'type' argument is a purely historical artifact.
A discussion about extending 'user_access_begin()' to do the range
checking resulted this patch, because there is no way we're going to
move the old VERIFY_xyz interface to that model. And it's best done at
the end of the merge window when I've done most of my merges, so let's
just get this done once and for all.
This patch was mostly done with a sed-script, with manual fix-ups for
the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.
There were a couple of notable cases:
- csky still had the old "verify_area()" name as an alias.
- the iter_iov code had magical hardcoded knowledge of the actual
values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
really used it)
- microblaze used the type argument for a debug printout
but other than those oddities this should be a total no-op patch.
I tried to fix up all architectures, did fairly extensive grepping for
access_ok() uses, and the changes are trivial, but I may have missed
something. Any missed conversion should be trivially fixable, though.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-03 18:57:57 -08:00
if ( ! access_ok ( tail , sizeof ( buftail ) ) )
2015-07-06 12:23:54 +01:00
return NULL ;
pagefault_disable ( ) ;
err = __copy_from_user_inatomic ( & buftail , tail , sizeof ( buftail ) ) ;
pagefault_enable ( ) ;
if ( err )
return NULL ;
arm64: use XPACLRI to strip PAC
Currently we strip the PAC from pointers using C code, which requires
generating bitmasks, and conditionally clearing/setting bits depending
on bit 55. We can do better by using XPACLRI directly.
When the logic was originally written to strip PACs from user pointers,
contemporary toolchains used for the kernel had assemblers which were
unaware of the PAC instructions. As stripping the PAC from userspace
pointers required unconditional clearing of a fixed set of bits (which
could be performed with a single instruction), it was simpler to
implement the masking in C than it was to make use of XPACI or XPACLRI.
When support for in-kernel pointer authentication was added, the
stripping logic was extended to cover TTBR1 pointers, requiring several
instructions to handle whether to clear/set bits dependent on bit 55 of
the pointer.
This patch simplifies the stripping of PACs by using XPACLRI directly,
as contemporary toolchains do within __builtin_return_address(). This
saves a number of instructions, especially where
__builtin_return_address() does not implicitly strip the PAC but is
heavily used (e.g. with tracepoints). As the kernel might be compiled
with an assembler without knowledge of XPACLRI, it is assembled using
the 'HINT #7' alias, which results in an identical opcode.
At the same time, I've split ptrauth_strip_insn_pac() into
ptrauth_strip_user_insn_pac() and ptrauth_strip_kernel_insn_pac()
helpers so that we can avoid unnecessary PAC stripping when pointer
authentication is not in use in userspace or kernel respectively.
The underlying xpaclri() macro uses inline assembly which clobbers x30.
The clobber causes the compiler to save/restore the original x30 value
in a frame record (protected with PACIASP and AUTIASP when in-kernel
authentication is enabled), so this does not provide a gadget to alter
the return address. Similarly this does not adversely affect unwinding
due to the presence of the frame record.
The ptrauth_user_pac_mask() and ptrauth_kernel_pac_mask() are exported
from the kernel in ptrace and core dumps, so these are retained. A
subsequent patch will move them out of <asm/compiler.h>.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kristina Martsenko <kristina.martsenko@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230412160134.306148-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-04-12 17:01:33 +01:00
lr = ptrauth_strip_user_insn_pac ( buftail . lr ) ;
2018-12-07 18:39:27 +00:00
perf_callchain_store ( entry , lr ) ;
2015-07-06 12:23:54 +01:00
/*
* Frame pointers should strictly progress back up the stack
* ( towards higher addresses ) .
*/
if ( tail > = buftail . fp )
return NULL ;
return buftail . fp ;
}
# ifdef CONFIG_COMPAT
/*
* The registers we ' re interested in are at the end of the variable
* length saved register structure . The fp points at the end of this
* structure so the address of this struct is :
* ( struct compat_frame_tail * ) ( xxx - > fp ) - 1
*
* This code has been adapted from the ARM OProfile support .
*/
struct compat_frame_tail {
compat_uptr_t fp ; /* a (struct compat_frame_tail *) in compat mode */
u32 sp ;
u32 lr ;
} __attribute__ ( ( packed ) ) ;
static struct compat_frame_tail __user *
compat_user_backtrace ( struct compat_frame_tail __user * tail ,
2016-04-28 12:30:53 -03:00
struct perf_callchain_entry_ctx * entry )
2015-07-06 12:23:54 +01:00
{
struct compat_frame_tail buftail ;
unsigned long err ;
/* Also check accessibility of one struct frame_tail beyond */
Remove 'type' argument from access_ok() function
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
of the user address range verification function since we got rid of the
old racy i386-only code to walk page tables by hand.
It existed because the original 80386 would not honor the write protect
bit when in kernel mode, so you had to do COW by hand before doing any
user access. But we haven't supported that in a long time, and these
days the 'type' argument is a purely historical artifact.
A discussion about extending 'user_access_begin()' to do the range
checking resulted this patch, because there is no way we're going to
move the old VERIFY_xyz interface to that model. And it's best done at
the end of the merge window when I've done most of my merges, so let's
just get this done once and for all.
This patch was mostly done with a sed-script, with manual fix-ups for
the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.
There were a couple of notable cases:
- csky still had the old "verify_area()" name as an alias.
- the iter_iov code had magical hardcoded knowledge of the actual
values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
really used it)
- microblaze used the type argument for a debug printout
but other than those oddities this should be a total no-op patch.
I tried to fix up all architectures, did fairly extensive grepping for
access_ok() uses, and the changes are trivial, but I may have missed
something. Any missed conversion should be trivially fixable, though.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-03 18:57:57 -08:00
if ( ! access_ok ( tail , sizeof ( buftail ) ) )
2015-07-06 12:23:54 +01:00
return NULL ;
pagefault_disable ( ) ;
err = __copy_from_user_inatomic ( & buftail , tail , sizeof ( buftail ) ) ;
pagefault_enable ( ) ;
if ( err )
return NULL ;
perf_callchain_store ( entry , buftail . lr ) ;
/*
* Frame pointers should strictly progress back up the stack
* ( towards higher addresses ) .
*/
if ( tail + 1 > = ( struct compat_frame_tail __user * )
compat_ptr ( buftail . fp ) )
return NULL ;
return ( struct compat_frame_tail __user * ) compat_ptr ( buftail . fp ) - 1 ;
}
# endif /* CONFIG_COMPAT */
2016-04-28 12:30:53 -03:00
void perf_callchain_user ( struct perf_callchain_entry_ctx * entry ,
2015-07-06 12:23:54 +01:00
struct pt_regs * regs )
{
2021-11-11 02:07:28 +00:00
if ( perf_guest_state ( ) ) {
2015-07-06 12:23:54 +01:00
/* We don't support guest os callchain now */
return ;
}
perf_callchain_store ( entry , regs - > pc ) ;
if ( ! compat_user_mode ( regs ) ) {
/* AARCH64 mode */
struct frame_tail __user * tail ;
tail = ( struct frame_tail __user * ) regs - > regs [ 29 ] ;
perf core: Add a 'nr' field to perf_event_callchain_context
We will use it to count how many addresses are in the entry->ip[] array,
excluding PERF_CONTEXT_{KERNEL,USER,etc} entries, so that we can really
return the number of entries specified by the user via the relevant
sysctl, kernel.perf_event_max_contexts, or via the per event
perf_event_attr.sample_max_stack knob.
This way we keep the perf_sample->ip_callchain->nr meaning, that is the
number of entries, be it real addresses or PERF_CONTEXT_ entries, while
honouring the max_stack knobs, i.e. the end result will be max_stack
entries if we have at least that many entries in a given stack trace.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-s8teto51tdqvlfhefndtat9r@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-10 18:08:32 -03:00
while ( entry - > nr < entry - > max_stack & &
2021-05-26 10:49:26 -07:00
tail & & ! ( ( unsigned long ) tail & 0x7 ) )
2015-07-06 12:23:54 +01:00
tail = user_backtrace ( tail , entry ) ;
} else {
# ifdef CONFIG_COMPAT
/* AARCH32 compat mode */
struct compat_frame_tail __user * tail ;
tail = ( struct compat_frame_tail __user * ) regs - > compat_fp - 1 ;
perf core: Add a 'nr' field to perf_event_callchain_context
We will use it to count how many addresses are in the entry->ip[] array,
excluding PERF_CONTEXT_{KERNEL,USER,etc} entries, so that we can really
return the number of entries specified by the user via the relevant
sysctl, kernel.perf_event_max_contexts, or via the per event
perf_event_attr.sample_max_stack knob.
This way we keep the perf_sample->ip_callchain->nr meaning, that is the
number of entries, be it real addresses or PERF_CONTEXT_ entries, while
honouring the max_stack knobs, i.e. the end result will be max_stack
entries if we have at least that many entries in a given stack trace.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-s8teto51tdqvlfhefndtat9r@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-10 18:08:32 -03:00
while ( ( entry - > nr < entry - > max_stack ) & &
2015-07-06 12:23:54 +01:00
tail & & ! ( ( unsigned long ) tail & 0x3 ) )
tail = compat_user_backtrace ( tail , entry ) ;
# endif
}
}
2020-09-14 16:34:08 +01:00
static bool callchain_trace ( void * data , unsigned long pc )
2015-07-06 12:23:54 +01:00
{
2016-04-28 12:30:53 -03:00
struct perf_callchain_entry_ctx * entry = data ;
arm64: Make perf_callchain_kernel() use arch_stack_walk()
To enable RELIABLE_STACKTRACE and LIVEPATCH on arm64, we need to
substantially rework arm64's unwinding code. As part of this, we want to
minimize the set of unwind interfaces we expose, and avoid open-coding
of unwind logic outside of stacktrace.c.
Currently perf_callchain_kernel() walks the stack of an interrupted
context by calling start_backtrace() with the context's PC and FP, and
iterating unwind steps using walk_stackframe(). This is functionally
equivalent to calling arch_stack_walk() with the interrupted context's
pt_regs, which will start with the PC and FP from the regs.
Make perf_callchain_kernel() use arch_stack_walk(). This simplifies
perf_callchain_kernel(), and in future will alow us to make
walk_stackframe() private to stacktrace.c.
At the same time, we update the callchain_trace() callback to check the
return value of perf_callchain_store(), which indicates whether there is
space for any further entries. When a non-zero value is returned,
further calls will be ignored, and are redundant, so we can stop the
unwind at this point.
We also remove the stale and confusing comment for callchain_trace.
There should be no functional change as a result of this patch.
Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
[Mark: elaborate commit message, remove comment, fix includes]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20211129142849.3056714-5-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-11-29 14:28:44 +00:00
return perf_callchain_store ( entry , pc ) = = 0 ;
2015-07-06 12:23:54 +01:00
}
2016-04-28 12:30:53 -03:00
void perf_callchain_kernel ( struct perf_callchain_entry_ctx * entry ,
2015-07-06 12:23:54 +01:00
struct pt_regs * regs )
{
2021-11-11 02:07:28 +00:00
if ( perf_guest_state ( ) ) {
2015-07-06 12:23:54 +01:00
/* We don't support guest os callchain now */
return ;
}
arm64: Make perf_callchain_kernel() use arch_stack_walk()
To enable RELIABLE_STACKTRACE and LIVEPATCH on arm64, we need to
substantially rework arm64's unwinding code. As part of this, we want to
minimize the set of unwind interfaces we expose, and avoid open-coding
of unwind logic outside of stacktrace.c.
Currently perf_callchain_kernel() walks the stack of an interrupted
context by calling start_backtrace() with the context's PC and FP, and
iterating unwind steps using walk_stackframe(). This is functionally
equivalent to calling arch_stack_walk() with the interrupted context's
pt_regs, which will start with the PC and FP from the regs.
Make perf_callchain_kernel() use arch_stack_walk(). This simplifies
perf_callchain_kernel(), and in future will alow us to make
walk_stackframe() private to stacktrace.c.
At the same time, we update the callchain_trace() callback to check the
return value of perf_callchain_store(), which indicates whether there is
space for any further entries. When a non-zero value is returned,
further calls will be ignored, and are redundant, so we can stop the
unwind at this point.
We also remove the stale and confusing comment for callchain_trace.
There should be no functional change as a result of this patch.
Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
[Mark: elaborate commit message, remove comment, fix includes]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20211129142849.3056714-5-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-11-29 14:28:44 +00:00
arch_stack_walk ( callchain_trace , entry , current , regs ) ;
2015-07-06 12:23:54 +01:00
}
unsigned long perf_instruction_pointer ( struct pt_regs * regs )
{
2021-11-11 02:07:28 +00:00
if ( perf_guest_state ( ) )
return perf_guest_get_ip ( ) ;
2015-07-06 12:23:54 +01:00
return instruction_pointer ( regs ) ;
}
unsigned long perf_misc_flags ( struct pt_regs * regs )
{
2021-11-11 02:07:28 +00:00
unsigned int guest_state = perf_guest_state ( ) ;
2015-07-06 12:23:54 +01:00
int misc = 0 ;
2021-11-11 02:07:27 +00:00
if ( guest_state ) {
if ( guest_state & PERF_GUEST_USER )
2015-07-06 12:23:54 +01:00
misc | = PERF_RECORD_MISC_GUEST_USER ;
else
misc | = PERF_RECORD_MISC_GUEST_KERNEL ;
} else {
if ( user_mode ( regs ) )
misc | = PERF_RECORD_MISC_USER ;
else
misc | = PERF_RECORD_MISC_KERNEL ;
}
return misc ;
}