IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
commit 57f0ff059e upstream.
It's later supposed to be either a correct address or NULL. Without the
initialization, it may contain an undefined value which results in the
following segmentation fault:
# perf top --sort comm -g --ignore-callees=do_idle
terminates with:
#0 0x00007ffff56b7685 in __strlen_avx2 () from /lib64/libc.so.6
#1 0x00007ffff55e3802 in strdup () from /lib64/libc.so.6
#2 0x00005555558cb139 in hist_entry__init (callchain_size=<optimized out>, sample_self=true, template=0x7fffde7fb110, he=0x7fffd801c250) at util/hist.c:489
#3 hist_entry__new (template=template@entry=0x7fffde7fb110, sample_self=sample_self@entry=true) at util/hist.c:564
#4 0x00005555558cb4ba in hists__findnew_entry (hists=hists@entry=0x5555561d9e38, entry=entry@entry=0x7fffde7fb110, al=al@entry=0x7fffde7fb420,
sample_self=sample_self@entry=true) at util/hist.c:657
#5 0x00005555558cba1b in __hists__add_entry (hists=hists@entry=0x5555561d9e38, al=0x7fffde7fb420, sym_parent=<optimized out>, bi=bi@entry=0x0, mi=mi@entry=0x0,
sample=sample@entry=0x7fffde7fb4b0, sample_self=true, ops=0x0, block_info=0x0) at util/hist.c:288
#6 0x00005555558cbb70 in hists__add_entry (sample_self=true, sample=0x7fffde7fb4b0, mi=0x0, bi=0x0, sym_parent=<optimized out>, al=<optimized out>, hists=0x5555561d9e38)
at util/hist.c:1056
#7 iter_add_single_cumulative_entry (iter=0x7fffde7fb460, al=<optimized out>) at util/hist.c:1056
#8 0x00005555558cc8a4 in hist_entry_iter__add (iter=iter@entry=0x7fffde7fb460, al=al@entry=0x7fffde7fb420, max_stack_depth=<optimized out>, arg=arg@entry=0x7fffffff7db0)
at util/hist.c:1231
#9 0x00005555557cdc9a in perf_event__process_sample (machine=<optimized out>, sample=0x7fffde7fb4b0, evsel=<optimized out>, event=<optimized out>, tool=0x7fffffff7db0)
at builtin-top.c:842
#10 deliver_event (qe=<optimized out>, qevent=<optimized out>) at builtin-top.c:1202
#11 0x00005555558a9318 in do_flush (show_progress=false, oe=0x7fffffff80e0) at util/ordered-events.c:244
#12 __ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP, timestamp=timestamp@entry=0) at util/ordered-events.c:323
#13 0x00005555558a9789 in __ordered_events__flush (timestamp=<optimized out>, how=<optimized out>, oe=<optimized out>) at util/ordered-events.c:339
#14 ordered_events__flush (how=OE_FLUSH__TOP, oe=0x7fffffff80e0) at util/ordered-events.c:341
#15 ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP) at util/ordered-events.c:339
#16 0x00005555557cd631 in process_thread (arg=0x7fffffff7db0) at builtin-top.c:1114
#17 0x00007ffff7bb817a in start_thread () from /lib64/libpthread.so.0
#18 0x00007ffff5656dc3 in clone () from /lib64/libc.so.6
If you look at the frame #2, the code is:
488 if (he->srcline) {
489 he->srcline = strdup(he->srcline);
490 if (he->srcline == NULL)
491 goto err_rawdata;
492 }
If he->srcline is not NULL (it is not NULL if it is uninitialized rubbish),
it gets strdupped and strdupping a rubbish random string causes the problem.
Also, if you look at the commit 1fb7d06a50, it adds the srcline property
into the struct, but not initializing it everywhere needed.
Committer notes:
Now I see, when using --ignore-callees=do_idle we end up here at line
2189 in add_callchain_ip():
2181 if (al.sym != NULL) {
2182 if (perf_hpp_list.parent && !*parent &&
2183 symbol__match_regex(al.sym, &parent_regex))
2184 *parent = al.sym;
2185 else if (have_ignore_callees && root_al &&
2186 symbol__match_regex(al.sym, &ignore_callees_regex)) {
2187 /* Treat this symbol as the root,
2188 forgetting its callees. */
2189 *root_al = al;
2190 callchain_cursor_reset(cursor);
2191 }
2192 }
And the al that doesn't have the ->srcline field initialized will be
copied to the root_al, so then, back to:
1211 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
1212 int max_stack_depth, void *arg)
1213 {
1214 int err, err2;
1215 struct map *alm = NULL;
1216
1217 if (al)
1218 alm = map__get(al->map);
1219
1220 err = sample__resolve_callchain(iter->sample, &callchain_cursor, &iter->parent,
1221 iter->evsel, al, max_stack_depth);
1222 if (err) {
1223 map__put(alm);
1224 return err;
1225 }
1226
1227 err = iter->ops->prepare_entry(iter, al);
1228 if (err)
1229 goto out;
1230
1231 err = iter->ops->add_single_entry(iter, al);
1232 if (err)
1233 goto out;
1234
That al at line 1221 is what hist_entry_iter__add() (called from
sample__resolve_callchain()) saw as 'root_al', and then:
iter->ops->add_single_entry(iter, al);
will go on with al->srcline with a bogus value, I'll add the above
sequence to the cset and apply, thanks!
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
CC: Milian Wolff <milian.wolff@kdab.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Fixes: 1fb7d06a50 ("perf report Use srcline from callchain for hist entries")
Link: https //lore.kernel.org/r/20210719145332.29747-1-mpetlan@redhat.com
Reported-by: Juri Lelli <jlelli@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5501e9229a upstream.
In some cases, the number of cpus (nr_cpus_online) is confused with the
maximum cpu number (nr_cpus_avail), which results in the error in the
example below:
Example on system with 8 cpus:
Before:
# echo 0 > /sys/devices/system/cpu/cpu2/online
# ./perf record --kcore -e intel_pt// taskset --cpu-list 7 uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.147 MB perf.data ]
# ./perf script --itrace=e
Requested CPU 7 too large. Consider raising MAX_NR_CPUS
0x25908 [0x8]: failed to process type: 68 [Invalid argument]
After:
# ./perf script --itrace=e
#
Fixes: 8c7274691f ("perf machine: Replace MAX_NR_CPUS with perf_env::nr_cpus_online")
Fixes: 7df4e36a47 ("perf session: Replace MAX_NR_CPUS with perf_env::nr_cpus_online")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210107174159.24897-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The addr in PERF_RECORD_KSYMBOL events for non-jited bpf progs points to
the bpf interpreter, ie. within kernel text section. When processing the
unregister event, this causes unexpected removal of vmlinux_map,
crashing perf later in cleanup:
# perf record -- timeout --signal=INT 2s /usr/share/bcc/tools/execsnoop
PCOMM PID PPID RET ARGS
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.208 MB perf.data (5155 samples) ]
perf: tools/include/linux/refcount.h:131: refcount_sub_and_test: Assertion `!(new > val)' failed.
Aborted (core dumped)
# perf script -D|grep KSYM
0 0xa40 [0x48]: PERF_RECORD_KSYMBOL addr ffffffffa9b6b530 len 0 type 1 flags 0x0 name bpf_prog_f958f6eb72ef5af6
0 0xab0 [0x48]: PERF_RECORD_KSYMBOL addr ffffffffa9b6b530 len 0 type 1 flags 0x0 name bpf_prog_8c42dee26e8cd4c2
0 0xb20 [0x48]: PERF_RECORD_KSYMBOL addr ffffffffa9b6b530 len 0 type 1 flags 0x0 name bpf_prog_f958f6eb72ef5af6
108563691893 0x33d98 [0x58]: PERF_RECORD_KSYMBOL addr ffffffffa9b6b3b0 len 0 type 1 flags 0x0 name bpf_prog_bc5697a410556fc2_syscall__execve
108568518458 0x34098 [0x58]: PERF_RECORD_KSYMBOL addr ffffffffa9b6b3f0 len 0 type 1 flags 0x0 name bpf_prog_45e2203c2928704d_do_ret_sys_execve
109301967895 0x34830 [0x58]: PERF_RECORD_KSYMBOL addr ffffffffa9b6b3b0 len 0 type 1 flags 0x1 name bpf_prog_bc5697a410556fc2_syscall__execve
109302007356 0x348b0 [0x58]: PERF_RECORD_KSYMBOL addr ffffffffa9b6b3f0 len 0 type 1 flags 0x1 name bpf_prog_45e2203c2928704d_do_ret_sys_execve
perf: tools/include/linux/refcount.h:131: refcount_sub_and_test: Assertion `!(new > val)' failed.
Here the addresses match the bpf interpreter:
# grep -e ffffffffa9b6b530 -e ffffffffa9b6b3b0 -e ffffffffa9b6b3f0 /proc/kallsyms
ffffffffa9b6b3b0 t __bpf_prog_run224
ffffffffa9b6b3f0 t __bpf_prog_run192
ffffffffa9b6b530 t __bpf_prog_run32
Fix by not allowing vmlinux_map to be removed by PERF_RECORD_KSYMBOL
unregister event.
Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20201016114718.54332-1-tommi.t.rantala@nokia.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename enum dso_kernel_type to enum dso_space_type, which seems like
better fit.
Committer notes:
This is used with 'struct dso'->kernel, which once was a boolean, so
DSO_SPACE__USER is zero, !zero means some sort of kernel space, be it
the host kernel space or a guest kernel space.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian reported that is_bpf_image is not working the way it was intended
- passing on trampolines and dispatcher names. Instead it returned true
for all the bpf names.
The reason even this logic worked properly is that all bpf objects, even
trampolines and dispatcher, were assigned DSO_BINARY_TYPE__BPF_IMAGE
binary_type.
The later for bpf_prog objects, the binary_type was fixed in bpf load
event processing, which is executed after the ksymbol code.
Fixing the is_bpf_image logic, so it properly recognizes trampoline and
dispatcher objects.
Fixes: 3c29d4483e ("perf annotate: Add basic support for bpf_image")
Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200512122310.3154754-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In LBR call stack mode, the depth of reconstructed LBR call stack limits
to the number of LBR registers.
For example, on skylake, the depth of reconstructed LBR call stack is
always <= 32.
# To display the perf.data header info, please use
# --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 6K of event 'cycles'
# Event count (approx.): 6487119731
#
# Children Self Command Shared Object Symbol
# ........ ........ ............... ..................
# ................................
99.97% 99.97% tchain_edit tchain_edit [.] f43
|
--99.64%--f11
f12
f13
f14
f15
f16
f17
f18
f19
f20
f21
f22
f23
f24
f25
f26
f27
f28
f29
f30
f31
f32
f33
f34
f35
f36
f37
f38
f39
f40
f41
f42
f43
For a call stack which is deeper than LBR limit, HW will overwrite the
LBR register with oldest branch. Only partial call stacks can be
reconstructed.
However, the overwritten LBRs may still be retrieved from previous
sample. At that moment, HW hasn't overwritten the LBR registers yet.
Perf tools can stitch those overwritten LBRs on current call stacks to
get a more complete call stack.
To determine if LBRs can be stitched, perf tools need to compare current
sample with previous sample.
- They should have identical LBR records (Same from, to and flags
values, and the same physical index of LBR registers).
- The searching starts from the base-of-stack of current sample.
Once perf determines to stitch the previous LBRs, the corresponding LBR
cursor nodes will be copied to 'lists'. The 'lists' is to track the LBR
cursor nodes which are going to be stitched.
When the stitching is over, the nodes will not be freed immediately.
They will be moved to 'free_lists'. Next stitching may reuse the space.
Both 'lists' and 'free_lists' will be freed when all samples are
processed.
Committer notes:
Fix the intel-pt.c initialization of the union with 'struct
branch_flags', that breaks the build with its unnamed union on older gcc
versions.
Uninline thread__free_stitch_list(), as it grew big and started dragging
includes to thread.h, so move it to thread.c where what it needs in
terms of headers are already there.
This fixes the build in several systems such as debian:experimental when
cross building to the MIPS32 architecture, i.e. in the other cases what
was needed was being included by sheer luck.
In file included from builtin-sched.c:11:
util/thread.h: In function 'thread__free_stitch_list':
util/thread.h:169:3: error: implicit declaration of function 'free' [-Werror=implicit-function-declaration]
169 | free(pos);
| ^~~~
util/thread.h:169:3: error: incompatible implicit declaration of built-in function 'free' [-Werror]
util/thread.h:19:1: note: include '<stdlib.h>' or provide a declaration of 'free'
18 | #include "callchain.h"
+++ |+#include <stdlib.h>
19 |
util/thread.h:174:3: error: incompatible implicit declaration of built-in function 'free' [-Werror]
174 | free(pos);
| ^~~~
util/thread.h:174:3: note: include '<stdlib.h>' or provide a declaration of 'free'
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pavel Gerasimov <pavel.gerasimov@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vitaly Slobodskoy <vitaly.slobodskoy@intel.com>
Link: http://lore.kernel.org/lkml/20200319202517.23423-13-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
LBR only collect the user call stack. To reconstruct a call stack, both
kernel call stack and user call stack are required. The function
resolve_lbr_callchain_sample() mix the kernel call stack and user call
stack.
Now, with the help of HW idx, perf tool can reconstruct a more complete
call stack by adding some user call stack from previous sample. However,
current implementation is hard to be extended to support it.
Current code path for resolve_lbr_callchain_sample()
for (j = 0; j < mix_chain_nr; j++) {
if (ORDER_CALLEE) {
if (kernel callchain)
Fill callchain info
else if (LBR callchain)
Fill callchain info
} else {
if (LBR callchain)
Fill callchain info
else if (kernel callchain)
Fill callchain info
}
add_callchain_ip();
}
With the patch,
if (ORDER_CALLEE) {
for (j = 0; j < NUM of kernel callchain) {
Fill callchain info
add_callchain_ip();
}
for (; j < mix_chain_nr) {
Fill callchain info
add_callchain_ip();
}
} else {
for (; j < NUM of LBR callchain) {
Fill callchain info
add_callchain_ip();
}
for (j = 0; j < mix_chain_nr) {
Fill callchain info
add_callchain_ip();
}
}
No functional changes.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pavel Gerasimov <pavel.gerasimov@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vitaly Slobodskoy <vitaly.slobodskoy@intel.com>
Link: http://lore.kernel.org/lkml/20200319202517.23423-7-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The low level index of raw branch records for the most recent branch can
be recorded in a sample with PERF_SAMPLE_BRANCH_HW_INDEX
branch_sample_type. Extend struct branch_stack to support it.
However, if the PERF_SAMPLE_BRANCH_HW_INDEX is not applied, only nr and
entries[] will be output by kernel. The pointer of entries[] could be
wrong, since the output format is different with new struct
branch_stack. Add a variable no_hw_idx in struct perf_sample to
indicate whether the hw_idx is output. Add get_branch_entry() to return
corresponding pointer of entries[0].
To make dummy branch sample consistent as new branch sample, add hw_idx
in struct dummy_branch_stack for cs-etm and intel-pt.
Apply the new struct branch_stack for synthetic events as well.
Extend test case sample-parsing to support new struct branch_stack.
Committer notes:
Renamed get_branch_entries() to perf_sample__branch_entries() to have
proper namespacing and pave the way for this to be moved to libperf,
eventually.
Add 'static' to that inline as it is in a header.
Add 'hw_idx' to 'struct dummy_branch_stack' in cs-etm.c to fix the build
on arm64.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pavel Gerasimov <pavel.gerasimov@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vitaly Slobodskoy <vitaly.slobodskoy@intel.com>
Link: http://lore.kernel.org/lkml/20200228163011.19358-2-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
I forgot to fill in the map_symbol->maps field in append_inlines() which
then makes code down the line segfault when trying to deref it.
It doesn't make any sense to have an addr_location with its 'map' member
not NULL while its 'maps' is NULL, after all al->maps is where al->map
is in.
It is done that way so that we don't have to have in each 'struct map' a
pointer to the 'struct maps' it is in, as we had in the past when we
would have 'map->mg', before 'struct maps' was combined with 'struct
map_groups', because there was always a one-to-one relationship for
these structs.
This fixes a segfault when processing DWARF callgraphs in 'perf report'.
Reported-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 08f6680e62 ("perf tools: Add a 'struct map_groups' pointer to 'struct map_symbol'")
Link: http://lore.kernel.org/lkml/20191129160631.GD26963@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
And pick the shortest name: 'struct maps'.
The split existed because we used to have two groups of maps, one for
functions and one for variables, but that only complicated things,
sometimes we needed to figure out what was at some address and then had
to first try it on the functions group and if that failed, fall back to
the variables one.
That split is long gone, so for quite a while we had only one struct
maps per struct map_groups, simplify things by combining those structs.
First patch is the minimum needed to merge both, follow up patches will
rename 'thread->mg' to 'thread->maps', etc.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-hom6639ro7020o708trhxh59@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
And take it into account when looking up DSOs when we have the dso_id
fields obtained from somewhere, like from PERF_RECORD_MMAP2 records.
Instances of struct map pointing to the same DSO pathname but with
anything in dso_id different are in fact different DSOs, so better have
different 'struct dso' instances to reflect that. At some point we may
want to get copies of the contents of the different objects if we want
to do correct annotation or other analysis.
With this we get 'struct map' 24 bytes leaner:
$ pahole -C map ~/bin/perf
struct map {
union {
struct rb_node rb_node __attribute__((__aligned__(8))); /* 0 24 */
struct list_head node; /* 0 16 */
} __attribute__((__aligned__(8))); /* 0 24 */
u64 start; /* 24 8 */
u64 end; /* 32 8 */
_Bool erange_warned:1; /* 40: 0 1 */
_Bool priv:1; /* 40: 1 1 */
/* XXX 6 bits hole, try to pack */
/* XXX 3 bytes hole, try to pack */
u32 prot; /* 44 4 */
u64 pgoff; /* 48 8 */
u64 reloc; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
u64 (*map_ip)(struct map *, u64); /* 64 8 */
u64 (*unmap_ip)(struct map *, u64); /* 72 8 */
struct dso * dso; /* 80 8 */
refcount_t refcnt; /* 88 4 */
u32 flags; /* 92 4 */
/* size: 96, cachelines: 2, members: 13 */
/* sum members: 92, holes: 1, sum holes: 3 */
/* sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 6 bits */
/* forced alignments: 1 */
/* last cacheline: 32 bytes */
} __attribute__((__aligned__(8)));
$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-g4hxxmraplo7wfjmk384mfsb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
At some point in the past we needed to make sure we would get the long
name of modules and not just what we get from /proc/modules, but that
need, as described in the cset that introduced the adjustment function:
Fixes: c03d5184f0 ("perf machine: Adjust dso->long_name for offline module")
Without using the buildid-cache:
# lsmod | grep trusted
# insmod trusted.ko
# lsmod | grep trusted
trusted 24576 0
# strace -e open,openat perf probe -m ./trusted.ko key_seal |& grep trusted
openat(AT_FDCWD, "/sys/module/trusted/notes/.note.gnu.build-id", O_RDONLY) = 4
openat(AT_FDCWD, "/sys/module/trusted/notes/.note.gnu.build-id", O_RDONLY) = 7
openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
openat(AT_FDCWD, "/root/.debug/root/trusted.ko/dd3d355d567394d540f527e093e0f64b95879584/probes", O_RDWR|O_CREAT, 0644) = 3
openat(AT_FDCWD, "/usr/lib/debug/root/trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/debug/root/trusted.ko", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/root/.debug/trusted.ko", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
openat(AT_FDCWD, "trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, ".debug/trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 4
openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
probe:key_seal (on key_seal in trusted)
# perf probe -l
probe:key_seal (on key_seal in trusted)
#
No attempt at opening '[trusted]'.
Now using the build-id cache:
# rmmod trusted
# perf buildid-cache --add ./trusted.ko
# insmod trusted.ko
# strace -e open,openat perf probe -m ./trusted.ko key_seal |& grep trusted
openat(AT_FDCWD, "/sys/module/trusted/notes/.note.gnu.build-id", O_RDONLY) = 4
openat(AT_FDCWD, "/sys/module/trusted/notes/.note.gnu.build-id", O_RDONLY) = 7
openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
openat(AT_FDCWD, "/root/.debug/root/trusted.ko/dd3d355d567394d540f527e093e0f64b95879584/probes", O_RDWR|O_CREAT, 0644) = 3
openat(AT_FDCWD, "/usr/lib/debug/root/trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/debug/root/trusted.ko", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/root/.debug/trusted.ko", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
openat(AT_FDCWD, "trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, ".debug/trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 4
openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
#
Again, no attempt at reading '[trusted]'.
Finally, adding a probe to that function and then using:
[root@quaco ~]# perf trace -e probe_perf:*/max-stack=16/ --max-events=2
0.000 perf/13456 probe_perf:dso__adjust_kmod_long_name(__probe_ip: 5492263)
dso__adjust_kmod_long_name (/home/acme/bin/perf)
machine__process_kernel_mmap_event (/home/acme/bin/perf)
machine__process_mmap_event (/home/acme/bin/perf)
perf_event__process_mmap (/home/acme/bin/perf)
machines__deliver_event (/home/acme/bin/perf)
perf_session__deliver_event (/home/acme/bin/perf)
perf_session__process_event (/home/acme/bin/perf)
process_simple (/home/acme/bin/perf)
reader__process_events (/home/acme/bin/perf)
__perf_session__process_events (/home/acme/bin/perf)
perf_session__process_events (/home/acme/bin/perf)
process_buildids (/home/acme/bin/perf)
record__finish_output (/home/acme/bin/perf)
__cmd_record (/home/acme/bin/perf)
cmd_record (/home/acme/bin/perf)
run_builtin (/home/acme/bin/perf)
0.055 perf/13456 probe_perf:dso__adjust_kmod_long_name(__probe_ip: 5492263)
dso__adjust_kmod_long_name (/home/acme/bin/perf)
machine__process_kernel_mmap_event (/home/acme/bin/perf)
machine__process_mmap_event (/home/acme/bin/perf)
perf_event__process_mmap (/home/acme/bin/perf)
machines__deliver_event (/home/acme/bin/perf)
perf_session__deliver_event (/home/acme/bin/perf)
perf_session__process_event (/home/acme/bin/perf)
process_simple (/home/acme/bin/perf)
reader__process_events (/home/acme/bin/perf)
__perf_session__process_events (/home/acme/bin/perf)
perf_session__process_events (/home/acme/bin/perf)
process_buildids (/home/acme/bin/perf)
record__finish_output (/home/acme/bin/perf)
__cmd_record (/home/acme/bin/perf)
cmd_record (/home/acme/bin/perf)
run_builtin (/home/acme/bin/perf)
#
This was the only path I could find using the perf tools that reach at this
function, then as of november/2019, if we put a probe in the line where the
actuall setting of the dso->long_name is done:
# perf trace -e probe_perf:*
^C[root@quaco ~]
# perf stat -e probe_perf:* -I 2000
2.000404265 0 probe_perf:dso__adjust_kmod_long_name
4.001142200 0 probe_perf:dso__adjust_kmod_long_name
6.001704120 0 probe_perf:dso__adjust_kmod_long_name
8.002398316 0 probe_perf:dso__adjust_kmod_long_name
10.002984010 0 probe_perf:dso__adjust_kmod_long_name
12.003597851 0 probe_perf:dso__adjust_kmod_long_name
14.004113303 0 probe_perf:dso__adjust_kmod_long_name
16.004582773 0 probe_perf:dso__adjust_kmod_long_name
18.005176373 0 probe_perf:dso__adjust_kmod_long_name
20.005801605 0 probe_perf:dso__adjust_kmod_long_name
22.006467540 0 probe_perf:dso__adjust_kmod_long_name
^C 23.683261941 0 probe_perf:dso__adjust_kmod_long_name
#
Its not being used at all.
To further test this I used kvm.ko as the offline module, i.e. removed
if from the buildid-cache by nuking it completely (rm -rf ~/.debug) and
moved it from the normal kernel distro path, removed the modules, stoped
the kvm guest, and then installed it manually, etc.
# rmmod kvm-intel
# rmmod kvm
# lsmod | grep kvm
# modprobe kvm-intel
modprobe: ERROR: ctx=0x55d3b1722260 path=/lib/modules/5.3.8-200.fc30.x86_64/kernel/arch/x86/kvm/kvm.ko.xz error=No such file or directory
modprobe: ERROR: ctx=0x55d3b1722260 path=/lib/modules/5.3.8-200.fc30.x86_64/kernel/arch/x86/kvm/kvm.ko.xz error=No such file or directory
modprobe: ERROR: could not insert 'kvm_intel': Unknown symbol in module, or unknown parameter (see dmesg)
# insmod ./kvm.ko
# modprobe kvm-intel
modprobe: ERROR: ctx=0x562f34026260 path=/lib/modules/5.3.8-200.fc30.x86_64/kernel/arch/x86/kvm/kvm.ko.xz error=No such file or directory
modprobe: ERROR: ctx=0x562f34026260 path=/lib/modules/5.3.8-200.fc30.x86_64/kernel/arch/x86/kvm/kvm.ko.xz error=No such file or directory
# lsmod | grep kvm
kvm_intel 299008 0
kvm 765952 1 kvm_intel
irqbypass 16384 1 kvm
#
# perf probe -x ~/bin/perf machine__findnew_module_map:12 mname=m.name:string filename=filename:string 'dso_long_name=map->dso->long_name:string' 'dso_name=map->dso->name:string'
# perf probe -l
probe_perf:machine__findnew_module_map (on machine__findnew_module_map:12@util/machine.c in /home/acme/bin/perf with mname filename dso_long_name dso_name)
# perf record
^C[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 3.416 MB perf.data (33956 samples) ]
# perf trace -e probe_perf:machine*
<SNIP>
6.322 perf/23099 probe_perf:machine__findnew_module_map(__probe_ip: 5492493, mname: "[salsa20_generic]", filename: "/lib/modules/5.3.8-200.fc30.x86_64/kernel/crypto/salsa20_generic.ko.xz", dso_long_name: "/lib/modules/5.3.8-200.fc30.x86_64/kernel/crypto/salsa20_generic.ko.xz", dso_name: "[salsa20_generic]")
6.375 perf/23099 probe_perf:machine__findnew_module_map(__probe_ip: 5492493, mname: "[kvm]", filename: "[kvm]", dso_long_name: "[kvm]", dso_name: "[kvm]")
<SNIP>
The filename doesn't come with the path, no point in trying to set the dso->long_name.
[root@quaco ~]# strace -e open,openat perf probe -m ./kvm.ko kvm_apic_local_deliver |& egrep 'open.*kvm'
openat(AT_FDCWD, "/sys/module/kvm_intel/notes/.note.gnu.build-id", O_RDONLY) = 4
openat(AT_FDCWD, "/sys/module/kvm/notes/.note.gnu.build-id", O_RDONLY) = 4
openat(AT_FDCWD, "/lib/modules/5.3.8-200.fc30.x86_64/kernel/arch/x86/kvm", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 7
openat(AT_FDCWD, "/sys/module/kvm_intel/notes/.note.gnu.build-id", O_RDONLY) = 8
openat(AT_FDCWD, "/root/kvm.ko", O_RDONLY) = 3
openat(AT_FDCWD, "/root/.debug/root/kvm.ko/5955f426cb93f03f30f3e876814be2db80ab0b55/probes", O_RDWR|O_CREAT, 0644) = 3
openat(AT_FDCWD, "/usr/lib/debug/root/kvm.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/debug/root/kvm.ko", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/root/.debug/kvm.ko", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/root/kvm.ko", O_RDONLY) = 3
openat(AT_FDCWD, "kvm.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, ".debug/kvm.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "kvm.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/root/kvm.ko", O_RDONLY) = 3
openat(AT_FDCWD, "/root/kvm.ko", O_RDONLY) = 3
openat(AT_FDCWD, "/root/kvm.ko", O_RDONLY) = 4
openat(AT_FDCWD, "/root/kvm.ko", O_RDONLY) = 3
[root@quaco ~]#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-jlfew3lyb24d58egrp0o72o2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So that we pass that substructure around and with it consolidate lots of
functions that receive a (map, symbol) pair and now can receive just a
'struct map_symbol' pointer.
This further paves the way to add 'struct map_groups' to 'struct
map_symbol' so that we can have all we need for annotation so that we
can ditch 'struct map'->groups, i.e. have the map_groups pointer in a
more central place, avoiding the pointer in the 'struct map' that have
tons of instances.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-fs90ttd9q12l7989fo7pw81q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>