b2ad9549bf
15111 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Ravi Bangoria
|
b2ad9549bf |
perf evsel amd: Fix IBS error message
AMD IBS can do per-process profiling[1] and is no longer restricted to per-cpu or systemwide only. Remove stale error message. Also, checking just exclude_kernel is not sufficient since IBS does not support any privilege filters. So include all exclude_* checks. And finally, move these checks under tools/perf/arch/x86/ from generic code. Before: $ sudo ./perf record -e ibs_op//k -C 0 Error: AMD IBS may only be available in system-wide/per-cpu mode. Try using -a, or -C and workload affinity After: $ sudo ./perf record -e ibs_op//k -C 0 Error: AMD IBS doesn't support privilege filtering. Try again without the privilege modifiers (like 'k') at the end. [1] https://git.kernel.org/torvalds/c/30093056f7b2 Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: ananth.narayan@amd.com Cc: sandipan.das@amd.com Cc: santosh.shukla@amd.com Cc: irogers@google.com Cc: peterz@infradead.org Cc: adrian.hunter@intel.com Cc: acme@kernel.org Cc: jolsa@kernel.org Link: https://lore.kernel.org/r/20230630085230.437-1-ravi.bangoria@amd.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Vincent Whitchurch
|
5f06267b6e |
perf: unwind: Fix symfs with libdw
Pass the full path including the symfs (if any) to libdw. Without this unwinding fails with errors like this when a symfs is used: unwind: failed with 'No such file or directory'" Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: kernel@axis.com Cc: Ian Rogers <irogers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Link: https://lore.kernel.org/r/20230630-perf-libdw-symfs-v2-1-469760dd4d5b@axis.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
James Clark
|
78a175c462 |
perf symbol: Fix uninitialized return value in symbols__find_by_name()
found_idx and s aren't initialized, so if no symbol is found then the
assert at the end will index off the end of the array causing a
segfault. The function also doesn't return NULL when the symbol isn't
found even if the assert passes. Fix it by initializing the values and
only setting them when something is found.
Fixes the following test failure:
$ perf test 1
1: vmlinux symtab matches kallsyms : FAILED!
Fixes:
|
||
Namhyung Kim
|
2aefb4cc90 |
perf test: Test perf lock contention CSV output
To verify CSV output, just check the number of separators (",") using the tr and wc commands like this. grep -v "^#" ${result} | tr -d -c | wc -c Now it expects 6 columns (and 5 separators) in the output, but it may be changed later so count the field in the header first and compare it to the actual output lines. $ cat ${result} # output: contended, total wait, max wait, avg wait, type, caller 1, 28787, 28787, 28787, spinlock, raw_spin_rq_lock_nested+0x1b The test looks like below now: $ sudo ./perf test -v contention 86: kernel lock contention analysis test : --- start --- test child forked, pid 2705822 Testing perf lock record and perf lock contention Testing perf lock contention --use-bpf Testing perf lock record and perf lock contention at the same time Testing perf lock contention --threads Testing perf lock contention --lock-addr Testing perf lock contention --type-filter (w/ spinlock) Testing perf lock contention --lock-filter (w/ tasklist_lock) Testing perf lock contention --callstack-filter (w/ unix_stream) Testing perf lock contention --callstack-filter with task aggregation Testing perf lock contention CSV output test child finished with 0 ---- end ---- kernel lock contention analysis test: Ok Acked-by: Ian Rogers <irogers@google.com> Cc: Hao Luo <haoluo@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230628200141.2739587-5-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Namhyung Kim
|
f6027053f8 |
perf lock contention: Add --output option
To avoid formatting failures for example in CSV output due to debug messages, add --output option to put the result in a file. Unfortunately the short -o option was taken by the --owner already. $ sudo ./perf lock con -ab --output lock-out.txt -v sleep 1 Looking at the vmlinux_path (8 entries long) symsrc__init: cannot get elf header. Using /proc/kcore for kernel data Using /proc/kallsyms for symbols $ head lock-out.txt contended total wait max wait avg wait type caller 3 76.79 us 26.89 us 25.60 us rwlock:R ep_poll_callback+0x2d 0xffffffff9a23f4b5 _raw_read_lock_irqsave+0x45 0xffffffff99bbd4dd ep_poll_callback+0x2d 0xffffffff999029f3 __wake_up_common+0x73 0xffffffff99902b82 __wake_up_common_lock+0x82 0xffffffff99fa5b1c sock_def_readable+0x3c 0xffffffff9a11521d unix_stream_sendmsg+0x18d 0xffffffff99f9fc9c sock_sendmsg+0x5c Suggested-by: Ian Rogers <irogers@google.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Hao Luo <haoluo@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230628200141.2739587-4-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Namhyung Kim
|
69c5c9930d |
perf lock contention: Add -x option for CSV style output
Sometimes we want to process the output by external programs. Let's add the -x option to specify the field separator like perf stat. $ sudo ./perf lock con -ab -x, sleep 1 # output: contended, total wait, max wait, avg wait, type, caller 19, 194232, 21415, 10222, spinlock, process_one_work+0x1f0 15, 162748, 23843, 10849, rwsem:R, do_user_addr_fault+0x40e 4, 86740, 23415, 21685, rwlock:R, ep_poll_callback+0x2d 1, 84281, 84281, 84281, mutex, iwl_mvm_async_handlers_wk+0x135 8, 67608, 27404, 8451, spinlock, __queue_work+0x174 3, 58616, 31125, 19538, rwsem:W, do_mprotect_pkey+0xff 3, 52953, 21172, 17651, rwlock:W, do_epoll_wait+0x248 2, 30324, 19704, 15162, rwsem:R, do_madvise+0x3ad 1, 24619, 24619, 24619, spinlock, rcu_core+0xd4 The first line is a comment that shows the output format. Each line is separated by the given string ("," in this case). The time is printed in nsec without the unit so that it can be parsed easily. The characters can be used in the output like (":", "+" and ".") are not allowed for the -x option. $ ./perf lock con -x: Cannot use the separator that is already used Usage: perf lock contention [<options>] -x, --field-separator <separator> print result in CSV format with custom separator The stacktraces are printed in the same line separated by ":". The header is updated to show the stacktrace. Also the debug output is added at the end as a comment. $ sudo ./perf lock con -abv -x, -F wait_total sleep 1 Looking at the vmlinux_path (8 entries long) symsrc__init: cannot get elf header. Using /proc/kcore for kernel data Using /proc/kallsyms for symbols # output: total wait, type, caller, stacktrace 37134, spinlock, rcu_core+0xd4, 0xffffffff9d0401e4 _raw_spin_lock_irqsave+0x44: 0xffffffff9c738114 rcu_core+0xd4: ... 21213, spinlock, raw_spin_rq_lock_nested+0x1b, 0xffffffff9d0407c0 _raw_spin_lock+0x30: 0xffffffff9c6d9cfb raw_spin_rq_lock_nested+0x1b: ... 20506, rwlock:W, ep_done_scan+0x2d, 0xffffffff9c9bc4dd ep_done_scan+0x2d: 0xffffffff9c9bd5f1 do_epoll_wait+0x6d1: ... 18044, rwlock:R, ep_poll_callback+0x2d, 0xffffffff9d040555 _raw_read_lock_irqsave+0x45: 0xffffffff9c9bc81d ep_poll_callback+0x2d: ... 17890, rwlock:W, do_epoll_wait+0x47b, 0xffffffff9c9bd39b do_epoll_wait+0x47b: 0xffffffff9c9be9ef __x64_sys_epoll_wait+0x6d1: ... 12114, spinlock, futex_wait_queue+0x60, 0xffffffff9d0407c0 _raw_spin_lock+0x30: 0xffffffff9d037cae __schedule+0xbe: ... # debug: total=7, bad=0, bad_task=0, bad_stack=0, bad_time=0, bad_data=0 Also note that some field (like lock symbols) can be empty. $ sudo ./perf lock con -abl -x, -E 10 sleep 1 # output: contended, total wait, max wait, avg wait, address, symbol, type 6, 275025, 61764, 45837, ffff9dcc9f7d60d0, , spinlock 18, 87716, 11196, 4873, ffff9dc540059000, , spinlock 2, 6472, 5499, 3236, ffff9dcc7f730e00, rq_lock, spinlock 3, 4429, 2341, 1476, ffff9dcc7f7b0e00, rq_lock, spinlock 3, 3974, 1635, 1324, ffff9dcc7f7f0e00, rq_lock, spinlock 4, 3290, 1326, 822, ffff9dc5f4e2cde0, , rwlock 3, 2894, 1023, 964, ffffffff9e0d7700, rcu_state, spinlock 1, 2567, 2567, 2567, ffff9dcc7f6b0e00, rq_lock, spinlock 4, 1259, 596, 314, ffff9dc69c2adde0, , rwlock 1, 934, 934, 934, ffff9dcc7f670e00, rq_lock, spinlock Acked-by: Ian Rogers <irogers@google.com> Cc: Hao Luo <haoluo@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230628200141.2739587-3-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Namhyung Kim
|
7b83d597c8 |
perf lock: Remove stale comments
The comment was for symbol_conf.sort_by_name which was deleted already. Let's get rid of the stale comments as well. Acked-by: Ian Rogers <irogers@google.com> Cc: Hao Luo <haoluo@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230628200141.2739587-2-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Ian Rogers
|
887e845f8c |
perf vendor events intel: Update tigerlake to 1.13
Updates were released in:
|
||
Ian Rogers
|
b5d2644da6 |
perf vendor events intel: Update skylakex to 1.31
Updates were released in:
|
||
Ian Rogers
|
ea3eafa08a |
perf vendor events intel: Update skylake to 57
Updates were released in:
|
||
Ian Rogers
|
938e4ad310 |
perf vendor events intel: Update sapphirerapids to 1.14
Updates were released in:
|
||
Ian Rogers
|
663655c91c |
perf vendor events intel: Update icelakex to 1.21
Updates were released in:
|
||
Ian Rogers
|
d1363b9454 |
perf vendor events intel: Update icelake to 1.19
Updates were released in:
|
||
Ian Rogers
|
0c9e39421c |
perf vendor events intel: Update cascadelakex to 1.19
Updates were released in:
|
||
Ian Rogers
|
dfc83cc877 |
perf vendor events intel: Update meteorlake to 1.03
1.03 events were released in:
|
||
Ian Rogers
|
7e74ece31a |
perf vendor events intel: Add rocketlake events/metrics
Add RocketLake events added to Intel perfmon in:
|
||
Ian Rogers
|
8076dc8c68 |
perf vendor metrics intel: Make transaction metrics conditional
Make the transaction metrics conditional on the cycles-tx event being present. This event may not be present when TSX extensions have been disabled. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Sohom Datta <sohomdatta1@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Cc: John Garry <john.g.garry@oracle.com> Cc: Ingo Molnar <mingo@redhat.com> Link: https://lore.kernel.org/r/20230623151016.4193660-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Ian Rogers
|
eadc0040a9 |
perf jevents: Support for has_event function
Support for the new has_event function in metrics. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Sohom Datta <sohomdatta1@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Cc: John Garry <john.g.garry@oracle.com> Cc: Ingo Molnar <mingo@redhat.com> Link: https://lore.kernel.org/r/20230623151016.4193660-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Ian Rogers
|
4a4a9bf907 |
perf expr: Add has_event function
Some events are dependent on firmware/kernel enablement. Allow such events to be detected when the metric is parsed so that the metric's event parsing doesn't fail. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Sohom Datta <sohomdatta1@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Cc: John Garry <john.g.garry@oracle.com> Cc: Ingo Molnar <mingo@redhat.com> Link: https://lore.kernel.org/r/20230623151016.4193660-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Namhyung Kim
|
36cee69f57 |
perf tools: Do not remove addr_location.thread in thread__find_map()
The thread__find_map() is to find a map for a given address in the
given thread's address space. It searches maps based on the cpu mode
and fills various information in the addr_location data structure.
It might change al->maps and al->map, but not al->thread. Then I
think no reason to not set the al->thread at the beginning.
Also get rid of the duplicate 'al->map = NULL' part.
Fixes:
|
||
Ian Rogers
|
628eaa4e87 |
perf pmus: Add placeholder core PMU
If loading a core PMU fails, legacy hardware/cache events may segv due to there being no PMU. Create a placeholder empty PMU for this case. This was discussed in: https://lore.kernel.org/lkml/20230614151625.2077-1-yangjihong1@huawei.com/ Reported-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Yang Jihong <yangjihong1@huawei.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Rob Herring <robh@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Link: https://lore.kernel.org/r/20230627182834.117565-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Namhyung Kim
|
ad5f604e18 |
perf test: Fix a compile error on pe-file-parsing.c
The dso__find_symbol_by_name() should be have idx pointer argument.
Found during the build-test.
$ make build-test
...
CC /tmp/tmp.6JwPK1xbWG/tests/pe-file-parsing.o
tests/pe-file-parsing.c: In function ‘run_dir’:
tests/pe-file-parsing.c:64:15: error: too few arguments to function ‘dso__find_symbol_by_name’
64 | sym = dso__find_symbol_by_name(dso, "main");
| ^~~~~~~~~~~~~~~~~~~~~~~~
In file included from tests/pe-file-parsing.c:16:
/usr/local/google/home/namhyung/project/linux/tools/perf/util/symbol.h:135:16: note: declared here
135 | struct symbol *dso__find_symbol_by_name(struct dso *dso, const char *name, size_t *idx);
| ^~~~~~~~~~~~~~~~~~~~~~~~
Fixes:
|
||
Fangrui Song
|
78987bb02a |
perf: Replace deprecated -target with --target= for Clang
-target has been deprecated since Clang 3.4 in 2013. Use the preferred
--target=bpf form instead. This matches how we use --target= in
scripts/Makefile.clang.
Signed-off-by: Fangrui Song <maskray@google.com>
Acked-by: Yonghong Song <yhs@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: llvm@lists.linux.dev
Cc: Ingo Molnar <mingo@redhat.com>
Cc: bpf@vger.kernel.org
Link:
|
||
Ian Rogers
|
710dffc969 |
perf pmu: Correct auto_merge_stats test
The original logic was to check is_pmu_hybrid() like in the below.
It just checks the name of PMU specifically for Intel hybrid systems
which means uncore PMU events should return false.
https://lore.kernel.org/all/20230527072210.2900565-35-irogers@google.com/
The is_pmu_hybrid() was replaced by arch-agnostic way but with the
incorrect condition which was fixed for core PMUs but not uncore.
This change fixes both.
Fixes:
|
||
Jiri Olsa
|
49a5e3edd3 |
perf tools: Add missing else to cmd_daemon subcommand condition
Namhyung reported segfault in perf daemon start command.
It's caused by extra check on argv[0] which is set to NULL by previous
__cmd_start call. Adding missing else to skip the extra check.
Fixes:
|
||
Yang Jihong
|
929ff679b6 |
perf tools: Add printing perf_event_attr config symbol in perf_event_attr__fprintf()
When printing perf_event_attr, always display perf_event_attr config and its symbol to improve the readability of debugging information. Before: # perf --debug verbose=2 record -e cycles,cpu-clock,sched:sched_switch,branch-load-misses,r101,mem:0x0 -C 0 true <SNIP> ------------------------------------------------------------ perf_event_attr: size 136 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER read_format ID disabled 1 inherit 1 freq 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 5 ------------------------------------------------------------ perf_event_attr: type 1 size 136 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER read_format ID disabled 1 inherit 1 freq 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 6 ------------------------------------------------------------ perf_event_attr: type 2 size 136 config 0x143 { sample_period, sample_freq } 1 sample_type IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER read_format ID disabled 1 inherit 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 7 ------------------------------------------------------------ perf_event_attr: type 3 size 136 config 0x10005 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER read_format ID disabled 1 inherit 1 freq 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 9 ------------------------------------------------------------ perf_event_attr: type 4 size 136 config 0x101 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER read_format ID disabled 1 inherit 1 freq 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 10 ------------------------------------------------------------ perf_event_attr: type 5 size 136 { sample_period, sample_freq } 1 sample_type IP|TID|TIME|CPU|IDENTIFIER read_format ID disabled 1 inherit 1 sample_id_all 1 exclude_guest 1 bp_type 3 { bp_len, config2 } 0x4 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 11 <SNIP> After: # perf --debug verbose=2 record -e cycles,cpu-clock,sched:sched_switch,branch-load-misses,r101,mem:0x0 -C 0 true <SNIP> ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER read_format ID disabled 1 inherit 1 freq 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 5 ------------------------------------------------------------ perf_event_attr: type 1 (PERF_TYPE_SOFTWARE) size 136 config 0 (PERF_COUNT_SW_CPU_CLOCK) { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER read_format ID disabled 1 inherit 1 freq 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 6 ------------------------------------------------------------ perf_event_attr: type 2 (PERF_TYPE_TRACEPOINT) size 136 config 0x143 (sched:sched_switch) { sample_period, sample_freq } 1 sample_type IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER read_format ID disabled 1 inherit 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 7 ------------------------------------------------------------ perf_event_attr: type 3 (PERF_TYPE_HW_CACHE) size 136 config 0x10005 (PERF_COUNT_HW_CACHE_RESULT_MISS | PERF_COUNT_HW_CACHE_OP_READ | PERF_COUNT_HW_CACHE_BPU) { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER read_format ID disabled 1 inherit 1 freq 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 9 ------------------------------------------------------------ perf_event_attr: type 4 (PERF_TYPE_RAW) size 136 config 0x101 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER read_format ID disabled 1 inherit 1 freq 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 10 ------------------------------------------------------------ perf_event_attr: type 5 (PERF_TYPE_BREAKPOINT) size 136 config 0 { sample_period, sample_freq } 1 sample_type IP|TID|TIME|CPU|IDENTIFIER read_format ID disabled 1 inherit 1 sample_id_all 1 exclude_guest 1 bp_type 3 { bp_len, config2 } 0x4 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 11 ------------------------------------------------------------ perf_event_attr: type 1 (PERF_TYPE_SOFTWARE) size 136 config 0x9 (PERF_COUNT_SW_DUMMY) { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER read_format ID inherit 1 mmap 1 comm 1 freq 1 task 1 sample_id_all 1 mmap2 1 comm_exec 1 ksymbol 1 bpf_event 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 12 <SNIP> Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: anshuman.khandual@arm.com Cc: mark.rutland@arm.com Cc: irogers@google.com Cc: jesussanp@google.com Cc: peterz@infradead.org Cc: acme@kernel.org Cc: jolsa@kernel.org Cc: alexander.shishkin@linux.intel.com Cc: mingo@redhat.com Link: https://lore.kernel.org/r/20230623054416.160858-5-yangjihong1@huawei.com [ fix perf import test by adding a dummy tracepoint_id__to_name() ] Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Yang Jihong
|
2e4dd65d7d |
perf tools: Add printing perf_event_attr type symbol in perf_event_attr__fprintf()
When printing perf_event_attr, always display perf_event_attr type and its symbol to improve the readability of debugging information. Before: # perf --debug verbose=2 record -e cycles,cpu-clock,sched:sched_switch,branch-load-misses,r101,mem:0x0 -C 0 true <SNIP> ------------------------------------------------------------ perf_event_attr: size 136 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER read_format ID disabled 1 inherit 1 freq 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 5 ------------------------------------------------------------ perf_event_attr: type 1 size 136 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER read_format ID disabled 1 inherit 1 freq 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 6 ------------------------------------------------------------ perf_event_attr: type 2 size 136 config 0x143 { sample_period, sample_freq } 1 sample_type IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER read_format ID disabled 1 inherit 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 7 ------------------------------------------------------------ perf_event_attr: type 3 size 136 config 0x10005 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER read_format ID disabled 1 inherit 1 freq 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 9 ------------------------------------------------------------ perf_event_attr: type 4 size 136 config 0x101 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER read_format ID disabled 1 inherit 1 freq 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 10 ------------------------------------------------------------ perf_event_attr: type 5 size 136 { sample_period, sample_freq } 1 sample_type IP|TID|TIME|CPU|IDENTIFIER read_format ID disabled 1 inherit 1 sample_id_all 1 exclude_guest 1 bp_type 3 { bp_len, config2 } 0x4 ------------------------------------------------------------ <SNIP> After: # perf --debug verbose=2 record -e cycles,cpu-clock,sched:sched_switch,branch-load-misses,r101,mem:0x0 -C 0 true <SNIP> ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER read_format ID disabled 1 inherit 1 freq 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 5 ------------------------------------------------------------ perf_event_attr: type 1 (PERF_TYPE_SOFTWARE) size 136 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER read_format ID disabled 1 inherit 1 freq 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 6 ------------------------------------------------------------ perf_event_attr: type 2 (PERF_TYPE_TRACEPOINT) size 136 config 0x143 { sample_period, sample_freq } 1 sample_type IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER read_format ID disabled 1 inherit 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 7 ------------------------------------------------------------ perf_event_attr: type 3 (PERF_TYPE_HW_CACHE) size 136 config 0x10005 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER read_format ID disabled 1 inherit 1 freq 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 9 ------------------------------------------------------------ perf_event_attr: type 4 (PERF_TYPE_RAW) size 136 config 0x101 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER read_format ID disabled 1 inherit 1 freq 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 10 ------------------------------------------------------------ perf_event_attr: type 5 (PERF_TYPE_BREAKPOINT) size 136 { sample_period, sample_freq } 1 sample_type IP|TID|TIME|CPU|IDENTIFIER read_format ID disabled 1 inherit 1 sample_id_all 1 exclude_guest 1 bp_type 3 { bp_len, config2 } 0x4 ------------------------------------------------------------ <SNIP> Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: anshuman.khandual@arm.com Cc: mark.rutland@arm.com Cc: irogers@google.com Cc: jesussanp@google.com Cc: peterz@infradead.org Cc: acme@kernel.org Cc: jolsa@kernel.org Cc: alexander.shishkin@linux.intel.com Cc: mingo@redhat.com Link: https://lore.kernel.org/r/20230623054416.160858-4-yangjihong1@huawei.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Yang Jihong
|
5492e72500 |
perf tools: Extend PRINT_ATTRf to support printing of members with a value of 0
When printing attr, members whose value is 0 will not be printed, we want to print the case where attr->type is 0(PERF_TYPE_HARDWARE), add `_a` param to PRINT_ATTRf macro to always print member when it is true No functional change. Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: anshuman.khandual@arm.com Cc: mark.rutland@arm.com Cc: irogers@google.com Cc: jesussanp@google.com Cc: peterz@infradead.org Cc: acme@kernel.org Cc: jolsa@kernel.org Cc: alexander.shishkin@linux.intel.com Cc: mingo@redhat.com Link: https://lore.kernel.org/r/20230623054416.160858-3-yangjihong1@huawei.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Yang Jihong
|
220f88b5a1 |
perf trace-event-info: Add tracepoint_id_to_name() helper
Add tracepoint_id_to_name() helper to search for the trace events directory by given event id and return the corresponding tracepoint. Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: anshuman.khandual@arm.com Cc: mark.rutland@arm.com Cc: irogers@google.com Cc: jesussanp@google.com Cc: peterz@infradead.org Cc: acme@kernel.org Cc: jolsa@kernel.org Cc: alexander.shishkin@linux.intel.com Cc: mingo@redhat.com Link: https://lore.kernel.org/r/20230623054416.160858-2-yangjihong1@huawei.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Ian Rogers
|
d82257d7f6 |
perf symbol: Remove now unused symbol_conf.sort_by_name
Previously used to specify symbol_name_rb_node was in use. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Carsten Haitzler <carsten.haitzler@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Jason Wang <wangborong@cdjrlc.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/20230623054520.4118442-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Ian Rogers
|
259dce914e |
perf symbol: Remove symbol_name_rb_node
Most perf commands want to sort symbols by name and this is done via an invasive rbtree that on 64-bit systems costs 24 bytes. Sorting the symbols in a DSO by name is optional and not done by default, however, if sorting is requested the 24 bytes is allocated for every symbol. This change removes the rbtree and uses a sorted array of symbol pointers instead (costing 8 bytes per symbol). As the array is created on demand then there are further memory savings. The complexity of sorting the array and using the rbtree are the same. To support going to the next symbol, the index of the current symbol needs to be passed around as a pair with the current symbol. This requires some API changes. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Carsten Haitzler <carsten.haitzler@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Jason Wang <wangborong@cdjrlc.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/20230623054520.4118442-3-irogers@google.com [ minimize change in symbols__sort_by_name() ] Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Ian Rogers
|
ce5b293405 |
perf dso: Sort symbols under lock
Determine if symbols are sorted, set the sorted flag and sort under the dso lock. Done in the interest of thread safety. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Carsten Haitzler <carsten.haitzler@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Jason Wang <wangborong@cdjrlc.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/20230623054520.4118442-2-irogers@google.com [ handle the similar code in util/probe-event.c ] Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Ian Rogers
|
ae7eb5baad |
perf build: Filter out BTF sources without a .BTF section
If generating vmlinux.h, make the code to generate it more tolerant by filtering out paths to kernels that lack a .BTF section. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Ingo Molnar <mingo@redhat.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230623041405.4039475-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Ian Rogers
|
06c39e742d |
perf test: Add build tests for BUILD_BPF_SKEL
Add tests with and without generating vmlinux.h. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Ingo Molnar <mingo@redhat.com> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230623041405.4039475-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Ian Rogers
|
5c45b21047 |
perf bpf: Move the declaration of struct rq
struct rq is defined in vmlinux.h when the vmlinux.h is generated,
this causes a redefinition failure if it is declared in
lock_contention.bpf.c. Move the definition to vmlinux.h for
consistency with the generated version.
Fixes:
|
||
Ian Rogers
|
b7a2d774c9 |
perf build: Add ability to build with a generated vmlinux.h
Commit |
||
Namhyung Kim
|
4d60e83dfc |
perf test: Skip metrics w/o event name in stat STD output linter
This test checks if the output of perf stat to match event names and
metrics. So it wants the output lines to have both event name and
metric. Otherwise it should skip the line.
On AMD machines, the instruction event has two metrics and they are printed
in separate lines. It makes the line without event name like below:
# perf stat -a sleep 1
Performance counter stats for 'system wide':
64,383.34 msec cpu-clock # 64.048 CPUs utilized
14,526 context-switches # 225.617 /sec
112 cpu-migrations # 1.740 /sec
190 page-faults # 2.951 /sec
807,558,652 cycles # 0.013 GHz (83.30%)
69,809,799 stalled-cycles-frontend # 8.64% frontend cycles idle (83.30%)
196,983,266 stalled-cycles-backend # 24.39% backend cycles idle (83.30%)
424,876,008 instructions # 0.53 insn per cycle
(here) ---> # 0.46 stalled cycles per insn (83.30%)
97,788,321 branches # 1.519 M/sec (83.34%)
4,147,377 branch-misses # 4.24% of all branches (83.46%)
1.005241409 seconds time elapsed
Also modern Intel machines have TopDown metrics which also don't have
event names.
# perf stat -a sleep 1
Performance counter stats for 'system wide':
8,015.39 msec cpu-clock # 7.996 CPUs utilized
5,823 context-switches # 726.477 /sec
189 cpu-migrations # 23.580 /sec
139 page-faults # 17.342 /sec
435,139,308 cycles # 0.054 GHz
193,891,345 instructions # 0.45 insn per cycle
42,773,028 branches # 5.336 M/sec
2,298,113 branch-misses # 5.37% of all branches
TopdownL1 # 25.5 % tma_backend_bound
/--> # 7.9 % tma_bad_speculation
(here) --+ # 55.7 % tma_frontend_bound
\--> # 10.9 % tma_retiring
1.002395924 seconds time elapsed
There is a check to skip TopdownL1 and TopdownL2 specifically but it
does not cover every affected lines.
So there is another check to skip the line if it has nothing on the left
side of # sign. Well.. it seems ok but that's not enough too.
When aggregation mode (like --per-socket or --per-thread) is used, it
adds some prefix (e.g. CPU socket, task name and PID) in the output
line. So the test code ignores them to normalize result.
A problem can happen for per-thread mode when task name contains one or
more spaces. It'd only ignore the first part of the task name, and it
thinks there's something more in the line so it would not skip.
# perf stat -a --perf-thread sleep 1
...
perf-21276 # 70.2 % tma_backend_bound
perf-21276 # 3.9 % tma_bad_speculation
perf-21276 # 10.5 % tma_frontend_bound
perf-21276 # 15.3 % tma_retiring
^^^^^^^^^^
(ignored)
my task-21328 # 70.2 % tma_backend_bound
my task-21328 # 3.9 % tma_bad_speculation
my task-21328 # 10.5 % tma_frontend_bound
my task-21328 # 15.3 % tma_retiring
^^
(ignored)
So I think it should look at the metric names instead. Add skip_metric
to hold the list of names to skip. It would contain 'stalled cycles per
insn' and metrics started by 'tma_'.
Fixes:
|
||
Namhyung Kim
|
8d3df7c39b |
perf test: Reorder event name checks in stat STD output linter
On AMD machines, the perf stat STD output test failed like below:
$ sudo ./perf test -v 98
98: perf stat STD output linter :
--- start ---
test child forked, pid 1841901
Checking STD output: no argswrong event metric.
expected 'GHz' in 108,121 stalled-cycles-frontend # 10.88% frontend cycles idle
test child finished with -1
---- end ----
perf stat STD output linter: FAILED!
This is because there are stalled-cycles-{frontend,backend} events are
used by default. The current logic checks the event_name array to find
which event it's running. But 'cycles' event comes before those stalled
cycles event and it matches first. So it tries to find 'GHz' metric
in the output (which is for the 'cycles') and fails.
Move the stalled-cycles-{frontend,backend} events before 'cycles' so
that it can find the stalled cycles events first.
Also add a space after 'no args' test name for consistency.
Fixes:
|
||
Ian Rogers
|
d06593aa00 |
perf pmu: Remove a hard coded cpu PMU assumption
The property of "cpu" when it has no cpu map is true on S390 with the PMU cpum_cf. Rather than maintain a list of such PMUs, reuse the is_core test result from the caller. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org Link: https://lore.kernel.org/r/20230623043843.4080180-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Ian Rogers
|
d685819b40 |
perf pmus: Add notion of default PMU for JSON events
JSON events created in pmu-events.c by jevents.py may not specify a PMU they are associated with, in which case it is implied that it is the first core PMU. Care is needed to select this for regular 'cpu', s390 'cpum_cf' and ARMs many names as at the point the name is first needed the core PMUs list hasn't been initialized. Add a helper in perf_pmus to create this value, in the worst case by scanning sysfs. v2. Add missing close if fdopendir fails. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Link: https://lore.kernel.org/r/20230623043843.4080180-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Ian Rogers
|
33941dbd14 |
perf unwind: Fix map reference counts
The result of thread__find_map is the map in the passed in addr_location. Calling addr_location__exit puts that map and so copies need to do a map__get. Add in the corresponding map__puts. v2. Add missing map__put when dso is missing. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ivan Babrou <ivan@cloudflare.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Link: https://lore.kernel.org/r/20230623043107.4077510-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Namhyung Kim
|
e4ef3ef1bc |
perf test: Set PERF_EXEC_PATH for script execution
The task-analyzer.py script (actually every other scripts too) requires
PERF_EXEC_PATH env to find dependent libraries and scripts. For scripts
test to run correctly, it needs to set PERF_EXEC_PATH to the perf tool
source directory.
Instead of blindly update the env, let's check the directory structure
to make sure it points to the correct location.
Fixes:
|
||
Namhyung Kim
|
2d7f5540b8 |
perf script: Initialize buffer for regs_map()
The buffer is used to save register mapping in a sample. Normally
perf samples don't have any register so the string should be empty.
But it missed to initialize the buffer when the size is 0. And it's
passed to PyUnicode_FromString() with a garbage data.
So it returns NULL due to invalid input (instead of an empty unicode
string object) which causes a segfault like below:
Thread 2.1 "perf" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff7c83780 (LWP 193775)]
0x00007ffff6dbca2e in PyDict_SetItem () from /lib/x86_64-linux-gnu/libpython3.11.so.1.0
(gdb) bt
#0 0x00007ffff6dbca2e in PyDict_SetItem () from /lib/x86_64-linux-gnu/libpython3.11.so.1.0
#1 0x00007ffff6dbf848 in PyDict_SetItemString () from /lib/x86_64-linux-gnu/libpython3.11.so.1.0
#2 0x000055555575824d in pydict_set_item_string_decref (val=0x0, key=0x5555557f96e3 "iregs", dict=0x7ffff5f7f780)
at util/scripting-engines/trace-event-python.c:145
#3 set_regs_in_dict (evsel=0x555555efc370, sample=0x7fffffffb870, dict=0x7ffff5f7f780)
at util/scripting-engines/trace-event-python.c:776
#4 get_perf_sample_dict (sample=sample@entry=0x7fffffffb870, evsel=evsel@entry=0x555555efc370, al=al@entry=0x7fffffffb2e0,
addr_al=addr_al@entry=0x0, callchain=callchain@entry=0x7ffff63ef440) at util/scripting-engines/trace-event-python.c:923
#5 0x0000555555758ec1 in python_process_tracepoint (sample=0x7fffffffb870, evsel=0x555555efc370, al=0x7fffffffb2e0, addr_al=0x0)
at util/scripting-engines/trace-event-python.c:1044
#6 0x00005555555c5db8 in process_sample_event (tool=<optimized out>, event=<optimized out>, sample=<optimized out>,
evsel=0x555555efc370, machine=0x555555ef4d68) at builtin-script.c:2421
#7 0x00005555556b7793 in perf_session__deliver_event (session=0x555555ef4b60, event=0x7ffff62ff7d0, tool=0x7fffffffc150,
file_offset=30672, file_path=0x555555efb8a0 "perf.data") at util/session.c:1639
#8 0x00005555556bc864 in do_flush (show_progress=true, oe=0x555555efb700) at util/ordered-events.c:245
#9 __ordered_events__flush (oe=oe@entry=0x555555efb700, how=how@entry=OE_FLUSH__FINAL, timestamp=timestamp@entry=0)
at util/ordered-events.c:324
#10 0x00005555556bd06e in ordered_events__flush (oe=oe@entry=0x555555efb700, how=how@entry=OE_FLUSH__FINAL)
at util/ordered-events.c:342
#11 0x00005555556b9d63 in __perf_session__process_events (session=0x555555ef4b60) at util/session.c:2465
#12 perf_session__process_events (session=0x555555ef4b60) at util/session.c:2627
#13 0x00005555555cb1d0 in __cmd_script (script=0x7fffffffc150) at builtin-script.c:2839
#14 cmd_script (argc=<optimized out>, argv=<optimized out>) at builtin-script.c:4365
#15 0x0000555555650811 in run_builtin (p=p@entry=0x555555ed8948 <commands+456>, argc=argc@entry=4, argv=argv@entry=0x7fffffffe240)
at perf.c:323
#16 0x0000555555597eb3 in handle_internal_command (argv=0x7fffffffe240, argc=4) at perf.c:377
#17 run_argv (argv=<synthetic pointer>, argcp=<synthetic pointer>) at perf.c:421
#18 main (argc=4, argv=0x7fffffffe240) at perf.c:537
Fixes:
|
||
James Clark
|
33fe7c0844 |
perf tests: Fix test_arm_callgraph_fp variable expansion
$TEST_PROGRAM is a command with spaces so it's supposed to be word
split. The referenced fix to fix the shellcheck warnings incorrectly
quoted this string so unquote it to fix the test.
At the same time silence the shellcheck warning for that line and fix
two more shellcheck errors at the end of the script.
Fixes:
|
||
Tiezhu Yang
|
765be32b97 |
perf symbol: Add LoongArch case in get_plt_sizes()
We can see the following definitions in bfd/elfnn-loongarch.c: #define PLT_HEADER_INSNS 8 #define PLT_HEADER_SIZE (PLT_HEADER_INSNS * 4) #define PLT_ENTRY_INSNS 4 #define PLT_ENTRY_SIZE (PLT_ENTRY_INSNS * 4) so plt header size is 32 and plt entry size is 16 on LoongArch, let us add LoongArch case in get_plt_sizes(). Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Acked-by: Huacai Chen <chenhuacai@loongson.cn> Reviewed-by: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: loongarch@lists.linux.dev Cc: loongson-kernel@lists.loongnix.cn Cc: Ingo Molnar <mingo@redhat.com> Link: https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=bfd/elfnn-loongarch.c Link: https://lore.kernel.org/r/1684835873-15956-1-git-send-email-yangtiezhu@loongson.cn Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Namhyung Kim
|
d7c2d34d72 |
perf test: Remove x permission from lib/stat_output.sh
The commit |
||
Weilin Wang
|
1203a63da0 |
perf test: Rerun failed metrics with longer workload
Rerun failed metrics with longer workload to avoid false failure because sometimes metric value test fails when running in very short amount of time. Skip rerun if equal to or more than 20 metrics fail. Signed-off-by: Weilin Wang <weilin.wang@intel.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Cc: ravi.bangoria@amd.com Cc: Ian Rogers <irogers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Link: https://lore.kernel.org/r/20230620170027.1861012-4-weilin.wang@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Weilin Wang
|
a0f1cc18f9 |
perf test: Add skip list for metrics known would fail
Add skip list for metrics known would fail because some of the metrics are very likely to fail due to multiplexing or other errors. So add all of the flaky tests into the skip list. Signed-off-by: Weilin Wang <weilin.wang@intel.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Cc: ravi.bangoria@amd.com Cc: Ian Rogers <irogers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Link: https://lore.kernel.org/r/20230620170027.1861012-3-weilin.wang@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
Weilin Wang
|
3ad7092f51 |
perf test: Add metric value validation test
Add metric value validation test to check if metric values are with in correct value ranges. There are three types of tests included: 1) positive-value test checks if all the metrics collected are non-negative; 2) single-value test checks if the list of metrics have values in given value ranges; 3) relationship test checks if multiple metrics follow a given relationship, e.g. memory_bandwidth_read + memory_bandwidth_write = memory_bandwidth_total. Signed-off-by: Weilin Wang <weilin.wang@intel.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Cc: ravi.bangoria@amd.com Cc: Ian Rogers <irogers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Link: https://lore.kernel.org/r/20230620170027.1861012-2-weilin.wang@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> |
||
elisabeth
|
362f9c907f |
perf jit: Fix incorrect file name in DWARF line table
Fixes an issue where an incorrect filename was added in the DWARF line table of an ELF object file when calling 'perf inject --jit' due to not checking the filename of a debug entry against the repeated name marker (/xff/0). The marker is mentioned in the tools/perf/util/jitdump.h header, which describes the jitdump binary format, and indicitates that the filename in a debug entry is the same as the previous enrty. In the function emit_lineno_info(), in the file tools/perf/util/genelf-debug.c, the debug entry filename gets compared to the previous entry filename. If they are not the same, a new filename is added to the DWARF line table. However, since there is no check against '\xff\0', in some cases '\xff\0' is inserted as the filename into the DWARF line table. This can be seen with `objdump --dwarf=line` on the ELF file after `perf inject --jit`. It also makes no source code information show up in 'perf annotate'. Signed-off-by: Elisabeth Panholzer <elisabeth@leaningtech.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20230602123815.255001-1-paniii94@gmail.com [ Fixed a trailing white space, removed a subject prefix ] Signed-off-by: Namhyung Kim <namhyung@kernel.org> |