IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
The current display code for perf stat iterates given cpus and build the
aggr map to collect the event data for the aggregation mode.
But uncore events have their own cpu maps and it won't guarantee that
it'd match to the aggr map. For example, per-package uncore events
would generate a single value for each socket. When user asks per-core
aggregation mode, the output would contain 0 values for other cores.
Thus it needs to check the uncore PMU's cpumask and if it matches to the
current aggregation id.
Before:
$ sudo ./perf stat -a --per-core -e power/energy-pkg/ sleep 1
Performance counter stats for 'system wide':
S0-D0-C0 1 3.73 Joules power/energy-pkg/
S0-D0-C1 0 <not counted> Joules power/energy-pkg/
S0-D0-C2 0 <not counted> Joules power/energy-pkg/
S0-D0-C3 0 <not counted> Joules power/energy-pkg/
1.001404046 seconds time elapsed
Some events weren't counted. Try disabling the NMI watchdog:
echo 0 > /proc/sys/kernel/nmi_watchdog
perf stat ...
echo 1 > /proc/sys/kernel/nmi_watchdog
The core 1, 2 and 3 should not be printed because the event is handled
in a cpu in the core 0 only. With this change, the output becomes like
below.
After:
$ sudo ./perf stat -a --per-core -e power/energy-pkg/ sleep 1
Performance counter stats for 'system wide':
S0-D0-C0 1 2.09 Joules power/energy-pkg/
Fixes: b897613510 ("perf stat: Update event skip condition for system-wide per-thread mode and merged uncore and hybrid events")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230125192431.2929677-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In 'perf stat' with CSV output option, number of fields in metrics
output is not matching with number of fields in other event output
lines.
Sample output below after applying patch to fix printing os->prefix.
# ./perf stat -x, --per-socket -a -C 1 ls
S0,1,82.11,msec,cpu-clock,82111626,100.00,1.000,CPUs utilized
S0,1,2,,context-switches,82109314,100.00,24.358,/sec
------
====> S0,1,,,,,,,1.71,stalled cycles per insn
The above command line uses field separator as "," via "-x," option and
per-socket option displays socket value as first field. But here the
last line for "stalled cycles per insn" has more separators. Each csv
output line is expected to have 8 field separators (for the 9 fields),
where as last line has 9 "," in the result. Patch fixes this issue.
The counter stats are displayed by function
"perf_stat__print_shadow_stats" in code "util/stat-shadow.c". While
printing the stats info for "stalled cycles per insn", function
"new_line_csv" is used as new_line callback.
The fields printed in each line contains: "Socket_id,aggr
nr,Avg,unit,event_name,run,enable_percent,ratio,unit"
The metric output prints Socket_id, aggr nr, ratio and unit. It has to
skip through remaining five fields ie,
Avg,unit,event_name,run,enable_percent. The csv line callback uses
"os->nfields" to know the number of fields to skip to match with other
lines.
Currently it is set as:
os.nfields = 3 + aggr_fields[config->aggr_mode] + (counter->cgrp ? 1 : 0);
But in case of aggregation modes, csv_sep already gets printed along
with each field (Function "aggr_printout" in util/stat-display.c). So
aggr_fields can be removed from nfields. And fixed number of fields to
skip has to be "4". This is to skip fields for: "avg, unit, event name,
run, enable_percent"
This needs 4 csv separators. Patch removes aggr_fields
and uses 4 as fixed number of os->nfields to skip.
After the patch:
# ./perf stat -x, --per-socket -a -C 1 ls
S0,1,79.08,msec,cpu-clock,79085956,100.00,1.000,CPUs utilized
S0,1,7,,context-switches,79084176,100.00,88.514,/sec
------
====> S0,1,,,,,,0.81,stalled cycles per insn
Fixes: 92a61f6412 ("perf stat: Implement CSV metrics output")
Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20221205042852.83382-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We need to check if we have a OS prefix, otherwise we stumble on a
metric segv that I'm now seeing in Arnaldo's tree:
$ gdb --args perf stat -M Backend true
...
Performance counter stats for 'true':
4,712,355 TOPDOWN.SLOTS # 17.3 % tma_core_bound
Program received signal SIGSEGV, Segmentation fault.
__strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex.S:77
77 ../sysdeps/x86_64/multiarch/strlen-evex.S: No such file or directory.
(gdb) bt
#0 __strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex.S:77
#1 0x00007ffff74749a5 in __GI__IO_fputs (str=0x0, fp=0x7ffff75f5680 <_IO_2_1_stderr_>)
#2 0x0000555555779f28 in do_new_line_std (config=0x555555e077c0 <stat_config>, os=0x7fffffffbf10) at util/stat-display.c:356
#3 0x000055555577a081 in print_metric_std (config=0x555555e077c0 <stat_config>, ctx=0x7fffffffbf10, color=0x0, fmt=0x5555558b77b5 "%8.1f", unit=0x7fffffffbb10 "% tma_memory_bound", val=13.165355724442199) at util/stat-display.c:380
#4 0x00005555557768b6 in generic_metric (config=0x555555e077c0 <stat_config>, metric_expr=0x55555593d5b7 "((CYCLE_ACTIVITY.STALLS_MEM_ANY + EXE_ACTIVITY.BOUND_ON_STORES) / (CYCLE_ACTIVITY.STALLS_TOTAL + (EXE_ACTIVITY.1_PORTS_UTIL + tma_retiring * EXE_ACTIVITY.2_PORTS_UTIL) + EXE_ACTIVITY.BOUND_ON_STORES))"..., metric_events=0x555555f334e0, metric_refs=0x555555ec81d0, name=0x555555f32e80 "TOPDOWN.SLOTS", metric_name=0x555555f26c80 "tma_memory_bound", metric_unit=0x55555593d5b1 "100%", runtime=0, map_idx=0, out=0x7fffffffbd90, st=0x555555e9e620 <rt_stat>) at util/stat-shadow.c:934
#5 0x0000555555778cac in perf_stat__print_shadow_stats (config=0x555555e077c0 <stat_config>, evsel=0x555555f289d0, avg=4712355, map_idx=0, out=0x7fffffffbd90, metric_events=0x555555e078e8 <stat_config+296>, st=0x555555e9e620 <rt_stat>) at util/stat-shadow.c:1329
#6 0x000055555577b6a0 in printout (config=0x555555e077c0 <stat_config>, os=0x7fffffffbf10, uval=4712355, run=325322, ena=325322, noise=4712355, map_idx=0) at util/stat-display.c:741
#7 0x000055555577bc74 in print_counter_aggrdata (config=0x555555e077c0 <stat_config>, counter=0x555555f289d0, s=0, os=0x7fffffffbf10) at util/stat-display.c:838
#8 0x000055555577c1d8 in print_counter (config=0x555555e077c0 <stat_config>, counter=0x555555f289d0, os=0x7fffffffbf10) at util/stat-display.c:957
#9 0x000055555577dba0 in evlist__print_counters (evlist=0x555555ec3610, config=0x555555e077c0 <stat_config>, _target=0x555555e01c80 <target>, ts=0x0, argc=1, argv=0x7fffffffe450) at util/stat-display.c:1413
#10 0x00005555555fc821 in print_counters (ts=0x0, argc=1, argv=0x7fffffffe450) at builtin-stat.c:1040
#11 0x000055555560091a in cmd_stat (argc=1, argv=0x7fffffffe450) at builtin-stat.c:2665
#12 0x00005555556b1eea in run_builtin (p=0x555555e11f70 <commands+336>, argc=4, argv=0x7fffffffe450) at perf.c:322
#13 0x00005555556b2181 in handle_internal_command (argc=4, argv=0x7fffffffe450) at perf.c:376
#14 0x00005555556b22d7 in run_argv (argcp=0x7fffffffe27c, argv=0x7fffffffe270) at perf.c:420
#15 0x00005555556b26ef in main (argc=4, argv=0x7fffffffe450) at perf.c:550
(gdb)
Fixes: f123b2d84e ("perf stat: Remove prefix argument in print_metric_headers()")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/CAP-5=fUOjSM5HajU9TCD6prY39LbX4OQbkEbtKPPGRBPBN=_VQ@mail.gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It printed empty strings for each metric. I guess it's needed for CSV
output to match the column number. We could just ignore the empty
metrics in JSON but it ended up with a broken JSON object with a
trailing comma.
So I added a dummy '"metric-value" : "none"' part. To do that, it
needs to pass struct outstate to print_metric_end() to check if any
metric value is printed or not.
Before:
# perf stat -aj --metric-only --per-socket --for-each-cgroup system.slice true
{"socket" : "S0", "cpu-count" : 8, "cgroup" : "system.slice", "" : "", "" : "", "" : "", "" : "", "" : "", "" : "", "" : "", "" : ""}
After:
# perf stat -aj --metric-only --per-socket --for-each-cgroup system.slice true
{"socket" : "S0", "cpu-count" : 8, "cgroup" : "system.slice", "metric-value" : "none"}
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221123180208.2068936-16-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It generated a broken JSON output when aggregation mode or cgroup is
used with --metric-only option. Also get rid of the header line and
make the output single line for each entry.
It needs to know whether the current metric is the first one or not.
So add 'first' field in the outstate and mark it false after printing.
Before:
# perf stat -a -j --metric-only true
{"unit" : "GHz"}{"unit" : "insn per cycle"}{"unit" : "branch-misses of all branches"}
{{"metric-value" : "0.797"}{"metric-value" : "1.65"}{"metric-value" : "0.89"}
^
# perf stat -a -j --metric-only --per-socket true
{"unit" : "GHz"}{"unit" : "insn per cycle"}{"unit" : "branch-misses of all branches"}
{"socket" : "S0", "aggregate-number" : 8, {"metric-value" : "0.295"}{"metric-value" : "1.88"}{"metric-value" : "0.64"}
^
After:
# perf stat -a -j --metric-only true
{"GHz" : "0.990", "insn per cycle" : "2.06", "branch-misses of all branches" : "0.59"}
# perf stat -a -j --metric-only --per-socket true
{"socket" : "S0", "aggregate-number" : 8, "GHz" : "0.439", "insn per cycle" : "2.14", "branch-misses of all branches" : "0.51"}
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221123180208.2068936-14-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We don't care about the alignment in the CSV output as it's intended for machine
processing. Let's get rid of it to make the output more compact.
Before:
# perf stat -a --summary -I 1 -x, true
0.001149309,219.20,msec,cpu-clock,219322251,100.00,219.200,CPUs utilized
0.001149309,144,,context-switches,219241902,100.00,656.935,/sec
0.001149309,38,,cpu-migrations,219173705,100.00,173.358,/sec
0.001149309,61,,page-faults,219093635,100.00,278.285,/sec
0.001149309,10679310,,cycles,218746228,100.00,0.049,GHz
0.001149309,6288296,,instructions,218589869,100.00,0.59,insn per cycle
0.001149309,1386904,,branches,218428851,100.00,6.327,M/sec
0.001149309,56863,,branch-misses,218219951,100.00,4.10,of all branches
summary,219.20,msec,cpu-clock,219322251,100.00,20.025,CPUs utilized
summary,144,,context-switches,219241902,100.00,656.935,/sec
summary,38,,cpu-migrations,219173705,100.00,173.358,/sec
summary,61,,page-faults,219093635,100.00,278.285,/sec
summary,10679310,,cycles,218746228,100.00,0.049,GHz
summary,6288296,,instructions,218589869,100.00,0.59,insn per cycle
summary,1386904,,branches,218428851,100.00,6.327,M/sec
summary,56863,,branch-misses,218219951,100.00,4.10,of all branches
After:
0.001148449,224.75,msec,cpu-clock,224870589,100.00,224.747,CPUs utilized
0.001148449,176,,context-switches,224775564,100.00,783.103,/sec
0.001148449,38,,cpu-migrations,224707428,100.00,169.079,/sec
0.001148449,61,,page-faults,224629326,100.00,271.416,/sec
0.001148449,12172071,,cycles,224266368,100.00,0.054,GHz
0.001148449,6901907,,instructions,224108764,100.00,0.57,insn per cycle
0.001148449,1515655,,branches,223946693,100.00,6.744,M/sec
0.001148449,70027,,branch-misses,223735385,100.00,4.62,of all branches
summary,224.75,msec,cpu-clock,224870589,100.00,21.066,CPUs utilized
summary,176,,context-switches,224775564,100.00,783.103,/sec
summary,38,,cpu-migrations,224707428,100.00,169.079,/sec
summary,61,,page-faults,224629326,100.00,271.416,/sec
summary,12172071,,cycles,224266368,100.00,0.054,GHz
summary,6901907,,instructions,224108764,100.00,0.57,insn per cycle
summary,1515655,,branches,223946693,100.00,6.744,M/sec
summary,70027,,branch-misses,223735385,100.00,4.62,of all branches
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221123180208.2068936-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>