f52679b788
perf tools: Support reading PERF_FORMAT_LOST
...
The recent kernel added lost count can be read from either read(2) or
ring buffer data with PERF_SAMPLE_READ. As it's a variable length data
we need to access it according to the format info.
But for perf tools use cases, PERF_FORMAT_ID is always set. So we can
only check PERF_FORMAT_LOST bit to determine the data format.
Add sample_read_value_size() and next_sample_read_value() helpers to
make it a bit easier to access. Use them in all places where it reads
the struct sample_read_value.
Signed-off-by: Namhyung Kim <namhyung@kernel.org >
Acked-by: Jiri Olsa <jolsa@kernel.org >
Cc: Ian Rogers <irogers@google.com >
Cc: Ingo Molnar <mingo@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: https://lore.kernel.org/r/20220819003644.508916-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-19 15:56:56 -03:00
cf1258ac37
perf beauty: Update copy of linux/socket.h with the kernel sources
...
To pick the changes in:
7fa875b8e5
("net: copy from user before calling __copy_msghdr")
ebe73a284f
("net: Allow custom iter handler in msghdr")
7c701d92b2
("skbuff: carry external ubuf_info in msghdr")
c04245328d
("net: make __sys_accept4_file() static")
That don't result in any changes in the tables generated from that
header.
This silences this perf build warning:
Warning: Kernel ABI header at 'tools/perf/trace/beauty/include/linux/socket.h' differs from latest version at 'include/linux/socket.h'
diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h
Cc: David Ahern <dsahern@kernel.org >
Cc: David S. Miller <davem@davemloft.net >
Cc: Dylan Yudaken <dylany@fb.com >
Cc: Jakub Kicinski <kuba@kernel.org >
Cc: Jens Axboe <axboe@kernel.dk >
Cc: Pavel Begunkov <asml.silence@gmail.com >
Cc: Yajun Deng <yajun.deng@linux.dev >
Link: https://lore.kernel.org/lkml/YvzYs+F+Xzq8Hvvp@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-19 15:30:33 -03:00
b2f10cd4e8
perf cpumap: Fix alignment for masks in event encoding
...
A mask encoding of a cpu map is laid out as:
u16 nr
u16 long_size
unsigned long mask[];
However, the mask may be 8-byte aligned meaning there is a 4-byte pad
after long_size. This means 32-bit and 64-bit builds see the mask as
being at different offsets. On top of this the structure is in the byte
data[] encoded as:
u16 type
char data[]
This means the mask's struct isn't the required 4 or 8 byte aligned, but
is offset by 2. Consequently the long reads and writes are causing
undefined behavior as the alignment is broken.
Fix the mask struct by creating explicit 32 and 64-bit variants, use a
union to avoid data[] and casts; the struct must be packed so the
layout matches the existing perf.data layout. Taking an address of a
member of a packed struct breaks alignment so pass the packed
perf_record_cpu_map_data to functions, so they can access variables with
the right alignment.
As the 64-bit version has 4 bytes of padding, optimizing writing to only
write the 32-bit version.
Committer notes:
Disable warnings about 'packed' that break the build in some arches like
riscv64, but just around that specific struct.
Signed-off-by: Ian Rogers <irogers@google.com >
Acked-by: Jiri Olsa <jolsa@kernel.org >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com >
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com >
Cc: Colin Ian King <colin.king@intel.com >
Cc: Dave Marchevsky <davemarchevsky@fb.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Kees Kook <keescook@chromium.org >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Riccardo Mancini <rickyman7@gmail.com >
Cc: Song Liu <songliubraving@fb.com >
Cc: Stephane Eranian <eranian@google.com >
Link: https://lore.kernel.org/r/20220614143353.1559597-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-19 15:30:28 -03:00
28526478cc
perf cpumap: Compute mask size in constant time
...
perf_cpu_map__max() computes the cpumap's maximum value, no need to
iterate over all values.
Signed-off-by: Ian Rogers <irogers@google.com >
Acked-by: Jiri Olsa <jolsa@kernel.org >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com >
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com >
Cc: Colin Ian King <colin.king@intel.com >
Cc: Dave Marchevsky <davemarchevsky@fb.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Kees Kook <keescook@chromium.org >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Riccardo Mancini <rickyman7@gmail.com >
Cc: Song Liu <songliubraving@fb.com >
Cc: Stephane Eranian <eranian@google.com >
Link: https://lore.kernel.org/r/20220614143353.1559597-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-19 12:26:58 -03:00
35ae6f09d8
perf cpumap: Synthetic events and const/static
...
Make the cpumap arguments const to make it clearer they are in rather
than out arguments. Make two functions static and remove external
declarations.
Signed-off-by: Ian Rogers <irogers@google.com >
Acked-by: Jiri Olsa <jolsa@kernel.org >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com >
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com >
Cc: Colin Ian King <colin.king@intel.com >
Cc: Dave Marchevsky <davemarchevsky@fb.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Kees Kook <keescook@chromium.org >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Riccardo Mancini <rickyman7@gmail.com >
Cc: Song Liu <songliubraving@fb.com >
Cc: Stephane Eranian <eranian@google.com >
Link: https://lore.kernel.org/r/20220614143353.1559597-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-19 12:26:58 -03:00
7391db6459
perf test: Refactor shell tests allowing subdirs
...
This is a prelude to adding more tests to shell tests and in order to
support putting those tests into subdirectories, I need to change the
test code that scans/finds and runs them.
To support subdirs I have to recurse so it's time to refactor the code
to allow this and centralize the shell script finding into one location
and only one single scan that builds a list of all the found tests in
memory instead of it being duplicated in 3 places.
This code also optimizes things like knowing the max width of desciption
strings (as we can do that while we scan instead of a whole new pass of
opening files).
It also more cleanly filters scripts to see only *.sh files thus
skipping random other files in directories like *~ backup files, other
random junk/data files that may appear and the scripts must be
executable to make the cut (this ensures the script lib dir is not seen
as scripts to run).
This avoids perf test running previous older versions of test scripts
that are editor backup files as well as skipping perf.data files that
may appear and so on.
Reviewed-by: Leo Yan <leo.yan@linaro.org >
Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com >
Tested-by: Leo Yan <leo.yan@linaro.org >
Cc: Mathieu Poirier <mathieu.poirier@linaro.org >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Suzuki Poulouse <suzuki.poulose@arm.com >
Cc: coresight@lists.linaro.org
Link: https://lore.kernel.org/r/20220812121641.336465-2-carsten.haitzler@foss.arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 15:13:20 -03:00
aa0d6e9cc2
perf vendor events: Update events for snowridgex
...
Update the events to v1.20, update events for snowridgex by the latest
event converter tools.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the snowridgex files into perf.
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Tested-by: Ian Rogers <irogers@google.com >
Cc: Alexander Shishkin <alexander.shishkin@intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: https://lore.kernel.org/r/20220812085239.3089231-12-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 15:08:31 -03:00
ce87616d0d
perf vendor events: Update events and metrics for skylakex
...
Update the events to v1.28, the metrics are based on TMA 4.4 full, update
events and metrics for skylakex by the latest event converter tools.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the skylakex files into perf.
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Tested-by: Ian Rogers <irogers@google.com >
Cc: Alexander Shishkin <alexander.shishkin@intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: https://lore.kernel.org/r/20220812085239.3089231-11-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 15:08:20 -03:00
74d8ca6d85
perf vendor events: Update metrics for sapphirerapids
...
The metrics are based on TMA 4.4 full, add new metrics “UNCORE_FREQ” for
sapphirerapids.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the sapphirerapids files into perf.
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Tested-by: Ian Rogers <irogers@google.com >
Cc: Alexander Shishkin <alexander.shishkin@intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: https://lore.kernel.org/r/20220812085239.3089231-10-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 15:07:31 -03:00
107630e6a5
perf vendor events: Update events for knightslanding
...
Update the events to v9, update events for knightslanding by the latest
event converter tools.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the knightslanding files into perf.
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Tested-by: Ian Rogers <irogers@google.com >
Cc: Alexander Shishkin <alexander.shishkin@intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: https://lore.kernel.org/r/20220812085239.3089231-9-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 15:07:25 -03:00
b823ee183d
perf vendor events: Update metrics for jaketown
...
The metrics are based on TMA 4.4 full, add new metrics “UNCORE_FREQ” for
jaketown.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the jaketown files into perf.
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Tested-by: Ian Rogers <irogers@google.com >
Cc: Alexander Shishkin <alexander.shishkin@intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: https://lore.kernel.org/r/20220812085239.3089231-8-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 15:06:54 -03:00
cb73eeb95a
perf vendor events: Update metrics for ivytown
...
The metrics are based on TMA 4.4 full, add new metrics “UNCORE_FREQ” for
ivytown.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the ivytown files into perf.
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Tested-by: Ian Rogers <irogers@google.com >
Cc: Alexander Shishkin <alexander.shishkin@intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: https://lore.kernel.org/r/20220812085239.3089231-7-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 15:06:38 -03:00
b8d4fbfb04
perf vendor events: Update events and metrics for icelakex
...
Update the events to v1.15, the metrics are based on TMA 4.4 full, update
events and metrics for icelakex by the latest event converter tools.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the icelakex files into perf.
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Tested-by: Ian Rogers <irogers@google.com >
Cc: Alexander Shishkin <alexander.shishkin@intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: https://lore.kernel.org/r/20220812085239.3089231-6-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 15:06:24 -03:00
575c3640a4
perf vendor events: Update events and metrics for haswellx
...
Update the events to v25, the metrics are based on TMA 4.4 full, update
events and metrics for haswellx by the latest event converter tools.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the haswellx files into perf.
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Tested-by: Ian Rogers <irogers@google.com >
Cc: Alexander Shishkin <alexander.shishkin@intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: https://lore.kernel.org/r/20220812085239.3089231-5-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 15:06:07 -03:00
c6e9c04418
perf vendor events: Update events and metrics for cascadelakex
...
Update to v16, the metrics are based on TMA 4.4 full, update events and add
new metrics “UNCORE_FREQ” for cascadelakex.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the cascadelakex files into perf.
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Tested-by: Ian Rogers <irogers@google.com >
Cc: Alexander Shishkin <alexander.shishkin@intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: https://lore.kernel.org/r/20220812085239.3089231-4-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 15:05:51 -03:00
e349fa6cc8
perf vendor events: Update events and metrics for broadwellx
...
Update to v19, the metrics are based on TMA 4.4 full, update events and add
new metrics “UNCORE_FREQ” for broadwellx.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the broadwellx files into perf.
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Tested-by: Ian Rogers <irogers@google.com >
Cc: Alexander Shishkin <alexander.shishkin@intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ian Rogers <irogers@google.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: https://lore.kernel.org/r/20220812085239.3089231-3-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 15:05:40 -03:00
bf79e18fdf
perf vendor events: Update metrics for broadwellde
...
The metrics are based on TMA 4.4 full, add new metrics “UNCORE_FREQ” for
broadwellde.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the broadwellde files into perf.
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Tested-by: Ian Rogers <irogers@google.com >
Cc: Alexander Shishkin <alexander.shishkin@intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: https://lore.kernel.org/r/20220812085239.3089231-2-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 15:04:49 -03:00
d0313e629f
perf jevents: Fold strings optimization
...
If a shorter string ends a longer string then the shorter string may
reuse the longer string at an offset. For example, on x86 the event
arith.cycles_div_busy and cycles_div_busy can be folded, even though
they have difference names the strings are identical after 6
characters. cycles_div_busy can reuse the arith.cycles_div_busy string
at an offset of 6.
In pmu-events.c this looks like the following where the 'also:' lists
folded strings:
/* offset=177541 */ "arith.cycles_div_busy\000\000pipeline\000Cycles the divider is busy\000\000\000event=0x14,period=2000000,umask=0x1\000\000\000\000\000\000\000\000\000" /* also: cycles_div_busy\000\000pipeline\000Cycles the divider is busy\000\000\000event=0x14,period=2000000,umask=0x1\000\000\000\000\000\000\000\000\000 */
As jevents.py combines multiple strings for an event into a larger
string, the amount of folding is minimal as all parts of the event must
align. Other organizations can benefit more from folding, but lose space
by say recording more offsets.
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Ravi Bangoria <ravi.bangoria@amd.com >
Cc: Stephane Eranian <eranian@google.com >
Cc: Will Deacon <will@kernel.org >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220812230949.683239-15-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 15:03:09 -03:00
9118259c1d
perf jevents: Compress the pmu_events_table
...
The pmu_events array requires 15 pointers per entry which in position
independent code need relocating. Change the array to be an array of
offsets within a big C string. Only the offset of the first variable is
required, subsequent variables are stored in order after the \0
terminator (requiring a byte per variable rather than 4 bytes per
offset).
The file size savings are:
no jevents - the same 19,788,464bytes
x86 jevents - ~16.7% file size saving 23,744,288bytes vs 28,502,632bytes
all jevents - ~19.5% file size saving 24,469,056bytes vs 30,379,920bytes
default build options plus NO_LIBBFD=1.
For example, the x86 build savings come from .rela.dyn and
.data.rel.ro becoming smaller by 3,157,032bytes and 3,042,016bytes
respectively. .rodata increases by 1,432,448bytes, giving an overall
4,766,600bytes saving.
To make metric strings more shareable, the topic is changed from say
'skx metrics' to just 'metrics'.
To try to help with the memory layout the pmu_events are ordered as used
by perf qsort comparator functions.
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Ravi Bangoria <ravi.bangoria@amd.com >
Cc: Stephane Eranian <eranian@google.com >
Cc: Will Deacon <will@kernel.org >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220812230949.683239-14-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 15:02:32 -03:00
d3abd7b8bd
perf metrics: Copy entire pmu_event in find metric
...
The pmu_event passed to the pmu_events_table_for_each_event is invalid
after the loop. Copy the entire struct in metricgroup__find_metric.
Reduce the scope of this function to static.
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Ravi Bangoria <ravi.bangoria@amd.com >
Cc: Stephane Eranian <eranian@google.com >
Cc: Will Deacon <will@kernel.org >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220812230949.683239-13-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 15:02:21 -03:00
1ba3752aec
perf pmu-events: Hide the pmu_events
...
Hide that the pmu_event structs are an array with a new wrapper struct.
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Ravi Bangoria <ravi.bangoria@amd.com >
Cc: Stephane Eranian <eranian@google.com >
Cc: Will Deacon <will@kernel.org >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220812230949.683239-12-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 15:02:08 -03:00
660842e468
perf pmu-events: Don't assume pmu_event is an array
...
The current code assumes that a struct pmu_event can be iterated over
forward until a NULL pmu_event is encountered.
This makes it difficult to refactor pmu_event.
Add a loop function taking a callback function that's passed the struct
pmu_event.
This way the pmu_event is only needed for one element and not an entire
array.
Switch existing code iterating over the pmu_event arrays to use the new
loop function pmu_events_table_for_each_event.
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Ravi Bangoria <ravi.bangoria@amd.com >
Cc: Stephane Eranian <eranian@google.com >
Cc: Will Deacon <will@kernel.org >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220812230949.683239-11-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 15:01:31 -03:00
7ae5c03a27
perf pmu-events: Move test events/metrics to JSON
...
Move arrays of pmu_events into the JSON code so that it may be
regenerated and modified by the jevents.py script.
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Ravi Bangoria <ravi.bangoria@amd.com >
Cc: Stephane Eranian <eranian@google.com >
Cc: Will Deacon <will@kernel.org >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220812230949.683239-10-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 15:01:15 -03:00
64234c141b
perf test: Use full metric resolution
...
The simple metric resolution doesn't handle recursion properly, switch
to use the full resolution as with the parse-metric tests which also
increases coverage. Don't set the values for the metric backward as
failures to generate a result are ignored.
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Ravi Bangoria <ravi.bangoria@amd.com >
Cc: Stephane Eranian <eranian@google.com >
Cc: Will Deacon <will@kernel.org >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220812230949.683239-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 15:01:03 -03:00
29be2fe0c1
perf pmu-events: Hide pmu_events_map
...
Move usage of the table to pmu-events.c so it may be hidden. By
abstracting the table the implementation can later be changed.
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Ravi Bangoria <ravi.bangoria@amd.com >
Cc: Stephane Eranian <eranian@google.com >
Cc: Will Deacon <will@kernel.org >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220812230949.683239-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 15:00:47 -03:00
eeac773041
perf pmu-events: Avoid passing pmu_events_map
...
Preparation for hiding pmu_events_map as an implementation detail. While
the map is passed, the table of events is all that is normally wanted.
While modifying the function's types, rename pmu_events_map__find to
pmu_events_table__find to match later encapsulation. Similarly rename
pmu_add_cpu_aliases_map to pmu_add_cpu_aliases_table.
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Ravi Bangoria <ravi.bangoria@amd.com >
Cc: Stephane Eranian <eranian@google.com >
Cc: Will Deacon <will@kernel.org >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220812230949.683239-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 15:00:32 -03:00
2519db2a9d
perf pmu-events: Hide pmu_sys_event_tables
...
Move usage of the table to pmu-events.c so it may be hidden. By
abstracting the table the implementation can later be changed.
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Ravi Bangoria <ravi.bangoria@amd.com >
Cc: Stephane Eranian <eranian@google.com >
Cc: Will Deacon <will@kernel.org >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220812230949.683239-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 15:00:16 -03:00
7b2f844c43
perf jevents: Sort JSON files entries
...
Sort the JSON files entries on conversion to C. The sort order tries to
replicated cmp_sevent from pmu.c so that the input there is already
sorted except for sysfs events. Specifically, the sort order is given by
the tuple:
(not j.desc is None, fix_none(j.topic), fix_none(j.name), fix_none(j.pmu), fix_none(j.metric_name))
which is putting events with descriptions and topics before those
without, then sorting by name, then pmu and finally metric_name
Add the topic to JsonEvent on reading to simplify. Remove an unnecessary
lambda in the JSON reading.
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Ravi Bangoria <ravi.bangoria@amd.com >
Cc: Stephane Eranian <eranian@google.com >
Cc: Will Deacon <will@kernel.org >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220812230949.683239-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 14:59:44 -03:00
ee2ce6fdc8
perf jevents: Provide path to JSON file on error
...
If a JSONDecoderError or similar is raised then it is useful to know the
path. Print this and then raise the exception agan.
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Ravi Bangoria <ravi.bangoria@amd.com >
Cc: Stephane Eranian <eranian@google.com >
Cc: Will Deacon <will@kernel.org >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220812230949.683239-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 14:59:26 -03:00
f793ae185e
perf jevents: Remove the type/version variables
...
pmu_events_map has a type variable that is always initialized to "core"
and a version variable that is never read. Remove these from the API as
it is straightforward to add them back when necessary.
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Ravi Bangoria <ravi.bangoria@amd.com >
Cc: Stephane Eranian <eranian@google.com >
Cc: Will Deacon <will@kernel.org >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220812230949.683239-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 14:58:40 -03:00
099b157c08
perf jevent: Add an 'all' architecture argument
...
When 'all' is passed as the architecture generate a mapping table for
all architectures. This simplifies testing. To identify the table for an
architecture add an arch variable to the pmu_events_map.
Signed-off-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Andi Kleen <ak@linux.intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kan Liang <kan.liang@linux.intel.com >
Cc: Leo Yan <leo.yan@linaro.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Ravi Bangoria <ravi.bangoria@amd.com >
Cc: Stephane Eranian <eranian@google.com >
Cc: Will Deacon <will@kernel.org >
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220812230949.683239-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-13 14:58:20 -03:00
8d33834f9f
perf stat: Remove duplicated include in builtin-stat.c
...
util/topdown.h is included twice in builtin-stat.c,
remove one of them.
Reported-by: Abaci Robot <abaci@linux.alibaba.com >
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com >
Tested-by: Ian Rogers <irogers@google.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=1818
Link: https://lore.kernel.org/r/20220804005213.71990-1-yang.lee@linux.alibaba.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-12 16:51:31 -03:00
0029e8ace1
perf scripting python: Delete repeated word in comments
...
Delete the repeated word "into" in comments.
Signed-off-by: shaomin Deng <dengshaomin@cdjrlc.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: http://lore.kernel.org/lkml/20220807160239.474-1-dengshaomin@cdjrlc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-12 16:45:53 -03:00
987f5cbd2f
perf tools: Fix double word in comments
...
Delete the repeated word "to" in comments.
Signed-off-by: shaomin Deng <dengshaomin@cdjrlc.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: http://lore.kernel.org/lkml/20220807155549.30953-1-dengshaomin@cdjrlc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-12 16:45:24 -03:00
632f5c224e
perf trace: Fix double word in comments
...
Delete repeated word "and" in comments.
Signed-off-by: shaomin Deng <dengshaomin@cdjrlc.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: http://lore.kernel.org/lkml/20220807084629.23121-1-dengshaomin@cdjrlc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-12 16:44:56 -03:00
ae4e4a0ba3
perf script: Delete repeated word "from"
...
Delete the repeated word "from" in code.
Signed-off-by: shaomin Deng <dengshaomin@cdjrlc.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: http://lore.kernel.org/lkml/20220807080642.13004-1-dengshaomin@cdjrlc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-12 16:44:30 -03:00
f3c96bec7c
perf test: Fix double word in comments
...
Delete the redundant word "then" in comments.
Signed-off-by: shaomin Deng <dengshaomin@cdjrlc.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Link: http://lore.kernel.org/lkml/20220807074753.7857-1-dengshaomin@cdjrlc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-12 16:42:55 -03:00
1bf7d836e5
perf record: Improve error message of -p not_existing_pid
...
When one uses -p $not_existing_pid, the output of --help is printed:
$ perf record -p 123456789 2>&1 | head -n3
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
Let's change it something similar what perf top -p $not_existing_pid
prints:
$ ./perf top -p 123456789 --stdio
Error:
Couldn't create thread/CPU maps: No such process
Newly suggested error message:
$ ./perf record -p 123456789
Couldn't create thread/CPU maps: No such process
Signed-off-by: Martin Liška <mliska@suse.cz >
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com >
Link: http://lore.kernel.org/lkml/8e00eda1-4de0-2c44-ce67-d4df48ac1f7c@suse.cz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-12 10:27:21 -03:00
a072a7a026
perf build-id: Print debuginfod queries if -v option is used
...
When ending a 'perf record' session, the querying of a debuginfod server
can take quite some time. Inform a user about it when -v options is
used.
Signed-off-by: Martin Liška <mliska@suse.cz >
Link: http://lore.kernel.org/lkml/325871cf-b71f-6237-8793-82182272ece8@suse.cz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-12 10:27:20 -03:00
34575ded68
perf build-id: Fix coding style, replace 8 spaces by tabs
...
Use tabs instead of 8 spaces for the indentation.
Signed-off-by: Martin Liška <mliska@suse.cz >
Link: http://lore.kernel.org/lkml/2983e2e0-6850-ad59-79d8-efe83b22cffe@suse.cz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-12 10:22:49 -03:00
e754dd7e8b
perf c2c: Update documentation for new display option 'peer'
...
Since the new display option 'peer' is introduced, this patch is to
update the documentation to reflect it.
Reviewed-by: Ali Saidi <alisaidi@amazon.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Acked-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Like Xu <likexu@tencent.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Timothy Hayes <timothy.hayes@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-16-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:12:32 -03:00
ead42a0f9b
perf c2c: Use 'peer' as default display for Arm64
...
Since Arm64 arch doesn't support HITMs flags, this patch changes to use
'peer' as default display if user doesn't specify any type; for other
arches, it still uses 'tot' as default display type if user doesn't
specify it.
This patch changes to call perf_session__new() in an earlier place, so
session environment can be initialized ahead and arch info can be used
for setting display type.
Suggested-by: Ali Saidi <alisaidi@amazon.com >
Reviewed-by: Ali Saidi <alisaidi@amazon.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Tested-by: Ali Saidi <alisaidi@amazon.com >
Acked-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Like Xu <likexu@tencent.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Timothy Hayes <timothy.hayes@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-15-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:12:31 -03:00
f37c5d914e
perf c2c: Sort on peer snooping for load operations
...
This patch adds a new option 'peer' so can sort on the cache hit for
peer snooping.
For displaying with option 'peer', the "Shared Data Cache Line Table"
and "Shared Cache Line Distribution Pareto" both sort with the metrics
"tot_peer".
As result, we can get the 'peer' display:
# perf c2c report -d peer --coalesce tid,pid,iaddr,dso -N --stdio
=================================================
Shared Data Cache Line Table
=================================================
#
# ----------- Cacheline ---------- Peer ------- Load Peer ------- Total Total Total --------- Stores -------- ----- Core Load Hit ----- - LLC Load Hit -- - RMT Load Hit -- --- Load Dram ----
# Index Address Node PA cnt Snoop Total Local Remote records Loads Stores L1Hit L1Miss N/A FB L1 L2 LclHit LclHitm RmtHit RmtHitm Lcl Rmt
# ..... .................. .... ...... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ....... ........ ....... ........ ....... ........ ........
#
0 0xaaaac17d6000 N/A 0 100.00% 99 99 0 18851 18851 0 0 0 0 0 18752 0 99 0 0 0 0 0
=================================================
Shared Cache Line Distribution Pareto
=================================================
#
# -- Peer Snoop -- ------- Store Refs ------ --------- Data address --------- ---------- cycles ---------- Total cpu Shared
# Num Rmt Lcl L1 Hit L1 Miss N/A Offset Node PA cnt Pid Tid Code address rmt peer lcl peer load records cnt Symbol Object Source:Line Node{cpus %peers %stores}
# ..... ....... ....... ....... ....... ....... .................. .... ...... ....... ................. .................. ........ ........ ........ ....... ........ ...................... ................ ............... ....
#
----------------------------------------------------------------------
0 0 99 0 0 0 0xaaaac17d6000
----------------------------------------------------------------------
0.00% 3.03% 0.00% 0.00% 0.00% 0x20 N/A 0 3603 3603:memstress 0xaaaac17c25ac 0 376 41 9314 2 [.] 0x00000000000025ac memstress memstress[25ac] 0{ 2 100.0% n/a}
0.00% 3.03% 0.00% 0.00% 0.00% 0x20 N/A 0 3603 3606:memstress 0xaaaac17c25ac 0 375 44 9155 1 [.] 0x00000000000025ac memstress memstress[25ac] 0{ 1 100.0% n/a}
0.00% 48.48% 0.00% 0.00% 0.00% 0x29 N/A 0 3603 3606:memstress 0xaaaac17c3e88 0 180 170 65 1 [.] 0x0000000000003e88 memstress memstress[3e88] 0{ 1 100.0% n/a}
0.00% 45.45% 0.00% 0.00% 0.00% 0x29 N/A 0 3603 3603:memstress 0xaaaac17c3e88 0 180 175 70 2 [.] 0x0000000000003e88 memstress memstress[3e88] 0{ 2 100.0% n/a}
Reviewed-by: Ali Saidi <alisaidi@amazon.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Tested-by: Ali Saidi <alisaidi@amazon.com >
Acked-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Like Xu <likexu@tencent.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Timothy Hayes <timothy.hayes@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-14-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:12:25 -03:00
faa30dfab5
perf c2c: Refactor display string
...
The display type is shown by combination the display string array and a
suffix string "HITMs", which is not friendly to extend display for other
sorting type (e.g. extension for peer operations).
This patch moves the suffix string "HITMs" into display string array for
HITM types, so it can allow us to not necessarily to output string
"HITMs" for new incoming display type.
Reviewed-by: Ali Saidi <alisaidi@amazon.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Tested-by: Ali Saidi <alisaidi@amazon.com >
Acked-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Like Xu <likexu@tencent.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Timothy Hayes <timothy.hayes@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-13-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:12:25 -03:00
7c10b65a42
perf c2c: Refactor node header
...
The node header array contains 3 items, each item is used for one of
the 3 flavors for node accessing info. To extend sorting on other
snooping type and not always stick to HITMs, the second header string
"Node{cpus %hitms %stores}" should be adjusted (e.g. it's changed as
"Node{cpus %peer %stores}").
For this reason, this patch changes the node header array to three
flat variables and uses switch-case in function setup_nodes_header(),
thus it is easier for altering the header string.
Reviewed-by: Ali Saidi <alisaidi@amazon.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Tested-by: Ali Saidi <alisaidi@amazon.com >
Acked-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Like Xu <likexu@tencent.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Timothy Hayes <timothy.hayes@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-12-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:12:24 -03:00
2be0bc7529
perf c2c: Rename dimension from 'percent_hitm' to 'percent_costly_snoop'
...
Use more general naming for the main sort dimension, this can allow us
not to sort only on HITM snoop type, so it can be extended to support
other costly snooping operations. So rename the dimension to the prefix
'percent_costly_".
Reviewed-by: Ali Saidi <alisaidi@amazon.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Tested-by: Ali Saidi <alisaidi@amazon.com >
Acked-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Like Xu <likexu@tencent.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Timothy Hayes <timothy.hayes@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-11-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:12:22 -03:00
c82ccc3a3d
perf c2c: Use explicit names for display macros
...
Perf c2c tool has an assumption that it heavily depends on HITM snoop
type to detect cache false sharing, unfortunately, HITM is not supported
on some architectures.
Essentially, perf c2c tool wants to find some very costly snooping
operations for false cache sharing, this means it's not necessarily
to stick using HITM tags and we can explore other snooping types
(e.g. SNOOPX_PEER).
For this reason, this patch renames HITM related display macros with
suffix '_HITM', so it can be distinct if later add more display types
for on other snooping type.
Reviewed-by: Ali Saidi <alisaidi@amazon.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Tested-by: Ali Saidi <alisaidi@amazon.com >
Acked-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Like Xu <likexu@tencent.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Timothy Hayes <timothy.hayes@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-10-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:12:21 -03:00
682352e59b
perf c2c: Add mean dimensions for peer operations
...
This patch adds two dimensions for the mean value of peer operations.
Reviewed-by: Ali Saidi <alisaidi@amazon.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Tested-by: Ali Saidi <alisaidi@amazon.com >
Acked-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Like Xu <likexu@tencent.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Timothy Hayes <timothy.hayes@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-9-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:12:20 -03:00
9082282fce
perf c2c: Add dimensions of peer metrics for cache line view
...
This patch adds dimensions of peer ops, which will be used for Shared
cache line distribution pareto.
It adds the percentage dimensions for local and remote peer operations,
and the dimensions for accounting operation numbers which is used for
stdio mode.
Reviewed-by: Ali Saidi <alisaidi@amazon.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Tested-by: Ali Saidi <alisaidi@amazon.com >
Acked-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Like Xu <likexu@tencent.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Timothy Hayes <timothy.hayes@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-8-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:12:19 -03:00
63e74ab5e4
perf c2c: Add dimensions for peer load operations
...
This patch adds three dimensions for peer load operations of 'lcl_peer',
'rmt_peer' and 'tot_peer'. These three dimensions will be used in the
shared data cache line table.
Reviewed-by: Ali Saidi <alisaidi@amazon.com >
Signed-off-by: Leo Yan <leo.yan@linaro.org >
Tested-by: Ali Saidi <alisaidi@amazon.com >
Acked-by: Ian Rogers <irogers@google.com >
Cc: Adrian Hunter <adrian.hunter@intel.com >
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com >
Cc: Anshuman Khandual <anshuman.khandual@arm.com >
Cc: German Gomez <german.gomez@arm.com >
Cc: Gustavo A. R. Silva <gustavoars@kernel.org >
Cc: Ingo Molnar <mingo@redhat.com >
Cc: James Clark <james.clark@arm.com >
Cc: Jiri Olsa <jolsa@kernel.org >
Cc: John Garry <john.garry@huawei.com >
Cc: Kajol Jain <kjain@linux.ibm.com >
Cc: Like Xu <likexu@tencent.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Mike Leach <mike.leach@linaro.org >
Cc: Namhyung Kim <namhyung@kernel.org >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Timothy Hayes <timothy.hayes@arm.com >
Cc: Will Deacon <will@kernel.org >
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-7-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com >
2022-08-11 19:12:18 -03:00