IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
When the lock contention events are used, there's no tracking of
acquire and release. So the state machine is simplified to use
UNINITIALIZED -> CONTENDED -> ACQUIRED only.
Note that CONTENDED state is re-entrant since mutex locks can hit two
or more consecutive contention_begin events for optimistic spinning
and sleep.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220615163222.1275500-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When LOCKDEP and LOCK_STAT events are not available, it falls back to
record the new lock contention tracepoints.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220615163222.1275500-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The debug output is meaningful when there are bad lock sequences.
Skip it unless there's one or -v option is given.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220615163222.1275500-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add --vmlinux and --kallsyms options to support data file from
different kernels.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220615163222.1275500-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently it only prints the time in nsec but it's a bit hard to read
and takes longer in the screen. We can change it to use different units
and keep the number small to save the space.
Before:
$ perf lock report
Name acquired contended avg wait (ns) total wait (ns) max wait (ns) min wait (ns)
jiffies_lock 433 32 2778 88908 13570 692
&lruvec->lru_lock 747 5 11254 56272 18317 1412
slock-AF_INET6 7 1 23543 23543 23543 23543
&newf->file_lock 706 15 1025 15388 2279 618
slock-AF_INET6 8 1 10379 10379 10379 10379
&rq->__lock 2143 5 2037 10185 3462 939
After:
Name acquired contended avg wait total wait max wait min wait
jiffies_lock 433 32 2.78 us 88.91 us 13.57 us 692 ns
&lruvec->lru_lock 747 5 11.25 us 56.27 us 18.32 us 1.41 us
slock-AF_INET6 7 1 23.54 us 23.54 us 23.54 us 23.54 us
&newf->file_lock 706 15 1.02 us 15.39 us 2.28 us 618 ns
slock-AF_INET6 8 1 10.38 us 10.38 us 10.38 us 10.38 us
&rq->__lock 2143 5 2.04 us 10.19 us 3.46 us 939 ns
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220615163222.1275500-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a self test for branch stack sampling, to check that we get the
expected branch types, and filters behave as expected.
Suggested-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20220705150511.473919-2-german.gomez@arm.com
Tested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Metric names are truncated so don't try to match all of one.
Allow AMX metrics to skip as floating point ones do.
Metrics for optane memory can also skip rather than fail.
Add a system wide check for uncore metrics.
Restructure code to avoid extensive nesting.
Some impetus for this in:
https://lore.kernel.org/lkml/d32376b5-5538-ff00-6620-e74ad4b4abf2@huawei.com/
Suggested-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220707153449.202409-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Printing out the metric name and architecture makes finding the source
of a failure easier.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220707153449.202409-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Remove files and build rules.
Remove test for comparing with jevents.py as there is no longer a binary
to compare with.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Forrington <nick.forrington@arm.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi115@huawei.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220629182505.406269-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Generate pmu-events.c using jevents.py rather than the binary built from
jevents.c.
Add a new config variable NO_JEVENTS that is set when there is no
architecture json or an appropriate python interpreter isn't present.
When NO_JEVENTS is defined the file pmu-events/empty-pmu-events.c is
copied and used as the pmu-events.c file.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Ian Rogers <rogers.email@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Forrington <nick.forrington@arm.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi115@huawei.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220629182505.406269-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
jevents.c is large, has a dependency on an old forked version of jsmn,
and is challenging to work upon. A lot of jevents.c's complexity comes
from needing to write json and csv parsing from first principles. In
contrast python has this functionality in standard libraries and is
already a build pre-requisite for tools like asciidoc (that builds all
of the perf man pages).
Introduce jevents.py that produces identical output to jevents.c. Add a
test that runs both converter tools and validates there are no output
differences. The test can be invoked with a phony build target like:
$ make -C tools/perf jevents-py-test
The python code deliberately tries to replicate the behavior of
jevents.c so that the output matches and transitioning tools shouldn't
introduce regressions. In some cases the code isn't as elegant as hoped,
but fixing this can be done as follow up.
Committer testing:
$ make -C tools/perf jevents-py-test
make: Entering directory '/var/home/acme/git/perf/tools/perf'
BUILD: Doing 'make -j32' parallel build
HOSTCC fixdep.o
HOSTLD fixdep-in.o
LINK fixdep
Auto-detecting system features:
... dwarf: [ on ]
... dwarf_getlocations: [ on ]
... glibc: [ on ]
... libbfd: [ on ]
... libbfd-buildid: [ on ]
... libcap: [ on ]
... libelf: [ on ]
... libnuma: [ on ]
... numa_num_possible_cpus: [ on ]
... libperl: [ on ]
... libpython: [ on ]
... libcrypto: [ OFF ]
... libunwind: [ on ]
... libdw-dwarf-unwind: [ on ]
... zlib: [ on ]
... lzma: [ on ]
... get_cpuid: [ on ]
... bpf: [ on ]
... libaio: [ on ]
... libzstd: [ on ]
... disassembler-four-args: [ on ]
HOSTCC pmu-events/json.o
HOSTCC pmu-events/jsmn.o
HOSTCC pmu-events/jevents.o
HOSTLD pmu-events/jevents-in.o
LINK pmu-events/jevents
Checking architecture: arm64
Generating using jevents.c
Generating using jevents.py
Diffing
Checking architecture: nds32
Generating using jevents.c
Generating using jevents.py
Diffing
Checking architecture: powerpc
Generating using jevents.c
Generating using jevents.py
Diffing
Checking architecture: s390
Generating using jevents.c
Generating using jevents.py
Diffing
Checking architecture: x86
Generating using jevents.c
Generating using jevents.py
Diffing
make: Leaving directory '/var/home/acme/git/perf/tools/perf'
$
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: John Garry <john.garry@huawei.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Tested-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Forrington <nick.forrington@arm.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi115@huawei.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220629182505.406269-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The PYTHON_AUTO code orders the preference for the PYTHON command to be
python3, python and then python2. python3 makes a more logical
preference as python2 is no longer supported:
https://www.python.org/doc/sunset-python-2/
Reorder the priority of the PYTHON command to be python2, python and
then python3.
Reported-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: John Garry <john.garry@huawei.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Tested-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Forrington <nick.forrington@arm.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi115@huawei.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220629182505.406269-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
IBS support has been enhanced with two new features in upcoming uarch:
1. DataSrc extension
2. L3 miss filtering.
Additional set of bits has been introduced in IBS registers to exploit
these features.
New bits are already defining in arch/x86/ header. Sync it with tools
header file. Also rename existing ibs_op_data field 'data_src' to
'data_src_lo'.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <rrichter@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: like.xu.linux@gmail.com
Cc: x86@kernel.org
Link: https://lore.kernel.org/r/20220604044519.594-8-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
PMUs advertise their capabilities via sysfs attribute files but
the perf tool currently parses only core(CPU) or hybrid core PMU
capabilities. Add support of recording non-core PMU capabilities
int perf.data header.
Note that a newly proposed HEADER_PMU_CAPS is replacing existing
HEADER_HYBRID_CPU_PMU_CAPS. Special care is taken for hybrid core
PMUs by writing their capabilities first in the perf.data header
to make sure new perf.data file being read by old perf tool does
not break.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <rrichter@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: like.xu.linux@gmail.com
Cc: x86@kernel.org
Link: https://lore.kernel.org/r/20220604044519.594-6-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently all capabilities are stored in a single string separated by
NULL character. Instead, store them in an array which makes searching of
capability easier.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <rrichter@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: like.xu.linux@gmail.com
Cc: x86@kernel.org
Link: https://lore.kernel.org/r/20220604044519.594-5-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Avoid unnecessary conditional code to check if pmu name is NULL
or not by passing "cpu" pmu name to the printing function.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <rrichter@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: like.xu.linux@gmail.com
Cc: x86@kernel.org
Link: https://lore.kernel.org/r/20220604044519.594-4-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In addition to returning nr_caps, cache it locally in struct perf_pmu.
Similarly, cache status of whether caps sysfs has already been parsed
or not. These will help to avoid parsing sysfs every time the function
gets called.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <rrichter@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: like.xu.linux@gmail.com
Cc: x86@kernel.org
Link: https://lore.kernel.org/r/20220604044519.594-3-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Samples without an L3 miss are discarded and counter is reset with
random value (between 1-15 for fetch PMU and 1-127 for op PMU) when IBS
L3 miss filtering is enabled. This causes a sampling period skew but
there is no way to reconstruct aggregated sampling period. So print a
warning at perf record if user sets l3missonly=1.
Ex:
# perf record -c 10000 -C 0 -e ibs_op/l3missonly=1/
WARNING: Hw internally resets sampling period when L3 Miss Filtering is enabled
and tagged operation does not cause L3 Miss. This causes sampling period skew.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <rrichter@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: like.xu.linux@gmail.com
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20220604044519.594-2-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When the -D option is used, the details of thread-map, cpu-map and
event-update events are not currently dumped. Add prints so that they are.
Example:
# perf record --kcore sleep 0.1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.021 MB perf.data (7 samples) ]
# perf script -D | grep 'THREAD\|CPU'
0 0x4950 [0x28]: PERF_RECORD_THREAD_MAP nr: 1 thread: 35116
0 0x4978 [0x20]: PERF_RECORD_CPU_MAP: 0-7
# perf script -D | grep -A4 'UPDATE'
0 0x4920 [0x30]: PERF_RECORD_EVENT_UPDATE
... id: 147
... 0-7
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220610113316.6682-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In preparation for recording sideband events in a virtual machine guest so
that they can be injected into a host perf.data file.
This is needed to enable injecting events after the initial synthesized
user events (that have an all zero id sample) but before regular events.
Committer notes:
Add entry about PERF_RECORD_FINISHED_INIT to
tools/perf/Documentation/perf.data-file-format.txt.
Committer testing:
Before:
# perf report -D | grep FINISHED
0 0x5910 [0x8]: PERF_RECORD_FINISHED_ROUND
FINISHED_ROUND events: 1 ( 0.5%)
#
After:
# perf record -- sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.020 MB perf.data (7 samples) ]
# perf report -D | grep FINISHED
0 0x5068 [0x8]: PERF_RECORD_FINISHED_INIT: unhandled!
0 0x5390 [0x8]: PERF_RECORD_FINISHED_ROUND
FINISHED_ROUND events: 1 ( 0.5%)
FINISHED_INIT events: 1 ( 0.5%)
#
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220610113316.6682-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In preparation for recording sideband events in a virtual machine guest so
that they can be injected into a host perf.data file.
Adjust the logic so that if there are IDs then the id index is recorded.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220610113316.6682-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
kcore provides a copy of the running kernel including any modified code.
A trace that benefits from that also benefits from text_poke events, so
enable them.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220610113316.6682-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When CPU has been explicitly sampled (via --sample-cpu), prefer this
sampled value over the thread CPU value when exporting to JSON.
Signed-off-by: Shawn M. Chapla <schapla@codeweavers.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220526201506.2028281-1-schapla@codeweavers.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We may have no events for a metric evaluated to a constant. In such a
case ensure a tool event is at least evaluated for metric parsing and
displaying.
Fixes: 8586d2744f ("perf metrics: Don't add all tool events for sharing")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220618013957.999321-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Except for memory load and store operations, ARM SPE records also can
support other operation types, bug when set the data source field the
current code assumes a record is a either load operation or store
operation, this leads to wrongly synthesize memory samples.
This patch strictly checks the record operation type, it only sets data
source only for the operation types ARM_SPE_LD and ARM_SPE_ST,
otherwise, returns zero for data source. Therefore, we can synthesize
memory samples only when data source is a non-zero value, the function
arm_spe__is_memory_event() is useless and removed.
Fixes: e55ed3423c ("perf arm-spe: Synthesize memory event")
Reviewed-by: Ali Saidi <alisaidi@amazon.com>
Reviewed-by: German Gomez <german.gomez@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Ali Saidi <alisaidi@amazon.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: alisaidi@amazon.com
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Forrington <nick.forrington@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: http://lore.kernel.org/lkml/20220517020326.18580-5-alisaidi@amazon.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pass the optional exponent component through to strtod that already
supports it. We already have exponents in ScaleUnit and so this adds
uniformity.
Reported-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Reviewed-By: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20220527020653.4160884-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
commit cfd7092c31 ("perf test session topology: Fix test to skip
the test in guest environment") added check to skip the testcase if the
socket_id can't be fetched from topology info.
But the condition check uses strncmp which should be changed to !strncmp
and to correctly match platform.
Fix this condition check.
Fixes: cfd7092c31 ("perf test session topology: Fix test to skip the test in guest environment")
Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Link: https://lore.kernel.org/r/20220610135939.63361-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The testcase 'Check Arm64 callgraphs are complete in fp mode' wants to
see the following output:
610 leaf
62f parent
648 main
However, without excluding kernel callchains, the output might look like:
ffffc2ff40ef1b5c arch_local_irq_enable
ffffc2ff419d032c __schedule
ffffc2ff419d06c0 schedule
ffffc2ff40e4da30 do_notify_resume
ffffc2ff40e421b0 work_pending
610 leaf
62f parent
648 main
Adding '--user-callchains' leaves only the wanted symbols in the chain.
Fixes: cd6382d827 ("perf test arm64: Test unwinding using fame-pointer (fp) mode")
Suggested-by: German Gomez <german.gomez@arm.com>
Reviewed-by: German Gomez <german.gomez@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220614105207.26223-1-mpetlan@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes in:
f94fd25cb0 ("tcp: pass back data left in socket after receive")
That don't result in any changes in the tables generated from that
header.
This silences this perf build warning:
Warning: Kernel ABI header at 'tools/perf/trace/beauty/include/linux/socket.h' differs from latest version at 'include/linux/socket.h'
diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/all/YqORj9d58AiGYl8b@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix:
tests/bp_account.c:154:9: runtime error: variable length array bound evaluates to non-positive value 0
by switching from a variable length to an allocated array.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220610180247.444798-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf test -F 83 ("perf stat CSV output linter") fails on s390.
Reason is the wrong number of fields for certain CPU core/die/socket
related output.
On x84_64 the output of command:
# ./perf stat -x, -A -a --no-merge true
CPU0,1.50,msec,cpu-clock,1502781,100.00,1.052,CPUs utilized
CPU1,1.48,msec,cpu-clock,1476113,100.00,1.034,CPUs utilized
...
results in 8 fields with 7 comma separators.
On s390 the output of command:
# ./perf stat -x, -A -a --no-merge -- true
0.95,msec,cpu-clock,949800,100.00,1.060,CPUs utilized
...
results in 7 fields with 6 comma separators. Therefore this tests
fails on s390. Similar issues exist for per-die and per-socket output
which is not supported on s390.
I have rewritten the python program to count commas in each output line
into a bash function to achieve the same result. I hope this makes it a
bit easier.
Output before:
# ./perf test -F 83
83: perf stat CSV output linter :
Checking CSV output: no args [Success]
Checking CSV output: system wide [Success]
Checking CSV output: system wide Checking CSV output: \
system wide no aggregation 6.92,msec,cpu-clock,\
6918131,100.00,6.972,CPUs utilized
...
RuntimeError: wrong number of fields. expected 7 in \
6.92,msec,cpu-clock,6918131,100.00,6.972,CPUs utilized
FAILED!
#
Output after:
# ./perf test -F 83
83: perf stat CSV output linter :
Checking CSV output: no args [Success]
Checking CSV output: system wide [Success]
Checking CSV output: system wide Checking CSV output:\
system wide no aggregation [Success]
Checking CSV output: interval [Success]
Checking CSV output: event [Success]
Checking CSV output: per core [Success]
Checking CSV output: per thread [Success]
Checking CSV output: per die [Success]
Checking CSV output: per node [Success]
Checking CSV output: per socket [Success]
Ok
#
Committer notes:
Continues to work on x86_64
$ perf test lint
89: perf stat CSV output linter : Ok
$ perf test -v lint
Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc
89: perf stat CSV output linter :
--- start ---
test child forked, pid 53133
Checking CSV output: no args [Success]
Checking CSV output: system wide [Skip] paranoid and not root
Checking CSV output: system wide [Skip] paranoid and not root
Checking CSV output: interval [Success]
Checking CSV output: event [Success]
Checking CSV output: per core [Skip] paranoid and not root
Checking CSV output: per thread [Skip] paranoid and not root
Checking CSV output: per die [Skip] paranoid and not root
Checking CSV output: per node [Skip] paranoid and not root
Checking CSV output: per socket [Skip] paranoid and not root
test child finished with 0
---- end ----
perf stat CSV output linter: Ok
$
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Claire Jensen <cjense@google.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: linux390-list@tuxmaker.boeblingen.de.ibm.com
Link: https://lore.kernel.org/r/20220603113034.2009728-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update JSON metrics for Alderlake to perf.
It included both P-core and E-core metrics.
P-core metrics based on TMA 4.4 (TMA_Metrics-full.csv)
E-core metrics based on E-core TMA 2.0 (E-core_TMA_Metrics.csv)
https://download.01.org/perfmon/
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220528095933.1784141-2-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The function percent_rmt_hitm_cmp() wrongly uses local HITMs for
sorting remote HITMs.
Since this function is to sort cache lines for remote HITMs, this patch
changes to use 'rmt_hitm' field for correct sorting.
Fixes: 9cb3500afc ("perf c2c report: Add hitm/store percent related sort keys")
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Mario <jmario@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220530084253.750190-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently, Arm SPE events don't trace physical address, therefore, the
field 'phys_addr' is always zero in synthesized memory samples. This
leads to perf c2c tool cannot locate the memory node for samples.
This patch enables configuration 'pa_enable' for Arm SPE events, so the
physical address packet can be traced, finally this can allow perf c2c
tool to locate properly for memory node.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220530083645.253432-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update IBM zEC12/zBC12 event counter description to the latest level
as described in the documents
1. SA23-2260-07:
"The Load-Program-Parameter and the CPU-Measurement Facilities."
released on May, 2022
for the following counter sets:
* Basic counter set
* Problem counter set
* Crypto counter set
2. SA23-2261-07:
"The CPU-Measurement Facility Extended Counters Definition
for z10, z196/z114, zEC12/zBC12, z13/z13s, z14, z15 and z16"
released on April 29, 2022
for the following counter sets:
* Extended counter set
* MT-Diagnostic counter set
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20220531092706.1931503-7-tmricht@linux.ibm.com
Cc: acme@kernel.org
Cc: gor@linux.ibm.com
Cc: hca@linux.ibm.com
Cc: svens@linux.ibm.com
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Update IBM z196/z114 event counter description to the latest level
as described in the documents
1. SA23-2260-07:
"The Load-Program-Parameter and the CPU-Measurement Facilities."
released on May, 2022
for the following counter sets:
* Basic counter set
* Problem counter set
* Crypto counter set
2. SA23-2261-07:
"The CPU-Measurement Facility Extended Counters Definition
for z10, z196/z114, zEC12/zBC12, z13/z13s, z14, z15 and z16"
released on April 29, 2022
for the following counter sets:
* Extended counter set
* MT-Diagnostic counter set
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20220531092706.1931503-6-tmricht@linux.ibm.com
Cc: acme@kernel.org
Cc: gor@linux.ibm.com
Cc: hca@linux.ibm.com
Cc: svens@linux.ibm.com
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Update IBM z15 event counter description to the latest level
as described in the documents
1. SA23-2260-07:
"The Load-Program-Parameter and the CPU-Measurement Facilities."
released on May, 2022
for the following counter sets:
* Basic counter set
* Problem counter set
* Crypto counter set
2. SA23-2261-07:
"The CPU-Measurement Facility Extended Counters Definition
for z10, z196/z114, zEC12/zBC12, z13/z13s, z14, z15 and z16"
released on April 29, 2022
for the following counter sets:
* Extended counter set
* MT-Diagnostic counter set
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20220531092706.1931503-5-tmricht@linux.ibm.com
Cc: acme@kernel.org
Cc: gor@linux.ibm.com
Cc: hca@linux.ibm.com
Cc: svens@linux.ibm.com
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Update IBM z14 event counter description to the latest level
as described in the documents
1. SA23-2260-07:
"The Load-Program-Parameter and the CPU-Measurement Facilities."
released on May, 2022
for the following counter sets:
* Basic counter set
* Problem counter set
* Crypto counter set
2. SA23-2261-07:
"The CPU-Measurement Facility Extended Counters Definition
for z10, z196/z114, zEC12/zBC12, z13/z13s, z14, z15 and z16"
released on April 29, 2022
for the following counter sets:
* Extended counter set
* MT-Diagnostic counter set
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20220531092706.1931503-4-tmricht@linux.ibm.com
Cc: acme@kernel.org
Cc: gor@linux.ibm.com
Cc: hca@linux.ibm.com
Cc: svens@linux.ibm.com
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Update IBM z13 event counter description to the latest level
as described in the documents
1. SA23-2260-07:
"The Load-Program-Parameter and the CPU-Measurement Facilities."
released on May, 2022
for the following counter sets:
* Basic counter set
* Problem counter set
* Crypto counter set
2. SA23-2261-07:
"The CPU-Measurement Facility Extended Counters Definition
for z10, z196/z114, zEC12/zBC12, z13/z13s, z14, z15 and z16"
released on April 29, 2022
for the following counter sets:
* Extended counter set
* MT-Diagnostic counter set
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20220531092706.1931503-3-tmricht@linux.ibm.com
Cc: acme@kernel.org
Cc: gor@linux.ibm.com
Cc: hca@linux.ibm.com
Cc: svens@linux.ibm.com
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Update IBM z10 event counter description to the latest level
as described in the documents
1. SA23-2260-07:
"The Load-Program-Parameter and the CPU-Measurement Facilities."
released on May, 2022
for the following counter sets:
* Basic counter set
* Problem counter set
* Crypto counter set
2. SA23-2261-07:
"The CPU-Measurement Facility Extended Counters Definition
for z10, z196/z114, zEC12/zBC12, z13/z13s, z14, z15 and z16"
released on April 29, 2022
for the following counter sets:
* Extended counter set
* MT-Diagnostic counter set
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20220531092706.1931503-2-tmricht@linux.ibm.com
Cc: acme@kernel.org
Cc: gor@linux.ibm.com
Cc: hca@linux.ibm.com
Cc: svens@linux.ibm.com
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Update IBM z16 counter description using document SA23-2260-07:
"The Load-Program-Parameter and the CPU-Measurement Facilities"
released in May, 2022, to include counter definitions for IBM z16
counter sets:
* Basic counter set
* Problem/user counter set
* Crypto counter set
Use document SA23-2261-07:
"The CPU-Measurement Facility Extended Counters Definition
for z10, z196/z114, zEC12/zBC12, z13/z13s, z14, z15 and z16"
released on April 29, 2022 to include counter definitions for IBM z16
* Extended counter set
* MT-Diagnostic counter set
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20220531092706.1931503-1-tmricht@linux.ibm.com
Cc: acme@kernel.org
Cc: gor@linux.ibm.com
Cc: hca@linux.ibm.com
Cc: svens@linux.ibm.com
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
With the hardware TopDown metrics feature, the sample-read feature should
be supported for a TopDown group, e.g., sample a non-topdown event and read
a Topdown metric group. But the current perf record code errors are out.
For a TopDown metric group,the slots event must be the leader of the group,
but the leader slots event doesn't support sampling. To support sample-read
the TopDown metric group, uses the 2nd event of the group as the "leader"
for the purposes of sampling.
Only the platform with the TopDown metric feature supports sample-read the
topdown group. In commit acb65150a4 ("perf record: Support sample-read
topdown metric group"), it adds arch_topdown_sample_read() to indicate
whether the TopDown group supports sample-read, it should only work on the
non-hybrid systems, this patch extends the support for hybrid platforms.
Before:
# ./perf record -e "{cpu_core/slots/,cpu_core/cycles/,cpu_core/topdown-retiring/}:S" -a sleep 1
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_core/topdown-retiring/).
/bin/dmesg | grep -i perf may provide additional information.
After:
# ./perf record -e "{cpu_core/slots/,cpu_core/cycles/,cpu_core/topdown-retiring/}:S" -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.238 MB perf.data (369 samples) ]
Fixes: acb65150a4 ("perf record: Support sample-read topdown metric group")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220602153603.1884710-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With -t/--threads option, it needs to display task names so synthesize
task related events at the beginning.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Fixes: 7c3bcbdf44 ("perf lock: Add -t/--thread option for report")
Link: http://lore.kernel.org/lkml/20220601065846.456965-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
segbase is the address of .eh_frame_hdr and table_data is segbase plus
the header size. find_proc_info computes segbase as `map->start +
segbase - map->pgoff` which is wrong when
* .eh_frame_hdr and .text are in different PT_LOAD program headers
* and their p_vaddr difference does not equal their p_offset difference
Since 10.0, ld.lld's default --rosegment -z noseparate-code layout has
such R and RX PT_LOAD program headers.
ld.lld (default) => perf report fails to unwind `perf record
--call-graph dwarf` recorded data
ld.lld --no-rosegment => ok (trivial, no R PT_LOAD)
ld.lld -z separate-code => ok but by luck: there are two PT_LOAD but
their p_vaddr difference equals p_offset difference
ld.bfd -z noseparate-code => ok (trivial, no R PT_LOAD)
ld.bfd -z separate-code (default for Linux/x86) => ok but by luck:
there are two PT_LOAD but their p_vaddr difference equals p_offset
difference
To fix the issue, compute segbase as dso's base address plus
PT_GNU_EH_FRAME's p_vaddr. The base address is computed by iterating
over all dso-associated maps and then subtract the first PT_LOAD p_vaddr
(the minimum guaranteed by generic ABI) from the minimum address.
In libunwind, find_proc_info transitively called by unw_step is cached,
so the iteration overhead is acceptable.
Reported-by: Sebastian Ullrich <sebasti@nullri.ch>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Fangrui Song <maskray@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: llvm@lists.linux.dev
Link: https://github.com/ClangBuiltLinux/linux/issues/1646
Link: https://lore.kernel.org/r/20220527182039.673248-1-maskray@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add shell test to check if perf-record hangs when recording an arm_spe
event with forks.
The test FAILS if the Kernel is not patched with Commit 961c391217 ("perf:
Always wake the parent event").
Unpatched Kernel:
$ perf test -v 90
90: Check Arm SPE doesn't hang when there are forks
--- start ---
test child forked, pid 14232
Recording workload with fork
Log lines = 90 /tmp/__perf_test.stderr.0Nu0U
Log lines after 1 second = 90 /tmp/__perf_test.stderr.0Nu0U
SPE hang test: FAIL
test child finished with -1
---- end ----
Check Arm SPE trace data in workload with forks: FAILED!
Patched Kernel:
$ perf test -v 90
90: Check Arm SPE doesn't hang when there are forks
--- start ---
test child forked, pid 2930
Compiling test program...
Recording workload...
Log lines = 478 /tmp/__perf_test.log.026AI
Log lines after 1 second = 557 /tmp/__perf_test.log.026AI
SPE hang test: PASS
Cleaning up files...
test child finished with 0
---- end ----
Check Arm SPE trace data in workload with forks: Ok
Reviewed-by: James Clark <james.clark@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: German Gomez <german.gomez@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220228165655.3920-1-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>