15648 Commits

Author SHA1 Message Date
Ian Rogers
f9cdeb58a9 perf evlist: Avoid frequency mode for the dummy event
Dummy events are created with an attribute where the period and freq
are zero. evsel__config will then see the uninitialized values and
initialize them in evsel__default_freq_period. As fequency mode is
used by default the dummy event would be set to use frequency
mode. However, this has no effect on the dummy event but does cause
unnecessary timers/interrupts. Avoid this overhead by setting the
period to 1 for dummy events.

evlist__add_aux_dummy calls evlist__add_dummy then sets freq=0 and
period=1. This isn't necessary after this change and so the setting is
removed.

From Stephane:

The dummy event is not counting anything. It is used to collect mmap
records and avoid a race condition during the synthesize mmap phase of
perf record. As such, it should not cause any overhead during active
profiling. Yet, it did. Because of a bug the dummy event was
programmed as a sampling event in frequency mode. Events in that mode
incur more kernel overheads because on timer tick, the kernel has to
look at the number of samples for each event and potentially adjust
the sampling period to achieve the desired frequency. The dummy event
was therefore adding a frequency event to task and ctx contexts we may
otherwise not have any, e.g.,

  perf record -a -e cpu/event=0x3c,period=10000000/.

On each timer tick the perf_adjust_freq_unthr_context() is invoked and
if ctx->nr_freq is non-zero, then the kernel will loop over ALL the
events of the context looking for frequency mode ones. In doing, so it
locks the context, and enable/disable the PMU of each hw event. If all
the events of the context are in period mode, the kernel will have to
traverse the list for nothing incurring overhead. The overhead is
multiplied by a very large factor when this happens in a guest kernel.
There is no need for the dummy event to be in frequency mode, it does
not count anything and therefore should not cause extra overhead for
no reason.

Fixes: 5bae0250237f ("perf evlist: Introduce perf_evlist__new_dummy constructor")
Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20230916035640.1074422-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-26 21:43:20 -07:00
Charles Han
c87b8cc816 perf vendors events: Remove repeated word in comments
Remove the repeated word "of" in comments.

Signed-off-by: Charles Han <hanchunchao@inspur.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: james.clark@arm.com
Cc: nick.forrington@arm.com
Cc: leo.yan@linaro.org
Cc: mike.leach@linaro.org
Cc: john.g.garry@oracle.com
Cc: ilkka@os.amperecomputing.com
Link: https://lore.kernel.org/r/20230918033623.159213-1-hanchunchao@inspur.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-26 21:43:20 -07:00
Ilkka Koskinen
59faeaf80d perf vendor events arm64: Fix for AmpereOne metrics
This patch addresses review comments that were given for
705ed549148f ("perf vendor events arm64: Add AmpereOne metrics")
but didn't make it to the original patch [1][2]

Changes include: A fix for backend_memory formula, use of standard metrics
when possible, using #slots, renaming metrics to avoid spaces in the names,
and cleanup.

[1] https://lore.kernel.org/linux-perf-users/e9bdacb-a231-36af-6a2e-6918ee7effa@os.amperecomputing.com/
[2] https://lore.kernel.org/linux-perf-users/20230826192352.3043220-1-ilkka@os.amperecomputing.com/

Fixes: 705ed549148f ("perf vendor events arm64: Add AmpereOne metrics")
Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Dave Kleikamp <dave.kleikamp@oracle.com>
Cc: John Garry <john.g.garry@oracle.com>
Cc: D Scott Phillips <scott@os.amperecomputing.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230920061839.2437413-1-ilkka@os.amperecomputing.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-26 21:43:20 -07:00
Wyes Karny
48a3adcf47 perf pmu: Fix perf stat output with correct scale and unit
The perf_pmu__parse_* functions for the sysfs files of pmu event’s
scale, unit, per-pkg and snapshot were updated in commit 7b723dbb96e8
("perf pmu: Be lazy about loading event info files from sysfs").
However, the paths for these sysfs files were incorrect. This resulted
in perf stat reporting values with wrong scaling and missing units. This
is fixed by correcting the paths for these sysfs files.

Before this fix:

 $sudo perf stat -e power/energy-pkg/ -- sleep 2

 Performance counter stats for 'system wide':

   351,217,188,864      power/energy-pkg/

          2.004127961 seconds time elapsed

After this fix:

 $sudo perf stat -e power/energy-pkg/ -- sleep 2

 Performance counter stats for 'system wide':

             80.58 Joules power/energy-pkg/

 	     2.004009749 seconds time elapsed

Fixes: 7b723dbb96e8 ("perf pmu: Be lazy about loading event info files from sysfs")
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: ravi.bangoria@amd.com
Cc: sandipan.das@amd.com
Cc: james.clark@arm.com
Cc: kan.liang@linux.intel.com
Link: https://lore.kernel.org/r/20230920122349.418673-1-wyes.karny@amd.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-26 21:41:50 -07:00
Veronika Molnarova
29441ab3a3 perf test lock_contention.sh: Skip test if not enough CPUs
Machines with less then 4 CPUs weren't consistently triggering lock
events required for the test.

Skip the test on those machines. The limit of 4 CPUs is set as it
generates around 100 lock events for a test.

Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Acked-by: Michael Petlan <mpetlan@redhat.com>
Link: https://lore.kernel.org/r/20230919150419.23193-2-vmolnaro@redhat.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-21 11:55:13 -07:00
Veronika Molnarova
fa52d995d1 perf test stat+shadow_stat.sh: Add threshold for rounding errors
The test was failing in specific scenarios due to imperfection of FP
arithmetics. The `bc` command wasn't correctly rounding the result of
division causing the failure.

Replace the `bc` with `awk` which should work with more decimal places
and add a threshold to catch any possible rounding errors.  The
acceptable rounding error is set to 0.01 when the test passes with a
warning message.

Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Acked-by: Michael Petlan <mpetlan@redhat.com>
Link: https://lore.kernel.org/r/20230919150419.23193-1-vmolnaro@redhat.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-21 11:53:45 -07:00
Xu Yang
e49be27e18 perf jevents: fix no member named 'entries' issue
The struct "pmu_events_table" has been changed after commit
2e255b4f9f41 (perf jevents: Group events by PMU, 2023-08-23).
So there doesn't exist 'entries' in pmu_events_table anymore.
This will align the members with that commit. Othewise, below
errors will be printed when run jevent.py:

pmu-events/pmu-events.c:5485:26: error: ‘struct pmu_metrics_table’ has no member named ‘entries’
 5485 |                         .entries = pmu_metrics__freescale_imx8dxl_sys,

Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230919080929.3807123-1-xu.yang_2@nxp.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-19 21:16:26 -07:00
Ian Rogers
ede72dca45 perf parse-events: Fix tracepoint name memory leak
Fuzzing found that an invalid tracepoint name would create a memory
leak with an address sanitizer build:
```
$ perf stat -e '*:o/' true
event syntax error: '*:o/'
                       \___ parser error
Run 'perf list' for a list of valid events

 Usage: perf stat [<options>] [<command>]

    -e, --event <event>   event selector. use 'perf list' to list available events

=================================================================
==59380==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 4 byte(s) in 2 object(s) allocated from:
    #0 0x7f38ac07077b in __interceptor_strdup ../../../../src/libsanitizer/asan/asan_interceptors.cpp:439
    #1 0x55f2f41be73b in str util/parse-events.l:49
    #2 0x55f2f41d08e8 in parse_events_lex util/parse-events.l:338
    #3 0x55f2f41dc3b1 in parse_events_parse util/parse-events-bison.c:1464
    #4 0x55f2f410b8b3 in parse_events__scanner util/parse-events.c:1822
    #5 0x55f2f410d1b9 in __parse_events util/parse-events.c:2094
    #6 0x55f2f410e57f in parse_events_option util/parse-events.c:2279
    #7 0x55f2f4427b56 in get_value tools/lib/subcmd/parse-options.c:251
    #8 0x55f2f4428d98 in parse_short_opt tools/lib/subcmd/parse-options.c:351
    #9 0x55f2f4429d80 in parse_options_step tools/lib/subcmd/parse-options.c:539
    #10 0x55f2f442acb9 in parse_options_subcommand tools/lib/subcmd/parse-options.c:654
    #11 0x55f2f3ec99fc in cmd_stat tools/perf/builtin-stat.c:2501
    #12 0x55f2f4093289 in run_builtin tools/perf/perf.c:322
    #13 0x55f2f40937f5 in handle_internal_command tools/perf/perf.c:375
    #14 0x55f2f4093bbd in run_argv tools/perf/perf.c:419
    #15 0x55f2f409412b in main tools/perf/perf.c:535

SUMMARY: AddressSanitizer: 4 byte(s) leaked in 2 allocation(s).
```
Fix by adding the missing destructor.

Fixes: 865582c3f48e ("perf tools: Adds the tracepoint name parsing support")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: He Kuang <hekuang@huawei.com>
Link: https://lore.kernel.org/r/20230914164028.363220-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-18 16:47:56 -07:00
Ian Rogers
b4f48f34f9 perf test: Detect off-cpu support from build options
Use perf version to detect whether BPF skeletons were enabled in a
build rather than a failing perf record.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Patrice Duroux <patrice.duroux@gmail.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: llvm@lists.linux.dev
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230914211948.814999-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-18 16:46:53 -07:00
Ian Rogers
c2ac838ef7 perf test: Ensure EXTRA_TESTS is covered in build test
Add to run variable.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Patrice Duroux <patrice.duroux@gmail.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: llvm@lists.linux.dev
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230914211948.814999-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-18 16:46:53 -07:00
Ian Rogers
c67c631d52 perf test: Update build test for changed BPF skeleton defaults
Fix a target name and set BUILD_BPF_SKEL to 0 rather than 1.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Patrice Duroux <patrice.duroux@gmail.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: llvm@lists.linux.dev
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230914211948.814999-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-18 16:46:45 -07:00
Ian Rogers
9925495d96 perf build: Default BUILD_BPF_SKEL, warn/disable for missing deps
LIBBPF is dependent on zlib so move the NO_ZLIB and feature check
early to avoid statically building when zlib is disabled. This avoids
a linkage failure with perf and static libbpf when zlib isn't
specified.

Move BUILD_BPF_SKEL logic to one place and if not defined set
BUILD_BPF_SKEL to 1. Detect dependencies of building with BPF
skeletons and warn/disable if the dependencies aren't present.

Change Makefile.perf to contain BPF skeleton logic dependent on the
Makefile.config result and refresh the comment about BUILD_BPF_SKEL.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Patrice Duroux <patrice.duroux@gmail.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: llvm@lists.linux.dev
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230914211948.814999-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-18 16:46:26 -07:00
Ian Rogers
727e431437 perf version: Add status of bpf skeletons
Add status for BPF skeletons, to see if a build has them enabled:
```
$ perf version --build-options
perf version 6.6.rc1.g0381ae36d1a6
                 dwarf: [ OFF ]  # HAVE_DWARF_SUPPORT
    dwarf_getlocations: [ OFF ]  # HAVE_DWARF_GETLOCATIONS_SUPPORT
         syscall_table: [ on  ]  # HAVE_SYSCALL_TABLE_SUPPORT
                libbfd: [ OFF ]  # HAVE_LIBBFD_SUPPORT
            debuginfod: [ OFF ]  # HAVE_DEBUGINFOD_SUPPORT
                libelf: [ OFF ]  # HAVE_LIBELF_SUPPORT
               libnuma: [ OFF ]  # HAVE_LIBNUMA_SUPPORT
numa_num_possible_cpus: [ OFF ]  # HAVE_LIBNUMA_SUPPORT
               libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
             libpython: [ on  ]  # HAVE_LIBPYTHON_SUPPORT
              libslang: [ on  ]  # HAVE_SLANG_SUPPORT
             libcrypto: [ on  ]  # HAVE_LIBCRYPTO_SUPPORT
             libunwind: [ OFF ]  # HAVE_LIBUNWIND_SUPPORT
    libdw-dwarf-unwind: [ OFF ]  # HAVE_DWARF_SUPPORT
                  zlib: [ on  ]  # HAVE_ZLIB_SUPPORT
                  lzma: [ on  ]  # HAVE_LZMA_SUPPORT
             get_cpuid: [ on  ]  # HAVE_AUXTRACE_SUPPORT
                   bpf: [ OFF ]  # HAVE_LIBBPF_SUPPORT
                   aio: [ on  ]  # HAVE_AIO_SUPPORT
                  zstd: [ on  ]  # HAVE_ZSTD_SUPPORT
               libpfm4: [ on  ]  # HAVE_LIBPFM
         libtraceevent: [ on  ]  # HAVE_LIBTRACEEVENT
         bpf_skeletons: [ OFF ]  # HAVE_BPF_SKEL
```

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Patrice Duroux <patrice.duroux@gmail.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: llvm@lists.linux.dev
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230914211948.814999-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-18 16:46:15 -07:00
Yang Li
3ecf87b2d8 perf kwork top: Simplify bool conversion
./tools/perf/util/bpf_kwork_top.c:120:53-58: WARNING: conversion to bool not needed here

Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230915063832.120274-1-yang.lee@linux.alibaba.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-18 15:38:46 -07:00
Thomas Richter
e47749f179 perf jevent: fix core dump on software events on s390
Running commands such as
 # ./perf stat -e cs -- true
 Segmentation fault (core dumped)
 # ./perf stat -e cpu-clock-- true
 Segmentation fault (core dumped)
 #

dump core. This should not happen as these events are defined
even when no hardware PMU is available.
Debugging this reveals this call chain:

  perf_pmus__find_by_type(type=1)
  +--> pmu_read_sysfs(core_only=false)
       +--> perf_pmu__find2(dirfd=3, name=0x152a113 "software")
            +--> perf_pmu__lookup(pmus=0x14f0568 <other_pmus>, dirfd=3,
                                  lookup_name=0x152a113 "software")
                 +--> perf_pmu__find_events_table (pmu=0x1532130)

Now the pmu is "software" and it tries to find a proper table
generated by the pmu-event generation process for s390:

 # cd pmu-events/
 # ./jevents.py  s390 all /root/linux/tools/perf/pmu-events/arch |\
        grep -E '^const struct pmu_table_entry'
 const struct pmu_table_entry pmu_events__cf_z10[] = {
 const struct pmu_table_entry pmu_events__cf_z13[] = {
 const struct pmu_table_entry pmu_metrics__cf_z13[] = {
 const struct pmu_table_entry pmu_events__cf_z14[] = {
 const struct pmu_table_entry pmu_metrics__cf_z14[] = {
 const struct pmu_table_entry pmu_events__cf_z15[] = {
 const struct pmu_table_entry pmu_metrics__cf_z15[] = {
 const struct pmu_table_entry pmu_events__cf_z16[] = {
 const struct pmu_table_entry pmu_metrics__cf_z16[] = {
 const struct pmu_table_entry pmu_events__cf_z196[] = {
 const struct pmu_table_entry pmu_events__cf_zec12[] = {
 const struct pmu_table_entry pmu_metrics__cf_zec12[] = {
 const struct pmu_table_entry pmu_events__test_soc_cpu[] = {
 const struct pmu_table_entry pmu_metrics__test_soc_cpu[] = {
 const struct pmu_table_entry pmu_events__test_soc_sys[] = {
 #

However event "software" is not listed, as can be seen in the
generated const struct pmu_events_map pmu_events_map[].
So in function perf_pmu__find_events_table(), the variable
table is initialized to NULL, but never set to a proper
value. The function scans all generated &pmu_events_map[]
tables, but no table matches, because the tables are
s390 CPU Measurement unit specific:

  i = 0;
  for (;;) {
      const struct pmu_events_map *map = &pmu_events_map[i++];
      if (!map->arch)
           break;

      --> the maps are there because the build generated them

           if (!strcmp_cpuid_str(map->cpuid, cpuid)) {
                table = &map->event_table;
                break;
           }
      --> Since no matching CPU string the table var remains 0x0
      }
      free(cpuid);
      if (!pmu)
           return table;

      --> The pmu is "software" so it exists and no return

      --> and here perf dies because table is 0x0
      for (i = 0; i < table->num_pmus; i++) {
	      ...
      }
      return NULL;

Fix this and do not access the table variable. Instead return 0x0
which is the same return code when the for-loop was not successful.

Output after:
 # ./perf stat -e cs -- true

 Performance counter stats for 'true':

                 0      cs

       0.000853105 seconds time elapsed

       0.000061000 seconds user
       0.000827000 seconds sys

 # ./perf stat -e cpu-clock -- true

 Performance counter stats for 'true':

              0.25 msec cpu-clock #    0.341 CPUs utilized

       0.000728383 seconds time elapsed

       0.000055000 seconds user
       0.000706000 seconds sys

 # ./perf stat -e cycles -- true

 Performance counter stats for 'true':

   <not supported>      cycles

       0.000767298 seconds time elapsed

       0.000055000 seconds user
       0.000739000 seconds sys

 #

Fixes: 7c52f10c0d4d8 ("perf pmu: Cache JSON events table")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: dengler@linux.ibm.com
Cc: gor@linux.ibm.com
Cc: hca@linux.ibm.com
Cc: sumanthk@linux.ibm.com
Cc: svens@linux.ibm.com
Link: https://lore.kernel.org/r/20230913125157.2790375-1-tmricht@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-17 15:51:57 -07:00
Ian Rogers
eaaebb01a7 perf pmu: Ensure all alias variables are initialized
Fix an error detected by memory sanitizer:
```
==4033==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x55fb0fbedfc7 in read_alias_info tools/perf/util/pmu.c:457:6
    #1 0x55fb0fbea339 in check_info_data tools/perf/util/pmu.c:1434:2
    #2 0x55fb0fbea339 in perf_pmu__check_alias tools/perf/util/pmu.c:1504:9
    #3 0x55fb0fbdca85 in parse_events_add_pmu tools/perf/util/parse-events.c:1429:32
    #4 0x55fb0f965230 in parse_events_parse tools/perf/util/parse-events.y:299:6
    #5 0x55fb0fbdf6b2 in parse_events__scanner tools/perf/util/parse-events.c:1822:8
    #6 0x55fb0fbdf8c1 in __parse_events tools/perf/util/parse-events.c:2094:8
    #7 0x55fb0fa8ffa9 in parse_events tools/perf/util/parse-events.h:41:9
    #8 0x55fb0fa8ffa9 in test_event tools/perf/tests/parse-events.c:2393:8
    #9 0x55fb0fa8f458 in test__pmu_events tools/perf/tests/parse-events.c:2551:15
    #10 0x55fb0fa6d93f in run_test tools/perf/tests/builtin-test.c:242:9
    #11 0x55fb0fa6d93f in test_and_print tools/perf/tests/builtin-test.c:271:8
    #12 0x55fb0fa6d082 in __cmd_test tools/perf/tests/builtin-test.c:442:5
    #13 0x55fb0fa6d082 in cmd_test tools/perf/tests/builtin-test.c:564:9
    #14 0x55fb0f942720 in run_builtin tools/perf/perf.c:322:11
    #15 0x55fb0f942486 in handle_internal_command tools/perf/perf.c:375:8
    #16 0x55fb0f941dab in run_argv tools/perf/perf.c:419:2
    #17 0x55fb0f941dab in main tools/perf/perf.c:535:3
```

Fixes: 7b723dbb96e8 ("perf pmu: Be lazy about loading event info files from sysfs")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20230914022425.1489035-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-17 15:51:42 -07:00
Ian Rogers
d1bac78e26 perf jevents metric: Fix type of strcmp_cpuid_str
The parser wraps all strings as Events, so the input is an
Event. Using a string would be bad as functions like Simplify are
called on the arguments, which wouldn't be present on a string.

Fixes: 9d5da30e4ae9 ("perf jevents: Add a new expression builtin strcmp_cpuid_str()")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20230914022204.1488383-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-17 15:51:35 -07:00
Ian Rogers
33b725ce7b perf trace: Avoid compile error wrt redefining bool
Make part of an existing TODO conditional to avoid the following build
error:
```
tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c:26:14: error: cannot combine with previous 'char' declaration specifier
   26 | typedef char bool;
      |              ^
include/stdbool.h:20:14: note: expanded from macro 'bool'
   20 | #define bool _Bool
      |              ^
tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c:26:1: error: typedef requires a name [-Werror,-Wmissing-declarations]
   26 | typedef char bool;
      | ^~~~~~~~~~~~~~~~~
2 errors generated.
```

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230913184957.230076-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-17 15:51:32 -07:00
Ian Rogers
4a73fca226 perf bpf-prologue: Remove unused file
Commit 3d6dfae88917 ("perf parse-events: Remove BPF event support")
removed building bpf-prologue.c but failed to remove the actual file.

Fixes: 3d6dfae88917 ("perf parse-events: Remove BPF event support")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230913184534.227961-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-17 15:51:28 -07:00
Yang Jihong
a132b784db perf test: Fix test-record-dummy-C0 failure for supported PERF_FORMAT_LOST feature kernel
For kernel that supports PERF_FORMAT_LOST, attr->read_format has
PERF_FORMAT_LOST bit. Update expected value of
attr->read_format of test-record-dummy-C0 for this scenario.

Before:

  # ./perf test 17 -vv
   17: Setup struct perf_event_attr                                    :
  --- start ---
  test child forked, pid 1609441
  <SNIP>
  running './tests/attr/test-record-dummy-C0'
    'PERF_TEST_ATTR=/tmp/tmpm3s60aji ./perf record -o /tmp/tmpm3s60aji/perf.data --no-bpf-event -e dummy -C 0 kill >/dev/null 2>&1' ret '1', expected '1'
  expected read_format=4, got 20
  FAILED './tests/attr/test-record-dummy-C0' - match failure
  test child finished with -1
  ---- end ----
  Setup struct perf_event_attr: FAILED!

After:

  # ./perf test 17 -vv
   17: Setup struct perf_event_attr                                    :
  --- start ---
  test child forked, pid 1609441
  <SNIP>
  running './tests/attr/test-record-dummy-C0'
    'PERF_TEST_ATTR=/tmp/tmppa9vxcb7 ./perf record -o /tmp/tmppa9vxcb7/perf.data --no-bpf-event -e dummy -C 0 kill >/dev/null 2>&1' ret '1', expected '1'
  <SNIP>
  test child finished with 0
  ---- end ----
  Setup struct perf_event_attr: Ok

Reported-and-Tested-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20230916091641.776031-1-yangjihong1@huawei.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-16 22:36:36 -07:00
Colin Ian King
79df8365e3 perf kwork: Fix spelling mistake "COMMMAND" -> "COMMAND"
There is a spelling mistake in a literal string. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: kernel-janitors@vger.kernel.org
Link: https://lore.kernel.org/r/20230915090910.30182-1-colin.i.king@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-15 16:46:40 -07:00
Namhyung Kim
486021e04b perf annotate: Add more x86 mov instruction cases
Instructions with sign- and zero- extention like movsbl and movzwq were
not handled properly.  As it can check different size suffix (-b, -w, -l
or -q) we can omit that and add the common parts even though some
combinations are not possible.

Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20230908052216.566148-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-15 16:46:40 -07:00
James Clark
70360fad91 perf pmu: Remove unused function
pmu_events_table__find() is no longer used so remove it and its Arm
specific version.

Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Will Deacon <will@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Haixin Yu <yuhaixin.yhx@linux.alibaba.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230913153355.138331-4-james.clark@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-15 16:46:40 -07:00
James Clark
105e5b433e perf pmus: Simplify perf_pmus__find_core_pmu()
Currently the while loop always either exits on the first iteration with
a core PMU, or exits with NULL on heterogeneous systems or when not all
CPUs are online.

Both of the latter behaviors are undesirable for platforms other than
Arm so simplify it to always return the first core PMU, or NULL if none
exist.

This behavior was depended on by the Arm version of
pmu_metrics_table__find(), so the logic has been moved there instead.

Signed-off-by: James Clark <james.clark@arm.com>
Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Will Deacon <will@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Haixin Yu <yuhaixin.yhx@linux.alibaba.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230913153355.138331-3-james.clark@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-15 16:46:40 -07:00
James Clark
3d0f5f456a perf pmu: Move pmu__find_core_pmu() to pmus.c
pmu__find_core_pmu() more logically belongs in pmus.c because it
iterates over all PMUs, so move it to pmus.c

At the same time rename it to perf_pmus__find_core_pmu() to match the
naming convention in this file.

list_prepare_entry() can't be used in perf_pmus__scan_core() anymore now
that it's called from the same compilation unit. This is with -O2
(specifically -O1 -ftree-vrp -finline-functions
-finline-small-functions) which allow the bounds of the array
access to be determined at compile time. list_prepare_entry() subtracts
the offset of the 'list' member in struct perf_pmu from &core_pmus,
which isn't a struct perf_pmu. The compiler sees that pmu results in
&core_pmus - 8 and refuses to compile. At runtime this works because
list_for_each_entry_continue() always adds the offset back again before
dereferencing ->next, but it's technically undefined behavior. With
-fsanitize=undefined an additional warning is generated.

Using list_first_entry_or_null() to get the first entry here avoids
doing &core_pmus - 8 but has the same result and fixes both the compile
warning and the undefined behavior warning. There are other uses of
list_prepare_entry() in pmus.c, but the compiler doesn't seem to be
able to see that they can also be called with &core_pmus, so I won't
change any at this time.

Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Will Deacon <will@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Haixin Yu <yuhaixin.yhx@linux.alibaba.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230913153355.138331-2-james.clark@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-15 16:46:40 -07:00
Ian Rogers
21ce931e55 perf symbol: Avoid an undefined behavior warning
The node (nd) may be NULL and pointer arithmetic on NULL is undefined
behavior. Move the computation of next below the NULL check on the
node.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20230914044233.1550195-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-15 16:46:40 -07:00
Arnaldo Carvalho de Melo
678ddf730a perf bench sched-seccomp-notify: Use the tools copy of seccomp.h UAPI
To keep perf building in systems where types and defines used in this
new benchmark are not available, such as:

  12    13.46 centos:stream                 : FAIL gcc version 8.5.0 20210514 (Red Hat 8.5.0-20) (GCC)
    bench/sched-seccomp-notify.c: In function 'user_notif_syscall':
    bench/sched-seccomp-notify.c:55:27: error: 'SECCOMP_RET_USER_NOTIF' undeclared (first use in this function); did you mean 'SECCOMP_RET_ERRNO'?
       BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_USER_NOTIF),
                               ^~~~~~~~~~~~~~~~~~~~~~
    /git/perf-6.6.0-rc1/tools/include/uapi/linux/filter.h:49:59: note: in definition of macro 'BPF_STMT'
     #define BPF_STMT(code, k) { (unsigned short)(code), 0, 0, k }
                                                               ^
    bench/sched-seccomp-notify.c:55:27: note: each undeclared identifier is reported only once for each function it appears in
       BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_USER_NOTIF),
                               ^~~~~~~~~~~~~~~~~~~~~~
    /git/perf-6.6.0-rc1/tools/include/uapi/linux/filter.h:49:59: note: in definition of macro 'BPF_STMT'
     #define BPF_STMT(code, k) { (unsigned short)(code), 0, 0, k }
                                                               ^
    bench/sched-seccomp-notify.c:55:3: error: missing initializer for field 'k' of 'struct sock_filter' [-Werror=missing-field-initializers]
       BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_USER_NOTIF),
       ^~~~~~~~
    In file included from bench/sched-seccomp-notify.c:5:
    /git/perf-6.6.0-rc1/tools/include/uapi/linux/filter.h:28:8: note: 'k' declared here
      __u32 k;      /* Generic multiuse field */
            ^
    bench/sched-seccomp-notify.c: In function 'user_notification_sync_loop':
    bench/sched-seccomp-notify.c:70:28: error: storage size of 'resp' isn't known
      struct seccomp_notif_resp resp;
                                ^~~~
    bench/sched-seccomp-notify.c:71:23: error: storage size of 'req' isn't known
      struct seccomp_notif req;
                           ^~~
    bench/sched-seccomp-notify.c:76:23: error: 'SECCOMP_IOCTL_NOTIF_RECV' undeclared (first use in this function); did you mean 'SECCOMP_MODE_STRICT'?
       if (ioctl(listener, SECCOMP_IOCTL_NOTIF_RECV, &req))
                           ^~~~~~~~~~~~~~~~~~~~~~~~
                           SECCOMP_MODE_STRICT
    bench/sched-seccomp-notify.c:86:23: error: 'SECCOMP_IOCTL_NOTIF_SEND' undeclared (first use in this function); did you mean 'SECCOMP_RET_ACTION'?
       if (ioctl(listener, SECCOMP_IOCTL_NOTIF_SEND, &resp))
                           ^~~~~~~~~~~~~~~~~~~~~~~~
                           SECCOMP_RET_ACTION
    bench/sched-seccomp-notify.c:71:23: error: unused variable 'req' [-Werror=unused-variable]
      struct seccomp_notif req;
                           ^~~
    bench/sched-seccomp-notify.c:70:28: error: unused variable 'resp' [-Werror=unused-variable]
      struct seccomp_notif_resp resp;
                                ^~~~

  14    11.31 debian:10                     : FAIL gcc version 8.3.0 (Debian 8.3.0-6)

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kees Kook <keescook@chromium.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/ZQGhjaojgOGtSNk6@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-13 08:49:00 -03:00
Arnaldo Carvalho de Melo
417ecb614f tools headers UAPI: Copy seccomp.h to be able to build 'perf bench' in older systems
The new 'perf bench' for sched-seccomp-notify uses defines and types not
available in older systems where we want to have perf available, so grab
a copy of this UAPI from the kernel sources to allow that.

This will be checked in the future for drift from the original when we
build the perf tool, that will warn when that happens like:

  make: Entering directory '/var/home/acme/git/perf-tools/tools/perf'
    BUILD:   Doing 'make -j32' parallel build
  Warning: Kernel ABI header differences:

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kees Kook <keescook@chromium.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/ZQGhMXtwX7RvV3ya@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-13 08:48:48 -03:00
Arnaldo Carvalho de Melo
f7875966dc tools headers UAPI: Sync files changed by new fchmodat2 and map_shadow_stack syscalls with the kernel sources
To pick the changes in these csets:

  c35559f94ebc3e3b ("x86/shstk: Introduce map_shadow_stack syscall")
  78252deb023cf087 ("arch: Register fchmodat2, usually as syscall 452")

That add support for this new syscall in tools such as 'perf trace'.

For instance, this is now possible:

  # perf trace -v -e fchmodat*,map_shadow_stack --max-events=4
  Using CPUID AuthenticAMD-25-21-0
  Reusing "openat" BPF sys_enter augmenter for "fchmodat"
  event qualifier tracepoint filter: (common_pid != 3499340 && common_pid != 11259) && (id == 268 || id == 452 || id == 453)
  ^C#

  And it'll work as with other syscalls, for instance openat:

  # perf trace -e openat* --max-events=4
     0.000 ( 0.015 ms): systemd-oomd/1150 openat(dfd: CWD, filename: "/proc/meminfo", flags: RDONLY|CLOEXEC)    = 11
     0.068 ( 0.019 ms): systemd-oomd/1150 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1001.slice/user@1001.service/memory.pressure", flags: RDONLY|CLOEXEC) = 11
     0.119 ( 0.008 ms): systemd-oomd/1150 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1001.slice/user@1001.service/memory.current", flags: RDONLY|CLOEXEC) = 11
     0.138 ( 0.006 ms): systemd-oomd/1150 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1001.slice/user@1001.service/memory.min", flags: RDONLY|CLOEXEC) = 11
  #

That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
tracepoints.

  $ find tools/perf/arch/ -name "syscall*tbl" | xargs grep -E fchmodat\|sys_map_shadow_stack
  tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl:258	n64	fchmodat			sys_fchmodat
  tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl:452	n64	fchmodat2			sys_fchmodat2
  tools/perf/arch/powerpc/entry/syscalls/syscall.tbl:297	common	fchmodat			sys_fchmodat
  tools/perf/arch/powerpc/entry/syscalls/syscall.tbl:452	common	fchmodat2			sys_fchmodat2
  tools/perf/arch/s390/entry/syscalls/syscall.tbl:299  common	fchmodat		sys_fchmodat			sys_fchmodat
  tools/perf/arch/s390/entry/syscalls/syscall.tbl:452  common	fchmodat2		sys_fchmodat2			sys_fchmodat2
  tools/perf/arch/x86/entry/syscalls/syscall_64.tbl:268	common	fchmodat		sys_fchmodat
  tools/perf/arch/x86/entry/syscalls/syscall_64.tbl:452	common	fchmodat2		sys_fchmodat2
  tools/perf/arch/x86/entry/syscalls/syscall_64.tbl:453	64	map_shadow_stack	sys_map_shadow_stack
  $

  $ grep -Ew map_shadow_stack\|fchmodat2 /tmp/build/perf-tools/arch/x86/include/generated/asm/syscalls_64.c
	[452] = "fchmodat2",
	[453] = "map_shadow_stack",
  $

This addresses these perf build warnings:

  Warning: Kernel ABI header differences:
    diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
    diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
    diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
    diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
    diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
Link: https://lore.kernel.org/lkml/ZP8bE7aXDBu%2Fdrak@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-13 08:24:51 -03:00
Ian Rogers
999b81b907 perf bpf-filter: Add YYDEBUG
YYDEBUG enables line numbers and other error helpers in the generated
bpf-filter-bison.c. Conditionally enabled only for debug builds.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230911170559.4037734-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:50:36 -03:00
Ian Rogers
f0f4cd1003 perf pmu: Add YYDEBUG
YYDEBUG enables line numbers and other error helpers in the generated
pmu-bison.c. Conditionally enabled only for debug builds.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230911170559.4037734-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:47:00 -03:00
Ian Rogers
1344a7077d perf expr: Make YYDEBUG dependent on doing a debug build
YYDEBUG enables line numbers and other error helpers in the generated
expr-bison.c. These shouldn't be generated when debugging
isn't enabled.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230911170559.4037734-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:46:50 -03:00
Ian Rogers
d4ce60190e perf parse-events: Make YYDEBUG dependent on doing a debug build
YYDEBUG enables line numbers and other error helpers in the generated
parse-events-bison.c. These shouldn't be generated when debugging
isn't enabled.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230911170559.4037734-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:46:16 -03:00
Ian Rogers
dc2cfef9a9 perf parse-events: Remove unused header files
The fnmatch header is now used in the PMU matching logic in pmu.c.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230911170559.4037734-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:46:03 -03:00
Athira Rajeev
f5d98b8bdc perf tools: Add includes for detected configs in Makefile.perf
Makefile.perf uses "CONFIG_*" checks in the code. Example the config for
libtraceevent is used to set PYTHON_EXT_SRCS

	ifeq ($(CONFIG_LIBTRACEEVENT),y)
	  PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources)
	else
	  PYTHON_EXT_SRCS := $(shell grep -v '^\#\|util/trace-event.c' util/python-ext-sources)
	endif

But this is not picking the value for CONFIG_LIBTRACEEVENT that is set
using the settings in Makefile.config. Include the file
".config-detected" so that make will use the system detected
configuration in the CONFIG checks.

This will fix isues that could arise when other "CONFIG_*" checks are
added to Makefile.perf in future as well.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230912063807.74250-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:32:00 -03:00
Ruidong Tian
bb35084796 perf test: Update cs_etm testcase for Arm ETE
Add ETE as one of the supported device types in perf cs_etm testcase.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230911065541.91293-1-tianruidong@linux.alibaba.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:32:00 -03:00
James Clark
5cdb51baf3 perf vendor events arm64: Add V1 metrics using Arm telemetry repo
Metrics for V1 weren't previously included in the Perf Jsons, so add
them using the telemetry source [1].

After generation any parts identical to the default metrics in sbsa.json
were manually removed.

[1]: https://gitlab.arm.com/telemetry-solution/telemetry-solution/-/blob/main/data/pmu/cpu/neoverse/neoverse-v1.json

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Forrington <nick.forrington@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230831161618.134738-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:32:00 -03:00
James Clark
a484e64580 perf vendor events arm64: Update V1 events using Arm telemetry repo
The new data [1] includes descriptions that may have product specific
details and new groupings that will be consistent with other products.

The following command was used to generate the jsons:

 $ telemetry-solution/tools/perf_json_generator/generate.py \
   linux/tools/perf/ --telemetry-files \
   telemetry-solution/data/pmu/cpu/neoverse/neoverse-v1.json

[1]: https://gitlab.arm.com/telemetry-solution/telemetry-solution/-/blob/main/data/pmu/cpu/neoverse/neoverse-v1.json

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Forrington <nick.forrington@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230831161618.134738-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:32:00 -03:00
James Clark
a1ebf7718e perf test: Add a test for strcmp_cpuid_str() expression
Test that the new expression builtin returns a match when the current
escaped CPU ID is given, and that it doesn't match when "0x0" is given.

The CPU ID in test__expr() has to be changed to perf_pmu__getcpuid()
which returns the CPU ID string, rather than the raw CPU ID that
get_cpuid() returns because that can't be used with strcmp_cpuid_str().
It doesn't affect the is_intel test because both versions contain
"Intel".

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chen Zhongjin <chenzhongjin@huawei.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Haixin Yu <yuhaixin.yhx@linux.alibaba.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230904095104.1162928-5-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:32:00 -03:00
James Clark
8a55c1e2c9 perf util: Add a function for replacing characters in a string
It finds all occurrences of a single character and replaces them with
a multi character string. This will be used in a test in a following
commit.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chen Zhongjin <chenzhongjin@huawei.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Haixin Yu <yuhaixin.yhx@linux.alibaba.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230904095104.1162928-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:32:00 -03:00
James Clark
f561fc78c5 perf jevents: Remove unused keyword
'cpuid_not_more_than' was the working title of the new
'strcmp_cpuid_str' keyword and was accidentally left in. It was never
used so tidying it up has no effect.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chen Zhongjin <chenzhongjin@huawei.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Haixin Yu <yuhaixin.yhx@linux.alibaba.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230904095104.1162928-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:32:00 -03:00
James Clark
d19a353cdd perf test: Check result of has_event(cycles) test
Currently the function always returns 0, so even when the has_event()
test fails, the test still passes. Fix it by returning ret instead.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chen Zhongjin <chenzhongjin@huawei.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Haixin Yu <yuhaixin.yhx@linux.alibaba.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230904095104.1162928-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:32:00 -03:00
Ian Rogers
6bd8c2ea6b perf list pfm: Retry supported test with exclude_kernel
With paranoia set at 2 evsel__open will fail with EACCES for non-root
users. To avoid this stopping libpfm4 events from being printed, retry
with exclude_kernel enabled - copying the regular is_event_supported
test.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20230906234416.3472339-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:32:00 -03:00
Ian Rogers
4f19fc1839 perf list: Avoid a hardcoded cpu PMU name
Use the first core PMU instead.

On a Raspberry Pi, before:

  $ perf list
  ...
    cpu/t1=v1[,t2=v2,t3 ...]/modifier                  [Raw hardware event descriptor]
         [(see 'man perf-list' on how to encode it)]
  ...

After:

  $ perf list
  ...
    armv8_cortex_a72/t1=v1[,t2=v2,t3 ...]/modifier     [Raw hardware event descriptor]
         [(see 'man perf-list' on how to encode it)]
  ...
  ```

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20230906234416.3472339-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:32:00 -03:00
Namhyung Kim
e44b47b931 perf test shell lock_contention: Add cgroup aggregation and filter tests
Add cgroup aggregation and filter tests.

  $ sudo ./perf test -v contention
   84: kernel lock contention analysis test                            :
  --- start ---
  test child forked, pid 222423
  Testing perf lock record and perf lock contention
  Testing perf lock contention --use-bpf
  Testing perf lock record and perf lock contention at the same time
  Testing perf lock contention --threads
  Testing perf lock contention --lock-addr
  Testing perf lock contention --lock-cgroup
  Testing perf lock contention --type-filter (w/ spinlock)
  Testing perf lock contention --lock-filter (w/ tasklist_lock)
  Testing perf lock contention --callstack-filter (w/ unix_stream)
  Testing perf lock contention --callstack-filter with task aggregation
  Testing perf lock contention --cgroup-filter
  Testing perf lock contention CSV output
  test child finished with 0
  ---- end ----
  kernel lock contention analysis test: Ok

Committer testing:

  [root@quaco ~]# uname -a
  Linux quaco 6.4.10-200.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Aug 11 12:20:29 UTC 2023 x86_64 GNU/Linux
  [root@quaco ~]# perf test -v contention
   84: kernel lock contention analysis test                            :
  --- start ---
  test child forked, pid 452625
  Testing perf lock record and perf lock contention
  Testing perf lock contention --use-bpf
  Testing perf lock record and perf lock contention at the same time
  Testing perf lock contention --threads
  Testing perf lock contention --lock-addr
  Testing perf lock contention --lock-cgroup
  Testing perf lock contention --type-filter (w/ spinlock)
  Testing perf lock contention --lock-filter (w/ tasklist_lock)
  Testing perf lock contention --callstack-filter (w/ unix_stream)
  Testing perf lock contention --callstack-filter with task aggregation
  Testing perf lock contention --cgroup-filter
  Testing perf lock contention CSV output
  test child finished with 0
  ---- end ----
  kernel lock contention analysis test: Ok
  [root@quaco ~]#

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230906174903.346486-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:32:00 -03:00
Namhyung Kim
4fd06bd2dc perf lock contention: Add -G/--cgroup-filter option
The -G/--cgroup-filter is to limit lock contention collection on the
tasks in the specific cgroups only.

  $ sudo ./perf lock con -abt -G /user.slice/.../vte-spawn-52221fb8-b33f-4a52-b5c3-e35d1e6fc0e0.scope \
    ./perf bench sched messaging
  # Running 'sched/messaging' benchmark:
  # 20 sender and receiver processes per group
  # 10 groups == 400 processes run

       Total time: 0.174 [sec]
   contended   total wait     max wait     avg wait          pid   comm

           4    114.45 us     60.06 us     28.61 us       214847   sched-messaging
           2    111.40 us     60.84 us     55.70 us       214848   sched-messaging
           2    106.09 us     59.42 us     53.04 us       214837   sched-messaging
           1     81.70 us     81.70 us     81.70 us       214709   sched-messaging
          68     78.44 us      6.83 us      1.15 us       214633   sched-messaging
          69     73.71 us      2.69 us      1.07 us       214632   sched-messaging
           4     72.62 us     60.83 us     18.15 us       214850   sched-messaging
           2     71.75 us     67.60 us     35.88 us       214840   sched-messaging
           2     69.29 us     67.53 us     34.65 us       214804   sched-messaging
           2     69.00 us     68.23 us     34.50 us       214826   sched-messaging
  ...

Export cgroup__new() function as it's needed from outside.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230906174903.346486-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:32:00 -03:00
Namhyung Kim
4d1792d0a2 perf lock contention: Add --lock-cgroup option
The --lock-cgroup option shows lock contention stats break down by
cgroups.

Add LOCK_AGGR_CGROUP mode and use it instead of use_cgroup field.

  $ sudo ./perf lock con -ab --lock-cgroup sleep 1
   contended   total wait     max wait     avg wait   cgroup

           8     15.70 us      6.34 us      1.96 us   /
           2      1.48 us       747 ns       738 ns   /user.slice/.../app.slice/app-gnome-google\x2dchrome-6442.scope
           1       848 ns       848 ns       848 ns   /user.slice/.../session.slice/org.gnome.Shell@x11.service
           1       220 ns       220 ns       220 ns   /user.slice/.../session.slice/pipewire-pulse.service

For now, the cgroup mode only works with BPF (-b).

Committer notes:

Remove -g as it is used in the other tools with a clear meaning of
collect/show callchains. As agreed with Namhyung off list.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230906174903.346486-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:32:00 -03:00
Namhyung Kim
d0c502e46e perf lock contention: Prepare to handle cgroups
Save cgroup info and display cgroup names if requested.  This is a
preparation for the next patch.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230906174903.346486-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:32:00 -03:00
Namhyung Kim
2bc12abce8 perf tools: Add read_all_cgroups() and __cgroup_find()
The read_all_cgroups() is to build a tree of cgroups in the system and
users can look up a cgroup using __cgroup_find().

Committer notes:

Had to do this to cover that #else block:

  -static inline u64 __read_cgroup_id(const char *path) { return -1ULL; }
  +static inline u64 __read_cgroup_id(const char *path __maybe_unused) { return -1ULL; }

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230906174903.346486-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:32:00 -03:00
Yang Jihong
36019dff30 perf kwork top: Add BPF-based statistics on softirq event support
Use BPF to collect statistics on softirq events based on perf BPF skeletons.

Example usage:

  # perf kwork top -b
  Starting trace, Hit <Ctrl+C> to stop and report
  ^C
  Total  : 135445.704 ms, 8 cpus
  %Cpu(s):  28.35% id,   0.00% hi,   0.25% si
  %Cpu0   [||||||||||||||||||||            69.85%]
  %Cpu1   [||||||||||||||||||||||          74.10%]
  %Cpu2   [|||||||||||||||||||||           71.18%]
  %Cpu3   [||||||||||||||||||||            69.61%]
  %Cpu4   [||||||||||||||||||||||          74.05%]
  %Cpu5   [||||||||||||||||||||            69.33%]
  %Cpu6   [||||||||||||||||||||            69.71%]
  %Cpu7   [||||||||||||||||||||||          73.77%]

        PID     SPID    %CPU           RUNTIME  COMMMAND
    -------------------------------------------------------------
          0        0   30.43       5271.005 ms  [swapper/5]
          0        0   30.17       5226.644 ms  [swapper/3]
          0        0   30.08       5210.257 ms  [swapper/6]
          0        0   29.89       5177.177 ms  [swapper/0]
          0        0   28.51       4938.672 ms  [swapper/2]
          0        0   25.93       4223.464 ms  [swapper/7]
          0        0   25.69       4181.411 ms  [swapper/4]
          0        0   25.63       4173.804 ms  [swapper/1]
      16665    16265    2.16        360.600 ms  sched-messaging
      16537    16265    2.05        356.275 ms  sched-messaging
      16503    16265    2.01        343.063 ms  sched-messaging
      16424    16265    1.97        336.876 ms  sched-messaging
      16580    16265    1.94        323.658 ms  sched-messaging
      16515    16265    1.92        321.616 ms  sched-messaging
      16659    16265    1.91        325.538 ms  sched-messaging
      16634    16265    1.88        327.766 ms  sched-messaging
      16454    16265    1.87        326.843 ms  sched-messaging
      16382    16265    1.87        322.591 ms  sched-messaging
      16642    16265    1.86        320.506 ms  sched-messaging
      16582    16265    1.86        320.164 ms  sched-messaging
      16315    16265    1.86        326.872 ms  sched-messaging
      16637    16265    1.85        323.766 ms  sched-messaging
      16506    16265    1.82        311.688 ms  sched-messaging
      16512    16265    1.81        304.643 ms  sched-messaging
      16560    16265    1.80        314.751 ms  sched-messaging
      16320    16265    1.80        313.405 ms  sched-messaging
      16442    16265    1.80        314.403 ms  sched-messaging
      16626    16265    1.78        295.380 ms  sched-messaging
      16600    16265    1.77        309.444 ms  sched-messaging
      16550    16265    1.76        301.161 ms  sched-messaging
      16525    16265    1.75        296.560 ms  sched-messaging
      16314    16265    1.75        298.338 ms  sched-messaging
      16595    16265    1.74        304.390 ms  sched-messaging
      16555    16265    1.74        287.564 ms  sched-messaging
      16520    16265    1.74        295.734 ms  sched-messaging
      16507    16265    1.73        293.956 ms  sched-messaging
      16593    16265    1.72        296.443 ms  sched-messaging
      16531    16265    1.72        299.950 ms  sched-messaging
      16281    16265    1.72        301.339 ms  sched-messaging
  <SNIP>

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/r/20230812084917.169338-17-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-12 17:31:59 -03:00