15393 Commits

Author SHA1 Message Date
Adrian Hunter
ee534525ba perf intel-pt: Fix async branch flags
commit f2d87895cbc4af80649850dcf5da36de6b2ed3dd upstream.

Ensure PERF_IP_FLAG_ASYNC is set always for asynchronous branches (i.e.
interrupts etc).

Fixes: 90e457f7be08 ("perf tools: Add Intel PT support")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230928072953.19369-1-adrian.hunter@intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-28 17:19:55 +00:00
Ian Rogers
0c67c9a2d4 perf vendor events intel: Add broadwellde two metrics
[ Upstream commit 19a214bffdf7abb8d472895bb944d9c269ab1699 ]

Add tma_info_system_socket_clks and uncore_freq metrics that require a
broadwellx style uncore event for UNC_CLOCK.

The associated converter script fix is in:
https://github.com/intel/perfmon/pull/112

Fixes: 7d124303d620 ("perf vendor events intel: Update broadwell variant events/metrics")
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Link: https://lore.kernel.org/r/20230926205948.1399594-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20 11:59:29 +01:00
Ian Rogers
8a1be9decb perf vendor events intel: Fix broadwellde tma_info_system_dram_bw_use metric
[ Upstream commit 3779416eed25a843364b940decee452620a1de4b ]

Broadwell-de has a consumer core and server uncore. The uncore_arb PMU
isn't present and the broadwellx style cbox PMU should be used
instead. Fix the tma_info_system_dram_bw_use metric to use the server
metric rather than client.

The associated converter script fix is in:
https://github.com/intel/perfmon/pull/111

Fixes: 7d124303d620 ("perf vendor events intel: Update broadwell variant events/metrics")
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Link: https://lore.kernel.org/r/20230926031034.1201145-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20 11:59:29 +01:00
Ian Rogers
581570045a perf hist: Add missing puts to hist__account_cycles
[ Upstream commit c1149037f65bcf0334886180ebe3d5efcf214912 ]

Caught using reference count checking on perf top with
"--call-graph=lbr". After this no memory leaks were detected.

Fixes: 57849998e2cd ("perf report: Add processing for cycle histograms")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: liuwenyu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20231024222353.3024098-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20 11:59:29 +01:00
Ian Rogers
c139d4d928 perf machine: Avoid out of bounds LBR memory read
[ Upstream commit ab8ce150781d326c6bfbe1e09f175ffde1186f80 ]

Running perf top with address sanitizer and "--call-graph=lbr" fails
due to reading sample 0 when no samples exist. Add a guard to prevent
this.

Fixes: e2b23483eb1d ("perf machine: Factor out lbr_callchain_add_lbr_ip()")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: liuwenyu <liuwenyu7@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/r/20231024222353.3024098-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20 11:59:29 +01:00
Kajol Jain
e2eb96ae62 perf vendor events: Update PMC used in PM_RUN_INST_CMPL event for power10 platform
[ Upstream commit 3f8b6e5b11192dacb721d2d28ea4589917f5e822 ]

The CPI_STALL_RATIO metric group can be used to present the high
level CPI stall breakdown metrics in powerpc, which will show:

- DISPATCH_STALL_CPI ( Dispatch stall cycles per insn )
- ISSUE_STALL_CPI ( Issue stall cycles per insn )
- EXECUTION_STALL_CPI ( Execution stall cycles per insn )
- COMPLETION_STALL_CPI ( Completion stall cycles per insn )

Commit cf26e043c2a9 ("perf vendor events power10: Add JSON
metric events to present CPI stall cycles in powerpc)" which added
the CPI_STALL_RATIO metric group, also modified
the PMC value used in PM_RUN_INST_CMPL event from PMC4 to PMC5,
to avoid multiplexing of events.
But that got revert in recent changes. Fix this issue by changing
back the PMC value used in PM_RUN_INST_CMPL to PMC5.

Result with the fix:

 ./perf stat --metric-no-group -M CPI_STALL_RATIO <workload>

 Performance counter stats for 'workload':

        68,745,426      PM_CMPL_STALL                    #     0.21 COMPLETION_STALL_CPI
         7,692,827      PM_ISSUE_STALL                   #     0.02 ISSUE_STALL_CPI
       322,638,223      PM_RUN_INST_CMPL                 #     0.05 DISPATCH_STALL_CPI
                                                  #     0.48 EXECUTION_STALL_CPI
        16,858,553      PM_DISP_STALL_CYC
       153,880,133      PM_EXEC_STALL

       0.089774592 seconds time elapsed

"--metric-no-group" is used for forcing PM_RUN_INST_CMPL to be scheduled
in all group for more accuracy.

Fixes: 7d473f475b2a ("perf vendor events: Move JSON/events to appropriate files for power10 platform")
Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Disha Goel<disgoel@linux.ibm.com>
Cc: maddy@linux.ibm.com
Link: https://lore.kernel.org/r/20231016143110.244255-1-kjain@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20 11:59:29 +01:00
Thomas Richter
b18be5a561 perf trace: Use the right bpf_probe_read(_str) variant for reading user data
[ Upstream commit 5069211e2f0b47e75119805e23ae6352d871e263 ]

Perf test case 111 Check open filename arg using perf trace + vfs_getname
fails on s390. This is caused by a failing function
bpf_probe_read() in file util/bpf_skel/augmented_raw_syscalls.bpf.c.

The root cause is the lookup by address. Function bpf_probe_read()
is used. This function works only for architectures
with ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE.

On s390 is not possible to determine from the address to which
address space the address belongs to (user or kernel space).

Replace bpf_probe_read() by bpf_probe_read_kernel()
and bpf_probe_read_str() by bpf_probe_read_user_str() to
explicity specify the address space the address refers to.

Output before:
 # ./perf trace -eopen,openat -- touch /tmp/111
 libbpf: prog 'sys_enter': BPF program load failed: Invalid argument
 libbpf: prog 'sys_enter': -- BEGIN PROG LOAD LOG --
 reg type unsupported for arg#0 function sys_enter#75
 0: R1=ctx(off=0,imm=0) R10=fp0
 ; int sys_enter(struct syscall_enter_args *args)
 0: (bf) r6 = r1           ; R1=ctx(off=0,imm=0) R6_w=ctx(off=0,imm=0)
 ; return bpf_get_current_pid_tgid();
 1: (85) call bpf_get_current_pid_tgid#14      ; R0_w=scalar()
 2: (63) *(u32 *)(r10 -8) = r0 ; R0_w=scalar() R10=fp0 fp-8=????mmmm
 3: (bf) r2 = r10              ; R2_w=fp0 R10=fp0
 ;
 .....
 lines deleted here
 .....
 23: (bf) r3 = r6              ; R3_w=ctx(off=0,imm=0) R6=ctx(off=0,imm=0)
 24: (85) call bpf_probe_read#4
 unknown func bpf_probe_read#4
 processed 23 insns (limit 1000000) max_states_per_insn 0 \
	 total_states 2 peak_states 2 mark_read 2
 -- END PROG LOAD LOG --
 libbpf: prog 'sys_enter': failed to load: -22
 libbpf: failed to load object 'augmented_raw_syscalls_bpf'
 libbpf: failed to load BPF skeleton 'augmented_raw_syscalls_bpf': -22
 ....

Output after:
 # ./perf test -Fv 111
 111: Check open filename arg using perf trace + vfs_getname          :
 --- start ---
     1.085 ( 0.011 ms): touch/320753 openat(dfd: CWD, filename: \
	"/tmp/temporary_file.SWH85", \
	flags: CREAT|NOCTTY|NONBLOCK|WRONLY, mode: IRUGO|IWUGO) = 3
 ---- end ----
 Check open filename arg using perf trace + vfs_getname: Ok
 #

Test with the sleep command shows:
Output before:
 # ./perf trace -e *sleep sleep 1.234567890
     0.000 (1234.681 ms): sleep/63114 clock_nanosleep(rqtp: \
         { .tv_sec: 0, .tv_nsec: 0 }, rmtp: 0x3ffe0979720) = 0
 #

Output after:
 # ./perf trace -e *sleep sleep 1.234567890
     0.000 (1234.686 ms): sleep/64277 clock_nanosleep(rqtp: \
         { .tv_sec: 1, .tv_nsec: 234567890 }, rmtp: 0x3fff3df9ea0) = 0
 #

Fixes: 14e4b9f4289a ("perf trace: Raw augmented syscalls fix libbpf 1.0+ compatibility")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Co-developed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: gor@linux.ibm.com
Cc: hca@linux.ibm.com
Cc: sumanthk@linux.ibm.com
Cc: svens@linux.ibm.com
Link: https://lore.kernel.org/r/20231019082642.3286650-1-tmricht@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20 11:59:28 +01:00
Namhyung Kim
517310f7ce perf tools: Do not ignore the default vmlinux.h
[ Upstream commit 1f36b190ad2dea68e3a7e84b7b2f24ce8c4063ea ]

The recent change made it possible to generate vmlinux.h from BTF and
to ignore the file.  But we also have a minimal vmlinux.h that will be
used by default.  It should not be ignored by GIT.

Fixes: b7a2d774c9c5 ("perf build: Add ability to build with a generated vmlinux.h")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202310110451.rvdUZJEY-lkp@intel.com/
Cc: oe-kbuild-all@lists.linux.dev
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20 11:59:28 +01:00
Ian Rogers
650b41987a perf mem-events: Avoid uninitialized read
[ Upstream commit 85f73c377b2ac9988a204b119aebb33ca5c60083 ]

pmu should be initialized to NULL before perf_pmus__scan loop. Fix and
shrink the scope of pmu at the same time. Issue detected by clang-tidy.

Fixes: 5752c20f3787 ("perf mem: Scan all PMUs instead of just core ones")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: llvm@lists.linux.dev
Cc: Ming Wang <wangming01@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20231009183920.200859-10-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20 11:59:28 +01:00
Ian Rogers
4fa4152950 perf parse-events: Fix for term values that are raw events
[ Upstream commit b20576fd7fe39554b212095c3c0d7a3dff512515 ]

Raw events can be strings like 'r0xead' but the 0x is optional so they
can also be 'read'. On IcelakeX uncore_imc_free_running has an event
called 'read' which may be programmed as:
```
$ perf stat -e 'uncore_imc_free_running/event=read/' -a sleep 1
```
However, the PE_RAW type isn't allowed on the right of a term, even
though in this case we just want to interpret it as a string. This
leads to the following error on IcelakeX:
```
$ perf stat -e 'uncore_imc_free_running/event=read/' -a sleep 1
event syntax error: '..nning/event=read/'
                                  \___ parser error
Run 'perf list' for a list of valid events

 Usage: perf stat [<options>] [<command>]

    -e, --event <event> event selector. use 'perf list' to list available events
```
Fix this by allowing raw types on the right of terms and treat them as
strings, just as is already done for PE_LEGACY_CACHE. Make this
consistent by just entirely removing name_or_legacy and always using
name_or_raw that covers all three cases.

Fixes: 6fd1e5191591 ("perf parse-events: Support PMUs for legacy cache events")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20230928004431.1926969-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20 11:59:28 +01:00
Arnaldo Carvalho de Melo
221d9cbc5d perf build: Add missing comment about NO_LIBTRACEEVENT=1
[ Upstream commit c1783ddfb62420c44cdf4672dad2046f056c624b ]

By default perf will fail the build if the development files for
libtraceevent are not available.

To build perf without libtraceevent support, disabling several features
such as 'perf trace', one needs to add NO_LIBTRACEVENT=1 to the make
command line.

Add the missing comments about that to the tools/perf/Makefile.perf
file, just like all the other such command line toggles.

Fixes: 378ef0f5d9d7f465 ("perf build: Use libtraceevent from the system")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/ZR6+MhXtLnv6ow6E@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20 11:59:27 +01:00
Kajol Jain
84f254d901 tools/perf: Update call stack check in builtin-lock.c
[ Upstream commit d7c9ae8d5d1be0c4156d1e20e4369a77b711a4cc ]

The perf test named "kernel lock contention analysis test"
fails in powerpc system with below error:

  [command]# ./perf test 81 -vv
   81: kernel lock contention analysis test                            :
   --- start ---
  test child forked, pid 2140
  Testing perf lock record and perf lock contention
  Testing perf lock contention --use-bpf
  [Skip] No BPF support
  Testing perf lock record and perf lock contention at the same time
  Testing perf lock contention --threads
  Testing perf lock contention --lock-addr
  Testing perf lock contention --type-filter (w/ spinlock)
  Testing perf lock contention --lock-filter (w/ tasklist_lock)
  Testing perf lock contention --callstack-filter (w/ unix_stream)
  [Fail] Recorded result should have a lock from unix_stream:
  test child finished with -1
   ---- end ----
  kernel lock contention analysis test: FAILED!

The test is failing because we get an address entry with 0 in
perf lock samples for powerpc, and code for lock contention
option "--callstack-filter" will not check further entries after
address 0.

Below are some of the samples from test generated perf.data file, which
have 0 address in the 2nd entry of callstack:
 --------
sched-messaging    3409 [001]  7152.904029: lock:contention_begin: 0xc00000c80904ef00 (flags=SPIN)
        c0000000001e926c __traceiter_contention_begin+0x6c ([kernel.kallsyms])
                       0 [unknown] ([unknown])
        c000000000f8a178 native_queued_spin_lock_slowpath+0x1f8 ([kernel.kallsyms])
        c000000000f89f44 _raw_spin_lock_irqsave+0x84 ([kernel.kallsyms])
        c0000000001d9fd0 prepare_to_wait+0x50 ([kernel.kallsyms])
        c000000000c80f50 sock_alloc_send_pskb+0x1b0 ([kernel.kallsyms])
        c000000000e82298 unix_stream_sendmsg+0x2b8 ([kernel.kallsyms])
        c000000000c78980 sock_sendmsg+0x80 ([kernel.kallsyms])

sched-messaging    3408 [005]  7152.904036: lock:contention_begin: 0xc00000c80904ef00 (flags=SPIN)
        c0000000001e926c __traceiter_contention_begin+0x6c ([kernel.kallsyms])
                       0 [unknown] ([unknown])
        c000000000f8a178 native_queued_spin_lock_slowpath+0x1f8 ([kernel.kallsyms])
        c000000000f89f44 _raw_spin_lock_irqsave+0x84 ([kernel.kallsyms])
        c0000000001d9fd0 prepare_to_wait+0x50 ([kernel.kallsyms])
        c000000000c80f50 sock_alloc_send_pskb+0x1b0 ([kernel.kallsyms])
        c000000000e82298 unix_stream_sendmsg+0x2b8 ([kernel.kallsyms])
        c000000000c78980 sock_sendmsg+0x80 ([kernel.kallsyms])
 --------

Based on commit 20002ded4d93 ("perf_counter: powerpc: Add callchain support"),
incase of powerpc, the callchain saved by kernel always includes first
three entries as the NIP (next instruction pointer), LR (link register), and
the contents of LR save area in the second stack frame. In certain scenarios
its possible to have invalid kernel instruction addresses in either of LR or the
second stack frame's LR. In that case, kernel will store the address as zer0.
Hence, its possible to have 2nd or 3rd callstack entry as 0.

As per the current code in match_callstack_filter function, we skip the callstack
check incase we get 0 address. And hence the test case is failing in powerpc.

Fix this issue by updating the check in match_callstack_filter function,
to not skip callstack check if the 2nd or 3rd entry have 0 address
for powerpc.

Result in powerpc after patch changes:

  [command]# ./perf test 81 -vv
   81: kernel lock contention analysis test                            :
   --- start ---
  test child forked, pid 4570
  Testing perf lock record and perf lock contention
  Testing perf lock contention --use-bpf
  [Skip] No BPF support
  Testing perf lock record and perf lock contention at the same time
  Testing perf lock contention --threads
  Testing perf lock contention --lock-addr
  Testing perf lock contention --type-filter (w/ spinlock)
  Testing perf lock contention --lock-filter (w/ tasklist_lock)
  [Skip] Could not find 'tasklist_lock'
  Testing perf lock contention --callstack-filter (w/ unix_stream)
  Testing perf lock contention --callstack-filter with task aggregation
  Testing perf lock contention CSV output
  [Skip] No BPF support
  test child finished with 0
   ---- end ----
  kernel lock contention analysis test: Ok

Fixes: ebab291641be ("perf lock contention: Support filters for different aggregation")
Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Tested-by: Disha Goel <disgoel@linux.ibm.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Cc: maddy@linux.ibm.com
Cc: atrajeev@linux.vnet.ibm.com
Link: https://lore.kernel.org/r/20231003092113.252380-1-kjain@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20 11:59:26 +01:00
Namhyung Kim
d9bd26de86 perf record: Fix BTF type checks in the off-cpu profiling
[ Upstream commit 0e501a65d35bf72414379fed0e31a0b6b81ab57d ]

The BTF func proto for a tracepoint has one more argument than the
actual tracepoint function since it has a context argument at the
begining.  So it should compare to 5 when the tracepoint has 4
arguments.

  typedef void (*btf_trace_sched_switch)(void *, bool, struct task_struct *, struct task_struct *, unsigned int);

Also, recent change in the perf tool would use a hand-written minimal
vmlinux.h to generate BTF in the skeleton.  So it won't have the info
of the tracepoint.  Anyway it should use the kernel's vmlinux BTF to
check the type in the kernel.

Fixes: b36888f71c85 ("perf record: Handle argument change in sched_switch")
Reviewed-by: Ian Rogers <irogers@google.com>
Acked-by: Song Liu <song@kernel.org>
Cc: Hao Luo <haoluo@google.com>
CC: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230922234444.3115821-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20 11:59:25 +01:00
Ilkka Koskinen
fe9d8c08de perf vendor events arm64: Fix for AmpereOne metrics
[ Upstream commit 59faeaf80d0239412056c4495cc49097e9cea4da ]

This patch addresses review comments that were given for
705ed549148f ("perf vendor events arm64: Add AmpereOne metrics")
but didn't make it to the original patch [1][2]

Changes include: A fix for backend_memory formula, use of standard metrics
when possible, using #slots, renaming metrics to avoid spaces in the names,
and cleanup.

[1] https://lore.kernel.org/linux-perf-users/e9bdacb-a231-36af-6a2e-6918ee7effa@os.amperecomputing.com/
[2] https://lore.kernel.org/linux-perf-users/20230826192352.3043220-1-ilkka@os.amperecomputing.com/

Fixes: 705ed549148f ("perf vendor events arm64: Add AmpereOne metrics")
Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Dave Kleikamp <dave.kleikamp@oracle.com>
Cc: John Garry <john.g.garry@oracle.com>
Cc: D Scott Phillips <scott@os.amperecomputing.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230920061839.2437413-1-ilkka@os.amperecomputing.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20 11:59:25 +01:00
Ian Rogers
e3c2dc9d60 perf parse-events: Fix tracepoint name memory leak
[ Upstream commit ede72dca45b1893f3e9236b6ad6c4e790db232f6 ]

Fuzzing found that an invalid tracepoint name would create a memory
leak with an address sanitizer build:
```
$ perf stat -e '*:o/' true
event syntax error: '*:o/'
                       \___ parser error
Run 'perf list' for a list of valid events

 Usage: perf stat [<options>] [<command>]

    -e, --event <event>   event selector. use 'perf list' to list available events

=================================================================
==59380==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 4 byte(s) in 2 object(s) allocated from:
    #0 0x7f38ac07077b in __interceptor_strdup ../../../../src/libsanitizer/asan/asan_interceptors.cpp:439
    #1 0x55f2f41be73b in str util/parse-events.l:49
    #2 0x55f2f41d08e8 in parse_events_lex util/parse-events.l:338
    #3 0x55f2f41dc3b1 in parse_events_parse util/parse-events-bison.c:1464
    #4 0x55f2f410b8b3 in parse_events__scanner util/parse-events.c:1822
    #5 0x55f2f410d1b9 in __parse_events util/parse-events.c:2094
    #6 0x55f2f410e57f in parse_events_option util/parse-events.c:2279
    #7 0x55f2f4427b56 in get_value tools/lib/subcmd/parse-options.c:251
    #8 0x55f2f4428d98 in parse_short_opt tools/lib/subcmd/parse-options.c:351
    #9 0x55f2f4429d80 in parse_options_step tools/lib/subcmd/parse-options.c:539
    #10 0x55f2f442acb9 in parse_options_subcommand tools/lib/subcmd/parse-options.c:654
    #11 0x55f2f3ec99fc in cmd_stat tools/perf/builtin-stat.c:2501
    #12 0x55f2f4093289 in run_builtin tools/perf/perf.c:322
    #13 0x55f2f40937f5 in handle_internal_command tools/perf/perf.c:375
    #14 0x55f2f4093bbd in run_argv tools/perf/perf.c:419
    #15 0x55f2f409412b in main tools/perf/perf.c:535

SUMMARY: AddressSanitizer: 4 byte(s) leaked in 2 allocation(s).
```
Fix by adding the missing destructor.

Fixes: 865582c3f48e ("perf tools: Adds the tracepoint name parsing support")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: He Kuang <hekuang@huawei.com>
Link: https://lore.kernel.org/r/20230914164028.363220-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20 11:59:25 +01:00
Yang Jihong
dfc4cb3462 perf kwork: Set ordered_events to true in 'struct perf_tool'
[ Upstream commit 0c526579a4b2b6ecd540472f2e34c2850cf70f76 ]

'perf kwork' processes data based on timestamps and needs to sort events.

Fixes: f98919ec4fccdacf ("perf kwork: Implement 'report' subcommand")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20230812084917.169338-4-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20 11:59:25 +01:00
Yang Jihong
85ae4383ae perf kwork: Add the supported subcommands to the document
[ Upstream commit 76e0d8c821bbd952730799cc7af841f9de67b7f7 ]

Add missing report, latency and timehist subcommands to the document.

Fixes: f98919ec4fccdacf ("perf kwork: Implement 'report' subcommand")
Fixes: ad3d9f7a929ab2df ("perf kwork: Implement perf kwork latency")
Fixes: bcc8b3e88d6fa1a3 ("perf kwork: Implement perf kwork timehist")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/r/20230812084917.169338-3-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20 11:59:25 +01:00
Yang Jihong
ccd75740cb perf kwork: Fix incorrect and missing free atom in work_push_atom()
[ Upstream commit d39710088d82ef100b33cdf4a9de3546fb0bb5df ]

1. Atoms are managed in page mode and should be released using atom_free()
   instead of free().
2. When the event does not match, the atom needs to free.

Fixes: f98919ec4fccdacf ("perf kwork: Implement 'report' subcommand")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20230812084917.169338-2-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20 11:59:24 +01:00
Ian Rogers
4b9d563d97 perf stat: Fix aggr mode initialization
[ Upstream commit a84fbf205609313594b86065c67e823f09ebe29b ]

Generating metrics llc_code_read_mpi_demand_plus_prefetch,
llc_data_read_mpi_demand_plus_prefetch,
llc_miss_local_memory_bandwidth_read,
llc_miss_local_memory_bandwidth_write,
nllc_miss_remote_memory_bandwidth_read, memory_bandwidth_read,
memory_bandwidth_write, uncore_frequency, upi_data_transmit_bw,
C2_Pkg_Residency, C3_Core_Residency, C3_Pkg_Residency,
C6_Core_Residency, C6_Pkg_Residency, C7_Core_Residency,
C7_Pkg_Residency, UNCORE_FREQ and tma_info_system_socket_clks would
trigger an address sanitizer heap-buffer-overflows on a SkylakeX.

```
==2567752==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x5020003ed098 at pc 0x5621a816654e bp 0x7fffb55d4da0 sp 0x7fffb55d4d98
READ of size 4 at 0x5020003eee78 thread T0
    #0 0x558265d6654d in aggr_cpu_id__is_empty tools/perf/util/cpumap.c:694:12
    #1 0x558265c914da in perf_stat__get_aggr tools/perf/builtin-stat.c:1490:6
    #2 0x558265c914da in perf_stat__get_global_cached tools/perf/builtin-stat.c:1530:9
    #3 0x558265e53290 in should_skip_zero_counter tools/perf/util/stat-display.c:947:31
    #4 0x558265e53290 in print_counter_aggrdata tools/perf/util/stat-display.c:985:18
    #5 0x558265e51931 in print_counter tools/perf/util/stat-display.c:1110:3
    #6 0x558265e51931 in evlist__print_counters tools/perf/util/stat-display.c:1571:5
    #7 0x558265c8ec87 in print_counters tools/perf/builtin-stat.c:981:2
    #8 0x558265c8cc71 in cmd_stat tools/perf/builtin-stat.c:2837:3
    #9 0x558265bb9bd4 in run_builtin tools/perf/perf.c:323:11
    #10 0x558265bb98eb in handle_internal_command tools/perf/perf.c:377:8
    #11 0x558265bb9389 in run_argv tools/perf/perf.c:421:2
    #12 0x558265bb9389 in main tools/perf/perf.c:537:3
```

The issue was the use of testing a cpumap with NULL rather than using
empty, as a map containing the dummy value isn't NULL and the -1
results in an empty aggr map being allocated which legitimately
overflows when any member is accessed.

Fixes: 8a96f454f5668572 ("perf stat: Avoid SEGV if core.cpus isn't set")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230906003912.3317462-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20 11:59:24 +01:00
Ian Rogers
67a21d9b93 perf evlist: Avoid frequency mode for the dummy event
[ Upstream commit f9cdeb58a9cf46c09b56f5f661ea8da24b6458c3 ]

Dummy events are created with an attribute where the period and freq
are zero. evsel__config will then see the uninitialized values and
initialize them in evsel__default_freq_period. As fequency mode is
used by default the dummy event would be set to use frequency
mode. However, this has no effect on the dummy event but does cause
unnecessary timers/interrupts. Avoid this overhead by setting the
period to 1 for dummy events.

evlist__add_aux_dummy calls evlist__add_dummy then sets freq=0 and
period=1. This isn't necessary after this change and so the setting is
removed.

From Stephane:

The dummy event is not counting anything. It is used to collect mmap
records and avoid a race condition during the synthesize mmap phase of
perf record. As such, it should not cause any overhead during active
profiling. Yet, it did. Because of a bug the dummy event was
programmed as a sampling event in frequency mode. Events in that mode
incur more kernel overheads because on timer tick, the kernel has to
look at the number of samples for each event and potentially adjust
the sampling period to achieve the desired frequency. The dummy event
was therefore adding a frequency event to task and ctx contexts we may
otherwise not have any, e.g.,

  perf record -a -e cpu/event=0x3c,period=10000000/.

On each timer tick the perf_adjust_freq_unthr_context() is invoked and
if ctx->nr_freq is non-zero, then the kernel will loop over ALL the
events of the context looking for frequency mode ones. In doing, so it
locks the context, and enable/disable the PMU of each hw event. If all
the events of the context are in period mode, the kernel will have to
traverse the list for nothing incurring overhead. The overhead is
multiplied by a very large factor when this happens in a guest kernel.
There is no need for the dummy event to be in frequency mode, it does
not count anything and therefore should not cause extra overhead for
no reason.

Fixes: 5bae0250237f ("perf evlist: Introduce perf_evlist__new_dummy constructor")
Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20230916035640.1074422-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-08 11:56:20 +01:00
Adrian Hunter
f38f547314 perf dlfilter: Add a test for object_code()
Extend the "dlfilter C API" test to test
perf_dlfilter_fns.object_code().

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20230928091033.33998-1-adrian.hunter@intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-30 00:09:06 -07:00
Adrian Hunter
7a48b58eb5 perf dlfilter: Fix use of addr_location__exit() in dlfilter__object_code()
Stop calling addr_location__exit() when addr_location__init() was not
called.

Fixes: 0dd5041c9a0e ("perf addr_location: Add init/exit/copy functions")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230928071605.17624-1-adrian.hunter@intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-29 23:55:05 -07:00
Wyes Karny
48a3adcf47 perf pmu: Fix perf stat output with correct scale and unit
The perf_pmu__parse_* functions for the sysfs files of pmu event’s
scale, unit, per-pkg and snapshot were updated in commit 7b723dbb96e8
("perf pmu: Be lazy about loading event info files from sysfs").
However, the paths for these sysfs files were incorrect. This resulted
in perf stat reporting values with wrong scaling and missing units. This
is fixed by correcting the paths for these sysfs files.

Before this fix:

 $sudo perf stat -e power/energy-pkg/ -- sleep 2

 Performance counter stats for 'system wide':

   351,217,188,864      power/energy-pkg/

          2.004127961 seconds time elapsed

After this fix:

 $sudo perf stat -e power/energy-pkg/ -- sleep 2

 Performance counter stats for 'system wide':

             80.58 Joules power/energy-pkg/

 	     2.004009749 seconds time elapsed

Fixes: 7b723dbb96e8 ("perf pmu: Be lazy about loading event info files from sysfs")
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: ravi.bangoria@amd.com
Cc: sandipan.das@amd.com
Cc: james.clark@arm.com
Cc: kan.liang@linux.intel.com
Link: https://lore.kernel.org/r/20230920122349.418673-1-wyes.karny@amd.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-26 21:41:50 -07:00
Thomas Richter
e47749f179 perf jevent: fix core dump on software events on s390
Running commands such as
 # ./perf stat -e cs -- true
 Segmentation fault (core dumped)
 # ./perf stat -e cpu-clock-- true
 Segmentation fault (core dumped)
 #

dump core. This should not happen as these events are defined
even when no hardware PMU is available.
Debugging this reveals this call chain:

  perf_pmus__find_by_type(type=1)
  +--> pmu_read_sysfs(core_only=false)
       +--> perf_pmu__find2(dirfd=3, name=0x152a113 "software")
            +--> perf_pmu__lookup(pmus=0x14f0568 <other_pmus>, dirfd=3,
                                  lookup_name=0x152a113 "software")
                 +--> perf_pmu__find_events_table (pmu=0x1532130)

Now the pmu is "software" and it tries to find a proper table
generated by the pmu-event generation process for s390:

 # cd pmu-events/
 # ./jevents.py  s390 all /root/linux/tools/perf/pmu-events/arch |\
        grep -E '^const struct pmu_table_entry'
 const struct pmu_table_entry pmu_events__cf_z10[] = {
 const struct pmu_table_entry pmu_events__cf_z13[] = {
 const struct pmu_table_entry pmu_metrics__cf_z13[] = {
 const struct pmu_table_entry pmu_events__cf_z14[] = {
 const struct pmu_table_entry pmu_metrics__cf_z14[] = {
 const struct pmu_table_entry pmu_events__cf_z15[] = {
 const struct pmu_table_entry pmu_metrics__cf_z15[] = {
 const struct pmu_table_entry pmu_events__cf_z16[] = {
 const struct pmu_table_entry pmu_metrics__cf_z16[] = {
 const struct pmu_table_entry pmu_events__cf_z196[] = {
 const struct pmu_table_entry pmu_events__cf_zec12[] = {
 const struct pmu_table_entry pmu_metrics__cf_zec12[] = {
 const struct pmu_table_entry pmu_events__test_soc_cpu[] = {
 const struct pmu_table_entry pmu_metrics__test_soc_cpu[] = {
 const struct pmu_table_entry pmu_events__test_soc_sys[] = {
 #

However event "software" is not listed, as can be seen in the
generated const struct pmu_events_map pmu_events_map[].
So in function perf_pmu__find_events_table(), the variable
table is initialized to NULL, but never set to a proper
value. The function scans all generated &pmu_events_map[]
tables, but no table matches, because the tables are
s390 CPU Measurement unit specific:

  i = 0;
  for (;;) {
      const struct pmu_events_map *map = &pmu_events_map[i++];
      if (!map->arch)
           break;

      --> the maps are there because the build generated them

           if (!strcmp_cpuid_str(map->cpuid, cpuid)) {
                table = &map->event_table;
                break;
           }
      --> Since no matching CPU string the table var remains 0x0
      }
      free(cpuid);
      if (!pmu)
           return table;

      --> The pmu is "software" so it exists and no return

      --> and here perf dies because table is 0x0
      for (i = 0; i < table->num_pmus; i++) {
	      ...
      }
      return NULL;

Fix this and do not access the table variable. Instead return 0x0
which is the same return code when the for-loop was not successful.

Output after:
 # ./perf stat -e cs -- true

 Performance counter stats for 'true':

                 0      cs

       0.000853105 seconds time elapsed

       0.000061000 seconds user
       0.000827000 seconds sys

 # ./perf stat -e cpu-clock -- true

 Performance counter stats for 'true':

              0.25 msec cpu-clock #    0.341 CPUs utilized

       0.000728383 seconds time elapsed

       0.000055000 seconds user
       0.000706000 seconds sys

 # ./perf stat -e cycles -- true

 Performance counter stats for 'true':

   <not supported>      cycles

       0.000767298 seconds time elapsed

       0.000055000 seconds user
       0.000739000 seconds sys

 #

Fixes: 7c52f10c0d4d8 ("perf pmu: Cache JSON events table")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: dengler@linux.ibm.com
Cc: gor@linux.ibm.com
Cc: hca@linux.ibm.com
Cc: sumanthk@linux.ibm.com
Cc: svens@linux.ibm.com
Link: https://lore.kernel.org/r/20230913125157.2790375-1-tmricht@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-17 15:51:57 -07:00
Ian Rogers
eaaebb01a7 perf pmu: Ensure all alias variables are initialized
Fix an error detected by memory sanitizer:
```
==4033==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x55fb0fbedfc7 in read_alias_info tools/perf/util/pmu.c:457:6
    #1 0x55fb0fbea339 in check_info_data tools/perf/util/pmu.c:1434:2
    #2 0x55fb0fbea339 in perf_pmu__check_alias tools/perf/util/pmu.c:1504:9
    #3 0x55fb0fbdca85 in parse_events_add_pmu tools/perf/util/parse-events.c:1429:32
    #4 0x55fb0f965230 in parse_events_parse tools/perf/util/parse-events.y:299:6
    #5 0x55fb0fbdf6b2 in parse_events__scanner tools/perf/util/parse-events.c:1822:8
    #6 0x55fb0fbdf8c1 in __parse_events tools/perf/util/parse-events.c:2094:8
    #7 0x55fb0fa8ffa9 in parse_events tools/perf/util/parse-events.h:41:9
    #8 0x55fb0fa8ffa9 in test_event tools/perf/tests/parse-events.c:2393:8
    #9 0x55fb0fa8f458 in test__pmu_events tools/perf/tests/parse-events.c:2551:15
    #10 0x55fb0fa6d93f in run_test tools/perf/tests/builtin-test.c:242:9
    #11 0x55fb0fa6d93f in test_and_print tools/perf/tests/builtin-test.c:271:8
    #12 0x55fb0fa6d082 in __cmd_test tools/perf/tests/builtin-test.c:442:5
    #13 0x55fb0fa6d082 in cmd_test tools/perf/tests/builtin-test.c:564:9
    #14 0x55fb0f942720 in run_builtin tools/perf/perf.c:322:11
    #15 0x55fb0f942486 in handle_internal_command tools/perf/perf.c:375:8
    #16 0x55fb0f941dab in run_argv tools/perf/perf.c:419:2
    #17 0x55fb0f941dab in main tools/perf/perf.c:535:3
```

Fixes: 7b723dbb96e8 ("perf pmu: Be lazy about loading event info files from sysfs")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20230914022425.1489035-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-17 15:51:42 -07:00
Ian Rogers
d1bac78e26 perf jevents metric: Fix type of strcmp_cpuid_str
The parser wraps all strings as Events, so the input is an
Event. Using a string would be bad as functions like Simplify are
called on the arguments, which wouldn't be present on a string.

Fixes: 9d5da30e4ae9 ("perf jevents: Add a new expression builtin strcmp_cpuid_str()")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20230914022204.1488383-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-17 15:51:35 -07:00
Ian Rogers
33b725ce7b perf trace: Avoid compile error wrt redefining bool
Make part of an existing TODO conditional to avoid the following build
error:
```
tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c:26:14: error: cannot combine with previous 'char' declaration specifier
   26 | typedef char bool;
      |              ^
include/stdbool.h:20:14: note: expanded from macro 'bool'
   20 | #define bool _Bool
      |              ^
tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c:26:1: error: typedef requires a name [-Werror,-Wmissing-declarations]
   26 | typedef char bool;
      | ^~~~~~~~~~~~~~~~~
2 errors generated.
```

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230913184957.230076-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-17 15:51:32 -07:00
Ian Rogers
4a73fca226 perf bpf-prologue: Remove unused file
Commit 3d6dfae88917 ("perf parse-events: Remove BPF event support")
removed building bpf-prologue.c but failed to remove the actual file.

Fixes: 3d6dfae88917 ("perf parse-events: Remove BPF event support")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230913184534.227961-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-09-17 15:51:28 -07:00
Arnaldo Carvalho de Melo
678ddf730a perf bench sched-seccomp-notify: Use the tools copy of seccomp.h UAPI
To keep perf building in systems where types and defines used in this
new benchmark are not available, such as:

  12    13.46 centos:stream                 : FAIL gcc version 8.5.0 20210514 (Red Hat 8.5.0-20) (GCC)
    bench/sched-seccomp-notify.c: In function 'user_notif_syscall':
    bench/sched-seccomp-notify.c:55:27: error: 'SECCOMP_RET_USER_NOTIF' undeclared (first use in this function); did you mean 'SECCOMP_RET_ERRNO'?
       BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_USER_NOTIF),
                               ^~~~~~~~~~~~~~~~~~~~~~
    /git/perf-6.6.0-rc1/tools/include/uapi/linux/filter.h:49:59: note: in definition of macro 'BPF_STMT'
     #define BPF_STMT(code, k) { (unsigned short)(code), 0, 0, k }
                                                               ^
    bench/sched-seccomp-notify.c:55:27: note: each undeclared identifier is reported only once for each function it appears in
       BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_USER_NOTIF),
                               ^~~~~~~~~~~~~~~~~~~~~~
    /git/perf-6.6.0-rc1/tools/include/uapi/linux/filter.h:49:59: note: in definition of macro 'BPF_STMT'
     #define BPF_STMT(code, k) { (unsigned short)(code), 0, 0, k }
                                                               ^
    bench/sched-seccomp-notify.c:55:3: error: missing initializer for field 'k' of 'struct sock_filter' [-Werror=missing-field-initializers]
       BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_USER_NOTIF),
       ^~~~~~~~
    In file included from bench/sched-seccomp-notify.c:5:
    /git/perf-6.6.0-rc1/tools/include/uapi/linux/filter.h:28:8: note: 'k' declared here
      __u32 k;      /* Generic multiuse field */
            ^
    bench/sched-seccomp-notify.c: In function 'user_notification_sync_loop':
    bench/sched-seccomp-notify.c:70:28: error: storage size of 'resp' isn't known
      struct seccomp_notif_resp resp;
                                ^~~~
    bench/sched-seccomp-notify.c:71:23: error: storage size of 'req' isn't known
      struct seccomp_notif req;
                           ^~~
    bench/sched-seccomp-notify.c:76:23: error: 'SECCOMP_IOCTL_NOTIF_RECV' undeclared (first use in this function); did you mean 'SECCOMP_MODE_STRICT'?
       if (ioctl(listener, SECCOMP_IOCTL_NOTIF_RECV, &req))
                           ^~~~~~~~~~~~~~~~~~~~~~~~
                           SECCOMP_MODE_STRICT
    bench/sched-seccomp-notify.c:86:23: error: 'SECCOMP_IOCTL_NOTIF_SEND' undeclared (first use in this function); did you mean 'SECCOMP_RET_ACTION'?
       if (ioctl(listener, SECCOMP_IOCTL_NOTIF_SEND, &resp))
                           ^~~~~~~~~~~~~~~~~~~~~~~~
                           SECCOMP_RET_ACTION
    bench/sched-seccomp-notify.c:71:23: error: unused variable 'req' [-Werror=unused-variable]
      struct seccomp_notif req;
                           ^~~
    bench/sched-seccomp-notify.c:70:28: error: unused variable 'resp' [-Werror=unused-variable]
      struct seccomp_notif_resp resp;
                                ^~~~

  14    11.31 debian:10                     : FAIL gcc version 8.3.0 (Debian 8.3.0-6)

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kees Kook <keescook@chromium.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/ZQGhjaojgOGtSNk6@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-13 08:49:00 -03:00
Arnaldo Carvalho de Melo
417ecb614f tools headers UAPI: Copy seccomp.h to be able to build 'perf bench' in older systems
The new 'perf bench' for sched-seccomp-notify uses defines and types not
available in older systems where we want to have perf available, so grab
a copy of this UAPI from the kernel sources to allow that.

This will be checked in the future for drift from the original when we
build the perf tool, that will warn when that happens like:

  make: Entering directory '/var/home/acme/git/perf-tools/tools/perf'
    BUILD:   Doing 'make -j32' parallel build
  Warning: Kernel ABI header differences:

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kees Kook <keescook@chromium.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/ZQGhMXtwX7RvV3ya@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-13 08:48:48 -03:00
Arnaldo Carvalho de Melo
f7875966dc tools headers UAPI: Sync files changed by new fchmodat2 and map_shadow_stack syscalls with the kernel sources
To pick the changes in these csets:

  c35559f94ebc3e3b ("x86/shstk: Introduce map_shadow_stack syscall")
  78252deb023cf087 ("arch: Register fchmodat2, usually as syscall 452")

That add support for this new syscall in tools such as 'perf trace'.

For instance, this is now possible:

  # perf trace -v -e fchmodat*,map_shadow_stack --max-events=4
  Using CPUID AuthenticAMD-25-21-0
  Reusing "openat" BPF sys_enter augmenter for "fchmodat"
  event qualifier tracepoint filter: (common_pid != 3499340 && common_pid != 11259) && (id == 268 || id == 452 || id == 453)
  ^C#

  And it'll work as with other syscalls, for instance openat:

  # perf trace -e openat* --max-events=4
     0.000 ( 0.015 ms): systemd-oomd/1150 openat(dfd: CWD, filename: "/proc/meminfo", flags: RDONLY|CLOEXEC)    = 11
     0.068 ( 0.019 ms): systemd-oomd/1150 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1001.slice/user@1001.service/memory.pressure", flags: RDONLY|CLOEXEC) = 11
     0.119 ( 0.008 ms): systemd-oomd/1150 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1001.slice/user@1001.service/memory.current", flags: RDONLY|CLOEXEC) = 11
     0.138 ( 0.006 ms): systemd-oomd/1150 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1001.slice/user@1001.service/memory.min", flags: RDONLY|CLOEXEC) = 11
  #

That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
tracepoints.

  $ find tools/perf/arch/ -name "syscall*tbl" | xargs grep -E fchmodat\|sys_map_shadow_stack
  tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl:258	n64	fchmodat			sys_fchmodat
  tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl:452	n64	fchmodat2			sys_fchmodat2
  tools/perf/arch/powerpc/entry/syscalls/syscall.tbl:297	common	fchmodat			sys_fchmodat
  tools/perf/arch/powerpc/entry/syscalls/syscall.tbl:452	common	fchmodat2			sys_fchmodat2
  tools/perf/arch/s390/entry/syscalls/syscall.tbl:299  common	fchmodat		sys_fchmodat			sys_fchmodat
  tools/perf/arch/s390/entry/syscalls/syscall.tbl:452  common	fchmodat2		sys_fchmodat2			sys_fchmodat2
  tools/perf/arch/x86/entry/syscalls/syscall_64.tbl:268	common	fchmodat		sys_fchmodat
  tools/perf/arch/x86/entry/syscalls/syscall_64.tbl:452	common	fchmodat2		sys_fchmodat2
  tools/perf/arch/x86/entry/syscalls/syscall_64.tbl:453	64	map_shadow_stack	sys_map_shadow_stack
  $

  $ grep -Ew map_shadow_stack\|fchmodat2 /tmp/build/perf-tools/arch/x86/include/generated/asm/syscalls_64.c
	[452] = "fchmodat2",
	[453] = "map_shadow_stack",
  $

This addresses these perf build warnings:

  Warning: Kernel ABI header differences:
    diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
    diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
    diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
    diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
    diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
Link: https://lore.kernel.org/lkml/ZP8bE7aXDBu%2Fdrak@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-13 08:24:51 -03:00
Arnaldo Carvalho de Melo
a6e414a4cb perf tools: Update copy of libbpf's hashmap.c
To pick the changes in:

  a3e7e6b17946f48b ("libbpf: Remove HASHMAP_INIT static initialization helper")

That don't entail any changes in tools/perf.

This addresses this perf build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/perf/util/hashmap.h tools/lib/bpf/hashmap.h

Not a kernel ABI, its just that this uses the mechanism in place for
checking kernel ABI files drift.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-11 10:31:02 -03:00
Linus Torvalds
535a265d7f perf tools changes for v6.6:
perf tools maintainership:
 
 - Add git information for perf-tools and perf-tools-next trees/branches to the
   MAINTAINERS file. That is where development now takes place and myself and
   Namhyung Kim have write access, more people to come as we emulate other
   maintainer groups.
 
 perf record:
 
 - Record kernel data maps when 'perf record --data' is used, so that global variables can
   be resolved and used in tools that do data profiling.
 
 perf trace:
 
 - Remove the old, experimental support for BPF events in which a .c file was passed as
   an event: "perf trace -e hello.c" to then get compiled and loaded.
 
   The only known usage for that, that shipped with the kernel as an example for such events,
   augmented the raw_syscalls tracepoints and was converted to a libbpf skeleton, reusing all
   the user space components and the BPF code connected to the syscalls.
 
   In the end just the way to glue the BPF part and the user space type beautifiers changed,
   now being performed by libbpf skeletons.
 
   The next step is to use BTF to do pretty printing of all syscall types, as discussed with
   Alan Maguire and others.
 
   Now, on a perf built with BUILD_BPF_SKEL=1 we get most if not all path/filenames/strings,
   some of the networking data structures, perf_event_attr, etc, i.e. systemwide tracing of
   nanosleep calls and perf_event_open syscalls while 'perf stat' runs 'sleep' for 5 seconds:
 
   # perf trace -a -e *nanosleep,perf* perf stat -e cycles,instructions sleep 5
      0.000 (   9.034 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3
      9.039 (   0.006 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x1 (PERF_COUNT_HW_INSTRUCTIONS), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf-exec), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
          ? (           ): gpm/991  ... [continued]: clock_nanosleep())               = 0
     10.133 (           ): sleep/327642 clock_nanosleep(rqtp: { .tv_sec: 5, .tv_nsec: 0 }, rmtp: 0x7ffd36f83ed0) ...
          ? (           ): pool-gsd-smart/3051  ... [continued]: clock_nanosleep())   = 0
     30.276 (           ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
    223.215 (1000.430 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0
     30.276 (2000.394 ms): gpm/991  ... [continued]: clock_nanosleep())               = 0
   1230.814 (           ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ...
   1230.814 (1000.404 ms): pool-gsd-smart/3051  ... [continued]: clock_nanosleep())   = 0
   2030.886 (           ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
   2237.709 (1000.153 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0
          ? (           ): crond/1172  ... [continued]: clock_nanosleep())            = 0
   3242.699 (           ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ...
   2030.886 (2000.385 ms): gpm/991  ... [continued]: clock_nanosleep())               = 0
   3728.078 (           ): crond/1172 clock_nanosleep(rqtp: { .tv_sec: 60, .tv_nsec: 0 }, rmtp: 0x7ffe0971dcf0) ...
   3242.699 (1000.158 ms): pool-gsd-smart/3051  ... [continued]: clock_nanosleep())   = 0
   4031.409 (           ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
     10.133 (5000.375 ms): sleep/327642  ... [continued]: clock_nanosleep())          = 0
 
  Performance counter stats for 'sleep 5':
 
          2,617,347      cycles
          1,855,997      instructions                     #    0.71  insn per cycle
 
        5.002282128 seconds time elapsed
 
        0.000855000 seconds user
        0.000852000 seconds sys
   #
 
 perf annotate:
 
 - Building with binutils' libopcode now is opt-in (BUILD_NONDISTRO=1) for
   licensing reasons, and we missed a build test on tools/perf/tests makefile.
 
   Since we now default to NDEBUG=1, we ended up segfaulting when building with
   BUILD_NONDISTRO=1 because a needed initialization routine was being "error
   checked" via an assert.
 
   Fix it by explicitly checking the result and aborting instead if it fails.
 
   We better back propagate the error, but at least 'perf annotate' on samples
   collected for a BPF program is back working when perf is built with
   BUILD_NONDISTRO=1.
 
 perf report/top:
 
 - Add back TUI hierarchy mode header, that is seen when using 'perf report/top --hierarchy'.
 
 - Fix the number of entries for 'e' key in the TUI that was preventing navigation of
   lines when expanding an entry.
 
 perf report/script:
 
 - Support cross platform register handling, allowing a perf.data file collected
   on one architecture to have registers sampled correctly displayed when
   analysis tools such as 'perf report' and 'perf script' are used on a different
   architecture.
 
 - Fix handling of event attributes in pipe mode, i.e. when one uses:
 
 	perf record -o - | perf report -i -
 
   When no perf.data files are used.
 
 - Handle files generated via pipe mode with a version of perf and then read
   also via pipe mode with a different version of perf, where the event attr
   record may have changed, use the record size field to properly support this
   version mismatch.
 
 perf probe:
 
 - Accessing global variables from uprobes isn't supported, make the error
   message state that instead of stating that some minimal kernel version is
   needed to have that feature. This seems just a tool limitation, the kernel
   probably has all that is needed.
 
 perf tests:
 
 - Fix a reference count related leak in the dlfilter v0 API where the result
   of a thread__find_symbol_fb() is not matched with an addr_location__exit()
   to drop the reference counts of the resolved components (machine, thread, map,
   symbol, etc). Add a dlfilter test to make sure that doesn't regresses.
 
 - Lots of fixes for the 'perf test' written in shell script related to problems
   found with the shellcheck utility.
 
 - Fixes for 'perf test' shell scripts testing features enabled when perf is
   built with BUILD_BPF_SKEL=1, such as 'perf stat' bpf counters.
 
 - Add perf record sample filtering test, things like the following example, that gets
   implemented as a BPF filter attached to the event:
 
    # perf record -e task-clock -c 10000 --filter 'ip < 0xffffffff00000000'
 
 - Improve the way the task_analyzer test checks if libtraceevent is linked,
   using 'perf version --build-options' instead of the more expensinve
   'perf record -e "sched:sched_switch"'.
 
 - Add support for riscv in the mmap-basic test. (This went as well via the RiscV tree, same contents).
 
 libperf:
 
 - Implement riscv mmap support (This went as well via the RiscV tree, same contents).
 
 perf script:
 
 - New tool that converts perf.data files to the firefox profiler format so that one can use
   the visualizer at https://profiler.firefox.com/. Done by Anup Sharma as part of this year's
   Google Summer of Code.
 
   One can generate the output and upload it to the web interface but Anup also automated
   everything:
 
      perf script gecko -F 99 -a sleep 60
 
 - Support syscall name parsing on arm64.
 
 - Print "cgroup" field on the same line as "comm".
 
 perf bench:
 
 - Add new 'uprobe' benchmark to measure the overhead of uprobes with/without
   BPF programs attached to it.
 
 - breakpoints are not available on power9, skip that test.
 
 perf stat:
 
 - Add #num_cpus_online literal to be used in 'perf stat' metrics, and add this extra
   'perf test' check that exemplifies its purpose:
 
 	TEST_ASSERT_VAL("#num_cpus_online",
                        expr__parse(&num_cpus_online, ctx, "#num_cpus_online") == 0);
 	TEST_ASSERT_VAL("#num_cpus", expr__parse(&num_cpus, ctx, "#num_cpus") == 0);
 	TEST_ASSERT_VAL("#num_cpus >= #num_cpus_online", num_cpus >= num_cpus_online);
 
 Miscellaneous:
 
 - Improve tool startup time by lazily reading PMU, JSON, sysfs data.
 
 - Improve error reporting in the parsing of events, passing YYLTYPE to error routines,
   so that the output can show were the parsing error was found.
 
 - Add 'perf test' entries to check the parsing of events improvements.
 
 - Fix various leak for things detected by -fsanitize=address, mostly things that would
   be freed at tool exit, including:
 
   - Free evsel->filter on the destructor.
 
   - Allow tools to register a thread->priv destructor and use it in 'perf trace'.
 
   - Free evsel->priv in 'perf trace'.
 
   - Free string returned by synthesize_perf_probe_point() when the caller fails
     to do all it needs.
 
 - Adjust various compiler options to not consider errors some warnings when
   building with broken headers found in things like python, flex, bison, as we
   otherwise build with -Werror. Some for gcc, some for clang, some for some
   specific version of those, some for some specific version of flex or bison, or
   some specific combination of these components, bah.
 
 - Allow customization of clang options for BPF target, this helps building on
   gentoo where there are other oddities where BPF targets gets passed some compiler
   options intended for the native build, so building with WERROR=0 helps while
   these oddities are fixed.
 
 - Dont pass ERR_PTR() values to perf_session__delete() in 'perf top' and 'perf lock',
   fixing some segfaults when handling some odd failures.
 
 - Add LTO build option.
 
 - Fix format of unordered lists in the perf docs (tools/perf/Documentation).
 
 - Overhaul the bison files, using constructs such as YYNOMEM.
 
 - Remove unused tokens from the bison .y files.
 
 - Add more comments to various structs.
 
 - A few LoongArch enablement patches.
 
 Vendor events (JSON):
 
 - Add JSON metrics for Yitian 710 DDR (aarch64). Things like:
 
 	EventName, BriefDescription
 	visible_window_limit_reached_rd, "At least one entry in read queue reaches the visible window limit.",
 	visible_window_limit_reached_wr, "At least one entry in write queue reaches the visible window limit.",
 	op_is_dqsosc_mpc	       , "A DQS Oscillator MPC command to DRAM.",
 	op_is_dqsosc_mrr	       , "A DQS Oscillator MRR command to DRAM.",
 	op_is_tcr_mrr		       , "A Temperature Compensated Refresh(TCR) MRR command to DRAM.",
 
 - Add AmpereOne metrics (aarch64).
 
 - Update N2 and V2 metrics (aarch64) and events using Arm telemetry repo.
 
 - Update scale units and descriptions of common topdown metrics on aarch64. Things like:
 
   - "MetricExpr": "stall_slot_frontend / (#slots * cpu_cycles)",
   - "BriefDescription": "Frontend bound L1 topdown metric",
   + "MetricExpr": "100 * (stall_slot_frontend / (#slots * cpu_cycles))",
   + "BriefDescription": "This metric is the percentage of total slots that were stalled due to resource constraints in the frontend of the processor.",
 
 - Update events for intel: meteorlake to 1.04, sapphirerapids to 1.15, Icelake+ metric constraints.
 
 - Update files for the power10 platform.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCZPfJZgAKCRCyPKLppCJ+
 J1/eAP9lgtavD0V75wy1p5zyotkceOmPTkk1DYFVx2Euhxa/lAD/YW/JvuVSo0Gr
 HqJP52XaV0tF8gG+YxL+Lay/Ke0P5AQ=
 =d12c
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v6.6-1-2023-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools updates from Arnaldo Carvalho de Melo:
 "perf tools maintainership:

   - Add git information for perf-tools and perf-tools-next trees and
     branches to the MAINTAINERS file. That is where development now
     takes place and myself and Namhyung Kim have write access, more
     people to come as we emulate other maintainer groups.

  perf record:

   - Record kernel data maps when 'perf record --data' is used, so that
     global variables can be resolved and used in tools that do data
     profiling.

  perf trace:

   - Remove the old, experimental support for BPF events in which a .c
     file was passed as an event: "perf trace -e hello.c" to then get
     compiled and loaded.

     The only known usage for that, that shipped with the kernel as an
     example for such events, augmented the raw_syscalls tracepoints and
     was converted to a libbpf skeleton, reusing all the user space
     components and the BPF code connected to the syscalls.

     In the end just the way to glue the BPF part and the user space
     type beautifiers changed, now being performed by libbpf skeletons.

     The next step is to use BTF to do pretty printing of all syscall
     types, as discussed with Alan Maguire and others.

     Now, on a perf built with BUILD_BPF_SKEL=1 we get most if not all
     path/filenames/strings, some of the networking data structures,
     perf_event_attr, etc, i.e. systemwide tracing of nanosleep calls
     and perf_event_open syscalls while 'perf stat' runs 'sleep' for 5
     seconds:

      # perf trace -a -e *nanosleep,perf* perf stat -e cycles,instructions sleep 5
         0.000 (   9.034 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3
         9.039 (   0.006 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x1 (PERF_COUNT_HW_INSTRUCTIONS), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf-exec), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
             ? (           ): gpm/991  ... [continued]: clock_nanosleep())               = 0
        10.133 (           ): sleep/327642 clock_nanosleep(rqtp: { .tv_sec: 5, .tv_nsec: 0 }, rmtp: 0x7ffd36f83ed0) ...
             ? (           ): pool-gsd-smart/3051  ... [continued]: clock_nanosleep())   = 0
        30.276 (           ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
       223.215 (1000.430 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0
        30.276 (2000.394 ms): gpm/991  ... [continued]: clock_nanosleep())               = 0
      1230.814 (           ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ...
      1230.814 (1000.404 ms): pool-gsd-smart/3051  ... [continued]: clock_nanosleep())   = 0
      2030.886 (           ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
      2237.709 (1000.153 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0
             ? (           ): crond/1172  ... [continued]: clock_nanosleep())            = 0
      3242.699 (           ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ...
      2030.886 (2000.385 ms): gpm/991  ... [continued]: clock_nanosleep())               = 0
      3728.078 (           ): crond/1172 clock_nanosleep(rqtp: { .tv_sec: 60, .tv_nsec: 0 }, rmtp: 0x7ffe0971dcf0) ...
      3242.699 (1000.158 ms): pool-gsd-smart/3051  ... [continued]: clock_nanosleep())   = 0
      4031.409 (           ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
        10.133 (5000.375 ms): sleep/327642  ... [continued]: clock_nanosleep())          = 0

      Performance counter stats for 'sleep 5':

             2,617,347      cycles
             1,855,997      instructions                     #    0.71  insn per cycle

           5.002282128 seconds time elapsed

           0.000855000 seconds user
           0.000852000 seconds sys

  perf annotate:

   - Building with binutils' libopcode now is opt-in (BUILD_NONDISTRO=1)
     for licensing reasons, and we missed a build test on
     tools/perf/tests makefile.

     Since we now default to NDEBUG=1, we ended up segfaulting when
     building with BUILD_NONDISTRO=1 because a needed initialization
     routine was being "error checked" via an assert.

     Fix it by explicitly checking the result and aborting instead if it
     fails.

     We better back propagate the error, but at least 'perf annotate' on
     samples collected for a BPF program is back working when perf is
     built with BUILD_NONDISTRO=1.

  perf report/top:

   - Add back TUI hierarchy mode header, that is seen when using 'perf
     report/top --hierarchy'.

   - Fix the number of entries for 'e' key in the TUI that was
     preventing navigation of lines when expanding an entry.

  perf report/script:

   - Support cross platform register handling, allowing a perf.data file
     collected on one architecture to have registers sampled correctly
     displayed when analysis tools such as 'perf report' and 'perf
     script' are used on a different architecture.

   - Fix handling of event attributes in pipe mode, i.e. when one uses:

  	perf record -o - | perf report -i -

     When no perf.data files are used.

   - Handle files generated via pipe mode with a version of perf and
     then read also via pipe mode with a different version of perf,
     where the event attr record may have changed, use the record size
     field to properly support this version mismatch.

  perf probe:

   - Accessing global variables from uprobes isn't supported, make the
     error message state that instead of stating that some minimal
     kernel version is needed to have that feature. This seems just a
     tool limitation, the kernel probably has all that is needed.

  perf tests:

   - Fix a reference count related leak in the dlfilter v0 API where the
     result of a thread__find_symbol_fb() is not matched with an
     addr_location__exit() to drop the reference counts of the resolved
     components (machine, thread, map, symbol, etc). Add a dlfilter test
     to make sure that doesn't regresses.

   - Lots of fixes for the 'perf test' written in shell script related
     to problems found with the shellcheck utility.

   - Fixes for 'perf test' shell scripts testing features enabled when
     perf is built with BUILD_BPF_SKEL=1, such as 'perf stat' bpf
     counters.

   - Add perf record sample filtering test, things like the following
     example, that gets implemented as a BPF filter attached to the
     event:

       # perf record -e task-clock -c 10000 --filter 'ip < 0xffffffff00000000'

   - Improve the way the task_analyzer test checks if libtraceevent is
     linked, using 'perf version --build-options' instead of the more
     expensinve 'perf record -e "sched:sched_switch"'.

   - Add support for riscv in the mmap-basic test. (This went as well
     via the RiscV tree, same contents).

  libperf:

   - Implement riscv mmap support (This went as well via the RiscV tree,
     same contents).

  perf script:

   - New tool that converts perf.data files to the firefox profiler
     format so that one can use the visualizer at
     https://profiler.firefox.com/. Done by Anup Sharma as part of this
     year's Google Summer of Code.

     One can generate the output and upload it to the web interface but
     Anup also automated everything:

       perf script gecko -F 99 -a sleep 60

   - Support syscall name parsing on arm64.

   - Print "cgroup" field on the same line as "comm".

  perf bench:

   - Add new 'uprobe' benchmark to measure the overhead of uprobes
     with/without BPF programs attached to it.

   - breakpoints are not available on power9, skip that test.

  perf stat:

   - Add #num_cpus_online literal to be used in 'perf stat' metrics, and
     add this extra 'perf test' check that exemplifies its purpose:

  	TEST_ASSERT_VAL("#num_cpus_online",
                         expr__parse(&num_cpus_online, ctx, "#num_cpus_online") == 0);
  	TEST_ASSERT_VAL("#num_cpus", expr__parse(&num_cpus, ctx, "#num_cpus") == 0);
  	TEST_ASSERT_VAL("#num_cpus >= #num_cpus_online", num_cpus >= num_cpus_online);

  Miscellaneous:

   - Improve tool startup time by lazily reading PMU, JSON, sysfs data.

   - Improve error reporting in the parsing of events, passing YYLTYPE
     to error routines, so that the output can show were the parsing
     error was found.

   - Add 'perf test' entries to check the parsing of events
     improvements.

   - Fix various leak for things detected by -fsanitize=address, mostly
     things that would be freed at tool exit, including:

       - Free evsel->filter on the destructor.

       - Allow tools to register a thread->priv destructor and use it in
         'perf trace'.

       - Free evsel->priv in 'perf trace'.

       - Free string returned by synthesize_perf_probe_point() when the
         caller fails to do all it needs.

   - Adjust various compiler options to not consider errors some
     warnings when building with broken headers found in things like
     python, flex, bison, as we otherwise build with -Werror. Some for
     gcc, some for clang, some for some specific version of those, some
     for some specific version of flex or bison, or some specific
     combination of these components, bah.

   - Allow customization of clang options for BPF target, this helps
     building on gentoo where there are other oddities where BPF targets
     gets passed some compiler options intended for the native build, so
     building with WERROR=0 helps while these oddities are fixed.

   - Dont pass ERR_PTR() values to perf_session__delete() in 'perf top'
     and 'perf lock', fixing some segfaults when handling some odd
     failures.

   - Add LTO build option.

   - Fix format of unordered lists in the perf docs
     (tools/perf/Documentation)

   - Overhaul the bison files, using constructs such as YYNOMEM.

   - Remove unused tokens from the bison .y files.

   - Add more comments to various structs.

   - A few LoongArch enablement patches.

  Vendor events (JSON):

   - Add JSON metrics for Yitian 710 DDR (aarch64). Things like:

  	EventName, BriefDescription
  	visible_window_limit_reached_rd, "At least one entry in read queue reaches the visible window limit.",
  	visible_window_limit_reached_wr, "At least one entry in write queue reaches the visible window limit.",
  	op_is_dqsosc_mpc	       , "A DQS Oscillator MPC command to DRAM.",
  	op_is_dqsosc_mrr	       , "A DQS Oscillator MRR command to DRAM.",
  	op_is_tcr_mrr		       , "A Temperature Compensated Refresh(TCR) MRR command to DRAM.",

   - Add AmpereOne metrics (aarch64).

   - Update N2 and V2 metrics (aarch64) and events using Arm telemetry
     repo.

   - Update scale units and descriptions of common topdown metrics on
     aarch64. Things like:
       - "MetricExpr": "stall_slot_frontend / (#slots * cpu_cycles)",
       - "BriefDescription": "Frontend bound L1 topdown metric",
       + "MetricExpr": "100 * (stall_slot_frontend / (#slots * cpu_cycles))",
       + "BriefDescription": "This metric is the percentage of total slots that were stalled due to resource constraints in the frontend of the processor.",

   - Update events for intel: meteorlake to 1.04, sapphirerapids to
     1.15, Icelake+ metric constraints.

   - Update files for the power10 platform"

* tag 'perf-tools-for-v6.6-1-2023-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (217 commits)
  perf parse-events: Fix driver config term
  perf parse-events: Fixes relating to no_value terms
  perf parse-events: Fix propagation of term's no_value when cloning
  perf parse-events: Name the two term enums
  perf list: Don't print Unit for "default_core"
  perf vendor events intel: Fix modifier in tma_info_system_mem_parallel_reads for skylake
  perf dlfilter: Avoid leak in v0 API test use of resolve_address()
  perf metric: Add #num_cpus_online literal
  perf pmu: Remove str from perf_pmu_alias
  perf parse-events: Make common term list to strbuf helper
  perf parse-events: Minor help message improvements
  perf pmu: Avoid uninitialized use of alias->str
  perf jevents: Use "default_core" for events with no Unit
  perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test
  perf test shell stat_bpf_counters: Fix test on Intel
  perf test shell record_bpf_filter: Skip 6.2 kernel
  libperf: Get rid of attr.id field
  perf tools: Convert to perf_record_header_attr_id()
  libperf: Add perf_record_header_attr_id()
  perf tools: Handle old data in PERF_RECORD_ATTR
  ...
2023-09-09 20:06:17 -07:00
Ian Rogers
45fc4628c1 perf parse-events: Fix driver config term
Inadvertently deleted in commit 30f4ade33d649aa0 ("perf tools: Revert
enable indices setting syntax for BPF map").

Fixes: 30f4ade33d649aa0 ("perf tools: Revert enable indices setting syntax for BPF map")
Reported-by: James Clark <james.clark@arm.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20230905033805.3094293-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-05 09:39:13 -03:00
Ian Rogers
9ea150a8d0 perf parse-events: Fixes relating to no_value terms
A term may have no value in which case it is assumed to have a value
of 1. It doesn't just apply to alias/event terms so change the
parse_events_term__to_strbuf assert.

Commit 99e7138eb7897aa0 ("perf tools: Fail on using multiple bits long
terms without value") made it so that no_value terms could only be for a
single bit. Prior to commit 64199ae4b8a3 ("perf parse-events: Fix
propagation of term's no_value when cloning") this missed a test case
where config1 had no_value.

Fixes: 64199ae4b8a36038 ("perf parse-events: Fix propagation of term's no_value when cloning")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230901233949.2930562-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-09-02 08:12:15 -03:00
Linus Torvalds
e0152e7481 RISC-V Patches for the 6.6 Merge Window, Part 1
* Support for the new "riscv,isa-extensions" and "riscv,isa-base" device
   tree interfaces for probing extensions.
 * Support for userspace access to the performance counters.
 * Support for more instructions in kprobes.
 * Crash kernels can be allocated above 4GiB.
 * Support for KCFI.
 * Support for ELFs in !MMU configurations.
 * ARCH_KMALLOC_MINALIGN has been reduced to 8.
 * mmap() defaults to sv48-sized addresses, with longer addresses hidden
   behind a hint (similar to Arm and Intel).
 * Also various fixes and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmTx96kTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiVjRD/9DYVLlkQ/OEDJjPaEcYCP49xgIVUUU
 lhs3XbSs2VNHBaiG114f6Q0AaT/uNi+uqSej3CeTmEot2kZkBk/f2yu+UNIriPZ9
 GQiZsdyXhu921C+5VFtiI47KDWOVZ+Jpy3M1ll61IWt3yPSQHr1xOP0AOiyHHqe3
 cmqpNnzjajlfVDoXPc2mGGzUJt/7ar4thcwnMNi98raXR5Qh7SP6rrHjoQhE1oFk
 LMP3CHqEAcHE2tE4CxZVpc6HOQ5m0LpQIOK7ypufGMyoIYESm5dt/JOT4MlhTtDw
 6JzyVKtiM7lartUnUaW3ZoX4trQYT5gbXxWrJ2gCnUGy3VulikoXr1Rpz0qfdeOR
 XN8OLkVAqHfTGFI7oKk24f9Adw96R5NPZcdCay90h4J/kMfCiC7ZyUUI1XIa5iy1
 np5pZCkf8HNcdywML7qcFd5n2O0wchyFnRLFZo6kJP9Ls5cEi6kBx/1jSdTcNgx/
 fUKXyoEcriGoQiiwn29+4RZnU69gJV3zqQNLPpuwDQ5F/Q1zHTlrr+dqzezKkzcO
 dRTV2d2Q4A5vIDXPptzNNLlRQdrc8qxPJ1lxQVkPIU4/mtqczmZBwlyY2u9zwPyS
 sehJgJZnoAf+jm71NgQAKLck4MUBsMnMogOWunhXkVRCoZlbbkUWX4ECZYwPKsVk
 W7zVPmLvSM0l5g==
 =/tXb
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.6-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for the new "riscv,isa-extensions" and "riscv,isa-base"
   device tree interfaces for probing extensions

 - Support for userspace access to the performance counters

 - Support for more instructions in kprobes

 - Crash kernels can be allocated above 4GiB

 - Support for KCFI

 - Support for ELFs in !MMU configurations

 - ARCH_KMALLOC_MINALIGN has been reduced to 8

 - mmap() defaults to sv48-sized addresses, with longer addresses hidden
   behind a hint (similar to Arm and Intel)

 - Also various fixes and cleanups

* tag 'riscv-for-linus-6.6-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (51 commits)
  lib/Kconfig.debug: Restrict DEBUG_INFO_SPLIT for RISC-V
  riscv: support PREEMPT_DYNAMIC with static keys
  riscv: Move create_tmp_mapping() to init sections
  riscv: Mark KASAN tmp* page tables variables as static
  riscv: mm: use bitmap_zero() API
  riscv: enable DEBUG_FORCE_FUNCTION_ALIGN_64B
  riscv: remove redundant mv instructions
  RISC-V: mm: Document mmap changes
  RISC-V: mm: Update pgtable comment documentation
  RISC-V: mm: Add tests for RISC-V mm
  RISC-V: mm: Restrict address space for sv39,sv48,sv57
  riscv: enable DMA_BOUNCE_UNALIGNED_KMALLOC for !dma_coherent
  riscv: allow kmalloc() caches aligned to the smallest value
  riscv: support the elf-fdpic binfmt loader
  binfmt_elf_fdpic: support 64-bit systems
  riscv: Allow CONFIG_CFI_CLANG to be selected
  riscv/purgatory: Disable CFI
  riscv: Add CFI error handling
  riscv: Add ftrace_stub_graph
  riscv: Add types to indirectly called assembly functions
  ...
2023-09-01 08:09:48 -07:00
Ian Rogers
64199ae4b8 perf parse-events: Fix propagation of term's no_value when cloning
The no_value field in 'struct parse_events_term' indicates that the val
variable isn't used, the case for an event name.

Cloning wasn't propagating this, making cloned event name terms
appearing to have a constant assinged to them.

Working around the bug would check for a value of 1 assigned to value,
but then this meant a user value of 1 couldn't be differentiated causing
the value to be lost in debug printing and perf list.

The change fixes the cloning and updates the "val.num ==/!= 1" tests to
use no_value instead.

To better check the no_value is set appropriately parameter comments are
added for constant values.

This found that no_value wasn't set correctly in parse_events_multi_pmu_add,
which matters now that no_value is used to indicate an event name.

Fixes: 7a6e91644708d514 ("perf parse-events: Make common term list to strbuf helper")
Fixes: 99e7138eb7897aa0 ("perf tools: Fail on using multiple bits long terms without value")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230831071421.2201358-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-31 16:24:59 -03:00
Ian Rogers
58d3a4cea4 perf parse-events: Name the two term enums
Name the enums used by 'struct parse_events_term' to
parse_events__term_val_type and parse_events__term_type.

This allows greater compile time error checking.

Fix -Wswitch related issues by explicitly listing all enum values prior
to default.

Add config_term_name to safely look up a parse_events__term_type name,
bounds checking the array access first.

Add documentation to 'struct parse_events_terms' and reorder to save
space.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230831071421.2201358-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-31 16:24:55 -03:00
Ian Rogers
478c3f5dcd perf list: Don't print Unit for "default_core"
"default_core" was added as a way to demark JSON events whose PMU should
be whatever the default core PMU is, previously this had been assumed to
be "cpu" but that fails on s390 and ARM.

'perf list' displays the PMU in the event description to save storing it
in JSON, but was still comparing against "cpu" and not "default_core",
so update this.

Fixes: d2045f87154bf67a ("perf jevents: Use "default_core" for events with no Unit")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230831071421.2201358-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-31 16:24:47 -03:00
Ian Rogers
bdc6012991 perf vendor events intel: Fix modifier in tma_info_system_mem_parallel_reads for skylake
The metric is using the wrong format encoding. This fix is in the
converter script PR: https://github.com/intel/perfmon/pull/101

Committer testing:

Tested on a Lenovo t480s, before 'perf test 100' was failing with:

 # perf test 100
  100: perf all metrics test                                           : FAILED!

With 'perf test -vv 100' we can see:

<SNIP>
  Testing MemoryBW
  Not grouping metric tma_fb_full's events.
  Try disabling the NMI watchdog to comply NO_NMI_WATCHDOG metric constraint:
      echo 0 > /proc/sys/kernel/nmi_watchdog
      perf stat ...
      echo 1 > /proc/sys/kernel/nmi_watchdog
  event syntax error: '...DATA_READ/thresh=1,metric-id=UNC_ARB_TRK_OCCUPANCY.DATA_READ!3thresh!21!3/,UNC_ARB_TRK_OCCUPANCY.DATA_READ/metric-id=UNC_ARB_TRK_OCCUPANCY.DATA_READ/}:W,duration_time'
                                    \___ Bad event or PMU

  Unable to find PMU or event on a PMU of 'UNC_ARB_TRK_OCCUPANCY.DATA_READ'
<SNIP>

With the patch this problem is gone.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20230830175543.1911892-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-30 23:03:03 -03:00
Adrian Hunter
45210e1ada perf dlfilter: Avoid leak in v0 API test use of resolve_address()
The introduction of reference counting causes the v0 API
perf_dlfilter_fns.resolve_address() to leak.

v2 API introduced perf_dlfilter_fns.al_cleanup() to prevent that.

For the v0 API, avoid the leak by exiting the addr_location immediately,
since the documentation makes it clear that pointers obtained via
perf_dlfilter_fns are not necessarily valid (dereferenceable) after
'filter_event' and 'filter_event_early' return.

Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Closes: https://lore.kernel.org/oe-lkp/202308232146.94d82cb4-oliver.sang@intel.com
Link: http://lore.kernel.org/lkml/20230830090539.68206-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-30 23:03:03 -03:00
Ian Rogers
f0005f1732 perf metric: Add #num_cpus_online literal
Returns the number of CPUs online, unlike #num_cpus that returns the
number present.

Add a test of the property.

This will be used in future Intel metrics.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20230830073026.1829912-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-30 23:03:03 -03:00
Ian Rogers
30f0b435bb perf pmu: Remove str from perf_pmu_alias
Currently the value is only used in perf list.

Compute the value just when needed to avoid unnecessary overhead.

Recycle the strbuf to avoid memory allocation overhead.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20230830070753.1821629-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-30 23:03:03 -03:00
Ian Rogers
7a6e916447 perf parse-events: Make common term list to strbuf helper
A term list is turned into a string for debug output and for the str
value in the alias.

Add a helper to do this based on existing code, but then fix for
situations like events being identified.

Use strbuf to manage the dynamic memory allocation and remove the 256
byte limit.

Use in various places the string of the term list is required.

Before:

  $ sudo perf stat -vv -e inst_retired.any true
  Using CPUID GenuineIntel-6-8D-1
  intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch
  Attempting to add event pmu 'cpu' with 'inst_retired.any,' that may result in non-fatal errors
  After aliases, add event pmu 'cpu' with 'event,period,' that may result in non-fatal errors
  inst_retired.any -> cpu/inst_retired.any/
  ...

After:

$ sudo perf stat -vv -e inst_retired.any true

  Using CPUID GenuineIntel-6-8D-1
  intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch
  Attempt to add: cpu/inst_retired.any/
  ..after resolving event: cpu/event=0xc0,period=0x1e8483/
  inst_retired.any -> cpu/event=0xc0,period=0x1e8483/
  ...

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20230830070753.1821629-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-30 23:03:03 -03:00
Ian Rogers
6beb6cfddf perf parse-events: Minor help message improvements
Be more specific and fix a typo.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20230830070753.1821629-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-30 23:03:03 -03:00
Ian Rogers
196e355877 perf pmu: Avoid uninitialized use of alias->str
alias is allocated with malloc allowing uninitialized memory to be
accessed.

The initialization of str was moved late after it could have been
updated by a JSON event, however, this create a potential for an
uninitialized use.

Fix this by assigning str to NULL early.

Testing on ARM (Raspberry Pi) showed a memory leak in the same code so
add a zfree.

Fixes: f63a536f03a2f64f ("perf pmu: Merge JSON events with sysfs at load time")
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20230830000545.1638964-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-30 23:03:02 -03:00
Linus Torvalds
1687d8aca5 * Rework apic callbacks, getting rid of unnecessary ones and
coalescing lots of silly duplicates.
  * Use static_calls() instead of indirect calls for apic->foo()
  * Tons of cleanups an crap removal along the way
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmTvfO8ACgkQaDWVMHDJ
 krAP2A//ccii/LuvtTnNEIMMR5w2rwTdHv91ancgFkC8pOeNk37Z8sSLq8tKuLFA
 vgjBIysVIqunuRcNCJ+eqwIIxYfU+UGCWHppzLwO+DY3Q7o9EoTL0BgytdAqxpQQ
 ntEVarqWq25QYXKFoAqbUTJ1UXa42/8HfiXAX/jvP+ACXfilkGPZre6ASxlXeOhm
 XbgPuNQPmXi2WYQH9GCQEsz2Nh80hKap8upK2WbQzzJ3lXsm+xA//4klab0HCYwl
 Uc302uVZozyXRMKbAlwmgasTFOLiV8KKriJ0oHoktBpWgkpdR9uv/RDeSaFR3DAl
 aFmecD4k/Hqezg4yVl+4YpEn2KjxiwARCm4PMW5AV7lpWBPBHAOOai65yJlAi9U6
 bP8pM0+aIx9xg7oWfsTnQ7RkIJ+GZ0w+KZ9LXFM59iu3eV1pAJE3UVyUehe/J1q9
 n8OcH0UeHRlAb8HckqVm1AC7IPvfHw4OAPtUq7z3NFDwbq6i651Tu7f+i2bj31cX
 77Ames+fx6WjxUjyFbJwaK44E7Qez3waztdBfn91qw+m0b+gnKE3ieDNpJTqmm5b
 mKulV7KJwwS6cdqY3+Kr+pIlN+uuGAv7wGzVLcaEAXucDsVn/YAMJHY2+v97xv+n
 J9N+yeaYtmSXVlDsJ6dndMrTQMmcasK1CVXKxs+VYq5Lgf+A68w=
 =eoKm
 -----END PGP SIGNATURE-----

Merge tag 'x86_apic_for_6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 apic updates from Dave Hansen:
 "This includes a very thorough rework of the 'struct apic' handlers.
  Quite a variety of them popped up over the years, especially in the
  32-bit days when odd apics were much more in vogue.

  The end result speaks for itself, which is a removal of a ton of code
  and static calls to replace indirect calls.

  If there's any breakage here, it's likely to be around the 32-bit
  museum pieces that get light to no testing these days.

  Summary:

   - Rework apic callbacks, getting rid of unnecessary ones and
     coalescing lots of silly duplicates.

   - Use static_calls() instead of indirect calls for apic->foo()

   - Tons of cleanups an crap removal along the way"

* tag 'x86_apic_for_6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits)
  x86/apic: Turn on static calls
  x86/apic: Provide static call infrastructure for APIC callbacks
  x86/apic: Wrap IPI calls into helper functions
  x86/apic: Mark all hotpath APIC callback wrappers __always_inline
  x86/xen/apic: Mark apic __ro_after_init
  x86/apic: Convert other overrides to apic_update_callback()
  x86/apic: Replace acpi_wake_cpu_handler_update() and apic_set_eoi_cb()
  x86/apic: Provide apic_update_callback()
  x86/xen/apic: Use standard apic driver mechanism for Xen PV
  x86/apic: Provide common init infrastructure
  x86/apic: Wrap apic->native_eoi() into a helper
  x86/apic: Nuke ack_APIC_irq()
  x86/apic: Remove pointless arguments from [native_]eoi_write()
  x86/apic/noop: Tidy up the code
  x86/apic: Remove pointless NULL initializations
  x86/apic: Sanitize APIC ID range validation
  x86/apic: Prepare x2APIC for using apic::max_apic_id
  x86/apic: Simplify X2APIC ID validation
  x86/apic: Add max_apic_id member
  x86/apic: Wrap APIC ID validation into an inline
  ...
2023-08-30 10:44:46 -07:00
Ian Rogers
d2045f8715 perf jevents: Use "default_core" for events with no Unit
The JSON Unit field encodes the name of the PMU to match the events
to. When no name is given it has meant the "cpu" core PMU except for
tests.

On ARM, Intel hybrid and s390 the core PMU is named differently which
means that using "cpu" for this case causes the events not to get
matched to the PMU.

Introduce a new "default_core" string for this case and in the
pmu__name_match force all core PMUs to match this name.

Fixes: 2e255b4f9f41f137 ("perf jevents: Group events by PMU")
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20230826062203.1058041-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-29 14:16:15 -03:00
Namhyung Kim
a84260e314 perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test
It has system-wide test and cpu-list test but the cpu-list test fails
sometimes.  It runs sleep command on CPU1 and measure both user.slice
and system.slice cgroups by default (on systemd-based systems).

But if the system was idle enough, sometime the system.slice gets no
count and it makes the test failing.  Maybe that's because it only looks
at the CPU1, let's add CPU0 to increase the chance it finds some tasks.

Fixes: 7901086014bbaa3a ("perf test: Add a new test for perf stat cgroup BPF counter")
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230825164152.165610-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-29 14:16:15 -03:00
Namhyung Kim
68ca249c96 perf test shell stat_bpf_counters: Fix test on Intel
As of now, bpf counters (bperf) don't support event groups.  But the
default perf stat includes topdown metrics if supported (on recent Intel
machines) which require groups.  That makes perf stat exiting.

  $ sudo perf stat --bpf-counter true
  bpf managed perf events do not yet support groups.

Actually the test explicitly uses cycles event only, but it missed to
pass the option when it checks the availability of the command.

Fixes: 2c0cb9f56020d2ea ("perf test: Add a shell test for 'perf stat --bpf-counters' new option")
Reviewed-by: Song Liu <song@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230825164152.165610-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-29 14:16:14 -03:00