linux/tools/perf
Arnaldo Carvalho de Melo b397f8468f perf evlist: Use unshare(CLONE_FS) in sb threads to let setns(CLONE_NEWNS) work
When we started using a thread to catch the PERF_RECORD_BPF_EVENT meta
data events to then ask the kernel for further info (BTF, etc) for BPF
programs shortly after they get loaded, we forgot to use
unshare(CLONE_FS) as was done in:

  868a832918 ("perf top: Support lookup of symbols in other mount namespaces.")

Do it so that we can enter the namespaces to read the build-ids at the
end of a 'perf record' session for the DSOs that had hits.

Before:

Starting a 'stress-ng --cpus 8' inside a container and then, outside the
container running:

  # perf record -a --namespaces sleep 5
  # perf buildid-list | grep stress-ng
  #

We would end up with a 'perf.data' file that had no entry in its
build-id table for the /usr/bin/stress-ng binary inside the container
that got tons of PERF_RECORD_SAMPLEs.

After:

  # perf buildid-list | grep stress-ng
  f2ed02c68341183a124b9b0f6e2e6c493c465b29 /usr/bin/stress-ng
  #

Then its just a matter of making sure that that binary debuginfo package
gets available in a place that 'perf report' will look at build-id keyed
ELF files, which, in my case, on a f30 notebook, was a matter of
installing the debuginfo file for the distro used in the container,
fedora 31:

  # rpm -ivh http://fedora.c3sl.ufpr.br/linux/development/31/Everything/x86_64/debug/tree/Packages/s/stress-ng-debuginfo-0.07.29-10.fc31.x86_64.rpm

Then, because perf currently looks for those debuginfo files (richer ELF
symtab) inside that namespace (look at the setns calls):

  openat(AT_FDCWD, "/proc/self/ns/mnt", O_RDONLY) = 137
  openat(AT_FDCWD, "/proc/13169/ns/mnt", O_RDONLY) = 139
  setns(139, CLONE_NEWNS)                 = 0
  stat("/usr/bin/stress-ng", {st_mode=S_IFREG|0755, st_size=3065416, ...}) = 0
  openat(AT_FDCWD, "/usr/bin/stress-ng", O_RDONLY) = 140
  fcntl(140, F_GETFD)                     = 0
  fstat(140, {st_mode=S_IFREG|0755, st_size=3065416, ...}) = 0
  mmap(NULL, 3065416, PROT_READ, MAP_PRIVATE, 140, 0) = 0x7ff2fdc5b000
  munmap(0x7ff2fdc5b000, 3065416)         = 0
  close(140)                              = 0
  stat("stress-ng-0.07.29-10.fc31.x86_64.debug", 0x7fff45d71260) = -1 ENOENT (No such file or directory)
  stat("/usr/bin/stress-ng-0.07.29-10.fc31.x86_64.debug", 0x7fff45d71260) = -1 ENOENT (No such file or directory)
  stat("/usr/bin/.debug/stress-ng-0.07.29-10.fc31.x86_64.debug", 0x7fff45d71260) = -1 ENOENT (No such file or directory)
  stat("/usr/lib/debug/usr/bin/stress-ng-0.07.29-10.fc31.x86_64.debug", 0x7fff45d71260) = -1 ENOENT (No such file or directory)
  stat("/root/.debug/.build-id/f2/ed02c68341183a124b9b0f6e2e6c493c465b29", 0x7fff45d711e0) = -1 ENOENT (No such file or directory)

To only then go back to the "host" namespace to look just in the users's
~/.debug cache:

  setns(137, CLONE_NEWNS)                 = 0
  chdir("/root")                          = 0
  close(137)                              = 0
  close(139)                              = 0
  stat("/root/.debug/.build-id/f2/ed02c68341183a124b9b0f6e2e6c493c465b29/elf", 0x7fff45d732e0) = -1 ENOENT (No such file or directory)

It continues to fail to resolve symbols:

  # perf report | grep stress-ng | head -5
     9.50%  stress-ng-cpu    stress-ng    [.] 0x0000000000021ac1
     8.58%  stress-ng-cpu    stress-ng    [.] 0x0000000000021ab4
     8.51%  stress-ng-cpu    stress-ng    [.] 0x0000000000021489
     7.17%  stress-ng-cpu    stress-ng    [.] 0x00000000000219b6
     3.93%  stress-ng-cpu    stress-ng    [.] 0x0000000000021478
  #

To overcome that we use:

  # perf buildid-cache -v --add /usr/lib/debug/usr/bin/stress-ng-0.07.29-10.fc31.x86_64.debug
  Adding f2ed02c68341183a124b9b0f6e2e6c493c465b29 /usr/lib/debug/usr/bin/stress-ng-0.07.29-10.fc31.x86_64.debug: Ok
  #
  # ls -la /root/.debug/.build-id/f2/ed02c68341183a124b9b0f6e2e6c493c465b29/elf
  -rw-r--r--. 3 root root 2401184 Jul 27 07:03 /root/.debug/.build-id/f2/ed02c68341183a124b9b0f6e2e6c493c465b29/elf
  # file /root/.debug/.build-id/f2/ed02c68341183a124b9b0f6e2e6c493c465b29/elf
  /root/.debug/.build-id/f2/ed02c68341183a124b9b0f6e2e6c493c465b29/elf: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter \004, BuildID[sha1]=f2ed02c68341183a124b9b0f6e2e6c493c465b29, for GNU/Linux 3.2.0, with debug_info, not stripped, too many notes (256)
  #

Now it finally works:

  # perf report | grep stress-ng | head -5
    23.59%  stress-ng-cpu    stress-ng    [.] ackermann
    23.33%  stress-ng-cpu    stress-ng    [.] is_prime
    17.36%  stress-ng-cpu    stress-ng    [.] stress_cpu_sieve
     6.08%  stress-ng-cpu    stress-ng    [.] stress_cpu_correlate
     3.55%  stress-ng-cpu    stress-ng    [.] queens_try
  #

I'll make sure that it looks for the build-id keyed files in both the
"host" namespace (the namespace the user running 'perf record' was a the
time of the recording) and in the container namespace, as it shouldn't
matter where a content based key lookup finds the ELF file to use in
resolving symbols, etc.

Reported-by: Karl Rister <krister@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Krister Johansen <kjlx@templeofstupid.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Fixes: 657ee55319 ("perf evlist: Introduce side band thread")
Link: https://lkml.kernel.org/n/tip-g79k0jz41adiaeuqud742t2l@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-29 08:36:12 -03:00
..
arch libperf: Rename the PERF_RECORD_ structs to have a "perf" prefix 2019-08-29 08:36:12 -03:00
bench Merge remote-tracking branch 'torvalds/master' into perf/core 2019-08-12 16:25:00 -03:00
Documentation perf report: Add --switch-on/--switch-off events 2019-08-16 12:14:33 -03:00
examples/bpf perf augmented_raw_syscalls: Reduce perf_event_output() boilerplate 2019-08-26 11:58:29 -03:00
include/bpf perf include bpf: Add bpf_tail_call() prototype 2019-07-29 18:34:40 -03:00
jvmti tools build: Check if gettid() is available before providing helper 2019-07-07 17:53:09 -03:00
lib libperf: Move 'enum perf_user_event_type' to perf/event.h 2019-08-29 08:36:12 -03:00
pmu-events perf vendor events intel: Add Tremontx event file v1.02 2019-08-15 12:04:04 -03:00
python treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 407 2019-06-05 17:37:14 +02:00
scripts perf scripts python: export-to-postgresql.py: Export switch events 2019-07-10 13:05:12 -03:00
tests libperf: Rename the PERF_RECORD_ structs to have a "perf" prefix 2019-08-29 08:36:12 -03:00
trace perf trace beauty ioctl: Fix off-by-one error in cmd->string table 2019-08-26 11:58:29 -03:00
ui perf script: Fix memory leaks in list_scripts() 2019-08-26 11:58:30 -03:00
util perf evlist: Use unshare(CLONE_FS) in sb threads to let setns(CLONE_NEWNS) work 2019-08-29 08:36:12 -03:00
.gitignore
Build
builtin-annotate.c perf evsel: Rename struct perf_evsel to struct evsel 2019-07-29 18:34:42 -03:00
builtin-bench.c tools lib: Adopt zalloc()/zfree() from tools/perf 2019-07-09 10:13:26 -03:00
builtin-buildid-cache.c perf tools: Remove needless util.h include from builtin.h 2019-08-28 17:19:34 -03:00
builtin-buildid-list.c
builtin-c2c.c perf cacheline: Move cacheline related routines to separate files 2019-08-26 11:58:29 -03:00
builtin-config.c perf tools: Add missing headers, mostly stdlib.h 2019-07-09 10:13:22 -03:00
builtin-data.c
builtin-diff.c perf srcline: Add missing srcline.h header to files needing its defs 2019-08-26 11:58:29 -03:00
builtin-evlist.c libperf: Move perf_event_attr field from perf's evsel to libperf's perf_evsel 2019-07-29 18:34:45 -03:00
builtin-ftrace.c libperf: Add perf_thread_map__nr/perf_thread_map__pid functions 2019-08-22 17:16:57 -03:00
builtin-help.c tools lib: Adopt zalloc()/zfree() from tools/perf 2019-07-09 10:13:26 -03:00
builtin-inject.c libperf: Move perf_event_attr field from perf's evsel to libperf's perf_evsel 2019-07-29 18:34:45 -03:00
builtin-kallsyms.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 251 2019-06-05 17:30:26 +02:00
builtin-kmem.c perf evsel: Rename struct perf_evsel to struct evsel 2019-07-29 18:34:42 -03:00
builtin-kvm.c libperf: Add threads to struct perf_evlist 2019-07-29 18:34:45 -03:00
builtin-list.c perf list: Output tool events 2019-04-01 14:49:25 -03:00
builtin-lock.c perf evsel: Rename struct perf_evsel to struct evsel 2019-07-29 18:34:42 -03:00
builtin-mem.c perf evsel: Rename struct perf_evsel to struct evsel 2019-07-29 18:34:42 -03:00
builtin-probe.c perf probe: Avoid calling freeing routine multiple times for same pointer 2019-07-23 09:04:41 -03:00
builtin-record.c libperf: Rename the PERF_RECORD_ structs to have a "perf" prefix 2019-08-29 08:36:12 -03:00
builtin-report.c libperf: Add PERF_RECORD_HEADER_FEATURE 'struct feature_event' to perf/event.h 2019-08-29 08:36:12 -03:00
builtin-sched.c libperf: Add PERF_RECORD_LOST 'struct lost_event' to perf/event.h 2019-08-26 19:39:09 -03:00
builtin-script.c libperf: Rename the PERF_RECORD_ structs to have a "perf" prefix 2019-08-29 08:36:12 -03:00
builtin-stat.c libperf: Rename the PERF_RECORD_ structs to have a "perf" prefix 2019-08-29 08:36:12 -03:00
builtin-timechart.c libperf: Move perf_event_attr field from perf's evsel to libperf's perf_evsel 2019-07-29 18:34:45 -03:00
builtin-top.c perf top: Fix event group with more than two events 2019-08-28 18:15:03 -03:00
builtin-trace.c perf evlist: Remove needless util.h from evlist.h 2019-08-28 17:19:35 -03:00
builtin-version.c perf version: Fix segfault due to missing OPT_END() 2019-07-15 07:59:05 -03:00
builtin.h perf tools: Remove needless util.h include from builtin.h 2019-08-28 17:19:34 -03:00
check-headers.sh tools headers: Grab copy of linux/const.h, needed by linux/bits.h 2019-08-20 12:08:23 -03:00
command-list.txt
CREDITS
design.txt
Makefile
Makefile.config perf tools: tools/include should come before tools/uapi/include 2019-08-20 12:07:22 -03:00
Makefile.perf tools build: Add capability-related feature detection 2019-08-12 17:14:14 -03:00
MANIFEST tools lib: Adopt zalloc()/zfree() from tools/perf 2019-07-09 10:13:26 -03:00
perf-archive.sh
perf-completion.sh
perf-read-vdso.c
perf-sys.h
perf-with-kcore.sh Merge branch 'x86/cpu' into perf/core, to pick up dependent changes 2019-06-17 12:29:16 +02:00
perf.c perf tools: Remove needless util.h include from builtin.h 2019-08-28 17:19:34 -03:00
perf.h perf record: Move record_opts and other record decls out of perf.h 2019-08-26 11:58:22 -03:00