linux/tools/perf
Alexey Budankov 10ccbc1cc0 perf report: Prefer DWARF callstacks to LBR ones when captured both
Display DWARF based callchains when the perf.data file contains raw thread
stack data as LBR callstack data.

Commiter testing:

This changes the output from the branch stack based one, i.e. without
this patch, for the same file as in the previous csets:

  # perf report --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  # Total Lost Samples: 0
  #
  # Samples: 13  of event 'cycles'
  # Event count (approx.): 13
  #
  # Overhead  Command  Source Shared Object  Source Symbol                Target Symbol                              Basic Block Cycles
  # ........  .......  ....................  ...........................  .........................................  ..................
  #
       7.69%  ls       libpthread-2.29.so    [.] _init                    [.] __pthread_initialize_minimal_internal  6827
       7.69%  ls       ld-2.29.so            [k] _start                   [k] _dl_start                              -
       7.69%  ls       ld-2.29.so            [.] _dl_start_user           [.] _dl_init                               -24790
       7.69%  ls       ld-2.29.so            [k] _dl_start                [k] _dl_sysdep_start                       278
       7.69%  ls       ld-2.29.so            [k] dl_main                  [k] _dl_map_object_deps                    15581
       7.69%  ls       ld-2.29.so            [k] open_verify.constprop.0  [k] lseek64                                4228
       7.69%  ls       ld-2.29.so            [k] _dl_map_object           [k] open_verify.constprop.0                55
       7.69%  ls       ld-2.29.so            [k] openaux                  [k] _dl_map_object                         67
       7.69%  ls       ld-2.29.so            [k] _dl_map_object_deps      [k] 0x00007f441b57c090                     112
       7.69%  ls       ld-2.29.so            [.] call_init.part.0         [.] _init                                  334
       7.69%  ls       ld-2.29.so            [.] _dl_init                 [.] call_init.part.0                       383
       7.69%  ls       ld-2.29.so            [k] _dl_sysdep_start         [k] dl_main                                45
       7.69%  ls       ld-2.29.so            [k] _dl_catch_exception      [k] openaux                                116

  #
  # (Tip: For memory address profiling, try: perf mem record / perf mem report)
  #

To the one that shows call chains:

  # perf report --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 10  of event 'cycles'
  # Event count (approx.): 3204047
  #
  # Children      Self  Command  Shared Object       Symbol
  # ........  ........  .......  ..................  .........................................
  #
      55.01%     0.00%  ls       [kernel.vmlinux]    [k] entry_SYSCALL_64_after_hwframe
              |
              ---entry_SYSCALL_64_after_hwframe
                 do_syscall_64
                 |
                  --16.01%--__x64_sys_execve
                            __do_execve_file.isra.0
                            search_binary_handler
                            load_elf_binary
                            elf_map
                            vm_mmap_pgoff
                            do_mmap
                            mmap_region
                            perf_event_mmap
                            perf_iterate_sb
                            perf_iterate_ctx
                            perf_event_mmap_output
                            perf_output_copy
                            memcpy_erms

      55.01%    39.00%  ls       [kernel.vmlinux]    [k] do_syscall_64
              |
              |--39.00%--0xffffffffffffffff
              |          _dl_map_object
              |          open_verify.constprop.0
              |          __lseek64 (inlined)
              |          entry_SYSCALL_64_after_hwframe
              |          do_syscall_64
              |
               --16.01%--do_syscall_64
                         __x64_sys_execve
                         __do_execve_file.isra.0
                         search_binary_handler
                         load_elf_binary
                         elf_map
                         vm_mmap_pgoff
                         do_mmap
                         mmap_region
                         perf_event_mmap
                         perf_iterate_sb
                         perf_iterate_ctx
                         perf_event_mmap_output
                         perf_output_copy
                         memcpy_erms

      42.95%    42.95%  ls       libpthread-2.29.so  [.] __pthread_initialize_minimal_internal
              |
              ---_init
                 __pthread_initialize_minimal_internal

      42.95%     0.00%  ls       libpthread-2.29.so  [.] _init
              |
              ---_init
                 __pthread_initialize_minimal_internal

  <SNIP>

  #
  # (Tip: Profiling branch (mis)predictions with: perf record -b / perf report)
  #
  #

The branch stack view be explicitely selected using:

  # perf report -h branch-stack

   Usage: perf report [<options>]

      -b, --branch-stack    use branch records for per branch histogram filling

  #

I.e. after this patch:

  # perf report -b --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 13  of event 'cycles'
  # Event count (approx.): 13
  #
  # Overhead  Command  Source Shared Object  Source Symbol                Target Symbol                              Basic Block Cycles
  # ........  .......  ....................  ...........................  .........................................  ..................
  #
       7.69%  ls       libpthread-2.29.so    [.] _init                    [.] __pthread_initialize_minimal_internal  6827
       7.69%  ls       ld-2.29.so            [k] _start                   [k] _dl_start                              -
       7.69%  ls       ld-2.29.so            [.] _dl_start_user           [.] _dl_init                               -24790
       7.69%  ls       ld-2.29.so            [k] _dl_start                [k] _dl_sysdep_start                       278
       7.69%  ls       ld-2.29.so            [k] dl_main                  [k] _dl_map_object_deps                    15581
       7.69%  ls       ld-2.29.so            [k] open_verify.constprop.0  [k] lseek64                                4228
       7.69%  ls       ld-2.29.so            [k] _dl_map_object           [k] open_verify.constprop.0                55
       7.69%  ls       ld-2.29.so            [k] openaux                  [k] _dl_map_object                         67
       7.69%  ls       ld-2.29.so            [k] _dl_map_object_deps      [k] 0x00007f441b57c090                     112
       7.69%  ls       ld-2.29.so            [.] call_init.part.0         [.] _init                                  334
       7.69%  ls       ld-2.29.so            [.] _dl_init                 [.] call_init.part.0                       383
       7.69%  ls       ld-2.29.so            [k] _dl_sysdep_start         [k] dl_main                                45
       7.69%  ls       ld-2.29.so            [k] _dl_catch_exception      [k] openaux                                116

  #
  # (Tip: Show current config key-value pairs: perf config --list)
  #
  #

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/ccbd9583-82f4-dec5-7e84-64bf56e351fb@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-08-20 12:20:16 -03:00
..
arch perf intel-pt: Process options for PEBS event synthesis 2019-08-14 10:59:59 -03:00
bench Merge remote-tracking branch 'torvalds/master' into perf/core 2019-08-12 16:25:00 -03:00
Documentation perf report: Add --switch-on/--switch-off events 2019-08-16 12:14:33 -03:00
examples/bpf perf trace beauty: Add BPF augmenter for the 'rename' syscall 2019-07-29 18:34:42 -03:00
include/bpf perf include bpf: Add bpf_tail_call() prototype 2019-07-29 18:34:40 -03:00
jvmti tools build: Check if gettid() is available before providing helper 2019-07-07 17:53:09 -03:00
lib libperf: Initial documentation 2019-07-29 18:34:47 -03:00
pmu-events perf vendor events intel: Add Tremontx event file v1.02 2019-08-15 12:04:04 -03:00
python treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 407 2019-06-05 17:37:14 +02:00
scripts perf scripts python: export-to-postgresql.py: Export switch events 2019-07-10 13:05:12 -03:00
tests perf tools: Add NO_LIBCAP=1 to the minimal build test 2019-08-14 10:59:59 -03:00
trace tools perf beauty: Fix usbdevfs_ioctl table generator to handle _IOC() 2019-07-29 09:03:42 -03:00
ui perf ui: No need to set ui_browser to 1 twice 2019-08-14 11:00:00 -03:00
util perf report: Dump LBR callstack data by -D jointly with thread stack 2019-08-20 12:19:44 -03:00
.gitignore
Build perf tools: Rename build libperf to perf 2019-02-14 15:18:08 -03:00
builtin-annotate.c perf evsel: Rename struct perf_evsel to struct evsel 2019-07-29 18:34:42 -03:00
builtin-bench.c tools lib: Adopt zalloc()/zfree() from tools/perf 2019-07-09 10:13:26 -03:00
builtin-buildid-cache.c perf data: Add global path holder 2019-02-22 16:52:07 -03:00
builtin-buildid-list.c perf data: Add global path holder 2019-02-22 16:52:07 -03:00
builtin-c2c.c perf evlist: Rename struct perf_evlist to struct evlist 2019-07-29 18:34:42 -03:00
builtin-config.c perf tools: Add missing headers, mostly stdlib.h 2019-07-09 10:13:22 -03:00
builtin-data.c
builtin-diff.c perf evlist: Rename struct perf_evlist to struct evlist 2019-07-29 18:34:42 -03:00
builtin-evlist.c libperf: Move perf_event_attr field from perf's evsel to libperf's perf_evsel 2019-07-29 18:34:45 -03:00
builtin-ftrace.c perf ftrace: Improve error message about capability to use ftrace 2019-08-14 10:59:59 -03:00
builtin-help.c tools lib: Adopt zalloc()/zfree() from tools/perf 2019-07-09 10:13:26 -03:00
builtin-inject.c libperf: Move perf_event_attr field from perf's evsel to libperf's perf_evsel 2019-07-29 18:34:45 -03:00
builtin-kallsyms.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 251 2019-06-05 17:30:26 +02:00
builtin-kmem.c perf evsel: Rename struct perf_evsel to struct evsel 2019-07-29 18:34:42 -03:00
builtin-kvm.c libperf: Add threads to struct perf_evlist 2019-07-29 18:34:45 -03:00
builtin-list.c perf list: Output tool events 2019-04-01 14:49:25 -03:00
builtin-lock.c perf evsel: Rename struct perf_evsel to struct evsel 2019-07-29 18:34:42 -03:00
builtin-mem.c perf evsel: Rename struct perf_evsel to struct evsel 2019-07-29 18:34:42 -03:00
builtin-probe.c perf probe: Avoid calling freeing routine multiple times for same pointer 2019-07-23 09:04:41 -03:00
builtin-record.c perf record: Add an option to take an AUX snapshot on exit 2019-08-14 10:59:59 -03:00
builtin-report.c perf report: Prefer DWARF callstacks to LBR ones when captured both 2019-08-20 12:20:16 -03:00
builtin-sched.c libperf: Add perf_cpu_map__new()/perf_cpu_map__read() functions 2019-07-29 18:34:45 -03:00
builtin-script.c perf evswitch: Introduce init() method to set the on/off evsels from the command line 2019-08-15 12:25:55 -03:00
builtin-stat.c libperf: Move nr_members from perf's evsel to libperf's perf_evsel 2019-07-29 18:34:46 -03:00
builtin-timechart.c libperf: Move perf_event_attr field from perf's evsel to libperf's perf_evsel 2019-07-29 18:34:45 -03:00
builtin-top.c perf top: Add --switch-on/--switch-off events 2019-08-15 16:03:26 -03:00
builtin-trace.c perf trace: Add --switch-on/--switch-off events 2019-08-15 12:26:21 -03:00
builtin-version.c perf version: Fix segfault due to missing OPT_END() 2019-07-15 07:59:05 -03:00
builtin.h perf script: Add array bound checking to list_scripts 2019-03-11 16:33:19 -03:00
check-headers.sh tools headers: Grab copy of linux/const.h, needed by linux/bits.h 2019-08-20 12:08:23 -03:00
command-list.txt perf help: Add missing subcommand version 2018-09-19 14:53:36 -03:00
CREDITS
design.txt perf/doc: Update design.txt for exclude_{host|guest} flags 2019-01-21 11:01:18 +01:00
Makefile perf tools: Disable parallelism for 'make clean' 2018-08-20 08:54:58 -03:00
Makefile.config perf tools: tools/include should come before tools/uapi/include 2019-08-20 12:07:22 -03:00
Makefile.perf tools build: Add capability-related feature detection 2019-08-12 17:14:14 -03:00
MANIFEST tools lib: Adopt zalloc()/zfree() from tools/perf 2019-07-09 10:13:26 -03:00
perf-archive.sh
perf-completion.sh
perf-read-vdso.c perf tools: Make find_vdso_map() more modular 2019-01-08 13:28:13 -03:00
perf-sys.h
perf-with-kcore.sh Merge branch 'x86/cpu' into perf/core, to pick up dependent changes 2019-06-17 12:29:16 +02:00
perf.c perf config: Honour $PERF_CONFIG env var to specify alternate .perfconfig 2019-08-12 16:26:02 -03:00
perf.h perf record: Add an option to take an AUX snapshot on exit 2019-08-14 10:59:59 -03:00