linux/tools/perf
Kan Liang 94ba462d69 perf diff: Support for different binaries
Currently, the perf diff only works with same binaries. That's because
it compares the symbol start address. It doesn't work if the perf.data
comes from different binaries. This patch matches the symbol names.

Actually, perf diff once intended to compare the symbol names.  The
commit as below can look for a pair by name.

604c5c9297 (perf diff: Change the default sort order to "dso,symbol")
However, at that time, perf diff used a global list of dsos. That means
the binaries which has same name can only be loaded once. That's a
problem for comparing different binaries.

For example, we have an old binary and an updated binary. They very
likely have same name and most of the functions, so only dsos from old
binary will be loaded. When processing the data from updated binary,
perf still use the symbol information from old binary. That's wrong.

Then the commit as below used IP to replace symbol name.
9c443dfdd3 ("perf diff: Fix support for all --sort combinations")
>From that time, perf diff starts to compare the symbol address.

The global dsos is discarded from a patch in 2010.
a1645ce12a ("perf: 'perf kvm' tool for monitoring guest performance
from host")
However, at that time, perf diff already compared by address. So perf
diff cannot work for different binaries as well.

This patch actually rolls back the perf diff to original design. The
document is also changed, so everybody knows the original design is to
compare the symbol names.

Here are some examples:

The only difference between example_v1.c and example_v2.c is the
location of f2 and f3. There is no change in behavior, but the previous
perf diff display the wrong differential profile.

example_v1.c
noinline void f3(void)
{
        volatile int i;
        for (i = 0; i < 10000;) {

                if(i%2)
                        i++;
                else
                        i++;
        }
}

noinline void f2(void)
{
        volatile int a = 100, b, c;
        for (b = 0; b < 10000; b++)
                c = a * b;

}

noinline void f1(void)
{
                f2();
                f3();
}

int main()
{
        int i;
        for (i = 0; i < 100000; i++)
                f1();
}

example_v2.c
noinline void f2(void)
{
        volatile int a = 100, b, c;
        for (b = 0; b < 10000; b++)
                c = a * b;
}

noinline void f3(void)
{
        volatile int i;
        for (i = 0; i < 10000;) {
                if(i%2)
                        i++;
                else
                        i++;
        }
}

noinline void f1(void)
{
                f2();
                f3();
}

int main()
{
        int i;
        for (i = 0; i < 100000; i++)
                f1();
}

[lk@localhost perf_diff]$ gcc example_v1.c -o example
[lk@localhost perf_diff]$ perf record -o example_v1.data ./example
[ perf record: Woken up 4 times to write data ]
[ perf record: Captured and wrote 0.813 MB example_v1.data (~35522 samples) ]

[lk@localhost perf_diff]$ gcc example_v2.c -o example
[lk@localhost perf_diff]$ perf record -o example_v2.data ./example
[ perf record: Woken up 4 times to write data ]
[ perf record: Captured and wrote 0.824 MB example_v2.data (~36015 samples) ]

Old perf diff result:

[lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
 Event 'cycles'
 Baseline    Delta  Shared Object     Symbol
 ........  .......  ................  ...............................

                     [kernel.vmlinux]  [k] __perf_event_task_sched_out
     0.00%           [kernel.vmlinux]  [k] apic_timer_interrupt
                     [kernel.vmlinux]  [k] idle_cpu
                     [kernel.vmlinux]  [k] intel_pstate_timer_func
                     [kernel.vmlinux]  [k] native_read_msr_safe
     0.00%           [kernel.vmlinux]  [k] native_read_tsc
     0.00%           [kernel.vmlinux]  [k] native_write_msr_safe
                     [kernel.vmlinux]  [k] ntp_tick_length
     0.00%           [kernel.vmlinux]  [k] rb_erase
     0.00%           [kernel.vmlinux]  [k] tick_sched_timer
     0.00%           [kernel.vmlinux]  [k] unmap_single_vma
     0.00%           [kernel.vmlinux]  [k] update_wall_time
     0.00%           example           [.] f1
    46.24%           example           [.] f2
    53.71%   -7.55%  example           [.] f3
            +53.81%  example           [.] f3
     0.02%           example           [.] main

New perf diff result:

[lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
                     [kernel.vmlinux]  [k] __perf_event_task_sched_out
     0.00%           [kernel.vmlinux]  [k] apic_timer_interrupt
                     [kernel.vmlinux]  [k] idle_cpu
                     [kernel.vmlinux]  [k] intel_pstate_timer_func
                     [kernel.vmlinux]  [k] native_read_msr_safe
     0.00%           [kernel.vmlinux]  [k] native_read_tsc
     0.00%           [kernel.vmlinux]  [k] native_write_msr_safe
                     [kernel.vmlinux]  [k] ntp_tick_length
     0.00%           [kernel.vmlinux]  [k] rb_erase
     0.00%           [kernel.vmlinux]  [k] tick_sched_timer
     0.00%           [kernel.vmlinux]  [k] unmap_single_vma
     0.00%           [kernel.vmlinux]  [k] update_wall_time
     0.00%           example           [.] f1
    46.24%   -0.08%  example           [.] f2
    53.71%   +0.11%  example           [.] f3
     0.02%           example           [.] main

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1423460384-11645-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-02-27 10:08:38 -03:00
..
arch perf build: Add arch sparc objects building 2015-02-12 13:22:01 -03:00
bench perf build: Add bench objects building 2015-02-12 11:32:32 -03:00
config perf data: Add perf data to CTF conversion support 2015-02-25 16:13:12 -03:00
Documentation perf diff: Support for different binaries 2015-02-27 10:08:38 -03:00
python perf python: Remove duplicate TID bit from mask 2013-08-07 17:35:25 -03:00
scripts perf build: Add scripts objects building 2015-02-12 11:49:53 -03:00
tests perf build: Add tests objects building 2015-02-12 11:40:32 -03:00
ui perf build: Add gtk objects building 2015-02-12 11:49:12 -03:00
util perf diff: Support for different binaries 2015-02-27 10:08:38 -03:00
.gitignore perf tools: Add perf-read-vdso32 and perf-read-vdsox32 to .gitignore 2014-11-19 12:34:24 -03:00
Build perf tools: Add new 'perf data' command 2015-02-25 12:42:25 -03:00
builtin-annotate.c perf report: Show progress bar for output resorting 2014-12-23 12:01:37 -03:00
builtin-bench.c perf bench: Add --repeat option 2014-06-19 16:13:15 -03:00
builtin-buildid-cache.c perf buildid-cache: Add new buildid cache if update target is not cached 2015-02-27 10:08:37 -03:00
builtin-buildid-list.c perf session: Separating data file properties from session 2013-10-21 17:33:25 -03:00
builtin-data.c perf data: Add perf data to CTF conversion support 2015-02-25 16:13:12 -03:00
builtin-diff.c perf diff: Fix -o/--order option behavior 2015-01-21 13:24:35 -03:00
builtin-evlist.c perf tools: Modify error code for when perf_session__new() fails 2014-09-26 12:32:58 -03:00
builtin-help.c perf help: Use strerror_r instead of strerror 2014-08-15 13:08:26 -03:00
builtin-inject.c perf tools: Use perf_data_file__fd() consistently 2015-01-29 16:58:24 -03:00
builtin-kmem.c perf tools: Modify error code for when perf_session__new() fails 2014-09-26 12:32:58 -03:00
builtin-kvm.c perf kvm stat live: Mark events as (x86 only) in help output 2014-12-10 12:08:59 -03:00
builtin-list.c perf list: Place the header text in its right position 2015-02-13 11:57:50 -03:00
builtin-lock.c perf tools: Modify error code for when perf_session__new() fails 2014-09-26 12:32:58 -03:00
builtin-mem.c perf mem: Move the mem_operations global to struct perf_mem 2015-01-21 13:24:31 -03:00
builtin-probe.c perf probe: Add --quiet option to suppress output result message 2014-10-29 10:32:49 -02:00
builtin-record.c perf record: Support recording running/enabled time 2015-02-25 12:42:23 -03:00
builtin-report.c perf tools: Enable LBR call stack support 2015-02-18 17:16:17 +01:00
builtin-sched.c perf evlist: Adopt events_stats from perf_session 2015-02-22 22:22:57 -03:00
builtin-script.c perf tools: Export usage string and option table of perf record 2014-10-29 10:32:47 -02:00
builtin-stat.c perf tools: Remove EOL whitespaces 2015-01-21 13:24:31 -03:00
builtin-timechart.c perf tools: Export usage string and option table of perf record 2014-10-29 10:32:47 -02:00
builtin-top.c perf evlist: Adopt events_stats from perf_session 2015-02-22 22:22:57 -03:00
builtin-trace.c perf trace: Fix SIGBUS failures due to misaligned accesses 2015-02-26 11:59:04 -03:00
builtin.h perf tools: Add new 'perf data' command 2015-02-25 12:42:25 -03:00
command-list.txt perf tools: Add new 'perf data' command 2015-02-25 12:42:25 -03:00
CREDITS
design.txt perf tools: Update some code references in design.txt 2014-03-18 18:17:06 -03:00
Makefile perf tools: Add 'build-test' make target 2014-01-16 16:26:26 -03:00
Makefile.perf perf tools: Add feature check for libbabeltrace 2015-02-25 12:42:24 -03:00
MANIFEST tools build: Add new build support 2015-02-11 18:30:03 -03:00
perf-archive.sh perf archive: Make 'f' the last parameter for tar 2012-09-17 13:10:42 -03:00
perf-completion.sh perf sched: Introduce --list-cmds for use by scripts 2014-04-16 17:16:05 +02:00
perf-read-vdso.c perf tools: Build programs to copy 32-bit compatibility 2014-10-29 10:32:48 -02:00
perf-sys.h perf tools: Avoid build splat for syscall numbers with uclibc 2015-01-16 17:49:29 -03:00
perf-with-kcore.sh perf tools: Add perf-with-kcore script 2014-09-17 17:08:08 -03:00
perf.c perf tools: Add new 'perf data' command 2015-02-25 12:42:25 -03:00
perf.h perf record: Support recording running/enabled time 2015-02-25 12:42:23 -03:00