IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Move the 'pager' function prototypes into a new pager.h so that the
pager code can be moved out to a library.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/ba7c316474dd6bfc047e5c6dc4dcab39a982caf5.1449965119.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
[The kernel patch needed for this is in tip now (b16a5b52eb9 perf/x86:
Add option to disable ...) So this user tools patch to make use of it
should be merged now]
Automatically disable collecting branch flags and cycles with
--call-graph lbr. This allows avoiding a bunch of extra MSR
reads in the PMI on Skylake.
When the kernel doesn't support the new flags they are automatically
cleared in the fallback code.
v2: Switch to use branch_sample_type instead of sample_type.
Adjust description.
Fix the fallback logic.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1449879144-29074-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We should always return from thread__new(), the constructor, with the
object with a reference count of one, so that:
struct thread *thread = thread__new();
thread__put(thread);
Will call thread__delete().
If any reference is made to that 'thread' variable, it better use
thread__get(thread) to hold a reference.
We were returning with thread->refcnt set to zero, fix it and some cases
where thread__delete() was being called, which were not a problem
because just one reference was being used, now that we set it to 1, use
thread__put() instead.
Reported-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-4b9mkuk66to4ecckpmpvqx6s@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
I.e. don't exit with the signal number, instead set the signal handler
to the default one and then raise it again.
Noticed while trying to dump the stack at segfaults in the 'perf test'
forked process used to run each test, that inspects signal info at
each test.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-5x5r176wnoqxi5p6id05wv9w@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There are so many test cases use stack allocated 'struct machine'.
Including:
test__hists_link
test__hists_filter
test__mmap_thread_lookup
test__thread_mg_share
test__hists_output
test__hists_cumulate
Also, in non-test code (for example, machine__new_host()) there are
code use 'malloc()' to alloc struct machine.
These are dangerous operations, cause some tests fail or hung in
machines__exit(). For example, in
machines__exit ->
machine__destroy_kernel_maps ->
map_groups__remove ->
maps__remove ->
pthread_rwlock_wrlock
a incorrectly initialized lock causes unintended behavior.
This patch memset(0) that structure in machine__init() to ensure all
fields in 'struct machine' are initialized to zero.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1449541544-67621-17-git-send-email-wangnan0@huawei.com
[ Use memset, see 'man bzero' ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add hexadecimal u32 to base data type, which is useful for raw output
because raw data is u32 aligned.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1449541544-67621-12-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix dso__load_sym to put dso because dsos__add already got it.
Refcnt debugger explain the problem:
----
==== [0] ====
Unreclaimed dso: 0x19dd200
Refcount +1 => 1 at
./perf(dso__new+0x1ff) [0x4a62df]
./perf(dso__load_sym+0xe89) [0x503509]
./perf(dso__load_vmlinux+0xbf) [0x4aa77f]
./perf(dso__load_vmlinux_path+0x8c) [0x4aa8dc]
./perf() [0x50539a]
./perf(convert_perf_probe_events+0xd79) [0x50ad39]
./perf() [0x45600f]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
Refcount +1 => 2 at
./perf(dso__get+0x34) [0x4a65f4]
./perf(map__new2+0x76) [0x4be216]
./perf(dso__load_sym+0xee1) [0x503561]
./perf(dso__load_vmlinux+0xbf) [0x4aa77f]
./perf(dso__load_vmlinux_path+0x8c) [0x4aa8dc]
./perf() [0x50539a]
./perf(convert_perf_probe_events+0xd79) [0x50ad39]
./perf() [0x45600f]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
Refcount +1 => 3 at
./perf(dsos__add+0xf3) [0x4a6bc3]
./perf(dso__load_sym+0xfc1) [0x503641]
./perf(dso__load_vmlinux+0xbf) [0x4aa77f]
./perf(dso__load_vmlinux_path+0x8c) [0x4aa8dc]
./perf() [0x50539a]
./perf(convert_perf_probe_events+0xd79) [0x50ad39]
./perf() [0x45600f]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
Refcount -1 => 2 at
./perf(dso__put+0x2f) [0x4a664f]
./perf(map_groups__exit+0xb9) [0x4bee29]
./perf(machine__delete+0xb0) [0x4b93d0]
./perf(exit_probe_symbol_maps+0x28) [0x506718]
./perf() [0x45628a]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
Refcount -1 => 1 at
./perf(dso__put+0x2f) [0x4a664f]
./perf(machine__delete+0xfe) [0x4b941e]
./perf(exit_probe_symbol_maps+0x28) [0x506718]
./perf() [0x45628a]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
----
So, in the dso__load_sym, dso is gotten 3 times, by dso__new,
map__new2, and dsos__add. The last 2 is actually released by
map_groups and machine__delete correspondingly. However, the
first reference by dso__new, is never released.
Committer note:
Changed the place where the reference count is dropped to:
Fix it by dropping it right after creating curr_map, since we know that
either that operation failed and we need to drop the dso refcount or
that it succeed and we have it referenced via curr_map->dso.
Then only drop the curr_map refcount after we call dsos__add() to make
sure we hold a reference to it via curr_map->dso.
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20151209021118.10245.49869.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Note that since the thread was already inserted to the session
list, it will be released when the session is released.
Also, in perf_session__register_idle_thread() failure path,
the thread should be put before returning.
Refcnt debugger shows that the perf_session__register_idle_thread
gets the returned thread, but the caller (__cmd_top) does not
put the returned idle thread.
----
==== [0] ====
Unreclaimed thread@0x24e6240
Refcount +1 => 0 at
./perf(thread__new+0xe5) [0x4c8a75]
./perf(machine__findnew_thread+0x9a) [0x4bbdba]
./perf(perf_session__register_idle_thread+0x28) [0x4c63c8]
./perf(cmd_top+0xd7d) [0x43cf6d]
./perf() [0x47ba35]
./perf(main+0x617) [0x4225b7]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f06027c5af5]
./perf() [0x42272d]
Refcount +1 => 1 at
./perf(thread__get+0x2c) [0x4c8bcc]
./perf(machine__findnew_thread+0xee) [0x4bbe0e]
./perf(perf_session__register_idle_thread+0x28) [0x4c63c8]
./perf(cmd_top+0xd7d) [0x43cf6d]
./perf() [0x47ba35]
./perf(main+0x617) [0x4225b7]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f06027c5af5]
./perf() [0x42272d]
Refcount +1 => 2 at
./perf(thread__get+0x2c) [0x4c8bcc]
./perf(machine__findnew_thread+0x112) [0x4bbe32]
./perf(perf_session__register_idle_thread+0x28) [0x4c63c8]
./perf(cmd_top+0xd7d) [0x43cf6d]
./perf() [0x47ba35]
./perf(main+0x617) [0x4225b7]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f06027c5af5]
./perf() [0x42272d]
----
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20151209021122.10245.69707.stgit@localhost.localdomain
[ Drop the refcount in perf_session__register_idle_thread() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
After sample processing is done, hist entries are in both of
hists->entries and hists->entries_in (or hists->entries_collapsed). So
I guess perf report does not have leaks on hists.
But for perf top, it's possible to have half-processed entries which are
only in hists->entries_in. Eventually they will go to the
hists->entries and get freed but they cannot be deleted by current
hists__delete_entries(). This patch adds hists__delete_all_entries
function to delete those entries.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-and-Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1449734015-9148-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since all of its users call before setup_browser(), there's no need to
call exit_browser() inside of the function.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1449716459-23004-8-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move cmd_version() to its own file so that help.c can be moved to a
library.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/e908b1b68f20ab6d8d33941d5571c23110622e60.1449548395.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_env__set_cmdline() only saves the arguments the first time it's
called. It doesn't need to be called every time the options and
suboptions are parsed. Instead it can just be called once.
This also has the advantage of making the option parsing code less
perf-specific so it can be moved out to a library.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/19b76a5aa1b688bd635bd65d80bbc103a978d75e.1449548395.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The term functions are needed by help.c which is going to be moved into
a separate library. Move them out of util.c and into their own file.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/9a39c854dd156b55ebda57e427594c9a59dcb40f.1449548395.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix write_numa_topology to put cpu_map instead of free because cpu_map
is managed based on refcnt.
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20151209021135.10245.79046.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix machine.vmlinux_maps to make sure to clear the old one if it is
renewal. This can leak the previous maps on the vmlinux_maps because
those are just overwritten.
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20151209021133.10245.93730.stgit@localhost.localdomain
[ Simplified the memset, same end result ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since hists__init doesn't set the destructor of hists_evsel (which is an
extended evsel structure), when hists_evsel is released, the extended
part of the hists_evsel is not deleted (note that the hists_evsel object
itself is freed).
This fixes it to add a destructor for hists__evsel and to set it up.
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20151209021129.10245.28710.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add basic support to parse ARM assembly.
This:
* enables perf to correctly show the disassembly, rather than chopping
some constants off at the '#' (which is not a comment character on
ARM).
* allows perf to identify ARM instructions that branch to other parts
within the same function, thereby properly annotating them.
* allows perf to identify function calls, allowing called functions to
be followed in the annotated view.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/n/tip-owp1uj0nmcgfrlppfyeetuyf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use perf_evsel__(enable|disable) functions in perf_evlist__(enable|disable)
functions in order to centralize ioctl enable/disable calls. This way we
eliminate 2 places calling directly ioctl.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1449133606-14429-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding perf_evsel__disable function to have complement for
perf_evsel__enable function. Both will be used in following patch to
factor perf_evlist__(enable|disable).
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1449133606-14429-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
All events now share proper cpu and thread maps. There's no need to pass
those maps from evlist, it's safe to use evsel maps for enabling event.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1449133606-14429-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There's a mistake in dso__adjust_kmod_long_name() that it use strdup()
to dup the new long_name of a dso, but passes the original string to
dso__set_long_name(). Which causes random crash during cleanup.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Fixes: c03d5184f0e9 ("perf machine: Adjust dso->long_name for offline module")
Link: http://lkml.kernel.org/r/1449455785-42020-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Following patches are going to introduce BPF object level configuration
to enable setting values into BPF maps. To avoid confusion, this patch
renames existing 'config' in bpf-loader.c to 'program config'. Following
patches would introduce 'object config'.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1448614067-197576-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If user gives a filter, perf marks the corresponding column elided and
omits the output. But it should process and aggregates samples using
the field, otherwise samples will be aggregated as if the column was not
there resulted in incorrect output.
For example, I'd like to set a filter on native_write_msr_safe. The
original overhead of the function is negligible.
$ perf report | grep native_write_msr_safe
0.00% swapper [kernel.vmlinux] native_write_msr_safe
0.00% perf [kernel.vmlinux] native_write_msr_safe
However adding -S option gives different output.
$ perf report -S native_write_msr_safe --percentage absolute | \
> grep -e swapper -e perf
51.47% swapper [kernel.vmlinux]
4.14% perf [kernel.vmlinux]
Since it aggregated samples using comm and dso only. In fact, the above
values are same when it sorts with -s comm,dso.
$ perf report -s comm,dso | grep -e swapper -e perf
51.47% swapper [kernel.vmlinux]
4.14% perf [kernel.vmlinux]
This resulted in TUI failure with -ERANGE since it tries to increase
sample hit count for annotation with wrong symbols due to incorrect
aggregation.
This patch fixes it not to skip elided fields when comparing samples in
order to insert them to the hists.
Commiter note:
After the patch, with a different workloads:
# perf report --show-total-period -S native_write_msr_safe --stdio
#
# symbol: native_write_msr_safe
#
# Samples: 455 of event 'cycles:pp'
# Event count (approx.): 134787489
#
# Overhead Period Command Shared Object
# ........ ...... ............... ................
#
0.22% 293081 qemu-system-x86 [vmlinux]
0.19% 255914 swapper [vmlinux]
0.00% 2054 Timer [vmlinux]
0.00% 1021 firefox [vmlinux]
0.00% 2 perf [vmlinux]
# perf report --show-total-period | grep native_write_msr_safe
Failed to open /tmp/perf-14838.map, continuing without symbols
0.22% 293081 qemu-system-x86 [vmlinux] [k] native_write_msr_safe
0.19% 255914 swapper [vmlinux] [k] native_write_msr_safe
0.00% 2054 Timer [vmlinux] [k] native_write_msr_safe
0.00% 1021 firefox [vmlinux] [k] native_write_msr_safe
0.00% 2 perf [vmlinux] [k] native_write_msr_safe
#
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1448645559-31167-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When a43eec304259 ("bpf: introduce bpf_perf_event_output() helper") added
PERF_COUNT_SW_BPF_OUTPUT we ended up with a new entry in the event_symbols_sw
array that wasn't initialized, thus set to NULL, fix print_symbol_events()
to check for that case so that we don't crash if this happens again.
(gdb) bt
#0 __match_glob (ignore_space=false, pat=<optimized out>, str=<optimized out>) at util/string.c:198
#1 strglobmatch (str=<optimized out>, pat=pat@entry=0x7fffffffe61d "stall") at util/string.c:252
#2 0x00000000004993a5 in print_symbol_events (type=1, syms=0x872880 <event_symbols_sw+160>, max=11, name_only=false, event_glob=0x7fffffffe61d "stall")
at util/parse-events.c:1615
#3 print_events (event_glob=event_glob@entry=0x7fffffffe61d "stall", name_only=false) at util/parse-events.c:1675
#4 0x000000000042c79e in cmd_list (argc=1, argv=0x7fffffffe390, prefix=<optimized out>) at builtin-list.c:68
#5 0x00000000004788a5 in run_builtin (p=p@entry=0x871758 <commands+120>, argc=argc@entry=2, argv=argv@entry=0x7fffffffe390) at perf.c:370
#6 0x0000000000420ab0 in handle_internal_command (argv=0x7fffffffe390, argc=2) at perf.c:429
#7 run_argv (argv=0x7fffffffe110, argcp=0x7fffffffe11c) at perf.c:473
#8 main (argc=2, argv=0x7fffffffe390) at perf.c:588
(gdb) p event_symbols_sw[PERF_COUNT_SW_BPF_OUTPUT]
$4 = {symbol = 0x0, alias = 0x0}
(gdb)
A patch to robustify perf to not segfault when the next counter gets added in
the kernel will follow this one.
Reported-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-57wysblcjfrseb0zg5u7ek10@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When PERF_COUNT_SW_BPF_OUTPUT was added to the kernel we should've
added it to tools/perf, where it is used just to list events.
This ended up causing a segfault in commands like "perf list stall".
Fix it by adding that new software counter.
A patch to robustify perf to not segfault when the next counter gets
added in the kernel will follow this one.
Reported-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-uya354upi3eprsey6mi5962d@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The --kernel option of perf buildid-list tool should show the running
kernel buildid. The functionality has been lost during other changes of
the related code.
The build_id__sprintf() function should return length of the build-id
string, but it was the length of the build-id raw data instead. Due to
that, some return value checking caused that the final string was not
printed out.
With this patch the build_id__sprintf() returns the correct value, so
the --kernel option works again.
Before:
# perf buildid-list --kernel
#
After:
# perf buildid-list --kernel
972c1edab5bdc06cc224af45d510af662a3c6972
#
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
LPU-Reference: 1448632089.24573.114.camel@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently when debuginfo is separated to vmlinux.debug, it's contents
get ignored. Let's change that and add it to the vmlinux_path list.
Signed-off-by: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Acked-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1448469166-61363-3-git-send-email-tumanova@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Refactor vmlinux_path__init() to ease subsequent additions of new
vmlinux locations.
Signed-off-by: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Acked-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1448469166-61363-2-git-send-email-tumanova@linux.vnet.ibm.com
[ Rename vmlinux_path__update() to vmlinux_path__add() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When parsing /proc/xxx/maps, the sscanf in perf_event__synthesize_mmap_events
truncate the map name at the space in "/anon_hugepage (deleted)".
is_anon_memory() then only receives the string "/anon_hugepage" and does
not detect it. We change is_anon_memory() to only compare the first
part of the string, effectively ignoring if " (deleted)" is there.
Signed-off-by: Yannick Brosseau <scientist@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Joshua Zhu <zhu.wen-jie@hp.com>
Cc: kernel-team@fb.com
Link: http://lkml.kernel.org/r/1448538152-2898-1-git-send-email-scientist@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Something unexpected may happen if copy statically linked perf to a
production environment:
# ./perf probe -m ./mymodule.ko my_func
[mymodule] with build id 326ab42550ef3d24944f53c817533728367effeb not found, continuing without symbols
Failed to find symbol my_func in /home/wangnan/kmodule/mymodule.ko
Error: Failed to add events.
# ./perf buildid-cache -a ./mymodule.ko
# ./perf probe -m ./mymodule.ko my_func
Added new event:
probe:my_func (on my_func in /home/wangnan/kmodule/mymodule.ko)
You can now use it in all perf tools, such as:
perf record -e probe:my_func -aR sleep 1
Where:
# ldd ./perf
not a dynamic executable
# strace -e open ./perf probe -m ./mymodule.ko my_func
...
open("/home/wangnan/kmodule/mymodule.ko", O_RDONLY) = 3
open("/home/wangnan/kmodule/../lib64/elfutils/libebl_x86_64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
...
open("/lib64/tls/libebl_x86_64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/lib64/libebl_x86_64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/lib64/tls/libebl_x86_64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/lib64/libebl_x86_64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("[mymodule]", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/home/wangnan/.debug/.build-id/32/6ab42550ef3d24944f53c817533728367effeb", O_RDONLY) = -1 ENOENT (No such file or directory)
open("[mymodule]", O_RDONLY) = -1 ENOENT (No such file or directory)
In the above example, probe fails before we put the module into
buildid-cache. However, user would expect it success in both case
because perf is able to find probe points actually.
The reason is because perf won't utilize module's full path if it failed
to open debuginfo. In:
convert_to_probe_trace_events ->
find_probe_trace_events_from_map ->
get_target_map ->
kernel_get_module_map ->
machine__findnew_module_map ->
map_groups__find_by_name
map_groups__find_by_name() is able to find the map of that module, but
this information is found from /proc/module before it knows the real
path of the offline module. Therefore, the map->dso->long_name is set to
something like '[mymodule]', which prevent dso__load() find the real
path of the module file.
In another aspect, if dso__load() can get the offline module through
buildid cache, it can read symble table from that ko. Even if debuginfo
is not available, 'perf probe' can success if the '.symtab' can be
found.
This patch improves machine__findnew_module_map(): when dso->long_name
is leading with '[' (doesn't find path of module when parsing
/proc/modules), fixes it by dso__set_long_name(), so following
dso__load() is possible to find the symbol table.
This patch won't interfere with buildid matching. Here is the test
result:
# ./perf probe -m ./mymodule.ko my_func
Added new event:
probe:my_func (on my_func in /home/wangnan/kmodule/mymodule.ko)
You can now use it in all perf tools, such as:
perf record -e probe:my_func -aR sleep 1
# ./perf probe -d '*'
Removed event: probe:my_func
# mv ./mymodule.{ko,.bak}
# mv ./moduleb.ko mymodule.ko
# ./perf probe -m ./mymodule.ko my_func
/home/wangnan/kmodule/mymodule.ko with build id 326ab42550ef3d24944f53c817533728367effeb not found, continuing without symbols
Failed to find symbol my_func in /home/wangnan/kmodule/mymodule.ko
Error: Failed to add events.
# ./perf probe -v -m ./mymodule.ko my_func
probe-definition(0): my_func
symbol:my_func file:(null) line:0 offset:0 return:0 lazy:(null)
0 arguments
Could not open debuginfo. Try to use symbols.
symsrc__init: build id mismatch for /home/wangnan/kmodule/mymodule.ko.
/home/wangnan/kmodule/mymodule.ko with build id 326ab42550ef3d24944f53c817533728367effeb not found, continuing without symbols
Failed to find symbol my_func in /home/wangnan/kmodule/mymodule.ko
Error: Failed to add events. Reason: No such file or directory (Code: -2)
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1448510397-187965-1-git-send-email-wangnan0@huawei.com
[ Renamed adjust_dso_long_name() do dso__adjust_kmod_long_name() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The callchain rbtree is rebuilt periodically, so it needs to
reinitialize the root everytime. Otherwise it can be stuck in the
rbtree insertion with stale pointers.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1448521700-32062-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If user requested to hide unresolved entries, skip unresolved callchains
as well as hist entries.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1448521700-32062-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The commit 05c8d802fa52 ("perf probe: Fix to free temporal Dwarf_Frame")
tried to fix the memory leak of Dwarf_Frame, but it released the frame
at wrong point. Since the dwarf_frame_cfa(frame, &pf->fb_ops, &nops) can
return an address inside the frame data structure to pf->fb_ops, we can
not release the frame before using pf->fb_ops.
This reverts the commit and releases the frame afterwards (right before
returning from call_probe_finder) correctly.
Reported-and-Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Reported-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 05c8d802fa52 ("perf probe: Fix to free temporal Dwarf_Frame")
LPU-Reference: 20151125103432.1473.31009.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As reported by Milian, currently for DWARF unwind (both libdw and
libunwind) we display callchain in callee order only.
Adding the support to follow callchain order setup to libdw DWARF
unwinder, so we could get following output for report:
$ perf record --call-graph dwarf ls
...
$ perf report --no-children --stdio
21.12% ls libc-2.21.so [.] __strcoll_l
|
---__strcoll_l
mpsort_with_tmp
mpsort_with_tmp
mpsort_with_tmp
sort_files
main
__libc_start_main
_start
$ perf report --stdio --no-children -g caller
21.12% ls libc-2.21.so [.] __strcoll_l
|
---_start
__libc_start_main
main
sort_files
mpsort_with_tmp
mpsort_with_tmp
mpsort_with_tmp
__strcoll_l
Reported-and-Tested-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Wang Nan <wangnan0@huawei.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jan Kratochvil <jkratoch@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20151119130119.GA26617@krava.brq.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As reported by Milian, currently for DWARF unwind (both libdw and
libunwind) we display callchain in callee order only.
Adding the support to follow callchain order setup to libunwind DWARF
unwinder, so we could get following output for report:
$ perf record --call-graph dwarf ls
...
$ perf report --no-children --stdio
39.26% ls libc-2.21.so [.] __strcoll_l
|
---__strcoll_l
mpsort_with_tmp
mpsort_with_tmp
sort_files
main
__libc_start_main
_start
0
$ perf report -g caller --no-children --stdio
...
39.26% ls libc-2.21.so [.] __strcoll_l
|
---0
_start
__libc_start_main
main
sort_files
mpsort_with_tmp
mpsort_with_tmp
__strcoll_l
Based-on-patch-by: Milian Wolff <milian.wolff@kdab.com>
Reported-and-Tested-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Wang Nan <wangnan0@huawei.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20151118075247.GA5416@krava.brq.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Moving initial entry call into get_entries function so all entries
processing is on one place. It will be useful for next change that adds
ordering logic.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Milian Wolff <milian.wolff@kdab.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1447772739-18471-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The flat callchain mode is to print all chains in a single, simple
hierarchy so make it easy to see.
Currently perf report --tui doesn't show flat callchains properly. With
flat callchains, only leaf nodes are added to the final rbtree so it
should show entries in parent nodes. To do that, add parent_val list to
struct callchain_node and show them along with the (normal) val list.
For example, consider following callchains with '-g graph'.
$ perf report -g graph
- 39.93% swapper [kernel.vmlinux] [k] intel_idle
intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
- cpu_startup_entry
28.63% start_secondary
- 11.30% rest_init
start_kernel
x86_64_start_reservations
x86_64_start_kernel
Before:
$ perf report -g flat
- 39.93% swapper [kernel.vmlinux] [k] intel_idle
28.63% start_secondary
- 11.30% rest_init
start_kernel
x86_64_start_reservations
x86_64_start_kernel
After:
$ perf report -g flat
- 39.93% swapper [kernel.vmlinux] [k] intel_idle
- 28.63% intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
start_secondary
- 11.30% intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
start_kernel
x86_64_start_reservations
x86_64_start_kernel
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1447047946-1691-8-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now -g/--call-graph option supports how to display callchain values.
Possible values are 'percent', 'period' and 'count'. The percent is
same as before and it's the default behavior. The period displays the
raw period value rather than the percentage. The count displays the
number of occurrences.
$ perf report --no-children --stdio -g percent
...
39.93% swapper [kernel.vmlinux] [k] intel_idel
|
---intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
|
|--28.63%-- start_secondary
|
--11.30%-- rest_init
$ perf report --no-children --show-total-period --stdio -g period
...
39.93% 13018705 swapper [kernel.vmlinux] [k] intel_idel
|
---intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
|
|--9334403-- start_secondary
|
--3684302-- rest_init
$ perf report --no-children --show-nr-samples --stdio -g count
...
39.93% 80 swapper [kernel.vmlinux] [k] intel_idel
|
---intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
|
|--57-- start_secondary
|
--23-- rest_init
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1447047946-1691-6-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It's to track the count of occurrences of the callchains.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Brendan Gregg <brendan.d.gregg@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1447047946-1691-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is a preparation to support for printing other type of callchain
value like count or period.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1447047946-1691-4-git-send-email-namhyung@kernel.org
[ renamed new _sprintf_ operation to _scnprintf_ ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add new call chain option (-g) 'folded' to print callchains in a line.
The callchains are separated by semicolons, and preceded by (absolute)
percent values and a space.
For example, the following 20 lines can be printed in 3 lines with the
folded output mode:
$ perf report -g flat --no-children | grep -v ^# | head -20
60.48% swapper [kernel.vmlinux] [k] intel_idle
54.60%
intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
start_secondary
5.88%
intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
rest_init
start_kernel
x86_64_start_reservations
x86_64_start_kernel
$ perf report -g folded --no-children | grep -v ^# | head -3
60.48% swapper [kernel.vmlinux] [k] intel_idle
54.60% intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary
5.88% intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
This mode is supported only for --stdio now and intended to be used by
some scripts like in FlameGraphs[1]. Support for other UI might be
added later.
[1] http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html
Requested-and-Tested-by: Brendan Gregg <brendan.d.gregg@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1447047946-1691-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix machine__findnew_module_map to drop the reference to the dso because
it is already referenced by both machine__findnew_module_dso() and
map__new2().
Refcnt debugger shows:
==== [1] ====
Unreclaimed dso: 0x1ffd980
Refcount +1 => 1 at
./perf(dso__new+0x1ff) [0x4a62df]
./perf(__dsos__addnew+0x29) [0x4a6e19]
./perf() [0x4b8b91]
./perf(modules__parse+0xfc) [0x4a9d5c]
./perf() [0x4b8460]
./perf(machine__create_kernel_maps+0x150) [0x4bb550]
./perf(machine__new_host+0xfa) [0x4bb75a]
./perf(init_probe_symbol_maps+0x93) [0x506623]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1345a8eaf5]
./perf() [0x4220a9]
This map_groups__insert(0x4b8b91) already gets a reference to the new
dso:
----
eu-addr2line -e ./perf -f 0x4b8b91
map_groups__insert inlined at util/machine.c:586 in
machine__create_module
util/map.h:207
----
So this dso refcnt will be released when map_groups gets released.
[snip]
Refcount +1 => 2 at
./perf(dso__get+0x34) [0x4a65f4]
./perf() [0x4b8b35]
./perf(modules__parse+0xfc) [0x4a9d5c]
./perf() [0x4b8460]
./perf(machine__create_kernel_maps+0x150) [0x4bb550]
./perf(machine__new_host+0xfa) [0x4bb75a]
./perf(init_probe_symbol_maps+0x93) [0x506623]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1345a8eaf5]
./perf() [0x4220a9]
Here, machine__findnew_module_dso(0x4b8b35) gets the dso (and stores it
in a local variable):
----
# eu-addr2line -e ./perf -f 0x4b8b35
machine__findnew_module_dso inlined at util/machine.c:578 in
machine__create_module
util/machine.c:514
----
Refcount +1 => 3 at
./perf(dso__get+0x34) [0x4a65f4]
./perf(map__new2+0x76) [0x4be1c6]
./perf() [0x4b8b4f]
./perf(modules__parse+0xfc) [0x4a9d5c]
./perf() [0x4b8460]
./perf(machine__create_kernel_maps+0x150) [0x4bb550]
./perf(machine__new_host+0xfa) [0x4bb75a]
./perf(init_probe_symbol_maps+0x93) [0x506623]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1345a8eaf5]
./perf() [0x4220a9]
But also map__new2() gets the dso which will be put when the map is
released.
So, we have to drop the constructor reference obtained in
machine__findnew_module_dso().
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20151118064035.30709.58824.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
__dsos__addnew should drop the constructor reference to dso after adding
it to the list, because __dsos__add() will get a reference that will be
kept while it is in the list.
This fixes DSO leaks when entries are removed to the list and the refcount
never gets to zero.
Refcnt debugger shows:
==== [0] ====
Unreclaimed dso: 0x2fccab0
Refcount +1 => 1 at
./perf(dso__new+0x1ff) [0x4a62df]
./perf(__dsos__addnew+0x29) [0x4a6e19]
./perf(dsos__findnew+0xd1) [0x4a7281]
./perf(machine__findnew_kernel+0x27) [0x4a5e17]
./perf() [0x4b8df2]
./perf(machine__create_kernel_maps+0x28) [0x4bb528]
./perf(machine__new_host+0xfa) [0x4bb84a]
./perf(init_probe_symbol_maps+0x93) [0x506713]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f46df132af5]
./perf() [0x4220a9]
Refcount +1 => 2 at
./perf(__dsos__addnew+0xfb) [0x4a6eeb]
./perf(dsos__findnew+0xd1) [0x4a7281]
./perf(machine__findnew_kernel+0x27) [0x4a5e17]
./perf() [0x4b8df2]
./perf(machine__create_kernel_maps+0x28) [0x4bb528]
./perf(machine__new_host+0xfa) [0x4bb84a]
./perf(init_probe_symbol_maps+0x93) [0x506713]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f46df132af5]
./perf() [0x4220a9]
Refcount +1 => 3 at
./perf(dsos__findnew+0x7e) [0x4a722e]
./perf(machine__findnew_kernel+0x27) [0x4a5e17]
./perf() [0x4b8df2]
./perf(machine__create_kernel_maps+0x28) [0x4bb528]
./perf(machine__new_host+0xfa) [0x4bb84a]
./perf(init_probe_symbol_maps+0x93) [0x506713]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f46df132af5]
./perf() [0x4220a9]
[snip]
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20151118064031.30709.81460.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>