b21484f1a1
For example, in an application with an expensive function implemented with deeply nested recursive calls, the default call-graph presentation is dominated by the different callchains within that function. By ignoring these callees, we can collect the callchains leading into the function and compactly identify what to blame for expensive calls. For example, in this report the callers of garbage_collect() are scattered across the tree: $ perf report -d ruby 2>- | grep -m10 ^[^#]*[a-z] 22.03% ruby [.] gc_mark --- gc_mark |--59.40%-- mark_keyvalue | st_foreach | gc_mark_children | |--99.75%-- rb_gc_mark | | rb_vm_mark | | gc_mark_children | | gc_marks | | |--99.00%-- garbage_collect If we ignore the callees of garbage_collect(), its callers are coalesced: $ perf report --ignore-callees garbage_collect -d ruby 2>- | grep -m10 ^[^#]*[a-z] 72.92% ruby [.] garbage_collect --- garbage_collect vm_xmalloc |--47.08%-- ruby_xmalloc | st_insert2 | rb_hash_aset | |--98.45%-- features_index_add | | rb_provide_feature | | rb_require_safe | | vm_call_method Signed-off-by: Greg Price <price@mit.edu> Tested-by: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20130623031720.GW22203@biohazard-cafe.mit.edu Link: http://lkml.kernel.org/r/20130708115746.GO22203@biohazard-cafe.mit.edu Cc: Fengguang Wu <fengguang.wu@intel.com> [ remove spaces at beginning of line, reported by Fengguang Wu ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
203 lines
4.1 KiB
Plaintext
203 lines
4.1 KiB
Plaintext
perf-top(1)
|
|
===========
|
|
|
|
NAME
|
|
----
|
|
perf-top - System profiling tool.
|
|
|
|
SYNOPSIS
|
|
--------
|
|
[verse]
|
|
'perf top' [-e <EVENT> | --event=EVENT] [<options>]
|
|
|
|
DESCRIPTION
|
|
-----------
|
|
This command generates and displays a performance counter profile in real time.
|
|
|
|
|
|
OPTIONS
|
|
-------
|
|
-a::
|
|
--all-cpus::
|
|
System-wide collection. (default)
|
|
|
|
-c <count>::
|
|
--count=<count>::
|
|
Event period to sample.
|
|
|
|
-C <cpu-list>::
|
|
--cpu=<cpu>::
|
|
Monitor only on the list of CPUs provided. Multiple CPUs can be provided as a
|
|
comma-separated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2.
|
|
Default is to monitor all CPUS.
|
|
|
|
-d <seconds>::
|
|
--delay=<seconds>::
|
|
Number of seconds to delay between refreshes.
|
|
|
|
-e <event>::
|
|
--event=<event>::
|
|
Select the PMU event. Selection can be a symbolic event name
|
|
(use 'perf list' to list all events) or a raw PMU
|
|
event (eventsel+umask) in the form of rNNN where NNN is a
|
|
hexadecimal event descriptor.
|
|
|
|
-E <entries>::
|
|
--entries=<entries>::
|
|
Display this many functions.
|
|
|
|
-f <count>::
|
|
--count-filter=<count>::
|
|
Only display functions with more events than this.
|
|
|
|
-g::
|
|
--group::
|
|
Put the counters into a counter group.
|
|
|
|
-F <freq>::
|
|
--freq=<freq>::
|
|
Profile at this frequency.
|
|
|
|
-i::
|
|
--inherit::
|
|
Child tasks do not inherit counters.
|
|
|
|
-k <path>::
|
|
--vmlinux=<path>::
|
|
Path to vmlinux. Required for annotation functionality.
|
|
|
|
-m <pages>::
|
|
--mmap-pages=<pages>::
|
|
Number of mmapped data pages.
|
|
|
|
-p <pid>::
|
|
--pid=<pid>::
|
|
Profile events on existing Process ID (comma separated list).
|
|
|
|
-t <tid>::
|
|
--tid=<tid>::
|
|
Profile events on existing thread ID (comma separated list).
|
|
|
|
-u::
|
|
--uid=::
|
|
Record events in threads owned by uid. Name or number.
|
|
|
|
-r <priority>::
|
|
--realtime=<priority>::
|
|
Collect data with this RT SCHED_FIFO priority.
|
|
|
|
-s <symbol>::
|
|
--sym-annotate=<symbol>::
|
|
Annotate this symbol.
|
|
|
|
-K::
|
|
--hide_kernel_symbols::
|
|
Hide kernel symbols.
|
|
|
|
-U::
|
|
--hide_user_symbols::
|
|
Hide user symbols.
|
|
|
|
-D::
|
|
--dump-symtab::
|
|
Dump the symbol table used for profiling.
|
|
|
|
-v::
|
|
--verbose::
|
|
Be more verbose (show counter open errors, etc).
|
|
|
|
-z::
|
|
--zero::
|
|
Zero history across display updates.
|
|
|
|
-s::
|
|
--sort::
|
|
Sort by key(s): pid, comm, dso, symbol, parent, srcline, weight, local_weight.
|
|
|
|
-n::
|
|
--show-nr-samples::
|
|
Show a column with the number of samples.
|
|
|
|
--show-total-period::
|
|
Show a column with the sum of periods.
|
|
|
|
--dsos::
|
|
Only consider symbols in these dsos.
|
|
|
|
--comms::
|
|
Only consider symbols in these comms.
|
|
|
|
--symbols::
|
|
Only consider these symbols.
|
|
|
|
-M::
|
|
--disassembler-style=:: Set disassembler style for objdump.
|
|
|
|
--source::
|
|
Interleave source code with assembly code. Enabled by default,
|
|
disable with --no-source.
|
|
|
|
--asm-raw::
|
|
Show raw instruction encoding of assembly instructions.
|
|
|
|
-G [type,min,order]::
|
|
--call-graph::
|
|
Display call chains using type, min percent threshold and order.
|
|
type can be either:
|
|
- flat: single column, linear exposure of call chains.
|
|
- graph: use a graph tree, displaying absolute overhead rates.
|
|
- fractal: like graph, but displays relative rates. Each branch of
|
|
the tree is considered as a new profiled object.
|
|
|
|
order can be either:
|
|
- callee: callee based call graph.
|
|
- caller: inverted caller based call graph.
|
|
|
|
Default: fractal,0.5,callee.
|
|
|
|
--ignore-callees=<regex>::
|
|
Ignore callees of the function(s) matching the given regex.
|
|
This has the effect of collecting the callers of each such
|
|
function into one place in the call-graph tree.
|
|
|
|
--percent-limit::
|
|
Do not show entries which have an overhead under that percent.
|
|
(Default: 0).
|
|
|
|
INTERACTIVE PROMPTING KEYS
|
|
--------------------------
|
|
|
|
[d]::
|
|
Display refresh delay.
|
|
|
|
[e]::
|
|
Number of entries to display.
|
|
|
|
[E]::
|
|
Event to display when multiple counters are active.
|
|
|
|
[f]::
|
|
Profile display filter (>= hit count).
|
|
|
|
[F]::
|
|
Annotation display filter (>= % of total).
|
|
|
|
[s]::
|
|
Annotate symbol.
|
|
|
|
[S]::
|
|
Stop annotation, return to full profile display.
|
|
|
|
[z]::
|
|
Toggle event count zeroing across display updates.
|
|
|
|
[qQ]::
|
|
Quit.
|
|
|
|
Pressing any unmapped key displays a menu, and prompts for input.
|
|
|
|
|
|
SEE ALSO
|
|
--------
|
|
linkperf:perf-stat[1], linkperf:perf-list[1]
|