IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Now hpp formats are linked using perf_hpp_list_node when hierarchy is
enabled. Use this info to print entries with multiple sort keys in a
single hierarchy properly.
For example, the below example shows using 4 sort keys with 2 levels.
$ perf report --hierarchy -s '{prev_pid,prev_comm},{next_pid,next_comm}' \
--percent-limit 1 -i perf.data.sched
...
# Overhead prev_pid+prev_comm / next_pid+next_comm
# ........... .......................................
#
22.36% 0 swapper/0
9.48% 17773 transmission-gt
5.25% 109 kworker/0:1H
1.53% 6524 Xephyr
21.39% 17773 transmission-gt
9.52% 0 swapper/0
9.04% 0 swapper/2
1.78% 0 swapper/3
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457361308-514-6-git-send-email-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The dynamic entries are right-aligned unlike other entries since it
usually has numeric value. But for the hierarchy mode, left alignment
is more appropriate IMHO. Also trim spaces on the left so that we can
easily identify the hierarchy.
Before:
$ perf report --hierarchy -i perf.data.kmem -s gfp_flags,ptr,bytes_req --stdio -g none
...
#
# Overhead gfp_flags / ptr / bytes_req
# .............. .................................................................................................
#
91.67% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
37.50% 0xffff8803f7669400
37.50% 448
8.33% 0xffff8803f766be00
8.33% 96
4.17% 0xffff8800d156dc00
4.17% 704
After:
# Overhead gfp_flags / ptr / bytes_req
# .............. ....................................
#
91.67% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
37.50% 0xffff8803f7669400
37.50% 448
8.33% 0xffff8803f766be00
8.33% 96
4.17% 0xffff8800d156dc00
4.17% 704
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456512767-1164-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When dynamic entries are used in the hierarchy mode with multiple
events, the output might not be aligned properly. In the hierarchy
mode, the each sort column is indented using total number of sort keys.
So it keeps track of number of sort keys when adding them. However
a dynamic sort key can be added more than once when multiple events have
same field names. This results in unnecessarily long indentation in the
output.
For example perf kmem records following events:
$ perf evlist --trace-fields -i perf.data.kmem
kmem:kmalloc: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags
kmem:kmalloc_node: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags,node
kmem:kfree: trace_fields: call_site,ptr
kmem:kmem_cache_alloc: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags
kmem:kmem_cache_alloc_node: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags,node
kmem:kmem_cache_free: trace_fields: call_site,ptr
kmem:mm_page_alloc: trace_fields: page,order,gfp_flags,migratetype
kmem:mm_page_free: trace_fields: page,order
As you can see, many field names shared between kmem events. So adding
'ptr' dynamic sort key alone will set nr_sort_keys to 6. And this adds
many unnecessary spaces between columns.
Before:
$ perf report -i perf.data.kmem --hierarchy -s ptr -g none --stdio
...
# Overhead ptr
# ....................... ...................................
#
99.89% 0xffff8803ffb79720
0.06% 0xffff8803d228a000
0.03% 0xffff8803f7678f00
0.00% 0xffff880401dc5280
0.00% 0xffff880406172380
0.00% 0xffff8803ffac3a00
0.00% 0xffff8803ffac1600
After:
# Overhead ptr
# ........ ....................
#
99.89% 0xffff8803ffb79720
0.06% 0xffff8803d228a000
0.03% 0xffff8803f7678f00
0.00% 0xffff880401dc5280
0.00% 0xffff880406172380
0.00% 0xffff8803ffac3a00
0.00% 0xffff8803ffac1600
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456512767-1164-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When the hierarchy mode is used, some entries might be omiited due to a
percent limit or filter. In this case the output hierarchy is different
than other entries. Add an informative message to users about this.
For example, when 4% of percent limit is applied:
Before:
# Overhead Command / Shared Object / Symbol
# .............. ..........................................
#
49.09% swapper
48.67% [kernel.vmlinux]
34.42% [k] intel_idle
11.51% firefox
8.87% libpthread-2.22.so
6.60% [.] __GI___libc_recvmsg
10.49% gnome-shell
4.74% libc-2.22.so
10.08% Xorg
6.11% libc-2.22.so
5.27% [.] __memcpy_sse2_unaligned
6.15% perf
Note that, gnome-shell/libc has no symbols and perf has no dso/symbols.
With that patch the output will look like below:
After:
# Overhead Command / Shared Object / Symbol
# .............. ..........................................
#
49.09% swapper
48.67% [kernel.vmlinux]
34.42% [k] intel_idle
11.51% firefox
8.87% libpthread-2.22.so
6.60% [.] __GI___libc_recvmsg
10.49% gnome-shell
4.74% libc-2.22.so
no entry >= 4.00%
10.08% Xorg
6.11% libc-2.22.so
5.27% [.] __memcpy_sse2_unaligned
6.15% perf
no entry >= 4.00%
Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456488800-28124-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We were doing column alignment in the format function for each cell,
returning a string padded with spaces so that when the next column is
printed the cursor is at its column alignment.
This ends up needlessly printing trailing spaces, do it at the format
iterator, that is where we know if it is needed, i.e. if there is more
columns to be printed.
This eliminates the need for triming lines when doing a dump using 'P'
in the TUI browser and also produces far saner results with things like
piping 'perf report' to 'less'.
Right now only the formatters for sym->name and the 'locked' column
(perf mem report), that are the ones that end up at the end of lines
in the default 'perf report', 'perf top' and 'perf mem report' tools,
the others will be done in a subsequent patch.
In the end the 'width' parameter for the formatters now mean, in
'printf' terms, the 'precision', where before it was the field 'width'.
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/n/tip-s7iwl2gj23w92l6tibnrcqzr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When there's only a single callchain, perf doesn't print its percentage
in front of the symbols. This is because it assumes that the percentage
is same as parents. But if a percent limit is applied, it's possible
that there are actually a couple of child nodes but only one of them is
shown. In this case it should display the percent to prevent
misunderstanding of its percentage is same as the parent's.
For example, let's see the following callchain.
$ perf report -s comm --percent-limit 0.01 --stdio
...
9.95% swapper
|
|--7.57%--intel_idle
| cpuidle_enter_state
| cpuidle_enter
| call_cpuidle
| cpu_startup_entry
| |
| |--4.89%--start_secondary
| |
| --2.68%--rest_init
| start_kernel
| x86_64_start_reservations
| x86_64_start_kernel
|
|--0.15%--__schedule
| |
| |--0.13%--schedule
| | schedule_preempt_disable
| | cpu_startup_entry
| | |
| | |--0.09%--start_secondary
| | |
| | --0.04%--rest_init
| | start_kernel
| | x86_64_start_reservations
| | x86_64_start_kernel
| |
| --0.01%--schedule_preempt_disabled
| cpu_startup_entry
...
Current code omits the percent if 'intel_idle' becomes the only node
when percent limit is set to 0.5%, its percent is not 9.95% but users
will assume it incorrectly.
Before:
$ perf report --percent-limit 0.5 --stdio
...
9.95% swapper
|
---intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
|
|--4.89%--start_secondary
|
--2.68%--rest_init
start_kernel
x86_64_start_reservations
x86_64_start_kernel
After:
$ perf report --percent-limit 0.5 --stdio
...
9.95% swapper
|
--7.57%--intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
|
|--4.89%--start_secondary
|
--2.68%--rest_init
start_kernel
x86_64_start_reservations
x86_64_start_kernel
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-7-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now -g/--call-graph option supports how to display callchain values.
Possible values are 'percent', 'period' and 'count'. The percent is
same as before and it's the default behavior. The period displays the
raw period value rather than the percentage. The count displays the
number of occurrences.
$ perf report --no-children --stdio -g percent
...
39.93% swapper [kernel.vmlinux] [k] intel_idel
|
---intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
|
|--28.63%-- start_secondary
|
--11.30%-- rest_init
$ perf report --no-children --show-total-period --stdio -g period
...
39.93% 13018705 swapper [kernel.vmlinux] [k] intel_idel
|
---intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
|
|--9334403-- start_secondary
|
--3684302-- rest_init
$ perf report --no-children --show-nr-samples --stdio -g count
...
39.93% 80 swapper [kernel.vmlinux] [k] intel_idel
|
---intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
|
|--57-- start_secondary
|
--23-- rest_init
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1447047946-1691-6-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add new call chain option (-g) 'folded' to print callchains in a line.
The callchains are separated by semicolons, and preceded by (absolute)
percent values and a space.
For example, the following 20 lines can be printed in 3 lines with the
folded output mode:
$ perf report -g flat --no-children | grep -v ^# | head -20
60.48% swapper [kernel.vmlinux] [k] intel_idle
54.60%
intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
start_secondary
5.88%
intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
rest_init
start_kernel
x86_64_start_reservations
x86_64_start_kernel
$ perf report -g folded --no-children | grep -v ^# | head -3
60.48% swapper [kernel.vmlinux] [k] intel_idle
54.60% intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary
5.88% intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
This mode is supported only for --stdio now and intended to be used by
some scripts like in FlameGraphs[1]. Support for other UI might be
added later.
[1] http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html
Requested-and-Tested-by: Brendan Gregg <brendan.d.gregg@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1447047946-1691-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On stdio, there's a problem that it shows invalid values for
callchains in cumulated hist entries. It's because it only cares
about the self period. But with --children behavior, we always add
callchain info to the cumulated entries so it should use the value in
that case.
Before:
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................. ................
#
61.22% 0.32% swapper [kernel.kallsyms] [k] cpu_idle
|
--- cpu_idle
|
|--16530.76%-- start_secondary
|
|--2758.70%-- rest_init
| start_kernel
| x86_64_start_reservations
| x86_64_start_kernel
--6837850969203030.00%-- [...]
After:
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................. ................
#
61.22% 0.32% swapper [kernel.kallsyms] [k] cpu_idle
|
--- cpu_idle
|
|--85.70%-- start_secondary
|
--14.30%-- rest_init
start_kernel
x86_64_start_reservations
x86_64_start_kernel
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-24-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Until now the hpp and sort functions do similar jobs different ways.
Since the sort functions converted/wrapped to hpp formats it can do
the job in a uniform way.
The perf_hpp__sort_list has a list of hpp formats to sort entries and
the perf_hpp__list has a list of hpp formats to print output result.
To have a backward compatibility, it automatically adds 'overhead'
field in front of sort list. And then all of fields in sort list
added to the output list (if it's not already there).
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/n/tip-7g3h86woz2sckg3h1lj42ygj@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>