IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Move the config customization into a site-local file
tdc_config_local.py, so that updates of the tdc test
software does not require hand-editing of the config.
This patch includes a template for the site-local
customization file.
In addition, this makes it easy to revert to a stock
tdc environment for testing the test framework and/or
the core tests.
Also it makes it harder for any custom config to be
submitted back to the kernel tdc.
Signed-off-by: Brenda J. Butler <bjb@mojatatu.com>
Acked-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ignore .pyc files, "python compiled" files, that get created
when a python script is run. They should never be committed.
Signed-off-by: Brenda J. Butler <bjb@mojatatu.com>
Acked-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As part of documentation, supply some very simple test cases
to illustrate how test cases work. One test case shows
commands in the setup, command, verify and teardown stages.
Other test cases show how to have a working test case that
does not have commands in the setup, verify and/or teardown
stages.
Specifically, the command lists for setup and teardown can
be empty. And the verify command must have a command, but
it can be /bin/true. The regex must have a string, we
recommend a single space, and the count of matches must be
zero if you do not want to use the match feature of verify.
Verify will always look for a return code of success (0)
so we give /bin/true when we do not want to make a check
there.
Also, update the documentation for testcases to be more
specific in the cases of:
- accepting non-success return codes in setup and
teardown stages
- how to write the test when no setup, teardown
and/or verify are desired.
To run the example test cases:
$ sudo -E ./tdc.py -f creating-testcases/example.json -l
1f: (example) simple test to test framework
2f: (example) simple test, no need for verify
3f: (example) simple test, no need for setup or teardown (or verify)
$ sudo -E ./tdc.py -f creating-testcases/example.json
Test 1f: simple test to test framework
Test 2f: simple test, no need for verify
Test 3f: simple test, no need for setup or teardown (or verify)
All test results:
1..3
ok 1 1f simple test to test framework
ok 2 2f simple test, no need for verify
ok 3 3f simple test, no need for setup or teardown (or verify)
$
Signed-off-by: Brenda J. Butler <bjb@mojatatu.com>
Acked-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change run_tests to print individual test results to console by default.
Introduce "summary" option to print individual test results to a file
/tmp/test_name and just print the summary to the console.
This change is necessary to support use-cases where test machines get
rebooted once tests are run and the console log should contain the full
results.
In the following example, individual test results with "summary=1" option
are written to /tmp/kcmp_test
make --silent TARGETS=kcmp kselftest
TAP version 13
selftests: kcmp_test
========================================
pid1: 30126 pid2: 30127 FD: 2 FILES: 2 VM: 1 FS: 2 SIGHAND: 2 IO:
0 SYSVSEM: 0 INV: -1
PASS: 0 returned as expected
PASS: 0 returned as expected
FAIL: 0 expected but -1 returned (Invalid argument)
Pass 2 Fail 1 Xfail 0 Xpass 0 Skip 0 Error 0
1..3
Bail out!
Pass 2 Fail 1 Xfail 0 Xpass 0 Skip 0 Error 0
1..3
Pass 0 Fail 0 Xfail 0 Xpass 0 Skip 0 Error 0
1..0
ok 1..1 selftests: kcmp_test [PASS]
make --silent TARGETS=kcmp summary=1 kselftest
TAP version 13
selftests: kcmp_test
========================================
ok 1..1 selftests: kcmp_test [PASS]
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
For some unknown reason there is no entry in tracefs's syscalls for
kcmp, i.e. no tracefs/events/syscalls/sys_{enter,exit}_kcmp, so we need
to provide a data dictionary for the fields.
To beautify the 'type' argument we automatically generate a strarray
from tools/include/uapi/kcmp.h, the idx1 and idx2 args, nowadays used
only if type == KCMP_FILE, are masked for all the other types and a
lookup is made for the thread and fd to show the path, if possible,
getting it from the probe:vfs_getname if in place or from procfs, races
allowing.
A system wide strace like tracing session, with callchains shows just
one user so far in this fedora 25 machine:
# perf trace --max-stack 5 -e kcmp
<SNIP>
1502914.400 ( 0.001 ms): systemd/1 kcmp(pid1: 1 (systemd), pid2: 1 (systemd), type: FILE, idx1: 271<socket:[4723475]>, idx2: 25<socket:[4788686]>) = -1 ENOSYS Function not implemented
syscall (/usr/lib64/libc-2.25.so)
same_fd (/usr/lib/systemd/libsystemd-shared-233.so)
service_add_fd_store (/usr/lib/systemd/systemd)
service_notify_message.lto_priv.127 (/usr/lib/systemd/systemd)
1502914.407 ( 0.001 ms): systemd/1 kcmp(pid1: 1 (systemd), pid2: 1 (systemd), type: FILE, idx1: 270<socket:[4726396]>, idx2: 25<socket:[4788686]>) = -1 ENOSYS Function not implemented
syscall (/usr/lib64/libc-2.25.so)
same_fd (/usr/lib/systemd/libsystemd-shared-233.so)
service_add_fd_store (/usr/lib/systemd/systemd)
service_notify_message.lto_priv.127 (/usr/lib/systemd/systemd)
<SNIP>
The backtraces seem to agree this is really kcmp(), but this system
doesn't have the sys_kcmp(), bummer:
# uname -a
Linux jouet 4.14.0-rc3+ #1 SMP Fri Oct 13 12:21:12 -03 2017 x86_64 x86_64 x86_64 GNU/Linux
# grep kcmp /proc/kallsyms
ffffffffb60b8890 W sys_kcmp
$ grep CONFIG_CHECKPOINT_RESTORE ../build/v4.14.0-rc3+/.config
# CONFIG_CHECKPOINT_RESTORE is not set
$
So systemd uses it, good fedora kernel config has it:
$ grep CONFIG_CHECKPOINT_RESTORE /boot/config-4.13.4-200.fc26.x86_64
CONFIG_CHECKPOINT_RESTORE=y
[acme@jouet linux]$
/me goes to rebuild a kernel...
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-gz5fca968viw8m7hryjqvrln@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
One that given a pid and a fd, will try to get the path for that fd.
Will be used in the upcoming kcmp's KCMP_FILE beautifier.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-7ketygp2dvs9h13wuakfncws@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We will use it to generate tables for beautifying kcmp's 'type' arg.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-r35zr79invmpinfe1zu57cas@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Milian Wolff found a problem he described in [1] and that for him would
get fixed:
"Note how most of the large offset values are now gone. Most notably, we
get proper srcline resolution for the random.h and complex headers."
Then Namhyung found the root cause:
"I looked into it and found a bug handling cumulative (children)
entries. For children entries that have no self period, the al->addr (so
he->ip) ends up having an doubly-mapped address.
It seems to be there from the beginning but only affects entries that
have no srclines - finding srcline itself is done using a different
address but it will show the invalid address if no srcline was found. I
think we should fix the commit c7405d85d7a3 ("perf tools: Update cpumode
for each cumulative entry")."
[1] https://lkml.kernel.org/r/20171018185350.14893-7-milian.wolff@kdab.com
Reported-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: kernel-team@lge.com
Fixes: c7405d85d7a3 ("perf tools: Update cpumode for each cumulative entry")
Link: https://lkml.kernel.org/r/20171020051533.GA2746@sejong
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixes: 31c2611b66e0 ("selftests: Introduce a new test case to tc testsuite")
Fixes: 76b903ee198d ("selftests: Introduce tc testsuite")
Signed-off-by: Brenda J. Butler <bjb@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We should support this because it would allow easily to collect metrics
for different threads in applications.
Original patch from posted by Jin Yao in here [1].
1. Current output, for example:
root@skl:/tmp# perf stat --per-thread -p 21623
^C
Performance counter stats for process id '21623':
vmstat-21623 0.517479 task-clock (msec) # 0.000 CPUs utilized
vmstat-21623 1 context-switches
vmstat-21623 0 cpu-migrations
vmstat-21623 0 page-faults
vmstat-21623 461,306 cycles
vmstat-21623 630,724 instructions
vmstat-21623 136,265 branches
vmstat-21623 2,520 branch-misses
1.444020756 seconds time elapsed
root@skl:/tmp# perf stat --per-thread --metrics ipc -p 21623
^C
Performance counter stats for process id '21623':
vmstat-21623 631,185 inst_retired.any
vmstat-21623 605,893 cpu_clk_unhalted.thread
1.415679293 seconds time elapsed
2. With this patch, the result would be:
root@skl:/tmp# perf stat --per-thread -p 21623
^C
Performance counter stats for process id '21623':
vmstat-21623 0.533759 task-clock (msec) # 0.000 CPUs utilized
vmstat-21623 1 context-switches # 0.002 M/sec
vmstat-21623 0 cpu-migrations # 0.000 K/sec
vmstat-21623 0 page-faults # 0.000 K/sec
vmstat-21623 473,896 cycles # 0.888 GHz
vmstat-21623 631,072 instructions # 1.33 insn per cycle
vmstat-21623 136,307 branches # 255.372 M/sec
vmstat-21623 2,524 branch-misses # 1.85% of all branches
1.544862861 seconds time elapsed
root@skl:/tmp# perf stat --per-thread --metrics ipc -p 21623
^C
Performance counter stats for process id '21623':
vmstat-21623 1,259,104 inst_retired.any # 1.2 IPC
vmstat-21623 1,056,756 cpu_clk_unhalted.thread
2.040954502 seconds time elapsed
[1] https://marc.info/?l=linux-kernel&m=150777054620511&w=2
Originally-from: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-tr8ntktxmy4qc5769ajg5u6c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the shadow stats scale computation to the
perf_stat__update_shadow_stats() function, so it's centralized and we
don't forget to do it. It also saves few lines of code.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-htg7mmyxv6pcrf57qyo6msid@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding perf_data_file__write function to provide single file write
operation.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-c3f9p4xzykr845ktqcek6p4t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add struct perf_data_file to represent a single file within a perf_data
struct.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-c3f9p4xzykr845ktqcek6p4t@git.kernel.org
[ Fixup recent changes in 'perf script --per-event-dump' ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename struct perf_data_file to perf_data, because we will add the
possibility to have multiple files under perf.data, so the 'perf_data'
name fits better.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-39wn4d77phel3dgkzo3lyan0@git.kernel.org
[ Fixup recent changes in 'perf script --per-event-dump' ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Several conflicts here.
NFP driver bug fix adding nfp_netdev_is_nfp_repr() check to
nfp_fl_output() needed some adjustments because the code block is in
an else block now.
Parallel additions to net/pkt_cls.h and net/sch_generic.h
A bug fix in __tcp_retransmit_skb() conflicted with some of
the rbtree changes in net-next.
The tc action RCU callback fixes in 'net' had some overlap with some
of the recent tcf_block reworking.
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Fix route leak in xfrm_bundle_create().
2) In mac80211, validate user rate mask before configuring it. From
Johannes Berg.
3) Properly enforce memory limits in fair queueing code, from Toke
Hoiland-Jorgensen.
4) Fix lockdep splat in inet_csk_route_req(), from Eric Dumazet.
5) Fix TSO header allocation and management in mvpp2 driver, from Yan
Markman.
6) Don't take socket lock in BH handler in strparser code, from Tom
Herbert.
7) Don't show sockets from other namespaces in AF_UNIX code, from
Andrei Vagin.
8) Fix double free in error path of tap_open(), from Girish Moodalbail.
9) Fix TX map failure path in igb and ixgbe, from Jean-Philippe Brucker
and Alexander Duyck.
10) Fix DCB mode programming in stmmac driver, from Jose Abreu.
11) Fix err_count handling in various tunnels (ipip, ip6_gre). From Xin
Long.
12) Properly align SKB head before building SKB in tuntap, from Jason
Wang.
13) Avoid matching qdiscs with a zero handle during lookups, from Cong
Wang.
14) Fix various endianness bugs in sctp, from Xin Long.
15) Fix tc filter callback races and add selftests which trigger the
problem, from Cong Wang.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (73 commits)
selftests: Introduce a new test case to tc testsuite
selftests: Introduce a new script to generate tc batch file
net_sched: fix call_rcu() race on act_sample module removal
net_sched: add rtnl assertion to tcf_exts_destroy()
net_sched: use tcf_queue_work() in tcindex filter
net_sched: use tcf_queue_work() in rsvp filter
net_sched: use tcf_queue_work() in route filter
net_sched: use tcf_queue_work() in u32 filter
net_sched: use tcf_queue_work() in matchall filter
net_sched: use tcf_queue_work() in fw filter
net_sched: use tcf_queue_work() in flower filter
net_sched: use tcf_queue_work() in flow filter
net_sched: use tcf_queue_work() in cgroup filter
net_sched: use tcf_queue_work() in bpf filter
net_sched: use tcf_queue_work() in basic filter
net_sched: introduce a workqueue for RCU callbacks of tc filter
sctp: fix some type cast warnings introduced since very beginning
sctp: fix a type cast warnings that causes a_rwnd gets the wrong value
sctp: fix some type cast warnings introduced by transport rhashtable
sctp: fix some type cast warnings introduced by stream reconf
...
In this patchset, we fixed a tc bug. This patch adds the test case
that reproduces the bug. To run this test case, user should specify
an existing NIC device:
# sudo ./tdc.py -d enp4s0f0
This test case belongs to category "flower". If user doesn't specify
a NIC device, the test cases belong to "flower" will not be run.
In this test case, we create 1M filters and all filters share the same
action. When destroying all filters, kernel should not panic. It takes
about 18s to run it.
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: Chris Mi <chrism@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
# ./tdc_batch.py -h
usage: tdc_batch.py [-h] [-n NUMBER] [-o] [-s] [-p] device file
TC batch file generator
positional arguments:
device device name
file batch file name
optional arguments:
-h, --help show this help message and exit
-n NUMBER, --number NUMBER
how many lines in batch file
-o, --skip_sw skip_sw (offload), by default skip_hw
-s, --share_action all filters share the same action
-p, --prio all filters have different prio
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: Chris Mi <chrism@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a completion file for bash. The completion function runs bpftool
when needed, making it smart enough to help users complete ids or tags
for eBPF programs and maps currently on the system.
Update Makefile to install completion file to
/usr/share/bash-completion/completions when running `make install`.
Emacs file mode and (at the end) Vim modeline have been added, to keep
the style in use for most existing bash completion files. In this, it
differs from tools/perf/perf-completion.sh, which seems to be the only
other completion file among the kernel sources repository. This is also
valid for indent style: 4-space indents, as in other completion files.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recent additions to support multiple programs in cgroups impose
a strict requirement, "all yes is yes, any no is no". To enforce
this the infrastructure requires the 'no' return code, SK_DROP in
this case, to be 0.
To apply these rules to SK_SKB program types the sk_actions return
codes need to be adjusted.
This fix adds SK_PASS and makes 'SK_DROP = 0'. Finally, remove
SK_ABORTED to remove any chance that the API may allow aborted
program flows to be passed up the stack. This would be incorrect
behavior and allow programs to break existing policies.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We have defined YY_USER_ACTION to keep trace of the column location
during events parsing, but we need to clean it up when we call REJECT.
When REJECT is called, the lexer shrinks the text and re-runs the
matching, so we need to address it in resuming the previous location
value to keep it correct for error display, like:
Before:
$ perf stat -e 'cpu/uops_executed.core,krava/' true
event syntax error: '..38;5;9:mi=01;05;37;41:su=48;5;196;38;5;15:sg=48;5;1\
1;38;5;16:ca=48;5;196;38;5;226:tw=48;5;10;38;5;16:ow=48;5;10;38;5;21:st=48;5;\
21;38;50
�'
\___ unknown term
After:
$ ./perf stat -e 'cpu/uops_executed.core,krava/' true
event syntax error: '..cuted.core,krava/'
\___ unknown term
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Andi Kleen <ak@linux.intel.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-vug2hchlny30jfsfrumbym26@git.kernel.org
Link: http://lkml.kernel.org/r/20171009140944.GD28623@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We will use it to generate tables for beautifying prctl's 'option' arg
and some of the others eventually.
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-cg8mpmz4hk9nfih685emnbk9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce a new option to dump trace output to files named by the
monitored events and update perf-script documentation accordingly.
Shown below is output of perf script command with the newly introduced
option.
$ perf record -e cycles -e cs -ag -- sleep 1
$ perf script --per-event-dump
$ ls
perf.data.cycles.dump perf.data.cs.dump
Without per-event-dump support, drawing flamegraphs for different events
would require post processing to separate events. You can monitor only
one event at a time if you want to get flamegraphs for different events.
Using this option, you can get the trace output files named by the
monitored events, and could draw flamegraphs according to the event's
name.
Based-on-a-patch-by: yuzhoujian <yuzhoujian@didichuxing.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1508921599-10832-3-git-send-email-yuzhoujian@didichuxing.com
Link: http://lkml.kernel.org/n/tip-8ngzsjdhgiovkupl3r5yy570@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When we started using it for stats and did it not just in
builtin-stat.c, but also for builtin-script.c, then it stopped being a
tool private area, so introduce a new pointer for these stats and leave
->priv to its original purpose.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: yuzhoujian <yuzhoujian@didichuxing.com>
Fixes: cfc8874a4859 ("perf script: Process cpu/threads maps")
Link: http://lkml.kernel.org/n/tip-jtpzx3rjqo78snmmsdzwb2eb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Another case where we a1a587073ccd ("perf script: Use fprintf like
printing uniformly") forgot to redirect output to the FILE descriptor,
fix this too.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: yuzhoujian <yuzhoujian@didichuxing.com>
Link: http://lkml.kernel.org/n/tip-jmwx4pgfezw98ezfoj9t957s@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We have facilities for reporting unexpected, unlikely errors, use them.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: yuzhoujian <yuzhoujian@didichuxing.com>
Link: http://lkml.kernel.org/n/tip-c7j22xfjf1j773g7ufp607q0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In a1a587073ccd ("perf script: Use fprintf like printing uniformly")
there were a few cases that were missed, fix it.
Reported-by: yuzhoujian <yuzhoujian@didichuxing.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-sq9hvfk5mkjdqzlpyiq7jkos@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
One possible cause of failure for `bpftool {prog|map} pin * file FILE`
is the FILE not being in an eBPF virtual file system (bpffs). In this
case, make bpftool attempt to mount bpffs on the parent directory of the
FILE. Then, if this operation is successful, try again to pin the
object.
The code for mnt_bpffs() is a copy of function bpf_mnt_fs() from
iproute2 package (under lib/bpf.c, taken at commit 4b73d52f8a81), with
modifications regarding handling of error messages.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Perf top is often crashing at very random locations on powerpc. After
investigating, I found the crash only happens when sample is of zero
length symbol. Powerpc kernel has many such symbols which does not
contain length details in vmlinux binary and thus start and end
addresses of such symbols are same.
Structure
struct sym_hist {
u64 nr_samples;
u64 period;
struct sym_hist_entry addr[0];
};
has last member 'addr[]' of size zero. 'addr[]' is an array of addresses
that belongs to one symbol (function). If function consist of 100
instructions, 'addr' points to an array of 100 'struct sym_hist_entry'
elements. For zero length symbol, it points to the *empty* array, i.e.
no members in the array and thus offset 0 is also invalid for such
array.
static int __symbol__inc_addr_samples(...)
{
...
offset = addr - sym->start;
h = annotation__histogram(notes, evidx);
h->nr_samples++;
h->addr[offset].nr_samples++;
h->period += sample->period;
h->addr[offset].period += sample->period;
...
}
Here, when 'addr' is same as 'sym->start', 'offset' becomes 0, which is
valid for normal symbols but *invalid* for zero length symbols and thus
updating h->addr[offset] causes memory corruption.
Fix this by adding one dummy element for zero length symbols.
Link: https://lkml.org/lkml/2016/10/10/148
Fixes: edee44be5919 ("perf annotate: Don't throw error for zero length symbols")
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Link: http://lkml.kernel.org/r/1508854806-10542-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now that we have caches in place to speed up the process of finding
inlined frames and srcline information repeatedly, we can enable this
useful option by default.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171019113836.5548-6-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On one hand this ensures that the memory is properly freed when the DSO
gets freed. On the other hand this significantly speeds up the
processing of the callchain nodes when lots of srclines are requested.
For one of my data files e.g.:
Before:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
52496.495043 task-clock (msec) # 0.999 CPUs utilized
634 context-switches # 0.012 K/sec
2 cpu-migrations # 0.000 K/sec
191,561 page-faults # 0.004 M/sec
165,074,498,235 cycles # 3.144 GHz
334,170,832,408 instructions # 2.02 insn per cycle
90,220,029,745 branches # 1718.591 M/sec
654,525,177 branch-misses # 0.73% of all branches
52.533273822 seconds time elapsedProcessed 236605 events and lost 40 chunks!
After:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
22606.323706 task-clock (msec) # 1.000 CPUs utilized
31 context-switches # 0.001 K/sec
0 cpu-migrations # 0.000 K/sec
185,471 page-faults # 0.008 M/sec
71,188,113,681 cycles # 3.149 GHz
133,204,943,083 instructions # 1.87 insn per cycle
34,886,384,979 branches # 1543.214 M/sec
278,214,495 branch-misses # 0.80% of all branches
22.609857253 seconds time elapsed
Note that the difference is only this large when `--inline` is not
passed. In such situations, we would use the inliner cache and thus do
not run this code path that often.
I think that this cache should actually be used in other places, too.
When looking at the valgrind leak report for perf report, we see tons of
srclines being leaked, most notably from calls to
hist_entry__get_srcline. The problem is that get_srcline has many
different formatting options (show_sym, show_addr, potentially even
unwind_inlines when calling __get_srcline directly). As such, the
srcline cannot easily be cached for all calls, or we'd have to add
caches for all formatting combinations (6 so far). An alternative would
be to remove the formatting options and handle that on a different level
- i.e. print the sym/addr on demand wherever we actually output
something. And the unwind_inlines could be moved into a separate
function that does not return the srcline.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171019113836.5548-4-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When no inlined frames could be found for a given address, we did not
store this information anywhere. That means we potentially do the costly
inliner lookup repeatedly for cases where we know it can never succeed.
This patch makes dso__parse_addr_inlines always return a valid
inline_node. It will be empty when no inliners are found. This enables
us to cache the empty list in the DSO, thereby improving the performance
when many addresses fail to find the inliners.
For my trivial example, the performance impact is already quite
significant:
Before:
~~~~~
Performance counter stats for 'perf report --stdio --inline -g srcline -s srcline' (5 runs):
594.804032 task-clock (msec) # 0.998 CPUs utilized ( +- 0.07% )
53 context-switches # 0.089 K/sec ( +- 4.09% )
0 cpu-migrations # 0.000 K/sec ( +-100.00% )
5,687 page-faults # 0.010 M/sec ( +- 0.02% )
2,300,918,213 cycles # 3.868 GHz ( +- 0.09% )
4,395,839,080 instructions # 1.91 insn per cycle ( +- 0.00% )
939,177,205 branches # 1578.969 M/sec ( +- 0.00% )
11,824,633 branch-misses # 1.26% of all branches ( +- 0.10% )
0.596246531 seconds time elapsed ( +- 0.07% )
~~~~~
After:
~~~~~
Performance counter stats for 'perf report --stdio --inline -g srcline -s srcline' (5 runs):
113.111405 task-clock (msec) # 0.990 CPUs utilized ( +- 0.89% )
29 context-switches # 0.255 K/sec ( +- 54.25% )
0 cpu-migrations # 0.000 K/sec
5,380 page-faults # 0.048 M/sec ( +- 0.01% )
432,378,779 cycles # 3.823 GHz ( +- 0.75% )
670,057,633 instructions # 1.55 insn per cycle ( +- 0.01% )
141,001,247 branches # 1246.570 M/sec ( +- 0.01% )
2,346,845 branch-misses # 1.66% of all branches ( +- 0.19% )
0.114222393 seconds time elapsed ( +- 1.19% )
~~~~~
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171019113836.5548-3-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some of the code paths I introduced before returned too early without
running the code to handle a node's branch count. By refactoring
match_chain to only have one exit point, this can be remedied.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1707691.qaJ269GSZW@agathebauer
Link: http://lkml.kernel.org/r/20171018185350.14893-2-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Please do not apply this to mainline directly, instead please re-run the
coccinelle script shown below and apply its output.
For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
preference to ACCESS_ONCE(), and new code is expected to use one of the
former. So far, there's been no reason to change most existing uses of
ACCESS_ONCE(), as these aren't harmful, and changing them results in
churn.
However, for some features, the read/write distinction is critical to
correct operation. To distinguish these cases, separate read/write
accessors must be used. This patch migrates (most) remaining
ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following
coccinelle script:
----
// Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and
// WRITE_ONCE()
// $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch
virtual patch
@ depends on patch @
expression E1, E2;
@@
- ACCESS_ONCE(E1) = E2
+ WRITE_ONCE(E1, E2)
@ depends on patch @
expression E;
@@
- ACCESS_ONCE(E)
+ READ_ONCE(E)
----
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: davem@davemloft.net
Cc: linux-arch@vger.kernel.org
Cc: mpe@ellerman.id.au
Cc: shuah@kernel.org
Cc: snitzer@redhat.com
Cc: thor.thayer@linux.intel.com
Cc: tj@kernel.org
Cc: viro@zeniv.linux.org.uk
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
preference to ACCESS_ONCE(), and new code is expected to use one of the
former. So far, there's been no reason to change most existing uses of
ACCESS_ONCE(), as these aren't currently harmful.
However, for some features it is necessary to instrument reads and
writes separately, which is not possible with ACCESS_ONCE(). This
distinction is critical to correct operation.
The bulk of the kernel code can be transformed via Coccinelle to use
{READ,WRITE}_ONCE(), though this only modifies users of ACCESS_ONCE(),
and not the implementation itself. As such, it has the potential to
break homebrew ACCESS_ONCE() macros seen in some user code in the kernel
tree (e.g. the virtio code, as fixed in commit ea9156fb3b71d9f7).
To avoid fragility if/when that transformation occurs, this patch
reworks the definitions of {READ,WRITE}_ONCE() in the rcutorture formal
tests, and removes the unused ACCESS_ONCE() helper. There should be no
functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: davem@davemloft.net
Cc: linux-arch@vger.kernel.org
Cc: mpe@ellerman.id.au
Cc: shuah@kernel.org
Cc: snitzer@redhat.com
Cc: thor.thayer@linux.intel.com
Cc: tj@kernel.org
Cc: viro@zeniv.linux.org.uk
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1508792849-3115-13-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
preference to ACCESS_ONCE(), and new code is expected to use one of the
former. So far, there's been no reason to change most existing uses of
ACCESS_ONCE(), as these aren't currently harmful.
However, for some features it is necessary to instrument reads and
writes separately, which is not possible with ACCESS_ONCE(). This
distinction is critical to correct operation.
The bulk of the kernel code can be transformed via Coccinelle to use
{READ,WRITE}_ONCE(), though this only modifies users of ACCESS_ONCE(),
and not the implementation itself. As such, it has the potential to
break homebrew ACCESS_ONCE() macros seen in some user code in the kernel
tree (e.g. the virtio code, as fixed in commit ea9156fb3b71d9f7).
To avoid fragility if/when that transformation occurs, and to align with
the preferred usage of {READ,WRITE}_ONCE(), this patch updates the DSCR
selftest code to use READ_ONCE() rather than ACCESS_ONCE(). There should
be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: davem@davemloft.net
Cc: linux-arch@vger.kernel.org
Cc: snitzer@redhat.com
Cc: thor.thayer@linux.intel.com
Cc: tj@kernel.org
Cc: viro@zeniv.linux.org.uk
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1508792849-3115-11-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The fake symbols we create for inlined frames will represent different
functions but can use the symbol start address. This leads to issues
when different inline branches all lead to the same function.
Before:
~~~~~
$ perf report -s sym -i perf.inlining.data --inline --stdio -g function
...
--38.86%--_start
__libc_start_main
main
|
--37.57%--std::norm<double> (inlined)
std::_Norm_helper<true>::_S_do_it<double> (inlined)
|
--36.36%--std::abs<double> (inlined)
std::__complex_abs (inlined)
|
--12.24%--std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>::operator() (inlined)
std::__detail::__mod<unsigned long, 2147483647ul, 16807ul, 0ul> (inlined)
std::__detail::_Mod<unsigned long, 2147483647ul, 16807ul, 0ul, true, true>::__calc (inlined)
~~~~~
Note that this backtrace representation is completely bogus.
Complex abs does not call the linear congruential engine! It
is just a side-effect of a longer inlined stack being appended
to a shorter, different inlined stack, both of which originate
in the same function (main).
This patch fixes the issue:
~~~~~
$ perf report -s sym -i perf.inlining.data --inline --stdio -g function
...
--38.86%--_start
__libc_start_main
main
|
|--35.59%--std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > (inlined)
| std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > (inlined)
| |
| --34.37%--std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator() (inlined)
| std::generate_canonical<double, 53ul, std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > (inlined)
| |
| --12.24%--std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>::operator() (inlined)
| std::__detail::__mod<unsigned long, 2147483647ul, 16807ul, 0ul> (inlined)
| std::__detail::_Mod<unsigned long, 2147483647ul, 16807ul, 0ul, true, true>::__calc (inlined)
|
--1.99%--std::norm<double> (inlined)
std::_Norm_helper<true>::_S_do_it<double> (inlined)
std::abs<double> (inlined)
std::__complex_abs (inlined)
~~~~~
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Yao Jin <yao.jin@linux.intel.com>
Link: http://lkml.kernel.org/r/20171009203310.17362-10-milian.wolff@kdab.com
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
[ Fix up conflict with c1fbc0cf81f1 ("perf callchain: Compare dsos (as well) for CCKEY_FUNCTION"), remove unneeded hunk ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The original patch that introduced inline frame output in the various
browsers used this suffix already. The new centralized approach that
uses fake symbols for inlined frames was missing this approach so far.
Instead of changing the symbol name itself, we only print the suffix
where needed. This allows us to efficiently lookup the symbol for a
given name without first having to append the suffix before the lookup.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yao Jin <yao.jin@linux.intel.com>
Link: http://lkml.kernel.org/r/20171009203310.17362-8-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When a callchain entry has no srcline available, we ended up comparing
the instruction pointer. I consider this to be not too useful. Rather, I
think we should group the entries by function name, which this patch
adds. For people who want to split the data on the IP boundary, using
`-g address` is the correct choice.
Before:
~~~~~
100.00% 38.86% [.] main
|
|--61.14%--main inlining.cpp:14
| std::norm<double> complex:664
| std::_Norm_helper<true>::_S_do_it<double> complex:654
| std::abs<double> complex:597
| std::__complex_abs complex:589
| |
| |--56.03%--hypot
| | |
| | |--8.45%--__hypot_finite
| | |
| | |--7.62%--__hypot_finite
| | |
| | |--2.29%--__hypot_finite
| | |
| | |--2.24%--__hypot_finite
| | |
| | |--2.06%--__hypot_finite
| | |
| | |--1.81%--__hypot_finite
...
~~~~~
After:
~~~~~
100.00% 38.86% [.] main
|
|--61.14%--main inlining.cpp:14
| std::norm<double> complex:664
| std::_Norm_helper<true>::_S_do_it<double> complex:654
| std::abs<double> complex:597
| std::__complex_abs complex:589
| |
| |--60.29%--hypot
| | |
| | --56.03%--__hypot_finite
| |
| --0.85%--cabs
~~~~~
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yao Jin <yao.jin@linux.intel.com>
Link: http://lkml.kernel.org/r/20171009203310.17362-7-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The inline_node structs are maintained by the new dso->inlines tree.
This in turn keeps ownership of the fake symbols and srcline string
representing an inline frame.
This tree is sorted by address to allow quick lookups. All other entries
of the symbol beside the function name are unused for inline frames. The
advantage of this approach is that all existing users of the callchain
API can now transparently display inlined frames without having to patch
their code.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yao Jin <yao.jin@linux.intel.com>
Link: http://lkml.kernel.org/r/20171009203310.17362-6-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is a preparation for the creation of real callchain entries for
inlined frames. The rest of the perf code uses the srcline string. As
such, using that also for the srcline API allows us to simplify some of
the upcoming code. Most notably, it will allow us to cache the srcline
for a given inline node and reuse it for different callchain entries.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yao Jin <yao.jin@linux.intel.com>
Link: http://lkml.kernel.org/r/20171009203310.17362-5-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is a requirement to create real callchain entries for inlined
frames.
Since the list of inlines usually contains the target symbol too, i.e.
the location where the frames get inlined to, we alias that symbol and
reuse it as-is is. This ensures that other dependent functionality keeps
working, most notably annotation of the target frames.
For all other entries in the inline_list, a fake symbol is created.
These are marked by new 'inlined' member which is set to true. Only
those symbols are managed by the inline_list and get freed when the
inline_list is deleted from within inline_node__delete.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yao Jin <yao.jin@linux.intel.com>
Link: http://lkml.kernel.org/r/20171009203310.17362-4-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>