IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
That got broken by d3a72fd8187b ("perf report: Fix indentation of
dynamic entries in hierarchy"), by using the evlist in setup_sorting()
without checking if it is NULL, as done in some 'perf test' entries:
$ find tools/ -name "*.c" | xargs grep 'setup_sorting(NULL);'
tools/perf/tests/hists_output.c: setup_sorting(NULL);
tools/perf/tests/hists_output.c: setup_sorting(NULL);
tools/perf/tests/hists_output.c: setup_sorting(NULL);
tools/perf/tests/hists_output.c: setup_sorting(NULL);
tools/perf/tests/hists_output.c: setup_sorting(NULL);
tools/perf/tests/hists_cumulate.c: setup_sorting(NULL);
tools/perf/tests/hists_cumulate.c: setup_sorting(NULL);
tools/perf/tests/hists_cumulate.c: setup_sorting(NULL);
tools/perf/tests/hists_cumulate.c: setup_sorting(NULL);
$
Fix it.
Before:
[root@jouet ~]# perf test
<SNIP>
15: Test matching and linking multiple hists : FAILED!
16: Try 'import perf' in python, checking link problems : Ok
17: Test breakpoint overflow signal handler : Ok
18: Test breakpoint overflow sampling : Ok
19: Test number of exit event of a simple workload : Ok
20: Test software clock events have valid period values : Ok
21: Test object code reading : Ok
22: Test sample parsing : Ok
23: Test using a dummy software event to keep tracking : Ok
24: Test parsing with no sample_id_all bit set : Ok
25: Test filtering hist entries : FAILED!
26: Test mmap thread lookup : Ok
27: Test thread mg sharing : Ok
28: Test output sorting of hist entries : FAILED!
29: Test cumulation of child hist entries : FAILED!
<SNIP>
After the patch the above failed tests complete successfully.
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: d3a72fd8187b ("perf report: Fix indentation of dynamic entries in hierarchy")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When a long value is read on 32 bit machines for 64 bit output, the
parsing needs to change "%lu" into "%llu", as the value is read
natively.
Unfortunately, if "%llu" is already there, the code will add another "l"
to it and fail to parse it properly.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20160209204237.337024613@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Had a bug where on error of parsing __print_array() where the fields are
freed after they were allocated, but since they were not set to NULL,
the freeing of the arg also tried to free the already freed fields
causing a double free.
Fix process_hex() while at it.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20160209204237.188327674@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When rounding to microseconds, if the timestamp subsecond is between
.999999500 and .999999999, it is rounded to .1000000, when it should
instead increment the second counter due to the overflow.
For example, if the timestamp is 1234.999999501 instead of seeing:
1235.000000
we see:
1234.1000000
Signed-off-by: Chaos.Chen <rainboy1215@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20160209204236.824426460@goodmis.org
[ fixed incrementing "secs" instead of decrementing it ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'command_line' variable is free'd twice if db_export__branch_types()
fails. To avoid this, defer the free'ing of 'command_line' to after this
call so that the error return path will just free 'command_line' once.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Javi Merino <javi.merino@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1456875980-25606-1-git-send-email-colin.king@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The "man gcc" says .i extension represents the file is C source code
that should not be preprocessed. Here, .s should be used.
For clarification,
.c ---(preprocess)---> .i
.S ---(preprocess)---> .s
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Aaro Koskinen <aaro.koskinen@nokia.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Lukas Wunner <lukas@wunner.de>
Link: http://lkml.kernel.org/r/1454263140-19670-1-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now support CSV output for metrics. With the new output callbacks this
is relatively straight forward by creating new callbacks.
This allows to easily plot metrics from CSV files.
The new line callback needs to know the number of fields to skip them
correctly
Example output before:
% perf stat -x, true
0.200687,,task-clock,200687,100.00
0,,context-switches,200687,100.00
0,,cpu-migrations,200687,100.00
40,,page-faults,200687,100.00
730871,,cycles,203601,100.00
551056,,stalled-cycles-frontend,203601,100.00
<not supported>,,stalled-cycles-backend,0,100.00
385523,,instructions,203601,100.00
78028,,branches,203601,100.00
3946,,branch-misses,203601,100.00
After:
% perf stat -x, true
.502457,,task-clock,502457,100.00,0.485,CPUs utilized
0,,context-switches,502457,100.00,0.000,K/sec
0,,cpu-migrations,502457,100.00,0.000,K/sec
45,,page-faults,502457,100.00,0.090,M/sec
644692,,cycles,509102,100.00,1.283,GHz
423470,,stalled-cycles-frontend,509102,100.00,65.69,frontend cycles idle
<not supported>,,stalled-cycles-backend,0,100.00,,,,
492701,,instructions,509102,100.00,0.76,insn per cycle
,,,,,0.86,stalled cycles per insn
97767,,branches,509102,100.00,194.578,M/sec
4788,,branch-misses,509102,100.00,4.90,of all branches
or easier readable
$ perf stat -x, -o x.csv true
$ column -s, -t x.csv
0.490635 task-clock 490635 100.00 0.489 CPUs utilized
0 context-switches 490635 100.00 0.000 K/sec
0 cpu-migrations 490635 100.00 0.000 K/sec
45 page-faults 490635 100.00 0.092 M/sec
629080 cycles 497698 100.00 1.282 GHz
409498 stalled-cycles-frontend 497698 100.00 65.09 frontend cycles idle
<not supported> stalled-cycles-backend 0 100.00
491424 instructions 497698 100.00 0.78 insn per cycle
0.83 stalled cycles per insn
97278 branches 497698 100.00 198.270 M/sec
4569 branch-misses 497698 100.00 4.70 of all branches
Two new fields are added: metric value and metric name.
v2: Split out function argument changes
v3: Reenable metrics for real.
v4: Fix wrong hunk from refactoring.
v5: Remove extra "noise" printing (Jiri), but add it to the not counted case.
Print empty metrics for not counted.
v6: Avoid outputting metric on empty format.
v7: Print metric at the end
v8: Remove extra run, ena fields
v9: Avoid extra new line for unsupported counters
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/1456785386-19481-3-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_evlist__mmap_ex() can fail without setting errno (for example, fail
in condition checking. In this case all syscall is success).
If this happen, record__open() incorrectly returns 0. Force setting rc
is a quick way to avoid this problem, or we have to follow all possible
code path in perf_evlist__mmap_ex() to make sure there's at least one
system call before returning an error.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-30-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move code for finalizing 'perf.data' to record__finish_output(). It will
be used by following commits to split output to multiple files.
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-23-git-send-email-wangnan0@huawei.com
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Create record__synthesize(). It can be used to create tracking events
for each perf.data after perf supporting splitting into multiple
outputs.
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-20-git-send-email-wangnan0@huawei.com
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commits in a BPF patchkit will extract kernel and module synthesizing
code into a separated function and call it multiple times. This patch
replace 'if (err < 0)' using WARN_ONCE, makes sure the error message
show one time.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-19-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
After babeltrace commit 5cec03e402aa ("ir: copy variants and sequences
when setting a field path"), 'perf data convert' gets incorrect result
if there's bpf output data. For example:
# perf data convert --to-ctf ./out.ctf
# babeltrace ./out.ctf
[10:44:31.186045346] (+?.?????????) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF810E7DD1, perf_tid = 23819, perf_pid = 23819, perf_id = 518, raw_len = 3, raw_data = [ [0] = 0xC028E32F, [1] = 0x815D0100, [2] = 0x1000000 ] }
[10:44:31.286101003] (+0.100055657) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF8105B609, perf_tid = 23819, perf_pid = 23819, perf_id = 518, raw_len = 3, raw_data = [ [0] = 0x35D9F1EB, [1] = 0x15D81, [2] = 0x2 ] }
The expected result of the first sample should be:
raw_data = [ [0] = 0x2FE328C0, [1] = 0x15D81, [2] = 0x1 ] }
however, 'perf data convert' output big endian value to resuling CTF
file.
The reason is a internal change (or a bug?) of babeltrace.
Before this patch, at the first add_bpf_output_values(), byte order of
all integer type is uncertain (is 0, neither 1234 (le) nor 4321 (be)).
It would be fixed by:
perf_evlist__deliver_sample
-> process_sample_event
-> ctf_stream
...
->bt_ctf_trace_add_stream_class
->bt_ctf_field_type_structure_set_byte_order
->bt_ctf_field_type_integer_set_byte_order
during creating the stream.
However, the babeltrace commit mentioned above duplicates types in
sequence to prevent potential conflict in following call stack and link
the newly allocated type into the 'raw_data' sequence:
perf_evlist__deliver_sample
-> process_sample_event
-> ctf_stream
...
-> bt_ctf_trace_add_stream_class
-> bt_ctf_stream_class_resolve_types
...
-> bt_ctf_field_type_sequence_copy
->bt_ctf_field_type_integer_copy
This happens before byte order setting, so only the newly allocated
type is initialized, the byte order of original type perf choose to
create the first raw_data is still uncertain.
Byte order in CTF output is not related to byte order in perf.data.
Setting it to anything other than BT_CTF_BYTE_ORDER_NATIVE solves this
problem (only BT_CTF_BYTE_ORDER_NATIVE needs to be fixed). To reduce
behavior changing, set byte order according to compiling options.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-10-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ingo reported regression on display format of big numbers, which is
missing separators (in default perf stat output).
triton:~/tip> perf stat -a sleep 1
...
127008602 cycles # 0.011 GHz
279538533 stalled-cycles-frontend # 220.09% frontend cycles idle
119213269 instructions # 0.94 insn per cycle
This is caused by recent change:
perf stat: Check existence of frontend/backed stalled cycles
that added call to pmu_have_event, that subsequently calls
perf_pmu__parse_scale, which has a bug in locale handling.
The lc string returned from setlocale, that we use to store old locale
value, may be allocated in static storage. Getting a dynamic copy to
make it survive another setlocale call.
$ perf stat ls
...
2,360,602 cycles # 3.080 GHz
2,703,090 instructions # 1.15 insn per cycle
546,031 branches # 712.511 M/sec
Committer note:
Since the patch introducing the regression didn't made to perf/core,
move it to just before where the regression was introduced, so that we
don't break bisection for this feature.
Reported-by: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20160303095348.GA24511@krava.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
virtio_ring currently sends the device (usually a hypervisor)
physical addresses of its I/O buffers. This is okay when DMA
addresses and physical addresses are the same thing, but this isn't
always the case. For example, this never works on Xen guests, and
it is likely to fail if a physical "virtio" device ever ends up
behind an IOMMU or swiotlb.
The immediate use case for me is to enable virtio on Xen guests.
For that to work, we need vring to support DMA address translation
as well as a corresponding change to virtio_pci or to another
driver.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Load up the non volatile FPU and VMX regs and ensure that they are the
expected value in a signal handler
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Loop in assembly checking the registers with many threads.
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Test that the non volatile floating point and Altivec registers get
correctly preserved across the fork() syscall.
fork() works nicely for this purpose, the registers should be the same for
both parent and child
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
[mpe: Add include guards to basic_asm.h, minor formatting]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
LTO can cause GCC to inline some functions which have attributes set.
The act of inlining the functions can lead to GCC forgetting about the
attributes which leads to incorrect tests.
Notable example being: __attribute__((__target__("no-vsx")))
LTO can also interact strangely with custom assembly functions and cause
tests to intermittently fail.
Both these cases are hard to detect and require manual inspection of
binaries which is unlikely to happen for all tests. Furthermore, LTO
optimisations are not necessary for selftests and correctness is
paramount and as such it is best to disable LTO.
LTO can be enabled on a per test basis.
A pseries_le_defconfig kernel on a POWER8 was used to determine that the
same subset of selftests pass and fail with and without -flto in the
common Makefile.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Gcc helpfully points out that we're accessing past the end of the gprs
array:
tm-signal-msr-resv.c: In function 'signal_usr1':
tm-signal-msr-resv.c:43:37: error: array subscript is above array bounds [-Werror=array-bounds]
ucp->uc_mcontext.regs->gpr[PT_MSR] |= (7ULL);
We haven't noticed previously because -flto was hiding it somehow.
The code is confused, PT_MSR isn't a gpr, instead it's in
uc_regs->gregs, so fix it.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
We want the fixes in here, and others are sending us pull requests based
on this kernel tree.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently there's a single function that is used to display a record's
data in human readable format. That's pevent_print_event().
Unfortunately, this gives little room for adding other output within the
line without updating that function call.
I've decided to split that function into 3 parts.
pevent_print_event_task() which prints the task comm, pid and the CPU
pevent_print_event_time() which outputs the record's timestamp
pevent_print_event_data() which outputs the rest of the event data.
pevent_print_event() now simply calls these three functions.
To save time from doing the search for event from the record's type, I
created a new helper function called pevent_find_event_by_record(),
which returns the record's event, and this event has to be passed to the
above functions.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20160229090128.43a56704@gandalf.local.home
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Format fields of a syscall have the first variable '__syscall_nr' or
'nr' that mean the syscall number. But it isn't relevant here so drop
it.
'nr' among fields of syscall was renamed '__syscall_nr'. So add
exception handling to drop '__syscall_nr' and modify the comment for
this excpetion handling.
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1456492465-5946-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The util/python-ext-sources file contains source files required to build
the python extension relative to $(srctree)/tools/perf,
Such a file path $(FILE).c is handed over to the python extension build
system, which builds the final object in the
$(PYTHON_EXTBUILD)/tmp/$(FILE).o path.
After the build is done all files from $(PYTHON_EXTBUILD)lib/ are
carried as the result binaries.
Above system fails when we add source file relative to ../lib, which we
do for:
../lib/bitmap.c
../lib/find_bit.c
../lib/hweight.c
../lib/rbtree.c
All above objects will be built like:
$(PYTHON_EXTBUILD)/tmp/../lib/bitmap.c
$(PYTHON_EXTBUILD)/tmp/../lib/find_bit.c
$(PYTHON_EXTBUILD)/tmp/../lib/hweight.c
$(PYTHON_EXTBUILD)/tmp/../lib/rbtree.c
which accidentally happens to be final library path:
$(PYTHON_EXTBUILD)/lib/
Changing setup.py to pass full paths of source files to Extension build
class and thus keep all built objects under $(PYTHON_EXTBUILD)tmp
directory.
Reported-by: Jeff Bastian <jbastian@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Josh Boyer <jwboyer@fedoraproject.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@vger.kernel.org # v4.2+
Link: http://lkml.kernel.org/r/20160227201350.GB28494@krava.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This warning triggers if the .so library has already been linked:
triton:~/tip/tools/lib/lockdep> make
CC common.o
CC lockdep.o
CC rbtree.o
LD liblockdep-in.o
LD liblockdep.a
ln: failed to create symbolic link ‘liblockdep.so’: File exists
LD liblockdep.so.4.5.0-rc6
Overwrite the link.
Cc: Alfredo Alvarez Fernandez <alfredoalvarezfernandez@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add test for AA and 2 threaded ABBA locking.
Rename AA.c to ABA.c since it was implementing an ABA instead of a pure
AA. Now both cases are covered.
The expected output for AA.c is that the process blocks and lockdep
reports a deadlock.
ABBA_2threads.c differs from ABBA.c in that lockdep keeps separate chains
of held locks per task. This can lead to different behaviour regarding
lock detection. The expected output for this test is that the process
blocks and lockdep reports a circular locking dependency.
These tests found a lockdep bug - fixed by the next commit.
Signed-off-by: Alfredo Alvarez Fernandez <alfredoalvarezfernandez@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1455864533-7536-3-git-send-email-alfredoalvarezernandez@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This was added to the kernel code in <1658d35ead5d> ("list: Use
READ_ONCE() when testing for empty lists").
There's nothing special we need to do about it in userspace.
Signed-off-by: Alfredo Alvarez Fernandez <alfredoalvarezfernandez@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1455864533-7536-2-git-send-email-alfredoalvarezernandez@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The following upstream commit:
4a389810bc3c ("kernel/locking/lockdep.c: convert hash tables to hlists")
broke the tools/lib/lockdep build. Add trivial RCU wrappers to fix it.
These wrappers should probably be moved into their own header file.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Krinkin <krinkin.m.u@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This adds a host tool named objtool which has a "check" subcommand which
analyzes .o files to ensure the validity of stack metadata. It enforces
a set of rules on asm code and C inline assembly code so that stack
traces can be reliable.
For each function, it recursively follows all possible code paths and
validates the correct frame pointer state at each instruction.
It also follows code paths involving kernel special sections, like
.altinstructions, __jump_table, and __ex_table, which can add
alternative execution paths to a given instruction (or set of
instructions). Similarly, it knows how to follow switch statements, for
which gcc sometimes uses jump tables.
Here are some of the benefits of validating stack metadata:
a) More reliable stack traces for frame pointer enabled kernels
Frame pointers are used for debugging purposes. They allow runtime
code and debug tools to be able to walk the stack to determine the
chain of function call sites that led to the currently executing
code.
For some architectures, frame pointers are enabled by
CONFIG_FRAME_POINTER. For some other architectures they may be
required by the ABI (sometimes referred to as "backchain pointers").
For C code, gcc automatically generates instructions for setting up
frame pointers when the -fno-omit-frame-pointer option is used.
But for asm code, the frame setup instructions have to be written by
hand, which most people don't do. So the end result is that
CONFIG_FRAME_POINTER is honored for C code but not for most asm code.
For stack traces based on frame pointers to be reliable, all
functions which call other functions must first create a stack frame
and update the frame pointer. If a first function doesn't properly
create a stack frame before calling a second function, the *caller*
of the first function will be skipped on the stack trace.
For example, consider the following example backtrace with frame
pointers enabled:
[<ffffffff81812584>] dump_stack+0x4b/0x63
[<ffffffff812d6dc2>] cmdline_proc_show+0x12/0x30
[<ffffffff8127f568>] seq_read+0x108/0x3e0
[<ffffffff812cce62>] proc_reg_read+0x42/0x70
[<ffffffff81256197>] __vfs_read+0x37/0x100
[<ffffffff81256b16>] vfs_read+0x86/0x130
[<ffffffff81257898>] SyS_read+0x58/0xd0
[<ffffffff8181c1f2>] entry_SYSCALL_64_fastpath+0x12/0x76
It correctly shows that the caller of cmdline_proc_show() is
seq_read().
If we remove the frame pointer logic from cmdline_proc_show() by
replacing the frame pointer related instructions with nops, here's
what it looks like instead:
[<ffffffff81812584>] dump_stack+0x4b/0x63
[<ffffffff812d6dc2>] cmdline_proc_show+0x12/0x30
[<ffffffff812cce62>] proc_reg_read+0x42/0x70
[<ffffffff81256197>] __vfs_read+0x37/0x100
[<ffffffff81256b16>] vfs_read+0x86/0x130
[<ffffffff81257898>] SyS_read+0x58/0xd0
[<ffffffff8181c1f2>] entry_SYSCALL_64_fastpath+0x12/0x76
Notice that cmdline_proc_show()'s caller, seq_read(), has been
skipped. Instead the stack trace seems to show that
cmdline_proc_show() was called by proc_reg_read().
The benefit of "objtool check" here is that because it ensures that
*all* functions honor CONFIG_FRAME_POINTER, no functions will ever[*]
be skipped on a stack trace.
[*] unless an interrupt or exception has occurred at the very
beginning of a function before the stack frame has been created,
or at the very end of the function after the stack frame has been
destroyed. This is an inherent limitation of frame pointers.
b) 100% reliable stack traces for DWARF enabled kernels
This is not yet implemented. For more details about what is planned,
see tools/objtool/Documentation/stack-validation.txt.
c) Higher live patching compatibility rate
This is not yet implemented. For more details about what is planned,
see tools/objtool/Documentation/stack-validation.txt.
To achieve the validation, "objtool check" enforces the following rules:
1. Each callable function must be annotated as such with the ELF
function type. In asm code, this is typically done using the
ENTRY/ENDPROC macros. If objtool finds a return instruction
outside of a function, it flags an error since that usually indicates
callable code which should be annotated accordingly.
This rule is needed so that objtool can properly identify each
callable function in order to analyze its stack metadata.
2. Conversely, each section of code which is *not* callable should *not*
be annotated as an ELF function. The ENDPROC macro shouldn't be used
in this case.
This rule is needed so that objtool can ignore non-callable code.
Such code doesn't have to follow any of the other rules.
3. Each callable function which calls another function must have the
correct frame pointer logic, if required by CONFIG_FRAME_POINTER or
the architecture's back chain rules. This can by done in asm code
with the FRAME_BEGIN/FRAME_END macros.
This rule ensures that frame pointer based stack traces will work as
designed. If function A doesn't create a stack frame before calling
function B, the _caller_ of function A will be skipped on the stack
trace.
4. Dynamic jumps and jumps to undefined symbols are only allowed if:
a) the jump is part of a switch statement; or
b) the jump matches sibling call semantics and the frame pointer has
the same value it had on function entry.
This rule is needed so that objtool can reliably analyze all of a
function's code paths. If a function jumps to code in another file,
and it's not a sibling call, objtool has no way to follow the jump
because it only analyzes a single file at a time.
5. A callable function may not execute kernel entry/exit instructions.
The only code which needs such instructions is kernel entry code,
which shouldn't be be in callable functions anyway.
This rule is just a sanity check to ensure that callable functions
return normally.
It currently only supports x86_64. I tried to make the code generic so
that support for other architectures can hopefully be plugged in
relatively easily.
On my Lenovo laptop with a i7-4810MQ 4-core/8-thread CPU, building the
kernel with objtool checking every .o file adds about three seconds of
total build time. It hasn't been optimized for performance yet, so
there are probably some opportunities for better build performance.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/f3efb173de43bd067b060de73f856567c0fa1174.1456719558.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Without this patch BPF map configuration is not applied.
Command like this:
# ./perf trace --ev bpf-output/no-inherit,name=evt/ \
--ev ./test_bpf_trace.c/map:channel.event=evt/ \
usleep 100000
Load BPF files without error, but since map:channel.event=evt is not
applied, bpf-output event not work.
This patch allows 'perf trace' load and run BPF scripts.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-7-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_evlist__set_filter() tries to set filter to every evsel linked in
the evlist. However, since filters can only be applied to tracepoints,
checking type of evsel before calling perf_evsel__set_filter() would be
better.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-6-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Before this patch each subcommand calls perf_config() by themself,
reading the default configuration together with subcommand specific
options. If a subcommand doesn't have it own options, it needs to call
'perf_config(perf_default_config, NULL)' to ensure .perfconfig is
loaded.
This patch brings perf_config(perf_default_config, NULL) to the very
start of main(), so subcommands don't need to do it.
After this patch, 'llvm.clang-path' works for 'perf trace'.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The column width of dynamic entries is updated when comparing hist
entries. However some unique entries can miss the chance to update. So
move the update to output resort stage to make sure every entry will get
called before display.
To do that, abuse ->sort callback to update the width when the third
argument is NULL. When resorting entries in normal path, it never be
NULL so it should be fine IMHO.
Before:
# Overhead ptr / bytes_req / gfp_flags
# .............. ..........................................
#
37.50% 0xffff8803f7669400
37.50% 448
37.50% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
10.42% 0xffff8803f766be00
8.33% 96
8.33% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
2.08% 512
2.08% GFP_KERNEL|GFP_NOWARN|GFP_REPEAT|GFP <-- here
After:
# Overhead ptr / bytes_req / gfp_flags
# .............. .....................................................
#
37.50% 0xffff8803f7669400
37.50% 448
37.50% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
10.42% 0xffff8803f766be00
8.33% 96
8.33% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
2.08% 512
2.08% GFP_KERNEL|GFP_NOWARN|GFP_REPEAT|GFP_NOMEMALLOC
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456512767-1164-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When dynamic sort key is used it might not show pretty printed output.
This is because the trace output was not set only for the first dynamic
sort key. During hierarchy_insert_entry() it missed to pass the
trace_output to dynamic entries. Also even if it did, only first entry
will have it. Subsequent entries might set it during collapsing stage
but it's not guaranteed.
Before:
$ perf report --hierarchy --stdio -s ptr,bytes_req,gfp_flags -g none
#
# Overhead ptr / bytes_req / gfp_flags
# .............. ..........................................
#
37.50% 0xffff8803f7669400
37.50% 448
37.50% 66080
10.42% 0xffff8803f766be00
8.33% 96
8.33% 66080
2.08% 512
2.08% 67280
After:
#
# Overhead ptr / bytes_req / gfp_flags
# .............. ..........................................
#
37.50% 0xffff8803f7669400
37.50% 448
37.50% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
10.42% 0xffff8803f766be00
8.33% 96
8.33% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
2.08% 512
2.08% GFP_KERNEL|GFP_NOWARN|GFP_REPEAT|GFP
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456512767-1164-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The dynamic entries are right-aligned unlike other entries since it
usually has numeric value. But for the hierarchy mode, left alignment
is more appropriate IMHO. Also trim spaces on the left so that we can
easily identify the hierarchy.
Before:
$ perf report --hierarchy -i perf.data.kmem -s gfp_flags,ptr,bytes_req --stdio -g none
...
#
# Overhead gfp_flags / ptr / bytes_req
# .............. .................................................................................................
#
91.67% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
37.50% 0xffff8803f7669400
37.50% 448
8.33% 0xffff8803f766be00
8.33% 96
4.17% 0xffff8800d156dc00
4.17% 704
After:
# Overhead gfp_flags / ptr / bytes_req
# .............. ....................................
#
91.67% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
37.50% 0xffff8803f7669400
37.50% 448
8.33% 0xffff8803f766be00
8.33% 96
4.17% 0xffff8800d156dc00
4.17% 704
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456512767-1164-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When dynamic entries are used in the hierarchy mode with multiple
events, the output might not be aligned properly. In the hierarchy
mode, the each sort column is indented using total number of sort keys.
So it keeps track of number of sort keys when adding them. However
a dynamic sort key can be added more than once when multiple events have
same field names. This results in unnecessarily long indentation in the
output.
For example perf kmem records following events:
$ perf evlist --trace-fields -i perf.data.kmem
kmem:kmalloc: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags
kmem:kmalloc_node: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags,node
kmem:kfree: trace_fields: call_site,ptr
kmem:kmem_cache_alloc: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags
kmem:kmem_cache_alloc_node: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags,node
kmem:kmem_cache_free: trace_fields: call_site,ptr
kmem:mm_page_alloc: trace_fields: page,order,gfp_flags,migratetype
kmem:mm_page_free: trace_fields: page,order
As you can see, many field names shared between kmem events. So adding
'ptr' dynamic sort key alone will set nr_sort_keys to 6. And this adds
many unnecessary spaces between columns.
Before:
$ perf report -i perf.data.kmem --hierarchy -s ptr -g none --stdio
...
# Overhead ptr
# ....................... ...................................
#
99.89% 0xffff8803ffb79720
0.06% 0xffff8803d228a000
0.03% 0xffff8803f7678f00
0.00% 0xffff880401dc5280
0.00% 0xffff880406172380
0.00% 0xffff8803ffac3a00
0.00% 0xffff8803ffac1600
After:
# Overhead ptr
# ........ ....................
#
99.89% 0xffff8803ffb79720
0.06% 0xffff8803d228a000
0.03% 0xffff8803f7678f00
0.00% 0xffff880401dc5280
0.00% 0xffff880406172380
0.00% 0xffff8803ffac3a00
0.00% 0xffff8803ffac1600
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456512767-1164-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When hist_entry__cmp() and hist_entry__collapse() are called, they
should check if the dynamic entry is comparing matching hists only.
Otherwise it might access different hists resulting in incorrect output.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456512767-1164-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Like the stdio, it should show messages about omitted hierarchy
entries. Please refer the previous commit for more details.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456488800-28124-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Like the stdio, it should show messages about omitted hierarchy entries.
Please refer the previous commit for more details.
As it needs to check an entry is omitted or not multiple times, add the
has_no_entry field in the hist entry.
Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456488800-28124-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The previous patch introduced __rb_hierarchy_next() function with
various move direction like HMD_FORCE_CHILD but missed to change using
it some place.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456488800-28124-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When the hierarchy mode is used, some entries might be omiited due to a
percent limit or filter. In this case the output hierarchy is different
than other entries. Add an informative message to users about this.
For example, when 4% of percent limit is applied:
Before:
# Overhead Command / Shared Object / Symbol
# .............. ..........................................
#
49.09% swapper
48.67% [kernel.vmlinux]
34.42% [k] intel_idle
11.51% firefox
8.87% libpthread-2.22.so
6.60% [.] __GI___libc_recvmsg
10.49% gnome-shell
4.74% libc-2.22.so
10.08% Xorg
6.11% libc-2.22.so
5.27% [.] __memcpy_sse2_unaligned
6.15% perf
Note that, gnome-shell/libc has no symbols and perf has no dso/symbols.
With that patch the output will look like below:
After:
# Overhead Command / Shared Object / Symbol
# .............. ..........................................
#
49.09% swapper
48.67% [kernel.vmlinux]
34.42% [k] intel_idle
11.51% firefox
8.87% libpthread-2.22.so
6.60% [.] __GI___libc_recvmsg
10.49% gnome-shell
4.74% libc-2.22.so
no entry >= 4.00%
10.08% Xorg
6.11% libc-2.22.so
5.27% [.] __memcpy_sse2_unaligned
6.15% perf
no entry >= 4.00%
Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456488800-28124-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The hists__overhead_width() is to calculate width occupied by the
overhead (and others) columns before the sort columns.
The hist_entry__has_hiearchy_children() is to check whether an entry has
lower entries (children) in the hierarchy to be shown in the output.
This means the children should not be filtered out and above the percent
limit.
These two functions will be used to show information when all children
of an entry is omitted by the percent limit (or filter).
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456488800-28124-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull libnvdimm fixes from Dan Williams:
- Two fixes for compatibility with the ACPI 6.1 specification.
Without these fixes multi-interface DIMMs will fail to be probed, and
address range scrub commands to find memory errors will give results
that the kernel will mis-interpret. For multi-interface DIMMs Linux
will accept either the original 6.0 implementation or 6.1.
For address range scrub we'll only support 6.1 since ACPI formalized
this DSM differently than the original example [1] implemented in
v4.2. The expectation is that production systems will only ever ship
the ACPI 6.1 address range scrub command definition.
- The wider async address range scrub work targeting 4.6 discovered
that the original synchronous implementation in 4.5 is not sizing its
return buffer correctly.
- Arnd caught that my recent fix to the size of the pfn_t flags missed
updating the flags variable used in the pmem driver.
- Toshi found that we mishandle the memremap() return value in
devm_memremap().
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
nvdimm: use 'u64' for pfn flags
devm_memremap: Fix error value when memremap failed
nfit: update address range scrub commands to the acpi 6.1 format
libnvdimm, tools/testing/nvdimm: fix 'ars_status' output buffer sizing
nfit: fix multi-interface dimm handling, acpi6.1 compatibility