IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Starting with binutils 2.28, aarch64 objdump adds comments to the
disassembly output to show the alternative names of a condition code
[1].
It is assumed that commas in objdump comments could occur in other
arches now or in the future, so this fix is arch-independent.
The fix could have been done with arm64 specific jump__parse and
jump__scnprintf functions, but the jump__scnprintf instruction would
have to have its comment character be a literal, since the scnprintf
functions cannot receive a struct arch easily.
This inconvenience also applies to the generic jump__scnprintf, which is
why we add a raw_comment pointer to struct ins_operands, so the __parse
function assigns it to be re-used by its corresponding __scnprintf
function.
Example differences in 'perf annotate --stdio2' output on an aarch64
perf.data file:
BEFORE: → b.cs ffff200008133d1c <unwind_frame+0x18c> // b.hs, dffff7ecc47b
AFTER : ↓ b.cs 18c
BEFORE: → b.cc ffff200008d8d9cc <get_alloc_profile+0x31c> // b.lo, b.ul, dffff727295b
AFTER : ↓ b.cc 31c
The branch target labels 18c and 31c also now appear in the output:
BEFORE: add x26, x29, #0x80
AFTER : 18c: add x26, x29, #0x80
BEFORE: add x21, x21, #0x8
AFTER : 31c: add x21, x21, #0x8
The Fixes: tag below is added so stable branches will get the update; it
doesn't necessarily mean that commit was broken at the time, rather it
didn't withstand the aarch64 objdump update.
Tested no difference in output for sample x86_64, power arch perf.data files.
[1] https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=bb7eff5206e4795ac79c177a80fe9f4630aaf730
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Fixes: b13bbeee5e ("perf annotate: Fix branch instruction with multiple operands")
Link: http://lkml.kernel.org/r/20180827125340.a2f7e291901d17cea05daba4@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add --percent-type option to set annotation percent type from following
choices:
global-period, local-period, global-hits, local-hits
Examples:
$ perf annotate --percent-type period-local --stdio | head -1
Percent | Source code ... es, percent: local period)
$ perf annotate --percent-type hits-local --stdio | head -1
Percent | Source code ... es, percent: local hits)
$ perf annotate --percent-type hits-global --stdio | head -1
Percent | Source code ... es, percent: global hits)
$ perf annotate --percent-type period-global --stdio | head -1
Percent | Source code ... es, percent: global period)
The local/global keywords set if the percentage is computed in the scope
of the function (local) or the whole data (global).
The period/hits keywords set the base the percentage is computed on -
the samples period or the number of samples (hits).
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20180804130521.11408-20-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We have more current function tto get the title for annotation,
which is hists__scnprintf_title. They both have same output as
far as the annotation's header line goes.
They differ in counting of the nr_samples, hists__scnprintf_title
provides more accurate number based on the setup of the
symbol_conf.filter_relative variable.
Plus it also displays any uid/thread/dso/socket filters/zooms
if there are set any, which annotation__scnprintf_samples_period
does not.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20180804130521.11408-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When we have multiple groups in an evlist, say:
$ perf stat -e '{cycles,instructions},{cache-references,cache-misses}' sleep 1
Performance counter stats for 'sleep 1':
343,134 cycles:u
249,292 instructions:u # 0.73 insn per cycle
15,556 cache-references:u
8,925 cache-misses:u # 57.373 % of all cache refs
1.000957550 seconds time elapsed
$
Then the perf_evsel instances for the two group leaders ("cycles" and
"cache-references") will have evsel->nr_members set to 2, while all the
evsel->evlist->nr_entries will be set to 4, so we can't use
evsel->evlist->nr_entries everywhere, as event groups need to be taken
into account.
But this probably requires us to audit at least the forced-group code,
where we want all of the events to be in a "group", to see them all in
the screen, one column for each, even knowing that they were not
necessarily scheduled to count at the same time by the kernel perf
subsystem.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-2g0vwqnc49wl4ttjk8dvpgcc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Because they all really check if we can access data structures/visual
constructs where a "jump" instruction targets code in the same function,
i.e. things like:
__pthread_mutex_lock /usr/lib64/libpthread-2.26.so
1.95 │ mov __pthread_force_elision,%ecx
│ ┌──test %ecx,%ecx
0.07 │ ├──je 60
│ │ test $0x300,%esi
│ │↓ jne 60
│ │ or $0x100,%esi
│ │ mov %esi,0x10(%rdi)
│ 42:│ mov %esi,%edx
│ │ lea 0x16(%r8),%rsi
│ │ mov %r8,%rdi
│ │ and $0x80,%edx
│ │ add $0x8,%rsp
│ │→ jmpq __lll_lock_elision
│ │ nop
0.29 │ 60:└─→and $0x80,%esi
0.07 │ mov $0x1,%edi
0.29 │ xor %eax,%eax
2.53 │ lock cmpxchg %edi,(%r8)
And not things like that "jmpq __lll_lock_elision", that instead should behave
like a "call" instruction and "jump" to the disassembly of "___lll_lock_elision".
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-3cwx39u3h66dfw9xjrlt7ca2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Things like this in _cpp_lex_token (gcc's cc1 program):
cpp_named_operator2name@@Base+0xa72
Point to a place that is after the cpp_named_operator2name boundaries,
i.e. in the ELF symbol table for cc1 cpp_named_operator2name is marked
as being 32-bytes long, but it in fact is much larger than that, so we
seem to need a symbols__find() routine that looks for >= current->start
and < next_symbol->start, possibly just for C++ objects?
For now lets just make some progress by marking jumps to outside the
current function as call like.
Actual navigation will come next, with further understanding of how the
symbol searching and disassembly should be done.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-aiys0a0bsgm3e00hbi6fg7yy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Just like we have in the histograms browser used as the main screen for
'perf top --tui' and 'perf report --tui', to print the current
annotation to a file with a named composed by the symbol name and the
".annotation" suffix.
Here is one example of pressing 'A' on 'perf top' to live annotate a
kernel function and then press 'P' to dump that annotation, the
resulting file:
# cat _raw_spin_lock_irqsave.annotation
_raw_spin_lock_irqsave() /proc/kcore
Event: cycles:ppp
7.14 nop
21.43 push %rbx
7.14 pushfq
pop %rax
nop
mov %rax,%rbx
cli
nop
xor %eax,%eax
mov $0x1,%edx
64.29 lock cmpxchg %edx,(%rdi)
test %eax,%eax
↓ jne 2b
mov %rbx,%rax
pop %rbx
← retq
2b: mov %eax,%esi
→ callq queued_spin_lock_slowpath
mov %rbx,%rax
pop %rbx
← retq
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-zzmnrwugb5vtk7bvg0rbx150@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>