linux/tools/perf/ui/browsers/annotate.c

1093 lines
29 KiB
C
Raw Normal View History

License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 15:07:57 +01:00
// SPDX-License-Identifier: GPL-2.0
#include "../../util/util.h"
#include "../browser.h"
#include "../helpline.h"
#include "../ui.h"
#include "../util.h"
#include "../../util/annotate.h"
#include "../../util/hist.h"
#include "../../util/sort.h"
#include "../../util/symbol.h"
#include "../../util/evsel.h"
#include "../../util/config.h"
perf annotate: Check for fused instructions Macro fusion merges two instructions to a single micro-op. Intel core platform performs this hardware optimization under limited circumstances. For example, CMP + JCC can be "fused" and executed /retired together. While with sampling this can result in the sample sometimes being on the JCC and sometimes on the CMP. So for the fused instruction pair, they could be considered together. On Nehalem, fused instruction pairs: cmp/test + jcc. On other new CPU: cmp/test/add/sub/and/inc/dec + jcc. This patch adds an x86-specific function which checks if 2 instructions are in a "fused" pair. For non-x86 arch, the function is just NULL. Changelog: v4: Move the CPU model checking to symbol__disassemble and save the CPU family/model in arch structure. It avoids checking every time when jump arrow printed. v3: Add checking for Nehalem (CMP, TEST). For other newer Intel CPUs just check it by default (CMP, TEST, ADD, SUB, AND, INC, DEC). v2: Remove the original weak function. Arnaldo points out that doing it as a weak function that will be overridden by the host arch doesn't work. So now it's implemented as an arch-specific function. Committer fix: Do not access evsel->evlist->env->cpuid, ->env can be null, introduce perf_evsel__env_cpuid(), just like perf_evsel__env_arch(), also used in this function call. The original patch was segfaulting 'perf top' + annotation. But this essentially disables this fused instructions augmentation in 'perf top', the right thing is to get the cpuid from the running kernel, left for a later patch tho. Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1499403995-19857-2-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-07 13:06:34 +08:00
#include "../../util/evlist.h"
#include <inttypes.h>
#include <pthread.h>
#include <linux/kernel.h>
#include <linux/string.h>
#include <sys/ttydefaults.h>
struct disasm_line_samples {
double percent;
struct sym_hist_entry he;
};
static struct annotation_options annotate_browser__opts = {
.use_offset = true,
.jump_arrows = true,
};
struct arch;
struct annotate_browser {
struct ui_browser b;
struct rb_root entries;
struct rb_node *curr_hot;
struct annotation_line *selection;
struct arch *arch;
int nr_asm_entries;
int nr_entries;
bool searching_backwards;
u8 addr_width;
u8 jumps_width;
u8 target_width;
u8 min_addr_width;
u8 max_addr_width;
char search_bf[128];
};
static inline struct annotation *browser__annotation(struct ui_browser *browser)
{
struct map_symbol *ms = browser->priv;
return symbol__annotation(ms->sym);
}
static bool disasm_line__filter(struct ui_browser *browser, void *entry)
{
struct annotation *notes = browser__annotation(browser);
if (notes->options->hide_src_code) {
struct annotation_line *al = list_entry(entry, struct annotation_line, node);
return al->offset == -1;
}
return false;
}
static int ui_browser__jumps_percent_color(struct ui_browser *browser, int nr, bool current)
{
struct annotation *notes = browser__annotation(browser);
if (current && (!browser->use_navkeypressed || browser->navkeypressed))
return HE_COLORSET_SELECTED;
if (nr == notes->max_jump_sources)
return HE_COLORSET_TOP;
if (nr > 1)
return HE_COLORSET_MEDIUM;
return HE_COLORSET_NORMAL;
}
static int ui_browser__set_jumps_percent_color(struct ui_browser *browser, int nr, bool current)
{
int color = ui_browser__jumps_percent_color(browser, nr, current);
return ui_browser__set_color(browser, color);
}
static void disasm_line__write(struct disasm_line *dl, struct ui_browser *browser,
char *bf, size_t size)
{
struct annotation *notes = browser__annotation(browser);
if (dl->ins.ops && dl->ins.ops->scnprintf) {
if (ins__is_jump(&dl->ins)) {
bool fwd = dl->ops.target.offset > dl->al.offset;
ui_browser__write_graph(browser, fwd ? SLSMG_DARROW_CHAR :
SLSMG_UARROW_CHAR);
SLsmg_write_char(' ');
} else if (ins__is_call(&dl->ins)) {
ui_browser__write_graph(browser, SLSMG_RARROW_CHAR);
SLsmg_write_char(' ');
} else if (ins__is_ret(&dl->ins)) {
ui_browser__write_graph(browser, SLSMG_LARROW_CHAR);
SLsmg_write_char(' ');
} else {
ui_browser__write_nstring(browser, " ", 2);
}
} else {
ui_browser__write_nstring(browser, " ", 2);
}
disasm_line__scnprintf(dl, bf, size, !notes->options->use_offset);
}
static void annotate_browser__write(struct ui_browser *browser, void *entry, int row)
{
struct annotate_browser *ab = container_of(browser, struct annotate_browser, b);
struct annotation *notes = browser__annotation(browser);
struct annotation_line *al = list_entry(entry, struct annotation_line, node);
bool current_entry = ui_browser__is_current_entry(browser, row);
bool change_color = (!notes->options->hide_src_code &&
(!current_entry || (browser->use_navkeypressed &&
!browser->navkeypressed)));
int width = browser->width, printed;
int i, pcnt_width = annotation__pcnt_width(notes),
cycles_width = annotation__cycles_width(notes);
double percent_max = 0.0;
char bf[256];
bool show_title = false;
for (i = 0; i < notes->nr_events; i++) {
if (al->samples[i].percent > percent_max)
percent_max = al->samples[i].percent;
}
if ((row == 0) && (al->offset == -1 || percent_max == 0.0)) {
if (notes->have_cycles) {
if (al->ipc == 0.0 && al->cycles == 0)
show_title = true;
} else
show_title = true;
}
if (al->offset != -1 && percent_max != 0.0) {
for (i = 0; i < notes->nr_events; i++) {
ui_browser__set_percent_color(browser,
al->samples[i].percent,
current_entry);
if (notes->options->show_total_period) {
ui_browser__printf(browser, "%11" PRIu64 " ",
al->samples[i].he.period);
} else if (notes->options->show_nr_samples) {
ui_browser__printf(browser, "%6" PRIu64 " ",
al->samples[i].he.nr_samples);
} else {
ui_browser__printf(browser, "%6.2f ",
al->samples[i].percent);
}
}
} else {
ui_browser__set_percent_color(browser, 0, current_entry);
if (!show_title)
ui_browser__write_nstring(browser, " ", pcnt_width);
else {
ui_browser__printf(browser, "%*s", pcnt_width,
notes->options->show_total_period ? "Period" :
notes->options->show_nr_samples ? "Samples" : "Percent");
}
}
if (notes->have_cycles) {
if (al->ipc)
ui_browser__printf(browser, "%*.2f ", ANNOTATION__IPC_WIDTH - 1, al->ipc);
else if (!show_title)
ui_browser__write_nstring(browser, " ", ANNOTATION__IPC_WIDTH);
else
ui_browser__printf(browser, "%*s ", ANNOTATION__IPC_WIDTH - 1, "IPC");
if (al->cycles)
ui_browser__printf(browser, "%*" PRIu64 " ",
ANNOTATION__CYCLES_WIDTH - 1, al->cycles);
else if (!show_title)
ui_browser__write_nstring(browser, " ", ANNOTATION__CYCLES_WIDTH);
else
ui_browser__printf(browser, "%*s ", ANNOTATION__CYCLES_WIDTH - 1, "Cycle");
}
SLsmg_write_char(' ');
/* The scroll bar isn't being used */
if (!browser->navkeypressed)
width += 1;
if (!*al->line)
ui_browser__write_nstring(browser, " ", width - pcnt_width - cycles_width);
else if (al->offset == -1) {
if (al->line_nr && notes->options->show_linenr)
printed = scnprintf(bf, sizeof(bf), "%-*d ",
ab->addr_width + 1, al->line_nr);
else
printed = scnprintf(bf, sizeof(bf), "%*s ",
ab->addr_width, " ");
ui_browser__write_nstring(browser, bf, printed);
ui_browser__write_nstring(browser, al->line, width - printed - pcnt_width - cycles_width + 1);
} else {
u64 addr = al->offset;
int color = -1;
if (!notes->options->use_offset)
addr += notes->start;
if (!notes->options->use_offset) {
printed = scnprintf(bf, sizeof(bf), "%" PRIx64 ": ", addr);
} else {
if (al->jump_sources) {
if (notes->options->show_nr_jumps) {
int prev;
printed = scnprintf(bf, sizeof(bf), "%*d ",
ab->jumps_width,
al->jump_sources);
prev = ui_browser__set_jumps_percent_color(browser, al->jump_sources,
current_entry);
ui_browser__write_nstring(browser, bf, printed);
ui_browser__set_color(browser, prev);
}
printed = scnprintf(bf, sizeof(bf), "%*" PRIx64 ": ",
ab->target_width, addr);
} else {
printed = scnprintf(bf, sizeof(bf), "%*s ",
ab->addr_width, " ");
}
}
perf annotate browser: Hide non jump target addresses in offset mode This: 0.00 : ffffffff8116bd00: lock btsl $0x0,(%r12) 100.00 : ffffffff8116bd07: sbb %eax,%eax 0.00 : ffffffff8116bd09: test %eax,%eax 0.00 : ffffffff8116bd0b: jne ffffffff8116bf5f <__mem_cgroup_commit_charge+0x28f> 0.00 : ffffffff8116bd11: mov (%r12),%rax 0.00 : ffffffff8116bd15: test $0x2,%al 0.00 : ffffffff8116bd17: jne ffffffff8116bf6e <__mem_cgroup_commit_charge+0x29e> 0.00 : ffffffff8116bd1d: test %r9b,%r9b 0.00 : ffffffff8116bd20: jne ffffffff8116be30 <__mem_cgroup_commit_charge+0x160> 0.00 : ffffffff8116bd26: xor %eax,%eax 0.00 : ffffffff8116bd28: mov %r13,0x8(%r12) 0.00 : ffffffff8116bd2d: lock orb $0x2,(%r12) 0.00 : ffffffff8116bd33: test %r9b,%r9b 0.00 : ffffffff8116bd36: je ffffffff8116bdf3 <__mem_cgroup_commit_charge+0x123> Becomes: 0.00 : 30: lock btsl $0x0,(%r12) 100.00 : sbb %eax,%eax 0.00 : test %eax,%eax 0.00 : jne 28f 0.00 : mov (%r12),%rax 0.00 : test $0x2,%al 0.00 : jne 29e 0.00 : test %r9b,%r9b 0.00 : jne 160 0.00 : 56: xor %eax,%eax 0.00 : 58: mov %r13,0x8(%r12) 0.00 : lock orb $0x2,(%r12) 0.00 : test %r9b,%r9b 0.00 : je 123 I.e. We trow away all those useless addresses and keep just jump labels. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-r2vmbtgz0l8coluj8flztgrn@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-04-19 12:19:22 -03:00
if (change_color)
color = ui_browser__set_color(browser, HE_COLORSET_ADDR);
ui_browser__write_nstring(browser, bf, printed);
if (change_color)
ui_browser__set_color(browser, color);
disasm_line__write(disasm_line(al), browser, bf, sizeof(bf));
ui_browser__write_nstring(browser, bf, width - pcnt_width - cycles_width - 3 - printed);
}
if (current_entry)
ab->selection = al;
}
static bool is_fused(struct annotate_browser *ab, struct disasm_line *cursor)
{
struct disasm_line *pos = list_prev_entry(cursor, al.node);
const char *name;
if (!pos)
return false;
if (ins__is_lock(&pos->ins))
name = pos->ops.locked.ins.name;
else
name = pos->ins.name;
if (!name || !cursor->ins.name)
return false;
return ins__is_fused(ab->arch, name, cursor->ins.name);
}
static void annotate_browser__draw_current_jump(struct ui_browser *browser)
{
struct annotate_browser *ab = container_of(browser, struct annotate_browser, b);
struct disasm_line *cursor = disasm_line(ab->selection);
struct annotation_line *target;
unsigned int from, to;
struct map_symbol *ms = ab->b.priv;
struct symbol *sym = ms->sym;
struct annotation *notes = symbol__annotation(sym);
u8 pcnt_width = annotation__pcnt_width(notes);
int width;
/* PLT symbols contain external offsets */
if (strstr(sym->name, "@plt"))
return;
if (!disasm_line__is_valid_jump(cursor, sym))
return;
/*
* This first was seen with a gcc function, _cpp_lex_token, that
* has the usual jumps:
*
* 1159e6c: jne 115aa32 <_cpp_lex_token@@Base+0xf92>
*
* I.e. jumps to a label inside that function (_cpp_lex_token), and
* those works, but also this kind:
*
* 1159e8b: jne c469be <cpp_named_operator2name@@Base+0xa72>
*
* I.e. jumps to another function, outside _cpp_lex_token, which
* are not being correctly handled generating as a side effect references
* to ab->offset[] entries that are set to NULL, so to make this code
* more robust, check that here.
*
* A proper fix for will be put in place, looking at the function
* name right after the '<' token and probably treating this like a
* 'call' instruction.
*/
target = notes->offsets[cursor->ops.target.offset];
if (target == NULL) {
ui_helpline__printf("WARN: jump target inconsistency, press 'o', notes->offsets[%#x] = NULL\n",
cursor->ops.target.offset);
return;
}
if (notes->options->hide_src_code) {
from = cursor->al.idx_asm;
to = target->idx_asm;
} else {
from = (u64)cursor->al.idx;
to = (u64)target->idx;
}
width = annotation__cycles_width(notes);
perf report: Fix wrong jump arrow When we use perf report interactive annotate view, we can see the position of jump arrow is not correct. For example, 1. perf record -b ... 2. perf report 3. In interactive mode, select Annotate 'function' Percent│ IPC Cycle │ if (flag) 1.37 │0.4┌── 1 ↓ je 82 │ │ x += x / y + y / x; 0.00 │0.4│ 1310 movsd (%rsp),%xmm0 0.00 │0.4│ 565 movsd 0x8(%rsp),%xmm4 │0.4│ movsd 0x8(%rsp),%xmm1 │0.4│ movsd (%rsp),%xmm3 │0.4│ divsd %xmm4,%xmm0 0.00 │0.4│ 579 divsd %xmm3,%xmm1 │0.4│ movsd (%rsp),%xmm2 │0.4│ addsd %xmm1,%xmm0 │0.4│ addsd %xmm2,%xmm0 0.00 │0.4│ movsd %xmm0,(%rsp) │ │ volatile double x = 1212121212, y = 121212; │ │ │ │ s_randseed = time(0); │ │ srand(s_randseed); │ │ │ │ for (i = 0; i < 2000000000; i++) { 1.37 │0.4└─→ 82: sub $0x1,%ebx 28.21 │0.48 17 ↑ jne 38 The jump arrow in above example is not correct. It should add the width of IPC and Cycle. With this patch, the result is: Percent│ IPC Cycle │ if (flag) 1.37 │0.48 1 ┌──je 82 │ │ x += x / y + y / x; 0.00 │0.48 1310 │ movsd (%rsp),%xmm0 0.00 │0.48 565 │ movsd 0x8(%rsp),%xmm4 │0.48 │ movsd 0x8(%rsp),%xmm1 │0.48 │ movsd (%rsp),%xmm3 │0.48 │ divsd %xmm4,%xmm0 0.00 │0.48 579 │ divsd %xmm3,%xmm1 │0.48 │ movsd (%rsp),%xmm2 │0.48 │ addsd %xmm1,%xmm0 │0.48 │ addsd %xmm2,%xmm0 0.00 │0.48 │ movsd %xmm0,(%rsp) │ │ volatile double x = 1212121212, y = 121212; │ │ │ │ s_randseed = time(0); │ │ srand(s_randseed); │ │ │ │ for (i = 0; i < 2000000000; i++) { 1.37 │0.48 82:└─→sub $0x1,%ebx 28.21 │0.48 17 ↑ jne 38 Committer notes: Please note that only from LBRv5 (according to Jiri) onwards, i.e. >= Skylake is that we'll have the cycles counts in each branch record entry, so to see the Cycles and IPC columns, and be able to test this patch, one need a capable hardware. While applying this I first tested it on a Broadwell class machine and couldn't get those columns, will add code to the annotate browser to warn the user about that, i.e. you have branch records, but no cycles, use a more recent hardware to get the cycles and IPC columns. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1517223473-14750-1-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-29 18:57:53 +08:00
ui_browser__set_color(browser, HE_COLORSET_JUMP_ARROWS);
perf report: Fix wrong jump arrow When we use perf report interactive annotate view, we can see the position of jump arrow is not correct. For example, 1. perf record -b ... 2. perf report 3. In interactive mode, select Annotate 'function' Percent│ IPC Cycle │ if (flag) 1.37 │0.4┌── 1 ↓ je 82 │ │ x += x / y + y / x; 0.00 │0.4│ 1310 movsd (%rsp),%xmm0 0.00 │0.4│ 565 movsd 0x8(%rsp),%xmm4 │0.4│ movsd 0x8(%rsp),%xmm1 │0.4│ movsd (%rsp),%xmm3 │0.4│ divsd %xmm4,%xmm0 0.00 │0.4│ 579 divsd %xmm3,%xmm1 │0.4│ movsd (%rsp),%xmm2 │0.4│ addsd %xmm1,%xmm0 │0.4│ addsd %xmm2,%xmm0 0.00 │0.4│ movsd %xmm0,(%rsp) │ │ volatile double x = 1212121212, y = 121212; │ │ │ │ s_randseed = time(0); │ │ srand(s_randseed); │ │ │ │ for (i = 0; i < 2000000000; i++) { 1.37 │0.4└─→ 82: sub $0x1,%ebx 28.21 │0.48 17 ↑ jne 38 The jump arrow in above example is not correct. It should add the width of IPC and Cycle. With this patch, the result is: Percent│ IPC Cycle │ if (flag) 1.37 │0.48 1 ┌──je 82 │ │ x += x / y + y / x; 0.00 │0.48 1310 │ movsd (%rsp),%xmm0 0.00 │0.48 565 │ movsd 0x8(%rsp),%xmm4 │0.48 │ movsd 0x8(%rsp),%xmm1 │0.48 │ movsd (%rsp),%xmm3 │0.48 │ divsd %xmm4,%xmm0 0.00 │0.48 579 │ divsd %xmm3,%xmm1 │0.48 │ movsd (%rsp),%xmm2 │0.48 │ addsd %xmm1,%xmm0 │0.48 │ addsd %xmm2,%xmm0 0.00 │0.48 │ movsd %xmm0,(%rsp) │ │ volatile double x = 1212121212, y = 121212; │ │ │ │ s_randseed = time(0); │ │ srand(s_randseed); │ │ │ │ for (i = 0; i < 2000000000; i++) { 1.37 │0.48 82:└─→sub $0x1,%ebx 28.21 │0.48 17 ↑ jne 38 Committer notes: Please note that only from LBRv5 (according to Jiri) onwards, i.e. >= Skylake is that we'll have the cycles counts in each branch record entry, so to see the Cycles and IPC columns, and be able to test this patch, one need a capable hardware. While applying this I first tested it on a Broadwell class machine and couldn't get those columns, will add code to the annotate browser to warn the user about that, i.e. you have branch records, but no cycles, use a more recent hardware to get the cycles and IPC columns. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1517223473-14750-1-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-29 18:57:53 +08:00
__ui_browser__line_arrow(browser,
pcnt_width + 2 + ab->addr_width + width,
from, to);
if (is_fused(ab, cursor)) {
ui_browser__mark_fused(browser,
perf report: Fix wrong jump arrow When we use perf report interactive annotate view, we can see the position of jump arrow is not correct. For example, 1. perf record -b ... 2. perf report 3. In interactive mode, select Annotate 'function' Percent│ IPC Cycle │ if (flag) 1.37 │0.4┌── 1 ↓ je 82 │ │ x += x / y + y / x; 0.00 │0.4│ 1310 movsd (%rsp),%xmm0 0.00 │0.4│ 565 movsd 0x8(%rsp),%xmm4 │0.4│ movsd 0x8(%rsp),%xmm1 │0.4│ movsd (%rsp),%xmm3 │0.4│ divsd %xmm4,%xmm0 0.00 │0.4│ 579 divsd %xmm3,%xmm1 │0.4│ movsd (%rsp),%xmm2 │0.4│ addsd %xmm1,%xmm0 │0.4│ addsd %xmm2,%xmm0 0.00 │0.4│ movsd %xmm0,(%rsp) │ │ volatile double x = 1212121212, y = 121212; │ │ │ │ s_randseed = time(0); │ │ srand(s_randseed); │ │ │ │ for (i = 0; i < 2000000000; i++) { 1.37 │0.4└─→ 82: sub $0x1,%ebx 28.21 │0.48 17 ↑ jne 38 The jump arrow in above example is not correct. It should add the width of IPC and Cycle. With this patch, the result is: Percent│ IPC Cycle │ if (flag) 1.37 │0.48 1 ┌──je 82 │ │ x += x / y + y / x; 0.00 │0.48 1310 │ movsd (%rsp),%xmm0 0.00 │0.48 565 │ movsd 0x8(%rsp),%xmm4 │0.48 │ movsd 0x8(%rsp),%xmm1 │0.48 │ movsd (%rsp),%xmm3 │0.48 │ divsd %xmm4,%xmm0 0.00 │0.48 579 │ divsd %xmm3,%xmm1 │0.48 │ movsd (%rsp),%xmm2 │0.48 │ addsd %xmm1,%xmm0 │0.48 │ addsd %xmm2,%xmm0 0.00 │0.48 │ movsd %xmm0,(%rsp) │ │ volatile double x = 1212121212, y = 121212; │ │ │ │ s_randseed = time(0); │ │ srand(s_randseed); │ │ │ │ for (i = 0; i < 2000000000; i++) { 1.37 │0.48 82:└─→sub $0x1,%ebx 28.21 │0.48 17 ↑ jne 38 Committer notes: Please note that only from LBRv5 (according to Jiri) onwards, i.e. >= Skylake is that we'll have the cycles counts in each branch record entry, so to see the Cycles and IPC columns, and be able to test this patch, one need a capable hardware. While applying this I first tested it on a Broadwell class machine and couldn't get those columns, will add code to the annotate browser to warn the user about that, i.e. you have branch records, but no cycles, use a more recent hardware to get the cycles and IPC columns. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1517223473-14750-1-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-29 18:57:53 +08:00
pcnt_width + 3 + ab->addr_width + width,
from - 1,
to > from ? true : false);
}
}
static unsigned int annotate_browser__refresh(struct ui_browser *browser)
{
struct annotation *notes = browser__annotation(browser);
int ret = ui_browser__list_head_refresh(browser);
int pcnt_width = annotation__pcnt_width(notes);
if (notes->options->jump_arrows)
annotate_browser__draw_current_jump(browser);
ui_browser__set_color(browser, HE_COLORSET_NORMAL);
__ui_browser__vline(browser, pcnt_width, 0, browser->height - 1);
return ret;
}
static int disasm__cmp(struct annotation_line *a, struct annotation_line *b)
{
int i;
for (i = 0; i < a->samples_nr; i++) {
if (a->samples[i].percent == b->samples[i].percent)
continue;
return a->samples[i].percent < b->samples[i].percent;
}
return 0;
}
static void disasm_rb_tree__insert(struct rb_root *root, struct annotation_line *al)
{
struct rb_node **p = &root->rb_node;
struct rb_node *parent = NULL;
struct annotation_line *l;
while (*p != NULL) {
parent = *p;
l = rb_entry(parent, struct annotation_line, rb_node);
if (disasm__cmp(al, l))
p = &(*p)->rb_left;
else
p = &(*p)->rb_right;
}
rb_link_node(&al->rb_node, parent, p);
rb_insert_color(&al->rb_node, root);
}
static void annotate_browser__set_top(struct annotate_browser *browser,
struct annotation_line *pos, u32 idx)
{
unsigned back;
ui_browser__refresh_dimensions(&browser->b);
back = browser->b.height / 2;
browser->b.top_idx = browser->b.index = idx;
while (browser->b.top_idx != 0 && back != 0) {
pos = list_entry(pos->node.prev, struct annotation_line, node);
if (disasm_line__filter(&browser->b, &pos->node))
continue;
--browser->b.top_idx;
--back;
}
browser->b.top = pos;
browser->b.navkeypressed = true;
}
static void annotate_browser__set_rb_top(struct annotate_browser *browser,
struct rb_node *nd)
{
struct annotation *notes = browser__annotation(&browser->b);
struct annotation_line * pos = rb_entry(nd, struct annotation_line, rb_node);
u32 idx = pos->idx;
if (notes->options->hide_src_code)
idx = pos->idx_asm;
annotate_browser__set_top(browser, pos, idx);
browser->curr_hot = nd;
}
static void annotate_browser__calc_percent(struct annotate_browser *browser,
struct perf_evsel *evsel)
{
struct map_symbol *ms = browser->b.priv;
struct symbol *sym = ms->sym;
struct annotation *notes = symbol__annotation(sym);
struct disasm_line *pos;
browser->entries = RB_ROOT;
pthread_mutex_lock(&notes->lock);
symbol__calc_percent(sym, evsel);
list_for_each_entry(pos, &notes->src->source, al.node) {
double max_percent = 0.0;
int i;
if (pos->al.offset == -1) {
RB_CLEAR_NODE(&pos->al.rb_node);
continue;
}
for (i = 0; i < pos->al.samples_nr; i++) {
struct annotation_data *sample = &pos->al.samples[i];
if (max_percent < sample->percent)
max_percent = sample->percent;
}
if (max_percent < 0.01 && pos->al.ipc == 0) {
RB_CLEAR_NODE(&pos->al.rb_node);
continue;
}
disasm_rb_tree__insert(&browser->entries, &pos->al);
}
pthread_mutex_unlock(&notes->lock);
browser->curr_hot = rb_last(&browser->entries);
}
static bool annotate_browser__toggle_source(struct annotate_browser *browser)
{
struct annotation *notes = browser__annotation(&browser->b);
struct annotation_line *al;
off_t offset = browser->b.index - browser->b.top_idx;
browser->b.seek(&browser->b, offset, SEEK_CUR);
al = list_entry(browser->b.top, struct annotation_line, node);
if (notes->options->hide_src_code) {
if (al->idx_asm < offset)
offset = al->idx;
browser->b.nr_entries = browser->nr_entries;
notes->options->hide_src_code = false;
browser->b.seek(&browser->b, -offset, SEEK_CUR);
browser->b.top_idx = al->idx - offset;
browser->b.index = al->idx;
} else {
if (al->idx_asm < 0) {
ui_helpline__puts("Only available for assembly lines.");
browser->b.seek(&browser->b, -offset, SEEK_CUR);
return false;
}
if (al->idx_asm < offset)
offset = al->idx_asm;
browser->b.nr_entries = browser->nr_asm_entries;
notes->options->hide_src_code = true;
browser->b.seek(&browser->b, -offset, SEEK_CUR);
browser->b.top_idx = al->idx_asm - offset;
browser->b.index = al->idx_asm;
}
return true;
}
static void annotate_browser__init_asm_mode(struct annotate_browser *browser)
{
ui_browser__reset_index(&browser->b);
browser->b.nr_entries = browser->nr_asm_entries;
}
#define SYM_TITLE_MAX_SIZE (PATH_MAX + 64)
static int sym_title(struct symbol *sym, struct map *map, char *title,
size_t sz)
{
return snprintf(title, sz, "%s %s", sym->name, map->dso->long_name);
}
static bool annotate_browser__callq(struct annotate_browser *browser,
struct perf_evsel *evsel,
struct hist_browser_timer *hbt)
{
struct map_symbol *ms = browser->b.priv;
struct disasm_line *dl = disasm_line(browser->selection);
struct annotation *notes;
char title[SYM_TITLE_MAX_SIZE];
perf annotate: Remove duplicate 'name' field from disasm_line The disasm_line::name field is always equal to ins::name, being used just to locate the instruction's ins_ops from the per-arch instructions table. Eliminate this duplication, nuking that field and instead make ins__find() return an ins_ops, store it in disasm_line::ins.ops, and keep just in disasm_line::ins.name what was in disasm_line::name, this way we end up not keeping a reference to entries in the per-arch instructions table. This in turn will help supporting multiple ways to manage the per-arch instructions table, allowing resorting that array, for instance, when the entries will move after references to its addresses were made. The same problem is avoided when one grows the array with realloc. So architectures simply keeping a constant array will work as well as architectures building the table using regular expressions or other logic that involves resorting the table. Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chris Riyder <chris.ryder@arm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-vr899azvabnw9gtuepuqfd9t@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-24 11:16:06 -03:00
if (!ins__is_call(&dl->ins))
return false;
if (!dl->ops.target.sym) {
ui_helpline__puts("The called function was not found.");
return true;
}
notes = symbol__annotation(dl->ops.target.sym);
pthread_mutex_lock(&notes->lock);
if (notes->src == NULL && symbol__alloc_hist(dl->ops.target.sym) < 0) {
pthread_mutex_unlock(&notes->lock);
ui__warning("Not enough memory for annotating '%s' symbol!\n",
dl->ops.target.sym->name);
return true;
}
pthread_mutex_unlock(&notes->lock);
symbol__tui_annotate(dl->ops.target.sym, ms->map, evsel, hbt);
sym_title(ms->sym, ms->map, title, sizeof(title));
ui_browser__show_title(&browser->b, title);
return true;
}
static
struct disasm_line *annotate_browser__find_offset(struct annotate_browser *browser,
s64 offset, s64 *idx)
{
struct annotation *notes = browser__annotation(&browser->b);
struct disasm_line *pos;
*idx = 0;
list_for_each_entry(pos, &notes->src->source, al.node) {
if (pos->al.offset == offset)
return pos;
if (!disasm_line__filter(&browser->b, &pos->al.node))
++*idx;
}
return NULL;
}
static bool annotate_browser__jump(struct annotate_browser *browser)
{
struct disasm_line *dl = disasm_line(browser->selection);
u64 offset;
s64 idx;
perf annotate: Remove duplicate 'name' field from disasm_line The disasm_line::name field is always equal to ins::name, being used just to locate the instruction's ins_ops from the per-arch instructions table. Eliminate this duplication, nuking that field and instead make ins__find() return an ins_ops, store it in disasm_line::ins.ops, and keep just in disasm_line::ins.name what was in disasm_line::name, this way we end up not keeping a reference to entries in the per-arch instructions table. This in turn will help supporting multiple ways to manage the per-arch instructions table, allowing resorting that array, for instance, when the entries will move after references to its addresses were made. The same problem is avoided when one grows the array with realloc. So architectures simply keeping a constant array will work as well as architectures building the table using regular expressions or other logic that involves resorting the table. Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chris Riyder <chris.ryder@arm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-vr899azvabnw9gtuepuqfd9t@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-24 11:16:06 -03:00
if (!ins__is_jump(&dl->ins))
return false;
offset = dl->ops.target.offset;
dl = annotate_browser__find_offset(browser, offset, &idx);
if (dl == NULL) {
ui_helpline__printf("Invalid jump offset: %" PRIx64, offset);
return true;
}
annotate_browser__set_top(browser, &dl->al, idx);
return true;
}
static
struct annotation_line *annotate_browser__find_string(struct annotate_browser *browser,
char *s, s64 *idx)
{
struct annotation *notes = browser__annotation(&browser->b);
struct annotation_line *al = browser->selection;
*idx = browser->b.index;
list_for_each_entry_continue(al, &notes->src->source, node) {
if (disasm_line__filter(&browser->b, &al->node))
continue;
++*idx;
if (al->line && strstr(al->line, s) != NULL)
return al;
}
return NULL;
}
static bool __annotate_browser__search(struct annotate_browser *browser)
{
struct annotation_line *al;
s64 idx;
al = annotate_browser__find_string(browser, browser->search_bf, &idx);
if (al == NULL) {
ui_helpline__puts("String not found!");
return false;
}
annotate_browser__set_top(browser, al, idx);
browser->searching_backwards = false;
return true;
}
static
struct annotation_line *annotate_browser__find_string_reverse(struct annotate_browser *browser,
char *s, s64 *idx)
{
struct annotation *notes = browser__annotation(&browser->b);
struct annotation_line *al = browser->selection;
*idx = browser->b.index;
list_for_each_entry_continue_reverse(al, &notes->src->source, node) {
if (disasm_line__filter(&browser->b, &al->node))
continue;
--*idx;
if (al->line && strstr(al->line, s) != NULL)
return al;
}
return NULL;
}
static bool __annotate_browser__search_reverse(struct annotate_browser *browser)
{
struct annotation_line *al;
s64 idx;
al = annotate_browser__find_string_reverse(browser, browser->search_bf, &idx);
if (al == NULL) {
ui_helpline__puts("String not found!");
return false;
}
annotate_browser__set_top(browser, al, idx);
browser->searching_backwards = true;
return true;
}
static bool annotate_browser__search_window(struct annotate_browser *browser,
int delay_secs)
{
if (ui_browser__input_window("Search", "String: ", browser->search_bf,
"ENTER: OK, ESC: Cancel",
delay_secs * 2) != K_ENTER ||
!*browser->search_bf)
return false;
return true;
}
static bool annotate_browser__search(struct annotate_browser *browser, int delay_secs)
{
if (annotate_browser__search_window(browser, delay_secs))
return __annotate_browser__search(browser);
return false;
}
static bool annotate_browser__continue_search(struct annotate_browser *browser,
int delay_secs)
{
if (!*browser->search_bf)
return annotate_browser__search(browser, delay_secs);
return __annotate_browser__search(browser);
}
static bool annotate_browser__search_reverse(struct annotate_browser *browser,
int delay_secs)
{
if (annotate_browser__search_window(browser, delay_secs))
return __annotate_browser__search_reverse(browser);
return false;
}
static
bool annotate_browser__continue_search_reverse(struct annotate_browser *browser,
int delay_secs)
{
if (!*browser->search_bf)
return annotate_browser__search_reverse(browser, delay_secs);
return __annotate_browser__search_reverse(browser);
}
static void annotate_browser__update_addr_width(struct annotate_browser *browser)
{
struct annotation *notes = browser__annotation(&browser->b);
if (notes->options->use_offset)
browser->target_width = browser->min_addr_width;
else
browser->target_width = browser->max_addr_width;
browser->addr_width = browser->target_width;
if (notes->options->show_nr_jumps)
browser->addr_width += browser->jumps_width + 1;
}
static int annotate_browser__run(struct annotate_browser *browser,
struct perf_evsel *evsel,
struct hist_browser_timer *hbt)
{
struct rb_node *nd = NULL;
struct map_symbol *ms = browser->b.priv;
struct symbol *sym = ms->sym;
struct annotation *notes = symbol__annotation(ms->sym);
const char *help = "Press 'h' for help on key bindings";
int delay_secs = hbt ? hbt->refresh : 0;
int key;
char title[SYM_TITLE_MAX_SIZE];
sym_title(sym, ms->map, title, sizeof(title));
if (ui_browser__show(&browser->b, title, help) < 0)
return -1;
annotate_browser__calc_percent(browser, evsel);
if (browser->curr_hot) {
annotate_browser__set_rb_top(browser, browser->curr_hot);
browser->b.navkeypressed = false;
}
nd = browser->curr_hot;
while (1) {
key = ui_browser__run(&browser->b, delay_secs);
if (delay_secs != 0) {
annotate_browser__calc_percent(browser, evsel);
/*
* Current line focus got out of the list of most active
* lines, NULL it so that if TAB|UNTAB is pressed, we
* move to curr_hot (current hottest line).
*/
if (nd != NULL && RB_EMPTY_NODE(nd))
nd = NULL;
}
switch (key) {
case K_TIMER:
if (hbt)
hbt->timer(hbt->arg);
if (delay_secs != 0)
symbol__annotate_decay_histogram(sym, evsel->idx);
continue;
case K_TAB:
if (nd != NULL) {
nd = rb_prev(nd);
if (nd == NULL)
nd = rb_last(&browser->entries);
} else
nd = browser->curr_hot;
break;
case K_UNTAB:
if (nd != NULL) {
nd = rb_next(nd);
if (nd == NULL)
nd = rb_first(&browser->entries);
} else
nd = browser->curr_hot;
break;
case K_F1:
case 'h':
ui_browser__help_window(&browser->b,
"UP/DOWN/PGUP\n"
"PGDN/SPACE Navigate\n"
"q/ESC/CTRL+C Exit\n\n"
"ENTER Go to target\n"
"ESC Exit\n"
"H Go to hottest instruction\n"
"TAB/shift+TAB Cycle thru hottest instructions\n"
"j Toggle showing jump to target arrows\n"
"J Toggle showing number of jump sources on targets\n"
"n Search next string\n"
"o Toggle disassembler output/simplified view\n"
"s Toggle source code view\n"
"t Circulate percent, total period, samples view\n"
"/ Search string\n"
"k Toggle line numbers\n"
"r Run available scripts\n"
"? Search string backwards\n");
continue;
case 'r':
{
script_browse(NULL);
continue;
}
case 'k':
notes->options->show_linenr = !notes->options->show_linenr;
break;
case 'H':
nd = browser->curr_hot;
break;
case 's':
if (annotate_browser__toggle_source(browser))
ui_helpline__puts(help);
continue;
case 'o':
notes->options->use_offset = !notes->options->use_offset;
annotate_browser__update_addr_width(browser);
continue;
case 'j':
notes->options->jump_arrows = !notes->options->jump_arrows;
continue;
case 'J':
notes->options->show_nr_jumps = !notes->options->show_nr_jumps;
annotate_browser__update_addr_width(browser);
continue;
case '/':
if (annotate_browser__search(browser, delay_secs)) {
show_help:
ui_helpline__puts(help);
}
continue;
case 'n':
if (browser->searching_backwards ?
annotate_browser__continue_search_reverse(browser, delay_secs) :
annotate_browser__continue_search(browser, delay_secs))
goto show_help;
continue;
case '?':
if (annotate_browser__search_reverse(browser, delay_secs))
goto show_help;
continue;
case 'D': {
static int seq;
ui_helpline__pop();
ui_helpline__fpush("%d: nr_ent=%d, height=%d, idx=%d, top_idx=%d, nr_asm_entries=%d",
seq++, browser->b.nr_entries,
browser->b.height,
browser->b.index,
browser->b.top_idx,
browser->nr_asm_entries);
}
continue;
case K_ENTER:
case K_RIGHT:
{
struct disasm_line *dl = disasm_line(browser->selection);
if (browser->selection == NULL)
ui_helpline__puts("Huh? No selection. Report to linux-kernel@vger.kernel.org");
else if (browser->selection->offset == -1)
ui_helpline__puts("Actions are only available for assembly lines.");
else if (!dl->ins.ops)
goto show_sup_ins;
else if (ins__is_ret(&dl->ins))
goto out;
else if (!(annotate_browser__jump(browser) ||
annotate_browser__callq(browser, evsel, hbt))) {
show_sup_ins:
ui_helpline__puts("Actions are only available for function call/return & jump/branch instructions.");
}
continue;
}
case 't':
if (notes->options->show_total_period) {
notes->options->show_total_period = false;
notes->options->show_nr_samples = true;
} else if (notes->options->show_nr_samples)
notes->options->show_nr_samples = false;
else
notes->options->show_total_period = true;
annotate_browser__update_addr_width(browser);
continue;
case K_LEFT:
case K_ESC:
case 'q':
case CTRL('c'):
goto out;
default:
continue;
}
if (nd != NULL)
annotate_browser__set_rb_top(browser, nd);
}
out:
ui_browser__hide(&browser->b);
return key;
}
int map_symbol__tui_annotate(struct map_symbol *ms, struct perf_evsel *evsel,
struct hist_browser_timer *hbt)
{
/* Set default value for show_total_period and show_nr_samples */
annotate_browser__opts.show_total_period =
symbol_conf.show_total_period;
annotate_browser__opts.show_nr_samples =
symbol_conf.show_nr_samples;
return symbol__tui_annotate(ms->sym, ms->map, evsel, hbt);
}
int hist_entry__tui_annotate(struct hist_entry *he, struct perf_evsel *evsel,
struct hist_browser_timer *hbt)
{
/* reset abort key so that it can get Ctrl-C as a key */
SLang_reset_tty();
SLang_init_tty(0, 0, 0);
return map_symbol__tui_annotate(&he->ms, evsel, hbt);
}
static inline int width_jumps(int n)
{
if (n >= 100)
return 5;
if (n / 10)
return 2;
return 1;
}
int symbol__tui_annotate(struct symbol *sym, struct map *map,
struct perf_evsel *evsel,
struct hist_browser_timer *hbt)
{
struct annotation_line *al;
struct annotation *notes = symbol__annotation(sym);
size_t size;
struct map_symbol ms = {
.map = map,
.sym = sym,
};
struct annotate_browser browser = {
.b = {
.refresh = annotate_browser__refresh,
.seek = ui_browser__list_head_seek,
.write = annotate_browser__write,
.filter = disasm_line__filter,
.priv = &ms,
.use_navkeypressed = true,
},
};
int ret = -1, err;
int nr_pcnt = 1;
if (sym == NULL)
return -1;
size = symbol__size(sym);
if (map->dso->annotate_warned)
return -1;
notes->options = &annotate_browser__opts;
notes->offsets = zalloc(size * sizeof(struct annotation_line *));
if (notes->offsets == NULL) {
perf annotate browser: Hide non jump target addresses in offset mode This: 0.00 : ffffffff8116bd00: lock btsl $0x0,(%r12) 100.00 : ffffffff8116bd07: sbb %eax,%eax 0.00 : ffffffff8116bd09: test %eax,%eax 0.00 : ffffffff8116bd0b: jne ffffffff8116bf5f <__mem_cgroup_commit_charge+0x28f> 0.00 : ffffffff8116bd11: mov (%r12),%rax 0.00 : ffffffff8116bd15: test $0x2,%al 0.00 : ffffffff8116bd17: jne ffffffff8116bf6e <__mem_cgroup_commit_charge+0x29e> 0.00 : ffffffff8116bd1d: test %r9b,%r9b 0.00 : ffffffff8116bd20: jne ffffffff8116be30 <__mem_cgroup_commit_charge+0x160> 0.00 : ffffffff8116bd26: xor %eax,%eax 0.00 : ffffffff8116bd28: mov %r13,0x8(%r12) 0.00 : ffffffff8116bd2d: lock orb $0x2,(%r12) 0.00 : ffffffff8116bd33: test %r9b,%r9b 0.00 : ffffffff8116bd36: je ffffffff8116bdf3 <__mem_cgroup_commit_charge+0x123> Becomes: 0.00 : 30: lock btsl $0x0,(%r12) 100.00 : sbb %eax,%eax 0.00 : test %eax,%eax 0.00 : jne 28f 0.00 : mov (%r12),%rax 0.00 : test $0x2,%al 0.00 : jne 29e 0.00 : test %r9b,%r9b 0.00 : jne 160 0.00 : 56: xor %eax,%eax 0.00 : 58: mov %r13,0x8(%r12) 0.00 : lock orb $0x2,(%r12) 0.00 : test %r9b,%r9b 0.00 : je 123 I.e. We trow away all those useless addresses and keep just jump labels. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-r2vmbtgz0l8coluj8flztgrn@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-04-19 12:19:22 -03:00
ui__error("Not enough memory!");
return -1;
}
if (perf_evsel__is_group_event(evsel))
nr_pcnt = evsel->nr_members;
err = symbol__annotate(sym, map, evsel, 0, &browser.arch);
if (err) {
char msg[BUFSIZ];
symbol__strerror_disassemble(sym, map, err, msg, sizeof(msg));
ui__error("Couldn't annotate %s:\n%s", sym->name, msg);
perf annotate browser: Hide non jump target addresses in offset mode This: 0.00 : ffffffff8116bd00: lock btsl $0x0,(%r12) 100.00 : ffffffff8116bd07: sbb %eax,%eax 0.00 : ffffffff8116bd09: test %eax,%eax 0.00 : ffffffff8116bd0b: jne ffffffff8116bf5f <__mem_cgroup_commit_charge+0x28f> 0.00 : ffffffff8116bd11: mov (%r12),%rax 0.00 : ffffffff8116bd15: test $0x2,%al 0.00 : ffffffff8116bd17: jne ffffffff8116bf6e <__mem_cgroup_commit_charge+0x29e> 0.00 : ffffffff8116bd1d: test %r9b,%r9b 0.00 : ffffffff8116bd20: jne ffffffff8116be30 <__mem_cgroup_commit_charge+0x160> 0.00 : ffffffff8116bd26: xor %eax,%eax 0.00 : ffffffff8116bd28: mov %r13,0x8(%r12) 0.00 : ffffffff8116bd2d: lock orb $0x2,(%r12) 0.00 : ffffffff8116bd33: test %r9b,%r9b 0.00 : ffffffff8116bd36: je ffffffff8116bdf3 <__mem_cgroup_commit_charge+0x123> Becomes: 0.00 : 30: lock btsl $0x0,(%r12) 100.00 : sbb %eax,%eax 0.00 : test %eax,%eax 0.00 : jne 28f 0.00 : mov (%r12),%rax 0.00 : test $0x2,%al 0.00 : jne 29e 0.00 : test %r9b,%r9b 0.00 : jne 160 0.00 : 56: xor %eax,%eax 0.00 : 58: mov %r13,0x8(%r12) 0.00 : lock orb $0x2,(%r12) 0.00 : test %r9b,%r9b 0.00 : je 123 I.e. We trow away all those useless addresses and keep just jump labels. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-r2vmbtgz0l8coluj8flztgrn@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-04-19 12:19:22 -03:00
goto out_free_offsets;
}
symbol__calc_percent(sym, evsel);
ui_helpline__push("Press ESC to exit");
notes->start = map__rip_2objdump(map, sym->start);
list_for_each_entry(al, &notes->src->source, node) {
size_t line_len = strlen(al->line);
if (browser.b.width < line_len)
browser.b.width = line_len;
al->idx = browser.nr_entries++;
if (al->offset != -1) {
al->idx_asm = browser.nr_asm_entries++;
/*
* FIXME: short term bandaid to cope with assembly
* routines that comes with labels in the same column
* as the address in objdump, sigh.
*
* E.g. copy_user_generic_unrolled
*/
if (al->offset < (s64)size)
notes->offsets[al->offset] = al;
perf annotate browser: Hide non jump target addresses in offset mode This: 0.00 : ffffffff8116bd00: lock btsl $0x0,(%r12) 100.00 : ffffffff8116bd07: sbb %eax,%eax 0.00 : ffffffff8116bd09: test %eax,%eax 0.00 : ffffffff8116bd0b: jne ffffffff8116bf5f <__mem_cgroup_commit_charge+0x28f> 0.00 : ffffffff8116bd11: mov (%r12),%rax 0.00 : ffffffff8116bd15: test $0x2,%al 0.00 : ffffffff8116bd17: jne ffffffff8116bf6e <__mem_cgroup_commit_charge+0x29e> 0.00 : ffffffff8116bd1d: test %r9b,%r9b 0.00 : ffffffff8116bd20: jne ffffffff8116be30 <__mem_cgroup_commit_charge+0x160> 0.00 : ffffffff8116bd26: xor %eax,%eax 0.00 : ffffffff8116bd28: mov %r13,0x8(%r12) 0.00 : ffffffff8116bd2d: lock orb $0x2,(%r12) 0.00 : ffffffff8116bd33: test %r9b,%r9b 0.00 : ffffffff8116bd36: je ffffffff8116bdf3 <__mem_cgroup_commit_charge+0x123> Becomes: 0.00 : 30: lock btsl $0x0,(%r12) 100.00 : sbb %eax,%eax 0.00 : test %eax,%eax 0.00 : jne 28f 0.00 : mov (%r12),%rax 0.00 : test $0x2,%al 0.00 : jne 29e 0.00 : test %r9b,%r9b 0.00 : jne 160 0.00 : 56: xor %eax,%eax 0.00 : 58: mov %r13,0x8(%r12) 0.00 : lock orb $0x2,(%r12) 0.00 : test %r9b,%r9b 0.00 : je 123 I.e. We trow away all those useless addresses and keep just jump labels. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-r2vmbtgz0l8coluj8flztgrn@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-04-19 12:19:22 -03:00
} else
al->idx_asm = -1;
}
annotation__mark_jump_targets(notes, sym);
annotation__compute_ipc(notes, size);
perf annotate browser: Hide non jump target addresses in offset mode This: 0.00 : ffffffff8116bd00: lock btsl $0x0,(%r12) 100.00 : ffffffff8116bd07: sbb %eax,%eax 0.00 : ffffffff8116bd09: test %eax,%eax 0.00 : ffffffff8116bd0b: jne ffffffff8116bf5f <__mem_cgroup_commit_charge+0x28f> 0.00 : ffffffff8116bd11: mov (%r12),%rax 0.00 : ffffffff8116bd15: test $0x2,%al 0.00 : ffffffff8116bd17: jne ffffffff8116bf6e <__mem_cgroup_commit_charge+0x29e> 0.00 : ffffffff8116bd1d: test %r9b,%r9b 0.00 : ffffffff8116bd20: jne ffffffff8116be30 <__mem_cgroup_commit_charge+0x160> 0.00 : ffffffff8116bd26: xor %eax,%eax 0.00 : ffffffff8116bd28: mov %r13,0x8(%r12) 0.00 : ffffffff8116bd2d: lock orb $0x2,(%r12) 0.00 : ffffffff8116bd33: test %r9b,%r9b 0.00 : ffffffff8116bd36: je ffffffff8116bdf3 <__mem_cgroup_commit_charge+0x123> Becomes: 0.00 : 30: lock btsl $0x0,(%r12) 100.00 : sbb %eax,%eax 0.00 : test %eax,%eax 0.00 : jne 28f 0.00 : mov (%r12),%rax 0.00 : test $0x2,%al 0.00 : jne 29e 0.00 : test %r9b,%r9b 0.00 : jne 160 0.00 : 56: xor %eax,%eax 0.00 : 58: mov %r13,0x8(%r12) 0.00 : lock orb $0x2,(%r12) 0.00 : test %r9b,%r9b 0.00 : je 123 I.e. We trow away all those useless addresses and keep just jump labels. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-r2vmbtgz0l8coluj8flztgrn@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-04-19 12:19:22 -03:00
browser.addr_width = browser.target_width = browser.min_addr_width = hex_width(size);
browser.max_addr_width = hex_width(sym->end);
browser.jumps_width = width_jumps(notes->max_jump_sources);
notes->nr_events = nr_pcnt;
browser.b.nr_entries = browser.nr_entries;
browser.b.entries = &notes->src->source,
browser.b.width += 18; /* Percentage */
if (notes->options->hide_src_code)
annotate_browser__init_asm_mode(&browser);
annotate_browser__update_addr_width(&browser);
ret = annotate_browser__run(&browser, evsel, hbt);
annotated_source__purge(notes->src);
perf annotate browser: Hide non jump target addresses in offset mode This: 0.00 : ffffffff8116bd00: lock btsl $0x0,(%r12) 100.00 : ffffffff8116bd07: sbb %eax,%eax 0.00 : ffffffff8116bd09: test %eax,%eax 0.00 : ffffffff8116bd0b: jne ffffffff8116bf5f <__mem_cgroup_commit_charge+0x28f> 0.00 : ffffffff8116bd11: mov (%r12),%rax 0.00 : ffffffff8116bd15: test $0x2,%al 0.00 : ffffffff8116bd17: jne ffffffff8116bf6e <__mem_cgroup_commit_charge+0x29e> 0.00 : ffffffff8116bd1d: test %r9b,%r9b 0.00 : ffffffff8116bd20: jne ffffffff8116be30 <__mem_cgroup_commit_charge+0x160> 0.00 : ffffffff8116bd26: xor %eax,%eax 0.00 : ffffffff8116bd28: mov %r13,0x8(%r12) 0.00 : ffffffff8116bd2d: lock orb $0x2,(%r12) 0.00 : ffffffff8116bd33: test %r9b,%r9b 0.00 : ffffffff8116bd36: je ffffffff8116bdf3 <__mem_cgroup_commit_charge+0x123> Becomes: 0.00 : 30: lock btsl $0x0,(%r12) 100.00 : sbb %eax,%eax 0.00 : test %eax,%eax 0.00 : jne 28f 0.00 : mov (%r12),%rax 0.00 : test $0x2,%al 0.00 : jne 29e 0.00 : test %r9b,%r9b 0.00 : jne 160 0.00 : 56: xor %eax,%eax 0.00 : 58: mov %r13,0x8(%r12) 0.00 : lock orb $0x2,(%r12) 0.00 : test %r9b,%r9b 0.00 : je 123 I.e. We trow away all those useless addresses and keep just jump labels. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-r2vmbtgz0l8coluj8flztgrn@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-04-19 12:19:22 -03:00
out_free_offsets:
zfree(&notes->offsets);
return ret;
}
#define ANNOTATE_CFG(n) \
{ .name = #n, .value = &annotate_browser__opts.n, }
/*
* Keep the entries sorted, they are bsearch'ed
*/
static struct annotate_config {
const char *name;
bool *value;
} annotate__configs[] = {
ANNOTATE_CFG(hide_src_code),
ANNOTATE_CFG(jump_arrows),
ANNOTATE_CFG(show_linenr),
ANNOTATE_CFG(show_nr_jumps),
ANNOTATE_CFG(show_nr_samples),
ANNOTATE_CFG(show_total_period),
ANNOTATE_CFG(use_offset),
};
#undef ANNOTATE_CFG
static int annotate_config__cmp(const void *name, const void *cfgp)
{
const struct annotate_config *cfg = cfgp;
return strcmp(name, cfg->name);
}
perf tools: Use __maybe_used for unused variables perf defines both __used and __unused variables to use for marking unused variables. The variable __used is defined to __attribute__((__unused__)), which contradicts the kernel definition to __attribute__((__used__)) for new gcc versions. On Android, __used is also defined in system headers and this leads to warnings like: warning: '__used__' attribute ignored __unused is not defined in the kernel and is not a standard definition. If __unused is included everywhere instead of __used, this leads to conflicts with glibc headers, since glibc has a variables with this name in its headers. The best approach is to use __maybe_unused, the definition used in the kernel for __attribute__((unused)). In this way there is only one definition in perf sources (instead of 2 definitions that point to the same thing: __used and __unused) and it works on both Linux and Android. This patch simply replaces all instances of __used and __unused with __maybe_unused. Signed-off-by: Irina Tirdea <irina.tirdea@intel.com> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1347315303-29906-7-git-send-email-irina.tirdea@intel.com [ committer note: fixed up conflict with a116e05 in builtin-sched.c ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-09-11 01:15:03 +03:00
static int annotate__config(const char *var, const char *value,
void *data __maybe_unused)
{
struct annotate_config *cfg;
const char *name;
if (!strstarts(var, "annotate."))
return 0;
name = var + 9;
cfg = bsearch(name, annotate__configs, ARRAY_SIZE(annotate__configs),
sizeof(struct annotate_config), annotate_config__cmp);
if (cfg == NULL)
ui__warning("%s variable unknown, ignoring...", var);
else
*cfg->value = perf_config_bool(name, value);
return 0;
}
void annotate_browser__init(void)
{
perf_config(annotate__config, NULL);
}