IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
data->id has been initialized at line 2362, remove duplicate initialization.
Fixes: 3ad31d8a0df2 ("perf evsel: Centralize perf_sample initialization")
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240127025756.4041808-1-yangjihong1@huawei.com
Since get_states() assumes the existence of libtraceevent, so move
to where it should belong, i.e, util/trace-event-parse.c, and also
rename it to parse_task_states().
Leave evsel_getstate() untouched as it fits well in the evsel
category.
Also make some necessary tweaks for python support, and get it
verified with: perf test python.
Signed-off-by: Ze Gao <zegao@tencent.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240123070210.1669843-2-zegao@tencent.com
I found the hierarchy mode useful, but it's easy to make a typo when
using it. Let's add a short option for that.
Also update the documentation. :)
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20240125055124.1579617-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
The msr PMU is a software one, meaning msr events may be grouped
with events in a hardware context. As the msr PMU isn't marked as a
software PMU by perf_pmu__is_software, groups with the msr PMU in
are broken and the msr events placed in a different group. This
may lead to multiplexing errors where a hardware event isn't
counted while the msr event, such as tsc, is. Fix all of this by
marking the msr PMU as software, which agrees with the driver.
Before:
```
$ perf stat -e '{slots,tsc}' -a true
WARNING: events were regrouped to match PMUs
Performance counter stats for 'system wide':
1,750,335 slots
4,243,557 tsc
0.001456717 seconds time elapsed
```
After:
```
$ perf stat -e '{slots,tsc}' -a true
Performance counter stats for 'system wide':
12,526,380 slots
3,415,163 tsc
0.001488360 seconds time elapsed
```
Fixes: 251aa040244a ("perf parse-events: Wildcard most "numeric" events")
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20240124234200.1510417-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Even though this is a frame pointer unwind test, it's testing that a
frame pointer stack can be augmented correctly with a partial
Dwarf unwind. So add a feature check so that this test skips instead of
fails if Dwarf unwinding isn't present.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Spoorthy S <spoorts2@in.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Link: https://lore.kernel.org/r/20240123163903.350306-3-james.clark@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Even though unwinding depends on either HAVE_DWARF_SUPPORT or
HAVE_LIBUNWIND, scripts testing unwinding can't just look for the
existence of either of those flags. This is because
NO_LIBDW_DWARF_UNWIND=1 can disable unwinding with libdw, but libdw will
still be linked leaving HAVE_DWARF_SUPPORT turned on. Presumably because
it is used for things other than unwinding, so I don't think this needs
to be fixed.
HAVE_DWARF_UNWIND_SUPPORT already takes the combination of all those
things into account, and is used to gate the built in tests like "Test
dwarf unwind", so add it to the feature list output so that it can be
used by the script tests too.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Spoorthy S <spoorts2@in.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20240123163903.350306-2-james.clark@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
The 'Session topology' test currently fails with this message when
evlist__new_default() opens more than one event:
32: Session topology :
--- start ---
templ file: /tmp/perf-test-vv5YzZ
Using CPUID 0x00000000410fd070
Opening: unknown-hardware:HG
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
config 0xb00000000
disabled 1
------------------------------------------------------------
sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 4
Opening: unknown-hardware:HG
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
config 0xa00000000
disabled 1
------------------------------------------------------------
sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 5
non matching sample_type
FAILED tests/topology.c:73 can't get session
---- end ----
Session topology: FAILED!
This is because when re-opening the file and parsing the header, Perf
expects that any file that has more than one event has the sample ID
flag set. Perf record already sets the flag in a similar way when there
is more than one event, so add the same logic to evlist__new_default().
evlist__new_default() is only currently used in tests, so I don't
expect this change to have any other side effects. The other tests that
use it don't save and re-open the file so don't hit this issue.
The session topology test has been failing on Arm big.LITTLE platforms
since commit 251aa040244a ("perf parse-events: Wildcard most
"numeric" events") when evlist__new_default() started opening multiple
events for 'cycles'.
Fixes: 251aa040244a ("perf parse-events: Wildcard most "numeric" events")
Closes: https://lore.kernel.org/lkml/CAP-5=fWVQ-7ijjK3-w1q+k2WYVNHbAcejb-xY0ptbjRw476VKA@mail.gmail.com/
Tested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20240124094358.489372-1-james.clark@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
The mem_events can be retrieved from the struct perf_pmu now. An ARCH
specific perf_mem_events__ptr() is not required anymore. Remove all of
them.
The Intel hybrid has multiple mem-events-supported PMUs. But they share
the same mem_events. Other ARCHs only support one mem-events-supported
PMU. In the configuration, it's good enough to only configure the
mem_events for one PMU. Add perf_mem_events_find_pmu() which returns the
first mem-events-supported PMU.
In the perf_mem_events__init(), the perf_pmus__scan() is not required
anymore. It avoids checking the sysfs for every PMU on the system.
Make the perf_mem_events__record_args() more generic. Remove the
perf_mem_events__print_unsupport_hybrid().
Since pmu is added as a new parameter, rename perf_mem_events__ptr() to
perf_pmu__mem_events_ptr(). Several other functions also do a similar
rename.
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Tested-by: Ravi Bangoria <ravi.bangoria@amd.com>
Tested-by: Kajol jain <kjain@linux.ibm.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: james.clark@arm.com
Cc: will@kernel.org
Cc: leo.yan@linaro.org
Cc: mike.leach@linaro.org
Cc: renyu.zj@linux.alibaba.com
Cc: yuhaixin.yhx@linux.alibaba.com
Cc: tmricht@linux.ibm.com
Cc: atrajeev@linux.vnet.ibm.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: john.g.garry@oracle.com
Link: https://lore.kernel.org/r/20240123185036.3461837-3-kan.liang@linux.intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Now that we have evsel__taskstate() which no longer relies on the
hardcoded task state string and has good backward compatibility,
we have a good reason to use it.
Note TASK_STATE_TO_CHAR_STR and task bitmasks are useless now so
we remove them for good. And now we pass the state info back and
forth in a symbolic char which explains itself well instead.
Signed-off-by: Ze Gao <zegao@tencent.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20240123022425.1611483-1-zegao@tencent.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Now that we have the __prinf_flags() parsing routines, we add a new
helper evsel__taskstate() to extract the task state info from the
recorded data.
Signed-off-by: Ze Gao <zegao@tencent.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20240122070859.1394479-5-zegao@tencent.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Perf uses a hard coded string "RSDTtXZPI" to index the sched_switch
prev_state field raw bitmask value. This works well except for when
the kernel changes this string, in which case this will break again.
Instead we add a new way to parse task state string from tracepoint
print format already recorded by perf, which eliminates the further
dependencies with this hardcode and unmaintainable macro, and this
is exactly what libtraceevent[1] does for now.
So we borrow the print flags parsing logic from libtraceevent[1].
And in get_states(), we walk the print arguments until the
__print_flags() for the target state field is found, and use that to
build the states string for future parsing.
[1]: https://lore.kernel.org/linux-trace-devel/20231224140732.7d41698d@rorschach.local.home/
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Ze Gao <zegao@tencent.com>
Link: https://lore.kernel.org/r/20240122070859.1394479-4-zegao@tencent.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Update state char array to match the latest kernel definitions and
remove unused state mapping macros.
Note this is the preparing patch for get rid of the way to parse
process state from raw bitmask value. Instead we are going to
parse it from the recorded tracepoint print format, and this change
marks why we're doing it.
Signed-off-by: Ze Gao <zegao@tencent.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20240122070859.1394479-3-zegao@tencent.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Minor code style alignment cleanup for perf_data__switch() and
perf_data__write().
No functional change.
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240119040304.3708522-4-yangjihong1@huawei.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
In pipe mode, no need to switch perf data output, therefore,
'--timestamp-filename' option should not take effect.
Check the conflict before recording and output WARNING.
In this case, the check pipe mode in perf_data__switch() can be removed.
Before:
# perf record --timestamp-filename -o- perf test -w noploop | perf report -i- --percent-limit=1
# To display the perf.data header info, please use --header/--header-only options.
#
[ perf record: Woken up 1 times to write data ]
[ perf record: Dump -.2024011812110182 ]
#
# Total Lost Samples: 0
#
# Samples: 4K of event 'cycles:P'
# Event count (approx.): 2176784359
#
# Overhead Command Shared Object Symbol
# ........ ....... .................... ......................................
#
97.83% perf perf [.] noploop
#
# (Tip: Print event counts in CSV format with: perf stat -x,)
#
After:
# perf record --timestamp-filename -o- perf test -w noploop | perf report -i- --percent-limit=1
WARNING: --timestamp-filename option is not available in pipe mode.
# To display the perf.data header info, please use --header/--header-only options.
#
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.000 MB - ]
#
# Total Lost Samples: 0
#
# Samples: 4K of event 'cycles:P'
# Event count (approx.): 2185575421
#
# Overhead Command Shared Object Symbol
# ........ ....... ..................... .............................................
#
97.75% perf perf [.] noploop
#
# (Tip: Profiling branch (mis)predictions with: perf record -b / perf report)
#
Fixes: ecfd7a9c044e ("perf record: Add '--timestamp-filename' option to append timestamp to output file name")
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240119040304.3708522-3-yangjihong1@huawei.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
perf_data__switch() may not assign a legal value to 'new_filename'.
In this case, 'new_filename' uses the on-stack value, which may cause a
incorrect free and unexpected result.
Fixes: 03724b2e9c45 ("perf record: Allow to limit number of reported perf.data files")
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240119040304.3708522-2-yangjihong1@huawei.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
The DWARF location expression can be fairly complex and it'd be hard
to match it with the condition correctly. So let's be conservative
and only allow simple expressions. For now it just checks the first
operation in the list. The following operations looks ok:
* DW_OP_stack_value
* DW_OP_deref_size
* DW_OP_deref
* DW_OP_piece
To refuse complex (and unsupported) location expressions, add
check_allowed_ops() to compare the rest of the list. It seems earlier
result contained those unsupported expressions. For example, I found
some local struct variable is placed like below.
<2><43d1517>: Abbrev Number: 62 (DW_TAG_variable)
<43d1518> DW_AT_location : 15 byte block: 91 50 93 8 91 78 93 4 93 84 8 91 68 93 4
(DW_OP_fbreg: -48; DW_OP_piece: 8;
DW_OP_fbreg: -8; DW_OP_piece: 4;
DW_OP_piece: 1028;
DW_OP_fbreg: -24; DW_OP_piece: 4)
Another example is something like this.
0057c8be ffffffffffffffff ffffffff812109f0 (base address)
0057c8ce ffffffff812112b5 ffffffff812112c8 (DW_OP_breg3 (rbx): 0;
DW_OP_constu: 18446744073709551612;
DW_OP_and;
DW_OP_stack_value)
It should refuse them. After the change, the stat shows:
Annotate data type stats:
total 294, ok 158 (53.7%), bad 136 (46.3%)
-----------------------------------------------------------
30 : no_sym
32 : no_mem_ops
53 : no_var
14 : no_typeinfo
7 : bad_offset
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240117062657.985479-10-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Local variables are allocated in the stack and the location list
should look like base register(s) and an offset. Extend the
die_find_variable_by_reg() to handle the following expressions
* DW_OP_breg{0..31}
* DW_OP_bregx
* DW_OP_fbreg
Ususally DWARF subprogram entries have frame base information and
use it to locate stack variable like below:
<2><43d1575>: Abbrev Number: 62 (DW_TAG_variable)
<43d1576> DW_AT_location : 2 byte block: 91 7c (DW_OP_fbreg: -4) <--- here
<43d1579> DW_AT_name : (indirect string, offset: 0x2c00c9): i
<43d157d> DW_AT_decl_file : 1
<43d157e> DW_AT_decl_line : 78
<43d157f> DW_AT_type : <0x43d19d7>
I found some differences on saving the frame base between gcc and clang.
The gcc uses the CFA to get the base so it needs to check the current
frame's CFI info. In this case, stack offset needs to be adjusted from
the start of the CFA.
<1><1bb8d>: Abbrev Number: 102 (DW_TAG_subprogram)
<1bb8e> DW_AT_name : (indirect string, offset: 0x74d41): kernel_init
<1bb92> DW_AT_decl_file : 2
<1bb92> DW_AT_decl_line : 1440
<1bb94> DW_AT_decl_column : 18
<1bb95> DW_AT_prototyped : 1
<1bb95> DW_AT_type : <0xcc>
<1bb99> DW_AT_low_pc : 0xffffffff81bab9e0
<1bba1> DW_AT_high_pc : 0x1b2
<1bba9> DW_AT_frame_base : 1 byte block: 9c (DW_OP_call_frame_cfa) <------ here
<1bbab> DW_AT_call_all_calls: 1
<1bbab> DW_AT_sibling : <0x1bf5a>
While clang sets it to a register directly and it can check the register
and offset in the instruction directly.
<1><43d1542>: Abbrev Number: 60 (DW_TAG_subprogram)
<43d1543> DW_AT_low_pc : 0xffffffff816a7c60
<43d154b> DW_AT_high_pc : 0x98
<43d154f> DW_AT_frame_base : 1 byte block: 56 (DW_OP_reg6 (rbp)) <---------- here
<43d1551> DW_AT_GNU_all_call_sites: 1
<43d1551> DW_AT_name : (indirect string, offset: 0x3bce91): foo
<43d1555> DW_AT_decl_file : 1
<43d1556> DW_AT_decl_line : 75
<43d1557> DW_AT_prototyped : 1
<43d1557> DW_AT_type : <0x43c7332>
<43d155b> DW_AT_external : 1
Also it needs to update the offset after finding the type like global
variables since the offset was from the frame base. Factor out
match_var_offset() to check global and local variables in the same way.
The type stats are improved too:
Annotate data type stats:
total 294, ok 160 (54.4%), bad 134 (45.6%)
-----------------------------------------------------------
30 : no_sym
32 : no_mem_ops
51 : no_var
14 : no_typeinfo
7 : bad_offset
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20240117062657.985479-9-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
The die_get_cfa() is to get frame base register and offset at the given
instruction address (pc). This info will be used to locate stack
variables which have location expression using DW_OP_fbreg.
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20240117062657.985479-8-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Global variables are accessed using PC-relative address so it needs to
be handled separately. The PC-rel addressing is detected by using
DWARF_REG_PC. On x86, %rip register would be used.
The address can be calculated using the ip and offset in the
instruction. But it should start from the next instruction so add
calculate_pcrel_addr() to do it properly.
But global variables defined in a different file would only have a
declaration which doesn't include a location list. So it first tries
to get the type info using the address, and then looks up the variable
declarations using name. The name of global variables should be get
from the symbol table. The declaration would have the type info.
So extend find_var_type() to take both address and name for global
variables.
The stat is now looks like:
Annotate data type stats:
total 294, ok 153 (52.0%), bad 141 (48.0%)
-----------------------------------------------------------
30 : no_sym
32 : no_mem_ops
61 : no_var
10 : no_typeinfo
8 : bad_offset
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20240117062657.985479-7-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Extend find_data_type_die() to find data type from PC-relative address
using die_find_variable_by_addr(). Users need to pass the address for
the (global) variable.
The offset for the variable should be updated after finding the type
because the offset in the instruction is just to calcuate the address
for the variable. So it changed to pass a pointer to offset and renamed
it to 'poffset'.
First it searches variables in the CU DIE as it's likely that the global
variables are defined in the file level. And then it iterates the scope
DIEs to find a local (static) variable.
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20240117062657.985479-6-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
A typical function prologue and epilogue include multiple stack
operations to save and restore the current value of registers.
On x86, it looks like below:
push r15
push r14
push r13
push r12
...
pop r12
pop r13
pop r14
pop r15
ret
As these all touches the stack memory region, chances are high that they
appear in a memory profile data. But these are not used for any real
purpose yet so it'd return no types.
One of my profile type shows that non neglible portion of data came from
the stack operations. It also seems GCC generates more stack operations
than clang.
Annotate Instruction stats
total 264, ok 169 (64.0%), bad 95 (36.0%)
Name : Good Bad
-----------------------------------------------------------
movq : 49 27
movl : 24 9
popq : 0 19 <-- here
cmpl : 17 2
addq : 14 1
cmpq : 12 2
cmpxchgl : 3 7
Instead of dealing them as unknown, let's create a seperate pseudo type
to represent those stack operations separately.
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20240117062657.985479-5-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
On x86, instructions for array access often looks like below.
mov 0x1234(%rax,%rbx,8), %rcx
Usually the first register holds the type information and the second one
has the index. And the current code only looks up a variable for the
first register. But it's possible to be in the other way around so it
needs to check the second register if the first one failed.
The stat changed like this.
Annotate data type stats:
total 294, ok 148 (50.3%), bad 146 (49.7%)
-----------------------------------------------------------
30 : no_sym
32 : no_mem_ops
66 : no_var
10 : no_typeinfo
8 : bad_offset
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20240117062657.985479-4-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
When a sample was come from a conditional branch without a memory
operand, it could be due to a macro fusion with a previous instruction.
So it needs to check the memory operand in the previous one.
This improves the stat like below:
Annotate data type stats:
total 294, ok 147 (50.0%), bad 147 (50.0%)
-----------------------------------------------------------
30 : no_sym
32 : no_mem_ops
71 : no_var
6 : no_typeinfo
8 : bad_offset
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20240117062657.985479-3-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
For the performance reason, I prefer llvm-objdump over GNU's. But I
found that llvm-objdump puts x86 lock prefix in a separate line like
below.
ffffffff81000695: f0 lock
ffffffff81000696: ff 83 54 0b 00 00 incl 2900(%rbx)
This should be parsed properly, but I just changed to find the insn
with next offset for now.
This improves the statistics as it can process more instructions.
Annotate data type stats:
total 294, ok 144 (49.0%), bad 150 (51.0%)
-----------------------------------------------------------
30 : no_sym
35 : no_mem_ops
71 : no_var
6 : no_typeinfo
8 : bad_offset
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20240117062657.985479-2-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
If pkg-config is not installed when libtraceevent is linked, the build fails.
The error information is as follows:
$ make
<SNIP>
In file included from /home/yjh/projects_linux/perf-tool-next/linux/tools/perf/util/evsel.c:43:
/home/yjh/projects_linux/perf-tool-next/linux/tools/perf/util/trace-event.h:149:62: error: operator '&&' has no right operand
149 | #if defined(LIBTRACEEVENT_VERSION) && LIBTRACEEVENT_VERSION >= MAKE_LIBTRACEEVENT_VERSION(1, 5, 0)
| ^~
error: command '/usr/bin/gcc' failed with exit code 1
cp: cannot stat 'python_ext_build/lib/perf*.so': No such file or directory
make[2]: *** [Makefile.perf:668: python/perf.cpython-310-x86_64-linux-gnu.so] Error 1
make[2]: *** Waiting for unfinished jobs....
Because pkg-config is not installed, fail to get libtraceevent version in
Makefile.config file. As a result, LIBTRACEEVENT_VERSION is empty.
However, the preceding error information is not user-friendly.
Identify errors in advance by checking that pkg-config is installed at
compile time.
The build results of various scenarios are as follows:
1. build successful when libtraceevent is not linked and pkg-config is not installed
$ pkg-config --version
-bash: /usr/bin/pkg-config: No such file or directory
$ make clean >/dev/null
$ make NO_LIBTRACEEVENT=1 >/dev/null
Makefile.config:1133: No alternatives command found, you need to set JDIR= to point to the root of your Java directory
PERF_VERSION = 6.7.rc6.gd988c9f511af
$ echo $?
0
2. dummy pkg-config is missing when libtraceevent is linked
$ pkg-config --version
-bash: /usr/bin/pkg-config: No such file or directory
$ make clean >/dev/null
$ make >/dev/null
Makefile.config:221: *** Error: pkg-config needed by libtraceevent is missing on this system, please install it. Stop.
make[1]: *** [Makefile.perf:251: sub-make] Error 2
make: *** [Makefile:70: all] Error 2
$ echo $?
2
3. build successful when libtraceevent is linked and pkg-config is installed
$ pkg-config --version
0.29.2
$ make clean >/dev/null
$ make >/dev/null
Makefile.config:1133: No alternatives command found, you need to set JDIR= to point to the root of your Java directory
PERF_VERSION = 6.7.rc6.gd988c9f511af
$ echo $?
0
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240112034019.3558584-1-yangjihong1@huawei.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
This test case often fails on s390 (about 2 out of 10) because the
10% percent limit on the difference between --bpf-counters event counting
and s390 hardware counting is more than 10% in all failure cases.
Raise the limit to 20% on s390 and the test case succeeds.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: gor@linux.ibm.com
Cc: hca@linux.ibm.com
Cc: sumanthk@linux.ibm.com
Cc: svens@linux.ibm.com
Link: https://lore.kernel.org/r/20240108084009.3959211-1-tmricht@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
- assorted prep work for disk space accounting rewrite
- BTREE_TRIGGER_ATOMIC: after combining our trigger callbacks, this
makes our trigger context more explicit
- A few fixes to avoid excessive transaction restarts on multithreaded
workloads: fstests (in addition to ktest tests) are now checking
slowpath counters, and that's shaking out a few bugs
- Assorted tracepoint improvements
- Starting to break up bcachefs_format.h and move on disk types so
they're with the code they belong to; this will make room to start
documenting the on disk format better.
- A few minor fixes
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmWtjOsACgkQE6szbY3K
bnbyXRAAsx+yM81TFqsLzRRqf8oocRwf2dj5XzExz9Ig/lYQS5LIVROS2OxwDsAc
DeaYQSTcph9dkOswCrNR96bBnEgmmZ1ClfVI6WRXvm6vs4rjhSMNbNaVyySrMUVn
5p/Lsn1/RKl0lWMYlHrdryo+106zRcr6z1Hiv9QCXkXhzdkV8wFYDkfbMveShUsu
KobC29wvd2EfZr04nqsIXS/y/iRIXhtZqJmFCiAguN70UWrwUwArpELHI5Ve+WPZ
9VjgFXW6Ka3QxJs/20tX+t24DrC+eDXR44DzQmxwG5mPBBpXkcSk5UgRw/EUag5U
5+mDZQ5Ei3gvZvUwrilMosVy3pIw0IuvqeqwDGFoFXs1cce01QCMN+NG/dBTQw9i
KGGxJw5sOrZ8fIiFnypk1M+r9NVtA8MjriLNR5bJjCWPSpWqzkT2HzxFXc6HmTZu
vsE/AxwC1RLA6B2HZlDEqLOdHE3cofkDiIzWM5ABvb4p118iyk9hE6HhAufk5UdE
HaG646kGB8pUY/sCxBIOD6K2pgthDFv+fftTM7X+uIazD3bovvPQCEInu48/KAHn
/KmslSPO0txyjnRFMbXFJvd4Fgfo44GcBCeqGpy3B79aEJ3nroyRZ0qNnnsqj0Gl
picUWjTn4W561Q1zBXuE/6cLWEp+sfaqYQcM8L3CCitRTVDPaCQ=
=yd+F
-----END PGP SIGNATURE-----
Merge tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs
Pull more bcachefs updates from Kent Overstreet:
"Some fixes, Some refactoring, some minor features:
- Assorted prep work for disk space accounting rewrite
- BTREE_TRIGGER_ATOMIC: after combining our trigger callbacks, this
makes our trigger context more explicit
- A few fixes to avoid excessive transaction restarts on
multithreaded workloads: fstests (in addition to ktest tests) are
now checking slowpath counters, and that's shaking out a few bugs
- Assorted tracepoint improvements
- Starting to break up bcachefs_format.h and move on disk types so
they're with the code they belong to; this will make room to start
documenting the on disk format better.
- A few minor fixes"
* tag 'bcachefs-2024-01-21' of https://evilpiepirate.org/git/bcachefs: (46 commits)
bcachefs: Improve inode_to_text()
bcachefs: logged_ops_format.h
bcachefs: reflink_format.h
bcachefs; extents_format.h
bcachefs: ec_format.h
bcachefs: subvolume_format.h
bcachefs: snapshot_format.h
bcachefs: alloc_background_format.h
bcachefs: xattr_format.h
bcachefs: dirent_format.h
bcachefs: inode_format.h
bcachefs; quota_format.h
bcachefs: sb-counters_format.h
bcachefs: counters.c -> sb-counters.c
bcachefs: comment bch_subvolume
bcachefs: bch_snapshot::btime
bcachefs: add missing __GFP_NOWARN
bcachefs: opts->compression can now also be applied in the background
bcachefs: Prep work for variable size btree node buffers
bcachefs: grab s_umount only if snapshotting
...
- A fix for the idle and iowait time accounting vs. CPU hotplug.
The time is reset on CPU hotplug which makes the accumulated
systemwide time jump backwards.
- Assorted fixes and improvements for clocksource/event drivers
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmWtTLgTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoUXiD/4uN4Ntps8TwxSdg1X11M6++rizg9q9
EmIfwWcfQQJDM5Ss5FE88ye55NxIOwJ1brYo08+yTAXjnnZ/yNP1BBegHbMNiGil
NCHye7tYKZle25+hErdgfBB9n6brPz7dPOvV04/wRRWW+9p2ejt/5nEvojkyco9Y
S9KgBCxkvUqScMbdKKFW1UsThWh2euxwQXRGiWhTPPkbKcVynPvQJjvVyRxn01NS
eEhTn8YUNcAPT+1YApouGXrSCxo/IzBJ36CxOoCoUfaXcJ6FG1LLeAjNxKZ26Dfs
Ah0e3Hhyv6KOsBvBNwwabXDwryd6L8rZd8yL2KakI1vIC51uS2wneFy8GCieDVGh
xmy3U/tfkS0L7pmN+dQW2l4k9PHRNrwvbISKhs0UAHSOgGIMHZcjE6aFbYKru5i4
1W+dEjiktlceZ94mrEHbLpKmxWH2z5P8m0BzUs4kt3nkaOf6CTUKqa/qdAiU5dv+
lovKT26L8HBrMXf48I70UpgW/bYzOUGk55sR6hiLTXAelz1z02D1uYHFkshc0NCO
/O4wvHcgvMM46CtWVbim42AlRcyyWCr+FrY+jvfiG2icOcHPLqc81iHL8EKj7pJl
IxLgyPHVckgnE5gx+GQ8aDkg/qwCZnj4rFWgub8QMYtjI+pO+9T9kPAYPCxFhP7J
gmcJxZAB2RnKXA==
=RD6E
-----END PGP SIGNATURE-----
Merge tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
"Updates for time and clocksources:
- A fix for the idle and iowait time accounting vs CPU hotplug.
The time is reset on CPU hotplug which makes the accumulated
systemwide time jump backwards.
- Assorted fixes and improvements for clocksource/event drivers"
* tag 'timers-core-2024-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick-sched: Fix idle and iowait sleeptime accounting vs CPU hotplug
clocksource/drivers/ep93xx: Fix error handling during probe
clocksource/drivers/cadence-ttc: Fix some kernel-doc warnings
clocksource/drivers/timer-ti-dm: Fix make W=n kerneldoc warnings
clocksource/timer-riscv: Add riscv_clock_shutdown callback
dt-bindings: timer: Add StarFive JH8100 clint
dt-bindings: timer: thead,c900-aclint-mtimer: separate mtime and mtimecmp regs
- 18f14afe2816 powerpc/64s: Increase default stack size to 32KB BY: Michael Ellerman
Thanks to:
Michael Ellerman
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTYs9CDOrDQRwKRmtrJvCLnGrjHVgUCZayxkgAKCRDJvCLnGrjH
Vv2hAQDwvyYydFw64D7bnaFJDLvOwi3SL02OBaFYV1JTr8rf/QEA8NcTuqXis5o5
NedFYVE5PhYGWfyPD63aL+JpUKxsXwc=
=Ud9v
-----END PGP SIGNATURE-----
Merge tag 'powerpc-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Aneesh Kumar:
- Increase default stack size to 32KB for Book3S
Thanks to Michael Ellerman.
* tag 'powerpc-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s: Increase default stack size to 32KB