IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
When MMAP2 has the PERF_RECORD_MISC_MMAP_BUILD_ID flag, it means the
record already has the build-id info. So it marks the DSO as hit, to
skip if the same DSO is not processed if it happens to miss the build-id
later.
But it missed to copy the MMAP2 record itself so it'd fail to symbolize
samples for those regions.
For example, the following generates 249 MMAP2 events.
$ perf record --buildid-mmap -o- true | perf report --stat -i- | grep MMAP2
MMAP2 events: 249 (86.8%)
Adding perf inject should not change the number of events like this
$ perf record --buildid-mmap -o- true | perf inject -b | \
> perf report --stat -i- | grep MMAP2
MMAP2 events: 249 (86.5%)
But when --buildid-all is used, it eats most of the MMAP2 events.
$ perf record --buildid-mmap -o- true | perf inject -b --buildid-all | \
> perf report --stat -i- | grep MMAP2
MMAP2 events: 1 ( 2.5%)
With this patch, it shows the original number now.
$ perf record --buildid-mmap -o- true | perf inject -b --buildid-all | \
> perf report --stat -i- | grep MMAP2
MMAP2 events: 249 (86.5%)
Committer testing:
Before:
$ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf inject -b | perf report --stat -i- | grep MMAP2
MMAP2 events: 58 (36.2%)
$ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf report --stat -i- | grep MMAP2
MMAP2 events: 58 (36.2%)
$ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf inject -b --buildid-all | perf report --stat -i- | grep MMAP2
MMAP2 events: 2 ( 1.9%)
$
After:
$ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf inject -b | perf report --stat -i- | grep MMAP2
MMAP2 events: 58 (29.3%)
$ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf report --stat -i- | grep MMAP2
MMAP2 events: 58 (34.3%)
$ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf inject -b --buildid-all | perf report --stat -i- | grep MMAP2
MMAP2 events: 58 (38.4%)
$
Fixes: f7fc0d1c91 ("perf inject: Do not inject BUILD_ID record if MMAP2 has it")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230223070155.54251-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In copy_bytes(), it reads the data from the (input) fd and writes it to
the output file. But it does with the read(2) unconditionally which
caused a problem of mixing buffered vs unbuffered I/O together.
You can see the problem when using pipes.
$ perf record -e intel_pt// -o- true | perf inject -b > /dev/null
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.000 MB - ]
0x45c0 [0x30]: failed to process type: 71
It should use perf_data__read() to honor the 'use_stdio' setting.
Fixes: 601366678c ("perf data: Allow to use stdio functions for pipe mode")
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20230131023350.1903992-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Remove the LIBTRACEEVENT_DYNAMIC and LIBTRACEFS_DYNAMIC make command
line variables.
If libtraceevent isn't installed or NO_LIBTRACEEVENT=1 is passed to the
build, don't compile in libtraceevent and libtracefs support.
This also disables CONFIG_TRACE that controls "perf trace".
CONFIG_LIBTRACEEVENT is used to control enablement in Build/Makefiles,
HAVE_LIBTRACEEVENT is used in C code.
Without HAVE_LIBTRACEEVENT tracepoints are disabled and as such the
commands kmem, kwork, lock, sched and timechart are removed. The
majority of commands continue to work including "perf test".
Committer notes:
Fixed up a tools/perf/util/Build reject and added:
#include <traceevent/event-parse.h>
to tools/perf/util/scripting-engines/trace-event-perl.c.
Committer testing:
$ rpm -qi libtraceevent-devel
Name : libtraceevent-devel
Version : 1.5.3
Release : 2.fc36
Architecture: x86_64
Install Date: Mon 25 Jul 2022 03:20:19 PM -03
Group : Unspecified
Size : 27728
License : LGPLv2+ and GPLv2+
Signature : RSA/SHA256, Fri 15 Apr 2022 02:11:58 PM -03, Key ID 999f7cbf38ab71f4
Source RPM : libtraceevent-1.5.3-2.fc36.src.rpm
Build Date : Fri 15 Apr 2022 10:57:01 AM -03
Build Host : buildvm-x86-05.iad2.fedoraproject.org
Packager : Fedora Project
Vendor : Fedora Project
URL : https://git.kernel.org/pub/scm/libs/libtrace/libtraceevent.git/
Bug URL : https://bugz.fedoraproject.org/libtraceevent
Summary : Development headers of libtraceevent
Description :
Development headers of libtraceevent-libs
$
Default build:
$ ldd ~/bin/perf | grep tracee
libtraceevent.so.1 => /lib64/libtraceevent.so.1 (0x00007f1dcaf8f000)
$
# perf trace -e sched:* --max-events 10
0.000 migration/0/17 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, dest_cpu: 1)
0.005 migration/0/17 sched:sched_wake_idle_without_ipi(cpu: 1)
0.011 migration/0/17 sched:sched_switch(prev_comm: "", prev_pid: 17 (migration/0), prev_state: 1, next_comm: "", next_prio: 120)
1.173 :0/0 sched:sched_wakeup(comm: "", pid: 3138 (gnome-terminal-), prio: 120)
1.180 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 3138 (gnome-terminal-), next_prio: 120)
0.156 migration/1/21 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, orig_cpu: 1, dest_cpu: 2)
0.160 migration/1/21 sched:sched_wake_idle_without_ipi(cpu: 2)
0.166 migration/1/21 sched:sched_switch(prev_comm: "", prev_pid: 21 (migration/1), prev_state: 1, next_comm: "", next_prio: 120)
1.183 :0/0 sched:sched_wakeup(comm: "", pid: 1602985 (kworker/u16:0-f), prio: 120, target_cpu: 1)
1.186 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 1602985 (kworker/u16:0-f), next_prio: 120)
#
Had to tweak tools/perf/util/setup.py to make sure the python binding
shared object links with libtraceevent if -DHAVE_LIBTRACEEVENT is
present in CFLAGS.
Building with NO_LIBTRACEEVENT=1 uncovered some more build failures:
- Make building of data-convert-bt.c to CONFIG_LIBTRACEEVENT=y
- perf-$(CONFIG_LIBTRACEEVENT) += scripts/
- bpf_kwork.o needs also to be dependent on CONFIG_LIBTRACEEVENT=y
- The python binding needed some fixups and util/trace-event.c can't be
built and linked with the python binding shared object, so remove it
in tools/perf/util/setup.py and exclude it from the list of
dependencies in the python/perf.so Makefile.perf target.
Building without libtraceevent-devel installed uncovered more build
failures:
- The python binding tools/perf/util/python.c was assuming that
traceevent/parse-events.h was always available, which was the case
when we defaulted to using the in-kernel tools/lib/traceevent/ files,
now we need to enclose it under ifdef HAVE_LIBTRACEEVENT, just like
the other parts of it that deal with tracepoints.
- We have to ifdef the rules in the Build files with
CONFIG_LIBTRACEEVENT=y to build builtin-trace.c and
tools/perf/trace/beauty/ as we only ifdef setting CONFIG_TRACE=y when
setting NO_LIBTRACEEVENT=1 in the make command line, not when we don't
detect libtraceevent-devel installed in the system. Simplification here
to avoid these two ways of disabling builtin-trace.c and not having
CONFIG_TRACE=y when libtraceevent-devel isn't installed is the clean
way.
From Athira:
<quote>
tools/perf/arch/powerpc/util/Build
-perf-y += kvm-stat.o
+perf-$(CONFIG_LIBTRACEEVENT) += kvm-stat.o
</quote>
Then, ditto for arm64 and s390, detected by container cross build tests.
- s/390 uses test__checkevent_tracepoint() that is now only available if
HAVE_LIBTRACEEVENT is defined, enclose the callsite with ifder HAVE_LIBTRACEEVENT.
Also from Athira:
<quote>
With this change, I could successfully compile in these environment:
- Without libtraceevent-devel installed
- With libtraceevent-devel installed
- With “make NO_LIBTRACEEVENT=1”
</quote>
Then, finally rename CONFIG_TRACEEVENT to CONFIG_LIBTRACEEVENT for
consistency with other libraries detected in tools/perf/.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20221205225940.3079667-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This commit adds the option --known-build-ids to perf inject.
It allows the user to explicitly specify the build id for a given
path, instead of retrieving it from the current system. This is
useful in cases where a perf.data file is processed on a different
system from where it was collected, or if some of the binaries are
no longer available.
The build ids and paths are specified in pairs in the command line.
Using the file:// specifier, build ids can be loaded from a file
directly generated by perf buildid-list. This is convenient to copy
build ids from one perf.data file to another.
** Example: In this example we use perf record to create two
perf.data files, one with build ids and another without, and use
perf buildid-list and perf inject to copy the build ids from the
first file to the second.
$ perf record ls /tmp
$ perf record --no-buildid -o perf.data.no-buildid ls /tmp
$ perf buildid-list > build-ids.txt
$ perf inject -b --known-build-ids='file://build-ids.txt' \
-i perf.data.no-buildid -o perf.data.buildid
Signed-off-by: Raul Silvera <rsilvera@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220815225922.2118745-1-rsilvera@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Inject events from a perf.data file recorded in a virtual machine into
a perf.data file recorded on the host at the same time.
Only side band events (e.g. mmap, comm, fork, exit etc) and build IDs are
injected. Additionally, the guest kcore_dir is copied as kcore_dir__
appended to the machine PID.
This is non-trivial because:
o It is not possible to process 2 sessions simultaneously so instead
events are first written to a temporary file.
o To avoid conflict, guest sample IDs are replaced with new unused sample
IDs.
o Guest event's CPU is changed to be the host CPU because it is more
useful for reporting and analysis.
o Sample ID is mapped to machine PID which is recorded with VCPU in the
id index. This is important to allow guest events to be related to the
guest machine and VCPU.
o Timestamps must be converted.
o Events are inserted to obey finished-round ordering.
The anticipated use-case is:
- start recording sideband events in a guest machine
- start recording an AUX area trace on the host which can trace also the
guest (e.g. Intel PT)
- run test case on the guest
- stop recording on the host
- stop recording on the guest
- copy the guest perf.data file to the host
- inject the guest perf.data file sideband events into the host perf.data
file using perf inject
- the resulting perf.data file can now be used
Subsequent patches provide Intel PT support for this.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-25-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To update the perf/core codebase.
Fix conflict by moving arch__post_evsel_config(evsel, attr) to the end
of evsel__config(), after what was added in:
49c692b7df ("perf offcpu: Accept allowed sample types only")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In preparation for recording sideband events in a virtual machine guest so
that they can be injected into a host perf.data file.
This is needed to enable injecting events after the initial synthesized
user events (that have an all zero id sample) but before regular events.
Committer notes:
Add entry about PERF_RECORD_FINISHED_INIT to
tools/perf/Documentation/perf.data-file-format.txt.
Committer testing:
Before:
# perf report -D | grep FINISHED
0 0x5910 [0x8]: PERF_RECORD_FINISHED_ROUND
FINISHED_ROUND events: 1 ( 0.5%)
#
After:
# perf record -- sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.020 MB perf.data (7 samples) ]
# perf report -D | grep FINISHED
0 0x5068 [0x8]: PERF_RECORD_FINISHED_INIT: unhandled!
0 0x5390 [0x8]: PERF_RECORD_FINISHED_ROUND
FINISHED_ROUND events: 1 ( 0.5%)
FINISHED_INIT events: 1 ( 0.5%)
#
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220610113316.6682-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If the input perf.data has a kcore_dir, copy it into the output, since
at least the kallsyms in the kcore_dir will be useful to the output.
Example:
Before:
$ ls -lR perf.data-from-desktop
perf.data-from-desktop:
total 916
-rw------- 1 user user 931756 May 19 09:55 data
drwx------ 2 user user 4096 May 19 09:55 kcore_dir
perf.data-from-desktop/kcore_dir:
total 42952
-r-------- 1 user user 7582467 May 19 09:55 kallsyms
-r-------- 1 user user 36388864 May 19 09:55 kcore
-r-------- 1 user user 4828 May 19 09:55 modules
$ perf inject -i perf.data-from-desktop -o injected-perf.data
$ ls -lR injected-perf.data
-rw------- 1 user user 931320 May 20 15:08 injected-perf.data
After:
$ perf inject -i perf.data-from-desktop -o injected-perf.data
$ ls -lR injected-perf.data
injected-perf.data:
total 916
-rw------- 1 user user 931320 May 20 15:21 data
drwx------ 2 user user 4096 May 20 15:21 kcore_dir
injected-perf.data/kcore_dir:
total 42952
-r-------- 1 user user 7582467 May 20 15:21 kallsyms
-r-------- 1 user user 36388864 May 20 15:21 kcore
-r-------- 1 user user 4828 May 20 15:21 modules
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220520132404.25853-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The fixed commit attempts to get the output file descriptor even if the
file was never opened e.g.
$ perf record uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB perf.data (7 samples) ]
$ perf inject -i perf.data --vm-time-correlation=dry-run
Segmentation fault (core dumped)
$ gdb --quiet perf
Reading symbols from perf...
(gdb) r inject -i perf.data --vm-time-correlation=dry-run
Starting program: /home/ahunter/bin/perf inject -i perf.data --vm-time-correlation=dry-run
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Program received signal SIGSEGV, Segmentation fault.
__GI___fileno (fp=0x0) at fileno.c:35
35 fileno.c: No such file or directory.
(gdb) bt
#0 __GI___fileno (fp=0x0) at fileno.c:35
#1 0x00005621e48dd987 in perf_data__fd (data=0x7fff4c68bd08) at util/data.h:72
#2 perf_data__fd (data=0x7fff4c68bd08) at util/data.h:69
#3 cmd_inject (argc=<optimized out>, argv=0x7fff4c69c1f0) at builtin-inject.c:1017
#4 0x00005621e4936783 in run_builtin (p=0x5621e4ee6878 <commands+600>, argc=4, argv=0x7fff4c69c1f0) at perf.c:313
#5 0x00005621e4897d5c in handle_internal_command (argv=<optimized out>, argc=<optimized out>) at perf.c:365
#6 run_argv (argcp=<optimized out>, argv=<optimized out>) at perf.c:409
#7 main (argc=4, argv=0x7fff4c69c1f0) at perf.c:539
(gdb)
Fixes: 0ae0389362 ("perf tools: Pass a fd to perf_file_header__read_pipe()")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20211213084829.114772-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The fixed commit attempts to close inject.output even if it was never
opened e.g.
$ perf record uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB perf.data (7 samples) ]
$ perf inject -i perf.data --vm-time-correlation=dry-run
Segmentation fault (core dumped)
$ gdb --quiet perf
Reading symbols from perf...
(gdb) r inject -i perf.data --vm-time-correlation=dry-run
Starting program: /home/ahunter/bin/perf inject -i perf.data --vm-time-correlation=dry-run
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Program received signal SIGSEGV, Segmentation fault.
0x00007eff8afeef5b in _IO_new_fclose (fp=0x0) at iofclose.c:48
48 iofclose.c: No such file or directory.
(gdb) bt
#0 0x00007eff8afeef5b in _IO_new_fclose (fp=0x0) at iofclose.c:48
#1 0x0000557fc7b74f92 in perf_data__close (data=data@entry=0x7ffcdafa6578) at util/data.c:376
#2 0x0000557fc7a6b807 in cmd_inject (argc=<optimized out>, argv=<optimized out>) at builtin-inject.c:1085
#3 0x0000557fc7ac4783 in run_builtin (p=0x557fc8074878 <commands+600>, argc=4, argv=0x7ffcdafb6a60) at perf.c:313
#4 0x0000557fc7a25d5c in handle_internal_command (argv=<optimized out>, argc=<optimized out>) at perf.c:365
#5 run_argv (argcp=<optimized out>, argv=<optimized out>) at perf.c:409
#6 main (argc=4, argv=0x7ffcdafb6a60) at perf.c:539
(gdb)
Fixes: 02e6246f53 ("perf inject: Close inject.output on exit")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20211213084829.114772-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The space allowed for new attributes can be too small if existing header
information is large. That can happen, for example, if there are very
many CPUs, due to having an event ID per CPU per event being stored in the
header information.
Fix by adding the existing header.data_offset. Also increase the extra
space allowed to 8KiB and align to a 4KiB boundary for neatness.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20211125071457.2066863-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When the input is a regular file but the output is a pipe, it should
write a pipe header. But just repiping would write a portion of the
existing header which is different in 'size' value. So we need to
prevent it and write a new pipe header along with other information
like event attributes and features.
This can handle something like this:
# perf record -a -B sleep 1
# perf inject -b -i perf.data | perf report -i -
Factor out perf_event__synthesize_for_pipe() to be shared between perf
record and inject.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210719223153.1618812-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Sometimes it needs to save the perf inject data to a file for debugging.
But normally it assumes the same format for input and output, so the end
result cannot be used due to a broken format.
# perf record -a -o - sleep 1 | perf inject -b -o my.data
# perf report -i my.data --stdio
0x208 [0]: failed to process type: 0 [Invalid argument]
Error:
failed to process sample
# To display the perf.data header info, please use --header/--header-only options.
#
In this case, it thought the data has a regular file header since the
output is not a pipe. But actually it doesn't have one and has a pipe
file header. At the end of the session, it tries to rewrite the regular
file header with updated features and it overwrites the data just
follows the pipe header.
Fix it by checking either the input and the output is a pipe.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210719223153.1618812-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Intel PT timestamps are affected by virtualization. Add a new option
that will allow the Intel PT decoder to correlate the timestamps and
translate the virtual machine timestamps to host timestamps.
The advantages of making this a separate step, rather than a part of
normal decoding are that it is simpler to implement, and it needs to
be done only once.
This patch adds only the option. Later patches add Intel PT support.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210430070309.17624-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since commit 14d3d54052 ("perf session: Try to read pipe data from
file") 'perf inject' has started printing "PERFILE2h" when not processing
pipes.
The commit exposed perf to the possiblity that the input is not a pipe
but the 'repipe' parameter gets used. That causes the printing because
perf inject sets 'repipe' to true always.
The 'repipe' parameter of perf_session__new() is used by 2 functions:
- perf_file_header__read_pipe()
- trace_report()
In both cases, the functions copy data to STDOUT_FILENO when 'repipe' is
true.
Fix by setting 'repipe' to true only if the output is a pipe.
Fixes: e558a5bd8b ("perf inject: Work with files")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andrew Vagin <avagin@openvz.org>
Link: http://lore.kernel.org/lkml/20210401103605.9000-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch fixes "perf inject --jit" to properly operate on
namespaced/containerized processes:
* jitdump files are generated by the process, thus they should be
looked up in its mount NS.
* DSOs of injected MMAP events will later be looked up in the process
mount NS, so write them into its NS.
* PIDs & TIDs from jitdump events need to be translated to the PID as
seen by "perf record" before written into MMAP events.
For a process in a different PID NS, the TID & PID given in the jitdump
event are actually ignored; I use the TID & PID of the thread which
mmap()ed the jitdump file. This is simplified and won't do for forks of
the initial process, if they continue using the same jitdump file.
Future patches might improve it.
This was tested by recording a NodeJS process running with
"--perf-prof", inside a Docker container, and by recording another
NodeJS process running in the same namespaces as perf itself, to make
sure it's not broken for non-containerized processes.
Signed-off-by: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20201105015604.1726943-1-yonatan.goldschmidt@granulate.io
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
"perf inject" can create corrupt files when synthesizing sample events from AUX
data. This happens when in the input file, the first event (for the AUX data)
has a different sample_type from the second event (generally dummy).
Specifically, they differ in the bits that indicate the standard fields
appended to perf records in the mmap buffer. "perf inject" deletes the first
event and moves up the second event to first position.
The problem is with the synthetic PERF_RECORD_MMAP (etc.) events created
by "perf record".
Since these are synthetic versions of events which are normally produced
by the kernel, they have to have the standard fields appended as
described by sample_type.
"perf record" fills these in with zeroes, including the IDENTIFIER
field; perf readers interpret records with zero IDENTIFIER using the
descriptor for the first event in the file.
Since "perf inject" changes the first event, these synthetic records are
then processed with the wrong value of sample_type, and the perf reader
reads bad data, reports on incorrect length records etc.
Mismatching sample_types are seen with "perf record -e cs_etm//", where the AUX
event has TID|TIME|CPU|IDENTIFIER and the dummy event has TID|TIME|IDENTIFIER.
Perhaps they could be the same, but it isn't normally a problem if they aren't
- perf has no problems reading the file.
The sample_types have to agree on the position of IDENTIFIER, because
that's how perf finds the right event descriptor in the first place, but
they don't normally have to agree on other fields, and perf doesn't
check that they do.
The problem is specific to the way "perf inject" reorganizes the events
and the way synthetic MMAP events are recorded with a zero identifier. A
simple solution is to stop "perf inject" deleting the tracing event.
Committer testing
Removed the now unused 'evsel' variable, update the comment about the
evsel removal not being performed anymore, and apply the patch manually
as it failed with this warning:
warning: Patch sent with format=flowed; space at the end of lines might be lost.
Testing it with:
$ perf bench internals inject-build-id
# Running 'internals/inject-build-id' benchmark:
Average build-id injection took: 8.543 msec (+- 0.130 msec)
Average time per event: 0.838 usec (+- 0.013 usec)
Average memory usage: 12717 KB (+- 9 KB)
Average build-id-all injection took: 5.710 msec (+- 0.058 msec)
Average time per event: 0.560 usec (+- 0.006 usec)
Average memory usage: 12079 KB (+- 7 KB)
$
Signed-off-by: Al Grant <al.grant@arm.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LPU-Reference: b9cf5611-daae-2390-3439-6617f8f0a34b@foss.arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Replace build_id byte array with struct build_id object and all the code
that references it.
The objective is to carry size together with build id array, so it's
better to keep both together.
This is preparatory change for following patches, and there's no
functional change.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20201013192441.1299447-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
**perf-<pid>.map and jit-<pid>.dump designs:
When a JIT generates code to be executed, it must allocate memory and
mark it executable using an mmap call.
*** perf-<pid>.map design
The perf-<pid>.map assumes that any sample recorded in an anonymous
memory page is JIT code. It then tries to resolve the symbol name by
looking at the process' perf-<pid>.map.
*** jit-<pid>.dump design
The jit-<pid>.dump mechanism takes a different approach. It requires a
JIT to write a `<path>/jit-<pid>.dump` file. This file must also be
mmapped so that perf inject -jit can find the file. The JIT must also
add JIT_CODE_LOAD records for any functions it generates. The records
are timestamped using a clock which can be correlated to the perf record
clock.
After perf record, the `perf inject -jit` pass parses the recording
looking for a `<path>/jit-<pid>.dump` file. When it finds the file, it
parses it and for each JIT_CODE_LOAD record:
* creates an elf file `<path>/jitted-<pid>-<code_index>.so
* injects a new mmap record mapping the new elf file into the process.
*** Coexistence design
The kernel and perf support both of these mechanisms. We need to make
sure perf works on an app supporting either or both of these mechanisms.
Both designs rely on mmap records to determine how to resolve an ip
address.
The mmap records of both techniques by definition overlap. When the JIT
compiles a method, it must:
* allocate memory (mmap)
* add execution privilege (mprotect or mmap. either will
generate an mmap event form the kernel to perf)
* compile code into memory
* add a function record to perf-<pid>.map and/or jit-<pid>.dump
Because the jit-<pid>.dump mechanism supports greater capabilities, perf
prefers the symbols from jit-<pid>.dump. It implements this based on
timestamp ordering of events. There is an implicit ASSUMPTION that the
JIT_CODE_LOAD record timestamp will be after the // anon mmap event that
was generated during memory allocation or adding the execution privilege setting.
*** Problems with the ASSUMPTION
The ASSUMPTION made in the Coexistence design section above is violated
in the following scenario.
*** Scenario
While a JIT is jitting code it will eventually need to commit more
pages and change these pages to executable permissions. Typically the
JIT will want these collocated to minimize branch displacements.
The kernel will coalesce these anonymous mapping with identical
permissions before sending an MMAP event for the new pages. The address
range of the new mmap will not be just the most recently mmap pages.
It will include the entire coalesced mmap region.
See mm/mmap.c
unsigned long mmap_region(struct file *file, unsigned long addr,
unsigned long len, vm_flags_t vm_flags, unsigned long pgoff,
struct list_head *uf)
{
...
/*
* Can we just expand an old mapping?
*/
...
perf_event_mmap(vma);
...
}
*** Symptoms
The coalesced // anon mmap event will be timestamped after the
JIT_CODE_LOAD records. This means it will be used as the most recent
mapping for that entire address range. For remaining events it will look
at the inferior perf-<pid>.map for symbols.
If both mechanisms are supported, the symbol will appear twice with
different module names. This causes weird behavior in reporting.
If only jit-<pid>.dump is supported, the symbol will no longer be resolved.
** Implemented solution
This patch solves the issue by removing // anon mmap events for any
process which has a valid jit-<pid>.dump file.
It tracks on a per process basis to handle the case where some running
apps support jit-<pid>.dump, but some only support perf-<pid>.map.
It adds new assumptions:
* // anon mmap events are only required for perf-<pid>.map support.
* An app that uses jit-<pid>.dump, no longer needs
perf-<pid>.map support. It assumes that any perf-<pid>.map info is
inferior.
*** Details
Use thread->priv to store whether a jitdump file has been processed
During "perf inject --jit", discard "//anon*" mmap events for any pid which
has sucessfully processed a jitdump file.
** Testing:
// jitdump case
perf record <app with jitdump>
perf inject --jit --input perf.data --output perfjit.data
// verify mmap "//anon" events present initially
perf script --input perf.data --show-mmap-events | grep '//anon'
// verify mmap "//anon" events removed
perf script --input perfjit.data --show-mmap-events | grep '//anon'
// no jitdump case
perf record <app without jitdump>
perf inject --jit --input perf.data --output perfjit.data
// verify mmap "//anon" events present initially
perf script --input perf.data --show-mmap-events | grep '//anon'
// verify mmap "//anon" events not removed
perf script --input perfjit.data --show-mmap-events | grep '//anon'
** Repro:
This issue was discovered while testing the initial CoreCLR jitdump
implementation. https://github.com/dotnet/coreclr/pull/26897.
** Alternate solutions considered
These were also briefly considered:
* Change kernel to not coalesce mmap regions.
* Change kernel reporting of coalesced mmap regions to perf. Only
include newly mapped memory.
* Only strip parts of // anon mmap events overlapping existing
jitted-<pid>-<code_index>.so mmap events.
Signed-off-by: Steve MacLean <Steve.MacLean@Microsoft.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/1590544271-125795-1-git-send-email-steve.maclean@linux.microsoft.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array
member[1][2], introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning in
case the flexible array does not occur last in the structure, which will
help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by this
change:
"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]
sizeof(flexible-array-member) triggers a warning because flexible array
members have incomplete type[1]. There are some instances of code in
which the sizeof operator is being incorrectly/erroneously applied to
zero-length arrays and the result is zero. Such instances may be hiding
some bugs. So, this work (flexible-array member conversions) will also
help to get completely rid of those sorts of issues.
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Gustavo A. R. Silva <gustavo@embeddedor.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200515172926.GA31976@embeddedor
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>