IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Daniel Borkmann says:
====================
bpf-next 2022-07-22
We've added 73 non-merge commits during the last 12 day(s) which contain
a total of 88 files changed, 3458 insertions(+), 860 deletions(-).
The main changes are:
1) Implement BPF trampoline for arm64 JIT, from Xu Kuohai.
2) Add ksyscall/kretsyscall section support to libbpf to simplify tracing kernel
syscalls through kprobe mechanism, from Andrii Nakryiko.
3) Allow for livepatch (KLP) and BPF trampolines to attach to the same kernel
function, from Song Liu & Jiri Olsa.
4) Add new kfunc infrastructure for netfilter's CT e.g. to insert and change
entries, from Kumar Kartikeya Dwivedi & Lorenzo Bianconi.
5) Add a ksym BPF iterator to allow for more flexible and efficient interactions
with kernel symbols, from Alan Maguire.
6) Bug fixes in libbpf e.g. for uprobe binary path resolution, from Dan Carpenter.
7) Fix BPF subprog function names in stack traces, from Alexei Starovoitov.
8) libbpf support for writing custom perf event readers, from Jon Doron.
9) Switch to use SPDX tag for BPF helper man page, from Alejandro Colomar.
10) Fix xsk send-only sockets when in busy poll mode, from Maciej Fijalkowski.
11) Reparent BPF maps and their charging on memcg offlining, from Roman Gushchin.
12) Multiple follow-up fixes around BPF lsm cgroup infra, from Stanislav Fomichev.
13) Use bootstrap version of bpftool where possible to speed up builds, from Pu Lehui.
14) Cleanup BPF verifier's check_func_arg() handling, from Joanne Koong.
15) Make non-prealloced BPF map allocations low priority to play better with
memcg limits, from Yafang Shao.
16) Fix BPF test runner to reject zero-length data for skbs, from Zhengchao Shao.
17) Various smaller cleanups and improvements all over the place.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (73 commits)
bpf: Simplify bpf_prog_pack_[size|mask]
bpf: Support bpf_trampoline on functions with IPMODIFY (e.g. livepatch)
bpf, x64: Allow to use caller address from stack
ftrace: Allow IPMODIFY and DIRECT ops on the same function
ftrace: Add modify_ftrace_direct_multi_nolock
bpf/selftests: Fix couldn't retrieve pinned program in xdp veth test
bpf: Fix build error in case of !CONFIG_DEBUG_INFO_BTF
selftests/bpf: Fix test_verifier failed test in unprivileged mode
selftests/bpf: Add negative tests for new nf_conntrack kfuncs
selftests/bpf: Add tests for new nf_conntrack kfuncs
selftests/bpf: Add verifier tests for trusted kfunc args
net: netfilter: Add kfuncs to set and change CT status
net: netfilter: Add kfuncs to set and change CT timeout
net: netfilter: Add kfuncs to allocate and insert CT
net: netfilter: Deduplicate code in bpf_{xdp,skb}_ct_lookup
bpf: Add documentation for kfuncs
bpf: Add support for forcing kfunc args to be trusted
bpf: Switch to new kfunc flags infrastructure
tools/resolve_btfids: Add support for 8-byte BTF sets
bpf: Introduce 8-byte BTF set
...
====================
Link: https://lore.kernel.org/r/20220722221218.29943-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Before change:
selftests: bpf: test_xdp_veth.sh
Couldn't retrieve pinned program '/sys/fs/bpf/test_xdp_veth/progs/redirect_map_0': No such file or directory
selftests: xdp_veth [SKIP]
ok 20 selftests: bpf: test_xdp_veth.sh # SKIP
After change:
PING 10.1.1.33 (10.1.1.33) 56(84) bytes of data.
64 bytes from 10.1.1.33: icmp_seq=1 ttl=64 time=0.320 ms
--- 10.1.1.33 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.320/0.320/0.320/0.000 ms
selftests: xdp_veth [PASS]
For the test case, the following can be found:
ls /sys/fs/bpf/test_xdp_veth/progs/redirect_map_0
ls: cannot access '/sys/fs/bpf/test_xdp_veth/progs/redirect_map_0': No such file or directory
ls /sys/fs/bpf/test_xdp_veth/progs/
xdp_redirect_map_0 xdp_redirect_map_1 xdp_redirect_map_2
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Jie2x Zhou <jie2x.zhou@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220719082430.9916-1-jie2x.zhou@intel.com
Ping sockets don't appear to make any attempt to preserve flow labels
created and set by userspace using IPV6_FLOWINFO_SEND. Instead they are
clobbered by autolabels (if enabled) or zero.
Grab the flowlabel out of the msghdr similar to how rawv6_sendmsg does
it and move the memset up so it doesn't get zeroed after.
Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Loading the BTF won't be permitted without privileges, hence only test
for privileged mode by setting the prog type. This makes the
test_verifier show 0 failures when unprivileged BPF is enabled.
Fixes: 41188e9e9def ("selftest/bpf: Test for use-after-free bug fix in inline_bpf_loop")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20220721134245.2450-14-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Test cases we care about and ensure improper usage is caught and
rejected by the verifier.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20220721134245.2450-13-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Make sure verifier rejects the bad cases and ensure the good case keeps
working. The selftests make use of the bpf_kfunc_call_test_ref kfunc
added in the previous patch only for verification.
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20220721134245.2450-11-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Instead of populating multiple sets to indicate some attribute and then
researching the same BTF ID in them, prepare a single unified BTF set
which indicates whether a kfunc is allowed to be called, and also its
attributes if any at the same time. Now, only one call is needed to
perform the lookup for both kfunc availability and its attributes.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20220721134245.2450-4-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
A flag is a 4-byte symbol that may follow a BTF ID in a set8. This is
used in the kernel to tag kfuncs in BTF sets with certain flags. Add
support to adjust the sorting code so that it passes size as 8 bytes
for 8-byte BTF sets.
Cc: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220721134245.2450-3-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
CXL 2.0 allows for dynamic provisioning of new memory regions (system
physical address resources like "System RAM" and "Persistent Memory").
Whereas DDR and PMEM resources are conveyed statically at boot, CXL
allows for assembling and instantiating new regions from the available
capacity of CXL memory expanders in the system.
Sysfs with an "echo $region_name > $create_region_attribute" interface
is chosen as the mechanism to initiate the provisioning process. This
was chosen over ioctl() and netlink() to keep the configuration
interface entirely in a pseudo-fs interface, and it was chosen over
configfs since, aside from this one creation event, the interface is
read-mostly. I.e. configfs supports cases where an object is designed to
be provisioned each boot, like an iSCSI storage target, and CXL region
creation is mostly for PMEM regions which are created usually once
per-lifetime of a server instance. This is an improvement over nvdimm
that pre-created "seed" devices that tended to confuse users looking to
determine which devices are active and which are idle.
Recall that the major change that CXL brings over previous persistent
memory architectures is the ability to dynamically define new regions.
Compare that to drivers like 'nfit' where the region configuration is
statically defined by platform firmware.
Regions are created as a child of a root decoder that encompasses an
address space with constraints. When created through sysfs, the root
decoder is explicit. When created from an LSA's region structure a root
decoder will possibly need to be inferred by the driver.
Upon region creation through sysfs, a vacant region is created with a
unique name. Regions have a number of attributes that must be configured
before the region can be bound to the driver where HDM decoder program
is completed.
An example of creating a new region:
- Allocate a new region name:
region=$(cat /sys/bus/cxl/devices/decoder0.0/create_pmem_region)
- Create a new region by name:
while
region=$(cat /sys/bus/cxl/devices/decoder0.0/create_pmem_region)
! echo $region > /sys/bus/cxl/devices/decoder0.0/create_pmem_region
do true; done
- Region now exists in sysfs:
stat -t /sys/bus/cxl/devices/decoder0.0/$region
- Delete the region, and name:
echo $region > /sys/bus/cxl/devices/decoder0.0/delete_region
Signed-off-by: Ben Widawsky <bwidawsk@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165784333909.1758207.794374602146306032.stgit@dwillia2-xfh.jf.intel.com
[djbw: simplify locking, reword changelog]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
It should be lock_text_end instead of _start.
Fixes: 0d2997f750d1de39 ("perf lock: Look up callchain for the contended locks")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220721043644.153718-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Previously the target routing specifics of switch decoders and platform
CXL window resource tracking of root decoders were factored out of
'struct cxl_decoder'. While switch decoders translate from SPA to
downstream ports, endpoint decoders translate from SPA to DPA.
This patch, 3 of 3, adds a 'struct cxl_endpoint_decoder' that tracks an
endpoint-specific Device Physical Address (DPA) resource. For now this
just defines ->dpa_res, a follow-on patch will handle requesting DPA
resource ranges from a device-DPA resource tree.
Co-developed-by: Ben Widawsky <bwidawsk@kernel.org>
Signed-off-by: Ben Widawsky <bwidawsk@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165784327088.1758207.15502834501671201192.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Currently 'struct cxl_decoder' contains the superset of attributes
needed for all decoder types. Before more type-specific attributes are
added to the common definition, reorganize 'struct cxl_decoder' into type
specific objects.
This patch, the first of three, factors out a cxl_switch_decoder type.
See the new kdoc for what a 'struct cxl_switch_decoder' represents in a
CXL topology.
Co-developed-by: Ben Widawsky <bwidawsk@kernel.org>
Signed-off-by: Ben Widawsky <bwidawsk@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/r/165784325340.1758207.5064717153608954960.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The return from strcmp() is inverted so it wrongly returns true instead
of false and vice versa.
Fixes: a1c9d61b19cb ("libbpf: Improve library identification for uprobe binary path resolution")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Cc: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/YtZ+/dAA195d99ak@kili
The code here is supposed to take a signed int and store it in a signed
long long. Unfortunately, the way that the type promotion works with
this conditional statement is that it takes a signed int, type promotes
it to a __u32, and then stores that as a signed long long. The result is
never negative.
This is from static analysis, but I made a little test program just to
test it before I sent the patch:
#include <stdio.h>
int main(void)
{
unsigned long long src = -1ULL;
signed long long dst1, dst2;
int is_signed = 1;
dst1 = is_signed ? *(int *)&src : *(unsigned int *)0;
dst2 = is_signed ? (signed long long)*(int *)&src : *(unsigned int *)0;
printf("%lld\n", dst1);
printf("%lld\n", dst2);
return 0;
}
Fixes: d90ec262b35b ("libbpf: Add enum64 support for btf_dump")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/YtZ+LpgPADm7BeEd@kili
This patch fixes a build error reported in the link. [0]
unix_connect.c: In function ‘unix_connect_test’:
unix_connect.c:115:55: error: expected identifier before ‘(’ token
#define offsetof(type, member) ((size_t)&((type *)0)->(member))
^
unix_connect.c:128:12: note: in expansion of macro ‘offsetof’
addrlen = offsetof(struct sockaddr_un, sun_path) + variant->len;
^~~~~~~~
We can fix this by removing () around member, but checkpatch will complain
about it, and the root cause of the build failure is that I followed the
warning and fixed this in the v2 -> v3 change of the blamed commit. [1]
CHECK: Macro argument 'member' may be better as '(member)' to avoid precedence issues
#33: FILE: tools/testing/selftests/net/af_unix/unix_connect.c:115:
+#define offsetof(type, member) ((size_t)&((type *)0)->member)
To avoid this warning, let's use offsetof() defined in stddef.h instead.
[0]: https://lore.kernel.org/linux-mm/202207182205.FrkMeDZT-lkp@intel.com/
[1]: https://lore.kernel.org/netdev/20220702154818.66761-1-kuniyu@amazon.com/
Fixes: e95ab1d85289 ("selftests: net: af_unix: Test connect() with different netns.")
Reported-by: kernel test robot <lkp@intel.com>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20220720005750.16600-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Noticed after switching to python3 by default on some older fedora
releases:
35 38.20 fedora:27 : FAIL clang version 5.0.2 (tags/RELEASE_502/final)
clang-5.0: error: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1' [-Werror,-Wunused-command-line-argument]
clang-5.0: error: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1' [-Werror,-Wunused-command-line-argument]
error: command 'clang' failed with exit status 1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
One in perf's CFLAGS and the other in the distro python binding
scripts.
So if use the usual technique of first -D_FORTIFY_SOURCE then -D it.
Noticed with:
opensuse tumbleweed:
gcc version 12.1.1 20220629 [revision 7811663964aa7e31c3939b859bbfa2e16919639f] (SUSE Linux)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Perf test case 83: perf stat CSV output linter might fail
on s390.
The reason for this is the output of the command
./perf stat -x, -A -a --no-merge true
which depends on a .config file setting. When CONFIG_SCHED_TOPOLOGY
is set, the output of above perf command is
CPU0,1.50,msec,cpu-clock,1502781,100.00,1.052,CPUs utilized
When CONFIG_SCHED_TOPOLOGY is *NOT* set the output of above perf
command is
0.95,msec,cpu-clock,949800,100.00,1.060,CPUs utilized
Fix the test case to accept both output formats.
Output before:
# perf test 83
83: perf stat CSV output linter : FAILED!
#
Output after:
# ./perf test 83
83: perf stat CSV output linter : Ok
#
Fixes: ec906102e5b7d339 ("perf test: Fix "perf stat CSV output linter" test on s390")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20220720123419.220953-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The double `the' is duplicated in the comment, remove one.
Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20220716044040.43123-1-wangborong@cdjrlc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The double `the' is duplicated in the comment, remove one.
Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Zechuan Chen <chenzechuan1@huawei.com>
Link: http://lore.kernel.org/lkml/20220716043957.42829-1-wangborong@cdjrlc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On gcc 12 we started seeing this:
In file included from /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/perl.h:2999,
from util/scripting-engines/trace-event-perl.c:35:
/usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/inline.h: In function 'Perl_is_utf8_valid_partial_char_flags':
/usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/handy.h:125:23: error: cast from function call of type 'STRLEN' {aka 'long unsigned int'} to non-matching type '_Bool' [-Werror=bad-function-cast]
125 | #define cBOOL(cbool) ((bool) (cbool))
| ^
/usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/inline.h:2363:12: note: in expansion of macro 'cBOOL'
2363 | return cBOOL(is_utf8_char_helper_(s0, e, flags));
| ^~~~~
In file included from /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/perl.h:7242:
/usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/inline.h: In function 'Perl_cop_file_avn':
/usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/inline.h:3489:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]
3489 | const char *file = CopFILE(cop);
| ^~~~~
In file included from /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/perl.h:7243:
/usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/sv_inline.h: In function 'Perl_newSV_type':
/usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/sv_inline.h:376:5: error: enumeration value 'SVt_LAST' not handled in switch [-Werror=switch-enum]
376 | switch (type) {
| ^~~~~~
So disable those warnings to keep building with perl devel headers.
Noticed, among other distros, on opensuse tumbleweed:
gcc version 12.1.1 20220629 [revision 7811663964aa7e31c3939b859bbfa2e16919639f] (SUSE Linux)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix the following DeprecationWarning:
tools/perf/util/setup.py:31: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives
Note: the setuptools module may need installing, for example:
$ sudo apt install python-setuptools
Reviewer comments:
James said:
Tested it with python 2.7 and 3.8 by running "make install-python_ext PYTHON=..."
Committer notes:
Tested with:
$ make -k BUILD_BPF_SKEL=1 PYTHON=python3 O=/tmp/build/perf -C tools/perf install-bin ; perf test python
$ make -k BUILD_BPF_SKEL=1 O=/tmp/build/perf -C tools/perf install-bin ; perf test python
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220615014206.26651-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now it is possible to decode a host Intel PT trace including guest machine
user space, add documentation for the steps needed to do it.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-36-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When decoding with guest sideband information, for VMX non-root (NR)
i.e. guest events, replace the host (hypervisor) pid/tid with guest values,
and provide also the new machine_pid and vcpu values.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-35-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When decoding with guest sideband information, for VMX non-root (NR)
i.e. guest errors, replace the host (hypervisor) pid/tid with guest values,
and provide also the new machine_pid and vcpu values.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-34-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Prior to decoding, determine what guest thread, if any, is running.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-33-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The sync_switch facility attempts to better synchronize context switches
with the Intel PT trace, however it is not designed for guest machine
context switches, so disable it when guest sideband is detected.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-32-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use guest context switch events to keep track of which guest thread is
running on a particular guest machine and VCPU.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-31-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To aid debugging, add some more logging to intel_pt_walk_next_insn().
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-30-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Remove guest_machine_pid because it is not needed.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-29-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a helper function to determine if an event is a guest event.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-28-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If a kernel mmap event was recorded inside a guest and injected into a host
perf.data file, then it will match a host mmap_name not a guest mmap_name,
see machine__set_mmap_name(). So try matching a host mmap_name in that
case.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-27-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Prepare machine__set_current_tid() for use with guest machines that do
not currently have a machine->env->nr_cpus_avail value by making use of
realloc_array_as_needed().
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-26-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Inject events from a perf.data file recorded in a virtual machine into
a perf.data file recorded on the host at the same time.
Only side band events (e.g. mmap, comm, fork, exit etc) and build IDs are
injected. Additionally, the guest kcore_dir is copied as kcore_dir__
appended to the machine PID.
This is non-trivial because:
o It is not possible to process 2 sessions simultaneously so instead
events are first written to a temporary file.
o To avoid conflict, guest sample IDs are replaced with new unused sample
IDs.
o Guest event's CPU is changed to be the host CPU because it is more
useful for reporting and analysis.
o Sample ID is mapped to machine PID which is recorded with VCPU in the
id index. This is important to allow guest events to be related to the
guest machine and VCPU.
o Timestamps must be converted.
o Events are inserted to obey finished-round ordering.
The anticipated use-case is:
- start recording sideband events in a guest machine
- start recording an AUX area trace on the host which can trace also the
guest (e.g. Intel PT)
- run test case on the guest
- stop recording on the host
- stop recording on the guest
- copy the guest perf.data file to the host
- inject the guest perf.data file sideband events into the host perf.data
file using perf inject
- the resulting perf.data file can now be used
Subsequent patches provide Intel PT support for this.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-25-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add helper reallocarray_as_needed() to reallocate an array to a larger
size and initialize the extra entries to an arbitrary value.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-24-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When registering a guest machine using machine_pid from the id index,
check perf.data for a matching kcore_dir subdirectory and set the
kallsyms file name accordingly. If set, use it to find the machine's
kernel symbols and object code (from kcore).
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-23-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Copies of /proc/kallsyms, /proc/modules and an extract of /proc/kcore can
be stored in the perf.data output directory under the subdirectory named
kcore_dir. Guest machines will have their files also under subdirectories
beginning kcore_dir__ followed by the machine pid. Make has_kcore_dir()
return true also if there is a guest machine kcore_dir.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-22-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Copies of /proc/kallsyms, /proc/modules and an extract of /proc/kcore can
be stored in the perf.data output directory under the subdirectory named
kcore_dir. Guest machines will have their files also under subdirectories
beginning kcore_dir__ followed by the machine pid. Remove these also when
removing kcore_dir.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-21-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add machine_pid and vcpu to the intel-pt-events.py script.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-20-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add machine_pid and vcpu to struct perf_record_auxtrace_error. The existing
fmt member is used to identify the new format.
The new members make it possible to easily differentiate errors from guest
machines.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-18-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add machine_pid and vcpu to struct perf_dlfilter_sample. The 'size' can be
used to determine if the values are present, however machine_pid is zero if
unused in any case. vcpu should be ignored if machine_pid is zero.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-17-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add fields machine_pid and vcpu. These are displayed only if machine_pid is
non-zero.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-16-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If machine_pid is set, use it to find the guest machine.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-15-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When parsing a sample with a sample ID, copy machine_pid and vcpu from
perf_sample_id to perf_sample.
Note, machine_pid will be zero when unused, so only a non-zero value
represents a guest machine. vcpu should be ignored if machine_pid is zero.
Note also, machine_pid is used with events that have come from injecting a
guest perf.data file, however guest events recorded on the host (i.e. using
perf kvm) have the (QEMU) hypervisor process pid to identify them - refer
machines__find_for_cpumode().
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-14-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>