IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Instead of populating multiple sets to indicate some attribute and then
researching the same BTF ID in them, prepare a single unified BTF set
which indicates whether a kfunc is allowed to be called, and also its
attributes if any at the same time. Now, only one call is needed to
perform the lookup for both kfunc availability and its attributes.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20220721134245.2450-4-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This patch fixes a build error reported in the link. [0]
unix_connect.c: In function ‘unix_connect_test’:
unix_connect.c:115:55: error: expected identifier before ‘(’ token
#define offsetof(type, member) ((size_t)&((type *)0)->(member))
^
unix_connect.c:128:12: note: in expansion of macro ‘offsetof’
addrlen = offsetof(struct sockaddr_un, sun_path) + variant->len;
^~~~~~~~
We can fix this by removing () around member, but checkpatch will complain
about it, and the root cause of the build failure is that I followed the
warning and fixed this in the v2 -> v3 change of the blamed commit. [1]
CHECK: Macro argument 'member' may be better as '(member)' to avoid precedence issues
#33: FILE: tools/testing/selftests/net/af_unix/unix_connect.c:115:
+#define offsetof(type, member) ((size_t)&((type *)0)->member)
To avoid this warning, let's use offsetof() defined in stddef.h instead.
[0]: https://lore.kernel.org/linux-mm/202207182205.FrkMeDZT-lkp@intel.com/
[1]: https://lore.kernel.org/netdev/20220702154818.66761-1-kuniyu@amazon.com/
Fixes: e95ab1d852 ("selftests: net: af_unix: Test connect() with different netns.")
Reported-by: kernel test robot <lkp@intel.com>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20220720005750.16600-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When building selftests out of the kernel tree the gpio.h the include
path is incorrect and the build falls back to the system includes
which may be outdated.
Add the KHDR_INCLUDES to the CFLAGS to include the gpio.h from the
build tree.
Fixes: 4f4d0af7b2 ("selftests: gpio: restore CFLAGS options")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Kent Gibson <warthog618@gmail.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
The snprintf() function returns the number of bytes which *would*
have been copied if there were space. In other words, it can be
> sizeof(pin_path).
Fixes: c0fa1b6c3e ("bpf: btf: Add BTF tests")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/r/YtZ+aD/tZMkgOUw+@kili
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add test validating that libbpf adjusts (and reflects adjusted) ringbuf
size early, before bpf_object is loaded. Also make sure we can't
successfully resize ringbuf map after bpf_object is loaded.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220715230952.2219271-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add a simple big 16MB array and validate access to the very last byte of
it to make sure that kernel supports > KMALLOC_MAX_SIZE value_size for
BPF array maps (which are backing .bss in this case).
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220715053146.1291891-5-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Convert few selftest that used plain SEC("kprobe") with arch-specific
syscall wrapper prefix to ksyscall/kretsyscall and corresponding
BPF_KSYSCALL macro. test_probe_user.c is especially benefiting from this
simplification.
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220714070755.3235561-6-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Exercise libbpf's logic for unknown __weak virtual __kconfig externs.
USDT selftests are already excercising non-weak known virtual extern
already (LINUX_HAS_BPF_COOKIE), so no need to add explicit tests for it.
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220714070755.3235561-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
In rseq_test, there are two threads, which are vCPU thread and migration
worker separately. Unfortunately, the test has the wrong PID passed to
sched_setaffinity() in the migration worker. It forces migration on the
migration worker because zeroed PID represents the calling thread, which
is the migration worker itself. It means the vCPU thread is never enforced
to migration and it can migrate at any time, which eventually leads to
failure as the following logs show.
host# uname -r
5.19.0-rc6-gavin+
host# # cat /proc/cpuinfo | grep processor | tail -n 1
processor : 223
host# pwd
/home/gavin/sandbox/linux.main/tools/testing/selftests/kvm
host# for i in `seq 1 100`; do \
echo "--------> $i"; ./rseq_test; done
--------> 1
--------> 2
--------> 3
--------> 4
--------> 5
--------> 6
==== Test Assertion Failure ====
rseq_test.c:265: rseq_cpu == cpu
pid=3925 tid=3925 errno=4 - Interrupted system call
1 0x0000000000401963: main at rseq_test.c:265 (discriminator 2)
2 0x0000ffffb044affb: ?? ??:0
3 0x0000ffffb044b0c7: ?? ??:0
4 0x0000000000401a6f: _start at ??:?
rseq CPU = 4, sched CPU = 27
Fix the issue by passing correct parameter, TID of the vCPU thread, to
sched_setaffinity() in the migration worker.
Fixes: 61e52f1630 ("KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs")
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Message-Id: <20220719020830.3479482-1-gshan@redhat.com>
Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When RDRAND was introduced, there was much discussion on whether it
should be trusted and how the kernel should handle that. Initially, two
mechanisms cropped up, CONFIG_ARCH_RANDOM, a compile time switch, and
"nordrand", a boot-time switch.
Later the thinking evolved. With a properly designed RNG, using RDRAND
values alone won't harm anything, even if the outputs are malicious.
Rather, the issue is whether those values are being *trusted* to be good
or not. And so a new set of options were introduced as the real
ones that people use -- CONFIG_RANDOM_TRUST_CPU and "random.trust_cpu".
With these options, RDRAND is used, but it's not always credited. So in
the worst case, it does nothing, and in the best case, maybe it helps.
Along the way, CONFIG_ARCH_RANDOM's meaning got sort of pulled into the
center and became something certain platforms force-select.
The old options don't really help with much, and it's a bit odd to have
special handling for these instructions when the kernel can deal fine
with the existence or untrusted existence or broken existence or
non-existence of that CPU capability.
Simplify the situation by removing CONFIG_ARCH_RANDOM and using the
ordinary asm-generic fallback pattern instead, keeping the two options
that are actually used. For now it leaves "nordrand" for now, as the
removal of that will take a different route.
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Borislav Petkov <bp@suse.de>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Booting with vsyscall=xonly results in the following vsyscall VMA:
ffffffffff600000-ffffffffff601000 --xp ... [vsyscall]
Test does read from fixed vsyscall address to determine if kernel
supports vsyscall page but it doesn't work because, well, vsyscall
page is execute only.
Fix test by trying to execute from the first byte of the page which
contains gettimeofday() stub. This should work because vsyscall
entry points have stable addresses by design.
Alexey, avoiding parsing .config, /proc/config.gz and
/proc/cmdline at all costs.
Link: https://lkml.kernel.org/r/Ys2KgeiEMboU8Ytu@localhost.localdomain
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: <dylanbhatch@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The objective is to test device migration mechanism in pages marked as
COW, for private and coherent device type. In case of writing to COW
private page(s), a page fault will migrate pages back to system memory
first. Then, these pages will be duplicated. In case of COW device
coherent type, pages are duplicated directly from device memory.
Link: https://lkml.kernel.org/r/20220715150521.18165-15-alex.sierra@amd.com
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The intention is to test hmm device coherent type under different get user
pages paths. Also, test gup with FOLL_LONGTERM flag set in device
coherent pages. These pages should get migrated back to system memory.
Link: https://lkml.kernel.org/r/20220715150521.18165-14-alex.sierra@amd.com
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add two more parameters to set spm_addr_dev0 & spm_addr_dev1 addresses.
These two parameters configure the start SP addresses for each device in
test_hmm driver. Consequently, this configures zone device type as
coherent.
Link: https://lkml.kernel.org/r/20220715150521.18165-13-alex.sierra@amd.com
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Test cases such as migrate_fault and migrate_multiple, were modified to
explicit migrate from device to sys memory without the need of page
faults, when using device coherent type.
Snapshot test case updated to read memory device type first and based on
that, get the proper returned results migrate_ping_pong test case added to
test explicit migration from device to sys memory for both private and
coherent zone types.
Helpers to migrate from device to sys memory and vicerversa were also
added.
Link: https://lkml.kernel.org/r/20220715150521.18165-12-alex.sierra@amd.com
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ipv4 arp_accept has a new option '2' to create new neighbor entries
only if the src ip is in the same subnet as an address configured on
the interface that received the garp message. This selftest tests all
options in arp_accept.
ipv6 has a sysctl endpoint, accept_untracked_na, that defines the
behavior for accepting untracked neighbor advertisements. A new option
similar to that of arp_accept for learning only from the same subnet is
added to accept_untracked_na. This selftest tests this new feature.
Signed-off-by: Jaehee Park <jhpark1013@gmail.com>
Suggested-by: Roopa Prabhu <roopa@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add some notes on how to run the test, update the policy file paths to
reflect recent upstream changes, prepare test for adding GID testing.
Signed-off-by: Micah Morton <mortonm@chromium.org>
Not sure how this bug got in here but its been there since the original
merge. I think I tested the code on a system that wouldn't let me
clone() with CLONE_NEWUSER flag set so had to comment out these
test_userns invocations.
Trying to map UID 0 inside the userns to UID 0 outside will never work,
even with CAP_SETUID. The code is supposed to test whether we can map
UID 0 in the userns to the UID of the parent process (the one with
CAP_SETUID that is writing the /proc/[pid]/uid_map file).
Signed-off-by: Micah Morton <mortonm@chromium.org>
The current vgic_init test wrongly assumes that the host cannot
multiple versions of the GIC architecture, while v2 emulation
on v3 has almost always been supported (it was supported before
the standalone v3 emulation).
Tweak the test to support multiple GIC incarnations.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Fixes: 3f4db37e20 ("KVM: arm64: selftests: Make vgic_init gic version agnostic")
Reviewed-by: Ricardo Koller <ricarkol@google.com>
Link: https://lore.kernel.org/r/20220714154108.3531213-1-maz@kernel.org
Alexei reported crash by running test_progs -j on system
with 32 cpus.
It turned out the kprobe_multi bench test that attaches all
ftrace-able functions will race with bpf_dispatcher_update,
that calls bpf_arch_text_poke on bpf_dispatcher_xdp_func,
which is ftrace-able function.
Ftrace is not aware of this update so this will cause
ftrace_bug with:
WARNING: CPU: 6 PID: 1985 at
arch/x86/kernel/ftrace.c:94 ftrace_verify_code+0x27/0x50
...
ftrace_replace_code+0xa3/0x170
ftrace_modify_all_code+0xbd/0x150
ftrace_startup_enable+0x3f/0x50
ftrace_startup+0x98/0xf0
register_ftrace_function+0x20/0x60
register_fprobe_ips+0xbb/0xd0
bpf_kprobe_multi_link_attach+0x179/0x430
__sys_bpf+0x18a1/0x2440
...
------------[ ftrace bug ]------------
ftrace failed to modify
[<ffffffff818d9380>] bpf_dispatcher_xdp_func+0x0/0x10
actual: ffffffe9:7b:ffffff9c:77:1e
Setting ftrace call site to call ftrace function
It looks like we need some way to hide some functions
from ftrace, but meanwhile we workaround this by skipping
bpf_dispatcher_xdp_func from kprobe_multi bench test.
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220714082316.479181-1-jolsa@kernel.org
The khdr make target has been removed, so drop it from the landlock
Makefile dependencies as well as related include paths that are
standard for headers in the kernel tree.
Signed-off-by: Guillaume Tucker <guillaume.tucker@collabora.com>
Reported-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
So we have proper counters at the end of a test. We also print the
kselftest header at the end of the test, so we don't mix with the output
of the child process. There is only this one test anyhow.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: John Stultz <jstultz@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
So the user can decide how long the test should run.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: John Stultz <jstultz@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
The sanity check takes a while. If you do repeated checks when
debugging, this is time consuming. Add a parameter to skip it.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: John Stultz <jstultz@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
It is easier to check if you need to add an include if the existing ones
are sorted.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: John Stultz <jstultz@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
The return value from system() is a waitpid-style integer. Do not return
it directly because with the implicit masking in exit() it will always
return 0. Access it with appropriate macros to really pass on errors.
Fixes: 7290ce1423 ("selftests/timers: Add clocksource-switch test from timetest suite")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: John Stultz <jstultz@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
So we have proper counters at the end of a test, e.g.:
# Totals: pass:11 fail:0 xfail:0 xpass:0 skip:1 error:0
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: John Stultz <jstultz@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
So we have proper counters at the end of a test, e.g.:
# Totals: pass:4 fail:0 xfail:0 xpass:0 skip:8 error:0
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: John Stultz <jstultz@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Mixing up argc/argv went unnoticed because they were not used. Still,
this is worth fixing.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: John Stultz <jstultz@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Toolchains with an include file 'sys/timex.h' based on 3.18 will have a
'clock_adjtime' definition added, so it can't be static in the code:
valid-adjtimex.c:43:12: error: static declaration of ‘clock_adjtime’ follows non-static declaration
Fixes: e03a58c320 ("kselftests: timers: Add adjtimex SETOFFSET validity tests")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: John Stultz <jstultz@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Return boolean values ("true" or "false") instead of 1 or 0 from bool
functions. This fixes the following warnings from coccicheck:
tools/testing/selftests/bpf/progs/test_xdp_noinline.c:407:9-10: WARNING:
return of 0/1 in function 'decap_v4' with return type bool
tools/testing/selftests/bpf/progs/test_xdp_noinline.c:389:9-10: WARNING:
return of 0/1 in function 'decap_v6' with return type bool
tools/testing/selftests/bpf/progs/test_xdp_noinline.c:290:9-10: WARNING:
return of 0/1 in function 'encap_v6' with return type bool
tools/testing/selftests/bpf/progs/test_xdp_noinline.c:264:9-10: WARNING:
return of 0/1 in function 'parse_tcp' with return type bool
tools/testing/selftests/bpf/progs/test_xdp_noinline.c:242:9-10: WARNING:
return of 0/1 in function 'parse_udp' with return type bool
Generated by: scripts/coccinelle/misc/boolreturn.cocci
Suggested-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Linkui Xiao <xiaolinkui@kylinos.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20220714015647.25074-1-xiaolinkui@kylinos.cn
Use cpuid() to get CPUID.0x0 in cpu_vendor_string_is(), thus eliminating
the last open coded usage of CPUID (ignoring debug_regs.c, which emits
CPUID from the guest to trigger a VM-Exit and doesn't actually care about
the results of CPUID).
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-42-seanjc@google.com
Provide informative error messages for the various checks related to
requesting access to XSAVE features that are buried behind XSAVE Feature
Disabling (XFD).
Opportunistically rename the helper to have "require" in the name so that
it's somewhat obvious that the helper may skip the test.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-41-seanjc@google.com
Skip the AMX test instead of silently returning if the host kernel
doesn't support ARCH_REQ_XCOMP_GUEST_PERM. KVM didn't support XFD until
v5.17, so it's extremely unlikely allowing the test to run on a pre-v5.15
kernel is the right thing to do.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-40-seanjc@google.com
Use kvm_cpu_has() to check for XFD supported in vm_xsave_req_perm(),
simply checking host CPUID doesn't guarantee KVM supports AMX/XFD.
Opportunistically hoist the check above the bit check; if XFD isn't
supported, it's far better to get a "not supported at all" message, as
opposed to a "feature X isn't supported" message".
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-39-seanjc@google.com
Rename kvm_get_supported_cpuid_index() to __kvm_get_supported_cpuid_entry()
to better show its relationship to kvm_get_supported_cpuid_entry(), and
because the helper returns a CPUID entry, not the index of an entry.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-37-seanjc@google.com
Use kvm_get_supported_cpuid_entry() instead of
kvm_get_supported_cpuid_index() when passing in '0' for the index, which
just so happens to be the case in all remaining users of
kvm_get_supported_cpuid_index() except kvm_get_supported_cpuid_entry().
Keep the helper as there may be users in the future, and it's not doing
any harm.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-36-seanjc@google.com
Add this_cpu_has() to query an X86_FEATURE_* via cpuid(), i.e. to query a
feature from L1 (or L2) guest code. Arbitrarily select the AMX test to
be the first user.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-33-seanjc@google.com
Set the function/index for CPUID in the helper instead of relying on the
caller to do so. In addition to reducing the risk of consuming an
uninitialized ECX, having the function/index embedded in the call makes
it easier to understand what is being checked.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-32-seanjc@google.com
Tag the returned CPUID pointers from kvm_get_supported_cpuid(),
kvm_get_supported_hv_cpuid(), and vcpu_get_supported_hv_cpuid() "const"
to prevent reintroducing the broken pattern of modifying the static
"cpuid" variable used by kvm_get_supported_cpuid() to cache the results
of KVM_GET_SUPPORTED_CPUID.
Update downstream consumers as needed.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-31-seanjc@google.com
Use vcpu_{set,clear}_cpuid_feature() to toggle nested VMX support in the
vCPU CPUID module in the nVMX state test. Drop CPUID_VMX as there are
no longer any users.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-29-seanjc@google.com
Use the vCPU's persistent CPUID array directly when manipulating the set
of exposed Hyper-V CPUID features. Drop set_cpuid() to route all future
modification through the vCPU helpers; the Hyper-V features test was the
last user.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-27-seanjc@google.com
Add a new helper, vcpu_clear_cpuid_entry(), to do a RMW operation on the
vCPU's CPUID model to clear a given CPUID entry, and use it to clear
KVM's paravirt feature instead of operating on kvm_get_supported_cpuid()'s
static "cpuid" variable. This also eliminates a user of
the soon-be-defunct set_cpuid() helper.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-26-seanjc@google.com
Use vcpu_clear_cpuid_feature() to the MONITOR/MWAIT CPUID feature bit in
the MONITOR/MWAIT quirk test.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add a helper to set a vCPU's guest.MAXPHYADDR, and use it in the test
that verifies the emulator returns an error on an unknown instruction
when KVM emulates in response to an EPT violation with a GPA that is
legal in hardware but illegal with respect to the guest's MAXPHYADDR.
Add a helper even though there's only a single user at this time. Before
its removal, mmu_role_test also stuffed guest.MAXPHYADDR, and the helper
provides a small amount of clarity.
More importantly, this eliminates a set_cpuid() user and an instance of
modifying kvm_get_supported_cpuid()'s static "cpuid".
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-25-seanjc@google.com
Use vm->pa_bits to generate the mask of physical address bits that are
reserved in page table entries. vm->pa_bits is set when the VM is
created, i.e. it's guaranteed to be valid when populating page tables.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-24-seanjc@google.com
Add helpers to get a specific CPUID entry for a given vCPU, and to toggle
a specific CPUID-based feature for a vCPU. The helpers will reduce the
amount of boilerplate code needed to tweak a vCPU's CPUID model, improve
code clarity, and most importantly move tests away from modifying the
static "cpuid" returned by kvm_get_supported_cpuid().
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-23-seanjc@google.com
Rename get_cpuid() to get_cpuid_entry() to better reflect its behavior.
Leave set_cpuid() as is to avoid unnecessary churn, that helper will soon
be removed entirely.
Oppurtunistically tweak the implementation to avoid using a temporary
variable in anticipation of taggin the input @cpuid with "const".
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-21-seanjc@google.com
Don't use a static variable for the Hyper-V supported CPUID array, the
helper unconditionally reallocates the array on every invocation (and all
callers free the array immediately after use). The array is intentionally
recreated and refilled because the set of supported CPUID features is
dependent on vCPU state, e.g. whether or not eVMCS has been enabled.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-20-seanjc@google.com
Cache a vCPU's CPUID information in "struct kvm_vcpu" to allow fixing the
mess where tests, often unknowingly, modify the global/static "cpuid"
allocated by kvm_get_supported_cpuid().
Add vcpu_init_cpuid() to handle stuffing an entirely different CPUID
model, e.g. during vCPU creation or when switching to the Hyper-V enabled
CPUID model. Automatically refresh the cache on vcpu_set_cpuid() so that
any adjustments made by KVM are always reflected in the cache. Drop
vcpu_get_cpuid() entirely to force tests to use the cache, and to allow
adding e.g. vcpu_get_cpuid_entry() in the future without creating a
conflicting set of APIs where vcpu_get_cpuid() does KVM_GET_CPUID2, but
vcpu_get_cpuid_entry() does not.
Opportunistically convert the VMX nested state test and KVM PV test to
manipulating the vCPU's CPUID (because it's easy), but use
vcpu_init_cpuid() for the Hyper-V features test and "emulator error" test
to effectively retain their current behavior as they're less trivial to
convert.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-19-seanjc@google.com
Split out the computation of the effective size of a kvm_cpuid2 struct
from allocate_kvm_cpuid2(), and modify both to take an arbitrary number
of entries. Future commits will add caching of a vCPU's CPUID model, and
will (a) be able to precisely size the entries array, and (b) will need
to know the effective size of the struct in order to copy to/from the
cache.
Expose the helpers so that the Hyper-V Features test can use them in the
(somewhat distant) future. The Hyper-V test very, very subtly relies on
propagating CPUID info across vCPU instances, and will need to make a
copy of the previous vCPU's CPUID information when it switches to using
the per-vCPU cache. Alternatively, KVM could provide helpers to
duplicate and/or copy a kvm_cpuid2 instance, but each is literally a
single line of code if the helpers are exposed, and it's not like the
size of kvm_cpuid2 is secret knowledge.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-18-seanjc@google.com
In the CPUID test, verify that KVM doesn't modify the kvm_cpuid2.entries
layout, i.e. that the order of entries and their flags is identical
between what the test provides via KVM_SET_CPUID2 and what KVM returns
via KVM_GET_CPUID2.
Asserting that the layouts match simplifies the test as there's no need
to iterate over both arrays.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-17-seanjc@google.com
Use kvm_cpu_has() in the stea-ltime test instead of open coding
equivalent functionality using kvm_get_supported_cpuid_entry().
Opportunistically define all of KVM's paravirt CPUID-based features.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-15-seanjc@google.com
Remove the MMU role test, which was made obsolete by KVM commit
feb627e8d6 ("KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUN"). The
ongoing costs of keeping the test updated far outweigh any benefits,
e.g. the test _might_ be useful as an example or for documentation
purposes, but otherwise the test is dead weight.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-14-seanjc@google.com
Use kvm_cpu_has() in the CR4/CPUID sync test instead of open coding
equivalent functionality using kvm_get_supported_cpuid_entry().
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-13-seanjc@google.com
Use kvm_cpu_has() in the AMX test instead of open coding equivalent
functionality using kvm_get_supported_cpuid_entry() and
kvm_get_supported_cpuid_index().
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-12-seanjc@google.com
Check for _both_ XTILE data and cfg support in the AMX test instead of
checking for _either_ feature. Practically speaking, no sane CPU or vCPU
will support one but not the other, but the effective "or" behavior is
subtle and technically incorrect.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-11-seanjc@google.com
Use kvm_cpu_has() in the XSS MSR test instead of open coding equivalent
functionality using kvm_get_supported_cpuid_index().
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-10-seanjc@google.com
Drop a redundant vcpu_set_cpuid() from the PMU test. The vCPU's CPUID is
set to KVM's supported CPUID by vm_create_with_one_vcpu(), which was also
true back when the helper was named vm_create_default().
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-9-seanjc@google.com
Use kvm_cpu_has() in the PMU test to query PDCM support instead of open
coding equivalent functionality using kvm_get_supported_cpuid_index().
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-8-seanjc@google.com
Use kvm_cpu_has() to check for nested VMX support, and drop the helpers
now that their functionality is trivial to implement.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-7-seanjc@google.com
Use kvm_cpu_has() to check for nested SVM support, and drop the helpers
now that their functionality is trivial to implement.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-6-seanjc@google.com
Use kvm_cpu_has() in the SEV migration test instead of open coding
equivalent functionality using kvm_get_supported_cpuid_entry().
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-5-seanjc@google.com
Add X86_FEATURE_* magic in the style of KVM-Unit-Tests' implementation,
where the CPUID function, index, output register, and output bit position
are embedded in the macro value. Add kvm_cpu_has() to query KVM's
supported CPUID and use it set_sregs_test, which is the most prolific
user of manual feature querying.
Opportunstically rename calc_cr4_feature_bits() to
calc_supported_cr4_feature_bits() to better capture how the CR4 bits are
chosen.
Link: https://lore.kernel.org/all/20210422005626.564163-1-ricarkol@google.com
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-4-seanjc@google.com
Rename X86_FEATURE_* macros to CPUID_* in various tests to free up the
X86_FEATURE_* names for KVM-Unit-Tests style CPUID automagic where the
function, leaf, register, and bit for the feature is embedded in its
macro value.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-3-seanjc@google.com
On x86-64, set KVM's supported CPUID as the vCPU's CPUID when recreating
a VM+vCPU to deduplicate code for state save/restore tests, and to
provide symmetry of sorts with respect to vm_create_with_one_vcpu(). The
extra KVM_SET_CPUID2 call is wasteful for Hyper-V, but ultimately is
nothing more than an expensive nop, and overriding the vCPU's CPUID with
the Hyper-V CPUID information is the only known scenario where a state
save/restore test wouldn't need/want the default CPUID.
Opportunistically use __weak for the default vm_compute_max_gfn(), it's
provided by tools' compiler.h.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-2-seanjc@google.com
Fix filename reporting in guest asserts by ensuring the GUEST_ASSERT
macro records __FILE__ and substituting REPORT_GUEST_ASSERT for many
repetitive calls to TEST_FAIL.
Previously filename was reported by using __FILE__ directly in the
selftest, wrongly assuming it would always be the same as where the
assertion failed.
Signed-off-by: Colton Lewis <coltonlewis@google.com>
Reported-by: Ricardo Koller <ricarkol@google.com>
Fixes: 4e18bccc2e
Link: https://lore.kernel.org/r/20220615193116.806312-5-coltonlewis@google.com
[sean: convert more TEST_FAIL => REPORT_GUEST_ASSERT instances]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Write REPORT_GUEST_ASSERT macros to pair with GUEST_ASSERT to abstract
and make consistent all guest assertion reporting. Every report
includes an explanatory string, a filename, and a line number.
Signed-off-by: Colton Lewis <coltonlewis@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Link: https://lore.kernel.org/r/20220615193116.806312-4-coltonlewis@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Increase UCALL_MAX_ARGS to 7 to allow GUEST_ASSERT_4 to pass 3 builtin
ucall arguments specified in guest_assert_builtin_args plus 4
user-specified arguments.
Signed-off-by: Colton Lewis <coltonlewis@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Link: https://lore.kernel.org/r/20220615193116.806312-3-coltonlewis@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Enumerate GUEST_ASSERT arguments to avoid magic indices to ucall.args.
Signed-off-by: Colton Lewis <coltonlewis@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Link: https://lore.kernel.org/r/20220615193116.806312-2-coltonlewis@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add a "UD" clause to KVM_X86_QUIRK_MWAIT_NEVER_FAULTS to make it clear
that the quirk only controls the #UD behavior of MONITOR/MWAIT. KVM
doesn't currently enforce fault checks when MONITOR/MWAIT are supported,
but that could change in the future. SVM also has a virtualization hole
in that it checks all faults before intercepts, and so "never faults" is
already a lie when running on SVM.
Fixes: bfbcc81bb8 ("KVM: x86: Add a quirk for KVM's "MONITOR/MWAIT are NOPs!" behavior")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220711225753.1073989-4-seanjc@google.com
Do not use GCC's "A" constraint to load EAX:EDX in wrmsr_safe(). Per
GCC's documenation on x86-specific constraints, "A" will not actually
load a 64-bit value into EAX:EDX on x86-64.
The a and d registers. This class is used for instructions that return
double word results in the ax:dx register pair. Single word values will
be allocated either in ax or dx. For example on i386 the following
implements rdtsc:
unsigned long long rdtsc (void)
{
unsigned long long tick;
__asm__ __volatile__("rdtsc":"=A"(tick));
return tick;
}
This is not correct on x86-64 as it would allocate tick in either ax or
dx. You have to use the following variant instead:
unsigned long long rdtsc (void)
{
unsigned int tickl, tickh;
__asm__ __volatile__("rdtsc":"=a"(tickl),"=d"(tickh));
return ((unsigned long long)tickh << 32)|tickl;
}
Because a u64 fits in a single 64-bit register, using "A" for selftests,
which are 64-bit only, results in GCC loading the value into either RAX
or RDX instead of splitting it across EAX:EDX.
E.g.:
kvm_exit: reason MSR_WRITE rip 0x402919 info 0 0
kvm_msr: msr_write 40000118 = 0x60000000001 (#GP)
...
With "A":
48 8b 43 08 mov 0x8(%rbx),%rax
49 b9 ba da ca ba 0a movabs $0xabacadaba,%r9
00 00 00
4c 8d 15 07 00 00 00 lea 0x7(%rip),%r10 # 402f44 <guest_msr+0x34>
4c 8d 1d 06 00 00 00 lea 0x6(%rip),%r11 # 402f4a <guest_msr+0x3a>
0f 30 wrmsr
With "a"/"d":
48 8b 53 08 mov 0x8(%rbx),%rdx
89 d0 mov %edx,%eax
48 c1 ea 20 shr $0x20,%rdx
49 b9 ba da ca ba 0a movabs $0xabacadaba,%r9
00 00 00
4c 8d 15 07 00 00 00 lea 0x7(%rip),%r10 # 402fc3 <guest_msr+0xb3>
4c 8d 1d 06 00 00 00 lea 0x6(%rip),%r11 # 402fc9 <guest_msr+0xb9>
0f 30 wrmsr
Fixes: 3b23054cd3 ("KVM: selftests: Add x86-64 support for exception fixup")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints
[sean: use "& -1u", provide GCC blurb and link to documentation]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220714011115.3135828-1-seanjc@google.com
Add a couple of test-cases covering the newly introduced
features - priority update for the MPC subflow.
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Provide valid inputs for RAX, RCX, and RDX when testing whether or not
KVM injects a #UD on MONITOR/MWAIT. SVM has a virtualization hole and
checks for _all_ faults before checking for intercepts, e.g. MONITOR with
an unsupported RCX will #GP before KVM gets a chance to intercept and
emulate.
Fixes: 2325d4dd73 ("KVM: selftests: Add MONITOR/MWAIT quirk test")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220711225753.1073989-3-seanjc@google.com
Fix a copy+paste error in monitor_mwait_test by switching one of the two
"monitor" instructions to an "mwait". The intent of the test is very
much to verify the quirk handles both MONITOR and MWAIT.
Fixes: 2325d4dd73 ("KVM: selftests: Add MONITOR/MWAIT quirk test")
Reported-by: Yuan Yao <yuan.yao@linux.intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220711225753.1073989-2-seanjc@google.com
add subtest verifying BPF ksym iter behaviour. The BPF ksym
iter program shows an example of dumping a format different to
/proc/kallsyms. It adds KIND and MAX_SIZE fields which represent the
kind of symbol (core kernel, module, ftrace, bpf, or kprobe) and
the maximum size the symbol can be. The latter is calculated from
the difference between current symbol value and the next symbol
value.
The key benefit for this iterator will likely be supporting in-kernel
data-gathering rather than dumping symbol details to userspace and
parsing the results.
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/1657629105-7812-3-git-send-email-alan.maguire@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Drop the KSFT_KHDR_INSTALL make target now that all use-cases have
been removed from the other kselftest Makefiles.
Signed-off-by: Guillaume Tucker <guillaume.tucker@collabora.com>
Tested-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Stop using the KSFT_KHDR_INSTALL flag as installing the kernel headers
from the kselftest Makefile is causing some issues. Instead, rely on
the headers to be installed directly by the top-level Makefile
"headers_install" make target prior to building kselftest.
Signed-off-by: Guillaume Tucker <guillaume.tucker@collabora.com>
Tested-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Drop the "khdr" make target as it fails when the build directory is a
sub-directory of the source tree. Rely on the "headers_install"
target to have been run first instead.
For example, here's a typical error this patch is addressing:
$ make O=build -j32 kselftest-gen_tar
make[1]: Entering directory '/home/kernelci/linux/build'
make --no-builtin-rules INSTALL_HDR_PATH=/home/kernelci/linux/build/usr \
ARCH=x86 -C ../../.. headers_install
make[3]: Entering directory '/home/kernelci/linux'
Makefile:1022: ../scripts/Makefile.extrawarn: No such file or directory
The source directory is determined in the top-level Makefile as ".."
relatively to the "build" directory, but then the kselftest Makefile
switches to "-C ../../.." so "../scripts" then points one level higher
than the source tree e.g. "linux/../scripts" - which fails obviously.
There is no other use-case in the kernel tree where a sub-directory
Makefile tries to call a top-level make target, and it appears this
isn't really a valid thing to do.
Signed-off-by: Guillaume Tucker <guillaume.tucker@collabora.com>
Tested-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Make any kselftest test module (using the kselftest_module framework)
taint the kernel with TAINT_TEST on module load.
Also mark the module as a test module using MODULE_INFO(test, "Y") so
that other tools can tell this is a test module. We can't rely solely
on this, though, as these test modules are also often built-in.
Finally, update the kselftest documentation to mention that the kernel
should be tainted, and how to do so manually (as below).
Note that several selftests use kernel modules which are not based on
the kselftest_module framework, and so will not automatically taint the
kernel.
This can be done in two ways:
- Moving the module to the tools/testing directory. All modules under
this directory will taint the kernel.
- Adding the 'test' module property with:
MODULE_INFO(test, "Y")
Similarly, selftests which do not load modules into the kernel generally
should not taint the kernel (or possibly should only do so on failure),
as it's assumed that testing from user-space should be safe. Regardless,
they can write to /proc/sys/kernel/tainted if required.
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Acked-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
The new script was not listed in the programs to test.
By consequence, some CIs running MPTCP selftests were not validating
these new tests. Note that MPTCP CI was validating it as it executes all
.sh scripts from 'tools/testing/selftests/net/mptcp' directory.
Fixes: 259a834fad ("selftests: mptcp: functional tests for the userspace PM type")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf-next 2022-07-09
We've added 94 non-merge commits during the last 19 day(s) which contain
a total of 125 files changed, 5141 insertions(+), 6701 deletions(-).
The main changes are:
1) Add new way for performing BTF type queries to BPF, from Daniel Müller.
2) Add inlining of calls to bpf_loop() helper when its function callback is
statically known, from Eduard Zingerman.
3) Implement BPF TCP CC framework usability improvements, from Jörn-Thorben Hinz.
4) Add LSM flavor for attaching per-cgroup BPF programs to existing LSM
hooks, from Stanislav Fomichev.
5) Remove all deprecated libbpf APIs in prep for 1.0 release, from Andrii Nakryiko.
6) Add benchmarks around local_storage to BPF selftests, from Dave Marchevsky.
7) AF_XDP sample removal (given move to libxdp) and various improvements around AF_XDP
selftests, from Magnus Karlsson & Maciej Fijalkowski.
8) Add bpftool improvements for memcg probing and bash completion, from Quentin Monnet.
9) Add arm64 JIT support for BPF-2-BPF coupled with tail calls, from Jakub Sitnicki.
10) Sockmap optimizations around throughput of UDP transmissions which have been
improved by 61%, from Cong Wang.
11) Rework perf's BPF prologue code to remove deprecated functions, from Jiri Olsa.
12) Fix sockmap teardown path to avoid sleepable sk_psock_stop, from John Fastabend.
13) Fix libbpf's cleanup around legacy kprobe/uprobe on error case, from Chuang Wang.
14) Fix libbpf's bpf_helpers.h to work with gcc for the case of its sec/pragma
macro, from James Hilliard.
15) Fix libbpf's pt_regs macros for riscv to use a0 for RC register, from Yixun Lan.
16) Fix bpftool to show the name of type BPF_OBJ_LINK, from Yafang Shao.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (94 commits)
selftests/bpf: Fix xdp_synproxy build failure if CONFIG_NF_CONNTRACK=m/n
bpf: Correctly propagate errors up from bpf_core_composites_match
libbpf: Disable SEC pragma macro on GCC
bpf: Check attach_func_proto more carefully in check_return_code
selftests/bpf: Add test involving restrict type qualifier
bpftool: Add support for KIND_RESTRICT to gen min_core_btf command
MAINTAINERS: Add entry for AF_XDP selftests files
selftests, xsk: Rename AF_XDP testing app
bpf, docs: Remove deprecated xsk libbpf APIs description
selftests/bpf: Add benchmark for local_storage RCU Tasks Trace usage
libbpf, riscv: Use a0 for RC register
libbpf: Remove unnecessary usdt_rel_ip assignments
selftests/bpf: Fix few more compiler warnings
selftests/bpf: Fix bogus uninitialized variable warning
bpftool: Remove zlib feature test from Makefile
libbpf: Cleanup the legacy uprobe_event on failed add/attach_event()
libbpf: Fix wrong variable used in perf_event_uprobe_open_legacy()
libbpf: Cleanup the legacy kprobe_event on failed add/attach_event()
selftests/bpf: Add type match test against kernel's task_struct
selftests/bpf: Add nested type to type based tests
...
====================
Link: https://lore.kernel.org/r/20220708233145.32365-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The usage header of pm_nl_ctl command doesn't match with the context. So
this patch adds the missing userspace PM keywords 'ann', 'rem', 'csf',
'dsf', 'events' and 'listen' in it.
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There're some 'Terminated' messages in the output of userspace pm tests
script after killing './pm_nl_ctl events' processes:
Created network namespaces ns1, ns2 [OK]
./userspace_pm.sh: line 166: 13735 Terminated ip netns exec "$ns2" ./pm_nl_ctl events >> "$client_evts" 2>&1
./userspace_pm.sh: line 172: 13737 Terminated ip netns exec "$ns1" ./pm_nl_ctl events >> "$server_evts" 2>&1
Established IPv4 MPTCP Connection ns2 => ns1 [OK]
./userspace_pm.sh: line 166: 13753 Terminated ip netns exec "$ns2" ./pm_nl_ctl events >> "$client_evts" 2>&1
./userspace_pm.sh: line 172: 13755 Terminated ip netns exec "$ns1" ./pm_nl_ctl events >> "$server_evts" 2>&1
Established IPv6 MPTCP Connection ns2 => ns1 [OK]
ADD_ADDR 10.0.2.2 (ns2) => ns1, invalid token [OK]
This patch adds a helper kill_wait(), in it using 'wait $pid 2>/dev/null'
commands after 'kill $pid' to avoid printing out these Terminated messages.
Use this helper instead of using 'kill $pid'.
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds userspace pm subflow tests support for mptcp_join.sh
script. Add userspace pm create subflow and destroy test cases in
userspace_tests().
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds userspace pm tests support for mptcp_join.sh script. Add
userspace pm add_addr and rm_addr test cases in userspace_tests().
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mentioned test measures the transfer run-time to verify
that the user-space program is able to use the full aggregate B/W.
Even on (virtual) link-speed-bound tests, debug kernel can slow
down the transfer enough to cause sporadic test failures.
Instead of unconditionally raising the maximum allowed run-time,
tweak when the running kernel is a debug one, and use some simple/
rough heuristic to guess such scenarios.
Note: this intentionally avoids looking for /boot/config-<version> as
the latter file is not always available in our reference CI
environments.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Co-developed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using the Makefile from tools/testing/selftests/net/forwarding/
all tests should be installed. Add no_forwarding.sh to the list of
"to be installed tests" where it has been missing so far.
Fixes: 476a4f05d9 ("selftests: forwarding: add a no_forwarding.sh test")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When using the Makefile from tools/testing/selftests/net/forwarding/
all tests should be installed. Add local_termination.sh to the list of
"to be installed tests" where it has been missing so far.
Fixes: 90b9566aa5 ("selftests: forwarding: add a test for local_termination.sh")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When CONFIG_NF_CONNTRACK=m, struct bpf_ct_opts and enum member
BPF_F_CURRENT_NETNS are not exposed. This commit allows building the
xdp_synproxy selftest in such cases. Note that nf_conntrack must be
loaded before running the test if it's compiled as a module.
This commit also allows this selftest to be successfully compiled when
CONFIG_NF_CONNTRACK is disabled.
One unused local variable of type struct bpf_ct_opts is also removed.
Fixes: fb5cd0ce70 ("selftests/bpf: Add selftests for raw syncookie helpers")
Reported-by: Yauheni Kaliuta <ykaliuta@redhat.com>
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220708130319.1016294-1-maximmi@nvidia.com
Daniel Borkmann says:
====================
bpf 2022-07-08
We've added 3 non-merge commits during the last 2 day(s) which contain
a total of 7 files changed, 40 insertions(+), 24 deletions(-).
The main changes are:
1) Fix cBPF splat triggered by skb not having a mac header, from Eric Dumazet.
2) Fix spurious packet loss in generic XDP when pushing packets out (note
that native XDP is not affected by the issue), from Johan Almbladh.
3) Fix bpf_dynptr_{read,write}() helper signatures with flag argument before
its set in stone as UAPI, from Joanne Koong.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf: Add flags arg to bpf_dynptr_read and bpf_dynptr_write APIs
bpf: Make sure mac_header was set before using it
xdp: Fix spurious packet loss in generic XDP TX path
====================
Link: https://lore.kernel.org/r/20220708213418.19626-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Syzkaller reports the following crash:
RIP: 0010:check_return_code kernel/bpf/verifier.c:10575 [inline]
RIP: 0010:do_check kernel/bpf/verifier.c:12346 [inline]
RIP: 0010:do_check_common+0xb3d2/0xd250 kernel/bpf/verifier.c:14610
With the following reproducer:
bpf$PROG_LOAD_XDP(0x5, &(0x7f00000004c0)={0xd, 0x3, &(0x7f0000000000)=ANY=[@ANYBLOB="1800000000000019000000000000000095"], &(0x7f0000000300)='GPL\x00', 0x0, 0x0, 0x0, 0x0, 0x0, '\x00', 0x0, 0x2b, 0xffffffffffffffff, 0x8, 0x0, 0x0, 0x10, 0x0}, 0x80)
Because we don't enforce expected_attach_type for XDP programs,
we end up in hitting 'if (prog->expected_attach_type == BPF_LSM_CGROUP'
part in check_return_code and follow up with testing
`prog->aux->attach_func_proto->type`, but `prog->aux->attach_func_proto`
is NULL.
Add explicit prog_type check for the "Note, BPF_LSM_CGROUP that
attach ..." condition. Also, don't skip return code check for
LSM/STRUCT_OPS.
The above actually brings an issue with existing selftest which
tries to return EPERM from void inet_csk_clone. Fix the
test (and move called_socket_clone to make sure it's not
incremented in case of an error) and add a new one to explicitly
verify this condition.
Fixes: 69fd337a97 ("bpf: per-cgroup lsm flavor")
Reported-by: syzbot+5cc0730bd4b4d2c5f152@syzkaller.appspotmail.com
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220708175000.2603078-1-sdf@google.com
Selftest udmabuf for the dma-buf driver is skipped when the device file
(e.g. /dev/udmabuf) for the DMA buffer cannot be opened i.e. no DMA buffer
has been allocated.
This patch adds clarity to the SKIP message.
Signed-off-by: Soumya Negi <soumya.negi97@gmail.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
This change adds a type based test involving the restrict type qualifier
to the BPF selftests. On the btfgen path, this will verify that bpftool
correctly handles the corresponding RESTRICT BTF kind.
Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20220706212855.1700615-3-deso@posteo.net
Recently, xsk part of libbpf was moved to selftests/bpf directory and
lives on its own because there is an AF_XDP testing application that
needs it called xdpxceiver. That name makes it a bit hard to indicate
who maintains it as there are other XDP samples in there, whereas this
one is strictly about AF_XDP.
Do s/xdpxceiver/xskxceiver so that it will be easier to figure out who
maintains it. A follow-up patch will correct MAINTAINERS file.
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220707111613.49031-2-maciej.fijalkowski@intel.com
Commit 13bbbfbea7 ("bpf: Add bpf_dynptr_read and bpf_dynptr_write")
added the bpf_dynptr_write() and bpf_dynptr_read() APIs.
However, it will be needed for some dynptr types to pass in flags as
well (e.g. when writing to a skb, the user may like to invalidate the
hash or recompute the checksum).
This patch adds a "u64 flags" arg to the bpf_dynptr_read() and
bpf_dynptr_write() APIs before their UAPI signature freezes where
we then cannot change them anymore with a 5.19.x released kernel.
Fixes: 13bbbfbea7 ("bpf: Add bpf_dynptr_read and bpf_dynptr_write")
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20220706232547.4016651-1-joannelkoong@gmail.com
Create enclave with additional heap that consumes all physical SGX
memory and then remove it.
Depending on the available SGX memory this test could take a
significant time to run (several minutes) as it (1) creates the
enclave, (2) changes the type of every page to be trimmed,
(3) enters the enclave once per page to run EACCEPT, before
(4) the pages are finally removed.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Link: https://lkml.kernel.org/r/e7c6aa2ab30cb1c41e52b776958409c06970d168.1652137848.git.reinette.chatre@intel.com
Removing a page from an initialized enclave involves three steps:
(1) the user requests changing the page type to PT_TRIM via the
SGX_IOC_ENCLAVE_MODIFY_TYPES ioctl()
(2) on success the ENCLU[EACCEPT] instruction is run from within
the enclave to accept the page removal
(3) the user initiates the actual removal of the page via the
SGX_IOC_ENCLAVE_REMOVE_PAGES ioctl().
Remove a page that has never been accessed. This means that when the
first ioctl() requesting page removal arrives, there will be no page
table entry, yet a valid page table entry needs to exist for the
ENCLU[EACCEPT] function to succeed. In this test it is verified that
a page table entry can still be installed for a page that is in the
process of being removed.
Suggested-by: Haitao Huang <haitao.huang@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Link: https://lkml.kernel.org/r/45e1b2a2fcd8c14597d04e40af5d8a9c1c5b017e.1652137848.git.reinette.chatre@intel.com
Removing a page from an initialized enclave involves three steps:
(1) the user requests changing the page type to SGX_PAGE_TYPE_TRIM
via the SGX_IOC_ENCLAVE_MODIFY_TYPES ioctl(), (2) on success the
ENCLU[EACCEPT] instruction is run from within the enclave to accept
the page removal, (3) the user initiates the actual removal of the
page via the SGX_IOC_ENCLAVE_REMOVE_PAGES ioctl().
Test two possible invalid accesses during the page removal flow:
* Test the behavior when a request to remove the page by changing its
type to SGX_PAGE_TYPE_TRIM completes successfully but instead of
executing ENCLU[EACCEPT] from within the enclave the enclave attempts
to read from the page. Even though the page is accessible from the
page table entries its type is SGX_PAGE_TYPE_TRIM and thus not
accessible according to SGX. The expected behavior is a page fault
with the SGX flag set in the error code.
* Test the behavior when the page type is changed successfully and
ENCLU[EACCEPT] was run from within the enclave. The final ioctl(),
SGX_IOC_ENCLAVE_REMOVE_PAGES, is omitted and replaced with an
attempt to access the page. Even though the page is accessible
from the page table entries its type is SGX_PAGE_TYPE_TRIM and
thus not accessible according to SGX. The expected behavior is
a page fault with the SGX flag set in the error code.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Link: https://lkml.kernel.org/r/189a86c25d6d62da7cfdd08ee97abc1a06fcc179.1652137848.git.reinette.chatre@intel.com
Removing a page from an initialized enclave involves three steps:
first the user requests changing the page type to SGX_PAGE_TYPE_TRIM
via an ioctl(), on success the ENCLU[EACCEPT] instruction needs to be
run from within the enclave to accept the page removal, finally the
user requests page removal to be completed via an ioctl(). Only after
acceptance (ENCLU[EACCEPT]) from within the enclave can the kernel
remove the page from a running enclave.
Test the behavior when the user's request to change the page type
succeeds, but the ENCLU[EACCEPT] instruction is not run before the
ioctl() requesting page removal is run. This should not be permitted.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Link: https://lkml.kernel.org/r/fa5da30ebac108b7517194c3038b52995602b996.1652137848.git.reinette.chatre@intel.com
Support for changing an enclave page's type enables an initialized
enclave to be expanded with support for more threads by changing the
type of a regular enclave page to that of a Thread Control Structure
(TCS). Additionally, being able to change a TCS or regular enclave
page's type to be trimmed (SGX_PAGE_TYPE_TRIM) initiates the removal
of the page from the enclave.
Test changing page type to TCS as well as page removal flows
in two phases: In the first phase support for a new thread is
dynamically added to an initialized enclave and in the second phase
the pages associated with the new thread are removed from the enclave.
As an additional sanity check after the second phase the page used as
a TCS page during the first phase is added back as a regular page and
ensured that it can be written to (which is not possible if it was a
TCS page).
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Link: https://lkml.kernel.org/r/d05b48b00338683a94dcaef9f478540fc3d6d5f9.1652137848.git.reinette.chatre@intel.com
The Thread Control Structure (TCS) contains meta-data used by the
hardware to save and restore thread specific information when
entering/exiting the enclave. A TCS can be added to an initialized
enclave by first adding a new regular enclave page, initializing the
content of the new page from within the enclave, and then changing that
page's type to a TCS.
Support the initialization of a TCS from within the enclave.
The variable information needed that should be provided from outside
the enclave is the address of the TCS, address of the State Save Area
(SSA), and the entry point that the thread should use to enter the
enclave. With this information provided all needed fields of a TCS
can be initialized.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Link: https://lkml.kernel.org/r/bad6052056188bde753a54313da1ac8f1e29088a.1652137848.git.reinette.chatre@intel.com
The test enclave (test_encl.elf) is built with two initialized
Thread Control Structures (TCS) included in the binary. Both TCS are
initialized with the same entry point, encl_entry, that correctly
computes the absolute address of the stack based on the stack of each
TCS that is also built into the binary.
A new TCS can be added dynamically to the enclave and requires to be
initialized with an entry point used to enter the enclave. Since the
existing entry point, encl_entry, assumes that the TCS and its stack
exists at particular offsets within the binary it is not able to handle
a dynamically added TCS and its stack.
Introduce a new entry point, encl_dyn_entry, that initializes the
absolute address of that thread's stack to the address immediately
preceding the TCS itself. It is now possible to dynamically add a
contiguous memory region to the enclave with the new stack preceding
the new TCS. With the new TCS initialized with encl_dyn_entry as entry
point the absolute address of the stack is computed correctly on entry.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Link: https://lkml.kernel.org/r/93e9c420dedf5f773ba6965c18245bc7d62aca83.1652137848.git.reinette.chatre@intel.com
Enclave pages can be added to an initialized enclave when an address
belonging to the enclave but without a backing page is accessed from
within the enclave.
Accessing memory without a backing enclave page from within an enclave
can be in different ways:
1) Pre-emptively run ENCLU[EACCEPT]. Since the addition of a page
always needs to be accepted by the enclave via ENCLU[EACCEPT] this
flow is efficient since the first execution of ENCLU[EACCEPT]
triggers the addition of the page and when execution returns to the
same instruction the second execution would be successful as an
acceptance of the page.
2) A direct read or write. The flow where a direct read or write
triggers the page addition execution cannot resume from the
instruction (read/write) that triggered the fault but instead
the enclave needs to be entered at a different entry point to
run needed ENCLU[EACCEPT] before execution can return to the
original entry point and the read/write instruction that faulted.
Add tests for both flows.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Link: https://lkml.kernel.org/r/0c321e0e32790ac1de742ce5017a331e6d902ac1.1652137848.git.reinette.chatre@intel.com
EPCM permission changes could be made from within (to relax
permissions) or out (to restrict permissions) the enclave. Kernel
support is needed when permissions are restricted to be able to
call the privileged ENCLS[EMODPR] instruction. EPCM permissions
can be relaxed via ENCLU[EMODPE] from within the enclave but the
enclave still depends on the kernel to install PTEs with the needed
permissions.
Add a test that exercises a few of the enclave page permission flows:
1) Test starts with a RW (from enclave and kernel perspective)
enclave page that is mapped via a RW VMA.
2) Use the SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS ioctl() to restrict
the enclave (EPCM) page permissions to read-only.
3) Run ENCLU[EACCEPT] from within the enclave to accept the new page
permissions.
4) Attempt to write to the enclave page from within the enclave - this
should fail with a page fault on the EPCM permissions since the page
table entry continues to allow RW access.
5) Restore EPCM permissions to RW by running ENCLU[EMODPE] from within
the enclave.
6) Attempt to write to the enclave page from within the enclave - this
should succeed since both EPCM and PTE permissions allow this access.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Link: https://lkml.kernel.org/r/2617bf2b2d1e27ca1d0096e1192ae5896baf3f80.1652137848.git.reinette.chatre@intel.com
This benchmark measures grace period latency and kthread cpu usage of
RCU Tasks Trace when many processes are creating/deleting BPF
local_storage. Intent here is to quantify improvement on these metrics
after Paul's recent RCU Tasks patches [0].
Specifically, fork 15k tasks which call a bpf prog that creates/destroys
task local_storage and sleep in a loop, resulting in many
call_rcu_tasks_trace calls.
To determine grace period latency, trace time elapsed between
rcu_tasks_trace_pregp_step and rcu_tasks_trace_postgp; for cpu usage
look at rcu_task_trace_kthread's stime in /proc/PID/stat.
On my virtualized test environment (Skylake, 8 cpus) benchmark results
demonstrate significant improvement:
BEFORE Paul's patches:
SUMMARY tasks_trace grace period latency avg 22298.551 us stddev 1302.165 us
SUMMARY ticks per tasks_trace grace period avg 2.291 stddev 0.324
AFTER Paul's patches:
SUMMARY tasks_trace grace period latency avg 16969.197 us stddev 2525.053 us
SUMMARY ticks per tasks_trace grace period avg 1.146 stddev 0.178
Note that since these patches are not in bpf-next benchmarking was done
by cherry-picking this patch onto rcu tree.
[0] https://lore.kernel.org/rcu/20220620225402.GA3842369@paulmck-ThinkPad-P17-Gen-1/
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220705190018.3239050-1-davemarchevsky@fb.com
This makes for faster tests, faster compile time, and allows us to ditch
ACPI finally.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
These selftests are used for much more extensive changes than just the
wireguard source files. So always call the kernel's build file, which
will do something or nothing after checking the whole tree, per usual.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Not all platforms have an RTC, and rather than trying to force one into
each, it's much easier to just set a fixed time. This is necessary
because WireGuard's latest handshakes parameter is returned in wallclock
time, and if the system time isn't set, and the system is really fast,
then this returns 0, which trips the test.
Turning this on requires setting CONFIG_COMPAT_32BIT_TIME=y, as musl
doesn't support settimeofday without it.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When compiling with -O2, GCC detects few problems with selftests/bpf, so
fix all of them. Two are real issues (uninitialized err and nums
out-of-bounds access), but two other uninitialized variables warnings
are due to GCC not being able to prove that variables are indeed
initialized under conditions under which they are used.
Fix all 4 cases, though.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20220705224818.4026623-3-andrii@kernel.org
When compiling selftests/bpf in optimized mode (-O2), GCC erroneously
complains about uninitialized token variable:
In file included from network_helpers.c:22:
network_helpers.c: In function ‘open_netns’:
test_progs.h:355:22: error: ‘token’ may be used uninitialized [-Werror=maybe-uninitialized]
355 | int ___err = libbpf_get_error(___res); \
| ^~~~~~~~~~~~~~~~~~~~~~~~
network_helpers.c:440:14: note: in expansion of macro ‘ASSERT_OK_PTR’
440 | if (!ASSERT_OK_PTR(token, "malloc token"))
| ^~~~~~~~~~~~~
In file included from /data/users/andriin/linux/tools/testing/selftests/bpf/tools/include/bpf/libbpf.h:21,
from bpf_util.h:9,
from network_helpers.c:20:
/data/users/andriin/linux/tools/testing/selftests/bpf/tools/include/bpf/libbpf_legacy.h:113:17: note: by argument 1 of type ‘const void *’ to ‘libbpf_get_error’ declared here
113 | LIBBPF_API long libbpf_get_error(const void *ptr);
| ^~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
make: *** [Makefile:522: /data/users/andriin/linux/tools/testing/selftests/bpf/network_helpers.o] Error 1
This is completely bogus becuase libbpf_get_error() doesn't dereference
pointer, but the only easy way to silence this is to allocate initialized
memory with calloc().
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20220705224818.4026623-2-andrii@kernel.org
This change updates the testing sample (pm_nl_ctl) to exercise
the updated MPTCP_PM_CMD_SET_FLAGS command for userspace PMs to
issue MP_PRIO signals over the selected subflow.
E.g. ./pm_nl_ctl set 10.0.1.2 port 47234 flags backup token 823274047 rip 10.0.1.1 rport 50003
userspace_pm.sh has a new selftest that invokes this command.
Fixes: 259a834fad ("selftests: mptcp: functional tests for the userspace PM type")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change extends the existing core_reloc/kernel test to include a
type match check of a local task_struct against the kernel's definition
-- which we assume to succeed.
Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220628160127.607834-11-deso@posteo.net
This change extends the type based tests with another struct type (in
addition to a_struct) to check relocations against: a_complex_struct.
This type is nested more deeply to provide additional coverage of
certain paths in the type match logic.
Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220628160127.607834-10-deso@posteo.net
This change adds another type-based self-test that specifically aims to
test some more characteristics of the TYPE_MATCH logic. Specifically, it
covers a few more potential differences between types, such as different
orders, enum variant values, and integer signedness.
Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220628160127.607834-9-deso@posteo.net
Now that we have type-match logic in both libbpf and the kernel, this
change adjusts the existing BPF self tests to check this functionality.
Specifically, we extend the existing type-based tests to check the
previously introduced bpf_core_type_matches macro.
Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220628160127.607834-8-deso@posteo.net
When packets are not received, they aren't received on $host1_if, so the
message talking about the second host not receiving them is incorrect.
Fix it.
Fixes: d4deb01467 ("selftests: forwarding: Add a test for FDB learning")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The first host interface has by default no interest in receiving packets
MAC DA de:ad:be:ef:13:37, so it might drop them before they hit the tc
filter and this might confuse the selftest.
Enable promiscuous mode such that the filter properly counts received
packets.
Fixes: d4deb01467 ("selftests: forwarding: Add a test for FDB learning")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
As mentioned in the blamed commit, flood_unicast_test() works by
checking the match count on a tc filter placed on the receiving
interface.
But the second host interface (host2_if) has no interest in receiving a
packet with MAC DA de:ad:be:ef:13:37, so its RX filter drops it even
before the ingress tc filter gets to be executed. So we will incorrectly
get the message "Packet was not flooded when should", when in fact, the
packet was flooded as expected but dropped due to an unrelated reason,
at some other layer on the receiving side.
Force h2 to accept this packet by temporarily placing it in promiscuous
mode. Alternatively we could either deliver to its MAC address or use
tcpdump_start, but this has the fewest complications.
This fixes the "flooding" test from bridge_vlan_aware.sh and
bridge_vlan_unaware.sh, which calls flood_test from the lib.
Fixes: 236dd50bf6 ("selftests: forwarding: Add a test for flooded traffic")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This patch add a test that checks connect()ivity between two sockets:
unnamed socket -> bound socket
* SOCK_STREAM or SOCK_DGRAM
* pathname or abstract
* same or different netns
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tests that permanent mdb entries can be added/deleted on ports with state down.
Signed-off-by: Casper Andersson <casper.casan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The test va_128TBswitch.c expects to be able to pass mmap an address hint
and length that cross the address 1<<47. On x86_64, this is not possible
without 5-level page tables, so the test fails.
The test is already only run on 64-bit powerpc and x86_64 archs, but this
patch adds an additional check on x86_64 that skips the test if
PG_TABLE_LEVELS < 5. There is precedent for checking /proc/config.gz in
selftests, e.g. in selftests/firmware.
Running the tests produces the desired output:
sudo make -C tools/testing/selftests TARGETS=vm run_tests
---------------------------
running ./va_128TBswitch.sh
---------------------------
./va_128TBswitch.sh: PG_TABLE_LEVELS=4, must be >= 5 to run this test
[SKIP]
-------------------------------
[adam@wowsignal.io: restrict the check to x86_64]
Link: https://lkml.kernel.org/r/20220628163654.337600-1-adam@wowsignal.io
[adam@wowsignal.io: fix formatting issues, rename "die" to "fail"]
Link: https://lkml.kernel.org/r/20220701163030.415735-1-adam@wowsignal.io
Link: https://lkml.kernel.org/r/20220627163912.5581-1-adam@wowsignal.io
Signed-off-by: Adam Sindelar <adam@wowsignal.io>
Cc: Adam Sindelar <ats@fb.com>
Cc: David Vernet <void@manifault.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
On Android this test is getting stuck in an infinite loop due to
indeterminate behavior:
The local variables steps and signalled were being reset to 1 and 0
respectively after every jump back to sigsetjmp by siglongjmp in the
signal handler. The test was incrementing them and expecting them to
retain their incremented values. The documentation for siglongjmp says:
All accessible objects have values as of the time sigsetjmp() was called,
except that the values of objects of automatic storage duration which are
local to the function containing the invocation of the corresponding
sigsetjmp() which do not have volatile-qualified type and which are
changed between the sigsetjmp() invocation and siglongjmp() call are
indeterminate.
Tagging steps and signalled with volatile enabled the test to pass.
Link: https://lkml.kernel.org/r/20220613233321.431282-1-edliaw@google.com
Signed-off-by: Edward Liaw <edliaw@google.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Global variables do not need to be initialized to 0 and checkpatch
flags this error in tools/testing/selftests/timers/alarmtimer-suspend.c:
ERROR: do not initialise globals to 0
+int final_ret = 0;
Fix this checkpatch error.
Signed-off-by: Zan Aziz <zanaziz313@gmail.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Since commit 8fffa0e345 ("selftests/bpf: Normalize XDP section names in
selftests") the xdp_dummy.o's section name has changed to xdp. But some
tests are still using "section xdp_dummy", which make the tests failed.
Fix them by updating to the new section name.
Fixes: 8fffa0e345 ("selftests/bpf: Normalize XDP section names in selftests")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220630062228.3453016-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Daniel Borkmann says:
====================
pull-request: bpf 2022-07-02
We've added 7 non-merge commits during the last 14 day(s) which contain
a total of 6 files changed, 193 insertions(+), 86 deletions(-).
The main changes are:
1) Fix clearing of page contiguity when unmapping XSK pool, from Ivan Malov.
2) Two verifier fixes around bounds data propagation, from Daniel Borkmann.
3) Fix fprobe sample module's parameter descriptions, from Masami Hiramatsu.
4) General BPF maintainer entry revamp to better scale patch reviews.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf, selftests: Add verifier test case for jmp32's jeq/jne
bpf, selftests: Add verifier test case for imm=0,umin=0,umax=1 scalar
bpf: Fix insufficient bounds propagation from adjust_scalar_min_max_vals
bpf: Fix incorrect verifier simulation around jmp32's jeq/jne
xsk: Clear page contiguity bit when unmapping pool
bpf, docs: Better scale maintenance of BPF subsystem
fprobe, samples: Add module parameter descriptions
====================
Link: https://lore.kernel.org/r/20220701230121.10354-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a test case to trigger the verifier's incorrect conclusion in the
case of jmp32's jeq/jne. Also here, make use of dead code elimination,
so that we can see the verifier bailing out on unfixed kernels.
Before:
# ./test_verifier 724
#724/p jeq32/jne32: bounds checking FAIL
Failed to load prog 'Permission denied'!
R4 !read_ok
verification time 8 usec
stack depth 0
processed 8 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 0
Summary: 0 PASSED, 0 SKIPPED, 1 FAILED
After:
# ./test_verifier 724
#724/p jeq32/jne32: bounds checking OK
Summary: 1 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220701124727.11153-4-daniel@iogearbox.net
Add a test case to trigger the constant scalar issue which leaves the
register in scalar(imm=0,umin=0,umax=1,var_off=(0x0; 0x0)) state. Make
use of dead code elimination, so that we can see the verifier bailing
out on unfixed kernels. For the condition, we use jle given it checks
on umax bound.
Before:
# ./test_verifier 743
#743/p jump & dead code elimination FAIL
Failed to load prog 'Permission denied'!
R4 !read_ok
verification time 11 usec
stack depth 0
processed 13 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 1
Summary: 0 PASSED, 0 SKIPPED, 1 FAILED
After:
# ./test_verifier 743
#743/p jump & dead code elimination OK
Summary: 1 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220701124727.11153-3-daniel@iogearbox.net