IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
When the width of a fill is smaller than the width of the preceding
spill, the information about scalar boundaries can still be preserved,
as long as it's coerced to the right width (done by coerce_reg_to_size).
Even further, if the actual value fits into the fill width, the ID can
be preserved as well for further tracking of equal scalars.
Implement the above improvements, which makes narrowing fills behave the
same as narrowing spills and MOVs between registers.
Two tests are adjusted to accommodate for endianness differences and to
take into account that it's now allowed to do a narrowing fill from the
least significant bits.
reg_bounds_sync is added to coerce_reg_to_size to correctly adjust
umin/umax boundaries after the var_off truncation, for example, a 64-bit
value 0xXXXXXXXX00000000, when read as a 32-bit, gets umin = 0, umax =
0xFFFFFFFF, var_off = (0x0; 0xffffffff00000000), which needs to be
synced down to umax = 0, otherwise reg_bounds_sanity_check doesn't pass.
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240127175237.526726-4-maxtram95@gmail.com
The previous commit added tracking for unbounded scalars on spill. Add
the test case to check the new functionality.
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240127175237.526726-3-maxtram95@gmail.com
Support the pattern where an unbounded scalar is spilled to the stack,
then boundary checks are performed on the src register, after which the
stack frame slot is refilled into a register.
Before this commit, the verifier didn't treat the src register and the
stack slot as related if the src register was an unbounded scalar. The
register state wasn't copied, the id wasn't preserved, and the stack
slot was marked as STACK_MISC. Subsequent boundary checks on the src
register wouldn't result in updating the boundaries of the spilled
variable on the stack.
After this commit, the verifier will preserve the bond between src and
dst even if src is unbounded, which permits to do boundary checks on src
and refill dst later, still remembering its boundaries. Such a pattern
is sometimes generated by clang when compiling complex long functions.
One test is adjusted to reflect that now unbounded scalars are tracked.
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240127175237.526726-2-maxtram95@gmail.com
Some benchmarks don't have either "consumer" or "producer" sides. For
example, trig-tp and other BPF triggering benchmarks don't have
consumers, as they only do "producing" by calling into syscall or
predefined uproes. As such it's valid for some benchmarks to have zero
consumers or producers. So allows to specify `-c0` explicitly.
This triggers another problem. If benchmark doesn't support either
consumer or producer side, consumer_thread/producer_thread callback will
be NULL, but benchmark runner will attempt to use those NULL callback to
create threads anyways. So instead of crashing with SIGSEGV in case of
misconfigured benchmark, detect the condition and report error.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240201172027.604869-6-andrii@kernel.org
Enable inline bpf_kptr_xchg() test for RV64, and the test have passed as
show below:
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/bpf/20240130124659.670321-3-pulehui@huaweicloud.com
This commit marks kfuncs as such inside the .BTF_ids section. The upshot
of these annotations is that we'll be able to automatically generate
kfunc prototypes for downstream users. The process is as follows:
1. In source, use BTF_KFUNCS_START/END macro pair to mark kfuncs
2. During build, pahole injects into BTF a "bpf_kfunc" BTF_DECL_TAG for
each function inside BTF_KFUNCS sets
3. At runtime, vmlinux or module BTF is made available in sysfs
4. At runtime, bpftool (or similar) can look at provided BTF and
generate appropriate prototypes for functions with "bpf_kfunc" tag
To ensure future kfunc are similarly tagged, we now also return error
inside kfunc registration for untagged kfuncs. For vmlinux kfuncs,
we also WARN(), as initcall machinery does not handle errors.
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Acked-by: Benjamin Tissoires <bentiss@kernel.org>
Link: https://lore.kernel.org/r/e55150ceecbf0a5d961e608941165c0bee7bc943.1706491398.git.dxu@dxuuu.xyz
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
After a recent change in the vmtest runner, this test started failing
sporadically.
Investigation showed that this test was subject to race condition which
got exacerbated after the vm runner change. The symptoms being that the
logic that waited for an ICMPv4 packet is naive and will break if 5 or
more non-ICMPv4 packets make it to tap0.
When ICMPv6 is enabled, the kernel will generate traffic such as ICMPv6
router solicitation...
On a system with good performance, the expected ICMPv4 packet would very
likely make it to the network interface promptly, but on a system with
poor performance, those "guarantees" do not hold true anymore.
Given that the test is IPv4 only, this change disable IPv6 in the test
netns by setting `net.ipv6.conf.all.disable_ipv6` to 1.
This essentially leaves "ping" as the sole generator of traffic in the
network namespace.
If this test was to be made IPv6 compatible, the logic in
`wait_for_packet` would need to be modified.
In more details...
At a high level, the test does:
- create a new namespace
- in `setup_redirect_target` set up lo, tap0, and link_err interfaces as
well as add 2 routes that attaches ingress/egress sections of
`test_lwt_redirect.bpf.o` to the xmit path.
- in `send_and_capture_test_packets` send an ICMP packet and read off
the tap interface (using `wait_for_packet`) to check that a ICMP packet
with the right size is read.
`wait_for_packet` will try to read `max_retry` (5) times from the tap0
fd looking for an ICMPv4 packet matching some criteria.
The problem is that when we set up the `tap0` interface, because IPv6 is
enabled by default, traffic such as Router solicitation is sent through
tap0, as in:
# tcpdump -r /tmp/lwt_redirect.pc
reading from file /tmp/lwt_redirect.pcap, link-type EN10MB (Ethernet)
04:46:23.578352 IP6 :: > ff02::1:ffc0:4427: ICMP6, neighbor solicitation, who has fe80::fcba:dff:fec0:4427, length 32
04:46:23.659522 IP6 :: > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
04:46:24.389169 IP 10.0.0.1 > 20.0.0.9: ICMP echo request, id 122, seq 1, length 108
04:46:24.618599 IP6 fe80::fcba:dff:fec0:4427 > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
04:46:24.619985 IP6 fe80::fcba:dff:fec0:4427 > ff02::2: ICMP6, router solicitation, length 16
04:46:24.767326 IP6 fe80::fcba:dff:fec0:4427 > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
04:46:28.936402 IP6 fe80::fcba:dff:fec0:4427 > ff02::2: ICMP6, router solicitation, length 16
If `wait_for_packet` sees 5 non-ICMPv4 packets, it will return 0, which is what we see in:
2024-01-31T03:51:25.0336992Z test_lwt_redirect_run:PASS:netns_create 0 nsec
2024-01-31T03:51:25.0341309Z open_netns:PASS:malloc token 0 nsec
2024-01-31T03:51:25.0344844Z open_netns:PASS:open /proc/self/ns/net 0 nsec
2024-01-31T03:51:25.0350071Z open_netns:PASS:open netns fd 0 nsec
2024-01-31T03:51:25.0353516Z open_netns:PASS:setns 0 nsec
2024-01-31T03:51:25.0356560Z test_lwt_redirect_run:PASS:setns 0 nsec
2024-01-31T03:51:25.0360140Z open_tuntap:PASS:open(/dev/net/tun) 0 nsec
2024-01-31T03:51:25.0363822Z open_tuntap:PASS:ioctl(TUNSETIFF) 0 nsec
2024-01-31T03:51:25.0367402Z open_tuntap:PASS:fcntl(O_NONBLOCK) 0 nsec
2024-01-31T03:51:25.0371167Z setup_redirect_target:PASS:open_tuntap 0 nsec
2024-01-31T03:51:25.0375180Z setup_redirect_target:PASS:if_nametoindex 0 nsec
2024-01-31T03:51:25.0379929Z setup_redirect_target:PASS:ip link add link_err type dummy 0 nsec
2024-01-31T03:51:25.0384874Z setup_redirect_target:PASS:ip link set lo up 0 nsec
2024-01-31T03:51:25.0389678Z setup_redirect_target:PASS:ip addr add dev lo 10.0.0.1/32 0 nsec
2024-01-31T03:51:25.0394814Z setup_redirect_target:PASS:ip link set link_err up 0 nsec
2024-01-31T03:51:25.0399874Z setup_redirect_target:PASS:ip link set tap0 up 0 nsec
2024-01-31T03:51:25.0407731Z setup_redirect_target:PASS:ip route add 10.0.0.0/24 dev link_err encap bpf xmit obj test_lwt_redirect.bpf.o sec redir_ingress 0 nsec
2024-01-31T03:51:25.0419105Z setup_redirect_target:PASS:ip route add 20.0.0.0/24 dev link_err encap bpf xmit obj test_lwt_redirect.bpf.o sec redir_egress 0 nsec
2024-01-31T03:51:25.0427209Z test_lwt_redirect_normal:PASS:setup_redirect_target 0 nsec
2024-01-31T03:51:25.0431424Z ping_dev:PASS:if_nametoindex 0 nsec
2024-01-31T03:51:25.0437222Z send_and_capture_test_packets:FAIL:wait_for_epacket unexpected wait_for_epacket: actual 0 != expected 1
2024-01-31T03:51:25.0448298Z (/tmp/work/bpf/bpf/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c:175: errno: Success) test_lwt_redirect_normal egress test fails
2024-01-31T03:51:25.0457124Z close_netns:PASS:setns 0 nsec
When running in a VM which potential resource contrains, the odds that calling
`ping` is not scheduled very soon after bringing `tap0` up increases,
and with this the chances to get our ICMP packet pushed to position 6+
in the network trace.
To confirm this indeed solves the issue, I ran the test 100 times in a
row with:
errors=0
successes=0
for i in `seq 1 100`
do
./test_progs -t lwt_redirect/lwt_redirect_normal
if [ $? -eq 0 ]; then
successes=$((successes+1))
else
errors=$((errors+1))
fi
done
echo "successes: $successes/errors: $errors"
While this test would at least fail a couple of time every 10 runs, here
it ran 100 times with no error.
Fixes: 43a7c3ef8a15 ("selftests/bpf: Add lwt_xmit tests for BPF_REDIRECT")
Signed-off-by: Manu Bretelle <chantr4@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240131053212.2247527-1-chantr4@gmail.com
Use more ergonomic bpf_core_cast() macro instead of bpf_rdonly_cast() in
selftests code.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240130212023.183765-3-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Add bpf_core_cast() macro that wraps bpf_rdonly_cast() kfunc. It's more
ergonomic than kfunc, as it automatically extracts btf_id with
bpf_core_type_id_kernel(), and works with type names. It also casts result
to (T *) pointer. See the definition of the macro, it's self-explanatory.
libbpf declares bpf_rdonly_cast() extern as __weak __ksym and should be
safe to not conflict with other possible declarations in user code.
But we do have a conflict with current BPF selftests that declare their
externs with first argument as `void *obj`, while libbpf opts into more
permissive `const void *obj`. This causes conflict, so we fix up BPF
selftests uses in the same patch.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240130212023.183765-2-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Add a bunch of test cases validating behavior of __arg_trusted and its
combination with __arg_nullable tag. We also validate CO-RE flavor
support by kernel for __arg_trusted args.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240130000648.2144827-5-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Clang supports enabling/disabling certain conversion diagnostics via
the -W[no-]compare-distinct-pointer-types command line options.
Disabling this warning is required by some BPF selftests due to
-Werror. Until very recently GCC would emit these warnings
unconditionally, which was a problem for gcc-bpf, but we added support
for the command-line options to GCC upstream [1].
This patch moves the -Wno-cmopare-distinct-pointer-types from
CLANG_CFLAGS to BPF_CFLAGS in selftests/bpf/Makefile so the option
is also used in gcc-bpf builds, not just in clang builds.
Tested in bpf-next master.
No regressions.
[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627769.html
Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240130113624.24940-1-jose.marchesi@oracle.com
A few BPF selftests perform type punning and they may break strict
aliasing rules, which are exploited by both GCC and clang by default
while optimizing. This can lead to broken compiled programs.
This patch disables strict aliasing for these particular tests, by
mean of the -fno-strict-aliasing command line option. This will make
sure these tests are optimized properly even if some strict aliasing
rule gets violated.
After this patch, GCC is able to build all the selftests without
warning about potential strict aliasing issue.
bpf@vger discussion on strict aliasing and BPF selftests:
https://lore.kernel.org/bpf/bae1205a-b6e5-4e46-8e20-520d7c327f7a@linux.dev/T/#t
Tested in bpf-next master.
No regressions.
Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/bae1205a-b6e5-4e46-8e20-520d7c327f7a@linux.dev
Link: https://lore.kernel.org/bpf/20240130110343.11217-1-jose.marchesi@oracle.com
Certain BPF selftests contain code that, albeit being legal C, trigger
warnings in GCC that cannot be disabled. This is the case for example
for the tests
progs/btf_dump_test_case_bitfields.c
progs/btf_dump_test_case_namespacing.c
progs/btf_dump_test_case_packing.c
progs/btf_dump_test_case_padding.c
progs/btf_dump_test_case_syntax.c
which contain struct type declarations inside function parameter
lists. This is problematic, because:
- The BPF selftests are built with -Werror.
- The Clang and GCC compilers sometimes differ when it comes to handle
warnings. in the handling of warnings. One compiler may emit
warnings for code that the other compiles compiles silently, and one
compiler may offer the possibility to disable certain warnings, while
the other doesn't.
In order to overcome this problem, this patch modifies the
tools/testing/selftests/bpf/Makefile in order to:
1. Enable the possibility of specifing per-source-file extra CFLAGS.
This is done by defining a make variable like:
<source-filename>-CFLAGS := <whateverflags>
And then modifying the proper Make rule in order to use these flags
when compiling <source-filename>.
2. Use the mechanism above to add -Wno-error to CFLAGS for the
following selftests:
progs/btf_dump_test_case_bitfields.c
progs/btf_dump_test_case_namespacing.c
progs/btf_dump_test_case_packing.c
progs/btf_dump_test_case_padding.c
progs/btf_dump_test_case_syntax.c
Note the corresponding -CFLAGS variables for these files are
defined only if the selftests are being built with GCC.
Note that, while compiler pragmas can generally be used to disable
particular warnings per file, this 1) is only possible for warning
that actually can be disabled in the command line, i.e. that have
-Wno-FOO options, and 2) doesn't apply to -Wno-error.
Tested in bpf-next master branch.
No regressions.
Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240127100702.21549-1-jose.marchesi@oracle.com
In s390, CI reported that the sock_iter_batch selftest
hits this error very often:
2024-01-26T16:56:49.3091804Z Bind /proc/self/ns/net -> /run/netns/sock_iter_batch_netns failed: No such file or directory
2024-01-26T16:56:49.3149524Z Cannot remove namespace file "/run/netns/sock_iter_batch_netns": No such file or directory
2024-01-26T16:56:49.3772213Z test_sock_iter_batch:FAIL:ip netns add sock_iter_batch_netns unexpected error: 256 (errno 0)
It happens very often in s390 but Manu also noticed it happens very
sparsely in other arch also.
It turns out the default dash shell does not recognize "&>"
as a redirection operator, so the command went to the background.
In the sock_iter_batch selftest, the "ip netns delete" went
into background and then race with the following "ip netns add"
command.
This patch replaces the "&> /dev/null" usage with ">/dev/null 2>&1"
and does this redirection in the SYS_NOFAIL macro instead of doing
it individually by its caller. The SYS_NOFAIL callers do not care
about failure, so it is no harm to do this redirection even if
some of the existing callers do not redirect to /dev/null now.
It touches different test files, so I skipped the Fixes tags
in this patch. Some of the changed tests do not use "&>"
but they use the SYS_NOFAIL, so these tests are also
changed to avoid doing its own redirection because
SYS_NOFAIL does it internally now.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240127025017.950825-1-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZbQV+gAKCRDbK58LschI
g2OeAP0VvhZS9SPiS+/AMAFuw2W1BkMrFNbfBTc3nzRnyJSmNAD+NG4CLLJvsKI9
olu7VC20B8pLTGLUGIUSwqnjOC+Kkgc=
=wVMl
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2024-01-26
We've added 107 non-merge commits during the last 4 day(s) which contain
a total of 101 files changed, 6009 insertions(+), 1260 deletions(-).
The main changes are:
1) Add BPF token support to delegate a subset of BPF subsystem
functionality from privileged system-wide daemons such as systemd
through special mount options for userns-bound BPF fs to a trusted
& unprivileged application. With addressed changes from Christian
and Linus' reviews, from Andrii Nakryiko.
2) Support registration of struct_ops types from modules which helps
projects like fuse-bpf that seeks to implement a new struct_ops type,
from Kui-Feng Lee.
3) Add support for retrieval of cookies for perf/kprobe multi links,
from Jiri Olsa.
4) Bigger batch of prep-work for the BPF verifier to eventually support
preserving boundaries and tracking scalars on narrowing fills,
from Maxim Mikityanskiy.
5) Extend the tc BPF flavor to support arbitrary TCP SYN cookies to help
with the scenario of SYN floods, from Kuniyuki Iwashima.
6) Add code generation to inline the bpf_kptr_xchg() helper which
improves performance when stashing/popping the allocated BPF objects,
from Hou Tao.
7) Extend BPF verifier to track aligned ST stores as imprecise spilled
registers, from Yonghong Song.
8) Several fixes to BPF selftests around inline asm constraints and
unsupported VLA code generation, from Jose E. Marchesi.
9) Various updates to the BPF IETF instruction set draft document such
as the introduction of conformance groups for instructions,
from Dave Thaler.
10) Fix BPF verifier to make infinite loop detection in is_state_visited()
exact to catch some too lax spill/fill corner cases,
from Eduard Zingerman.
11) Refactor the BPF verifier pointer ALU check to allow ALU explicitly
instead of implicitly for various register types, from Hao Sun.
12) Fix the flaky tc_redirect_dtime BPF selftest due to slowness
in neighbor advertisement at setup time, from Martin KaFai Lau.
13) Change BPF selftests to skip callback tests for the case when the
JIT is disabled, from Tiezhu Yang.
14) Add a small extension to libbpf which allows to auto create
a map-in-map's inner map, from Andrey Grafin.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (107 commits)
selftests/bpf: Add missing line break in test_verifier
bpf, docs: Clarify definitions of various instructions
bpf: Fix error checks against bpf_get_btf_vmlinux().
bpf: One more maintainer for libbpf and BPF selftests
selftests/bpf: Incorporate LSM policy to token-based tests
selftests/bpf: Add tests for LIBBPF_BPF_TOKEN_PATH envvar
libbpf: Support BPF token path setting through LIBBPF_BPF_TOKEN_PATH envvar
selftests/bpf: Add tests for BPF object load with implicit token
selftests/bpf: Add BPF object loading tests with explicit token passing
libbpf: Wire up BPF token support at BPF object level
libbpf: Wire up token_fd into feature probing logic
libbpf: Move feature detection code into its own file
libbpf: Further decouple feature checking logic from bpf_object
libbpf: Split feature detectors definitions from cached results
selftests/bpf: Utilize string values for delegate_xxx mount options
bpf: Support symbolic BPF FS delegation mount options
bpf: Fail BPF_TOKEN_CREATE if no delegation option was set on BPF FS
bpf,selinux: Allocate bpf_security_struct per BPF token
selftests/bpf: Add BPF token-enabled tests
libbpf: Add BPF token support to bpf_prog_load() API
...
====================
Link: https://lore.kernel.org/r/20240126215710.19855-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There are no break lines in the test log for test_verifier #106 ~ #111
if jit is disabled, add the missing line break at the end of printf()
to fix it.
Without this patch:
[root@linux bpf]# echo 0 > /proc/sys/net/core/bpf_jit_enable
[root@linux bpf]# ./test_verifier 106
#106/p inline simple bpf_loop call SKIP (requires BPF JIT)Summary: 0 PASSED, 1 SKIPPED, 0 FAILED
With this patch:
[root@linux bpf]# echo 0 > /proc/sys/net/core/bpf_jit_enable
[root@linux bpf]# ./test_verifier 106
#106/p inline simple bpf_loop call SKIP (requires BPF JIT)
Summary: 0 PASSED, 1 SKIPPED, 0 FAILED
Fixes: 0b50478fd877 ("selftests/bpf: Skip callback tests if jit is disabled in test_verifier")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240126015736.655-1-yangtiezhu@loongson.cn
ping_group_range sysctl has a compound value which does not go
through the various function layers in tact. Create a helper
function to bypass the layers and correctly set the value.
Signed-off-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20240124214117.24687-3-dsahern@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Allow fcnal-test.sh to be run from top level directory in the
kernel repo as well as from tools/testing/selftests/net by
setting the PATH to find the in-tree nettest.
Signed-off-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20240124214117.24687-2-dsahern@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
SOCK_SEQPACKET is supported for virtio transport, so do not interpret
such type of socket as unknown.
Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/20240124193255.3417803-1-avkrasnov@salutedevices.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
As of today tests throwing exceptions in setup/teardown phase are
treated as skipped but they should really be failures.
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Link: https://lore.kernel.org/r/20240124181933.75724-6-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
For the longest time tdc ran only actions and qdiscs tests.
It's time to enable all the remaining tests so every user visible
piece of TC is tested by the downstream CIs.
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Link: https://lore.kernel.org/r/20240124181933.75724-5-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If 'jq' is not available the taprio tests might enter an infinite loop,
use the "dependsOn" feature from tdc to check if jq is present. If it's
not the test is skipped.
Suggested-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Link: https://lore.kernel.org/r/20240124181933.75724-3-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
On a default config + tc-testing config build, tdc will miss
all the netfilter related tests because it's missing:
CONFIG_NETFILTER=y
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Link: https://lore.kernel.org/r/20240124181933.75724-2-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Current release - regressions:
- bpf: fix a kernel crash for the riscv 64 JIT
- bnxt_en: fix memory leak in bnxt_hwrm_get_rings()
- revert "net: macsec: use skb_ensure_writable_head_tail to expand the skb"
Previous releases - regressions:
- core: fix removing a namespace with conflicting altnames
- tc/flower: fix chain template offload memory leak
- tcp:
- make sure init the accept_queue's spinlocks once
- fix autocork on CPUs with weak memory model
- udp: fix busy polling
- mlx5e:
- fix out-of-bound read in port timestamping
- fix peer flow lists corruption
- iwlwifi: fix a memory corruption
Previous releases - always broken:
- netfilter:
- nft_chain_filter: handle NETDEV_UNREGISTER for inet/ingress basechain
- nft_limit: reject configurations that cause integer overflow
- bpf: fix bpf_xdp_adjust_tail() with XSK zero-copy mbuf, avoiding
a NULL pointer dereference upon shrinking
- llc: make llc_ui_sendmsg() more robust against bonding changes
- smc: fix illegal rmb_desc access in SMC-D connection dump
- dpll: fix pin dump crash for rebound module
- bnxt_en: fix possible crash after creating sw mqprio TCs
- hv_netvsc: calculate correct ring size when PAGE_SIZE is not 4 Kbytes
Misc:
- several self-tests fixes for better integration with the netdev CI
- added several missing modules descriptions
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmWyUSISHHBhYmVuaUBy
ZWRoYXQuY29tAAoJECkkeY3MjxOkiuIP/0IChNqw3KtJJQOb4eIu12qRTblMmucU
Yf+Q4ZBHI7Epz3HFWrmqoj7N7CaWAj+U8b9JEHbcZxP7cwo4mc7ScXmm78IOvpEl
ypWdjWVW89UVz6GI/Yz/MNC2H7By51NUkiDpbbAA4pZVK6N1+rO8oAU9Fy2IiFTb
ixt61S1zsVYdmUjHOD/PiU9b8i5cRBukaXF8jnznRj8nAT/cU9XUV/YyFixj33vQ
Rbs3HoKxD9Mk9KdJ7jgEMi7Vazb40w5TmfMLWyNglJQdNz4DUg+9tqQGCkEf5UGU
CpRKu4RVr2uzAn9N5Hav4O0He2kDUVH1MoPqgS6MnJAERzCDKoDFxo8ljTmHBk6b
ISmstRzRon/AXcp+94pwU5RT78B7HekXYlZPcj5tGVKiM7HMdgLiodOcZcsG5fdW
8okeYhpCL5ew/fxGOZnbNS/BiODZBaa+/e6ns8NasmWZHgGJa9uHiO865g5I53/H
jWnm53Bi1Zmkgu/+NwXEx1I1vWSa9GCBA8ia5oSuEQDWhHhm3EcUcr44kjrD/R+S
6elNScjrJ2kMF+fvOb1BEITUf77fk7/ATJarCk8oybJCAt7do+DZ1E47NN0M9Km8
gARKvThije9rmc4OZ7RYR7R9iQgvwpb7fgGJZw+SI3XFK/WxmcDJHV4dXLiA6+Mu
vvt+x7EmWiWf
=ER+y
-----END PGP SIGNATURE-----
Merge tag 'net-6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from bpf, netfilter and WiFi.
Jakub is doing a lot of work to include the self-tests in our CI, as a
result a significant amount of self-tests related fixes is flowing in
(and will likely continue in the next few weeks).
Current release - regressions:
- bpf: fix a kernel crash for the riscv 64 JIT
- bnxt_en: fix memory leak in bnxt_hwrm_get_rings()
- revert "net: macsec: use skb_ensure_writable_head_tail to expand
the skb"
Previous releases - regressions:
- core: fix removing a namespace with conflicting altnames
- tc/flower: fix chain template offload memory leak
- tcp:
- make sure init the accept_queue's spinlocks once
- fix autocork on CPUs with weak memory model
- udp: fix busy polling
- mlx5e:
- fix out-of-bound read in port timestamping
- fix peer flow lists corruption
- iwlwifi: fix a memory corruption
Previous releases - always broken:
- netfilter:
- nft_chain_filter: handle NETDEV_UNREGISTER for inet/ingress
basechain
- nft_limit: reject configurations that cause integer overflow
- bpf: fix bpf_xdp_adjust_tail() with XSK zero-copy mbuf, avoiding a
NULL pointer dereference upon shrinking
- llc: make llc_ui_sendmsg() more robust against bonding changes
- smc: fix illegal rmb_desc access in SMC-D connection dump
- dpll: fix pin dump crash for rebound module
- bnxt_en: fix possible crash after creating sw mqprio TCs
- hv_netvsc: calculate correct ring size when PAGE_SIZE is not 4kB
Misc:
- several self-tests fixes for better integration with the netdev CI
- added several missing modules descriptions"
* tag 'net-6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (88 commits)
tsnep: Fix XDP_RING_NEED_WAKEUP for empty fill ring
tsnep: Remove FCS for XDP data path
net: fec: fix the unhandled context fault from smmu
selftests: bonding: do not test arp/ns target with mode balance-alb/tlb
fjes: fix memleaks in fjes_hw_setup
i40e: update xdp_rxq_info::frag_size for ZC enabled Rx queue
i40e: set xdp_rxq_info::frag_size
xdp: reflect tail increase for MEM_TYPE_XSK_BUFF_POOL
ice: update xdp_rxq_info::frag_size for ZC enabled Rx queue
intel: xsk: initialize skb_frag_t::bv_offset in ZC drivers
ice: remove redundant xdp_rxq_info registration
i40e: handle multi-buffer packets that are shrunk by xdp prog
ice: work on pre-XDP prog frag count
xsk: fix usage of multi-buffer BPF helpers for ZC XDP
xsk: make xsk_buff_pool responsible for clearing xdp_buff::flags
xsk: recycle buffer in case Rx queue was full
net: fill in MODULE_DESCRIPTION()s for rvu_mbox
net: fill in MODULE_DESCRIPTION()s for litex
net: fill in MODULE_DESCRIPTION()s for fsl_pq_mdio
net: fill in MODULE_DESCRIPTION()s for fec
...
The prio_arp/ns tests hard code the mode to active-backup. At the same
time, The balance-alb/tlb modes do not support arp/ns target. So remove
the prio_arp/ns tests from the loop and only test active-backup mode.
Fixes: 481b56e0391e ("selftests: bonding: re-format bond option tests")
Reported-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Closes: https://lore.kernel.org/netdev/17415.1705965957@famine/
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Link: https://lore.kernel.org/r/20240123075917.1576360-1-liuhangbin@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Implement port for given CID as input argument instead of using
hardcoded value '1234'. This allows to run different test instances
on a single CID. Port argument is not required parameter and if it is
not set, then default value will be '1234' - thus we preserve previous
behaviour.
Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/20240123072750.4084181-1-avkrasnov@salutedevices.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add tests for LSM interactions (both bpf_token_capable and bpf_token_cmd
LSM hooks) with BPF token in bpf() subsystem. Now child process passes
back token FD for parent to be able to do tests with token originating
in "wrong" userns. But we also create token in initns and check that
token LSMs don't accidentally reject BPF operations when capable()
checks pass without BPF token.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-31-andrii@kernel.org
Add a test to validate libbpf's implicit BPF token creation from default
BPF FS location (/sys/fs/bpf). Also validate that disabling this
implicit BPF token creation works.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-28-andrii@kernel.org
Add a few tests that attempt to load BPF object containing privileged
map, program, and the one requiring mandatory BTF uploading into the
kernel (to validate token FD propagation to BPF_BTF_LOAD command).
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-27-andrii@kernel.org
Use both hex-based and string-based way to specify delegate mount
options for BPF FS.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-21-andrii@kernel.org
Add a selftest that attempts to conceptually replicate intended BPF
token use cases inside user namespaced container.
Child process is forked. It is then put into its own userns and mountns.
Child creates BPF FS context object. This ensures child userns is
captured as the owning userns for this instance of BPF FS. Given setting
delegation mount options is privileged operation, we ensure that child
cannot set them.
This context is passed back to privileged parent process through Unix
socket, where parent sets up delegation options, creates, and mounts it
as a detached mount. This mount FD is passed back to the child to be
used for BPF token creation, which allows otherwise privileged BPF
operations to succeed inside userns.
We validate that all of token-enabled privileged commands (BPF_BTF_LOAD,
BPF_MAP_CREATE, and BPF_PROG_LOAD) work as intended. They should only
succeed inside the userns if a) BPF token is provided with proper
allowed sets of commands and types; and b) namespaces CAP_BPF and other
privileges are set. Lacking a) or b) should lead to -EPERM failures.
Based on suggested workflow by Christian Brauner ([0]).
[0] https://lore.kernel.org/bpf/20230704-hochverdient-lehne-eeb9eeef785e@brauner/
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-17-andrii@kernel.org
Add basic support of BPF token to BPF_PROG_LOAD. BPF_F_TOKEN_FD flag
should be set in prog_flags field when providing prog_token_fd.
Wire through a set of allowed BPF program types and attach types,
derived from BPF FS at BPF token creation time. Then make sure we
perform bpf_token_capable() checks everywhere where it's relevant.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-7-andrii@kernel.org
Allow providing token_fd for BPF_MAP_CREATE command to allow controlled
BPF map creation from unprivileged process through delegated BPF token.
New BPF_F_TOKEN_FD flag is added to specify together with BPF token FD
for BPF_MAP_CREATE command.
Wire through a set of allowed BPF map types to BPF token, derived from
BPF FS at BPF token creation time. This, in combination with allowed_cmds
allows to create a narrowly-focused BPF token (controlled by privileged
agent) with a restrictive set of BPF maps that application can attempt
to create.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-5-andrii@kernel.org
This test is missing a whole bunch of checks for interface
renaming and one ifup. Presumably it was only used on a system
with renaming disabled and NetworkManager running.
Fixes: 91f430b2c49d ("selftests: net: add a test for UDP tunnel info infra")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240123060529.1033912-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If there is more than 32 cpus the bitmask will start to contain
commas, leading to:
./rps_default_mask.sh: line 36: [: 00000000,00000000: integer expression expected
Remove the commas, bash doesn't interpret leading zeroes as oct
so that should be good enough. Switch to bash, Simon reports that
not all shells support this type of substitution.
Fixes: c12e0d5f267d ("self-tests: introduce self-tests for RPS default mask")
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240122195815.638997-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
After the previous patch that speeded up the test (by avoiding neigh
discovery in IPv6), the BPF CI occasionally hits this error:
rcv tstamp unexpected pkt rcv tstamp: actual 0 == expected 0
The test complains about the cmsg returned from the recvmsg() does not
have the rcv timestamp. Setting skb->tstamp or not is
controlled by a kernel static key "netstamp_needed_key". The static
key is enabled whenever this is at least one sk with the SOCK_TIMESTAMP
set.
The test_redirect_dtime does use setsockopt() to turn on
the SOCK_TIMESTAMP for the reading sk. In the kernel
net_enable_timestamp() has a delay to enable the "netstamp_needed_key"
when CONFIG_JUMP_LABEL is set. This potential delay is the likely reason
for packet missing rcv timestamp occasionally.
This patch is to create udp sockets with SOCK_TIMESTAMP set.
It sends and receives some packets until the received packet
has a rcv timestamp. It currently retries at most 5 times with 1s
in between. This should be enough to wait for the "netstamp_needed_key".
It then holds on to the socket and only closes it at the end of the test.
This guarantees that the test has the "netstamp_needed_key" key turned
on from the beginning.
To simplify the udp sockets setup, they are sending/receiving packets
in the same netns (ns_dst is used) and communicate over the "lo" dev.
Hence, the patch enables the "lo" dev in the ns_dst.
Fixes: c803475fd8dd ("bpf: selftests: test skb->tstamp in redirect_neigh")
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240120060518.3604920-2-martin.lau@linux.dev
BPF CI has been reporting the tc_redirect_dtime test failing
from time to time:
test_inet_dtime:PASS:setns src 0 nsec
(network_helpers.c:253: errno: No route to host) Failed to connect to server
close_netns:PASS:setns 0 nsec
test_inet_dtime:FAIL:connect_to_fd unexpected connect_to_fd: actual -1 < expected 0
test_tcp_clear_dtime:PASS:tcp ip6 clear dtime ingress_fwdns_p100 0 nsec
The connect_to_fd failure (EHOSTUNREACH) is from the
test_tcp_clear_dtime() test and it is the very first IPv6 traffic
after setting up all the links, addresses, and routes.
The symptom is this first connect() is always slow. In my setup, it
could take ~3s.
After some tracing and tcpdump, the slowness is mostly spent in
the neighbor solicitation in the "ns_fwd" namespace while
the "ns_src" and "ns_dst" are fine.
I forced the kernel to drop the neighbor solicitation messages.
I can then reproduce EHOSTUNREACH. What actually happen could be:
- the neighbor advertisement came back a little slow.
- the "ns_fwd" namespace concluded a neighbor discovery failure
and triggered the ndisc_error_report() => ip6_link_failure() =>
icmpv6_send(skb, ICMPV6_DEST_UNREACH, ICMPV6_ADDR_UNREACH, 0)
- the client's connect() reports EHOSTUNREACH after receiving
the ICMPV6_DEST_UNREACH message.
The neigh table of both "ns_src" and "ns_dst" namespace has already
been manually populated but not the "ns_fwd" namespace. This patch
fixes it by manually populating the neigh table also in the "ns_fwd"
namespace.
Although the namespace configuration part had been existed before
the tc_redirect_dtime test, still Fixes-tagging the patch when
the tc_redirect_dtime test was added since it is the only test
hitting it so far.
Fixes: c803475fd8dd ("bpf: selftests: test skb->tstamp in redirect_neigh")
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240120060518.3604920-1-martin.lau@linux.dev
If CONFIG_BPF_JIT_ALWAYS_ON is not set and bpf_jit_enable is 0, there
exist 6 failed tests.
[root@linux bpf]# echo 0 > /proc/sys/net/core/bpf_jit_enable
[root@linux bpf]# echo 0 > /proc/sys/kernel/unprivileged_bpf_disabled
[root@linux bpf]# ./test_verifier | grep FAIL
#106/p inline simple bpf_loop call FAIL
#107/p don't inline bpf_loop call, flags non-zero FAIL
#108/p don't inline bpf_loop call, callback non-constant FAIL
#109/p bpf_loop_inline and a dead func FAIL
#110/p bpf_loop_inline stack locations for loop vars FAIL
#111/p inline bpf_loop call in a big program FAIL
Summary: 768 PASSED, 15 SKIPPED, 6 FAILED
The test log shows that callbacks are not allowed in non-JITed programs,
interpreter doesn't support them yet, thus these tests should be skipped
if jit is disabled.
Add an explicit flag F_NEEDS_JIT_ENABLED to those tests to mark that they
require JIT enabled in bpf_loop_inline.c, check the flag and jit_disabled
at the beginning of do_test_single() to handle this case.
With this patch:
[root@linux bpf]# echo 0 > /proc/sys/net/core/bpf_jit_enable
[root@linux bpf]# echo 0 > /proc/sys/kernel/unprivileged_bpf_disabled
[root@linux bpf]# ./test_verifier | grep FAIL
Summary: 768 PASSED, 21 SKIPPED, 0 FAILED
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240123090351.2207-3-yangtiezhu@loongson.cn
Currently, is_jit_enabled() is only used in test_progs, move it into
testing_helpers so that it can be used in test_verifier. While at it,
remove the second argument "0" of open() as Hou Tao suggested.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Hou Tao <houtao1@huawei.com>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/bpf/20240123090351.2207-2-yangtiezhu@loongson.cn
Create a new struct_ops type called bpf_testmod_ops within the bpf_testmod
module. When a struct_ops object is registered, the bpf_testmod module will
invoke test_2 from the module.
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240119225005.668602-15-thinker.li@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Adding fill_link_info test for perf event and testing we
get its values back through the bpf_link_info interface.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240119110505.400573-7-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Now that we get cookies for perf_event probes, adding tests
for cookie for kprobe/uprobe/tracepoint.
The perf_event test needs to be added completely and is coming
in following change.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240119110505.400573-6-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Adding cookies check for kprobe_multi fill_link_info test,
plus tests for invalid values related to cookies.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240119110505.400573-5-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Some of the BPF selftests use the "p" constraint in inline assembly
snippets, for input operands for MOV (rN = rM) instructions.
This is mainly done via the __imm_ptr macro defined in
tools/testing/selftests/bpf/progs/bpf_misc.h:
#define __imm_ptr(name) [name]"p"(&name)
Example:
int consume_first_item_only(void *ctx)
{
struct bpf_iter_num iter;
asm volatile (
/* create iterator */
"r1 = %[iter];"
[...]
:
: __imm_ptr(iter)
: CLOBBERS);
[...]
}
The "p" constraint is a tricky one. It is documented in the GCC manual
section "Simple Constraints":
An operand that is a valid memory address is allowed. This is for
``load address'' and ``push address'' instructions.
p in the constraint must be accompanied by address_operand as the
predicate in the match_operand. This predicate interprets the mode
specified in the match_operand as the mode of the memory reference for
which the address would be valid.
There are two problems:
1. It is questionable whether that constraint was ever intended to be
used in inline assembly templates, because its behavior really
depends on compiler internals. A "memory address" is not the same
than a "memory operand" or a "memory reference" (constraint "m"), and
in fact its usage in the template above results in an error in both
x86_64-linux-gnu and bpf-unkonwn-none:
foo.c: In function ‘bar’:
foo.c:6:3: error: invalid 'asm': invalid expression as operand
6 | asm volatile ("r1 = %[jorl]" : : [jorl]"p"(&jorl));
| ^~~
I would assume the same happens with aarch64, riscv, and most/all
other targets in GCC, that do not accept operands of the form A + B
that are not wrapped either in a const or in a memory reference.
To avoid that error, the usage of the "p" constraint in internal GCC
instruction templates is supposed to be complemented by the 'a'
modifier, like in:
asm volatile ("r1 = %a[jorl]" : : [jorl]"p"(&jorl));
Internally documented (in GCC's final.cc) as:
%aN means expect operand N to be a memory address
(not a memory reference!) and print a reference
to that address.
That works because when the modifier 'a' is found, GCC prints an
"operand address", which is not the same than an "operand".
But...
2. Even if we used the internal 'a' modifier (we shouldn't) the 'rN =
rM' instruction really requires a register argument. In cases
involving automatics, like in the examples above, we easily end with:
bar:
#APP
r1 = r10-4
#NO_APP
In other cases we could conceibly also end with a 64-bit label that
may overflow the 32-bit immediate operand of `rN = imm32'
instructions:
r1 = foo
All of which is clearly wrong.
clang happens to do "the right thing" in the current usage of __imm_ptr
in the BPF tests, because even with -O2 it seems to "reload" the
fp-relative address of the automatic to a register like in:
bar:
r1 = r10
r1 += -4
#APP
r1 = r1
#NO_APP
Which is what GCC would generate with -O0. Whether this is by chance
or by design, the compiler shouln't be expected to do that reload
driven by the "p" constraint.
This patch changes the usage of the "p" constraint in the BPF
selftests macros to use the "r" constraint instead. If a register is
what is required, we should let the compiler know.
Previous discussion in bpf@vger:
https://lore.kernel.org/bpf/87h6p5ebpb.fsf@oracle.com/T/#ef0df83d6975c34dff20bf0dd52e078f5b8ca2767
Tested in bpf-next master.
No regressions.
Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240123181309.19853-1-jose.marchesi@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>