14686 Commits

Author SHA1 Message Date
Maxim Mikityanskiy
c1e6148cb4 bpf: Preserve boundaries and track scalars on narrowing fill
When the width of a fill is smaller than the width of the preceding
spill, the information about scalar boundaries can still be preserved,
as long as it's coerced to the right width (done by coerce_reg_to_size).
Even further, if the actual value fits into the fill width, the ID can
be preserved as well for further tracking of equal scalars.

Implement the above improvements, which makes narrowing fills behave the
same as narrowing spills and MOVs between registers.

Two tests are adjusted to accommodate for endianness differences and to
take into account that it's now allowed to do a narrowing fill from the
least significant bits.

reg_bounds_sync is added to coerce_reg_to_size to correctly adjust
umin/umax boundaries after the var_off truncation, for example, a 64-bit
value 0xXXXXXXXX00000000, when read as a 32-bit, gets umin = 0, umax =
0xFFFFFFFF, var_off = (0x0; 0xffffffff00000000), which needs to be
synced down to umax = 0, otherwise reg_bounds_sanity_check doesn't pass.

Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240127175237.526726-4-maxtram95@gmail.com
2024-02-02 13:22:14 -08:00
Maxim Mikityanskiy
6be503cec6 selftests/bpf: Test tracking spilled unbounded scalars
The previous commit added tracking for unbounded scalars on spill. Add
the test case to check the new functionality.

Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240127175237.526726-3-maxtram95@gmail.com
2024-02-02 13:22:14 -08:00
Maxim Mikityanskiy
e67ddd9b1c bpf: Track spilled unbounded scalars
Support the pattern where an unbounded scalar is spilled to the stack,
then boundary checks are performed on the src register, after which the
stack frame slot is refilled into a register.

Before this commit, the verifier didn't treat the src register and the
stack slot as related if the src register was an unbounded scalar. The
register state wasn't copied, the id wasn't preserved, and the stack
slot was marked as STACK_MISC. Subsequent boundary checks on the src
register wouldn't result in updating the boundaries of the spilled
variable on the stack.

After this commit, the verifier will preserve the bond between src and
dst even if src is unbounded, which permits to do boundary checks on src
and refill dst later, still remembering its boundaries. Such a pattern
is sometimes generated by clang when compiling complex long functions.

One test is adjusted to reflect that now unbounded scalars are tracked.

Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240127175237.526726-2-maxtram95@gmail.com
2024-02-02 13:22:14 -08:00
Andrii Nakryiko
943b043aee selftests/bpf: Fix bench runner SIGSEGV
Some benchmarks don't have either "consumer" or "producer" sides. For
example, trig-tp and other BPF triggering benchmarks don't have
consumers, as they only do "producing" by calling into syscall or
predefined uproes. As such it's valid for some benchmarks to have zero
consumers or producers. So allows to specify `-c0` explicitly.

This triggers another problem. If benchmark doesn't support either
consumer or producer side, consumer_thread/producer_thread callback will
be NULL, but benchmark runner will attempt to use those NULL callback to
create threads anyways. So instead of crashing with SIGSEGV in case of
misconfigured benchmark, detect the condition and report error.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240201172027.604869-6-andrii@kernel.org
2024-02-01 22:16:12 +01:00
Pu Lehui
994ff2f797 selftests/bpf: Enable inline bpf_kptr_xchg() test for RV64
Enable inline bpf_kptr_xchg() test for RV64, and the test have passed as
show below:

Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/bpf/20240130124659.670321-3-pulehui@huaweicloud.com
2024-02-01 11:35:48 +01:00
Daniel Xu
6f3189f38a bpf: treewide: Annotate BPF kfuncs in BTF
This commit marks kfuncs as such inside the .BTF_ids section. The upshot
of these annotations is that we'll be able to automatically generate
kfunc prototypes for downstream users. The process is as follows:

1. In source, use BTF_KFUNCS_START/END macro pair to mark kfuncs
2. During build, pahole injects into BTF a "bpf_kfunc" BTF_DECL_TAG for
   each function inside BTF_KFUNCS sets
3. At runtime, vmlinux or module BTF is made available in sysfs
4. At runtime, bpftool (or similar) can look at provided BTF and
   generate appropriate prototypes for functions with "bpf_kfunc" tag

To ensure future kfunc are similarly tagged, we now also return error
inside kfunc registration for untagged kfuncs. For vmlinux kfuncs,
we also WARN(), as initcall machinery does not handle errors.

Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Acked-by: Benjamin Tissoires <bentiss@kernel.org>
Link: https://lore.kernel.org/r/e55150ceecbf0a5d961e608941165c0bee7bc943.1706491398.git.dxu@dxuuu.xyz
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-31 20:40:56 -08:00
Manu Bretelle
2ef61296d2 selftests/bpf: Disable IPv6 for lwt_redirect test
After a recent change in the vmtest runner, this test started failing
sporadically.

Investigation showed that this test was subject to race condition which
got exacerbated after the vm runner change. The symptoms being that the
logic that waited for an ICMPv4 packet is naive and will break if 5 or
more non-ICMPv4 packets make it to tap0.
When ICMPv6 is enabled, the kernel will generate traffic such as ICMPv6
router solicitation...
On a system with good performance, the expected ICMPv4 packet would very
likely make it to the network interface promptly, but on a system with
poor performance, those "guarantees" do not hold true anymore.

Given that the test is IPv4 only, this change disable IPv6 in the test
netns by setting `net.ipv6.conf.all.disable_ipv6` to 1.
This essentially leaves "ping" as the sole generator of traffic in the
network namespace.
If this test was to be made IPv6 compatible, the logic in
`wait_for_packet` would need to be modified.

In more details...

At a high level, the test does:
- create a new namespace
- in `setup_redirect_target` set up lo, tap0, and link_err interfaces as
  well as add 2 routes that attaches ingress/egress sections of
  `test_lwt_redirect.bpf.o` to the xmit path.
- in `send_and_capture_test_packets` send an ICMP packet and read off
  the tap interface (using `wait_for_packet`) to check that a ICMP packet
  with the right size is read.

`wait_for_packet` will try to read `max_retry` (5) times from the tap0
fd looking for an ICMPv4 packet matching some criteria.

The problem is that when we set up the `tap0` interface, because IPv6 is
enabled by default, traffic such as Router solicitation is sent through
tap0, as in:

  # tcpdump -r /tmp/lwt_redirect.pc
  reading from file /tmp/lwt_redirect.pcap, link-type EN10MB (Ethernet)
  04:46:23.578352 IP6 :: > ff02::1:ffc0:4427: ICMP6, neighbor solicitation, who has fe80::fcba:dff:fec0:4427, length 32
  04:46:23.659522 IP6 :: > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
  04:46:24.389169 IP 10.0.0.1 > 20.0.0.9: ICMP echo request, id 122, seq 1, length 108
  04:46:24.618599 IP6 fe80::fcba:dff:fec0:4427 > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
  04:46:24.619985 IP6 fe80::fcba:dff:fec0:4427 > ff02::2: ICMP6, router solicitation, length 16
  04:46:24.767326 IP6 fe80::fcba:dff:fec0:4427 > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
  04:46:28.936402 IP6 fe80::fcba:dff:fec0:4427 > ff02::2: ICMP6, router solicitation, length 16

If `wait_for_packet` sees 5 non-ICMPv4 packets, it will return 0, which is what we see in:

  2024-01-31T03:51:25.0336992Z test_lwt_redirect_run:PASS:netns_create 0 nsec
  2024-01-31T03:51:25.0341309Z open_netns:PASS:malloc token 0 nsec
  2024-01-31T03:51:25.0344844Z open_netns:PASS:open /proc/self/ns/net 0 nsec
  2024-01-31T03:51:25.0350071Z open_netns:PASS:open netns fd 0 nsec
  2024-01-31T03:51:25.0353516Z open_netns:PASS:setns 0 nsec
  2024-01-31T03:51:25.0356560Z test_lwt_redirect_run:PASS:setns 0 nsec
  2024-01-31T03:51:25.0360140Z open_tuntap:PASS:open(/dev/net/tun) 0 nsec
  2024-01-31T03:51:25.0363822Z open_tuntap:PASS:ioctl(TUNSETIFF) 0 nsec
  2024-01-31T03:51:25.0367402Z open_tuntap:PASS:fcntl(O_NONBLOCK) 0 nsec
  2024-01-31T03:51:25.0371167Z setup_redirect_target:PASS:open_tuntap 0 nsec
  2024-01-31T03:51:25.0375180Z setup_redirect_target:PASS:if_nametoindex 0 nsec
  2024-01-31T03:51:25.0379929Z setup_redirect_target:PASS:ip link add link_err type dummy 0 nsec
  2024-01-31T03:51:25.0384874Z setup_redirect_target:PASS:ip link set lo up 0 nsec
  2024-01-31T03:51:25.0389678Z setup_redirect_target:PASS:ip addr add dev lo 10.0.0.1/32 0 nsec
  2024-01-31T03:51:25.0394814Z setup_redirect_target:PASS:ip link set link_err up 0 nsec
  2024-01-31T03:51:25.0399874Z setup_redirect_target:PASS:ip link set tap0 up 0 nsec
  2024-01-31T03:51:25.0407731Z setup_redirect_target:PASS:ip route add 10.0.0.0/24 dev link_err encap bpf xmit obj test_lwt_redirect.bpf.o sec redir_ingress 0 nsec
  2024-01-31T03:51:25.0419105Z setup_redirect_target:PASS:ip route add 20.0.0.0/24 dev link_err encap bpf xmit obj test_lwt_redirect.bpf.o sec redir_egress 0 nsec
  2024-01-31T03:51:25.0427209Z test_lwt_redirect_normal:PASS:setup_redirect_target 0 nsec
  2024-01-31T03:51:25.0431424Z ping_dev:PASS:if_nametoindex 0 nsec
  2024-01-31T03:51:25.0437222Z send_and_capture_test_packets:FAIL:wait_for_epacket unexpected wait_for_epacket: actual 0 != expected 1
  2024-01-31T03:51:25.0448298Z (/tmp/work/bpf/bpf/tools/testing/selftests/bpf/prog_tests/lwt_redirect.c:175: errno: Success) test_lwt_redirect_normal egress test fails
  2024-01-31T03:51:25.0457124Z close_netns:PASS:setns 0 nsec

When running in a VM which potential resource contrains, the odds that calling
`ping` is not scheduled very soon after bringing `tap0` up increases,
and with this the chances to get our ICMP packet pushed to position 6+
in the network trace.

To confirm this indeed solves the issue, I ran the test 100 times in a
row with:

  errors=0
  successes=0
  for i in `seq 1 100`
  do
    ./test_progs -t lwt_redirect/lwt_redirect_normal
    if [ $? -eq 0 ]; then
      successes=$((successes+1))
    else
      errors=$((errors+1))
    fi
  done
  echo "successes: $successes/errors: $errors"

While this test would at least fail a couple of time every 10 runs, here
it ran 100 times with no error.

Fixes: 43a7c3ef8a15 ("selftests/bpf: Add lwt_xmit tests for BPF_REDIRECT")
Signed-off-by: Manu Bretelle <chantr4@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240131053212.2247527-1-chantr4@gmail.com
2024-01-31 09:14:03 -08:00
Andrii Nakryiko
ea9d561686 selftests/bpf: convert bpf_rdonly_cast() uses to bpf_core_cast() macro
Use more ergonomic bpf_core_cast() macro instead of bpf_rdonly_cast() in
selftests code.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240130212023.183765-3-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-01-30 15:55:50 -08:00
Andrii Nakryiko
20d59ee551 libbpf: add bpf_core_cast() macro
Add bpf_core_cast() macro that wraps bpf_rdonly_cast() kfunc. It's more
ergonomic than kfunc, as it automatically extracts btf_id with
bpf_core_type_id_kernel(), and works with type names. It also casts result
to (T *) pointer. See the definition of the macro, it's self-explanatory.

libbpf declares bpf_rdonly_cast() extern as __weak __ksym and should be
safe to not conflict with other possible declarations in user code.

But we do have a conflict with current BPF selftests that declare their
externs with first argument as `void *obj`, while libbpf opts into more
permissive `const void *obj`. This causes conflict, so we fix up BPF
selftests uses in the same patch.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240130212023.183765-2-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-01-30 15:55:50 -08:00
Andrii Nakryiko
c381203ead selftests/bpf: add trusted global subprog arg tests
Add a bunch of test cases validating behavior of __arg_trusted and its
combination with __arg_nullable tag. We also validate CO-RE flavor
support by kernel for __arg_trusted args.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240130000648.2144827-5-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-30 09:41:50 -08:00
Jose E. Marchesi
2421905680 bpf: Move -Wno-compare-distinct-pointer-types to BPF_CFLAGS
Clang supports enabling/disabling certain conversion diagnostics via
the -W[no-]compare-distinct-pointer-types command line options.
Disabling this warning is required by some BPF selftests due to
-Werror.  Until very recently GCC would emit these warnings
unconditionally, which was a problem for gcc-bpf, but we added support
for the command-line options to GCC upstream [1].

This patch moves the -Wno-cmopare-distinct-pointer-types from
CLANG_CFLAGS to BPF_CFLAGS in selftests/bpf/Makefile so the option
is also used in gcc-bpf builds, not just in clang builds.

Tested in bpf-next master.
No regressions.

  [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627769.html

Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240130113624.24940-1-jose.marchesi@oracle.com
2024-01-30 16:13:40 +01:00
Jose E. Marchesi
27a90b14b9 bpf: Build type-punning BPF selftests with -fno-strict-aliasing
A few BPF selftests perform type punning and they may break strict
aliasing rules, which are exploited by both GCC and clang by default
while optimizing.  This can lead to broken compiled programs.

This patch disables strict aliasing for these particular tests, by
mean of the -fno-strict-aliasing command line option.  This will make
sure these tests are optimized properly even if some strict aliasing
rule gets violated.

After this patch, GCC is able to build all the selftests without
warning about potential strict aliasing issue.

bpf@vger discussion on strict aliasing and BPF selftests:
https://lore.kernel.org/bpf/bae1205a-b6e5-4e46-8e20-520d7c327f7a@linux.dev/T/#t

Tested in bpf-next master.
No regressions.

Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/bae1205a-b6e5-4e46-8e20-520d7c327f7a@linux.dev
Link: https://lore.kernel.org/bpf/20240130110343.11217-1-jose.marchesi@oracle.com
2024-01-30 16:08:18 +01:00
Jose E. Marchesi
646751d523 bpf: Use -Wno-error in certain tests when building with GCC
Certain BPF selftests contain code that, albeit being legal C, trigger
warnings in GCC that cannot be disabled.  This is the case for example
for the tests

  progs/btf_dump_test_case_bitfields.c
  progs/btf_dump_test_case_namespacing.c
  progs/btf_dump_test_case_packing.c
  progs/btf_dump_test_case_padding.c
  progs/btf_dump_test_case_syntax.c

which contain struct type declarations inside function parameter
lists.  This is problematic, because:

- The BPF selftests are built with -Werror.

- The Clang and GCC compilers sometimes differ when it comes to handle
  warnings.  in the handling of warnings.  One compiler may emit
  warnings for code that the other compiles compiles silently, and one
  compiler may offer the possibility to disable certain warnings, while
  the other doesn't.

In order to overcome this problem, this patch modifies the
tools/testing/selftests/bpf/Makefile in order to:

1. Enable the possibility of specifing per-source-file extra CFLAGS.
   This is done by defining a make variable like:

   <source-filename>-CFLAGS := <whateverflags>

   And then modifying the proper Make rule in order to use these flags
   when compiling <source-filename>.

2. Use the mechanism above to add -Wno-error to CFLAGS for the
   following selftests:

   progs/btf_dump_test_case_bitfields.c
   progs/btf_dump_test_case_namespacing.c
   progs/btf_dump_test_case_packing.c
   progs/btf_dump_test_case_padding.c
   progs/btf_dump_test_case_syntax.c

   Note the corresponding -CFLAGS variables for these files are
   defined only if the selftests are being built with GCC.

Note that, while compiler pragmas can generally be used to disable
particular warnings per file, this 1) is only possible for warning
that actually can be disabled in the command line, i.e. that have
-Wno-FOO options, and 2) doesn't apply to -Wno-error.

Tested in bpf-next master branch.
No regressions.

Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240127100702.21549-1-jose.marchesi@oracle.com
2024-01-29 22:38:48 +01:00
Martin KaFai Lau
fbaf59a9f5 selftests/bpf: Remove "&>" usage in the selftests
In s390, CI reported that the sock_iter_batch selftest
hits this error very often:

2024-01-26T16:56:49.3091804Z Bind /proc/self/ns/net -> /run/netns/sock_iter_batch_netns failed: No such file or directory
2024-01-26T16:56:49.3149524Z Cannot remove namespace file "/run/netns/sock_iter_batch_netns": No such file or directory
2024-01-26T16:56:49.3772213Z test_sock_iter_batch:FAIL:ip netns add sock_iter_batch_netns unexpected error: 256 (errno 0)

It happens very often in s390 but Manu also noticed it happens very
sparsely in other arch also.

It turns out the default dash shell does not recognize "&>"
as a redirection operator, so the command went to the background.
In the sock_iter_batch selftest, the "ip netns delete" went
into background and then race with the following "ip netns add"
command.

This patch replaces the "&> /dev/null" usage with ">/dev/null 2>&1"
and does this redirection in the SYS_NOFAIL macro instead of doing
it individually by its caller. The SYS_NOFAIL callers do not care
about failure, so it is no harm to do this redirection even if
some of the existing callers do not redirect to /dev/null now.

It touches different test files, so I skipped the Fixes tags
in this patch. Some of the changed tests do not use "&>"
but they use the SYS_NOFAIL, so these tests are also
changed to avoid doing its own redirection because
SYS_NOFAIL does it internally now.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240127025017.950825-1-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-29 12:48:32 -08:00
Geliang Tang
f149d03f45 selftests/bpf: Drop return in bpf_testmod_exit
bpf_testmod_exit() does not need to have a return value (given the void),
so this patch drops this useless 'return' in it.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/5765b287ea088f0c820f2a834faf9b20fb2f8215.1706442113.git.tanggeliang@kylinos.cn
2024-01-29 16:41:29 +01:00
Jakub Kicinski
92046e83c0 bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZbQV+gAKCRDbK58LschI
 g2OeAP0VvhZS9SPiS+/AMAFuw2W1BkMrFNbfBTc3nzRnyJSmNAD+NG4CLLJvsKI9
 olu7VC20B8pLTGLUGIUSwqnjOC+Kkgc=
 =wVMl
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2024-01-26

We've added 107 non-merge commits during the last 4 day(s) which contain
a total of 101 files changed, 6009 insertions(+), 1260 deletions(-).

The main changes are:

1) Add BPF token support to delegate a subset of BPF subsystem
   functionality from privileged system-wide daemons such as systemd
   through special mount options for userns-bound BPF fs to a trusted
   & unprivileged application. With addressed changes from Christian
   and Linus' reviews, from Andrii Nakryiko.

2) Support registration of struct_ops types from modules which helps
   projects like fuse-bpf that seeks to implement a new struct_ops type,
   from Kui-Feng Lee.

3) Add support for retrieval of cookies for perf/kprobe multi links,
   from Jiri Olsa.

4) Bigger batch of prep-work for the BPF verifier to eventually support
   preserving boundaries and tracking scalars on narrowing fills,
   from Maxim Mikityanskiy.

5) Extend the tc BPF flavor to support arbitrary TCP SYN cookies to help
   with the scenario of SYN floods, from Kuniyuki Iwashima.

6) Add code generation to inline the bpf_kptr_xchg() helper which
   improves performance when stashing/popping the allocated BPF objects,
   from Hou Tao.

7) Extend BPF verifier to track aligned ST stores as imprecise spilled
   registers, from Yonghong Song.

8) Several fixes to BPF selftests around inline asm constraints and
   unsupported VLA code generation, from Jose E. Marchesi.

9) Various updates to the BPF IETF instruction set draft document such
   as the introduction of conformance groups for instructions,
   from Dave Thaler.

10) Fix BPF verifier to make infinite loop detection in is_state_visited()
    exact to catch some too lax spill/fill corner cases,
    from Eduard Zingerman.

11) Refactor the BPF verifier pointer ALU check to allow ALU explicitly
    instead of implicitly for various register types, from Hao Sun.

12) Fix the flaky tc_redirect_dtime BPF selftest due to slowness
    in neighbor advertisement at setup time, from Martin KaFai Lau.

13) Change BPF selftests to skip callback tests for the case when the
    JIT is disabled, from Tiezhu Yang.

14) Add a small extension to libbpf which allows to auto create
    a map-in-map's inner map, from Andrey Grafin.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (107 commits)
  selftests/bpf: Add missing line break in test_verifier
  bpf, docs: Clarify definitions of various instructions
  bpf: Fix error checks against bpf_get_btf_vmlinux().
  bpf: One more maintainer for libbpf and BPF selftests
  selftests/bpf: Incorporate LSM policy to token-based tests
  selftests/bpf: Add tests for LIBBPF_BPF_TOKEN_PATH envvar
  libbpf: Support BPF token path setting through LIBBPF_BPF_TOKEN_PATH envvar
  selftests/bpf: Add tests for BPF object load with implicit token
  selftests/bpf: Add BPF object loading tests with explicit token passing
  libbpf: Wire up BPF token support at BPF object level
  libbpf: Wire up token_fd into feature probing logic
  libbpf: Move feature detection code into its own file
  libbpf: Further decouple feature checking logic from bpf_object
  libbpf: Split feature detectors definitions from cached results
  selftests/bpf: Utilize string values for delegate_xxx mount options
  bpf: Support symbolic BPF FS delegation mount options
  bpf: Fail BPF_TOKEN_CREATE if no delegation option was set on BPF FS
  bpf,selinux: Allocate bpf_security_struct per BPF token
  selftests/bpf: Add BPF token-enabled tests
  libbpf: Add BPF token support to bpf_prog_load() API
  ...
====================

Link: https://lore.kernel.org/r/20240126215710.19855-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-26 21:08:22 -08:00
Tiezhu Yang
fa7178b0f1 selftests/bpf: Add missing line break in test_verifier
There are no break lines in the test log for test_verifier #106 ~ #111
if jit is disabled, add the missing line break at the end of printf()
to fix it.

Without this patch:

  [root@linux bpf]# echo 0 > /proc/sys/net/core/bpf_jit_enable
  [root@linux bpf]# ./test_verifier 106
  #106/p inline simple bpf_loop call SKIP (requires BPF JIT)Summary: 0 PASSED, 1 SKIPPED, 0 FAILED

With this patch:

  [root@linux bpf]# echo 0 > /proc/sys/net/core/bpf_jit_enable
  [root@linux bpf]# ./test_verifier 106
  #106/p inline simple bpf_loop call SKIP (requires BPF JIT)
  Summary: 0 PASSED, 1 SKIPPED, 0 FAILED

Fixes: 0b50478fd877 ("selftests/bpf: Skip callback tests if jit is disabled in test_verifier")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240126015736.655-1-yangtiezhu@loongson.cn
2024-01-26 11:09:32 -08:00
David Ahern
70863c902d selftest: Show expected and actual return codes for test failures in fcnal-test
Capture expected and actual return codes for a test that fails in
the fcnal-test suite.

Signed-off-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20240124214117.24687-4-dsahern@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-25 17:14:11 -08:00
David Ahern
79bf0d4a07 selftest: Fix set of ping_group_range in fcnal-test
ping_group_range sysctl has a compound value which does not go
through the various function layers in tact. Create a helper
function to bypass the layers and correctly set the value.

Signed-off-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20240124214117.24687-3-dsahern@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-25 17:14:11 -08:00
David Ahern
ad9b701aed selftest: Update PATH for nettest in fcnal-test
Allow fcnal-test.sh to be run from top level directory in the
kernel repo as well as from tools/testing/selftests/net by
setting the PATH to find the in-tree nettest.

Signed-off-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20240124214117.24687-2-dsahern@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-25 17:14:11 -08:00
Arseniy Krasnov
767ec326f9 vsock/test: print type for SOCK_SEQPACKET
SOCK_SEQPACKET is supported for virtio transport, so do not interpret
such type of socket as unknown.

Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/20240124193255.3417803-1-avkrasnov@salutedevices.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-25 16:39:21 -08:00
Pedro Tammela
8981a85e1b selftests: tc-testing: return fail if a test fails in setup/teardown
As of today tests throwing exceptions in setup/teardown phase are
treated as skipped but they should really be failures.

Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Link: https://lore.kernel.org/r/20240124181933.75724-6-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-25 16:38:16 -08:00
Pedro Tammela
d17d0e3337 selftests: tc-testing: enable all tdc tests
For the longest time tdc ran only actions and qdiscs tests.
It's time to enable all the remaining tests so every user visible
piece of TC is tested by the downstream CIs.

Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Link: https://lore.kernel.org/r/20240124181933.75724-5-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-25 16:38:16 -08:00
Pedro Tammela
3007d8712c selftests: tc-testing: adjust fq test to latest iproute2
Adjust the fq verify regex to the latest iproute2

Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Link: https://lore.kernel.org/r/20240124181933.75724-4-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-25 16:38:16 -08:00
Pedro Tammela
4f4d384121 selftests: tc-testing: check if 'jq' is available in taprio tests
If 'jq' is not available the taprio tests might enter an infinite loop,
use the "dependsOn" feature from tdc to check if jq is present. If it's
not the test is skipped.

Suggested-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Link: https://lore.kernel.org/r/20240124181933.75724-3-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-25 16:38:16 -08:00
Pedro Tammela
14a12e6c0b selftests: tc-testing: add missing netfilter config
On a default config + tc-testing config build, tdc will miss
all the netfilter related tests because it's missing:
   CONFIG_NETFILTER=y

Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Link: https://lore.kernel.org/r/20240124181933.75724-2-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-25 16:38:16 -08:00
Jakub Kicinski
06f609b311 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts or adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-25 14:20:08 -08:00
Linus Torvalds
ecb1b8288d Including fixes from bpf, netfilter and WiFi.
Current release - regressions:
 
   - bpf: fix a kernel crash for the riscv 64 JIT
 
   - bnxt_en: fix memory leak in bnxt_hwrm_get_rings()
 
   - revert "net: macsec: use skb_ensure_writable_head_tail to expand the skb"
 
 Previous releases - regressions:
 
   - core: fix removing a namespace with conflicting altnames
 
   - tc/flower: fix chain template offload memory leak
 
   - tcp:
     - make sure init the accept_queue's spinlocks once
     - fix autocork on CPUs with weak memory model
 
   - udp: fix busy polling
 
   - mlx5e:
     - fix out-of-bound read in port timestamping
     - fix peer flow lists corruption
 
   - iwlwifi: fix a memory corruption
 
 Previous releases - always broken:
 
   - netfilter:
     - nft_chain_filter: handle NETDEV_UNREGISTER for inet/ingress basechain
     - nft_limit: reject configurations that cause integer overflow
 
   - bpf: fix bpf_xdp_adjust_tail() with XSK zero-copy mbuf, avoiding
     a NULL pointer dereference upon shrinking
 
   - llc: make llc_ui_sendmsg() more robust against bonding changes
 
   - smc: fix illegal rmb_desc access in SMC-D connection dump
 
   - dpll: fix pin dump crash for rebound module
 
   - bnxt_en: fix possible crash after creating sw mqprio TCs
 
   - hv_netvsc: calculate correct ring size when PAGE_SIZE is not 4 Kbytes
 
 Misc:
 
   - several self-tests fixes for better integration with the netdev CI
 
   - added several missing modules descriptions
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmWyUSISHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkiuIP/0IChNqw3KtJJQOb4eIu12qRTblMmucU
 Yf+Q4ZBHI7Epz3HFWrmqoj7N7CaWAj+U8b9JEHbcZxP7cwo4mc7ScXmm78IOvpEl
 ypWdjWVW89UVz6GI/Yz/MNC2H7By51NUkiDpbbAA4pZVK6N1+rO8oAU9Fy2IiFTb
 ixt61S1zsVYdmUjHOD/PiU9b8i5cRBukaXF8jnznRj8nAT/cU9XUV/YyFixj33vQ
 Rbs3HoKxD9Mk9KdJ7jgEMi7Vazb40w5TmfMLWyNglJQdNz4DUg+9tqQGCkEf5UGU
 CpRKu4RVr2uzAn9N5Hav4O0He2kDUVH1MoPqgS6MnJAERzCDKoDFxo8ljTmHBk6b
 ISmstRzRon/AXcp+94pwU5RT78B7HekXYlZPcj5tGVKiM7HMdgLiodOcZcsG5fdW
 8okeYhpCL5ew/fxGOZnbNS/BiODZBaa+/e6ns8NasmWZHgGJa9uHiO865g5I53/H
 jWnm53Bi1Zmkgu/+NwXEx1I1vWSa9GCBA8ia5oSuEQDWhHhm3EcUcr44kjrD/R+S
 6elNScjrJ2kMF+fvOb1BEITUf77fk7/ATJarCk8oybJCAt7do+DZ1E47NN0M9Km8
 gARKvThije9rmc4OZ7RYR7R9iQgvwpb7fgGJZw+SI3XFK/WxmcDJHV4dXLiA6+Mu
 vvt+x7EmWiWf
 =ER+y
 -----END PGP SIGNATURE-----

Merge tag 'net-6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from bpf, netfilter and WiFi.

  Jakub is doing a lot of work to include the self-tests in our CI, as a
  result a significant amount of self-tests related fixes is flowing in
  (and will likely continue in the next few weeks).

  Current release - regressions:

   - bpf: fix a kernel crash for the riscv 64 JIT

   - bnxt_en: fix memory leak in bnxt_hwrm_get_rings()

   - revert "net: macsec: use skb_ensure_writable_head_tail to expand
     the skb"

  Previous releases - regressions:

   - core: fix removing a namespace with conflicting altnames

   - tc/flower: fix chain template offload memory leak

   - tcp:
      - make sure init the accept_queue's spinlocks once
      - fix autocork on CPUs with weak memory model

   - udp: fix busy polling

   - mlx5e:
      - fix out-of-bound read in port timestamping
      - fix peer flow lists corruption

   - iwlwifi: fix a memory corruption

  Previous releases - always broken:

   - netfilter:
      - nft_chain_filter: handle NETDEV_UNREGISTER for inet/ingress
        basechain
      - nft_limit: reject configurations that cause integer overflow

   - bpf: fix bpf_xdp_adjust_tail() with XSK zero-copy mbuf, avoiding a
     NULL pointer dereference upon shrinking

   - llc: make llc_ui_sendmsg() more robust against bonding changes

   - smc: fix illegal rmb_desc access in SMC-D connection dump

   - dpll: fix pin dump crash for rebound module

   - bnxt_en: fix possible crash after creating sw mqprio TCs

   - hv_netvsc: calculate correct ring size when PAGE_SIZE is not 4kB

  Misc:

   - several self-tests fixes for better integration with the netdev CI

   - added several missing modules descriptions"

* tag 'net-6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (88 commits)
  tsnep: Fix XDP_RING_NEED_WAKEUP for empty fill ring
  tsnep: Remove FCS for XDP data path
  net: fec: fix the unhandled context fault from smmu
  selftests: bonding: do not test arp/ns target with mode balance-alb/tlb
  fjes: fix memleaks in fjes_hw_setup
  i40e: update xdp_rxq_info::frag_size for ZC enabled Rx queue
  i40e: set xdp_rxq_info::frag_size
  xdp: reflect tail increase for MEM_TYPE_XSK_BUFF_POOL
  ice: update xdp_rxq_info::frag_size for ZC enabled Rx queue
  intel: xsk: initialize skb_frag_t::bv_offset in ZC drivers
  ice: remove redundant xdp_rxq_info registration
  i40e: handle multi-buffer packets that are shrunk by xdp prog
  ice: work on pre-XDP prog frag count
  xsk: fix usage of multi-buffer BPF helpers for ZC XDP
  xsk: make xsk_buff_pool responsible for clearing xdp_buff::flags
  xsk: recycle buffer in case Rx queue was full
  net: fill in MODULE_DESCRIPTION()s for rvu_mbox
  net: fill in MODULE_DESCRIPTION()s for litex
  net: fill in MODULE_DESCRIPTION()s for fsl_pq_mdio
  net: fill in MODULE_DESCRIPTION()s for fec
  ...
2024-01-25 10:58:35 -08:00
Hangbin Liu
a2933a8759 selftests: bonding: do not test arp/ns target with mode balance-alb/tlb
The prio_arp/ns tests hard code the mode to active-backup. At the same
time, The balance-alb/tlb modes do not support arp/ns target. So remove
the prio_arp/ns tests from the loop and only test active-backup mode.

Fixes: 481b56e0391e ("selftests: bonding: re-format bond option tests")
Reported-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Closes: https://lore.kernel.org/netdev/17415.1705965957@famine/
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Link: https://lore.kernel.org/r/20240123075917.1576360-1-liuhangbin@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-01-25 09:50:54 +01:00
Arseniy Krasnov
e18c709230 vsock/test: add '--peer-port' input argument
Implement port for given CID as input argument instead of using
hardcoded value '1234'. This allows to run different test instances
on a single CID. Port argument is not required parameter and if it is
not set, then default value will be '1234' - thus we preserve previous
behaviour.

Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/20240123072750.4084181-1-avkrasnov@salutedevices.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-24 17:47:35 -08:00
Andrii Nakryiko
906ee42cb1 selftests/bpf: Incorporate LSM policy to token-based tests
Add tests for LSM interactions (both bpf_token_capable and bpf_token_cmd
LSM hooks) with BPF token in bpf() subsystem. Now child process passes
back token FD for parent to be able to do tests with token originating
in "wrong" userns. But we also create token in initns and check that
token LSMs don't accidentally reject BPF operations when capable()
checks pass without BPF token.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-31-andrii@kernel.org
2024-01-24 16:21:03 -08:00
Andrii Nakryiko
fadf54935e selftests/bpf: Add tests for LIBBPF_BPF_TOKEN_PATH envvar
Add new subtest validating LIBBPF_BPF_TOKEN_PATH envvar semantics.
Extend existing test to validate that LIBBPF_BPF_TOKEN_PATH allows to
disable implicit BPF token creation by setting envvar to empty string.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-30-andrii@kernel.org
2024-01-24 16:21:03 -08:00
Andrii Nakryiko
b73d08d131 selftests/bpf: Add tests for BPF object load with implicit token
Add a test to validate libbpf's implicit BPF token creation from default
BPF FS location (/sys/fs/bpf). Also validate that disabling this
implicit BPF token creation works.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-28-andrii@kernel.org
2024-01-24 16:21:03 -08:00
Andrii Nakryiko
d5baf0cac6 selftests/bpf: Add BPF object loading tests with explicit token passing
Add a few tests that attempt to load BPF object containing privileged
map, program, and the one requiring mandatory BTF uploading into the
kernel (to validate token FD propagation to BPF_BTF_LOAD command).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-27-andrii@kernel.org
2024-01-24 16:21:03 -08:00
Andrii Nakryiko
0350f9d99e selftests/bpf: Utilize string values for delegate_xxx mount options
Use both hex-based and string-based way to specify delegate mount
options for BPF FS.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-21-andrii@kernel.org
2024-01-24 16:21:02 -08:00
Andrii Nakryiko
fcb9597ff7 selftests/bpf: Add BPF token-enabled tests
Add a selftest that attempts to conceptually replicate intended BPF
token use cases inside user namespaced container.

Child process is forked. It is then put into its own userns and mountns.
Child creates BPF FS context object. This ensures child userns is
captured as the owning userns for this instance of BPF FS. Given setting
delegation mount options is privileged operation, we ensure that child
cannot set them.

This context is passed back to privileged parent process through Unix
socket, where parent sets up delegation options, creates, and mounts it
as a detached mount. This mount FD is passed back to the child to be
used for BPF token creation, which allows otherwise privileged BPF
operations to succeed inside userns.

We validate that all of token-enabled privileged commands (BPF_BTF_LOAD,
BPF_MAP_CREATE, and BPF_PROG_LOAD) work as intended. They should only
succeed inside the userns if a) BPF token is provided with proper
allowed sets of commands and types; and b) namespaces CAP_BPF and other
privileges are set. Lacking a) or b) should lead to -EPERM failures.

Based on suggested workflow by Christian Brauner ([0]).

  [0] https://lore.kernel.org/bpf/20230704-hochverdient-lehne-eeb9eeef785e@brauner/

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-17-andrii@kernel.org
2024-01-24 16:21:02 -08:00
Andrii Nakryiko
caf8f28e03 bpf: Add BPF token support to BPF_PROG_LOAD command
Add basic support of BPF token to BPF_PROG_LOAD. BPF_F_TOKEN_FD flag
should be set in prog_flags field when providing prog_token_fd.

Wire through a set of allowed BPF program types and attach types,
derived from BPF FS at BPF token creation time. Then make sure we
perform bpf_token_capable() checks everywhere where it's relevant.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-7-andrii@kernel.org
2024-01-24 16:21:01 -08:00
Andrii Nakryiko
a177fc2bf6 bpf: Add BPF token support to BPF_MAP_CREATE command
Allow providing token_fd for BPF_MAP_CREATE command to allow controlled
BPF map creation from unprivileged process through delegated BPF token.
New BPF_F_TOKEN_FD flag is added to specify together with BPF token FD
for BPF_MAP_CREATE command.

Wire through a set of allowed BPF map types to BPF token, derived from
BPF FS at BPF token creation time. This, in combination with allowed_cmds
allows to create a narrowly-focused BPF token (controlled by privileged
agent) with a restrictive set of BPF maps that application can attempt
to create.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-5-andrii@kernel.org
2024-01-24 16:21:01 -08:00
Jakub Kicinski
0879020a78 selftests: netdevsim: fix the udp_tunnel_nic test
This test is missing a whole bunch of checks for interface
renaming and one ifup. Presumably it was only used on a system
with renaming disabled and NetworkManager running.

Fixes: 91f430b2c49d ("selftests: net: add a test for UDP tunnel info infra")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240123060529.1033912-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-24 15:11:10 -08:00
Jakub Kicinski
0719b5338a selftests: net: fix rps_default_mask with >32 CPUs
If there is more than 32 cpus the bitmask will start to contain
commas, leading to:

./rps_default_mask.sh: line 36: [: 00000000,00000000: integer expression expected

Remove the commas, bash doesn't interpret leading zeroes as oct
so that should be good enough. Switch to bash, Simon reports that
not all shells support this type of substitution.

Fixes: c12e0d5f267d ("self-tests: introduce self-tests for RPS default mask")
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240122195815.638997-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-24 13:55:19 -08:00
Martin KaFai Lau
ce6f6cffae selftests/bpf: Wait for the netstamp_needed_key static key to be turned on
After the previous patch that speeded up the test (by avoiding neigh
discovery in IPv6), the BPF CI occasionally hits this error:

rcv tstamp unexpected pkt rcv tstamp: actual 0 == expected 0

The test complains about the cmsg returned from the recvmsg() does not
have the rcv timestamp. Setting skb->tstamp or not is
controlled by a kernel static key "netstamp_needed_key". The static
key is enabled whenever this is at least one sk with the SOCK_TIMESTAMP
set.

The test_redirect_dtime does use setsockopt() to turn on
the SOCK_TIMESTAMP for the reading sk. In the kernel
net_enable_timestamp() has a delay to enable the "netstamp_needed_key"
when CONFIG_JUMP_LABEL is set. This potential delay is the likely reason
for packet missing rcv timestamp occasionally.

This patch is to create udp sockets with SOCK_TIMESTAMP set.
It sends and receives some packets until the received packet
has a rcv timestamp. It currently retries at most 5 times with 1s
in between. This should be enough to wait for the "netstamp_needed_key".
It then holds on to the socket and only closes it at the end of the test.
This guarantees that the test has the "netstamp_needed_key" key turned
on from the beginning.

To simplify the udp sockets setup, they are sending/receiving packets
in the same netns (ns_dst is used) and communicate over the "lo" dev.
Hence, the patch enables the "lo" dev in the ns_dst.

Fixes: c803475fd8dd ("bpf: selftests: test skb->tstamp in redirect_neigh")
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240120060518.3604920-2-martin.lau@linux.dev
2024-01-24 10:03:46 -08:00
Martin KaFai Lau
177f1d083a selftests/bpf: Fix the flaky tc_redirect_dtime test
BPF CI has been reporting the tc_redirect_dtime test failing
from time to time:

test_inet_dtime:PASS:setns src 0 nsec
(network_helpers.c:253: errno: No route to host) Failed to connect to server
close_netns:PASS:setns 0 nsec
test_inet_dtime:FAIL:connect_to_fd unexpected connect_to_fd: actual -1 < expected 0
test_tcp_clear_dtime:PASS:tcp ip6 clear dtime ingress_fwdns_p100 0 nsec

The connect_to_fd failure (EHOSTUNREACH) is from the
test_tcp_clear_dtime() test and it is the very first IPv6 traffic
after setting up all the links, addresses, and routes.

The symptom is this first connect() is always slow. In my setup, it
could take ~3s.

After some tracing and tcpdump, the slowness is mostly spent in
the neighbor solicitation in the "ns_fwd" namespace while
the "ns_src" and "ns_dst" are fine.

I forced the kernel to drop the neighbor solicitation messages.
I can then reproduce EHOSTUNREACH. What actually happen could be:
- the neighbor advertisement came back a little slow.
- the "ns_fwd" namespace concluded a neighbor discovery failure
  and triggered the ndisc_error_report() => ip6_link_failure() =>
  icmpv6_send(skb, ICMPV6_DEST_UNREACH, ICMPV6_ADDR_UNREACH, 0)
- the client's connect() reports EHOSTUNREACH after receiving
  the ICMPV6_DEST_UNREACH message.

The neigh table of both "ns_src" and "ns_dst" namespace has already
been manually populated but not the "ns_fwd" namespace. This patch
fixes it by manually populating the neigh table also in the "ns_fwd"
namespace.

Although the namespace configuration part had been existed before
the tc_redirect_dtime test, still Fixes-tagging the patch when
the tc_redirect_dtime test was added since it is the only test
hitting it so far.

Fixes: c803475fd8dd ("bpf: selftests: test skb->tstamp in redirect_neigh")
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240120060518.3604920-1-martin.lau@linux.dev
2024-01-24 10:03:46 -08:00
Tiezhu Yang
0b50478fd8 selftests/bpf: Skip callback tests if jit is disabled in test_verifier
If CONFIG_BPF_JIT_ALWAYS_ON is not set and bpf_jit_enable is 0, there
exist 6 failed tests.

  [root@linux bpf]# echo 0 > /proc/sys/net/core/bpf_jit_enable
  [root@linux bpf]# echo 0 > /proc/sys/kernel/unprivileged_bpf_disabled
  [root@linux bpf]# ./test_verifier | grep FAIL
  #106/p inline simple bpf_loop call FAIL
  #107/p don't inline bpf_loop call, flags non-zero FAIL
  #108/p don't inline bpf_loop call, callback non-constant FAIL
  #109/p bpf_loop_inline and a dead func FAIL
  #110/p bpf_loop_inline stack locations for loop vars FAIL
  #111/p inline bpf_loop call in a big program FAIL
  Summary: 768 PASSED, 15 SKIPPED, 6 FAILED

The test log shows that callbacks are not allowed in non-JITed programs,
interpreter doesn't support them yet, thus these tests should be skipped
if jit is disabled.

Add an explicit flag F_NEEDS_JIT_ENABLED to those tests to mark that they
require JIT enabled in bpf_loop_inline.c, check the flag and jit_disabled
at the beginning of do_test_single() to handle this case.

With this patch:

  [root@linux bpf]# echo 0 > /proc/sys/net/core/bpf_jit_enable
  [root@linux bpf]# echo 0 > /proc/sys/kernel/unprivileged_bpf_disabled
  [root@linux bpf]# ./test_verifier | grep FAIL
  Summary: 768 PASSED, 21 SKIPPED, 0 FAILED

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240123090351.2207-3-yangtiezhu@loongson.cn
2024-01-23 20:27:41 -08:00
Tiezhu Yang
15b4f88dcc selftests/bpf: Move is_jit_enabled() into testing_helpers
Currently, is_jit_enabled() is only used in test_progs, move it into
testing_helpers so that it can be used in test_verifier. While at it,
remove the second argument "0" of open() as Hou Tao suggested.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Hou Tao <houtao1@huawei.com>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/bpf/20240123090351.2207-2-yangtiezhu@loongson.cn
2024-01-23 20:26:17 -08:00
Jakub Kicinski
04fe7c5029 selftests: fill in some missing configs for net
We are missing a lot of config options from net selftests,
it seems:

tun/tap:     CONFIG_TUN, CONFIG_MACVLAN, CONFIG_MACVTAP
fib_tests:   CONFIG_NET_SCH_FQ_CODEL
l2tp:        CONFIG_L2TP, CONFIG_L2TP_V3, CONFIG_L2TP_IP, CONFIG_L2TP_ETH
sctp-vrf:    CONFIG_INET_DIAG
txtimestamp: CONFIG_NET_CLS_U32
vxlan_mdb:   CONFIG_BRIDGE_VLAN_FILTERING
gre_gso:     CONFIG_NET_IPGRE_DEMUX, CONFIG_IP_GRE, CONFIG_IPV6_GRE
srv6_end_dt*_l3vpn:   CONFIG_IPV6_SEG6_LWTUNNEL
ip_local_port_range:  CONFIG_MPTCP
fib_test:    CONFIG_NET_CLS_BASIC
rtnetlink:   CONFIG_MACSEC, CONFIG_NET_SCH_HTB, CONFIG_XFRM_INTERFACE
             CONFIG_NET_IPGRE, CONFIG_BONDING
fib_nexthops: CONFIG_MPLS, CONFIG_MPLS_ROUTING
vxlan_mdb:   CONFIG_NET_ACT_GACT
tls:         CONFIG_TLS, CONFIG_CRYPTO_CHACHA20POLY1305
psample:     CONFIG_PSAMPLE
fcnal:       CONFIG_TCP_MD5SIG

Try to add them in a semi-alphabetical order.

Fixes: 62199e3f1658 ("selftests: net: Add VXLAN MDB test")
Fixes: c12e0d5f267d ("self-tests: introduce self-tests for RPS default mask")
Fixes: 122db5e3634b ("selftests/net: add MPTCP coverage for IP_LOCAL_PORT_RANGE")
Link: https://lore.kernel.org/r/20240122203528.672004-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-23 17:22:58 -08:00
Kui-Feng Lee
0253e0590e selftests/bpf: test case for register_bpf_struct_ops().
Create a new struct_ops type called bpf_testmod_ops within the bpf_testmod
module. When a struct_ops object is registered, the bpf_testmod module will
invoke test_2 from the module.

Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240119225005.668602-15-thinker.li@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-01-23 17:12:52 -08:00
Jiri Olsa
b789648668 selftests/bpf: Add fill_link_info test for perf event
Adding fill_link_info test for perf event and testing we
get its values back through the bpf_link_info interface.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240119110505.400573-7-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23 16:05:28 -08:00
Jiri Olsa
d741797084 selftests/bpf: Add cookies check for perf_event fill_link_info test
Now that we get cookies for perf_event probes, adding tests
for cookie for kprobe/uprobe/tracepoint.

The perf_event test needs to be added completely and is coming
in following change.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240119110505.400573-6-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23 16:05:27 -08:00
Jiri Olsa
59a89706c4 selftests/bpf: Add cookies check for kprobe_multi fill_link_info test
Adding cookies check for kprobe_multi fill_link_info test,
plus tests for invalid values related to cookies.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240119110505.400573-5-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23 16:05:27 -08:00
Jose E. Marchesi
bbc094b305 bpf: Use r constraint instead of p constraint in selftests
Some of the BPF selftests use the "p" constraint in inline assembly
snippets, for input operands for MOV (rN = rM) instructions.

This is mainly done via the __imm_ptr macro defined in
tools/testing/selftests/bpf/progs/bpf_misc.h:

  #define __imm_ptr(name) [name]"p"(&name)

Example:

  int consume_first_item_only(void *ctx)
  {
        struct bpf_iter_num iter;
        asm volatile (
                /* create iterator */
                "r1 = %[iter];"
                [...]
                :
                : __imm_ptr(iter)
                : CLOBBERS);
        [...]
  }

The "p" constraint is a tricky one.  It is documented in the GCC manual
section "Simple Constraints":

  An operand that is a valid memory address is allowed.  This is for
  ``load address'' and ``push address'' instructions.

  p in the constraint must be accompanied by address_operand as the
  predicate in the match_operand.  This predicate interprets the mode
  specified in the match_operand as the mode of the memory reference for
  which the address would be valid.

There are two problems:

1. It is questionable whether that constraint was ever intended to be
   used in inline assembly templates, because its behavior really
   depends on compiler internals.  A "memory address" is not the same
   than a "memory operand" or a "memory reference" (constraint "m"), and
   in fact its usage in the template above results in an error in both
   x86_64-linux-gnu and bpf-unkonwn-none:

     foo.c: In function ‘bar’:
     foo.c:6:3: error: invalid 'asm': invalid expression as operand
        6 |   asm volatile ("r1 = %[jorl]" : : [jorl]"p"(&jorl));
          |   ^~~

   I would assume the same happens with aarch64, riscv, and most/all
   other targets in GCC, that do not accept operands of the form A + B
   that are not wrapped either in a const or in a memory reference.

   To avoid that error, the usage of the "p" constraint in internal GCC
   instruction templates is supposed to be complemented by the 'a'
   modifier, like in:

     asm volatile ("r1 = %a[jorl]" : : [jorl]"p"(&jorl));

   Internally documented (in GCC's final.cc) as:

     %aN means expect operand N to be a memory address
        (not a memory reference!) and print a reference
        to that address.

   That works because when the modifier 'a' is found, GCC prints an
   "operand address", which is not the same than an "operand".

   But...

2. Even if we used the internal 'a' modifier (we shouldn't) the 'rN =
   rM' instruction really requires a register argument.  In cases
   involving automatics, like in the examples above, we easily end with:

     bar:
        #APP
            r1 = r10-4
        #NO_APP

   In other cases we could conceibly also end with a 64-bit label that
   may overflow the 32-bit immediate operand of `rN = imm32'
   instructions:

        r1 = foo

   All of which is clearly wrong.

clang happens to do "the right thing" in the current usage of __imm_ptr
in the BPF tests, because even with -O2 it seems to "reload" the
fp-relative address of the automatic to a register like in:

  bar:
	r1 = r10
	r1 += -4
	#APP
	r1 = r1
	#NO_APP

Which is what GCC would generate with -O0.  Whether this is by chance
or by design, the compiler shouln't be expected to do that reload
driven by the "p" constraint.

This patch changes the usage of the "p" constraint in the BPF
selftests macros to use the "r" constraint instead.  If a register is
what is required, we should let the compiler know.

Previous discussion in bpf@vger:
https://lore.kernel.org/bpf/87h6p5ebpb.fsf@oracle.com/T/#ef0df83d6975c34dff20bf0dd52e078f5b8ca2767

Tested in bpf-next master.
No regressions.

Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240123181309.19853-1-jose.marchesi@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-23 15:55:46 -08:00