IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Add --log-size to be able to customize log buffer sent to bpf() syscall
for BPF program verification logging.
Add --log-fixed to enforce BPF_LOG_FIXED behavior for BPF verifier log.
This is useful in unlikely event that beginning of truncated verifier
log is more important than the end of it (which with rotating verifier
log behavior is the default now).
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230406234205.323208-6-andrii@kernel.org
This basically prevents any forward compatibility. And we either way
just return -EINVAL, which would otherwise be returned from bpf()
syscall anyways.
Similarly, drop enforcement of non-NULL log_buf when log_level > 0. This
won't be true anymore soon.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Lorenz Bauer <lmb@isovalent.com>
Link: https://lore.kernel.org/bpf/20230406234205.323208-5-andrii@kernel.org
Currently, if user-supplied log buffer to collect BPF verifier log turns
out to be too small to contain full log, bpf() syscall returns -ENOSPC,
fails BPF program verification/load, and preserves first N-1 bytes of
the verifier log (where N is the size of user-supplied buffer).
This is problematic in a bunch of common scenarios, especially when
working with real-world BPF programs that tend to be pretty complex as
far as verification goes and require big log buffers. Typically, it's
when debugging tricky cases at log level 2 (verbose). Also, when BPF program
is successfully validated, log level 2 is the only way to actually see
verifier state progression and all the important details.
Even with log level 1, it's possible to get -ENOSPC even if the final
verifier log fits in log buffer, if there is a code path that's deep
enough to fill up entire log, even if normally it would be reset later
on (there is a logic to chop off successfully validated portions of BPF
verifier log).
In short, it's not always possible to pre-size log buffer. Also, what's
worse, in practice, the end of the log most often is way more important
than the beginning, but verifier stops emitting log as soon as initial
log buffer is filled up.
This patch switches BPF verifier log behavior to effectively behave as
rotating log. That is, if user-supplied log buffer turns out to be too
short, verifier will keep overwriting previously written log,
effectively treating user's log buffer as a ring buffer. -ENOSPC is
still going to be returned at the end, to notify user that log contents
was truncated, but the important last N bytes of the log would be
returned, which might be all that user really needs. This consistent
-ENOSPC behavior, regardless of rotating or fixed log behavior, allows
to prevent backwards compatibility breakage. The only user-visible
change is which portion of verifier log user ends up seeing *if buffer
is too small*. Given contents of verifier log itself is not an ABI,
there is no breakage due to this behavior change. Specialized tools that
rely on specific contents of verifier log in -ENOSPC scenario are
expected to be easily adapted to accommodate old and new behaviors.
Importantly, though, to preserve good user experience and not require
every user-space application to adopt to this new behavior, before
exiting to user-space verifier will rotate log (in place) to make it
start at the very beginning of user buffer as a continuous
zero-terminated string. The contents will be a chopped off N-1 last
bytes of full verifier log, of course.
Given beginning of log is sometimes important as well, we add
BPF_LOG_FIXED (which equals 8) flag to force old behavior, which allows
tools like veristat to request first part of verifier log, if necessary.
BPF_LOG_FIXED flag is also a simple and straightforward way to check if
BPF verifier supports rotating behavior.
On the implementation side, conceptually, it's all simple. We maintain
64-bit logical start and end positions. If we need to truncate the log,
start position will be adjusted accordingly to lag end position by
N bytes. We then use those logical positions to calculate their matching
actual positions in user buffer and handle wrap around the end of the
buffer properly. Finally, right before returning from bpf_check(), we
rotate user log buffer contents in-place as necessary, to make log
contents contiguous. See comments in relevant functions for details.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Lorenz Bauer <lmb@isovalent.com>
Link: https://lore.kernel.org/bpf/20230406234205.323208-4-andrii@kernel.org
It's not clear why we have 128 as minimum size, but it makes testing
harder and seems unnecessary, as we carefully handle truncation
scenarios and use proper snprintf variants. So remove this limitation
and just enforce positive length for log buffer.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Lorenz Bauer <lmb@isovalent.com>
Link: https://lore.kernel.org/bpf/20230406234205.323208-3-andrii@kernel.org
kernel/bpf/verifier.c file is large and growing larger all the time. So
it's good to start splitting off more or less self-contained parts into
separate files to keep source code size (somewhat) somewhat under
control.
This patch is a one step in this direction, moving some of BPF verifier log
routines into a separate kernel/bpf/log.c. Right now it's most low-level
and isolated routines to append data to log, reset log to previous
position, etc. Eventually we could probably move verifier state
printing logic here as well, but this patch doesn't attempt to do that
yet.
Subsequent patches will add more logic to verifier log management, so
having basics in a separate file will make sure verifier.c doesn't grow
more with new changes.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Lorenz Bauer <lmb@isovalent.com>
Link: https://lore.kernel.org/bpf/20230406234205.323208-2-andrii@kernel.org
There is an extra whitespace in the SPDX tag, before the license name,
in the script for generating man pages for the bpf() syscall and the
helpers. It has caused problems in Debian packaging, in the tool that
autodetects licenses. Let's clean it up.
Fixes: 5cb62b7598f2 ("bpf, docs: Use SPDX license identifier in bpf_doc.py")
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230411144747.66734-1-quentin@isovalent.com
When trying to add a name to the hashmap, an error code of EEXIST is
returned and we continue as names are possibly duplicated in the sys
file.
If the last name in the file is a duplicate, we will continue to the
next iteration of the while loop, and exit the loop with a value of err
set to EEXIST and enter the error label with err set, which causes the
test to fail when it should not.
This change reset err to 0 before continue-ing into the next iteration,
this way, if there is no more data to read from the file we iterate
through, err will be set to 0.
Behaviour prior to this change:
```
test_kprobe_multi_bench_attach:FAIL:get_syms unexpected error: -17
(errno 2)
All error logs:
test_kprobe_multi_bench_attach:FAIL:get_syms unexpected error: -17
(errno 2)
Summary: 0/1 PASSED, 0 SKIPPED, 1 FAILED
```
After this change:
```
Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED
```
Signed-off-by: Manu Bretelle <chantr4@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230408022919.54601-1-chantr4@gmail.com
The following example forces veristat to loop indefinitely:
$ cat two-ok
file_name,prog_name,verdict,total_states
file-a,a,success,12
file-b,b,success,67
$ cat add-failure
file_name,prog_name,verdict,total_states
file-a,a,success,12
file-b,b,success,67
file-b,c,failure,32
$ veristat -C two-ok add-failure
<does not return>
The loop is caused by handle_comparison_mode() not checking if `base`
variable points to `fallback_stats` prior advancing joined results
using `base`.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230407154125.896927-1-eddyz87@gmail.com
After commit d6e6286a12e7 ("libbpf: disassociate section handler on explicit
bpf_program__set_type() call"), bpf_program__set_type() will force cleanup
the program's SEC() definition, this commit fixed the test helper but missed
the bpftool, which leads to bpftool prog autoattach broken as follows:
$ bpftool prog load spi-xfer-r1v1.o /sys/fs/bpf/test autoattach
Program spi_xfer_r1v1 does not support autoattach, falling back to pinning
This patch fix bpftool to set program type only if it differs.
Fixes: d6e6286a12e7 ("libbpf: disassociate section handler on explicit bpf_program__set_type() call")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230407081427.2621590-1-weiyongjun@huaweicloud.com
perf_event with type=PERF_TYPE_RAW and config=0x1b00 turned out to be not
reliable in ensuring LBR is active. Thus, test_progs:get_branch_snapshot is
not reliable in some systems. Replace it with PERF_COUNT_HW_CPU_CYCLES
event, which gives more consistent results.
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20230407190130.2093736-1-song@kernel.org
BPF helpers that take an ARG_PTR_TO_UNINIT_MEM must ensure that all of
the memory is set, including beyond the end of the string.
Signed-off-by: Barret Rhoden <brho@google.com>
Link: https://lore.kernel.org/r/20230407001808.1622968-1-brho@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add various tests for code pattern '<const> <cond_op> <non_const>' to
exercise the previous verifier patch.
The following are veristat changed number of processed insns stat
comparing the previous patch vs. this patch:
File Program Insns (A) Insns (B) Insns (DIFF)
----------------------------------------------------- ---------------------------------------------------- --------- --------- -------------
test_seg6_loop.bpf.linked3.o __add_egr_x 12423 12314 -109 (-0.88%)
Only one program is affected with minor change.
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230406164510.1047757-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Currently, the verifier does not handle '<const> <cond_op> <non_const>' well.
For example,
...
10: (79) r1 = *(u64 *)(r10 -16) ; R1_w=scalar() R10=fp0
11: (b7) r2 = 0 ; R2_w=0
12: (2d) if r2 > r1 goto pc+2
13: (b7) r0 = 0
14: (95) exit
15: (65) if r1 s> 0x1 goto pc+3
16: (0f) r0 += r1
...
At insn 12, verifier decides both true and false branch are possible, but
actually only false branch is possible.
Currently, the verifier already supports patterns '<non_const> <cond_op> <const>.
Add support for patterns '<const> <cond_op> <non_const>' in a similar way.
Also fix selftest 'verifier_bounds_mix_sign_unsign/bounds checks mixing signed and unsigned, variant 10'
due to this change.
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230406164505.1046801-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add various tests for code pattern '<non-const> NE/EQ <const>' implemented
in the previous verifier patch. Without the verifier patch, these new
tests will fail.
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230406164500.1045715-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Currently, for BPF_JEQ/BPF_JNE insn, verifier determines
whether the branch is taken or not only if both operands
are constants. Therefore, for the following code snippet,
0: (85) call bpf_ktime_get_ns#5 ; R0_w=scalar()
1: (a5) if r0 < 0x3 goto pc+2 ; R0_w=scalar(umin=3)
2: (b7) r2 = 2 ; R2_w=2
3: (1d) if r0 == r2 goto pc+2 6
At insn 3, since r0 is not a constant, verifier assumes both branch
can be taken which may lead inproper verification failure.
Add comparing umin/umax value and the constant. If the umin value
is greater than the constant, or umax value is smaller than the constant,
for JEQ the branch must be not-taken, and for JNE the branch must be taken.
The jmp32 mode JEQ/JNE branch taken checking is also handled similarly.
The following lists the veristat result w.r.t. changed number
of processes insns during verification:
File Program Insns (A) Insns (B) Insns (DIFF)
----------------------------------------------------- ---------------------------------------------------- --------- --------- ---------------
test_cls_redirect.bpf.linked3.o cls_redirect 64980 73472 +8492 (+13.07%)
test_seg6_loop.bpf.linked3.o __add_egr_x 12425 12423 -2 (-0.02%)
test_tcp_hdr_options.bpf.linked3.o estab 2634 2558 -76 (-2.89%)
test_parse_tcp_hdr_opt.bpf.linked3.o xdp_ingress_v6 1421 1420 -1 (-0.07%)
test_parse_tcp_hdr_opt_dynptr.bpf.linked3.o xdp_ingress_v6 1238 1237 -1 (-0.08%)
test_tc_dtime.bpf.linked3.o egress_fwdns_prio100 414 411 -3 (-0.72%)
Mostly a small improvement but test_cls_redirect.bpf.linked3.o has a 13% regression.
I checked with verifier log and found it this is due to pruning.
For some JEQ/JNE branches impacted by this patch,
one branch is explored and the other has state equivalence and
pruned.
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230406164455.1045294-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Kal Conley says:
====================
This patchset includes the test with the bugfix as requested here:
https://lore.kernel.org/all/f1a32d5a-03e7-fce1-f5a5-6095f365f0a9@linux.dev/
Patch #1 (the bugfix) is identical to the previous submission except
that I improved the commit message slightly.
Magnus: I improved the test code a little different than you asked
since I thought this was a little simpler than having a separate
function for now. Hopefully, you can live with this :-).
====================
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Add unaligned descriptor test for frame size of 4001. Using an odd frame
size ensures that the end of the UMEM is not near a page boundary. This
allows testing descriptors that staddle the end of the UMEM but not a
page.
This test used to fail without the previous commit ("xsk: Fix unaligned
descriptor validation").
Signed-off-by: Kal Conley <kal.conley@dectris.com>
Link: https://lore.kernel.org/r/20230405235920.7305-3-kal.conley@dectris.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Make sure unaligned descriptors that straddle the end of the UMEM are
considered invalid. Currently, descriptor validation is broken for
zero-copy mode which only checks descriptors at page granularity.
For example, descriptors in zero-copy mode that overrun the end of the
UMEM but not a page boundary are (incorrectly) considered valid. The
UMEM boundary check needs to happen before the page boundary and
contiguity checks in xp_desc_crosses_non_contig_pg(). Do this check in
xp_unaligned_validate_desc() instead like xp_check_unaligned() already
does.
Fixes: 2b43470add8c ("xsk: Introduce AF_XDP buffer allocation API")
Signed-off-by: Kal Conley <kal.conley@dectris.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/r/20230405235920.7305-2-kal.conley@dectris.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Functions for searching module kallsyms should have non-empty
definitions only if CONFIG_MODULES=y and CONFIG_KALLSYMS=y. Until now,
only CONFIG_MODULES check was used for many of these, which may have
caused complilation errors on some configs.
This patch moves all relevant functions under the correct configs.
Fixes: bd5314f8dd2d ("kallsyms, bpf: Move find_kallsyms_symbol_value out of internal header")
Signed-off-by: Viktor Malik <vmalik@redhat.com>
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/oe-kbuild-all/202303181535.RFDCnz3E-lkp@intel.com/
Link: https://lore.kernel.org/r/20230330102001.2183693-1-vmalik@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Quentin Monnet says:
====================
This set contains some improvements for bpftool's "visual" program dump
option, which produces the control flow graph in a DOT format. The main
objective is to add support for inline annotations on such graphs, so that
we can have the C source code for the program showing up alongside the
instructions, when available. The last commits also make it possible to
display the line numbers or the bare opcodes in the graph, as supported by
regular program dumps.
v3:
- Fixed formatting of DOT graph: escape spaces, and remove indent that
would cause some unwanted spaces to show up in the resulting graph.
- Don't print line information if the record is empty.
- Add '<' and ' ' to the list of escaped characters for generting the
DOT graph.
- Truncate long file paths, use shorter field names ("line", "col") for
code location information in the graph, add missing separator space.
- Add a commit to return an error if JSON output and CFG are both
required.
- Add a drive-by, clean up commit for bash completion (avoid unnecessary
calls to _bpftool_once_attr()).
v2: Replace fputc(..., stdout) with putchar(...) in dotlabel_puts().
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
In bpftool's bash completion file, function _bpftool_once_attr() is able
to process multiple arguments. There are a few locations where this
function is called multiple times in a row, each time for a single
argument; let's pass all arguments instead to minimize the number of
function calls required for the completion.
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/r/20230405132120.59886-8-quentin@isovalent.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add support for displaying opcodes or/and file references (filepath,
line and column numbers) when dumping the control flow graphs of loaded
BPF programs with bpftool.
The filepaths in the records are absolute. To avoid blocks on the graph
to get too wide, we truncate them when they get too long (but we always
keep the entire file name). In the unlikely case where the resulting
file name is ambiguous, it remains possible to get the full path with a
regular dump (no CFG).
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/r/20230405132120.59886-7-quentin@isovalent.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
When dumping a program, the keywords "opcodes" (for printing the raw
opcodes), "linum" (for displaying the filename, line number, column
number along with the source code), and "visual" (for generating the
control flow graph for translated programs) are mutually exclusive. But
there's no reason why they should be. Let's make it possible to pass
several of them at once. The "file FILE" option, which makes bpftool
output a binary image to a file, remains incompatible with the others.
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/r/20230405132120.59886-6-quentin@isovalent.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
We do not support JSON output for control flow graphs of programs with
bpftool. So far, requiring both the CFG and JSON output would result in
producing a null JSON object. It makes more sense to raise an error
directly when parsing command line arguments and options, so that users
know they won't get any output they might expect.
If JSON is required for the graph, we leave it to Graphviz instead:
# bpftool prog dump xlated <REF> visual | dot -Tjson
Suggested-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/r/20230405132120.59886-5-quentin@isovalent.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
We support dumping the control flow graph of loaded programs to the DOT
format with bpftool, but so far this feature wouldn't display the source
code lines available through BTF along with the eBPF bytecode. Let's add
support for these annotations, to make it easier to read the graph.
In prog.c, we move the call to dump_xlated_cfg() in order to pass and
use the full struct dump_data, instead of creating a minimal one in
draw_bb_node().
We pass the pointer to this struct down to dump_xlated_for_graph() in
xlated_dumper.c, where most of the logics is added. We deal with BTF
mostly like we do for plain or JSON output, except that we cannot use a
"nr_skip" value to skip a given number of linfo records (we don't
process the BPF instructions linearly, and apart from the root of the
graph we don't know how many records we should skip, so we just store
the last linfo and make sure the new one we find is different before
printing it).
When printing the source instructions to the label of a DOT graph node,
there are a few subtleties to address. We want some special newline
markers, and there are some characters that we must escape. To deal with
them, we introduce a new dedicated function btf_dump_linfo_dotlabel() in
btf_dumper.c. We'll reuse this function in a later commit to format the
filepath, line, and column references as well.
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/r/20230405132120.59886-4-quentin@isovalent.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
When dumping the control flow graphs for programs using the 16-byte long
load instruction, we need to skip the second part of this instruction
when looking for the next instruction to process. Otherwise, we end up
printing "BUG_ld_00" from the kernel disassembler in the CFG.
Fixes: efcef17a6d65 ("tools: bpftool: generate .dot graph from CFG information")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/r/20230405132120.59886-3-quentin@isovalent.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The documentation states that when line_info is available when dumping a
program, the source line will be displayed "by default". There is no
notion of "default" here: the line is always displayed if available,
there is no way currently to turn it off.
In the next sentence, the documentation states that if "linum" is used
on the command line, the relevant filename, line, and column will be
displayed "on top of the source line". This is incorrect, as they are
currently displayed on the right side of the source line (or on top of
the eBPF instruction, not the source).
This commit fixes the documentation to address these points.
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/r/20230405132120.59886-2-quentin@isovalent.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
In some cases the loopback latency might be large enough, causing
the assertion on invocations to be run before ingress prog getting
executed. The assertion would fail and the test would flake.
This can be reliably reproduced by arbitrarily increasing the
loopback latency (thanks to [1]):
tc qdisc add dev lo root handle 1: htb default 12
tc class add dev lo parent 1:1 classid 1:12 htb rate 20kbps ceil 20kbps
tc qdisc add dev lo parent 1:12 netem delay 100ms
Fix this by waiting on the receive end, instead of instantly
returning to the assert. The call to read() will wait for the
default SO_RCVTIMEO timeout of 3 seconds provided by
start_server().
[1] https://gist.github.com/kstevens715/4598301
Reported-by: Martin KaFai Lau <martin.lau@linux.dev>
Link: https://lore.kernel.org/bpf/9c5c8b7e-1d89-a3af-5400-14fde81f4429@linux.dev/
Fixes: 3573f384014f ("selftests/bpf: Test CGROUP_STORAGE behavior on shared egress + ingress")
Acked-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Link: https://lore.kernel.org/r/20230405193354.1956209-1-zhuyifei@google.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Fix flaky STATS_RX_DROPPED test. The receiver calls getsockopt after
receiving the last (valid) packet which is not the final packet sent in
the test (valid and invalid packets are sent in alternating fashion with
the final packet being invalid). Since the last packet may or may not
have been dropped already, both outcomes must be allowed.
This issue could also be fixed by making sure the last packet sent is
valid. This alternative is left as an exercise to the reader (or the
benevolent maintainers of this file).
This problem was quite visible on certain setups. On one machine this
failure was observed 50% of the time.
Also, remove a redundant assignment of pkt_stream->nb_pkts. This field
is already initialized by __pkt_stream_alloc.
Fixes: 27e934bec35b ("selftests: xsk: make stat tests not spin on getsockopt")
Signed-off-by: Kal Conley <kal.conley@dectris.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/r/20230403120400.31018-1-kal.conley@dectris.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
This change fixes flakiness in the BIDIRECTIONAL test:
# [is_pkt_valid] expected length [60], got length [90]
not ok 1 FAIL: SKB BUSY-POLL BIDIRECTIONAL
When IPv6 is enabled, the interface will periodically send MLDv1 and
MLDv2 packets. These packets can cause the BIDIRECTIONAL test to fail
since it uses VETH0 for RX.
For other tests, this was not a problem since they only receive on VETH1
and IPv6 was already disabled on VETH0.
Fixes: a89052572ebb ("selftests/bpf: Xsk selftests framework")
Signed-off-by: Kal Conley <kal.conley@dectris.com>
Link: https://lore.kernel.org/r/20230405082905.6303-1-kal.conley@dectris.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Kal Conley says:
====================
This patchset fixes a minor bug in xskxceiver.c then adds a test case
for valid packets at the end of the UMEM.
====================
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Add test case to testapp_invalid_desc for valid packets at the end of
the UMEM.
Signed-off-by: Kal Conley <kal.conley@dectris.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/r/20230403145047.33065-3-kal.conley@dectris.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Avoid UMEM_SIZE macro in testapp_invalid_desc which is incorrect when
the frame size is not XSK_UMEM__DEFAULT_FRAME_SIZE. Also remove the
macro since it's no longer being used.
Fixes: 909f0e28207c ("selftests: xsk: Add tests for 2K frame size")
Signed-off-by: Kal Conley <kal.conley@dectris.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/r/20230403145047.33065-2-kal.conley@dectris.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Alexei Starovoitov says:
====================
From: Alexei Starovoitov <ast@kernel.org>
The patch set is addressing a fallout from
commit 6fcd486b3a0a ("bpf: Refactor RCU enforcement in the verifier.")
It was too aggressive with PTR_UNTRUSTED marks.
Patches 1-6 are cleanup and adding verifier smartness to address real
use cases in bpf programs that broke with too aggressive PTR_UNTRUSTED.
The partial revert is done in patch 7 anyway.
====================
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
The commit 6fcd486b3a0a ("bpf: Refactor RCU enforcement in the verifier.")
broke several tracing bpf programs. Even in clang compiled kernels there are
many fields that are not marked with __rcu that are safe to read and pass into
helpers, but the verifier doesn't know that they're safe. Aggressively marking
them as PTR_UNTRUSTED was premature.
Fixes: 6fcd486b3a0a ("bpf: Refactor RCU enforcement in the verifier.")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/20230404045029.82870-8-alexei.starovoitov@gmail.com
check_reg_type() unconditionally disallows PTR_TO_BTF_ID | PTR_MAYBE_NULL.
It's problematic for helpers that allow ARG_PTR_TO_BTF_ID_OR_NULL like
bpf_sk_storage_get(). Allow passing PTR_TO_BTF_ID | PTR_MAYBE_NULL into such
helpers. That technically includes bpf_kptr_xchg() helper, but in practice:
bpf_kptr_xchg(..., bpf_cpumask_create());
is still disallowed because bpf_cpumask_create() returns ref counted pointer
with ref_obj_id > 0.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/20230404045029.82870-6-alexei.starovoitov@gmail.com
bpf_[sk|inode|task|cgrp]_storage_[get|delete]() and bpf_get_socket_cookie() helpers
perform run-time check that sk|inode|task|cgrp pointer != NULL.
Teach verifier about this fact and allow bpf programs to pass
PTR_TO_BTF_ID | PTR_MAYBE_NULL into such helpers.
It will be used in the subsequent patch that will do
bpf_sk_storage_get(.., skb->sk, ...);
Even when 'skb' pointer is trusted the 'sk' pointer may be NULL.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/20230404045029.82870-5-alexei.starovoitov@gmail.com
btf_nested_type_is_trusted() tries to find a struct member at corresponding offset.
It works for flat structures and falls apart in more complex structs with nested structs.
The offset->member search is already performed by btf_struct_walk() including nested structs.
Reuse this work and pass {field name, field btf id} into btf_nested_type_is_trusted()
instead of offset to make BTF_TYPE_SAFE*() logic more robust.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/20230404045029.82870-4-alexei.starovoitov@gmail.com
Remove duplicated if (atype == BPF_READ) btf_struct_access() from
btf_struct_access() callback and invoke it only for writes. This is
possible to do because currently btf_struct_access() custom callback
always delegates to generic btf_struct_access() helper for BPF_READ
accesses.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/20230404045029.82870-2-alexei.starovoitov@gmail.com
bpf_testmod.ko sometimes fails to build from a clean checkout:
BTF [M] linux/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.ko
/bin/sh: 1: linux-build//tools/build/resolve_btfids/resolve_btfids: not found
The reason is that RESOLVE_BTFIDS may not yet be built. Fix by adding a
dependency.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20230403172935.1553022-1-iii@linux.ibm.com
bpf_obj_drop_impl has a void return type. In check_kfunc_call, the "else
if" which sets insn_aux->kptr_struct_meta for bpf_obj_drop_impl is
surrounded by a larger if statement which checks btf_type_is_ptr. As a
result:
* The bpf_obj_drop_impl-specific code will never execute
* The btf_struct_meta input to bpf_obj_drop is always NULL
* __bpf_obj_drop_impl will always see a NULL btf_record when called
from BPF program, and won't call bpf_obj_free_fields
* program-allocated kptrs which have fields that should be cleaned up
by bpf_obj_free_fields may instead leak resources
This patch adds a btf_type_is_void branch to the larger if and moves
special handling for bpf_obj_drop_impl there, fixing the issue.
Fixes: ac9f06050a35 ("bpf: Introduce bpf_obj_drop")
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/r/20230403200027.2271029-1-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add docs on extended 64-bit immediate instructions, including six instructions
previously undocumented. Include a brief description of maps and variables,
as used by those instructions.
V1 -> V2: rebased on top of latest master
V2 -> V3: addressed comments from Alexei
V3 -> V4: addressed comments from David Vernet
Signed-off-by: Dave Thaler <dthaler@microsoft.com>
Link: https://lore.kernel.org/r/20230326054946.2331-1-dthaler1968@googlemail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
If the value size in a bloom filter is a multiple of 4, then the jhash2()
function is used to compute hashes. The length parameter of this function
equals to the number of 32-bit words in input. Compute it in the hot path
instead of pre-computing it, as this is translated to one extra shift to
divide the length by four vs. one extra memory load of a pre-computed length.
Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Link: https://lore.kernel.org/r/20230402114340.3441-1-aspsk@isovalent.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>