IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
The newly added helper start_test() can be used in test_dctcp_fallback()
too, to replace start_server_str() and connect_to_fd_opts(). In that
way, two network_helper_opts srv_opts and cli_opts are used instead of
the previously shared opts.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/792ca3bb013fa06e618176da02d75e4f79a76733.1717054461.git.tanggeliang@kylinos.cn
This patch uses connect_to_fd_opts() instead of using connect_fd_to_fd()
and settcpca() in do_test() in prog_tests/bpf_tcp_ca.c to accept a struct
network_helper_opts argument.
Then define a dctcp dedicated post_socket_cb callback stg_post_socket_cb(),
invoking both settcpca() and bpf_map_update_elem() in it, and set it in
test_dctcp(). For passing map_fd into stg_post_socket_cb() callback, a new
member map_fd is added in struct cb_opts.
Add another "const struct network_helper_opts *cli_opts" to do_test() to
separate it from the server "opts".
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/876ec90430865bc468e3b7f6fb2648420b075548.1717054461.git.tanggeliang@kylinos.cn
Similarly to `bpf_program`, support `bpf_map` automatic attachment in
`bpf_object__attach_skeleton`. Currently only struct_ops maps could be
attached.
On bpftool side, code-generate links in skeleton struct for struct_ops maps.
Similarly to `bpf_program_skeleton`, set links in `bpf_map_skeleton`.
On libbpf side, extend `bpf_map` with new `autoattach` field to support
enabling or disabling autoattach functionality, introducing
getter/setter for this field.
`bpf_object__(attach|detach)_skeleton` is extended with
attaching/detaching struct_ops maps logic.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240605175135.117127-1-yatsenko@meta.com
Now that all libbpf/bpftool code switched to btf_field_iter, remove
btf_type_visit_type_ids() and btf_type_visit_str_offs() callback-based
helpers as not needed anymore.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240605001629.4061937-6-andrii@kernel.org
Switch bpftool's code which is using libbpf-internal
btf_type_visit_type_ids() helper to new btf_field_iter functionality.
This makes bpftool code simpler, but also unblocks removing libbpf's
btf_type_visit_type_ids() helper completely.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Reviewed-by: Quentin Monnet <qmo@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240605001629.4061937-5-andrii@kernel.org
Use new BTF field iterator logic to replace all the callback-based
visitor calls. There is still a .BTF.ext callback-based visitor APIs
that should be converted, which will happens as a follow up.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240605001629.4061937-4-andrii@kernel.org
Switch all BPF linker code dealing with iterating BTF type ID and string
offset fields to new btf_field_iter facilities.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240605001629.4061937-3-andrii@kernel.org
Implement iterator-based type ID and string offset BTF field iterator.
This is used extensively in BTF-handling code and BPF linker code for
various sanity checks, rewriting IDs/offsets, etc. Currently this is
implemented as visitor pattern calling custom callbacks, which makes the
logic (especially in simple cases) unnecessarily obscure and harder to
follow.
Having equivalent functionality using iterator pattern makes for simpler
to understand and maintain code. As we add more code for BTF processing
logic in libbpf, it's best to switch to iterator pattern before adding
more callback-based code.
The idea for iterator-based implementation is to record offsets of
necessary fields within fixed btf_type parts (which should be iterated
just once), and, for kinds that have multiple members (based on vlen
field), record where in each member necessary fields are located.
Generic iteration code then just keeps track of last offset that was
returned and handles N members correctly. Return type is just u32
pointer, where NULL is returned when all relevant fields were already
iterated.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240605001629.4061937-2-andrii@kernel.org
I hit the following failure when running selftests with
internal backported upstream kernel:
test_ksyms:PASS:kallsyms_fopen 0 nsec
test_ksyms:FAIL:ksym_find symbol 'bpf_link_fops' not found
#123 ksyms:FAIL
In /proc/kallsyms, we have
$ cat /proc/kallsyms | grep bpf_link_fops
ffffffff829f0cb0 d bpf_link_fops.llvm.12608678492448798416
The CONFIG_LTO_CLANG_THIN is enabled in the kernel which is responsible
for bpf_link_fops.llvm.12608678492448798416 symbol name.
In prog_tests/ksyms.c we have
kallsyms_find("bpf_link_fops", &link_fops_addr)
and kallsyms_find() compares "bpf_link_fops" with symbols
in /proc/kallsyms in order to find the entry. With
bpf_link_fops.llvm.<hash> in /proc/kallsyms, the kallsyms_find()
failed.
To fix the issue, in kallsyms_find(), if a symbol has suffix
.llvm.<hash>, that suffix will be ignored for comparison.
This fixed the test failure.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240604180034.1356016-1-yonghong.song@linux.dev
bpf_cookie and find_vma are flaky in nested VMs, which is used by some CI
systems. It turns out these failures are caused by unreliable perf event
in nested VM. Fix these by:
1. Use PERF_COUNT_SW_CPU_CLOCK in find_vma;
2. Increase sample_freq in bpf_cookie.
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240604070700.3032142-1-song@kernel.org
Kui-Feng Lee says:
====================
Enable BPF programs to declare arrays of kptr, bpf_rb_root, and bpf_list_head.
Some types, such as type kptr, bpf_rb_root, and bpf_list_head, are
treated in a special way. Previously, these types could not be the
type of a field in a struct type that is used as the type of a global
variable. They could not be the type of a field in a struct type that
is used as the type of a field in the value type of a map either. They
could not even be the type of array elements. This means that they can
only be the type of global variables or of direct fields in the value
type of a map.
The patch set aims to enable the use of these specific types in arrays
and struct fields, providing flexibility. It examines the types of
global variables or the value types of maps, such as arrays and struct
types, recursively to identify these special types and generate field
information for them.
For example,
...
struct task_struct __kptr *ptr[3];
...
it will create 3 instances of "struct btf_field" in the "btf_record" of
the data section.
[...,
btf_field(offset=0x100, type=BPF_KPTR_REF),
btf_field(offset=0x108, type=BPF_KPTR_REF),
btf_field(offset=0x110, type=BPF_KPTR_REF),
...
]
It creates a record of each of three elements. These three records are
almost identical except their offsets.
Another example is
...
struct A {
...
struct task_struct __kptr *task;
struct bpf_rb_root root;
...
}
struct A foo[2];
it will create 4 records.
[...,
btf_field(offset=0x7100, type=BPF_KPTR_REF),
btf_field(offset=0x7108, type=BPF_RB_ROOT:),
btf_field(offset=0x7200, type=BPF_KPTR_REF),
btf_field(offset=0x7208, type=BPF_RB_ROOT:),
...
]
Assuming that the size of an element/struct A is 0x100 and "foo"
starts at 0x7000, it includes two kptr records at 0x7100 and 0x7200,
and two rbtree root records at 0x7108 and 0x7208.
All these field information will be flatten, for struct types, and
repeated, for arrays.
---
Changes from v6:
- Return BPF_KPTR_REF from btf_get_field_type() only if var_type is a
struct type.
- Pass btf and type to btf_get_field_type().
Changes from v5:
- Ensure field->offset values of kptrs are advanced correctly from
one nested struct/or array to another.
Changes from v4:
- Return -E2BIG for i == MAX_RESOLVE_DEPTH.
Changes from v3:
- Refactor the common code of btf_find_struct_field() and
btf_find_datasec_var().
- Limit the number of levels looking into a struct types.
Changes from v2:
- Support fields in nested struct type.
- Remove nelems and duplicate field information with offset
adjustments for arrays.
Changes from v1:
- Move the check of element alignment out of btf_field_cmp() to
btf_record_find().
- Change the order of the previous patch 4 "bpf:
check_map_kptr_access() compute the offset from the reg state" as
the patch 7 now.
- Reject BPF_RB_NODE and BPF_LIST_NODE with nelems > 1.
- Rephrase the commit log of the patch "bpf: check_map_access() with
the knowledge of arrays" to clarify the alignment on elements.
v6: https://lore.kernel.org/all/20240520204018.884515-1-thinker.li@gmail.com/
v5: https://lore.kernel.org/all/20240510011312.1488046-1-thinker.li@gmail.com/
v4: https://lore.kernel.org/all/20240508063218.2806447-1-thinker.li@gmail.com/
v3: https://lore.kernel.org/all/20240501204729.484085-1-thinker.li@gmail.com/
v2: https://lore.kernel.org/all/20240412210814.603377-1-thinker.li@gmail.com/
v1: https://lore.kernel.org/bpf/20240410004150.2917641-1-thinker.li@gmail.com/
Kui-Feng Lee (9):
bpf: Remove unnecessary checks on the offset of btf_field.
bpf: Remove unnecessary call to btf_field_type_size().
bpf: refactor btf_find_struct_field() and btf_find_datasec_var().
bpf: create repeated fields for arrays.
bpf: look into the types of the fields of a struct type recursively.
bpf: limit the number of levels of a nested struct type.
selftests/bpf: Test kptr arrays and kptrs in nested struct fields.
selftests/bpf: Test global bpf_rb_root arrays and fields in nested
struct types.
selftests/bpf: Test global bpf_list_head arrays.
kernel/bpf/btf.c | 310 ++++++++++++------
kernel/bpf/verifier.c | 4 +-
.../selftests/bpf/prog_tests/cpumask.c | 5 +
.../selftests/bpf/prog_tests/linked_list.c | 12 +
.../testing/selftests/bpf/prog_tests/rbtree.c | 47 +++
.../selftests/bpf/progs/cpumask_success.c | 171 ++++++++++
.../testing/selftests/bpf/progs/linked_list.c | 42 +++
tools/testing/selftests/bpf/progs/rbtree.c | 77 +++++
8 files changed, 558 insertions(+), 110 deletions(-)
====================
Link: https://lore.kernel.org/r/20240523174202.461236-1-thinker.li@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Make sure global arrays of bpf_list_heads and fields of bpf_list_heads in
nested struct types work correctly.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240523174202.461236-10-thinker.li@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Make sure global arrays of bpf_rb_root and fields of bpf_rb_root in nested
struct types work correctly.
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240523174202.461236-9-thinker.li@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Make sure that BPF programs can declare global kptr arrays and kptr fields
in struct types that is the type of a global variable or the type of a
nested descendant field in a global variable.
An array with only one element is special case, that it treats the element
like a non-array kptr field. Nested arrays are also tested to ensure they
are handled properly.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240523174202.461236-8-thinker.li@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Limit the number of levels looking into struct types to avoid running out
of stack space.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240523174202.461236-7-thinker.li@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The verifier has field information for specific special types, such as
kptr, rbtree root, and list head. These types are handled
differently. However, we did not previously examine the types of fields of
a struct type variable. Field information records were not generated for
the kptrs, rbtree roots, and linked_list heads that are not located at the
outermost struct type of a variable.
For example,
struct A {
struct task_struct __kptr * task;
};
struct B {
struct A mem_a;
}
struct B var_b;
It did not examine "struct A" so as not to generate field information for
the kptr in "struct A" for "var_b".
This patch enables BPF programs to define fields of these special types in
a struct type other than the direct type of a variable or in a struct type
that is the type of a field in the value type of a map.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240523174202.461236-6-thinker.li@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The verifier uses field information for certain special types, such as
kptr, rbtree root, and list head. These types are treated
differently. However, we did not previously support these types in
arrays. This update examines arrays and duplicates field information the
same number of times as the length of the array if the element type is one
of the special types.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240523174202.461236-5-thinker.li@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Move common code of the two functions to btf_find_field_one().
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240523174202.461236-4-thinker.li@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
field->size has been initialized by bpf_parse_fields() with the value
returned by btf_field_type_size(). Use it instead of calling
btf_field_type_size() again.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240523174202.461236-3-thinker.li@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
reg_find_field_offset() always return a btf_field with a matching offset
value. Checking the offset of the returned btf_field is unnecessary.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240523174202.461236-2-thinker.li@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
bpf_map_lookup_elem is invoked in bpf_prog3() already, no need to invoke
it again. This patch drops it.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/ea8458462b876ee445173e3effb535fd126137ed.1716446893.git.tanggeliang@kylinos.cn
The value of recv in msg_loop may be negative, like EWOULDBLOCK, so it's
necessary to check if it is positive before accumulating it to bytes_recvd.
Fixes: 16962b2404ac ("bpf: sockmap, add selftests")
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/5172563f7c7b2a2e953cef02e89fc34664a7b190.1716446893.git.tanggeliang@kylinos.cn
The array size of map_fd[] is 9, not 8. This patch changes it as a more
general form: ARRAY_SIZE(map_fd).
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/0972529ee01ebf8a8fd2b310bdec90831c94be77.1716446893.git.tanggeliang@kylinos.cn
The program fds can be got by using bpf_program__fd(progs[]), then
prog_fd becomes useless. This patch drops it.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/9a6335e4d8dbab23c0d8906074457ceddd61e74b.1716446893.git.tanggeliang@kylinos.cn
bpf_program__attach_sockmap() needs to take a parameter of type bpf_program
instead of an fd, so tx_prog_fd becomes useless. This patch uses a pointer
tx_prog to point to an item in progs[] array.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/23b37f932c547dd1ebfe154bbc0b0e957be21ee6.1716446893.git.tanggeliang@kylinos.cn
Switch attachments to bpf_link using bpf_program__attach_sockmap() instead
of bpf_prog_attach().
This patch adds a new array progs[] to replace prog_fd[] array, set in
populate_progs() for each program in bpf object.
And another new array links[] to save the attached bpf_link. It is
initalized as NULL in populate_progs, set as the return valuses of
bpf_program__attach_sockmap(), and detached by bpf_link__detach().
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/32cf8376a810e2e9c719f8e4cfb97132ed2d1f9c.1716446893.git.tanggeliang@kylinos.cn
There's already a definition of i in run_options() at the beginning, no
need to define a new one in "if (tx_prog_fd > 0)" block.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/8d690682330a59361562bca75d6903253d16f312.1716446893.git.tanggeliang@kylinos.cn
The values of tx_prog_fd in run_options() should not be 0, so set it as -1
in else branch, and test it using "if (tx_prog_fd > 0)" condition, not
"if (tx_prog_fd)" or "if (tx_prog_fd >= 0)".
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/08b20ffc544324d40939efeae93800772a91a58e.1716446893.git.tanggeliang@kylinos.cn
make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_bpf.o
Add the missing invocation of the MODULE_DESCRIPTION() macro.
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240531-md-lib-test_bpf-v1-1-868e4bd2f9ed@quicinc.com
Correct typo in bpftool profiler and change all instances of 'MATRICS' to
'METRICS' in the profiler.bpf.c file.
Signed-off-by: Swan Beaujard <beaujardswan@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Quentin Monnet <qmo@kernel.org>
Link: https://lore.kernel.org/bpf/20240602225812.81171-1-beaujardswan@gmail.com
'libcap' is unused since commit b1c2768a82b9 ("bpf: selftests: Remove libcap
usage from test_verifier"). Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240602234112.225107-4-linux@treblig.org
'key_t' is unused in a couple of files since the original commit 60dd49ea6539
("selftests/bpf: Add test for bpf array map iterators"). Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240602234112.225107-3-linux@treblig.org
'scale_test_def' is unused since commit 3762a39ce85f ("selftests/bpf: Split out
bpf_verif_scale selftests into multiple tests"). Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240602234112.225107-2-linux@treblig.org
Zba extension is very useful for generating addresses that index into array
of basic data types. This patch introduces sh2add and sh3add helpers for
RV32 and RV64 respectively, to accelerate addressing for array of unsigned
long data.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/bpf/20240524075543.4050464-3-xiao.w.wang@intel.com
Make sure to preserve and/or enforce FD_CLOEXEC flag on duped FDs.
Use dup3() with O_CLOEXEC flag for that.
Without this fix libbpf effectively clears FD_CLOEXEC flag on each of BPF
map/prog FD, which is definitely not the right or expected behavior.
Reported-by: Lennart Poettering <lennart@poettering.net>
Fixes: bc308d011ab8 ("libbpf: call dup2() syscall directly")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20240529223239.504241-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Kui-Feng Lee says:
====================
The subsystems managing struct_ops objects may need to detach a
struct_ops object due to errors or other reasons. It would be useful
to notify user space programs so that error recovery or logging can be
carried out.
This patch set enables the detach feature for struct_ops links and
send an event to epoll when a link is detached. Subsystems could call
link->ops->detach() to detach a link and notify user space programs
through epoll.
The signatures of callback functions in "struct bpf_struct_ops" have
been changed as well to pass an extra link argument to
subsystems. Subsystems could detach the links received from reg() and
update() callbacks if there is. This also provides a way that
subsystems can distinguish registrations for an object that has been
registered multiple times for several links.
However, bpf struct_ops maps without BPF_F_LINK have no any link.
Subsystems will receive NULL link pointer for this case.
---
Changes from v6:
- Fix the missing header at patch 5.
- Move RCU_INIT_POINTER() back to its original position.
Changes from v5:
- Change the commit title of the patch for bpftool.
Changes from v4:
- Change error code for bpf_struct_ops_map_link_update()
- Always return 0 for bpf_struct_ops_map_link_detach()
- Hold update_mutex in bpf_struct_ops_link_create()
- Add a separated instance of file_operations for links supporting
poll.
- Fix bpftool for bpf_link_fops_poll.
Changes from v3:
- Add a comment to explain why holding update_mutex is not necessary
in bpf_struct_ops_link_create()
- Use rcu_access_pointer() in bpf_struct_ops_map_link_poll().
Changes from v2:
- Rephrased commit logs and comments.
- Addressed some mistakes from patch splitting.
- Replace mutex with spinlock in bpf_testmod.c to address lockdep
Splat and simplify the implementation.
- Fix an argument passing to rcu_dereference_protected().
Changes from v1:
- Pass a link to reg, unreg, and update callbacks.
- Provide a function to detach a link from underlying subsystems.
- Add a kfunc to mimic detachments from subsystems, and provide a
flexible way to control when to do detachments.
- Add two tests to detach a link from the subsystem after the refcount
of the link drops to zero.
v6: https://lore.kernel.org/bpf/20240524223036.318800-1-thinker.li@gmail.com/
v5: https://lore.kernel.org/all/20240523230848.2022072-1-thinker.li@gmail.com/
v4: https://lore.kernel.org/all/20240521225121.770930-1-thinker.li@gmail.com/
v3: https://lore.kernel.org/all/20240510002942.1253354-1-thinker.li@gmail.com/
v2: https://lore.kernel.org/all/20240507055600.2382627-1-thinker.li@gmail.com/
v1: https://lore.kernel.org/all/20240429213609.487820-1-thinker.li@gmail.com/
====================
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
To support epoll, a new instance of file_operations, bpf_link_fops_poll,
has been added for links that support epoll. The pid_iter.bpf.c checks
f_ops for links and other BPF objects. The check should fail for struct_ops
links without this patch.
Acked-by: Quentin Monnet <qmo@kernel.org>
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240530065946.979330-9-thinker.li@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Verify whether a user space program is informed through epoll with EPOLLHUP
when a struct_ops object is detached.
The BPF code in selftests/bpf/progs/struct_ops_module.c has become
complex. Therefore, struct_ops_detach.c has been added to segregate the BPF
code for detachment tests from the BPF code for other tests based on the
recommendation of Andrii Nakryiko.
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240530065946.979330-6-thinker.li@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Add epoll support to bpf struct_ops links to trigger EPOLLHUP event upon
detachment.
This patch implements the "poll" of the "struct file_operations" for BPF
links and introduces a new "poll" operator in the "struct bpf_link_ops". By
implementing "poll" of "struct bpf_link_ops" for the links of struct_ops,
the file descriptor of a struct_ops link can be added to an epoll file
descriptor to receive EPOLLHUP events.
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240530065946.979330-4-thinker.li@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Implement the detach callback in bpf_link_ops for struct_ops so that user
programs can detach a struct_ops link. The subsystems that struct_ops
objects are registered to can also use this callback to detach the links
being passed to them.
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240530065946.979330-3-thinker.li@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Pass an additional pointer of bpf_struct_ops_link to callback function reg,
unreg, and update provided by subsystems defined in bpf_struct_ops. A
bpf_struct_ops_map can be registered for multiple links. Passing a pointer
of bpf_struct_ops_link helps subsystems to distinguish them.
This pointer will be used in the later patches to let the subsystem
initiate a detachment on a link that was registered to it previously.
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240530065946.979330-2-thinker.li@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Yafang Shao says:
====================
bpf: Add a generic bits iterator
Three new kfuncs, namely bpf_iter_bits_{new,next,destroy}, have been
added for the new bpf_iter_bits functionality. These kfuncs enable the
iteration of the bits from a given address and a given number of bits.
- bpf_iter_bits_new
Initialize a new bits iterator for a given memory area. Due to the
limitation of bpf memalloc, the max number of bits to be iterated
over is (4096 * 8).
- bpf_iter_bits_next
Get the next bit in a bpf_iter_bits
- bpf_iter_bits_destroy
Destroy a bpf_iter_bits
The bits iterator can be used in any context and on any address.
Changes:
- v7->v8:
Refine the interface to avoid dealing with endianness (Andrii)
- v6->v7:
Fix endianness error for non-long-aligned data (Andrii)
- v5->v6:
Add positive tests (Andrii)
- v4->v5:
Simplify test cases (Andrii)
- v3->v4:
- Fix endianness error on s390x (Andrii)
- zero-initialize kit->bits_copy and zero out nr_bits (Andrii)
- v2->v3:
Optimization for u64/u32 mask (Andrii)
- v1->v2:
Simplify the CPU number verification code to avoid the failure on s390x
(Eduard)
- bpf: Add bpf_iter_cpumask
https://lwn.net/Articles/961104/
- bpf: Add new bpf helper bpf_for_each_cpu
https://lwn.net/Articles/939939/
====================
Link: https://lore.kernel.org/r/20240517023034.48138-1-laoar.shao@gmail.com
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Add test cases for the bits iter:
- Positive cases
- Bit mask representing a single word (8-byte unit)
- Bit mask representing data spanning more than one word
- The index of the set bit
- Nagative cases
- bpf_iter_bits_destroy() is required after calling
bpf_iter_bits_new()
- bpf_iter_bits_destroy() can only destroy an initialized iter
- bpf_iter_bits_next() must use an initialized iter
- Bit mask representing zero words
- Bit mask representing fewer words than expected
- Case for ENOMEM
- Case for NULL pointer
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240517023034.48138-3-laoar.shao@gmail.com
Add three new kfuncs for the bits iterator:
- bpf_iter_bits_new
Initialize a new bits iterator for a given memory area. Due to the
limitation of bpf memalloc, the max number of words (8-byte units) that
can be iterated over is limited to (4096 / 8).
- bpf_iter_bits_next
Get the next bit in a bpf_iter_bits
- bpf_iter_bits_destroy
Destroy a bpf_iter_bits
The bits iterator facilitates the iteration of the bits of a memory area,
such as cpumask. It can be used in any context and on any address.
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240517023034.48138-2-laoar.shao@gmail.com