linux

iv/linux

Author	SHA1	Message	Date
Geliang Tang	cd984b2ed6	selftests/bpf: Use start_test in test_dctcp in bpf_tcp_ca The "if (sk_stg_map)" block in do_test() is only used by test_dctcp(), it makes sense to move it from do_test() into test_dctcp(). Then do_test() can be used by other tests except test_dctcp(). Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/9938916627b9810c877e5c03a621bc0ba5acf5c5.1717054461.git.tanggeliang@kylinos.cn	2024-06-06 23:04:05 +02:00
Geliang Tang	224eeb5598	selftests/bpf: Use start_test in test_dctcp_fallback in bpf_tcp_ca The newly added helper start_test() can be used in test_dctcp_fallback() too, to replace start_server_str() and connect_to_fd_opts(). In that way, two network_helper_opts srv_opts and cli_opts are used instead of the previously shared opts. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/792ca3bb013fa06e618176da02d75e4f79a76733.1717054461.git.tanggeliang@kylinos.cn	2024-06-06 23:04:05 +02:00
Geliang Tang	fee97d0c9a	selftests/bpf: Add start_test helper in bpf_tcp_ca For moving the "if (sk_stg_map)" block out of do_test(), extract the code before this block as a new function start_test(). It creates server-side and client-side sockets and returns them to the caller. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/48f2921ff9be958f5d3d28fe6bb7269a61cafa9f.1717054461.git.tanggeliang@kylinos.cn	2024-06-06 23:04:05 +02:00
Geliang Tang	9abdfd8a21	selftests/bpf: Use connect_to_fd_opts in do_test in bpf_tcp_ca This patch uses connect_to_fd_opts() instead of using connect_fd_to_fd() and settcpca() in do_test() in prog_tests/bpf_tcp_ca.c to accept a struct network_helper_opts argument. Then define a dctcp dedicated post_socket_cb callback stg_post_socket_cb(), invoking both settcpca() and bpf_map_update_elem() in it, and set it in test_dctcp(). For passing map_fd into stg_post_socket_cb() callback, a new member map_fd is added in struct cb_opts. Add another "const struct network_helper_opts *cli_opts" to do_test() to separate it from the server "opts". Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/876ec90430865bc468e3b7f6fb2648420b075548.1717054461.git.tanggeliang@kylinos.cn	2024-06-06 23:04:05 +02:00
Mykyta Yatsenko	08ac454e25	libbpf: Auto-attach struct_ops BPF maps in BPF skeleton Similarly to `bpf_program`, support `bpf_map` automatic attachment in `bpf_object__attach_skeleton`. Currently only struct_ops maps could be attached. On bpftool side, code-generate links in skeleton struct for struct_ops maps. Similarly to `bpf_program_skeleton`, set links in `bpf_map_skeleton`. On libbpf side, extend `bpf_map` with new `autoattach` field to support enabling or disabling autoattach functionality, introducing getter/setter for this field. `bpf_object__(attach\|detach)_skeleton` is extended with attaching/detaching struct_ops maps logic. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240605175135.117127-1-yatsenko@meta.com	2024-06-06 10:06:05 -07:00
Alan Maguire	b24862bac7	selftests/bpf: Add btf_field_iter selftests The added selftests verify that for every BTF kind we iterate correctly over consituent strings and ids. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240605153314.3727466-1-alan.maguire@oracle.com	2024-06-06 15:56:30 +02:00
Yonghong Song	7015843afc	selftests/bpf: Fix send_signal test with nested CONFIG_PARAVIRT Alexei reported that send_signal test may fail with nested CONFIG_PARAVIRT configs. In this particular case, the base VM is AMD with 166 cpus, and I run selftests with regular qemu on top of that and indeed send_signal test failed. I also tried with an Intel box with 80 cpus and there is no issue. The main qemu command line includes: -enable-kvm -smp 16 -cpu host The failure log looks like: $ ./test_progs -t send_signal [ 48.501588] watchdog: BUG: soft lockup - CPU#9 stuck for 26s! [test_progs:2225] [ 48.503622] Modules linked in: bpf_testmod(O) [ 48.503622] CPU: 9 PID: 2225 Comm: test_progs Tainted: G O 6.9.0-08561-g2c1713a8f1c9-dirty #69 [ 48.507629] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 [ 48.511635] RIP: 0010:handle_softirqs+0x71/0x290 [ 48.511635] Code: [...] 10 0a 00 00 00 31 c0 65 66 89 05 d5 f4 fa 7e fb bb ff ff ff ff <49> c7 c2 cb [ 48.518527] RSP: 0018:ffffc90000310fa0 EFLAGS: 00000246 [ 48.519579] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 00000000000006e0 [ 48.522526] RDX: 0000000000000006 RSI: ffff88810791ae80 RDI: 0000000000000000 [ 48.523587] RBP: ffffc90000fabc88 R08: 00000005a0af4f7f R09: 0000000000000000 [ 48.525525] R10: 0000000561d2f29c R11: 0000000000006534 R12: 0000000000000280 [ 48.528525] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 48.528525] FS: 00007f2f2885cd00(0000) GS:ffff888237c40000(0000) knlGS:0000000000000000 [ 48.531600] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 48.535520] CR2: 00007f2f287059f0 CR3: 0000000106a28002 CR4: 00000000003706f0 [ 48.537538] Call Trace: [ 48.537538] <IRQ> [ 48.537538] ? watchdog_timer_fn+0x1cd/0x250 [ 48.539590] ? lockup_detector_update_enable+0x50/0x50 [ 48.539590] ? __hrtimer_run_queues+0xff/0x280 [ 48.542520] ? hrtimer_interrupt+0x103/0x230 [ 48.544524] ? __sysvec_apic_timer_interrupt+0x4f/0x140 [ 48.545522] ? sysvec_apic_timer_interrupt+0x3a/0x90 [ 48.547612] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 48.547612] ? handle_softirqs+0x71/0x290 [ 48.547612] irq_exit_rcu+0x63/0x80 [ 48.551585] sysvec_apic_timer_interrupt+0x75/0x90 [ 48.552521] </IRQ> [ 48.553529] <TASK> [ 48.553529] asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 48.555609] RIP: 0010:finish_task_switch.isra.0+0x90/0x260 [ 48.556526] Code: [...] 9f 58 0a 00 00 48 85 db 0f 85 89 01 00 00 4c 89 ff e8 53 d9 bd 00 fb 66 90 <4d> 85 ed 74 [ 48.562524] RSP: 0018:ffffc90000fabd38 EFLAGS: 00000282 [ 48.563589] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff83385620 [ 48.563589] RDX: ffff888237c73ae4 RSI: 0000000000000000 RDI: ffff888237c6fd00 [ 48.568521] RBP: ffffc90000fabd68 R08: 0000000000000000 R09: 0000000000000000 [ 48.569528] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8881009d0000 [ 48.573525] R13: ffff8881024e5400 R14: ffff88810791ae80 R15: ffff888237c6fd00 [ 48.575614] ? finish_task_switch.isra.0+0x8d/0x260 [ 48.576523] __schedule+0x364/0xac0 [ 48.577535] schedule+0x2e/0x110 [ 48.578555] pipe_read+0x301/0x400 [ 48.579589] ? destroy_sched_domains_rcu+0x30/0x30 [ 48.579589] vfs_read+0x2b3/0x2f0 [ 48.579589] ksys_read+0x8b/0xc0 [ 48.583590] do_syscall_64+0x3d/0xc0 [ 48.583590] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 48.586525] RIP: 0033:0x7f2f28703fa1 [ 48.587592] Code: [...] 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 80 3d c5 23 14 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0 [ 48.593534] RSP: 002b:00007ffd90f8cf88 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 48.595589] RAX: ffffffffffffffda RBX: 00007ffd90f8d5e8 RCX: 00007f2f28703fa1 [ 48.595589] RDX: 0000000000000001 RSI: 00007ffd90f8cfb0 RDI: 0000000000000006 [ 48.599592] RBP: 00007ffd90f8d2f0 R08: 0000000000000064 R09: 0000000000000000 [ 48.602527] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 48.603589] R13: 00007ffd90f8d608 R14: 00007f2f288d8000 R15: 0000000000f6bdb0 [ 48.605527] </TASK> In the test, two processes are communicating through pipe. Further debugging with strace found that the above splat is triggered as read() syscall could not receive the data even if the corresponding write() syscall in another process successfully wrote data into the pipe. The failed subtest is "send_signal_perf". The corresponding perf event has sample_period 1 and config PERF_COUNT_SW_CPU_CLOCK. sample_period 1 means every overflow event will trigger a call to the BPF program. So I suspect this may overwhelm the system. So I increased the sample_period to 100,000 and the test passed. The sample_period 10,000 still has the test failed. In other parts of selftest, e.g., [1], sample_freq is used instead. So I decided to use sample_freq = 1,000 since the test can pass as well. [1] https://lore.kernel.org/bpf/20240604070700.3032142-1-song@kernel.org/ Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240605201203.2603846-1-yonghong.song@linux.dev	2024-06-06 15:49:13 +02:00
Andrii Nakryiko	0720887044	libbpf: Remove callback-based type/string BTF field visitor helpers Now that all libbpf/bpftool code switched to btf_field_iter, remove btf_type_visit_type_ids() and btf_type_visit_str_offs() callback-based helpers as not needed anymore. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20240605001629.4061937-6-andrii@kernel.org	2024-06-05 16:54:45 +02:00
Andrii Nakryiko	e1a8630291	bpftool: Use BTF field iterator in btfgen Switch bpftool's code which is using libbpf-internal btf_type_visit_type_ids() helper to new btf_field_iter functionality. This makes bpftool code simpler, but also unblocks removing libbpf's btf_type_visit_type_ids() helper completely. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Alan Maguire <alan.maguire@oracle.com> Reviewed-by: Quentin Monnet <qmo@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20240605001629.4061937-5-andrii@kernel.org	2024-06-05 16:54:41 +02:00
Andrii Nakryiko	c264112369	libbpf: Make use of BTF field iterator in BTF handling code Use new BTF field iterator logic to replace all the callback-based visitor calls. There is still a .BTF.ext callback-based visitor APIs that should be converted, which will happens as a follow up. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20240605001629.4061937-4-andrii@kernel.org	2024-06-05 16:54:37 +02:00
Andrii Nakryiko	2bce2c1cb2	libbpf: Make use of BTF field iterator in BPF linker code Switch all BPF linker code dealing with iterating BTF type ID and string offset fields to new btf_field_iter facilities. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20240605001629.4061937-3-andrii@kernel.org	2024-06-05 16:54:32 +02:00
Andrii Nakryiko	68153bb2ff	libbpf: Add BTF field iterator Implement iterator-based type ID and string offset BTF field iterator. This is used extensively in BTF-handling code and BPF linker code for various sanity checks, rewriting IDs/offsets, etc. Currently this is implemented as visitor pattern calling custom callbacks, which makes the logic (especially in simple cases) unnecessarily obscure and harder to follow. Having equivalent functionality using iterator pattern makes for simpler to understand and maintain code. As we add more code for BTF processing logic in libbpf, it's best to switch to iterator pattern before adding more callback-based code. The idea for iterator-based implementation is to record offsets of necessary fields within fixed btf_type parts (which should be iterated just once), and, for kinds that have multiple members (based on vlen field), record where in each member necessary fields are located. Generic iteration code then just keeps track of last offset that was returned and handles N members correctly. Return type is just u32 pointer, where NULL is returned when all relevant fields were already iterated. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20240605001629.4061937-2-andrii@kernel.org	2024-06-05 16:54:26 +02:00
Yonghong Song	898ac74c5b	selftests/bpf: Ignore .llvm.<hash> suffix in kallsyms_find() I hit the following failure when running selftests with internal backported upstream kernel: test_ksyms:PASS:kallsyms_fopen 0 nsec test_ksyms:FAIL:ksym_find symbol 'bpf_link_fops' not found #123 ksyms:FAIL In /proc/kallsyms, we have $ cat /proc/kallsyms \| grep bpf_link_fops ffffffff829f0cb0 d bpf_link_fops.llvm.12608678492448798416 The CONFIG_LTO_CLANG_THIN is enabled in the kernel which is responsible for bpf_link_fops.llvm.12608678492448798416 symbol name. In prog_tests/ksyms.c we have kallsyms_find("bpf_link_fops", &link_fops_addr) and kallsyms_find() compares "bpf_link_fops" with symbols in /proc/kallsyms in order to find the entry. With bpf_link_fops.llvm.<hash> in /proc/kallsyms, the kallsyms_find() failed. To fix the issue, in kallsyms_find(), if a symbol has suffix .llvm.<hash>, that suffix will be ignored for comparison. This fixed the test failure. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240604180034.1356016-1-yonghong.song@linux.dev	2024-06-04 12:49:44 -07:00
Song Liu	61ce0ea759	selftests/bpf: Fix bpf_cookie and find_vma in nested VM bpf_cookie and find_vma are flaky in nested VMs, which is used by some CI systems. It turns out these failures are caused by unreliable perf event in nested VM. Fix these by: 1. Use PERF_COUNT_SW_CPU_CLOCK in find_vma; 2. Increase sample_freq in bpf_cookie. Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240604070700.3032142-1-song@kernel.org	2024-06-04 11:17:54 -07:00
Alexei Starovoitov	49df0019f3	Merge branch 'enable-bpf-programs-to-declare-arrays-of-kptr-bpf_rb_root-and-bpf_list_head' Kui-Feng Lee says: ==================== Enable BPF programs to declare arrays of kptr, bpf_rb_root, and bpf_list_head. Some types, such as type kptr, bpf_rb_root, and bpf_list_head, are treated in a special way. Previously, these types could not be the type of a field in a struct type that is used as the type of a global variable. They could not be the type of a field in a struct type that is used as the type of a field in the value type of a map either. They could not even be the type of array elements. This means that they can only be the type of global variables or of direct fields in the value type of a map. The patch set aims to enable the use of these specific types in arrays and struct fields, providing flexibility. It examines the types of global variables or the value types of maps, such as arrays and struct types, recursively to identify these special types and generate field information for them. For example, ... struct task_struct __kptr ptr[3]; ... it will create 3 instances of "struct btf_field" in the "btf_record" of the data section. [..., btf_field(offset=0x100, type=BPF_KPTR_REF), btf_field(offset=0x108, type=BPF_KPTR_REF), btf_field(offset=0x110, type=BPF_KPTR_REF), ... ] It creates a record of each of three elements. These three records are almost identical except their offsets. Another example is ... struct A { ... struct task_struct __kptr task; struct bpf_rb_root root; ... } struct A foo[2]; it will create 4 records. [..., btf_field(offset=0x7100, type=BPF_KPTR_REF), btf_field(offset=0x7108, type=BPF_RB_ROOT:), btf_field(offset=0x7200, type=BPF_KPTR_REF), btf_field(offset=0x7208, type=BPF_RB_ROOT:), ... ] Assuming that the size of an element/struct A is 0x100 and "foo" starts at 0x7000, it includes two kptr records at 0x7100 and 0x7200, and two rbtree root records at 0x7108 and 0x7208. All these field information will be flatten, for struct types, and repeated, for arrays. --- Changes from v6: - Return BPF_KPTR_REF from btf_get_field_type() only if var_type is a struct type. - Pass btf and type to btf_get_field_type(). Changes from v5: - Ensure field->offset values of kptrs are advanced correctly from one nested struct/or array to another. Changes from v4: - Return -E2BIG for i == MAX_RESOLVE_DEPTH. Changes from v3: - Refactor the common code of btf_find_struct_field() and btf_find_datasec_var(). - Limit the number of levels looking into a struct types. Changes from v2: - Support fields in nested struct type. - Remove nelems and duplicate field information with offset adjustments for arrays. Changes from v1: - Move the check of element alignment out of btf_field_cmp() to btf_record_find(). - Change the order of the previous patch 4 "bpf: check_map_kptr_access() compute the offset from the reg state" as the patch 7 now. - Reject BPF_RB_NODE and BPF_LIST_NODE with nelems > 1. - Rephrase the commit log of the patch "bpf: check_map_access() with the knowledge of arrays" to clarify the alignment on elements. v6: https://lore.kernel.org/all/20240520204018.884515-1-thinker.li@gmail.com/ v5: https://lore.kernel.org/all/20240510011312.1488046-1-thinker.li@gmail.com/ v4: https://lore.kernel.org/all/20240508063218.2806447-1-thinker.li@gmail.com/ v3: https://lore.kernel.org/all/20240501204729.484085-1-thinker.li@gmail.com/ v2: https://lore.kernel.org/all/20240412210814.603377-1-thinker.li@gmail.com/ v1: https://lore.kernel.org/bpf/20240410004150.2917641-1-thinker.li@gmail.com/ Kui-Feng Lee (9): bpf: Remove unnecessary checks on the offset of btf_field. bpf: Remove unnecessary call to btf_field_type_size(). bpf: refactor btf_find_struct_field() and btf_find_datasec_var(). bpf: create repeated fields for arrays. bpf: look into the types of the fields of a struct type recursively. bpf: limit the number of levels of a nested struct type. selftests/bpf: Test kptr arrays and kptrs in nested struct fields. selftests/bpf: Test global bpf_rb_root arrays and fields in nested struct types. selftests/bpf: Test global bpf_list_head arrays. kernel/bpf/btf.c \| 310 ++++++++++++------ kernel/bpf/verifier.c \| 4 +- .../selftests/bpf/prog_tests/cpumask.c \| 5 + .../selftests/bpf/prog_tests/linked_list.c \| 12 + .../testing/selftests/bpf/prog_tests/rbtree.c \| 47 +++ .../selftests/bpf/progs/cpumask_success.c \| 171 ++++++++++ .../testing/selftests/bpf/progs/linked_list.c \| 42 +++ tools/testing/selftests/bpf/progs/rbtree.c \| 77 +++++ 8 files changed, 558 insertions(+), 110 deletions(-) ==================== Link: https://lore.kernel.org/r/20240523174202.461236-1-thinker.li@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-06-03 20:52:43 -07:00
Kui-Feng Lee	43d50ffb1f	selftests/bpf: Test global bpf_list_head arrays. Make sure global arrays of bpf_list_heads and fields of bpf_list_heads in nested struct types work correctly. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20240523174202.461236-10-thinker.li@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-06-03 20:52:43 -07:00
Kui-Feng Lee	d55c765a9b	selftests/bpf: Test global bpf_rb_root arrays and fields in nested struct types. Make sure global arrays of bpf_rb_root and fields of bpf_rb_root in nested struct types work correctly. Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20240523174202.461236-9-thinker.li@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-06-03 20:52:42 -07:00
Kui-Feng Lee	c4c6c3b785	selftests/bpf: Test kptr arrays and kptrs in nested struct fields. Make sure that BPF programs can declare global kptr arrays and kptr fields in struct types that is the type of a global variable or the type of a nested descendant field in a global variable. An array with only one element is special case, that it treats the element like a non-array kptr field. Nested arrays are also tested to ensure they are handled properly. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20240523174202.461236-8-thinker.li@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-06-03 20:52:42 -07:00
Kui-Feng Lee	f19caf57d8	bpf: limit the number of levels of a nested struct type. Limit the number of levels looking into struct types to avoid running out of stack space. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20240523174202.461236-7-thinker.li@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-06-03 20:52:42 -07:00
Kui-Feng Lee	64e8ee8148	bpf: look into the types of the fields of a struct type recursively. The verifier has field information for specific special types, such as kptr, rbtree root, and list head. These types are handled differently. However, we did not previously examine the types of fields of a struct type variable. Field information records were not generated for the kptrs, rbtree roots, and linked_list heads that are not located at the outermost struct type of a variable. For example, struct A { struct task_struct __kptr * task; }; struct B { struct A mem_a; } struct B var_b; It did not examine "struct A" so as not to generate field information for the kptr in "struct A" for "var_b". This patch enables BPF programs to define fields of these special types in a struct type other than the direct type of a variable or in a struct type that is the type of a field in the value type of a map. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20240523174202.461236-6-thinker.li@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-06-03 20:52:42 -07:00
Kui-Feng Lee	994796c025	bpf: create repeated fields for arrays. The verifier uses field information for certain special types, such as kptr, rbtree root, and list head. These types are treated differently. However, we did not previously support these types in arrays. This update examines arrays and duplicates field information the same number of times as the length of the array if the element type is one of the special types. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20240523174202.461236-5-thinker.li@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-06-03 20:52:42 -07:00
Kui-Feng Lee	a7db0d4f87	bpf: refactor btf_find_struct_field() and btf_find_datasec_var(). Move common code of the two functions to btf_find_field_one(). Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20240523174202.461236-4-thinker.li@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-06-03 20:52:42 -07:00
Kui-Feng Lee	482f713379	bpf: Remove unnecessary call to btf_field_type_size(). field->size has been initialized by bpf_parse_fields() with the value returned by btf_field_type_size(). Use it instead of calling btf_field_type_size() again. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20240523174202.461236-3-thinker.li@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-06-03 20:52:42 -07:00
Kui-Feng Lee	c95a3be45a	bpf: Remove unnecessary checks on the offset of btf_field. reg_find_field_offset() always return a btf_field with a matching offset value. Checking the offset of the returned btf_field is unnecessary. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20240523174202.461236-2-thinker.li@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-06-03 20:52:42 -07:00
Geliang Tang	49784c7979	selftests/bpf: Drop duplicate bpf_map_lookup_elem in test_sockmap bpf_map_lookup_elem is invoked in bpf_prog3() already, no need to invoke it again. This patch drops it. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/ea8458462b876ee445173e3effb535fd126137ed.1716446893.git.tanggeliang@kylinos.cn	2024-06-03 19:32:55 +02:00
Geliang Tang	de1b5ea789	selftests/bpf: Check length of recv in test_sockmap The value of recv in msg_loop may be negative, like EWOULDBLOCK, so it's necessary to check if it is positive before accumulating it to bytes_recvd. Fixes: 16962b2404ac ("bpf: sockmap, add selftests") Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/5172563f7c7b2a2e953cef02e89fc34664a7b190.1716446893.git.tanggeliang@kylinos.cn	2024-06-03 19:32:55 +02:00
Geliang Tang	dcb681b659	selftests/bpf: Fix size of map_fd in test_sockmap The array size of map_fd[] is 9, not 8. This patch changes it as a more general form: ARRAY_SIZE(map_fd). Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/0972529ee01ebf8a8fd2b310bdec90831c94be77.1716446893.git.tanggeliang@kylinos.cn	2024-06-03 19:32:54 +02:00
Geliang Tang	467a0c79b5	selftests/bpf: Drop prog_fd array in test_sockmap The program fds can be got by using bpf_program__fd(progs[]), then prog_fd becomes useless. This patch drops it. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/9a6335e4d8dbab23c0d8906074457ceddd61e74b.1716446893.git.tanggeliang@kylinos.cn	2024-06-03 19:32:54 +02:00
Geliang Tang	24bb90a426	selftests/bpf: Replace tx_prog_fd with tx_prog in test_sockmap bpf_program__attach_sockmap() needs to take a parameter of type bpf_program instead of an fd, so tx_prog_fd becomes useless. This patch uses a pointer tx_prog to point to an item in progs[] array. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/23b37f932c547dd1ebfe154bbc0b0e957be21ee6.1716446893.git.tanggeliang@kylinos.cn	2024-06-03 19:32:54 +02:00
Geliang Tang	3f32a115f6	selftests/bpf: Use bpf_link attachments in test_sockmap Switch attachments to bpf_link using bpf_program__attach_sockmap() instead of bpf_prog_attach(). This patch adds a new array progs[] to replace prog_fd[] array, set in populate_progs() for each program in bpf object. And another new array links[] to save the attached bpf_link. It is initalized as NULL in populate_progs, set as the return valuses of bpf_program__attach_sockmap(), and detached by bpf_link__detach(). Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/32cf8376a810e2e9c719f8e4cfb97132ed2d1f9c.1716446893.git.tanggeliang@kylinos.cn	2024-06-03 19:32:54 +02:00
Geliang Tang	a9f0ea1759	selftests/bpf: Drop duplicate definition of i in test_sockmap There's already a definition of i in run_options() at the beginning, no need to define a new one in "if (tx_prog_fd > 0)" block. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/8d690682330a59361562bca75d6903253d16f312.1716446893.git.tanggeliang@kylinos.cn	2024-06-03 19:32:54 +02:00
Geliang Tang	d95ba15b97	selftests/bpf: Fix tx_prog_fd values in test_sockmap The values of tx_prog_fd in run_options() should not be 0, so set it as -1 in else branch, and test it using "if (tx_prog_fd > 0)" condition, not "if (tx_prog_fd)" or "if (tx_prog_fd >= 0)". Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/08b20ffc544324d40939efeae93800772a91a58e.1716446893.git.tanggeliang@kylinos.cn	2024-06-03 19:32:54 +02:00
Jeff Johnson	ec1249d327	test_bpf: Add missing MODULE_DESCRIPTION() make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_bpf.o Add the missing invocation of the MODULE_DESCRIPTION() macro. Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240531-md-lib-test_bpf-v1-1-868e4bd2f9ed@quicinc.com	2024-06-03 17:03:41 +02:00
Swan Beaujard	ce5249b91e	bpftool: Fix typo in MAX_NUM_METRICS macro name Correct typo in bpftool profiler and change all instances of 'MATRICS' to 'METRICS' in the profiler.bpf.c file. Signed-off-by: Swan Beaujard <beaujardswan@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Quentin Monnet <qmo@kernel.org> Link: https://lore.kernel.org/bpf/20240602225812.81171-1-beaujardswan@gmail.com	2024-06-03 16:58:27 +02:00
Dr. David Alan Gilbert	a450d36b05	selftests/bpf: Remove unused struct 'libcap' 'libcap' is unused since commit b1c2768a82b9 ("bpf: selftests: Remove libcap usage from test_verifier"). Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240602234112.225107-4-linux@treblig.org	2024-06-03 16:53:06 +02:00
Dr. David Alan Gilbert	3f67639d8e	selftests/bpf: Remove unused 'key_t' structs 'key_t' is unused in a couple of files since the original commit 60dd49ea6539 ("selftests/bpf: Add test for bpf array map iterators"). Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240602234112.225107-3-linux@treblig.org	2024-06-03 16:52:57 +02:00
Dr. David Alan Gilbert	dfa7c9ffa6	selftests/bpf: Remove unused struct 'scale_test_def' 'scale_test_def' is unused since commit 3762a39ce85f ("selftests/bpf: Split out bpf_verif_scale selftests into multiple tests"). Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240602234112.225107-2-linux@treblig.org	2024-06-03 16:52:42 +02:00
Xiao Wang	96a27ee76f	riscv, bpf: Introduce shift add helper with Zba optimization Zba extension is very useful for generating addresses that index into array of basic data types. This patch introduces sh2add and sh3add helpers for RV32 and RV64 respectively, to accelerate addressing for array of unsigned long data. Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/bpf/20240524075543.4050464-3-xiao.w.wang@intel.com	2024-06-03 16:45:23 +02:00
Andrii Nakryiko	531876c800	libbpf: keep FD_CLOEXEC flag when dup()'ing FD Make sure to preserve and/or enforce FD_CLOEXEC flag on duped FDs. Use dup3() with O_CLOEXEC flag for that. Without this fix libbpf effectively clears FD_CLOEXEC flag on each of BPF map/prog FD, which is definitely not the right or expected behavior. Reported-by: Lennart Poettering <lennart@poettering.net> Fixes: bc308d011ab8 ("libbpf: call dup2() syscall directly") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20240529223239.504241-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-05-31 20:35:55 -07:00
Martin KaFai Lau	3f8fde3195	Merge branch 'Notify user space when a struct_ops object is detached/unregistered' Kui-Feng Lee says: ==================== The subsystems managing struct_ops objects may need to detach a struct_ops object due to errors or other reasons. It would be useful to notify user space programs so that error recovery or logging can be carried out. This patch set enables the detach feature for struct_ops links and send an event to epoll when a link is detached. Subsystems could call link->ops->detach() to detach a link and notify user space programs through epoll. The signatures of callback functions in "struct bpf_struct_ops" have been changed as well to pass an extra link argument to subsystems. Subsystems could detach the links received from reg() and update() callbacks if there is. This also provides a way that subsystems can distinguish registrations for an object that has been registered multiple times for several links. However, bpf struct_ops maps without BPF_F_LINK have no any link. Subsystems will receive NULL link pointer for this case. --- Changes from v6: - Fix the missing header at patch 5. - Move RCU_INIT_POINTER() back to its original position. Changes from v5: - Change the commit title of the patch for bpftool. Changes from v4: - Change error code for bpf_struct_ops_map_link_update() - Always return 0 for bpf_struct_ops_map_link_detach() - Hold update_mutex in bpf_struct_ops_link_create() - Add a separated instance of file_operations for links supporting poll. - Fix bpftool for bpf_link_fops_poll. Changes from v3: - Add a comment to explain why holding update_mutex is not necessary in bpf_struct_ops_link_create() - Use rcu_access_pointer() in bpf_struct_ops_map_link_poll(). Changes from v2: - Rephrased commit logs and comments. - Addressed some mistakes from patch splitting. - Replace mutex with spinlock in bpf_testmod.c to address lockdep Splat and simplify the implementation. - Fix an argument passing to rcu_dereference_protected(). Changes from v1: - Pass a link to reg, unreg, and update callbacks. - Provide a function to detach a link from underlying subsystems. - Add a kfunc to mimic detachments from subsystems, and provide a flexible way to control when to do detachments. - Add two tests to detach a link from the subsystem after the refcount of the link drops to zero. v6: https://lore.kernel.org/bpf/20240524223036.318800-1-thinker.li@gmail.com/ v5: https://lore.kernel.org/all/20240523230848.2022072-1-thinker.li@gmail.com/ v4: https://lore.kernel.org/all/20240521225121.770930-1-thinker.li@gmail.com/ v3: https://lore.kernel.org/all/20240510002942.1253354-1-thinker.li@gmail.com/ v2: https://lore.kernel.org/all/20240507055600.2382627-1-thinker.li@gmail.com/ v1: https://lore.kernel.org/all/20240429213609.487820-1-thinker.li@gmail.com/ ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-05-30 15:34:14 -07:00
Kui-Feng Lee	d14c1fac0c	bpftool: Change pid_iter.bpf.c to comply with the change of bpf_link_fops. To support epoll, a new instance of file_operations, bpf_link_fops_poll, has been added for links that support epoll. The pid_iter.bpf.c checks f_ops for links and other BPF objects. The check should fail for struct_ops links without this patch. Acked-by: Quentin Monnet <qmo@kernel.org> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20240530065946.979330-9-thinker.li@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-05-30 15:34:14 -07:00
Kui-Feng Lee	1a4b858b6a	selftests/bpf: test struct_ops with epoll Verify whether a user space program is informed through epoll with EPOLLHUP when a struct_ops object is detached. The BPF code in selftests/bpf/progs/struct_ops_module.c has become complex. Therefore, struct_ops_detach.c has been added to segregate the BPF code for detachment tests from the BPF code for other tests based on the recommendation of Andrii Nakryiko. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20240530065946.979330-6-thinker.li@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-05-30 15:34:14 -07:00
Kui-Feng Lee	67c3e8353f	bpf: export bpf_link_inc_not_zero. bpf_link_inc_not_zero() will be used by kernel modules. We will use it in bpf_testmod.c later. Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20240530065946.979330-5-thinker.li@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-05-30 15:34:13 -07:00
Kui-Feng Lee	1adddc97aa	bpf: support epoll from bpf struct_ops links. Add epoll support to bpf struct_ops links to trigger EPOLLHUP event upon detachment. This patch implements the "poll" of the "struct file_operations" for BPF links and introduces a new "poll" operator in the "struct bpf_link_ops". By implementing "poll" of "struct bpf_link_ops" for the links of struct_ops, the file descriptor of a struct_ops link can be added to an epoll file descriptor to receive EPOLLHUP events. Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20240530065946.979330-4-thinker.li@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-05-30 15:34:13 -07:00
Kui-Feng Lee	6fb2544ea1	bpf: enable detaching links of struct_ops objects. Implement the detach callback in bpf_link_ops for struct_ops so that user programs can detach a struct_ops link. The subsystems that struct_ops objects are registered to can also use this callback to detach the links being passed to them. Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20240530065946.979330-3-thinker.li@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-05-30 15:34:13 -07:00
Kui-Feng Lee	73287fe228	bpf: pass bpf_struct_ops_link to callbacks in bpf_struct_ops. Pass an additional pointer of bpf_struct_ops_link to callback function reg, unreg, and update provided by subsystems defined in bpf_struct_ops. A bpf_struct_ops_map can be registered for multiple links. Passing a pointer of bpf_struct_ops_link helps subsystems to distinguish them. This pointer will be used in the later patches to let the subsystem initiate a detachment on a link that was registered to it previously. Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20240530065946.979330-2-thinker.li@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-05-30 15:34:13 -07:00
Jakub Sitnicki	46253c4ae9	selftests/bpf: use section names understood by libbpf in test_sockmap libbpf can deduce program type and attach type from the ELF section name. We don't need to pass it out-of-band if we switch to libbpf convention [1]. [1] https://docs.kernel.org/bpf/libbpf/program_types.html Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240522080936.2475833-1-jakub@cloudflare.com	2024-05-30 14:42:17 -07:00
Andrii Nakryiko	f088cabffc	Merge branch 'bpf-add-a-generic-bits-iterator' Yafang Shao says: ==================== bpf: Add a generic bits iterator Three new kfuncs, namely bpf_iter_bits_{new,next,destroy}, have been added for the new bpf_iter_bits functionality. These kfuncs enable the iteration of the bits from a given address and a given number of bits. - bpf_iter_bits_new Initialize a new bits iterator for a given memory area. Due to the limitation of bpf memalloc, the max number of bits to be iterated over is (4096 * 8). - bpf_iter_bits_next Get the next bit in a bpf_iter_bits - bpf_iter_bits_destroy Destroy a bpf_iter_bits The bits iterator can be used in any context and on any address. Changes: - v7->v8: Refine the interface to avoid dealing with endianness (Andrii) - v6->v7: Fix endianness error for non-long-aligned data (Andrii) - v5->v6: Add positive tests (Andrii) - v4->v5: Simplify test cases (Andrii) - v3->v4: - Fix endianness error on s390x (Andrii) - zero-initialize kit->bits_copy and zero out nr_bits (Andrii) - v2->v3: Optimization for u64/u32 mask (Andrii) - v1->v2: Simplify the CPU number verification code to avoid the failure on s390x (Eduard) - bpf: Add bpf_iter_cpumask https://lwn.net/Articles/961104/ - bpf: Add new bpf helper bpf_for_each_cpu https://lwn.net/Articles/939939/ ==================== Link: https://lore.kernel.org/r/20240517023034.48138-1-laoar.shao@gmail.com Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-05-29 16:01:48 -07:00
Yafang Shao	6ba7acdb93	selftests/bpf: Add selftest for bits iter Add test cases for the bits iter: - Positive cases - Bit mask representing a single word (8-byte unit) - Bit mask representing data spanning more than one word - The index of the set bit - Nagative cases - bpf_iter_bits_destroy() is required after calling bpf_iter_bits_new() - bpf_iter_bits_destroy() can only destroy an initialized iter - bpf_iter_bits_next() must use an initialized iter - Bit mask representing zero words - Bit mask representing fewer words than expected - Case for ENOMEM - Case for NULL pointer Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240517023034.48138-3-laoar.shao@gmail.com	2024-05-29 16:01:48 -07:00
Yafang Shao	4665415975	bpf: Add bits iterator Add three new kfuncs for the bits iterator: - bpf_iter_bits_new Initialize a new bits iterator for a given memory area. Due to the limitation of bpf memalloc, the max number of words (8-byte units) that can be iterated over is limited to (4096 / 8). - bpf_iter_bits_next Get the next bit in a bpf_iter_bits - bpf_iter_bits_destroy Destroy a bpf_iter_bits The bits iterator facilitates the iteration of the bits of a memory area, such as cpumask. It can be used in any context and on any address. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240517023034.48138-2-laoar.shao@gmail.com	2024-05-29 16:01:47 -07:00

1 2 3 4 5 ...

1279127 Commits