linux

iv/linux

Author	SHA1	Message	Date
Tobias Klauser	f5be22c64b	bpf: Fix bpf_skc_lookup comment wrt. return type The function no longer returns 'unsigned long' as of commit edbf8c01de5a ("bpf: add skc_lookup_tcp helper"). Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220617152121.29617-1-tklauser@distanz.ch	2022-06-17 18:36:45 +02:00
Joanne Koong	dc368e1c65	bpf: Fix non-static bpf_func_proto struct definitions This patch does two things: 1) Marks the dynptr bpf_func_proto structs that were added in [1] as static, as pointed out by the kernel test robot in [2]. 2) There are some bpf_func_proto structs marked as extern which can instead be statically defined. [1] https://lore.kernel.org/bpf/20220523210712.3641569-1-joannelkoong@gmail.com/ [2] https://lore.kernel.org/bpf/62ab89f2.Pko7sI08RAKdF8R6%25lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220616225407.1878436-1-joannelkoong@gmail.com	2022-06-17 16:00:51 +02:00
Andrii Nakryiko	08c79c9cd6	selftests/bpf: Don't force lld on non-x86 architectures LLVM's lld linker doesn't have a universal architecture support (e.g., it definitely doesn't work on s390x), so be safe and force lld for urandom_read and liburandom_read.so only on x86 architectures. This should fix s390x CI runs. Fixes: 3e6fe5ce4d48 ("libbpf: Fix internal USDT address translation logic for shared libraries") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220617045512.1339795-1-andrii@kernel.org	2022-06-17 10:16:01 +02:00
Alexei Starovoitov	4429bdc408	Merge branch 'New BPF helpers to accelerate synproxy' Maxim Mikityanskiy says: ==================== The first patch of this series is a documentation fix. The second patch allows BPF helpers to accept memory regions of fixed size without doing runtime size checks. The two next patches add new functionality that allows XDP to accelerate iptables synproxy. v1 of this series [1] used to include a patch that exposed conntrack lookup to BPF using stable helpers. It was superseded by series [2] by Kumar Kartikeya Dwivedi, which implements this functionality using unstable helpers. The third patch adds new helpers to issue and check SYN cookies without binding to a socket, which is useful in the synproxy scenario. The fourth patch adds a selftest, which includes an XDP program and a userspace control application. The XDP program uses socketless SYN cookie helpers and queries conntrack status instead of socket status. The userspace control application allows to tune parameters of the XDP program. This program also serves as a minimal example of usage of the new functionality. The last two patches expose the new helpers to TC BPF and extend the selftest. The draft of the new functionality was presented on Netdev 0x15 [3]. v2 changes: Split into two series, submitted bugfixes to bpf, dropped the conntrack patches, implemented the timestamp cookie in BPF using bpf_loop, dropped the timestamp cookie patch. v3 changes: Moved some patches from bpf to bpf-next, dropped the patch that changed error codes, split the new helpers into IPv4/IPv6, added verifier functionality to accept memory regions of fixed size. v4 changes: Converted the selftest to the test_progs runner. Replaced some deprecated functions in xdp_synproxy userspace helper. v5 changes: Fixed a bug in the selftest. Added questionable functionality to support new helpers in TC BPF, added selftests for it. v6 changes: Wrap the new helpers themselves into #ifdef CONFIG_SYN_COOKIES, replaced fclose with pclose and fixed the MSS for IPv6 in the selftest. v7 changes: Fixed the off-by-one error in indices, changed the section name to "xdp", added missing kernel config options to vmtest in CI. v8 changes: Properly rebased, dropped the first patch (the same change was applied by someone else), updated the cover letter. v9 changes: Fixed selftests for no_alu32. v10 changes: Selftests for s390x were blacklisted due to lack of support of kfunc, rebased the series, split selftests to separate commits, created ARG_PTR_TO_FIXED_SIZE_MEM and packed arg_size, addressed the rest of comments. [1]: https://lore.kernel.org/bpf/20211020095815.GJ28644@breakpoint.cc/t/ [2]: https://lore.kernel.org/bpf/20220114163953.1455836-1-memxor@gmail.com/ [3]: https://netdevconf.info/0x15/session.html?Accelerating-synproxy-with-XDP ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-16 21:20:30 -07:00
Maxim Mikityanskiy	784d5dc0ef	selftests/bpf: Add selftests for raw syncookie helpers in TC mode This commit extends selftests for the new BPF helpers bpf_tcp_raw_{gen,check}_syncookie_ipv{4,6} to also test the TC BPF functionality added in the previous commit. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20220615134847.3753567-7-maximmi@nvidia.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-16 21:20:30 -07:00
Maxim Mikityanskiy	9a4cf07386	bpf: Allow the new syncookie helpers to work with SKBs This commit allows the new BPF helpers to work in SKB context (in TC BPF programs): bpf_tcp_raw_{gen,check}_syncookie_ipv{4,6}. Using these helpers in TC BPF programs is not recommended, because it's unlikely that the BPF program will provide any substantional speedup compared to regular SYN cookies or synproxy, after the SKB is already created. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20220615134847.3753567-6-maximmi@nvidia.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-16 21:20:30 -07:00
Maxim Mikityanskiy	fb5cd0ce70	selftests/bpf: Add selftests for raw syncookie helpers This commit adds selftests for the new BPF helpers: bpf_tcp_raw_{gen,check}_syncookie_ipv{4,6}. xdp_synproxy_kern.c is a BPF program that generates SYN cookies on allowed TCP ports and sends SYNACKs to clients, accelerating synproxy iptables module. xdp_synproxy.c is a userspace control application that allows to configure the following options in runtime: list of allowed ports, MSS, window scale, TTL. A selftest is added to prog_tests that leverages the above programs to test the functionality of the new helpers. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20220615134847.3753567-5-maximmi@nvidia.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-16 21:20:30 -07:00
Maxim Mikityanskiy	33bf988504	bpf: Add helpers to issue and check SYN cookies in XDP The new helpers bpf_tcp_raw_{gen,check}_syncookie_ipv{4,6} allow an XDP program to generate SYN cookies in response to TCP SYN packets and to check those cookies upon receiving the first ACK packet (the final packet of the TCP handshake). Unlike bpf_tcp_{gen,check}_syncookie these new helpers don't need a listening socket on the local machine, which allows to use them together with synproxy to accelerate SYN cookie generation. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20220615134847.3753567-4-maximmi@nvidia.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-16 21:20:30 -07:00
Maxim Mikityanskiy	508362ac66	bpf: Allow helpers to accept pointers with a fixed size Before this commit, the BPF verifier required ARG_PTR_TO_MEM arguments to be followed by ARG_CONST_SIZE holding the size of the memory region. The helpers had to check that size in runtime. There are cases where the size expected by a helper is a compile-time constant. Checking it in runtime is an unnecessary overhead and waste of BPF registers. This commit allows helpers to accept pointers to memory without the corresponding ARG_CONST_SIZE, given that they define the memory region size in struct bpf_func_proto and use ARG_PTR_TO_FIXED_SIZE_MEM type. arg_size is unionized with arg_btf_id to reduce the kernel image size, and it's valid because they are used by different argument types. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20220615134847.3753567-3-maximmi@nvidia.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-16 21:20:29 -07:00
Maxim Mikityanskiy	ac80287a6a	bpf: Fix documentation of th_len in bpf_tcp_{gen,check}_syncookie bpf_tcp_gen_syncookie expects the full length of the TCP header (with all options), and bpf_tcp_check_syncookie accepts lengths bigger than sizeof(struct tcphdr). Fix the documentation that says these lengths should be exactly sizeof(struct tcphdr). While at it, fix a typo in the name of struct ipv6hdr. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20220615134847.3753567-2-maximmi@nvidia.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-16 21:20:29 -07:00
Alexei Starovoitov	88bf185813	Merge branch 'sleepable uprobe support' Delyan Kratunov says: ==================== This series implements support for sleepable uprobe programs. Key work is in patches 2 and 3, the rest is plumbing and tests. The main observation is that the only obstacle in the way of sleepable uprobe programs is not the uprobe infrastructure, which already runs in a user-like context, but the rcu usage around bpf_prog_array. Details are in patch 2 but the tl;dr is that we chain trace_tasks and normal rcu grace periods when releasing to array to accommodate users of either rcu type. This introduces latency for non-sleepable users (kprobe, tp) but that's deemed acceptable, given recent benchmarks by Andrii [1]. We're a couple of orders of magnitude under the rate of bpf_prog_array churn that would raise flags (~1MM/s per Paul). [1]: https://lore.kernel.org/bpf/CAEf4BzbpjN6ca7D9KOTiFPOoBYkciYvTz0UJNp5c-_3ptm=Mrg@mail.gmail.com/ v3 -> v4: * Fix kdoc and inline issues * Rebase v2 -> v3: * Inline uprobe_call_bpf into trace_uprobe.c, it's just a bpf_prog_run_array_sleepable call now. * Do not disable preemption for uprobe non-sleepable programs. * Add acks. v1 -> v2: * Fix lockdep annotations in bpf_prog_run_array_sleepable * Chain rcu grace periods only for perf_event-attached programs. This limits the additional latency on the free path to use cases where we know it won't be a problem. * Add tests calling helpers only available in sleepable programs. * Remove kprobe.s support from libbpf. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-16 19:27:37 -07:00
Delyan Kratunov	cb3f4a4a46	selftests/bpf: add tests for sleepable (uk)probes Add tests that ensure sleepable uprobe programs work correctly. Add tests that ensure sleepable kprobe programs cannot attach. Also add tests that attach both sleepable and non-sleepable uprobe programs to the same location (i.e. same bpf_prog_array). Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Delyan Kratunov <delyank@fb.com> Link: https://lore.kernel.org/r/c744e5bb7a5c0703f05444dc41f2522ba3579a48.1655248076.git.delyank@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-16 19:27:30 -07:00
Delyan Kratunov	c4cac71fc8	libbpf: add support for sleepable uprobe programs Add section mappings for u(ret)probe.s programs. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Delyan Kratunov <delyank@fb.com> Link: https://lore.kernel.org/r/aedbc3b74f3523f00010a7b0df8f3388cca59f16.1655248076.git.delyank@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-16 19:27:30 -07:00
Delyan Kratunov	64ad7556c7	bpf: allow sleepable uprobe programs to attach uprobe and kprobe programs have the same program type, KPROBE, which is currently not allowed to load sleepable programs. To avoid adding a new UPROBE type, instead allow sleepable KPROBE programs to load and defer the is-it-actually-a-uprobe-program check to attachment time, where there's already validation of the corresponding perf_event. A corollary of this patch is that you can now load a sleepable kprobe program but cannot attach it. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Delyan Kratunov <delyank@fb.com> Link: https://lore.kernel.org/r/fcd44a7cd204f372f6bb03ef794e829adeaef299.1655248076.git.delyank@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-16 19:27:29 -07:00
Delyan Kratunov	8c7dcb84e3	bpf: implement sleepable uprobes by chaining gps uprobes work by raising a trap, setting a task flag from within the interrupt handler, and processing the actual work for the uprobe on the way back to userspace. As a result, uprobe handlers already execute in a might_fault/_sleep context. The primary obstacle to sleepable bpf uprobe programs is therefore on the bpf side. Namely, the bpf_prog_array attached to the uprobe is protected by normal rcu. In order for uprobe bpf programs to become sleepable, it has to be protected by the tasks_trace rcu flavor instead (and kfree() called after a corresponding grace period). Therefore, the free path for bpf_prog_array now chains a tasks_trace and normal grace periods one after the other. Users who iterate under tasks_trace read section would be safe, as would users who iterate under normal read sections (from non-sleepable locations). The downside is that the tasks_trace latency affects all perf_event-attached bpf programs (and not just uprobe ones). This is deemed safe given the possible attach rates for kprobe/uprobe/tp programs. Separately, non-sleepable programs need access to dynamically sized rcu-protected maps, so bpf_run_prog_array_sleepables now conditionally takes an rcu read section, in addition to the overarching tasks_trace section. Signed-off-by: Delyan Kratunov <delyank@fb.com> Link: https://lore.kernel.org/r/ce844d62a2fd0443b08c5ab02e95bc7149f9aeb1.1655248076.git.delyank@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-16 19:27:29 -07:00
Delyan Kratunov	d687f621c5	bpf: move bpf_prog to bpf.h In order to add a version of bpf_prog_run_array which accesses the bpf_prog->aux member, bpf_prog needs to be more than a forward declaration inside bpf.h. Given that filter.h already includes bpf.h, this merely reorders the type declarations for filter.h users. bpf.h users now have access to bpf_prog internals. Signed-off-by: Delyan Kratunov <delyank@fb.com> Link: https://lore.kernel.org/r/3ed7824e3948f22d84583649ccac0ff0d38b6b58.1655248076.git.delyank@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-16 19:27:29 -07:00
Andrii Nakryiko	3e6fe5ce4d	libbpf: Fix internal USDT address translation logic for shared libraries Perform the same virtual address to file offset translation that libbpf is doing for executable ELF binaries also for shared libraries. Currently libbpf is making a simplifying and sometimes wrong assumption that for shared libraries relative virtual addresses inside ELF are always equal to file offsets. Unfortunately, this is not always the case with LLVM's lld linker, which now by default generates quite more complicated ELF segments layout. E.g., for liburandom_read.so from selftests/bpf, here's an excerpt from readelf output listing ELF segments (a.k.a. program headers): Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align PHDR 0x000040 0x0000000000000040 0x0000000000000040 0x0001f8 0x0001f8 R 0x8 LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x0005e4 0x0005e4 R 0x1000 LOAD 0x0005f0 0x00000000000015f0 0x00000000000015f0 0x000160 0x000160 R E 0x1000 LOAD 0x000750 0x0000000000002750 0x0000000000002750 0x000210 0x000210 RW 0x1000 LOAD 0x000960 0x0000000000003960 0x0000000000003960 0x000028 0x000029 RW 0x1000 Compare that to what is generated by GNU ld (or LLVM lld's with extra -znoseparate-code argument which disables this cleverness in the name of file size reduction): Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x000550 0x000550 R 0x1000 LOAD 0x001000 0x0000000000001000 0x0000000000001000 0x000131 0x000131 R E 0x1000 LOAD 0x002000 0x0000000000002000 0x0000000000002000 0x0000ac 0x0000ac R 0x1000 LOAD 0x002dc0 0x0000000000003dc0 0x0000000000003dc0 0x000262 0x000268 RW 0x1000 You can see from the first example above that for executable (Flg == "R E") PT_LOAD segment (LOAD #2), Offset doesn't match VirtAddr columns. And it does in the second case (GNU ld output). This is important because all the addresses, including USDT specs, operate in a virtual address space, while kernel is expecting file offsets when performing uprobe attach. So such mismatches have to be properly taken care of and compensated by libbpf, which is what this patch is fixing. Also patch clarifies few function and variable names, as well as updates comments to reflect this important distinction (virtaddr vs file offset) and to ephasize that shared libraries are not all that different from executables in this regard. This patch also changes selftests/bpf Makefile to force urand_read and liburand_read.so to be built with Clang and LLVM's lld (and explicitly request this ELF file size optimization through -znoseparate-code linker parameter) to validate libbpf logic and ensure regressions don't happen in the future. I've bundled these selftests changes together with libbpf changes to keep the above description tied with both libbpf and selftests changes. Fixes: 74cc6311cec9 ("libbpf: Add USDT notes parsing and resolution logic") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220616055543.3285835-1-andrii@kernel.org	2022-06-17 01:20:10 +02:00
Zhengchao Shao	de5bb43826	samples/bpf: Check detach prog exist or not in xdp_fwd Before detach the prog, we should check detach prog exist or not. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20220606005425.261967-1-shaozhengchao@huawei.com	2022-06-15 17:26:20 -07:00
Yonghong Song	3831cd1f9f	selftests/bpf: Avoid skipping certain subtests Commit 704c91e59fe0 ('selftests/bpf: Test "bpftool gen min_core_btf"') added a test test_core_btfgen to test core relocation with btf generated with 'bpftool gen min_core_btf'. Currently, among 76 subtests, 25 are skipped. ... #46/69 core_reloc_btfgen/enumval:OK #46/70 core_reloc_btfgen/enumval___diff:OK #46/71 core_reloc_btfgen/enumval___val3_missing:OK #46/72 core_reloc_btfgen/enumval___err_missing:SKIP #46/73 core_reloc_btfgen/enum64val:OK #46/74 core_reloc_btfgen/enum64val___diff:OK #46/75 core_reloc_btfgen/enum64val___val3_missing:OK #46/76 core_reloc_btfgen/enum64val___err_missing:SKIP ... #46 core_reloc_btfgen:SKIP Summary: 1/51 PASSED, 25 SKIPPED, 0 FAILED Alexei found that in the above core_reloc_btfgen/enum64val___err_missing should not be skipped. Currently, the core_reloc tests have some negative tests. In Commit 704c91e59fe0, for core_reloc_btfgen, all negative tests are skipped with the following condition if (!test_case->btf_src_file \|\| test_case->fails) { test__skip(); continue; } This is too conservative. Negative tests do not fail mkstemp() and run_btfgen() should not be skipped. There are a few negative tests indeed failing run_btfgen() and this patch added 'run_btfgen_fails' to mark these tests so that they can be skipped for btfgen tests. With this, we have ... #46/69 core_reloc_btfgen/enumval:OK #46/70 core_reloc_btfgen/enumval___diff:OK #46/71 core_reloc_btfgen/enumval___val3_missing:OK #46/72 core_reloc_btfgen/enumval___err_missing:OK #46/73 core_reloc_btfgen/enum64val:OK #46/74 core_reloc_btfgen/enum64val___diff:OK #46/75 core_reloc_btfgen/enum64val___val3_missing:OK #46/76 core_reloc_btfgen/enum64val___err_missing:OK ... Summary: 1/62 PASSED, 14 SKIPPED, 0 FAILED Totally 14 subtests are skipped instead of 25. Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220614055526.628299-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-14 16:18:29 -07:00
Yonghong Song	96752e1ec0	selftests/bpf: Fix test_varlen verification failure with latest llvm With latest llvm15, test_varlen failed with the following verifier log: 17: (85) call bpf_probe_read_kernel_str#115 ; R0_w=scalar(smin=-4095,smax=256) 18: (bf) r1 = r0 ; R0_w=scalar(id=1,smin=-4095,smax=256) R1_w=scalar(id=1,smin=-4095,smax=256) 19: (67) r1 <<= 32 ; R1_w=scalar(smax=1099511627776,umax=18446744069414584320,var_off=(0x0; 0xffffffff00000000),s32_min=0,s32_max=0,u32_max=) 20: (bf) r2 = r1 ; R1_w=scalar(id=2,smax=1099511627776,umax=18446744069414584320,var_off=(0x0; 0xffffffff00000000),s32_min=0,s32_max=0,u32) 21: (c7) r2 s>>= 32 ; R2=scalar(smin=-2147483648,smax=256) ; if (len >= 0) { 22: (c5) if r2 s< 0x0 goto pc+7 ; R2=scalar(umax=256,var_off=(0x0; 0x1ff)) ; payload4_len1 = len; 23: (18) r2 = 0xffffc90000167418 ; R2_w=map_value(off=1048,ks=4,vs=1572,imm=0) 25: (63) (u32 )(r2 +0) = r0 ; R0=scalar(id=1,smin=-4095,smax=256) R2_w=map_value(off=1048,ks=4,vs=1572,imm=0) 26: (77) r1 >>= 32 ; R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) ; payload += len; 27: (18) r6 = 0xffffc90000167424 ; R6_w=map_value(off=1060,ks=4,vs=1572,imm=0) 29: (0f) r6 += r1 ; R1_w=Pscalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R6_w=map_value(off=1060,ks=4,vs=1572,umax=4294967295,var_off=(0) ; len = bpf_probe_read_kernel_str(payload, MAX_LEN, &buf_in2[0]); 30: (bf) r1 = r6 ; R1_w=map_value(off=1060,ks=4,vs=1572,umax=4294967295,var_off=(0x0; 0xffffffff)) R6_w=map_value(off=1060,ks=4,vs=1572,um) 31: (b7) r2 = 256 ; R2_w=256 32: (18) r3 = 0xffffc90000164100 ; R3_w=map_value(off=256,ks=4,vs=1056,imm=0) 34: (85) call bpf_probe_read_kernel_str#115 R1 unbounded memory access, make sure to bounds check any such access processed 27 insns (limit 1000000) max_states_per_insn 0 total_states 2 peak_states 2 mark_read 1 -- END PROG LOAD LOG -- libbpf: failed to load program 'handler32_signed' The failure is due to 20: (bf) r2 = r1 ; R1_w=scalar(id=2,smax=1099511627776,umax=18446744069414584320,var_off=(0x0; 0xffffffff00000000),s32_min=0,s32_max=0,u32) 21: (c7) r2 s>>= 32 ; R2=scalar(smin=-2147483648,smax=256) 22: (c5) if r2 s< 0x0 goto pc+7 ; R2=scalar(umax=256,var_off=(0x0; 0x1ff)) 26: (77) r1 >>= 32 ; R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) 29: (0f) r6 += r1 ; R1_w=Pscalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R6_w=map_value(off=1060,ks=4,vs=1572,umax=4294967295,var_off=(0) where r1 has conservative value range compared to r2 and r1 is used later. In llvm, commit [1] triggered the above code generation and caused verification failure. It may take a while for llvm to address this issue. In the main time, let us change the variable 'len' type to 'long' and adjust condition properly. Tested with llvm14 and latest llvm15, both worked fine. [1] https://reviews.llvm.org/D126647 Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220613233449.2860753-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-14 16:17:07 -07:00
Quentin Monnet	93270357da	bpftool: Do not check return value from libbpf_set_strict_mode() The function always returns 0, so we don't need to check whether the return value is 0 or not. This change was first introduced in commit a777e18f1bcd ("bpftool: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK"), but later reverted to restore the unconditional rlimit bump in bpftool. Let's re-add it. Co-developed-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220610112648.29695-3-quentin@isovalent.com	2022-06-14 22:18:56 +02:00
Quentin Monnet	6b4384ff10	Revert "bpftool: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK" This reverts commit a777e18f1bcd32528ff5dfd10a6629b655b05eb8. In commit a777e18f1bcd ("bpftool: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK"), we removed the rlimit bump in bpftool, because the kernel has switched to memcg-based memory accounting. Thanks to the LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK, we attempted to keep compatibility with other systems and ask libbpf to raise the limit for us if necessary. How do we know if memcg-based accounting is supported? There is a probe in libbpf to check this. But this probe currently relies on the availability of a given BPF helper, bpf_ktime_get_coarse_ns(), which landed in the same kernel version as the memory accounting change. This works in the generic case, but it may fail, for example, if the helper function has been backported to an older kernel. This has been observed for Google Cloud's Container-Optimized OS (COS), where the helper is available but rlimit is still in use. The probe succeeds, the rlimit is not raised, and probing features with bpftool, for example, fails. A patch was submitted [0] to update this probe in libbpf, based on what the cilium/ebpf Go library does [1]. It would lower the soft rlimit to 0, attempt to load a BPF object, and reset the rlimit. But it may induce some hard-to-debug flakiness if another process starts, or the current application is killed, while the rlimit is reduced, and the approach was discarded. As a workaround to ensure that the rlimit bump does not depend on the availability of a given helper, we restore the unconditional rlimit bump in bpftool for now. [0] https://lore.kernel.org/bpf/20220609143614.97837-1-quentin@isovalent.com/ [1] https://github.com/cilium/ebpf/blob/v0.9.0/rlimit/rlimit.go#L39 Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Yafang Shao <laoar.shao@gmail.com> Cc: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20220610112648.29695-2-quentin@isovalent.com	2022-06-14 22:18:06 +02:00
YueHaibing	fc386ba721	bpf, arm: Remove unused function emit_a32_alu_r() Since commit b18bea2a45b1 ("ARM: net: bpf: improve 64-bit ALU implementation") this is unused anymore, so can remove it. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220611040904.8976-1-yuehaibing@huawei.com	2022-06-14 22:04:16 +02:00
Yonghong Song	c49a44b39b	libbpf: Fix an unsigned < 0 bug Andrii reported a bug with the following information: 2859 if (enum64_placeholder_id == 0) { 2860 enum64_placeholder_id = btf__add_int(btf, "enum64_placeholder", 1, 0); >>> CID 394804: Control flow issues (NO_EFFECT) >>> This less-than-zero comparison of an unsigned value is never true. "enum64_placeholder_id < 0U". 2861 if (enum64_placeholder_id < 0) 2862 return enum64_placeholder_id; 2863 ... Here enum64_placeholder_id declared as '__u32' so enum64_placeholder_id < 0 is always false. Declare enum64_placeholder_id as 'int' in order to capture the potential error properly. Fixes: f2a625889bb8 ("libbpf: Add enum64 sanitization") Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220613054314.1251905-1-yhs@fb.com	2022-06-14 17:01:54 +02:00
Hongyi Lu	6dbdc9f353	bpf: Fix spelling in bpf_verifier.h Minor spelling fix spotted in bpf_verifier.h. Spelling is no big deal, but it is still an improvement when reading through the code. Signed-off-by: Hongyi Lu <jwnhy0@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220613211633.58647-1-jwnhy0@gmail.com	2022-06-14 16:56:25 +02:00
Alexei Starovoitov	d5e9aeda81	Merge branch 'Optimize performance of update hash-map when free is zero' Feng zhou says: ==================== From: Feng Zhou <zhoufeng.zf@bytedance.com> We encountered bad case on big system with 96 CPUs that alloc_htab_elem() would last for 1ms. The reason is that after the prealloc hashtab has no free elems, when trying to update, it will still grab spin_locks of all cpus. If there are multiple update users, the competition is very serious. 0001: Use head->first to check whether the free list is empty or not before taking the lock. 0002: Add benchmark to reproduce this worst case. Changelog: v5->v6: Addressed comments from Alexei Starovoitov. - Adjust the commit log. some details in here: https://lore.kernel.org/all/20220608021050.47279-1-zhoufeng.zf@bytedance.com/ v4->v5: Addressed comments from Alexei Starovoitov. - Use head->first. - Use cpu+max_entries. some details in here: https://lore.kernel.org/bpf/20220601084149.13097-1-zhoufeng.zf@bytedance.com/ v3->v4: Addressed comments from Daniel Borkmann. - Use READ_ONCE/WRITE_ONCE. some details in here: https://lore.kernel.org/all/20220530091340.53443-1-zhoufeng.zf@bytedance.com/ v2->v3: Addressed comments from Alexei Starovoitov, Andrii Nakryiko. - Adjust the way the benchmark is tested. - Adjust the code format. some details in here: https://lore.kernel.org/all/20220524075306.32306-1-zhoufeng.zf@bytedance.com/T/ v1->v2: Addressed comments from Alexei Starovoitov. - add a benchmark to reproduce the issue. - Adjust the code format that avoid adding indent. some details in here: https://lore.kernel.org/all/877ac441-045b-1844-6938-fcaee5eee7f2@bytedance.com/T/ ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-11 14:25:35 -07:00
Feng Zhou	89eda98428	selftest/bpf/benchs: Add bpf_map benchmark Add benchmark for hash_map to reproduce the worst case that non-stop update when map's free is zero. Just like this: ./run_bench_bpf_hashmap_full_update.sh Setting up benchmark 'bpf-hashmap-ful-update'... Benchmark 'bpf-hashmap-ful-update' started. 1:hash_map_full_perf 555830 events per sec ... Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com> Link: https://lore.kernel.org/r/20220610023308.93798-3-zhoufeng.zf@bytedance.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-11 14:25:35 -07:00
Feng Zhou	54a9c3a42d	bpf: avoid grabbing spin_locks of all cpus when no free elems This patch use head->first in pcpu_freelist_head to check freelist having free or not. If having, grab spin_lock, or check next cpu's freelist. Before patch: hash_map performance ./map_perf_test 1 0:hash_map_perf pre-alloc 1043397 events per sec ... The average of the test results is around 1050000 events per sec. hash_map the worst: no free ./run_bench_bpf_hashmap_full_update.sh Setting up benchmark 'bpf-hashmap-ful-update'... Benchmark 'bpf-hashmap-ful-update' started. 1:hash_map_full_perf 15687 events per sec ... The average of the test results is around 16000 events per sec. ftrace trace: 0) \| htab_map_update_elem() { 0) \| __pcpu_freelist_pop() { 0) \| _raw_spin_lock() 0) \| _raw_spin_unlock() 0) \| ... 0) + 25.188 us \| } 0) + 28.439 us \| } The test machine is 16C, trying to get spin_lock 17 times, in addition to 16c, there is an extralist. after patch: hash_map performance ./map_perf_test 1 0:hash_map_perf pre-alloc 1053298 events per sec ... The average of the test results is around 1050000 events per sec. hash_map worst: no free ./run_bench_bpf_hashmap_full_update.sh Setting up benchmark 'bpf-hashmap-ful-update'... Benchmark 'bpf-hashmap-ful-update' started. 1:hash_map_full_perf 555830 events per sec ... The average of the test results is around 550000 events per sec. ftrace trace: 0) \| htab_map_update_elem() { 0) \| alloc_htab_elem() { 0) 0.586 us \| __pcpu_freelist_pop(); 0) 0.945 us \| } 0) 8.669 us \| } It can be seen that after adding this patch, the map performance is almost not degraded, and when free=0, first check head->first instead of directly acquiring spin_lock. Co-developed-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com> Link: https://lore.kernel.org/r/20220610023308.93798-2-zhoufeng.zf@bytedance.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-11 14:25:35 -07:00
Andrii Nakryiko	fe92833524	libbpf: Fix uprobe symbol file offset calculation logic Fix libbpf's bpf_program__attach_uprobe() logic of determining function's file offset (which is what kernel is actually expecting) when attaching uprobe/uretprobe by function name. Previously calculation was determining virtual address offset relative to base load address, which (offset) is not always the same as file offset (though very frequently it is which is why this went unnoticed for a while). Fixes: 433966e3ae04 ("libbpf: Support function name-based attach uprobes") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Riham Selim <rihams@fb.com> Cc: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/20220606220143.3796908-1-andrii@kernel.org	2022-06-09 14:09:41 +02:00
Kosuke Fujimoto	492f99e419	bpf, docs: Fix typo "BFP_ALU" to "BPF_ALU" "BFP" should be "BPF" Signed-off-by: Kosuke Fujimoto <fujimotokosuke0@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220609083937.245749-1-fujimotoksouke0@gmail.com	2022-06-09 14:06:41 +02:00
Shahab Vahedi	0b817059a8	bpftool: Fix bootstrapping during a cross compilation This change adjusts the Makefile to use "HOSTAR" as the archive tool to keep the sanity of the build process for the bootstrap part in check. For the rationale, please continue reading. When cross compiling bpftool with buildroot, it leads to an invocation like: $ AR="/path/to/buildroot/host/bin/arc-linux-gcc-ar" \ CC="/path/to/buildroot/host/bin/arc-linux-gcc" \ ... make Which in return fails while building the bootstrap section: ----------------------------------8<---------------------------------- make: Entering directory '/src/bpftool-v6.7.0/src' ... libbfd: [ on ] ... disassembler-four-args: [ on ] ... zlib: [ on ] ... libcap: [ OFF ] ... clang-bpf-co-re: [ on ] <-- triggers bootstrap . . . LINK /src/bpftool-v6.7.0/src/bootstrap/bpftool /usr/bin/ld: /src/bpftool-v6.7.0/src/bootstrap/libbpf/libbpf.a: error adding symbols: archive has no index; run ranlib to add one collect2: error: ld returned 1 exit status make: * [Makefile:211: /src/bpftool-v6.7.0/src/bootstrap/bpftool] Error 1 make: * Waiting for unfinished jobs.... AR /src/bpftool-v6.7.0/src/libbpf/libbpf.a make[1]: Leaving directory '/src/bpftool-v6.7.0/libbpf/src' make: Leaving directory '/src/bpftool-v6.7.0/src' ---------------------------------->8---------------------------------- This occurs because setting "AR" confuses the build process for the bootstrap section and it calls "arc-linux-gcc-ar" to create and index "libbpf.a" instead of the host "ar". Signed-off-by: Shahab Vahedi <shahab@synopsys.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org> Link: https://lore.kernel.org/bpf/8d297f0c-cfd0-ef6f-3970-6dddb3d9a87a@synopsys.com	2022-06-09 14:01:21 +02:00
Alexei Starovoitov	d352bd889b	Merge branch 'bpf: Add 64bit enum value support' Yonghong Song says: ==================== Currently, btf only supports upto 32bit enum value with BTF_KIND_ENUM. But in kernel, some enum has 64bit values, e.g., in uapi bpf.h, we have enum { BPF_F_INDEX_MASK = 0xffffffffULL, BPF_F_CURRENT_CPU = BPF_F_INDEX_MASK, BPF_F_CTXLEN_MASK = (0xfffffULL << 32), }; With BTF_KIND_ENUM, the value for BPF_F_CTXLEN_MASK will be encoded as 0 which is incorrect. To solve this problem, BTF_KIND_ENUM64 is proposed in this patch set to support enum 64bit values. Also, since sometimes there is a need to generate C code from btf, e.g., vmlinux.h, btf kflag support is also added for BTF_KIND_ENUM and BTF_KIND_ENUM64 to indicate signedness, helping proper value printout. Changelog: v4 -> v5: - skip newly-added enum64 C test if clang version <= 14. v3 -> v4: - rename btf_type_is_any_enum() to btf_is_any_enum() to favor consistency in libbpf. - fix sign extension issue in btf_dump_get_enum_value(). - fix BPF_CORE_FIELD_SIGNED signedness issue in bpf_core_calc_field_relo(). v2 -> v3: - Implement separate btf_equal_enum()/btf_equal_enum64() and btf_compat_enum()/btf_compat_enum64(). - Add a new enum64 placeholder type dynamicly for enum64 sanitization. - For bpftool output and unit selftest, printed out signed/unsigned encoding as well. - fix some issues with BTF_KIND_ENUM is doc and clarified sign extension rules for enum values. v1 -> v2: - Changed kflag default from signed to unsigned - Fixed sanitization issue - Broke down libbpf related patches for easier review - Added more tests - More code refactorization - Corresponding llvm patch (to support enum64) is also updated ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-07 10:20:44 -07:00
Yonghong Song	61dbd59829	docs/bpf: Update documentation for BTF_KIND_ENUM64 support Add BTF_KIND_ENUM64 documentation in btf.rst. Also fixed a typo for section number for BTF_KIND_TYPE_TAG from 2.2.17 to 2.2.18, and fixed a type size issue for BTF_KIND_ENUM. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062724.3728215-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-07 10:20:44 -07:00
Yonghong Song	f4db3dd528	selftests/bpf: Add a test for enum64 value relocations Add a test for enum64 value relocations. The test will be skipped if clang version is 14 or lower since enum64 is only supported from version 15. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062718.3726307-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-07 10:20:44 -07:00
Yonghong Song	adc26d134e	selftests/bpf: Test BTF_KIND_ENUM64 for deduplication Add a few unit tests for BTF_KIND_ENUM64 deduplication. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062713.3725409-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-07 10:20:44 -07:00
Yonghong Song	3b5325186d	selftests/bpf: Add BTF_KIND_ENUM64 unit tests Add unit tests for basic BTF_KIND_ENUM64 encoding. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062708.3724845-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-07 10:20:44 -07:00
Yonghong Song	2b7301457f	selftests/bpf: Test new enum kflag and enum64 API functions Add tests to use the new enum kflag and enum64 API functions in selftest btf_write. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062703.3724287-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-07 10:20:43 -07:00
Yonghong Song	d932815a43	selftests/bpf: Fix selftests failure The kflag is supported now for BTF_KIND_ENUM. So remove the test which tests verifier failure due to existence of kflag. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062657.3723737-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-07 10:20:43 -07:00
Yonghong Song	58a53978fd	bpftool: Add btf enum64 support Add BTF_KIND_ENUM64 support. For example, the following enum is defined in uapi bpf.h. $ cat core.c enum A { BPF_F_INDEX_MASK = 0xffffffffULL, BPF_F_CURRENT_CPU = BPF_F_INDEX_MASK, BPF_F_CTXLEN_MASK = (0xfffffULL << 32), } g; Compiled with clang -target bpf -O2 -g -c core.c Using bpftool to dump types and generate format C file: $ bpftool btf dump file core.o ... [1] ENUM64 'A' encoding=UNSIGNED size=8 vlen=3 'BPF_F_INDEX_MASK' val=4294967295ULL 'BPF_F_CURRENT_CPU' val=4294967295ULL 'BPF_F_CTXLEN_MASK' val=4503595332403200ULL $ bpftool btf dump file core.o format c ... enum A { BPF_F_INDEX_MASK = 4294967295ULL, BPF_F_CURRENT_CPU = 4294967295ULL, BPF_F_CTXLEN_MASK = 4503595332403200ULL, }; ... Note that for raw btf output, the encoding (UNSIGNED or SIGNED) is printed out as well. The 64bit value is also represented properly in BTF and C dump. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062652.3722649-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-07 10:20:43 -07:00
Yonghong Song	23b2a3a8f6	libbpf: Add enum64 relocation support The enum64 relocation support is added. The bpf local type could be either enum or enum64 and the remote type could be either enum or enum64 too. The all combinations of local enum/enum64 and remote enum/enum64 are supported. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062647.3721719-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-07 10:20:43 -07:00
Yonghong Song	6ec7d79be2	libbpf: Add enum64 support for bpf linking Add BTF_KIND_ENUM64 support for bpf linking, which is very similar to BTF_KIND_ENUM. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062642.3721494-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-07 10:20:43 -07:00
Yonghong Song	f2a625889b	libbpf: Add enum64 sanitization When old kernel does not support enum64 but user space btf contains non-zero enum kflag or enum64, libbpf needs to do proper sanitization so modified btf can be accepted by the kernel. Sanitization for enum kflag can be achieved by clearing the kflag bit. For enum64, the type is replaced with an union of integer member types and the integer member size must be smaller than enum64 size. If such an integer type cannot be found, a new type is created and used for union members. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062636.3721375-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-07 10:20:43 -07:00
Yonghong Song	d90ec262b3	libbpf: Add enum64 support for btf_dump Add enum64 btf dumping support. For long long and unsigned long long dump, suffixes 'LL' and 'ULL' are added to avoid compilation errors in some cases. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062631.3720526-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-07 10:20:43 -07:00
Yonghong Song	2ef2026349	libbpf: Add enum64 deduplication support Add enum64 deduplication support. BTF_KIND_ENUM64 handling is very similar to BTF_KIND_ENUM. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062626.3720166-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-07 10:20:43 -07:00
Yonghong Song	dffbbdc2d9	libbpf: Add enum64 parsing and new enum64 public API Add enum64 parsing support and two new enum64 public APIs: btf__add_enum64 btf__add_enum64_value Also add support of signedness for BTF_KIND_ENUM. The BTF_KIND_ENUM API signatures are not changed. The signedness will be changed from unsigned to signed if btf__add_enum_value() finds any negative values. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062621.3719391-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-07 10:20:43 -07:00
Yonghong Song	8479aa7522	libbpf: Refactor btf__add_enum() for future code sharing Refactor btf__add_enum() function to create a separate function btf_add_enum_common() so later the common function can be used to add enum64 btf type. There is no functionality change for this patch. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062615.3718063-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-07 10:20:42 -07:00
Yonghong Song	b58b2b3a31	libbpf: Fix an error in 64bit relocation value computation Currently, the 64bit relocation value in the instruction is computed as follows: __u64 imm = insn[0].imm + ((__u64)insn[1].imm << 32) Suppose insn[0].imm = -1 (0xffffffff) and insn[1].imm = 1. With the above computation, insn[0].imm will first sign-extend to 64bit -1 (0xffffffffFFFFFFFF) and then add 0x1FFFFFFFF, producing incorrect value 0xFFFFFFFF. The correct value should be 0x1FFFFFFFF. Changing insn[0].imm to __u32 first will prevent 64bit sign extension and fix the issue. Merging high and low 32bit values also changed from '+' to '\|' to be consistent with other similar occurences in kernel and libbpf. Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062610.3717378-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-07 10:20:42 -07:00
Yonghong Song	776281652d	libbpf: Permit 64bit relocation value Currently, the libbpf limits the relocation value to be 32bit since all current relocations have such a limit. But with BTF_KIND_ENUM64 support, the enum value could be 64bit. So let us permit 64bit relocation value in libbpf. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062605.3716779-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-07 10:20:42 -07:00
Yonghong Song	6089fb325c	bpf: Add btf enum64 support Currently, BTF only supports upto 32bit enum value with BTF_KIND_ENUM. But in kernel, some enum indeed has 64bit values, e.g., in uapi bpf.h, we have enum { BPF_F_INDEX_MASK = 0xffffffffULL, BPF_F_CURRENT_CPU = BPF_F_INDEX_MASK, BPF_F_CTXLEN_MASK = (0xfffffULL << 32), }; In this case, BTF_KIND_ENUM will encode the value of BPF_F_CTXLEN_MASK as 0, which certainly is incorrect. This patch added a new btf kind, BTF_KIND_ENUM64, which permits 64bit value to cover the above use case. The BTF_KIND_ENUM64 has the following three fields followed by the common type: struct bpf_enum64 { __u32 nume_off; __u32 val_lo32; __u32 val_hi32; }; Currently, btf type section has an alignment of 4 as all element types are u32. Representing the value with __u64 will introduce a pad for bpf_enum64 and may also introduce misalignment for the 64bit value. Hence, two members of val_hi32 and val_lo32 are chosen to avoid these issues. The kflag is also introduced for BTF_KIND_ENUM and BTF_KIND_ENUM64 to indicate whether the value is signed or unsigned. The kflag intends to provide consistent output of BTF C fortmat with the original source code. For example, the original BTF_KIND_ENUM bit value is 0xffffffff. The format C has two choices, printing out 0xffffffff or -1 and current libbpf prints out as unsigned value. But if the signedness is preserved in btf, the value can be printed the same as the original source code. The kflag value 0 means unsigned values, which is consistent to the default by libbpf and should also cover most cases as well. The new BTF_KIND_ENUM64 is intended to support the enum value represented as 64bit value. But it can represent all BTF_KIND_ENUM values as well. The compiler ([1]) and pahole will generate BTF_KIND_ENUM64 only if the value has to be represented with 64 bits. In addition, a static inline function btf_kind_core_compat() is introduced which will be used later when libbpf relo_core.c changed. Here the kernel shares the same relo_core.c with libbpf. [1] https://reviews.llvm.org/D124641 Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062600.3716578-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-07 10:20:42 -07:00
Hangbin Liu	02f4afebf8	selftests/bpf: Add drv mode testing for xdping As subject, we only test SKB mode for xdping at present. Now add DRV mode for xdping. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20220602032507.464453-1-liuhangbin@gmail.com	2022-06-03 14:53:34 -07:00

1 2 3 4 5 ...

1102887 Commits