linux

iv/linux

Author	SHA1	Message	Date
Rong Tao	b74344cbed	docs/bpf: Update btf selftests program and add link Commit c64779e24e88("selftests/bpf: Merge most of test_btf into test_progs") renamed the BTF selftest from 'test_btf.c' to 'prog_tests/btf.c'. Signed-off-by: Rong Tao <rongtao@cestc.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/tencent_1FA6904156E8E599CAE4ABDBE80F22830106@qq.com	2022-11-25 00:00:15 +01:00
Alexei Starovoitov	c6b0337f01	bpf: Don't mark arguments to fentry/fexit programs as trusted. The PTR_TRUSTED flag should only be applied to pointers where the verifier can guarantee that such pointers are valid. The fentry/fexit/fmod_ret programs are not in this category. Only arguments of SEC("tp_btf") and SEC("iter") programs are trusted (which have BPF_TRACE_RAW_TP and BPF_TRACE_ITER attach_type correspondingly) This bug was masked because convert_ctx_accesses() was converting trusted loads into BPF_PROBE_MEM loads. Fix it as well. The loads from trusted pointers don't need exception handling. Fixes: 3f00c5239344 ("bpf: Allow trusted pointers to be passed to KF_TRUSTED_ARGS kfuncs") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20221124215314.55890-1-alexei.starovoitov@gmail.com	2022-11-24 23:47:09 +01:00
Alexei Starovoitov	6099754a14	Merge branch 'bpf: Add bpf_rcu_read_lock() support' Yonghong Song says: ==================== Currently, without rcu attribute info in BTF, the verifier treats rcu tagged pointer as a normal pointer. This might be a problem for sleepable program where rcu_read_lock()/unlock() is not available. For example, for a sleepable fentry program, if rcu protected memory access is interleaved with a sleepable helper/kfunc, it is possible the memory access after the sleepable helper/kfunc might be invalid since the object might have been freed then. Even without a sleepable helper/kfunc, without rcu_read_lock() protection, it is possible that the rcu protected object might be release in the middle of bpf program execution which may cause incorrect result. To prevent above cases, enable btf_type_tag("rcu") attributes, introduce new bpf_rcu_read_lock/unlock() kfuncs and add verifier support. In the rest of patch set, Patch 1 enabled btf_type_tag for __rcu attribute. Patche 2 added might_sleep in bpf_func_proto. Patch 3 added new bpf_rcu_read_lock/unlock() kfuncs and verifier support. Patch 4 added some tests for these two new kfuncs. Changelogs: v9 -> v10: . if no rcu tag support in vmlinux btf, using bpf_rcu_read_lock/unlock() will cause verification error. . at bpf_rcu_read_unlock(), invalidate rcu ptr to PTR_UNTRUSTED instead of SCALAR_VALUE. . a few other comment changes and other minor changes. v8 -> v9: . remove sleepable prog check for ld_abs/ind checking in rcu read lock region. . fix a test failure with gcc-compiled kernel. . a couple of other minor fixes. v7 -> v8: . add might_sleep in bpf_func_proto so we can easily identify whether a helper is sleepable or not. . do not enforce rcu rules for sleepable, e.g., rcu dereference must be in a bpf_rcu_read_lock region. This is to keep old code working fine. . Mark 'b' in 'b = a->b' (b is tagged with __rcu) as MEM_RCU only if 'b = a->b' in rcu read region and 'a' is trusted. This adds safety guarantee for 'b' inside the rcu read region. v6 -> v7: . rebase on top of bpf-next. . remove the patch which enables sleepable program using cgrp_local_storage map. This is orthogonal to this patch set and will be addressed separately. . mark the rcu pointer dereference result as UNTRUSTED if inside a bpf_rcu_read_lock() region. v5 -> v6: . fix selftest prog miss_unlock which tested nested locking. . add comments in selftest prog cgrp_succ to explain how to handle nested memory access after rcu memory load. v4 -> v5: . add new test to aarch64 deny list. v3 -> v4: . fix selftest failures when built with gcc. gcc doesn't support btf_type_tag yet and some tests relies on that. skip these tests if vmlinux BTF does not have btf_type_tag("rcu"). v2 -> v3: . went back to MEM_RCU approach with invalidate rcu ptr registers at bpf_rcu_read_unlock() place. . remove KF_RCU_LOCK/UNLOCK flag and compare btf_id at verification time instead. v1 -> v2: . use kfunc instead of helper for bpf_rcu_read_lock/unlock. . not use MEM_RCU bpf_type_flag, instead use active_rcu_lock in reg state to identify rcu ptr's. . Add more self tests. . add new test to s390x deny list. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-24 12:54:34 -08:00
Yonghong Song	48671232fc	selftests/bpf: Add tests for bpf_rcu_read_lock() Add a few positive/negative tests to test bpf_rcu_read_lock() and its corresponding verifier support. The new test will fail on s390x and aarch64, so an entry is added to each of their respective deny lists. Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221124053222.2374650-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-24 12:54:34 -08:00
Yonghong Song	9bb00b2895	bpf: Add kfunc bpf_rcu_read_lock/unlock() Add two kfunc's bpf_rcu_read_lock() and bpf_rcu_read_unlock(). These two kfunc's can be used for all program types. The following is an example about how rcu pointer are used w.r.t. bpf_rcu_read_lock()/bpf_rcu_read_unlock(). struct task_struct { ... struct task_struct last_wakee; struct task_struct __rcu real_parent; ... }; Let us say prog does 'task = bpf_get_current_task_btf()' to get a 'task' pointer. The basic rules are: - 'real_parent = task->real_parent' should be inside bpf_rcu_read_lock region. This is to simulate rcu_dereference() operation. The 'real_parent' is marked as MEM_RCU only if (1). task->real_parent is inside bpf_rcu_read_lock region, and (2). task is a trusted ptr. So MEM_RCU marked ptr can be 'trusted' inside the bpf_rcu_read_lock region. - 'last_wakee = real_parent->last_wakee' should be inside bpf_rcu_read_lock region since it tries to access rcu protected memory. - the ptr 'last_wakee' will be marked as PTR_UNTRUSTED since in general it is not clear whether the object pointed by 'last_wakee' is valid or not even inside bpf_rcu_read_lock region. The verifier will reset all rcu pointer register states to untrusted at bpf_rcu_read_unlock() kfunc call site, so any such rcu pointer won't be trusted any more outside the bpf_rcu_read_lock() region. The current implementation does not support nested rcu read lock region in the prog. Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221124053217.2373910-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-24 12:54:13 -08:00
Yonghong Song	01685c5bdd	bpf: Introduce might_sleep field in bpf_func_proto Introduce bpf_func_proto->might_sleep to indicate a particular helper might sleep. This will make later check whether a helper might be sleepable or not easier. Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221124053211.2373553-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-24 12:27:13 -08:00
Yonghong Song	5a0f663f01	compiler_types: Define __rcu as __attribute__((btf_type_tag("rcu"))) Currently, without rcu attribute info in BTF, the verifier treats rcu tagged pointer as a normal pointer. This might be a problem for sleepable program where rcu_read_lock()/unlock() is not available. For example, for a sleepable fentry program, if rcu protected memory access is interleaved with a sleepable helper/kfunc, it is possible the memory access after the sleepable helper/kfunc might be invalid since the object might have been freed then. To prevent such cases, introducing rcu tagging for memory accesses in verifier can help to reject such programs. To enable rcu tagging in BTF, during kernel compilation, define __rcu as attribute btf_type_tag("rcu") so __rcu information can be preserved in dwarf and btf, and later can be used for bpf prog verification. Acked-by: KP Singh <kpsingh@kernel.org> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221124053206.2373141-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-24 12:27:13 -08:00
David Vernet	f471748b7f	selftests/bpf: Add selftests for bpf_task_from_pid() Add some selftest testcases that validate the expected behavior of the bpf_task_from_pid() kfunc that was added in the prior patch. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20221122145300.251210-3-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-23 17:45:30 -08:00
David Vernet	3f0e6f2b41	bpf: Add bpf_task_from_pid() kfunc Callers can currently store tasks as kptrs using bpf_task_acquire(), bpf_task_kptr_get(), and bpf_task_release(). These are useful if a caller already has a struct task_struct , but there may be some callers who only have a pid, and want to look up the associated struct task_struct from that to e.g. find task->comm. This patch therefore adds a new bpf_task_from_pid() kfunc which allows BPF programs to get a struct task_struct * kptr from a pid. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20221122145300.251210-2-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-23 17:45:23 -08:00
Stanislav Fomichev	5bad3587b7	bpf: Unify and simplify btf_func_proto_check error handling Replace 'err = x; break;' with 'return x;'. Suggested-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20221124002838.2700179-1-sdf@google.com	2022-11-24 01:43:22 +01:00
Ji Rongfeng	72b43bde38	bpf: Update bpf_{g,s}etsockopt() documentation * append missing optnames to the end * simplify bpf_getsockopt()'s doc Signed-off-by: Ji Rongfeng <SikoJobs@outlook.com> Link: https://lore.kernel.org/r/DU0P192MB15479B86200B1216EC90E162D6099@DU0P192MB1547.EURP192.PROD.OUTLOOK.COM Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2022-11-23 16:33:59 -08:00
Donald Hunter	539886a32a	docs/bpf: Fix sphinx warnings in BPF map docs Fix duplicate C declaration warnings when using sphinx >= 3.1. Reported-by: Akira Yokosawa <akiyks@gmail.com> Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Akira Yokosawa <akiyks@gmail.com> Link: https://lore.kernel.org/bpf/ed4dac84-1b12-5c58-e4de-93ab9ac67c09@gmail.com Link: https://lore.kernel.org/bpf/20221122143933.91321-1-donald.hunter@gmail.com	2022-11-24 01:05:04 +01:00
Stanislav Fomichev	8e898aaa73	selftests/bpf: Add reproducer for decl_tag in func_proto argument It should trigger a WARN_ON_ONCE in btf_type_id_size: RIP: 0010:btf_type_id_size+0x8bd/0x940 kernel/bpf/btf.c:1952 btf_func_proto_check kernel/bpf/btf.c:4506 [inline] btf_check_all_types kernel/bpf/btf.c:4734 [inline] btf_parse_type_sec+0x1175/0x1980 kernel/bpf/btf.c:4763 btf_parse kernel/bpf/btf.c:5042 [inline] btf_new_fd+0x65a/0xb00 kernel/bpf/btf.c:6709 bpf_btf_load+0x6f/0x90 kernel/bpf/syscall.c:4342 __sys_bpf+0x50a/0x6c0 kernel/bpf/syscall.c:5034 __do_sys_bpf kernel/bpf/syscall.c:5093 [inline] __se_sys_bpf kernel/bpf/syscall.c:5091 [inline] __x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:5091 do_syscall_64+0x54/0x70 arch/x86/entry/common.c:48 Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20221123035422.872531-1-sdf@google.com	2022-11-24 00:50:06 +01:00
Stanislav Fomichev	f17472d459	bpf: Prevent decl_tag from being referenced in func_proto arg Syzkaller managed to hit another decl_tag issue: btf_func_proto_check kernel/bpf/btf.c:4506 [inline] btf_check_all_types kernel/bpf/btf.c:4734 [inline] btf_parse_type_sec+0x1175/0x1980 kernel/bpf/btf.c:4763 btf_parse kernel/bpf/btf.c:5042 [inline] btf_new_fd+0x65a/0xb00 kernel/bpf/btf.c:6709 bpf_btf_load+0x6f/0x90 kernel/bpf/syscall.c:4342 __sys_bpf+0x50a/0x6c0 kernel/bpf/syscall.c:5034 __do_sys_bpf kernel/bpf/syscall.c:5093 [inline] __se_sys_bpf kernel/bpf/syscall.c:5091 [inline] __x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:5091 do_syscall_64+0x54/0x70 arch/x86/entry/common.c:48 This seems similar to commit ea68376c8bed ("bpf: prevent decl_tag from being referenced in func_proto") but for the argument. Reported-by: syzbot+8dd0551dda6020944c5d@syzkaller.appspotmail.com Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20221123035422.872531-2-sdf@google.com	2022-11-24 00:48:50 +01:00
Donald Hunter	264c21867a	docs/bpf: Document BPF_MAP_TYPE_BLOOM_FILTER Add documentation for BPF_MAP_TYPE_BLOOM_FILTER including kernel BPF helper usage, userspace usage and examples. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Joanne Koong <joannelkoong@gmail.com> Link: https://lore.kernel.org/bpf/20221123141151.54556-1-donald.hunter@gmail.com	2022-11-23 22:47:32 +01:00
Maryam Tahhan	c645eee4d3	docs/bpf: Fix sphinx warnings for devmap Sphinx version >=3.1 warns about duplicate function declarations in the DEVMAP documentation. This is because the function name is the same for kernel and user space BPF progs but the parameters and return types they take is what differs. This patch moves from using the ``c:function::`` directive to using the ``code-block:: c`` directive. The patches also fix the indentation for the text associated with the "new" code block delcarations. The missing support of c:namespace-push:: and c:namespace-pop:: directives by helper scripts for kernel documentation prevents using the ``c:function::`` directive with proper namespacing. Signed-off-by: Maryam Tahhan <mtahhan@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20221123092321.88558-3-mtahhan@redhat.com	2022-11-23 22:40:27 +01:00
Maryam Tahhan	3685b0dc0d	docs/bpf: Fix sphinx warnings for cpumap Sphinx version >=3.1 warns about duplicate function declarations in the CPUMAP documentation. This is because the function name is the same for kernel and user space BPF progs but the parameters and return types they take is what differs. This patch moves from using the ``c:function::`` directive to using the ``code-block:: c`` directive. The patches also fix the indentation for the text associated with the "new" code block delcarations. The missing support of c:namespace-push:: and c:namespace-pop:: directives by helper scripts for kernel documentation prevents using the ``c:function::`` directive with proper namespacing. Signed-off-by: Maryam Tahhan <mtahhan@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20221123092321.88558-2-mtahhan@redhat.com	2022-11-23 22:38:53 +01:00
Donald Hunter	c742cb7c3e	docs/bpf: Add table of BPF program types to libbpf docs Extend the libbpf documentation with a table of program types, attach points and ELF section names. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20221121121734.98329-1-donald.hunter@gmail.com	2022-11-23 13:31:20 -08:00
Yonghong Song	beb3d47d1d	bpf: Fix a BTF_ID_LIST bug with CONFIG_DEBUG_INFO_BTF not set With CONFIG_DEBUG_INFO_BTF not set, we hit the following compilation error, /.../kernel/bpf/verifier.c:8196:23: error: array index 6 is past the end of the array (that has type 'u32[5]' (aka 'unsigned int[5]')) [-Werror,-Warray-bounds] if (meta->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx]) ^ ~~~~~~~~~~~~~~~~~~~~~~~ /.../kernel/bpf/verifier.c:8174:1: note: array 'special_kfunc_list' declared here BTF_ID_LIST(special_kfunc_list) ^ /.../include/linux/btf_ids.h:207:27: note: expanded from macro 'BTF_ID_LIST' #define BTF_ID_LIST(name) static u32 __maybe_unused name[5]; ^ /.../kernel/bpf/verifier.c:8443:19: error: array index 5 is past the end of the array (that has type 'u32[5]' (aka 'unsigned int[5]')) [-Werror,-Warray-bounds] btf_id == special_kfunc_list[KF_bpf_list_pop_back]; ^ ~~~~~~~~~~~~~~~~~~~~ /.../kernel/bpf/verifier.c:8174:1: note: array 'special_kfunc_list' declared here BTF_ID_LIST(special_kfunc_list) ^ /.../include/linux/btf_ids.h:207:27: note: expanded from macro 'BTF_ID_LIST' #define BTF_ID_LIST(name) static u32 __maybe_unused name[5]; ... Fix the problem by increase the size of BTF_ID_LIST to 16 to avoid compilation error and also prevent potentially unintended issue due to out-of-bound access. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221123155759.2669749-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-23 13:20:33 -08:00
Stanislav Fomichev	8ac88eece8	selftests/bpf: Mount debugfs in setns_by_fd Jiri reports broken test_progs after recent commit 68f8e3d4b916 ("selftests/bpf: Make sure zero-len skbs aren't redirectable"). Apparently we don't remount debugfs when we switch back networking namespace. Let's explicitly mount /sys/kernel/debug. 0: https://lore.kernel.org/bpf/63b85917-a2ea-8e35-620c-808560910819@meta.com/T/#ma66ca9c92e99eee0a25e40f422489b26ee0171c1 Fixes: a30338840fa5 ("selftests/bpf: Move open_netns() and close_netns() into network_helpers.c") Reported-by: Jiri Olsa <olsajiri@gmail.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20221123200829.2226254-1-sdf@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-23 12:38:16 -08:00
David Vernet	2fcc6081a7	bpf: Don't use idx variable when registering kfunc dtors In commit fda01efc6160 ("bpf: Enable cgroups to be used as kptrs"), I added an 'int idx' variable to kfunc_init() which was meant to dynamically set the index of the btf id entries of the 'generic_dtor_ids' array. This was done to make the code slightly less brittle as the struct cgroup * kptr kfuncs such as bpf_cgroup_aquire() are compiled out if CONFIG_CGROUPS is not defined. This, however, causes an lkp build warning: >> kernel/bpf/helpers.c:2005:40: warning: multiple unsequenced modifications to 'idx' [-Wunsequenced] .btf_id = generic_dtor_ids[idx++], Fix the warning by just hard-coding the indices. Fixes: fda01efc6160 ("bpf: Enable cgroups to be used as kptrs") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: David Vernet <void@manifault.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221123135253.637525-1-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-23 11:58:34 -08:00
Alexei Starovoitov	8a2162a922	Merge branch 'Support storing struct cgroup * objects as kptrs' David Vernet says: ==================== In [0], we added support for storing struct task_struct * objects as kptrs. This patch set extends this effort to also include storing struct cgroup * object as kptrs. As with tasks, there are many possible use cases for storing cgroups in maps. During tracing, for example, it could be useful to query cgroup statistics such as PSI averages, or tracking which tasks are migrating to and from the cgroup. [0]: https://lore.kernel.org/all/20221120051004.3605026-1-void@manifault.com/ ==================== Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-22 14:46:54 -08:00
David Vernet	227a89cf50	selftests/bpf: Add selftests for bpf_cgroup_ancestor() kfunc bpf_cgroup_ancestor() allows BPF programs to access the ancestor of a struct cgroup *. This patch adds selftests that validate its expected behavior. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20221122055458.173143-5-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-22 14:45:41 -08:00
David Vernet	5ca7867078	bpf: Add bpf_cgroup_ancestor() kfunc struct cgroup * objects have a variably sized struct cgroup *ancestors[] field which stores pointers to their ancestor cgroups. If using a cgroup as a kptr, it can be useful to access these ancestors, but doing so requires variable offset accesses for PTR_TO_BTF_ID, which is currently unsupported. This is a very useful field to access for cgroup kptrs, as programs may wish to walk their ancestor cgroups when determining e.g. their proportional cpu.weight. So as to enable this functionality with cgroup kptrs before var_off is supported for PTR_TO_BTF_ID, this patch adds a bpf_cgroup_ancestor() kfunc which accesses the cgroup node on behalf of the caller, and acquires a reference on it. Once var_off is supported for PTR_TO_BTF_ID, and fields inside a struct can be marked as trusted so they retain the PTR_TRUSTED modifier when walked, this can be removed. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20221122055458.173143-4-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-22 14:45:41 -08:00
David Vernet	f583ddf15e	selftests/bpf: Add cgroup kfunc / kptr selftests This patch adds a selftest suite to validate the cgroup kfuncs that were added in the prior patch. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20221122055458.173143-3-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-22 14:45:41 -08:00
David Vernet	fda01efc61	bpf: Enable cgroups to be used as kptrs Now that tasks can be used as kfuncs, and the PTR_TRUSTED flag is available for us to easily add basic acquire / get / release kfuncs, we can do the same for cgroups. This patch set adds the following kfuncs which enable using cgroups as kptrs: struct cgroup bpf_cgroup_acquire(struct cgroup cgrp); struct cgroup bpf_cgroup_kptr_get(struct cgroup cgrpp); void bpf_cgroup_release(struct cgroup cgrp); A follow-on patch will add a selftest suite which validates these kfuncs. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20221122055458.173143-2-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-22 14:45:41 -08:00
Alexei Starovoitov	dc79f035b2	selftests/bpf: Workaround for llvm nop-4 bug Currently LLVM fails to recognize .data.* as data section and defaults to .text section. Later BPF backend tries to emit 4-byte NOP instruction which doesn't exist in BPF ISA and aborts. The fix for LLVM is pending: https://reviews.llvm.org/D138477 While waiting for the fix lets workaround the linked_list test case by using .bss.* prefix which is properly recognized by LLVM as BSS section. Fix libbpf to support .bss. prefix and adjust tests. Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-22 13:34:08 -08:00
Alexei Starovoitov	0b2971a270	Revert "selftests/bpf: Temporarily disable linked list tests" This reverts commit 0a2f85a1be4328d29aefa54684d10c23a3298fef. Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-22 13:33:56 -08:00
Stanislav Fomichev	68f8e3d4b9	selftests/bpf: Make sure zero-len skbs aren't redirectable LWT_XMIT to test L3 case, TC to test L2 case. v2: - s/veth_ifindex/ipip_ifindex/ in two places (Martin) - add comment about which condition triggers the rejection (Martin) Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20221121180340.1983627-2-sdf@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2022-11-21 12:47:04 -08:00
Stanislav Fomichev	114039b342	bpf: Move skb->len == 0 checks into __bpf_redirect To avoid potentially breaking existing users. Both mac/no-mac cases have to be amended; mac_header >= network_header is not enough (verified with a new test, see next patch). Fixes: fd1894224407 ("bpf: Don't redirect packets with invalid pkt_len") Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20221121180340.1983627-1-sdf@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2022-11-21 12:47:04 -08:00
Hou Tao	8589e92675	selftests/bpf: Add test for cgroup iterator on a dead cgroup The test closes both iterator link fd and cgroup fd, and removes the cgroup file to make a dead cgroup before reading from cgroup iterator. It also uses kern_sync_rcu() and usleep() to wait for the release of start cgroup. If the start cgroup is not pinned by cgroup iterator, reading from iterator fd will trigger use-after-free. Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/bpf/20221121073440.1828292-4-houtao@huaweicloud.com	2022-11-21 17:40:42 +01:00
Hou Tao	2a42461a88	selftests/bpf: Add cgroup helper remove_cgroup() Add remove_cgroup() to remove a cgroup which doesn't have any children or live processes. It will be used by the following patch to test cgroup iterator on a dead cgroup. Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20221121073440.1828292-3-houtao@huaweicloud.com	2022-11-21 17:40:42 +01:00
Hou Tao	1a5160d4d8	bpf: Pin the start cgroup in cgroup_iter_seq_init() bpf_iter_attach_cgroup() has already acquired an extra reference for the start cgroup, but the reference may be released if the iterator link fd is closed after the creation of iterator fd, and it may lead to user-after-free problem when reading the iterator fd. An alternative fix is pinning iterator link when opening iterator, but it will make iterator link being still visible after the close of iterator link fd and the behavior is different with other link types, so just fixing it by acquiring another reference for the start cgroup. Fixes: d4ccaf58a847 ("bpf: Introduce cgroup iter") Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20221121073440.1828292-2-houtao@huaweicloud.com	2022-11-21 17:40:42 +01:00
Kees Cook	ceb35b666d	bpf/verifier: Use kmalloc_size_roundup() to match ksize() usage Most allocation sites in the kernel want an explicitly sized allocation (and not "more"), and that dynamic runtime analysis tools (e.g. KASAN, UBSAN_BOUNDS, FORTIFY_SOURCE, etc) are looking for precise bounds checking (i.e. not something that is rounded up). A tiny handful of allocations were doing an implicit alloc/realloc loop that actually depended on ksize(), and didn't actually always call realloc. This has created a long series of bugs and problems over many years related to the runtime bounds checking, so these callers are finally being adjusted to _not_ depend on the ksize() side-effect, by doing one of several things: - tracking the allocation size precisely and just never calling ksize() at all [1]. - always calling realloc and not using ksize() at all. (This solution ends up actually be a subset of the next solution.) - using kmalloc_size_roundup() to explicitly round up the desired allocation size immediately [2]. The bpf/verifier case is this another of this latter case, and is the last outstanding case to be fixed in the kernel. Because some of the dynamic bounds checking depends on the size being an _argument_ to an allocator function (i.e. see the __alloc_size attribute), the ksize() users are rare, and it could waste local variables, it was been deemed better to explicitly separate the rounding up from the allocation itself [3]. Round up allocations with kmalloc_size_roundup() so that the verifier's use of ksize() is always accurate. [1] e.g.: https://git.kernel.org/linus/712f210a457d https://git.kernel.org/linus/72c08d9f4c72 [2] e.g.: https://git.kernel.org/netdev/net-next/c/12d6c1d3a2ad https://git.kernel.org/netdev/net-next/c/ab3f7828c979 https://git.kernel.org/netdev/net-next/c/d6dd508080a3 [3] https://lore.kernel.org/lkml/0ea1fc165a6c6117f982f4f135093e69cb884930.camel@redhat.com/ Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20221118183409.give.387-kees@kernel.org	2022-11-21 15:07:04 +01:00
Alexei Starovoitov	35ffb1d9bf	Merge branch 'clean-up bpftool from legacy support' Sahid Orentino Ferdjaoui says: ==================== As part of commit 93b8952d223a ("libbpf: deprecate legacy BPF map definitions") and commit bd054102a8c7 ("libbpf: enforce strict libbpf 1.0 behaviors") The --legacy option is not relevant anymore. #1 is removing it. #4 is cleaning the code from using libbpf_get_error(). About patches #2 and #3 They are changes discovered while working on this series (credits to Quentin Monnet). #2 is cleaning-up usage of an unnecessary PTR_ERR(NULL), finally #3 is fixing an invalid value passed to strerror(). v1 -> v2: - Addressed review comments from Yonghong Song on patch #4 - Added a patch #5 that removes unwanted function noticed by Yonghong Song v2 -> v3 - Addressed review comments from Andrii Nakryiko on patch #2, #3, #4 * clean-up usage of libbpf_get_error() (#2, #3) * fix possible return of an uninitialized local variable err * fix returned errors using errno v3 -> v4 - Addressed review comments from Quentin Monnet * fix line moved from patch #2 to patch #3 * fix missing returned errors using errno * fix some returned values to errno instead of -1 ==================== Reviewed-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-20 16:18:25 -08:00
Sahid Orentino Ferdjaoui	52df1a8aab	bpftool: remove function free_btf_vmlinux() The function contains a single btf__free() call which can be inlined. Credits to Yonghong Song. Signed-off-by: Sahid Orentino Ferdjaoui <sahid.ferdjaoui@industrialdiscipline.com> Acked-by: Yonghong Song <yhs@fb.com> Suggested-by: Yonghong Song <yhs@fb.com> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/r/20221120112515.38165-6-sahid.ferdjaoui@industrialdiscipline.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-20 16:17:46 -08:00
Sahid Orentino Ferdjaoui	d1313e0127	bpftool: clean-up usage of libbpf_get_error() bpftool is now totally compliant with libbpf 1.0 mode and is not expected to be compiled with pre-1.0, let's clean-up the usage of libbpf_get_error(). The changes stay aligned with returned errors always negative. - In tools/bpf/bpftool/btf.c This fixes an uninitialized local variable `err` in function do_dump() because it may now be returned without having been set. - This also removes the checks on NULL pointers before calling btf__free() because that function already does the check. Signed-off-by: Sahid Orentino Ferdjaoui <sahid.ferdjaoui@industrialdiscipline.com> Link: https://lore.kernel.org/r/20221120112515.38165-5-sahid.ferdjaoui@industrialdiscipline.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-20 16:17:46 -08:00
Sahid Orentino Ferdjaoui	d2973ffd25	bpftool: fix error message when function can't register struct_ops It is expected that errno be passed to strerror(). This also cleans this part of code from using libbpf_get_error(). Signed-off-by: Sahid Orentino Ferdjaoui <sahid.ferdjaoui@industrialdiscipline.com> Acked-by: Yonghong Song <yhs@fb.com> Suggested-by: Quentin Monnet <quentin@isovalent.com> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/r/20221120112515.38165-4-sahid.ferdjaoui@industrialdiscipline.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-20 16:17:46 -08:00
Sahid Orentino Ferdjaoui	989f285159	bpftool: replace return value PTR_ERR(NULL) with 0 There is no reasons to keep PTR_ERR() when kern_btf=NULL, let's just return 0. This also cleans this part of code from using libbpf_get_error(). Signed-off-by: Sahid Orentino Ferdjaoui <sahid.ferdjaoui@industrialdiscipline.com> Acked-by: Yonghong Song <yhs@fb.com> Suggested-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/r/20221120112515.38165-3-sahid.ferdjaoui@industrialdiscipline.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-20 16:17:46 -08:00
Sahid Orentino Ferdjaoui	9b81075534	bpftool: remove support of --legacy option for bpftool Following: commit bd054102a8c7 ("libbpf: enforce strict libbpf 1.0 behaviors") commit 93b8952d223a ("libbpf: deprecate legacy BPF map definitions") The --legacy option is no longer relevant as libbpf no longer supports it. libbpf_set_strict_mode() is a no-op operation. Signed-off-by: Sahid Orentino Ferdjaoui <sahid.ferdjaoui@industrialdiscipline.com> Acked-by: Yonghong Song <yhs@fb.com> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/r/20221120112515.38165-2-sahid.ferdjaoui@industrialdiscipline.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-20 16:17:46 -08:00
Alexei Starovoitov	99429b224f	Merge branch 'bpf: Implement two type cast kfuncs' Yonghong Song says: ==================== Currenty, a non-tracing bpf program typically has a single 'context' argument with predefined uapi struct type. Following these uapi struct, user is able to access other fields defined in uapi header. Inside the kernel, the user-seen 'context' argument is replaced with 'kernel context' (or 'kctx' in short) which can access more information than what uapi header provides. To access other info not in uapi header, people typically do two things: (1). extend uapi to access more fields rooted from 'context'. (2). use bpf_probe_read_kernl() helper to read particular field based on kctx. Using (1) needs uapi change and using (2) makes code more complex since direct memory access is not allowed. There are already a few instances trying to access more information from kctx: . trying to access some fields from perf_event kctx ([1]). . trying to access some fields from xdp kctx ([2]). This patch set tried to allow direct memory access for kctx fields by introducing bpf_cast_to_kern_ctx() kfunc. Martin mentioned a use case like type casting below: #define skb_shinfo(SKB) ((struct skb_shared_info )(skb_end_pointer(SKB))) basically a 'unsigned char " casted to 'struct skb_shared_info *'. This patch set tries to support such a use case as well with bpf_rdonly_cast(). For the patch series, Patch 1 added support for a kfunc available to all prog types. Patch 2 added bpf_cast_to_kern_ctx() kfunc. Patch 3 added bpf_rdonly_cast() kfunc. Patch 4 added a few positive and negative tests. [1] https://lore.kernel.org/bpf/ad15b398-9069-4a0e-48cb-4bb651ec3088@meta.com/ [2] https://lore.kernel.org/bpf/20221109215242.1279993-1-john.fastabend@gmail.com/ Changelog: v3 -> v4: - remove unnecessary bpf_ctx_convert.t error checking - add and use meta.ret_btf_id instead of meta.arg_constant.value for bpf_cast_to_kern_ctx(). - add PTR_TRUSTED to the return PTR_TO_BTF_ID type for bpf_cast_to_kern_ctx(). v2 -> v3: - rebase on top of bpf-next (for merging conflicts) - add the selftest to s390x deny list rfcv1 -> v2: - break original one kfunc into two. - add missing error checks and error logs. - adapt to the new conventions in https://lore.kernel.org/all/20221118015614.2013203-1-memxor@gmail.com/ for example, with __ign and __k suffix. - added support in fixup_kfunc_call() to replace kfunc calls with a single mov. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-20 15:45:48 -08:00
Yonghong Song	58d84bee58	bpf: Add type cast unit tests Three tests are added. One is from John Fastabend ({1]) which tests tracing style access for xdp program from the kernel ctx. Another is a tc test to test both kernel ctx tracing style access and explicit non-ctx type cast. The third one is for negative tests including two tests, a tp_bpf test where the bpf_rdonly_cast() returns a untrusted ptr which cannot be used as helper argument, and a tracepoint test where the kernel ctx is a u64. Also added the test to DENYLIST.s390x since s390 does not currently support calling kernel functions in JIT mode. [1] https://lore.kernel.org/bpf/20221109215242.1279993-1-john.fastabend@gmail.com/ Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221120195442.3114844-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-20 15:45:48 -08:00
Yonghong Song	a35b9af4ec	bpf: Add a kfunc for generic type cast Implement bpf_rdonly_cast() which tries to cast the object to a specified type. This tries to support use case like below: #define skb_shinfo(SKB) ((struct skb_shared_info )(skb_end_pointer(SKB))) where skb_end_pointer(SKB) is a 'unsigned char ' and needs to be casted to 'struct skb_shared_info '. The signature of bpf_rdonly_cast() looks like void bpf_rdonly_cast(void *obj, __u32 btf_id) The function returns the same 'obj' but with PTR_TO_BTF_ID with btf_id. The verifier will ensure btf_id being a struct type. Since the supported type cast may not reflect what the 'obj' represents, the returned btf_id is marked as PTR_UNTRUSTED, so the return value and subsequent pointer chasing cannot be used as helper/kfunc arguments. Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221120195437.3114585-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-20 15:45:26 -08:00
Yonghong Song	fd264ca020	bpf: Add a kfunc to type cast from bpf uapi ctx to kernel ctx Implement bpf_cast_to_kern_ctx() kfunc which does a type cast of a uapi ctx object to the corresponding kernel ctx. Previously if users want to access some data available in kctx but not in uapi ctx, bpf_probe_read_kernel() helper is needed. The introduction of bpf_cast_to_kern_ctx() allows direct memory access which makes code simpler and easier to understand. Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221120195432.3113982-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-20 15:43:37 -08:00
Yonghong Song	cfe1456440	bpf: Add support for kfunc set with common btf_ids Later on, we will introduce kfuncs bpf_cast_to_kern_ctx() and bpf_rdonly_cast() which apply to all program types. Currently kfunc set only supports individual prog types. This patch added support for kfunc applying to all program types. Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221120195426.3113828-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-20 15:43:37 -08:00
Kumar Kartikeya Dwivedi	e181d3f143	bpf: Disallow bpf_obj_new_impl call when bpf_mem_alloc_init fails In the unlikely event that bpf_global_ma is not correctly initialized, instead of checking the boolean everytime bpf_obj_new_impl is called, simply check it while loading the program and return an error if bpf_global_ma_set is false. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20221120212610.2361700-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-20 15:38:28 -08:00
Alexei Starovoitov	efc1970d68	Merge branch 'Support storing struct task_struct objects as kptrs' David Vernet says: ==================== Now that BPF supports adding new kernel functions with kfuncs, and storing kernel objects in maps with kptrs, we can add a set of kfuncs which allow struct task_struct objects to be stored in maps as referenced kptrs. The possible use cases for doing this are plentiful. During tracing, for example, it would be useful to be able to collect some tasks that performed a certain operation, and then periodically summarize who they are, which cgroup they're in, how much CPU time they've utilized, etc. Doing this now would require storing the tasks' pids along with some relevant data to be exported to user space, and later associating the pids to tasks in other event handlers where the data is recorded. Another useful by-product of this is that it allows a program to pin a task in a BPF program, and by proxy therefore also e.g. pin its task local storage. In order to support this, we'll need to expand KF_TRUSTED_ARGS to support receiving trusted, non-refcounted pointers. It currently only supports either PTR_TO_CTX pointers, or refcounted pointers. What this means in terms of the implementation is that check_kfunc_args() would have to also check for the PTR_TRUSTED or MEM_ALLOC type modifiers when determining if a trusted KF_ARG_PTR_TO_ALLOC_BTF_ID or KF_ARG_PTR_TO_BTF_ID pointer requires a refcount. Note that PTR_UNTRUSTED is insufficient for this purpose, as it does not cover all of the possible types of potentially unsafe pointers. For example, a pointer obtained from walking a struct is not PTR_UNTRUSTED. To account for this and enable us to expand KF_TRUSTED_ARGS to include allow-listed arguments such as those passed by the kernel to tracepoints and struct_ops callbacks, this patch set also introduces a new PTR_TRUSTED type flag modifier which records if a pointer was obtained passed from the kernel in a trusted context. Currently, both PTR_TRUSTED and MEM_ALLOC are used to imply that a pointer is trusted. Longer term, PTR_TRUSTED should be the sole source of truth for whether a pointer is trusted. This requires us to set PTR_TRUSTED when appropriate (e.g. when setting MEM_ALLOC), and unset it when appropriate (e.g. when setting PTR_UNTRUSTED). We don't do that in this patch, as we need to do more clean up before this can be done in a clear and well-defined manner. In closing, this patch set: 1. Adds the new PTR_TRUSTED register type modifier flag, and updates the verifier and existing selftests accordingly. Also expands KF_TRUSTED_ARGS to also include trusted pointers that were not obtained from walking structs. 2. Adds a new set of kfuncs that allows struct task_struct* objects to be used as kptrs. 3. Adds a new selftest suite to validate these new task kfuncs. --- Changelog: v8 -> v9: - Moved check for release register back to where we check for !PTR_TO_BTF_ID \|\| socket. Change the verifier log message to reflect really what's being tested (the presence of unsafe modifiers) (Alexei) - Fix verifier_test error tests to reflect above changes - Remove unneeded parens around bitwise operator checks (Alexei) - Move updates to reg_type_str() which allow multiple type modifiers to be present in the prefix string, to a separate patch (Alexei) - Increase TYPE_STR_BUF_LEN size to 128 to reflect larger prefix size in reg_type_str(). v7 -> v8: - Rebased onto Kumar's latest patch set which, adds a new MEM_ALLOC reg type modifier for bpf_obj_new() calls. - Added comments to bpf_task_kptr_get() describing some of the subtle races we're protecting against (Alexei and John) - Slightly rework process_kf_arg_ptr_to_btf_id(), and add a new reg_has_unsafe_modifiers() function which validates that a register containing a kfunc release arg doesn't have unsafe modifiers. Note that this is slightly different than the check for KF_TRUSTED_ARGS. An alternative here would be to treat KF_RELEASE as implicitly requiring KF_TRUSTED_ARGS. - Export inline bpf_type_has_unsafe_modifiers() function from bpf_verifier.h so that it can be used from bpf_tcp_ca.c. Eventually this function should likely be changed to bpf_type_is_trusted(), once PTR_TRUSTED is the real source of truth. v6 -> v7: - Removed the PTR_WALKED type modifier, and instead define a new PTR_TRUSTED type modifier which is set on registers containing pointers passed from trusted contexts (i.e. as tracepoint or struct_ops callback args) (Alexei) - Remove the new KF_OWNED_ARGS kfunc flag. This can be accomplished by defining a new type that wraps an existing type, such as with struct nf_conn___init (Alexei) - Add a test_task_current_acquire_release testcase which verifies we can acquire a task struct returned from bpf_get_current_task_btf(). - Make bpf_task_acquire() no longer return NULL, as it can no longer be called with a NULL task. - Removed unnecessary is_test_kfunc_task() checks from failure testcases. v5 -> v6: - Add a new KF_OWNED_ARGS kfunc flag which may be used by kfuncs to express that they require trusted, refcounted args (Kumar) - Rename PTR_NESTED -> PTR_WALKED in the verifier (Kumar) - Convert reg_type_str() prefixes to use snprintf() instead of strncpy() (Kumar) - Add PTR_TO_BTF_ID \| PTR_WALKED to missing struct btf_reg_type instances -- specifically btf_id_sock_common_types, and percpu_btf_ptr_types. - Add a missing PTR_TO_BTF_ID \| PTR_WALKED switch case entry in check_func_arg_reg_off(), which is required when validating helper calls (Kumar) - Update reg_type_mismatch_ok() to check base types for the registers (i.e. to accommodate type modifiers). Additionally, add a lengthy comment that explains why this is being done (Kumar) - Update convert_ctx_accesses() to also issue probe reads for PTR_TO_BTF_ID \| PTR_WALKED (Kumar) - Update selftests to expect new prefix reg type strings. - Rename task_kfunc_acquire_trusted_nested testcase to task_kfunc_acquire_trusted_walked, and fix a comment (Kumar) - Remove KF_TRUSTED_ARGS from bpf_task_release(), which already includes KF_RELEASE (Kumar) - Add bpf-next in patch subject lines (Kumar) v4 -> v5: - Fix an improperly formatted patch title. v3 -> v4: - Remove an unnecessary check from my repository that I forgot to remove after debugging something. v2 -> v3: - Make bpf_task_acquire() check for NULL, and include KF_RET_NULL (Martin) - Include new PTR_NESTED register modifier type flag which specifies whether a pointer was obtained from walking a struct. Use this to expand the meaning of KF_TRUSTED_ARGS to include trusted pointers that were passed from the kernel (Kumar) - Add more selftests to the task_kfunc selftest suite which verify that you cannot pass a walked pointer to bpf_task_acquire(). - Update bpf_task_acquire() to also specify KF_TRUSTED_ARGS. v1 -> v2: - Rename tracing_btf_ids to generic_kfunc_btf_ids, and add the new kfuncs to that list instead of making a separate btf id list (Alexei). - Don't run the new selftest suite on s390x, which doesn't appear to support invoking kfuncs. - Add a missing __diag_ignore block for -Wmissing-prototypes (lkp@intel.com). - Fix formatting on some of the SPDX-License-Identifier tags. - Clarified the function header comment a bit on bpf_task_kptr_get(). ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-20 09:16:21 -08:00
David Vernet	fe147956fc	bpf/selftests: Add selftests for new task kfuncs A previous change added a series of kfuncs for storing struct task_struct objects as referenced kptrs. This patch adds a new task_kfunc test suite for validating their expected behavior. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20221120051004.3605026-5-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-20 09:16:21 -08:00
David Vernet	90660309b0	bpf: Add kfuncs for storing struct task_struct * as a kptr Now that BPF supports adding new kernel functions with kfuncs, and storing kernel objects in maps with kptrs, we can add a set of kfuncs which allow struct task_struct objects to be stored in maps as referenced kptrs. The possible use cases for doing this are plentiful. During tracing, for example, it would be useful to be able to collect some tasks that performed a certain operation, and then periodically summarize who they are, which cgroup they're in, how much CPU time they've utilized, etc. In order to enable this, this patch adds three new kfuncs: struct task_struct bpf_task_acquire(struct task_struct p); struct task_struct bpf_task_kptr_get(struct task_struct pp); void bpf_task_release(struct task_struct p); A follow-on patch will add selftests validating these kfuncs. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20221120051004.3605026-4-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-20 09:16:21 -08:00
David Vernet	3f00c52393	bpf: Allow trusted pointers to be passed to KF_TRUSTED_ARGS kfuncs Kfuncs currently support specifying the KF_TRUSTED_ARGS flag to signal to the verifier that it should enforce that a BPF program passes it a "safe", trusted pointer. Currently, "safe" means that the pointer is either PTR_TO_CTX, or is refcounted. There may be cases, however, where the kernel passes a BPF program a safe / trusted pointer to an object that the BPF program wishes to use as a kptr, but because the object does not yet have a ref_obj_id from the perspective of the verifier, the program would be unable to pass it to a KF_ACQUIRE \| KF_TRUSTED_ARGS kfunc. The solution is to expand the set of pointers that are considered trusted according to KF_TRUSTED_ARGS, so that programs can invoke kfuncs with these pointers without getting rejected by the verifier. There is already a PTR_UNTRUSTED flag that is set in some scenarios, such as when a BPF program reads a kptr directly from a map without performing a bpf_kptr_xchg() call. These pointers of course can and should be rejected by the verifier. Unfortunately, however, PTR_UNTRUSTED does not cover all the cases for safety that need to be addressed to adequately protect kfuncs. Specifically, pointers obtained by a BPF program "walking" a struct are _not_ considered PTR_UNTRUSTED according to BPF. For example, say that we were to add a kfunc called bpf_task_acquire(), with KF_ACQUIRE \| KF_TRUSTED_ARGS, to acquire a struct task_struct . If we only used PTR_UNTRUSTED to signal that a task was unsafe to pass to a kfunc, the verifier would mistakenly allow the following unsafe BPF program to be loaded: SEC("tp_btf/task_newtask") int BPF_PROG(unsafe_acquire_task, struct task_struct task, u64 clone_flags) { struct task_struct acquired, nested; nested = task->last_wakee; /* Would not be rejected by the verifier. / acquired = bpf_task_acquire(nested); if (!acquired) return 0; bpf_task_release(acquired); return 0; } To address this, this patch defines a new type flag called PTR_TRUSTED which tracks whether a PTR_TO_BTF_ID pointer is safe to pass to a KF_TRUSTED_ARGS kfunc or a BPF helper function. PTR_TRUSTED pointers are passed directly from the kernel as a tracepoint or struct_ops callback argument. Any nested pointer that is obtained from walking a PTR_TRUSTED pointer is no longer PTR_TRUSTED. From the example above, the struct task_struct task argument is PTR_TRUSTED, but the 'nested' pointer obtained from 'task->last_wakee' is not PTR_TRUSTED. A subsequent patch will add kfuncs for storing a task kfunc as a kptr, and then another patch will add selftests to validate. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20221120051004.3605026-3-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-20 09:16:21 -08:00

1 2 3 4 5 ...

1138473 Commits