682 Commits

Author SHA1 Message Date
Jon Doron
6a4ab8869d libbpf: Fix the case of running as non-root with capabilities
When running rootless with special capabilities like:
FOWNER / DAC_OVERRIDE / DAC_READ_SEARCH

The "access" API will not make the proper check if there is really
access to a file or not.

>From the access man page:
"
The check is done using the calling process's real UID and GID, rather
than the effective IDs as is done when actually attempting an operation
(e.g., open(2)) on the file.  Similarly, for the root user, the check
uses the set of permitted capabilities  rather than the set of effective
capabilities; ***and for non-root users, the check uses an empty set of
capabilities.***
"

What that means is that for non-root user the access API will not do the
proper validation if the process really has permission to a file or not.

To resolve this this patch replaces all the access API calls with
faccessat with AT_EACCESS flag.

Signed-off-by: Jon Doron <jond@wiz.io>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220925070431.1313680-1-arilou@gmail.com
2022-09-26 21:38:32 -07:00
Wang Yufen
e588c116df libbpf: Add pathname_concat() helper
Move snprintf and len check to common helper pathname_concat() to make the
code simpler.

Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1663828124-10437-1-git-send-email-wangyufen@huawei.com
2022-09-23 14:18:03 -07:00
David Vernet
b66ccae01f bpf: Add libbpf logic for user-space ring buffer
Now that all of the logic is in place in the kernel to support user-space
produced ring buffers, we can add the user-space logic to libbpf. This
patch therefore adds the following public symbols to libbpf:

struct user_ring_buffer *
user_ring_buffer__new(int map_fd,
		      const struct user_ring_buffer_opts *opts);
void *user_ring_buffer__reserve(struct user_ring_buffer *rb, __u32 size);
void *user_ring_buffer__reserve_blocking(struct user_ring_buffer *rb,
                                         __u32 size, int timeout_ms);
void user_ring_buffer__submit(struct user_ring_buffer *rb, void *sample);
void user_ring_buffer__discard(struct user_ring_buffer *rb,
void user_ring_buffer__free(struct user_ring_buffer *rb);

A user-space producer must first create a struct user_ring_buffer * object
with user_ring_buffer__new(), and can then reserve samples in the
ring buffer using one of the following two symbols:

void *user_ring_buffer__reserve(struct user_ring_buffer *rb, __u32 size);
void *user_ring_buffer__reserve_blocking(struct user_ring_buffer *rb,
                                         __u32 size, int timeout_ms);

With user_ring_buffer__reserve(), a pointer to a 'size' region of the ring
buffer will be returned if sufficient space is available in the buffer.
user_ring_buffer__reserve_blocking() provides similar semantics, but will
block for up to 'timeout_ms' in epoll_wait if there is insufficient space
in the buffer. This function has the guarantee from the kernel that it will
receive at least one event-notification per invocation to
bpf_ringbuf_drain(), provided that at least one sample is drained, and the
BPF program did not pass the BPF_RB_NO_WAKEUP flag to bpf_ringbuf_drain().

Once a sample is reserved, it must either be committed to the ring buffer
with user_ring_buffer__submit(), or discarded with
user_ring_buffer__discard().

Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220920000100.477320-4-void@manifault.com
2022-09-21 16:25:03 -07:00
David Vernet
583c1f4201 bpf: Define new BPF_MAP_TYPE_USER_RINGBUF map type
We want to support a ringbuf map type where samples are published from
user-space, to be consumed by BPF programs. BPF currently supports a
kernel -> user-space circular ring buffer via the BPF_MAP_TYPE_RINGBUF
map type.  We'll need to define a new map type for user-space -> kernel,
as none of the helpers exported for BPF_MAP_TYPE_RINGBUF will apply
to a user-space producer ring buffer, and we'll want to add one or
more helper functions that would not apply for a kernel-producer
ring buffer.

This patch therefore adds a new BPF_MAP_TYPE_USER_RINGBUF map type
definition. The map type is useless in its current form, as there is no
way to access or use it for anything until we one or more BPF helpers. A
follow-on patch will therefore add a new helper function that allows BPF
programs to run callbacks on samples that are published to the ring
buffer.

Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220920000100.477320-2-void@manifault.com
2022-09-21 16:24:17 -07:00
Andrii Nakryiko
749c202cb6 libbpf: Fix crash if SEC("freplace") programs don't have attach_prog_fd set
Fix SIGSEGV caused by libbpf trying to find attach type in vmlinux BTF
for freplace programs. It's wrong to search in vmlinux BTF and libbpf
doesn't even mark vmlinux BTF as required for freplace programs. So
trying to search anything in obj->vmlinux_btf might cause NULL
dereference if nothing else in BPF object requires vmlinux BTF.

Instead, error out if freplace (EXT) program doesn't specify
attach_prog_fd during at the load time.

Fixes: 91abb4a6d79d ("libbpf: Support attachment of BPF tracing programs to kernel modules")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220909193053.577111-3-andrii@kernel.org
2022-09-16 22:39:37 +02:00
Andrii Nakryiko
abf84b64e3 libbpf: Clean up deprecated and legacy aliases
Remove three missed deprecated APIs that were aliased to new APIs:
bpf_object__unload, bpf_prog_attach_xattr and btf__load.

Also move legacy API libbpf_find_kernel_btf (aliased to
btf__load_vmlinux_btf) into libbpf_legacy.h.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Hao Luo <haoluo@google.com>
Link: https://lore.kernel.org/bpf/20220816001929.369487-4-andrii@kernel.org
2022-08-17 22:42:56 +02:00
Andrii Nakryiko
813847a314 libbpf: Streamline bpf_attr and perf_event_attr initialization
Make sure that entire libbpf code base is initializing bpf_attr and
perf_event_attr with memset(0). Also for bpf_attr make sure we
clear and pass to kernel only relevant parts of bpf_attr. bpf_attr is
a huge union of independent sub-command attributes, so there is no need
to clear and pass entire union bpf_attr, which over time grows quite
a lot and for most commands this growth is completely irrelevant.

Few cases where we were relying on compiler initialization of BPF UAPI
structs (like bpf_prog_info, bpf_map_info, etc) with `= {};` were
switched to memset(0) pattern for future-proofing.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Hao Luo <haoluo@google.com>
Link: https://lore.kernel.org/bpf/20220816001929.369487-3-andrii@kernel.org
2022-08-17 22:42:10 +02:00
Andrii Nakryiko
d4e6d684f3 libbpf: Fix potential NULL dereference when parsing ELF
Fix if condition filtering empty ELF sections to prevent NULL
dereference.

Fixes: 47ea7417b074 ("libbpf: Skip empty sections in bpf_object__init_global_data_maps")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Hao Luo <haoluo@google.com>
Link: https://lore.kernel.org/bpf/20220816001929.369487-2-andrii@kernel.org
2022-08-17 22:42:10 +02:00
Hao Luo
43cb8cbadf libbpf: Allows disabling auto attach
Adds libbpf APIs for disabling auto-attach for individual functions.
This is motivated by the use case of cgroup iter [1]. Some iter
types require their parameters to be non-zero, therefore applying
auto-attach on them will fail. With these two new APIs, users who
want to use auto-attach and these types of iters can disable
auto-attach on the program and perform manual attach.

[1] https://lore.kernel.org/bpf/CAEf4BzZ+a2uDo_t6kGBziqdz--m2gh2_EUwkGLDtMd65uwxUjA@mail.gmail.com/

Signed-off-by: Hao Luo <haoluo@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220816234012.910255-1-haoluo@google.com
2022-08-17 09:40:47 -07:00
Hangbin Liu
1f235777c3 libbpf: Making bpf_prog_load() ignore name if kernel doesn't support
Similar with commit 10b62d6a38f7 ("libbpf: Add names for auxiliary maps"),
let's make bpf_prog_load() also ignore name if kernel doesn't support
program name.

To achieve this, we need to call sys_bpf_prog_load() directly in
probe_kern_prog_name() to avoid circular dependency. sys_bpf_prog_load()
also need to be exported in the libbpf_internal.h file.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20220813000936.6464-1-liuhangbin@gmail.com
2022-08-15 14:58:20 -07:00
Hangbin Liu
10b62d6a38 libbpf: Add names for auxiliary maps
The bpftool self-created maps can appear in final map show output due to
deferred removal in kernel. These maps don't have a name, which would make
users confused about where it comes from.

With a libbpf_ prefix name, users could know who created these maps.
It also could make some tests (like test_offload.py, which skip base maps
without names as a workaround) filter them out.

Kernel adds bpf prog/map name support in the same merge
commit fadad670a8ab ("Merge branch 'bpf-extend-info'"). So we can also use
kernel_supports(NULL, FEAT_PROG_NAME) to check if kernel supports map name.

As discussed [1], Let's make bpf_map_create accept non-null
name string, and silently ignore the name if kernel doesn't support.

  [1] https://lore.kernel.org/bpf/CAEf4BzYL1TQwo1231s83pjTdFPk9XWWhfZC5=KzkU-VO0k=0Ug@mail.gmail.com/

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220811034020.529685-1-liuhangbin@gmail.com
2022-08-11 15:12:38 -07:00
Andrii Nakryiko
d7c5802faf libbpf: preserve errno across pr_warn/pr_info/pr_debug
As suggested in [0], make sure that libbpf_print saves and restored
errno and as such guaranteed that no matter what actual print callback
user installs, macros like pr_warn/pr_info/pr_debug are completely
transparent as far as errno goes.

While libbpf code is pretty careful about not clobbering important errno
values accidentally with pr_warn(), it's a trivial change to make sure
that pr_warn can be used anywhere without a risk of clobbering errno.

No functional changes, just future proofing.

  [0] https://github.com/libbpf/libbpf/pull/536

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Daniel Müller <deso@posteo.net>
Link: https://lore.kernel.org/r/20220810183425.1998735-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-10 11:47:29 -07:00
Hengqi Chen
9e32084ef1 libbpf: Do not require executable permission for shared libraries
Currently, resolve_full_path() requires executable permission for both
programs and shared libraries. This causes failures on distos like Debian
since the shared libraries are not installed executable and Linux is not
requiring shared libraries to have executable permissions. Let's remove
executable permission check for shared libraries.

Reported-by: Goro Fuji <goro@fastly.com>
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220806102021.3867130-1-hengqi.chen@gmail.com
2022-08-08 15:07:40 -07:00
Andrii Nakryiko
e19db6762c libbpf: Reject legacy 'maps' ELF section
Add explicit error message if BPF object file is still using legacy BPF
map definitions in SEC("maps"). Before this change, if BPF object file
is still using legacy map definition user will see a bit confusing:

  libbpf: elf: skipping unrecognized data section(4) maps
  libbpf: prog 'handler': bad map relo against 'server_map' in section 'maps'

Now libbpf will be explicit about rejecting "maps" ELF section:

  libbpf: elf: legacy map definitions in 'maps' section are not supported by libbpf v1.0+

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220803214202.23750-1-andrii@kernel.org
2022-08-08 15:44:11 +02:00
James Hilliard
47ea7417b0 libbpf: Skip empty sections in bpf_object__init_global_data_maps
The GNU assembler generates an empty .bss section. This is a well
established behavior in GAS that happens in all supported targets.

The LLVM assembler doesn't generate an empty .bss section.

bpftool chokes on the empty .bss section.

Additionally in bpf_object__elf_collect the sec_desc->data is not
initialized when a section is not recognized. In this case, this
happens with .comment.

So we must check that sec_desc->data is initialized before checking
if the size is 0.

Signed-off-by: James Hilliard <james.hilliard1@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20220731232649.4668-1-james.hilliard1@gmail.com
2022-08-04 14:39:07 -07:00
Daniel Müller
64893e83f9 libbpf: Support PPC in arch_specific_syscall_pfx
Commit 708ac5bea0ce ("libbpf: add ksyscall/kretsyscall sections support
for syscall kprobes") added the arch_specific_syscall_pfx() function,
which returns a string representing the architecture in use. As it turns
out this function is currently not aware of Power PC, where NULL is
returned. That's being flagged by the libbpf CI system, which builds for
ppc64le and the compiler sees a NULL pointer being passed in to a %s
format string.
With this change we add representations for two more architectures, for
Power PC and Power PC 64, and also adjust the string format logic to
handle NULL pointers gracefully, in an attempt to prevent similar issues
with other architectures in the future.

Fixes: 708ac5bea0ce ("libbpf: add ksyscall/kretsyscall sections support for syscall kprobes")
Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220728222345.3125975-1-deso@posteo.net
2022-07-28 16:11:18 -07:00
Andrii Nakryiko
597fbc4682 libbpf: make RINGBUF map size adjustments more eagerly
Make libbpf adjust RINGBUF map size (rounding it up to closest power-of-2
of page_size) more eagerly: during open phase when initializing the map
and on explicit calls to bpf_map__set_max_entries().

Such approach allows user to check actual size of BPF ringbuf even
before it's created in the kernel, but also it prevents various edge
case scenarios where BPF ringbuf size can get out of sync with what it
would be in kernel. One of them (reported in [0]) is during an attempt
to pin/reuse BPF ringbuf.

Move adjust_ringbuf_sz() helper closer to its first actual use. The
implementation of the helper is unchanged.

Also make detection of whether bpf_object is already loaded more robust
by checking obj->loaded explicitly, given that map->fd can be < 0 even
if bpf_object is already loaded due to ability to disable map creation
with bpf_map__set_autocreate(map, false).

  [0] Closes: https://github.com/libbpf/libbpf/pull/530

Fixes: 0087a681fa8c ("libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220715230952.2219271-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-19 10:01:20 -07:00
Andrii Nakryiko
a1ac9fd6c6 libbpf: fallback to tracefs mount point if debugfs is not mounted
Teach libbpf to fallback to tracefs mount point (/sys/kernel/tracing) if
debugfs (/sys/kernel/debug/tracing) isn't mounted.

Acked-by: Yonghong Song <yhs@fb.com>
Suggested-by: Connor O'Brien <connoro@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220715185736.898848-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-19 09:54:28 -07:00
Andrii Nakryiko
708ac5bea0 libbpf: add ksyscall/kretsyscall sections support for syscall kprobes
Add SEC("ksyscall")/SEC("ksyscall/<syscall_name>") and corresponding
kretsyscall variants (for return kprobes) to allow users to kprobe
syscall functions in kernel. These special sections allow to ignore
complexities and differences between kernel versions and host
architectures when it comes to syscall wrapper and corresponding
__<arch>_sys_<syscall> vs __se_sys_<syscall> differences, depending on
whether host kernel has CONFIG_ARCH_HAS_SYSCALL_WRAPPER (though libbpf
itself doesn't rely on /proc/config.gz for detecting this, see
BPF_KSYSCALL patch for how it's done internally).

Combined with the use of BPF_KSYSCALL() macro, this allows to just
specify intended syscall name and expected input arguments and leave
dealing with all the variations to libbpf.

In addition to SEC("ksyscall+") and SEC("kretsyscall+") add
bpf_program__attach_ksyscall() API which allows to specify syscall name
at runtime and provide associated BPF cookie value.

At the moment SEC("ksyscall") and bpf_program__attach_ksyscall() do not
handle all the calling convention quirks for mmap(), clone() and compat
syscalls. It also only attaches to "native" syscall interfaces. If host
system supports compat syscalls or defines 32-bit syscalls in 64-bit
kernel, such syscall interfaces won't be attached to by libbpf.

These limitations may or may not change in the future. Therefore it is
recommended to use SEC("kprobe") for these syscalls or if working with
compat and 32-bit interfaces is required.

Tested-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220714070755.3235561-5-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-19 09:33:18 -07:00
Andrii Nakryiko
6f5d467d55 libbpf: improve BPF_KPROBE_SYSCALL macro and rename it to BPF_KSYSCALL
Improve BPF_KPROBE_SYSCALL (and rename it to shorter BPF_KSYSCALL to
match libbpf's SEC("ksyscall") section name, added in next patch) to use
__kconfig variable to determine how to properly fetch syscall arguments.

Instead of relying on hard-coded knowledge of whether kernel's
architecture uses syscall wrapper or not (which only reflects the latest
kernel versions, but is not necessarily true for older kernels and won't
necessarily hold for later kernel versions on some particular host
architecture), determine this at runtime by attempting to create
perf_event (with fallback to kprobe event creation through tracefs on
legacy kernels, just like kprobe attachment code is doing) for kernel
function that would correspond to bpf() syscall on a system that has
CONFIG_ARCH_HAS_SYSCALL_WRAPPER set (e.g., for x86-64 it would try
'__x64_sys_bpf').

If host kernel uses syscall wrapper, syscall kernel function's first
argument is a pointer to struct pt_regs that then contains syscall
arguments. In such case we need to use bpf_probe_read_kernel() to fetch
actual arguments (which we do through BPF_CORE_READ() macro) from inner
pt_regs.

But if the kernel doesn't use syscall wrapper approach, input
arguments can be read from struct pt_regs directly with no probe reading.

All this feature detection is done without requiring /proc/config.gz
existence and parsing, and BPF-side helper code uses newly added
LINUX_HAS_SYSCALL_WRAPPER virtual __kconfig extern to keep in sync with
user-side feature detection of libbpf.

BPF_KSYSCALL() macro can be used both with SEC("kprobe") programs that
define syscall function explicitly (e.g., SEC("kprobe/__x64_sys_bpf"))
and SEC("ksyscall") program added in the next patch (which are the same
kprobe program with added benefit of libbpf determining correct kernel
function name automatically).

Kretprobe and kretsyscall (added in next patch) programs don't need
BPF_KSYSCALL as they don't provide access to input arguments. Normal
BPF_KRETPROBE is completely sufficient and is recommended.

Tested-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220714070755.3235561-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-19 09:33:18 -07:00
Andrii Nakryiko
55d00c37eb libbpf: generalize virtual __kconfig externs and use it for USDT
Libbpf supports single virtual __kconfig extern currently: LINUX_KERNEL_VERSION.
LINUX_KERNEL_VERSION isn't coming from /proc/kconfig.gz and is intead
customly filled out by libbpf.

This patch generalizes this approach to support more such virtual
__kconfig externs. One such extern added in this patch is
LINUX_HAS_BPF_COOKIE which is used for BPF-side USDT supporting code in
usdt.bpf.h instead of using CO-RE-based enum detection approach for
detecting bpf_get_attach_cookie() BPF helper. This allows to remove
otherwise not needed CO-RE dependency and keeps user-space and BPF-side
parts of libbpf's USDT support strictly in sync in terms of their
feature detection.

We'll use similar approach for syscall wrapper detection for
BPF_KSYSCALL() BPF-side macro in follow up patch.

Generally, currently libbpf reserves CONFIG_ prefix for Kconfig values
and LINUX_ for virtual libbpf-backed externs. In the future we might
extend the set of prefixes that are supported. This can be done without
any breaking changes, as currently any __kconfig extern with
unrecognized name is rejected.

For LINUX_xxx externs we support the normal "weak rule": if libbpf
doesn't recognize given LINUX_xxx extern but such extern is marked as
__weak, it is not rejected and defaults to zero.  This follows
CONFIG_xxx handling logic and will allow BPF applications to
opportunistically use newer libbpf virtual externs without breaking on
older libbpf versions unnecessarily.

Tested-by: Alan Maguire <alan.maguire@oracle.com>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220714070755.3235561-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-19 09:33:17 -07:00
Jon Doron
9ff5efdeb0 libbpf: perfbuf: Add API to get the ring buffer
Add support for writing a custom event reader, by exposing the ring
buffer.

With the new API perf_buffer__buffer() you will get access to the
raw mmaped()'ed per-cpu underlying memory of the ring buffer.

This region contains both the perf buffer data and header
(struct perf_event_mmap_page), which manages the ring buffer
state (head/tail positions, when accessing the head/tail position
it's important to take into consideration SMP).
With this type of low level access one can implement different types of
consumers here are few simple examples where this API helps with:

1. perf_event_read_simple is allocating using malloc, perhaps you want
   to handle the wrap-around in some other way.
2. Since perf buf is per-cpu then the order of the events is not
   guarnteed, for example:
   Given 3 events where each event has a timestamp t0 < t1 < t2,
   and the events are spread on more than 1 CPU, then we can end
   up with the following state in the ring buf:
   CPU[0] => [t0, t2]
   CPU[1] => [t1]
   When you consume the events from CPU[0], you could know there is
   a t1 missing, (assuming there are no drops, and your event data
   contains a sequential index).
   So now one can simply do the following, for CPU[0], you can store
   the address of t0 and t2 in an array (without moving the tail, so
   there data is not perished) then move on the CPU[1] and set the
   address of t1 in the same array.
   So you end up with something like:
   void **arr[] = [&t0, &t1, &t2], now you can consume it orderely
   and move the tails as you process in order.
3. Assuming there are multiple CPUs and we want to start draining the
   messages from them, then we can "pick" with which one to start with
   according to the remaining free space in the ring buffer.

Signed-off-by: Jon Doron <jond@wiz.io>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220715181122.149224-1-arilou@gmail.com
2022-07-15 12:53:22 -07:00
Anquan Wu
bf3f003785 libbpf: Fix the name of a reused map
BPF map name is limited to BPF_OBJ_NAME_LEN.
A map name is defined as being longer than BPF_OBJ_NAME_LEN,
it will be truncated to BPF_OBJ_NAME_LEN when a userspace program
calls libbpf to create the map. A pinned map also generates a path
in the /sys. If the previous program wanted to reuse the map,
it can not get bpf_map by name, because the name of the map is only
partially the same as the name which get from pinned path.

The syscall information below show that map name "process_pinned_map"
is truncated to "process_pinned_".

    bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/process_pinned_map",
    bpf_fd=0, file_flags=0}, 144) = -1 ENOENT (No such file or directory)

    bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4,
    value_size=4,max_entries=1024, map_flags=0, inner_map_fd=0,
    map_name="process_pinned_",map_ifindex=0, btf_fd=3, btf_key_type_id=6,
    btf_value_type_id=10,btf_vmlinux_value_type_id=0}, 72) = 4

This patch check that if the name of pinned map are the same as the
actual name for the first (BPF_OBJ_NAME_LEN - 1),
bpf map still uses the name which is included in bpf object.

Fixes: 26736eb9a483 ("tools: libbpf: allow map reuse")
Signed-off-by: Anquan Wu <leiqi96@hotmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/OSZP286MB1725CEA1C95C5CB8E7CCC53FB8869@OSZP286MB1725.JPNP286.PROD.OUTLOOK.COM
2022-07-13 22:18:37 -07:00
Hengqi Chen
8ed2f5a6f3 libbpf: Error out when binary_path is NULL for uprobe and USDT
binary_path is a required non-null parameter for bpf_program__attach_usdt
and bpf_program__attach_uprobe_opts. Check it against NULL to prevent
coredump on strchr.

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220712025745.2703995-1-hengqi.chen@gmail.com
2022-07-13 12:02:59 -07:00
Chuang Wang
2655144fb4 libbpf: Cleanup the legacy uprobe_event on failed add/attach_event()
A potential scenario, when an error is returned after
add_uprobe_event_legacy() in perf_event_uprobe_open_legacy(), or
bpf_program__attach_perf_event_opts() in
bpf_program__attach_uprobe_opts() returns an error, the uprobe_event
that was previously created is not cleaned.

So, with this patch, when an error is returned, fix this by adding
remove_uprobe_event_legacy()

Signed-off-by: Chuang Wang <nashuiliang@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220629151848.65587-4-nashuiliang@gmail.com
2022-07-05 21:23:03 -07:00
Chuang Wang
5666fc997c libbpf: Fix wrong variable used in perf_event_uprobe_open_legacy()
Use "type" as opposed to "err" in pr_warn() after
determine_uprobe_perf_type_legacy() returns an error.

Signed-off-by: Chuang Wang <nashuiliang@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220629151848.65587-3-nashuiliang@gmail.com
2022-07-05 21:21:44 -07:00
Chuang Wang
8094029330 libbpf: Cleanup the legacy kprobe_event on failed add/attach_event()
Before the 0bc11ed5ab60 commit ("kprobes: Allow kprobes coexist with
livepatch"), in a scenario where livepatch and kprobe coexist on the
same function entry, the creation of kprobe_event using
add_kprobe_event_legacy() will be successful, at the same time as a
trace event (e.g. /debugfs/tracing/events/kprobe/XXX) will exist, but
perf_event_open() will return an error because both livepatch and kprobe
use FTRACE_OPS_FL_IPMODIFY. As follows:

1) add a livepatch

$ insmod livepatch-XXX.ko

2) add a kprobe using tracefs API (i.e. add_kprobe_event_legacy)

$ echo 'p:mykprobe XXX' > /sys/kernel/debug/tracing/kprobe_events

3) enable this kprobe (i.e. sys_perf_event_open)

This will return an error, -EBUSY.

On Andrii Nakryiko's comment, few error paths in
bpf_program__attach_kprobe_opts() that should need to call
remove_kprobe_event_legacy().

With this patch, whenever an error is returned after
add_kprobe_event_legacy() or bpf_program__attach_perf_event_opts(), this
ensures that the created kprobe_event is cleaned.

Signed-off-by: Chuang Wang <nashuiliang@gmail.com>
Signed-off-by: Jingren Zhou <zhoujingren@didiglobal.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220629151848.65587-2-nashuiliang@gmail.com
2022-07-05 21:21:41 -07:00
Daniel Müller
ec6209c8d4 bpf, libbpf: Add type match support
This patch adds support for the proposed type match relation to
relo_core where it is shared between userspace and kernel. It plumbs
through both kernel-side and libbpf-side support.

The matching relation is defined as follows (copy from source):
- modifiers and typedefs are stripped (and, hence, effectively ignored)
- generally speaking types need to be of same kind (struct vs. struct, union
  vs. union, etc.)
  - exceptions are struct/union behind a pointer which could also match a
    forward declaration of a struct or union, respectively, and enum vs.
    enum64 (see below)
Then, depending on type:
- integers:
  - match if size and signedness match
- arrays & pointers:
  - target types are recursively matched
- structs & unions:
  - local members need to exist in target with the same name
  - for each member we recursively check match unless it is already behind a
    pointer, in which case we only check matching names and compatible kind
- enums:
  - local variants have to have a match in target by symbolic name (but not
    numeric value)
  - size has to match (but enum may match enum64 and vice versa)
- function pointers:
  - number and position of arguments in local type has to match target
  - for each argument and the return value we recursively check match

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220628160127.607834-5-deso@posteo.net
2022-07-05 21:14:25 -07:00
Stanislav Fomichev
bffcf34878 libbpf: add lsm_cgoup_sock type
lsm_cgroup/ is the prefix for BPF_LSM_CGROUP.

Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20220628174314.1216643-9-sdf@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-29 13:21:52 -07:00
Andrii Nakryiko
bd054102a8 libbpf: enforce strict libbpf 1.0 behaviors
Remove support for legacy features and behaviors that previously had to
be disabled by calling libbpf_set_strict_mode():
  - legacy BPF map definitions are not supported now;
  - RLIMIT_MEMLOCK auto-setting, if necessary, is always on (but see
    libbpf_set_memlock_rlim());
  - program name is used for program pinning (instead of section name);
  - cleaned up error returning logic;
  - entry BPF programs should have SEC() always.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-15-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28 13:13:33 -07:00
Andrii Nakryiko
450b167fb9 libbpf: clean up SEC() handling
Get rid of sloppy prefix logic and remove deprecated xdp_{devmap,cpumap}
sections.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-13-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28 13:13:33 -07:00
Andrii Nakryiko
cf90a20db8 libbpf: remove internal multi-instance prog support
Clean up internals that had to deal with the possibility of
multi-instance bpf_programs. Libbpf 1.0 doesn't support this, so all
this is not necessary now and can be simplified.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-12-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28 13:13:33 -07:00
Andrii Nakryiko
b4bda502df libbpf: remove multi-instance and custom private data APIs
Remove all the public APIs that are related to creating multi-instance
bpf_programs through custom preprocessing callback and generally working
with them.

Also remove all the bpf_{object,map,program}__[set_]priv() APIs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-10-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28 13:13:32 -07:00
Andrii Nakryiko
146bf811f5 libbpf: remove most other deprecated high-level APIs
Remove a bunch of high-level bpf_object/bpf_map/bpf_program related
APIs. All the APIs related to private per-object/map/prog state,
program preprocessing callback, and generally everything multi-instance
related is removed in a separate patch.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-9-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28 13:13:32 -07:00
Andrii Nakryiko
9a590538ba libbpf: remove prog_info_linear APIs
Remove prog_info_linear-related APIs previously used by perf.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-8-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28 13:13:32 -07:00
Andrii Nakryiko
22dd7a58b2 libbpf: clean up perfbuf APIs
Remove deprecated perfbuf APIs and clean up opts structs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-7-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28 13:13:32 -07:00
Andrii Nakryiko
aaf6886d9b libbpf: remove deprecated BTF APIs
Get rid of deprecated BTF-related APIs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-6-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28 13:13:32 -07:00
Andrii Nakryiko
765a34130e libbpf: remove deprecated low-level APIs
Drop low-level APIs as well as high-level (and very confusingly named)
BPF object loading bpf_prog_load_xattr() and bpf_prog_load_deprecated()
APIs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28 13:13:32 -07:00
Daniel Müller
fd75733da2 bpf: Merge "types_are_compat" logic into relo_core.c
BPF type compatibility checks (bpf_core_types_are_compat()) are
currently duplicated between kernel and user space. That's a historical
artifact more than intentional doing and can lead to subtle bugs where
one implementation is adjusted but another is forgotten.

That happened with the enum64 work, for example, where the libbpf side
was changed (commit 23b2a3a8f63a ("libbpf: Add enum64 relocation
support")) to use the btf_kind_core_compat() helper function but the
kernel side was not (commit 6089fb325cf7 ("bpf: Add btf enum64
support")).

This patch addresses both the duplication issue, by merging both
implementations and moving them into relo_core.c, and fixes the alluded
to kind check (by giving preference to libbpf's already adjusted logic).

For discussion of the topic, please refer to:
https://lore.kernel.org/bpf/CAADnVQKbWR7oarBdewgOBZUPzryhRYvEbkhyPJQHHuxq=0K1gw@mail.gmail.com/T/#mcc99f4a33ad9a322afaf1b9276fb1f0b7add9665

Changelog:
v1 -> v2:
- limited libbpf recursion limit to 32
- changed name to __bpf_core_types_are_compat
- included warning previously present in libbpf version
- merged kernel and user space changes into a single patch

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220623182934.2582827-1-deso@posteo.net
2022-06-24 14:15:37 -07:00
Delyan Kratunov
c4cac71fc8 libbpf: add support for sleepable uprobe programs
Add section mappings for u(ret)probe.s programs.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Delyan Kratunov <delyank@fb.com>
Link: https://lore.kernel.org/r/aedbc3b74f3523f00010a7b0df8f3388cca59f16.1655248076.git.delyank@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-16 19:27:30 -07:00
Yonghong Song
c49a44b39b libbpf: Fix an unsigned < 0 bug
Andrii reported a bug with the following information:

  2859 	if (enum64_placeholder_id == 0) {
  2860 		enum64_placeholder_id = btf__add_int(btf, "enum64_placeholder", 1, 0);
  >>>     CID 394804:  Control flow issues  (NO_EFFECT)
  >>>     This less-than-zero comparison of an unsigned value is never true. "enum64_placeholder_id < 0U".
  2861 		if (enum64_placeholder_id < 0)
  2862 			return enum64_placeholder_id;
  2863    	...

Here enum64_placeholder_id declared as '__u32' so enum64_placeholder_id < 0
is always false. Declare enum64_placeholder_id as 'int' in order to capture
the potential error properly.

Fixes: f2a625889bb8 ("libbpf: Add enum64 sanitization")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220613054314.1251905-1-yhs@fb.com
2022-06-14 17:01:54 +02:00
Andrii Nakryiko
fe92833524 libbpf: Fix uprobe symbol file offset calculation logic
Fix libbpf's bpf_program__attach_uprobe() logic of determining
function's *file offset* (which is what kernel is actually expecting)
when attaching uprobe/uretprobe by function name. Previously calculation
was determining virtual address offset relative to base load address,
which (offset) is not always the same as file offset (though very
frequently it is which is why this went unnoticed for a while).

Fixes: 433966e3ae04 ("libbpf: Support function name-based attach uprobes")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Riham Selim <rihams@fb.com>
Cc: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20220606220143.3796908-1-andrii@kernel.org
2022-06-09 14:09:41 +02:00
Yonghong Song
23b2a3a8f6 libbpf: Add enum64 relocation support
The enum64 relocation support is added. The bpf local type
could be either enum or enum64 and the remote type could be
either enum or enum64 too. The all combinations of local enum/enum64
and remote enum/enum64 are supported.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062647.3721719-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-07 10:20:43 -07:00
Yonghong Song
f2a625889b libbpf: Add enum64 sanitization
When old kernel does not support enum64 but user space btf
contains non-zero enum kflag or enum64, libbpf needs to
do proper sanitization so modified btf can be accepted
by the kernel.

Sanitization for enum kflag can be achieved by clearing
the kflag bit. For enum64, the type is replaced with an
union of integer member types and the integer member size
must be smaller than enum64 size. If such an integer
type cannot be found, a new type is created and used
for union members.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062636.3721375-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-07 10:20:43 -07:00
Yuze Chi
611edf1bac libbpf: Fix is_pow_of_2
Move the correct definition from linker.c into libbpf_internal.h.

Fixes: 0087a681fa8c ("libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary")
Reported-by: Yuze Chi <chiyuze@google.com>
Signed-off-by: Yuze Chi <chiyuze@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220603055156.2830463-1-irogers@google.com
2022-06-03 14:53:33 -07:00
Daniel Müller
ba5d1b5802 libbpf: Introduce libbpf_bpf_link_type_str
This change introduces a new function, libbpf_bpf_link_type_str, to the
public libbpf API. The function allows users to get a string
representation for a bpf_link_type enum variant.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220523230428.3077108-11-deso@posteo.net
2022-06-02 16:26:33 -07:00
Daniel Müller
ccde5760ba libbpf: Introduce libbpf_bpf_attach_type_str
This change introduces a new function, libbpf_bpf_attach_type_str, to
the public libbpf API. The function allows users to get a string
representation for a bpf_attach_type variant.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220523230428.3077108-8-deso@posteo.net
2022-06-02 16:26:26 -07:00
Daniel Müller
3e6dc0207b libbpf: Introduce libbpf_bpf_map_type_str
This change introduces a new function, libbpf_bpf_map_type_str, to the
public libbpf API. The function allows users to get a string
representation for a bpf_map_type enum variant.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220523230428.3077108-5-deso@posteo.net
2022-06-02 16:26:20 -07:00
Daniel Müller
d18616e7aa libbpf: Introduce libbpf_bpf_prog_type_str
This change introduces a new function, libbpf_bpf_prog_type_str, to the
public libbpf API. The function allows users to get a string
representation for a bpf_prog_type variant.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220523230428.3077108-2-deso@posteo.net
2022-06-02 16:26:10 -07:00
Julia Lawall
bb412cf1d7 libbpf: Fix typo in comment
Spelling mistake (triple letters) in comment.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Daniel Müller <deso@posteo.net>
Link: https://lore.kernel.org/bpf/20220521111145.81697-71-Julia.Lawall@inria.fr
2022-05-23 11:24:50 -07:00