Commit Graph

1105517 Commits

Author SHA1 Message Date
Andrii Nakryiko
b8a195dc29 libbpf: add bpf_core_type_matches() helper macro
This patch finalizes support for the proposed type match relation in libbpf by
adding bpf_core_type_matches() macro which emits TYPE_MATCH relocation.

Clang support for this relocation was added in [0].

  [0] https://reviews.llvm.org/D126838

Signed-off-by: Daniel Müller <deso@posteo.net>¬
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>¬
Link: https://lore.kernel.org/bpf/20220628160127.607834-7-deso@posteo.net¬
2022-07-05 21:15:19 -07:00
Daniel Müller
ec6209c8d4 bpf, libbpf: Add type match support
This patch adds support for the proposed type match relation to
relo_core where it is shared between userspace and kernel. It plumbs
through both kernel-side and libbpf-side support.

The matching relation is defined as follows (copy from source):
- modifiers and typedefs are stripped (and, hence, effectively ignored)
- generally speaking types need to be of same kind (struct vs. struct, union
  vs. union, etc.)
  - exceptions are struct/union behind a pointer which could also match a
    forward declaration of a struct or union, respectively, and enum vs.
    enum64 (see below)
Then, depending on type:
- integers:
  - match if size and signedness match
- arrays & pointers:
  - target types are recursively matched
- structs & unions:
  - local members need to exist in target with the same name
  - for each member we recursively check match unless it is already behind a
    pointer, in which case we only check matching names and compatible kind
- enums:
  - local variants have to have a match in target by symbolic name (but not
    numeric value)
  - size has to match (but enum may match enum64 and vice versa)
- function pointers:
  - number and position of arguments in local type has to match target
  - for each argument and the return value we recursively check match

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220628160127.607834-5-deso@posteo.net
2022-07-05 21:14:25 -07:00
Daniel Müller
633e7ceb2c bpftool: Honor BPF_CORE_TYPE_MATCHES relocation
bpftool needs to know about the newly introduced BPF_CORE_TYPE_MATCHES
relocation for its 'gen min_core_btf' command to work properly in the
present of this relocation.
Specifically, we need to make sure to mark types and fields so that they
are present in the minimized BTF for "type match" checks to work out.
However, contrary to the existing btfgen_record_field_relo, we need to
rely on the BTF -- and not the spec -- to find fields. With this change
we handle this new variant correctly. The functionality will be tested
with follow on changes to BPF selftests, which already run against a
minimized BTF created with bpftool.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20220628160127.607834-3-deso@posteo.net
2022-07-05 20:26:18 -07:00
Daniel Müller
3c660a5d86 bpf: Introduce TYPE_MATCH related constants/macros
In order to provide type match support we require a new type of
relocation which, in turn, requires toolchain support. Recent LLVM/Clang
versions support a new value for the last argument to the
__builtin_preserve_type_info builtin, for example.
With this change we introduce the necessary constants into relevant
header files, mirroring what the compiler may support.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220628160127.607834-2-deso@posteo.net
2022-07-05 20:24:12 -07:00
Magnus Karlsson
cfb5a2dbf1 bpf, samples: Remove AF_XDP samples
Remove the AF_XDP samples from samples/bpf/ as they are dependent on
the AF_XDP support in libbpf. This support has now been removed in the
1.0 release, so these samples cannot be compiled anymore. Please start
to use libxdp instead. It is backwards compatible with the AF_XDP
support that was offered in libbpf. New samples can be found in the
various xdp-project repositories connected to libxdp and by googling.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/20220630093717.8664-1-magnus.karlsson@gmail.com
2022-07-05 11:58:44 +02:00
Quentin Monnet
990a6194f7 bpftool: Rename "bpftool feature list" into "... feature list_builtins"
To make it more explicit that the features listed with "bpftool feature
list" are known to bpftool, but not necessary available on the system
(as opposed to the probed features), rename the "feature list" command
into "feature list_builtins".

Note that "bpftool feature list" still works as before given that we
recognise arguments from their prefixes; but the real name of the
subcommand, in particular as displayed in the man page or the
interactive help, will now include "_builtins".

Since we update the bash completion accordingly, let's also take this
chance to redirect error output to /dev/null in the completion script,
to avoid displaying unexpected error messages when users attempt to
tab-complete.

Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220701093805.16920-1-quentin@isovalent.com
2022-07-05 11:53:54 +02:00
Tobias Klauser
2064a132c0 bpf: Omit superfluous address family check in __bpf_skc_lookup
family is only set to either AF_INET or AF_INET6 based on len. In all
other cases we return early. Thus the check against AF_UNSPEC can be
omitted.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220630082618.15649-1-tklauser@distanz.ch
2022-07-05 11:51:30 +02:00
Stanislav Fomichev
b0d93b4464 selftests/bpf: Skip lsm_cgroup when we don't have trampolines
With arch_prepare_bpf_trampoline removed on x86:

  [...]
  #98/1    lsm_cgroup/functional:SKIP
  #98      lsm_cgroup:SKIP
  Summary: 1/0 PASSED, 1 SKIPPED, 0 FAILED

Fixes: dca85aac88 ("selftests/bpf: lsm_cgroup functional test")
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Hao Luo <haoluo@google.com>
Link: https://lore.kernel.org/bpf/20220630224203.512815-1-sdf@google.com
2022-07-01 15:13:36 +02:00
Yafang Shao
7a255ae772 bpftool: Show also the name of type BPF_OBJ_LINK
For example, /sys/fs/bpf/maps.debug is a BPF link. When you run `bpftool map show`
to show it:

Before:

  $ bpftool map show pinned /sys/fs/bpf/maps.debug
  Error: incorrect object type: unknown

After:

  $ bpftool map show pinned /sys/fs/bpf/maps.debug
  Error: incorrect object type: link

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20220629154832.56986-5-laoar.shao@gmail.com
2022-06-30 23:48:13 +02:00
Maciej Fijalkowski
39e940d4ab selftests/xsk: Destroy BPF resources only when ctx refcount drops to 0
Currently, xsk_socket__delete frees BPF resources regardless of ctx
refcount. Xdpxceiver has a test to verify whether underlying BPF
resources would not be wiped out after closing XSK socket that was
bound to interface with other active sockets. From library's xsk part
perspective it also means that the internal xsk context is shared and
its refcount is bumped accordingly.

After a switch to loading XDP prog based on previously opened XSK
socket, mentioned xdpxceiver test fails with:

  not ok 16 [xdpxceiver.c:swap_xsk_resources:1334]: ERROR: 9/"Bad file descriptor

which means that in swap_xsk_resources(), xsk_socket__delete() released
xskmap which in turn caused a failure of xsk_socket__update_xskmap().

To fix this, when deleting socket, decrement ctx refcount before
releasing BPF resources and do so only when refcount dropped to 0 which
means there are no more active sockets for this ctx so BPF resources can
be freed safely.

Fixes: 2f6324a393 ("libbpf: Support shared umems between queues and devices")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20220629143458.934337-5-maciej.fijalkowski@intel.com
2022-06-30 22:50:10 +02:00
Maciej Fijalkowski
6d4c767c03 selftests/xsk: Verify correctness of XDP prog attach point
To prevent the case we had previously where for TEST_MODE_SKB, XDP prog
was attached in native mode, call bpf_xdp_query() after loading prog and
make sure that attach_mode is as expected.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20220629143458.934337-4-maciej.fijalkowski@intel.com
2022-06-30 22:49:10 +02:00
Maciej Fijalkowski
61333008d0 selftests/xsk: Introduce XDP prog load based on existing AF_XDP socket
Currently, xsk_setup_xdp_prog() uses anonymous xsk_socket struct which
means that during xsk_create_bpf_link() call, xsk->config.xdp_flags is
always 0. This in turn means that from xdpxceiver it is impossible to
use xdpgeneric attachment, so since commit 3b22523bca ("selftests,
xsk: Fix bpf_res cleanup test") we were not testing SKB mode at all.

To fix this, introduce a function, called xsk_setup_xdp_prog_xsk(), that
will load XDP prog based on the existing xsk_socket, so that xsk
context's refcount is correctly bumped and flags from application side
are respected. Use this from xdpxceiver side so we get coverage of
generic and native XDP program attach points.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20220629143458.934337-3-maciej.fijalkowski@intel.com
2022-06-30 22:49:05 +02:00
Maciej Fijalkowski
24d2e5d9da selftests/xsk: Avoid bpf_link probe for existing xsk
Currently bpf_link probe is done for each call of xsk_socket__create().
For cases where xsk context was previously created and current socket
creation uses it, has_bpf_link will be overwritten, where it has already
been initialized.

Optimize this by moving the query to the xsk_create_ctx() so that when
xsk_get_ctx() finds a ctx then no further bpf_link probes are needed.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20220629143458.934337-2-maciej.fijalkowski@intel.com
2022-06-30 22:48:57 +02:00
Quentin Monnet
6d304871e3 bpftool: Use feature list in bash completion
Now that bpftool is able to produce a list of known program, map, attach
types, let's use as much of this as we can in the bash completion file,
so that we don't have to expand the list each time a new type is added
to the kernel.

Also update the relevant test script to remove some checks that are no
longer needed.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Müller <deso@posteo.net>
Link: https://lore.kernel.org/bpf/20220629203637.138944-3-quentin@isovalent.com
2022-06-30 16:17:06 +02:00
Quentin Monnet
27b3f70553 bpftool: Add feature list (prog/map/link/attach types, helpers)
Add a "bpftool feature list" subcommand to list BPF "features".
Contrarily to "bpftool feature probe", this is not about the features
available on the system. Instead, it lists all features known to bpftool
from compilation time; in other words, all program, map, attach, link
types known to the libbpf version in use, and all helpers found in the
UAPI BPF header.

The first use case for this feature is bash completion: running the
command provides a list of types that can be used to produce the list of
candidate map types, for example.

Now that bpftool uses "standard" names provided by libbpf for the
program, map, link, and attach types, having the ability to list these
types and helpers could also be useful in scripts to loop over existing
items.

Sample output:

    # bpftool feature list prog_types | grep -vw unspec | head -n 6
    socket_filter
    kprobe
    sched_cls
    sched_act
    tracepoint
    xdp

    # bpftool -p feature list map_types | jq '.[1]'
    "hash"

    # bpftool feature list attach_types | grep '^cgroup_'
    cgroup_inet_ingress
    cgroup_inet_egress
    [...]
    cgroup_inet_sock_release

    # bpftool feature list helpers | grep -vw bpf_unspec | wc -l
    207

The "unspec" types and helpers are not filtered out by bpftool, so as to
remain closer to the enums, and to preserve the indices in the JSON
arrays (e.g. "hash" at index 1 == BPF_MAP_TYPE_HASH in map types list).

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Müller <deso@posteo.net>
Link: https://lore.kernel.org/bpf/20220629203637.138944-2-quentin@isovalent.com
2022-06-30 16:17:03 +02:00
Tobias Klauser
b0cbd6154a bpftool: Remove attach_type_name forward declaration
The attach_type_name definition was removed in commit 1ba5ad36e0
("bpftool: Use libbpf_bpf_attach_type_str"). Remove its forward
declaration in main.h as well.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20220630093638.25916-1-tklauser@distanz.ch
2022-06-30 16:11:20 +02:00
Quentin Monnet
f0cf642c56 bpftool: Probe for memcg-based accounting before bumping rlimit
Bpftool used to bump the memlock rlimit to make sure to be able to load
BPF objects. After the kernel has switched to memcg-based memory
accounting [0] in 5.11, bpftool has relied on libbpf to probe the system
for memcg-based accounting support and for raising the rlimit if
necessary [1]. But this was later reverted, because the probe would
sometimes fail, resulting in bpftool not being able to load all required
objects [2].

Here we add a more efficient probe, in bpftool itself. We first lower
the rlimit to 0, then we attempt to load a BPF object (and finally reset
the rlimit): if the load succeeds, then memcg-based memory accounting is
supported.

This approach was earlier proposed for the probe in libbpf itself [3],
but given that the library may be used in multithreaded applications,
the probe could have undesirable consequences if one thread attempts to
lock kernel memory while memlock rlimit is at 0. Since bpftool is
single-threaded and the rlimit is process-based, this is fine to do in
bpftool itself.

This probe was inspired by the similar one from the cilium/ebpf Go
library [4].

  [0] commit 97306be45f ("Merge branch 'switch to memcg-based memory accounting'")
  [1] commit a777e18f1b ("bpftool: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK")
  [2] commit 6b4384ff10 ("Revert "bpftool: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK"")
  [3] https://lore.kernel.org/bpf/20220609143614.97837-1-quentin@isovalent.com/t/#u
  [4] https://github.com/cilium/ebpf/blob/v0.9.0/rlimit/rlimit.go#L39

Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/bpf/20220629111351.47699-1-quentin@isovalent.com
2022-06-29 23:33:02 +02:00
Alexei Starovoitov
d17b557e5e Merge branch 'bpf: cgroup_sock lsm flavor'
Stanislav Fomichev says:

====================

This series implements new lsm flavor for attaching per-cgroup programs to
existing lsm hooks. The cgroup is taken out of 'current', unless
the first argument of the hook is 'struct socket'. In this case,
the cgroup association is taken out of socket. The attachment
looks like a regular per-cgroup attachment: we add new BPF_LSM_CGROUP
attach type which, together with attach_btf_id, signals per-cgroup lsm.
Behind the scenes, we allocate trampoline shim program and
attach to lsm. This program looks up cgroup from current/socket
and runs cgroup's effective prog array. The rest of the per-cgroup BPF
stays the same: hierarchy, local storage, retval conventions
(return 1 == success).

Current limitations:
* haven't considered sleepable bpf; can be extended later on
* not sure the verifier does the right thing with null checks;
  see latest selftest for details
* total of 10 (global) per-cgroup LSM attach points

v11:
- Martin: address selftest memory & fd leaks
- Martin: address moving into root (instead have another temp leaf cgroup)
- Martin: move tools/include/uapi/linux/bpf.h change from libbpf patch
  into 'sync tools' patch

v10:
- Martin: reword commit message, drop outdated items
- Martin: remove rcu_real_lock from __cgroup_bpf_run_lsm_current
- Martin: remove CONFIG_BPF_LSM from cgroup_bpf_release
- Martin: fix leaking shim reference in bpf_cgroup_link_release
- Martin: WARN_ON_ONCE for bpf_trampoline_lookup in bpf_trampoline_unlink_cgroup_shim
- Martin: sync tools/include/linux/btf_ids.h
- Martin: move progs/flags closer to the places where they are used in __cgroup_bpf_query
- Martin: remove sk_clone_security & sctp_bind_connect from bpf_lsm_locked_sockopt_hooks
- Martin: try to determine vmlinux btf_id in bpftool
- Martin: update tools header in a separate commit
- Quentin: do libbpf_find_kernel_btf from the ops that need it
- lkp@intel.com: another build failure

v9:
Major change since last version is the switch to bpf_setsockopt to
change the socket state instead of letting the progs poke socket directly.
This, in turn, highlights the challenge that we need to care about whether
the socket is locked or not when we call bpf_setsockopt. (with my original
example selftest, the hooks are running early in the init phase for this
not to matter).

For now, I've added two btf id lists:
* hooks where we know the socket is locked and it's safe to call bpf_setsockopt
* hooks where we know the socket is _not_ locked, but the hook works on
  the socket that's not yet exposed to userspace so it should be safe
  (for this mode, special new set of bpf_{s,g}etsockopt helpers
   is added; they don't have sock_owned_by_me check)

Going forward, for the rest of the hooks, this might be a good motivation
to expand lsm cgroup to support sleeping bpf and allow the callers to
lock/unlock sockets or have a new bpf_setsockopt variant that does the
locking.

- ifdef around cleanup in cgroup_bpf_release
- Andrii: a few nits in libbpf patches
- Martin: remove unused btf_id_set_index
- Martin: bring back refcnt for cgroup_atype
- Martin: make __cgroup_bpf_query a bit more readable
- Martin: expose dst_prog->aux->attach_btf as attach_btf_obj_id as well
- Martin: reorg check_return_code path for BPF_LSM_CGROUP
- Martin: return directly from check_helper_call (instead of goto err)
- Martin: add note to new warning in check_return_code, print only for void hooks
- Martin: remove confusing shim reuse
- Martin: use bpf_{s,g}etsockopt instead of poking into socket data
- Martin: use CONFIG_CGROUP_BPF in bpf_prog_alloc_no_stats/bpf_prog_free_deferred

v8:
- CI: fix compile issue
- CI: fix broken bpf_cookie
- Yonghong: remove __bpf_trampoline_unlink_prog comment
- Yonghong: move cgroup_atype around to fill the gap
- Yonghong: make bpf_lsm_find_cgroup_shim void
- Yonghong: rename regs to args
- Yonghong: remove if(current) check
- Martin: move refcnt into bpf_link
- Martin: move shim management to bpf_link ops
- Martin: use cgroup_atype for shim only
- Martin: go back to arrays for managing cgroup_atype(s)
- Martin: export bpf_obj_id(aux->attach_btf)
- Andrii: reorder SEC_DEF("lsm_cgroup+")
- Andrii: OPTS_SET instead of OPTS_HAS
- Andrii: rename attach_btf_func_id
- Andrii: move into 1.0 map

v7:
- there were a lot of comments last time, hope I didn't forget anything,
  some of the bigger ones:
  - Martin: use/extend BTF_SOCK_TYPE_SOCKET
  - Martin: expose bpf_set_retval
  - Martin: reject 'return 0' at the verifier for 'void' hooks
  - Martin: prog_query returns all BPF_LSM_CGROUP, prog_info
    returns attach_btf_func_id
  - Andrii: split libbpf changes
  - Andrii: add field access test to test_progs, not test_verifier (still
    using asm though)
- things that I haven't addressed, stating them here explicitly, let
  me know if some of these are still problematic:
  1. Andrii: exposing only link-based api: seems like the changes
     to support non-link-based ones are minimal, couple of lines,
     so seems like it worth having it?
  2. Alexei: applying cgroup_atype for all cgroup hooks, not only
     cgroup lsm: looks a bit harder to apply everywhere that I
     originally thought; with lsm cgroup, we have a shim_prog pointer where
     we store cgroup_atype; for non-lsm programs, we don't have a
     trace program where to store it, so we still need some kind
     of global table to map from "static" hook to "dynamic" slot.
     So I'm dropping this "can be easily extended" clause from the
     description for now. I have converted this whole machinery
     to an RCU-managed list to remove synchronize_rcu().
- also note that I had to introduce new bpf_shim_tramp_link and
  moved refcnt there; we need something to manage new bpf_tramp_link

v6:
- remove active count & stats for shim program (Martin KaFai Lau)
- remove NULL/error check for btf_vmlinux (Martin)
- don't check cgroup_atype in bpf_cgroup_lsm_shim_release (Martin)
- use old_prog (instead of passed one) in __cgroup_bpf_detach (Martin)
- make sure attach_btf_id is the same in __cgroup_bpf_replace (Martin)
- enable cgroup local storage and test it (Martin)
- properly implement prog query and add bpftool & tests (Martin)
- prohibit non-shared cgroup storage mode for BPF_LSM_CGROUP (Martin)

v5:
- __cgroup_bpf_run_lsm_socket remove NULL sock/sk checks (Martin KaFai Lau)
- __cgroup_bpf_run_lsm_{socket,current} s/prog/shim_prog/ (Martin)
- make sure bpf_lsm_find_cgroup_shim works for hooks without args (Martin)
- __cgroup_bpf_attach make sure attach_btf_id is the same when replacing (Martin)
- call bpf_cgroup_lsm_shim_release only for LSM_CGROUP (Martin)
- drop BPF_LSM_CGROUP from bpf_attach_type_to_tramp (Martin)
- drop jited check from cgroup_shim_find (Martin)
- new patch to convert cgroup_bpf to hlist_node (Jakub Sitnicki)
- new shim flavor for 'struct sock' + list of exceptions (Martin)

v4:
- fix build when jit is on but syscall is off

v3:
- add BPF_LSM_CGROUP to bpftool
- use simple int instead of refcnt_t (to avoid use-after-free
  false positive)

v2:
- addressed build bot failures
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-29 13:21:53 -07:00
Stanislav Fomichev
dca85aac88 selftests/bpf: lsm_cgroup functional test
Functional test that exercises the following:

1. apply default sk_priority policy
2. permit TX-only AF_PACKET socket
3. cgroup attach/detach/replace
4. reusing trampoline shim

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/r/20220628174314.1216643-12-sdf@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-29 13:21:52 -07:00
Stanislav Fomichev
596f5fb2ea bpftool: implement cgroup tree for BPF_LSM_CGROUP
$ bpftool --nomount prog loadall $KDIR/tools/testing/selftests/bpf/lsm_cgroup.o /sys/fs/bpf/x
$ bpftool cgroup attach /sys/fs/cgroup lsm_cgroup pinned /sys/fs/bpf/x/socket_alloc
$ bpftool cgroup attach /sys/fs/cgroup lsm_cgroup pinned /sys/fs/bpf/x/socket_bind
$ bpftool cgroup attach /sys/fs/cgroup lsm_cgroup pinned /sys/fs/bpf/x/socket_clone
$ bpftool cgroup attach /sys/fs/cgroup lsm_cgroup pinned /sys/fs/bpf/x/socket_post_create
$ bpftool cgroup tree
CgroupPath
ID       AttachType      AttachFlags     Name
/sys/fs/cgroup
6        lsm_cgroup                      socket_post_create bpf_lsm_socket_post_create
8        lsm_cgroup                      socket_bind     bpf_lsm_socket_bind
10       lsm_cgroup                      socket_alloc    bpf_lsm_sk_alloc_security
11       lsm_cgroup                      socket_clone    bpf_lsm_inet_csk_clone

$ bpftool cgroup detach /sys/fs/cgroup lsm_cgroup pinned /sys/fs/bpf/x/socket_post_create
$ bpftool cgroup tree
CgroupPath
ID       AttachType      AttachFlags     Name
/sys/fs/cgroup
8        lsm_cgroup                      socket_bind     bpf_lsm_socket_bind
10       lsm_cgroup                      socket_alloc    bpf_lsm_sk_alloc_security
11       lsm_cgroup                      socket_clone    bpf_lsm_inet_csk_clone

Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20220628174314.1216643-11-sdf@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-29 13:21:52 -07:00
Stanislav Fomichev
a4b2f3cf69 libbpf: implement bpf_prog_query_opts
Implement bpf_prog_query_opts as a more expendable version of
bpf_prog_query. Expose new prog_attach_flags and attach_btf_func_id as
well:

* prog_attach_flags is a per-program attach_type; relevant only for
  lsm cgroup program which might have different attach_flags
  per attach_btf_id
* attach_btf_func_id is a new field expose for prog_query which
  specifies real btf function id for lsm cgroup attachments

Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20220628174314.1216643-10-sdf@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-29 13:21:52 -07:00
Stanislav Fomichev
bffcf34878 libbpf: add lsm_cgoup_sock type
lsm_cgroup/ is the prefix for BPF_LSM_CGROUP.

Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20220628174314.1216643-9-sdf@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-29 13:21:52 -07:00
Stanislav Fomichev
3b34bcb946 tools/bpf: Sync btf_ids.h to tools
Has been slowly getting out of sync, let's update it.

resolve_btfids usage has been updated to match the header changes.

Also bring new parts of tools/include/uapi/linux/bpf.h.

Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20220628174314.1216643-8-sdf@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-29 13:21:52 -07:00
Stanislav Fomichev
9113d7e48e bpf: expose bpf_{g,s}etsockopt to lsm cgroup
I don't see how to make it nice without introducing btf id lists
for the hooks where these helpers are allowed. Some LSM hooks
work on the locked sockets, some are triggering early and
don't grab any locks, so have two lists for now:

1. LSM hooks which trigger under socket lock - minority of the hooks,
   but ideal case for us, we can expose existing BTF-based helpers
2. LSM hooks which trigger without socket lock, but they trigger
   early in the socket creation path where it should be safe to
   do setsockopt without any locks
3. The rest are prohibited. I'm thinking that this use-case might
   be a good gateway to sleeping lsm cgroup hooks in the future.
   We can either expose lock/unlock operations (and add tracking
   to the verifier) or have another set of bpf_setsockopt
   wrapper that grab the locks and might sleep.

Reviewed-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20220628174314.1216643-7-sdf@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-29 13:21:52 -07:00
Stanislav Fomichev
b79c9fc955 bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP
We have two options:
1. Treat all BPF_LSM_CGROUP the same, regardless of attach_btf_id
2. Treat BPF_LSM_CGROUP+attach_btf_id as a separate hook point

I was doing (2) in the original patch, but switching to (1) here:

* bpf_prog_query returns all attached BPF_LSM_CGROUP programs
regardless of attach_btf_id
* attach_btf_id is exported via bpf_prog_info

Reviewed-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20220628174314.1216643-6-sdf@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-29 13:21:52 -07:00
Stanislav Fomichev
c0e19f2c9a bpf: minimize number of allocated lsm slots per program
Previous patch adds 1:1 mapping between all 211 LSM hooks
and bpf_cgroup program array. Instead of reserving a slot per
possible hook, reserve 10 slots per cgroup for lsm programs.
Those slots are dynamically allocated on demand and reclaimed.

struct cgroup_bpf {
	struct bpf_prog_array *    effective[33];        /*     0   264 */
	/* --- cacheline 4 boundary (256 bytes) was 8 bytes ago --- */
	struct hlist_head          progs[33];            /*   264   264 */
	/* --- cacheline 8 boundary (512 bytes) was 16 bytes ago --- */
	u8                         flags[33];            /*   528    33 */

	/* XXX 7 bytes hole, try to pack */

	struct list_head           storages;             /*   568    16 */
	/* --- cacheline 9 boundary (576 bytes) was 8 bytes ago --- */
	struct bpf_prog_array *    inactive;             /*   584     8 */
	struct percpu_ref          refcnt;               /*   592    16 */
	struct work_struct         release_work;         /*   608    72 */

	/* size: 680, cachelines: 11, members: 7 */
	/* sum members: 673, holes: 1, sum holes: 7 */
	/* last cacheline: 40 bytes */
};

Reviewed-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20220628174314.1216643-5-sdf@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-29 13:21:52 -07:00
Stanislav Fomichev
69fd337a97 bpf: per-cgroup lsm flavor
Allow attaching to lsm hooks in the cgroup context.

Attaching to per-cgroup LSM works exactly like attaching
to other per-cgroup hooks. New BPF_LSM_CGROUP is added
to trigger new mode; the actual lsm hook we attach to is
signaled via existing attach_btf_id.

For the hooks that have 'struct socket' or 'struct sock' as its first
argument, we use the cgroup associated with that socket. For the rest,
we use 'current' cgroup (this is all on default hierarchy == v2 only).
Note that for some hooks that work on 'struct sock' we still
take the cgroup from 'current' because some of them work on the socket
that hasn't been properly initialized yet.

Behind the scenes, we allocate a shim program that is attached
to the trampoline and runs cgroup effective BPF programs array.
This shim has some rudimentary ref counting and can be shared
between several programs attaching to the same lsm hook from
different cgroups.

Note that this patch bloats cgroup size because we add 211
cgroup_bpf_attach_type(s) for simplicity sake. This will be
addressed in the subsequent patch.

Also note that we only add non-sleepable flavor for now. To enable
sleepable use-cases, bpf_prog_run_array_cg has to grab trace rcu,
shim programs have to be freed via trace rcu, cgroup_bpf.effective
should be also trace-rcu-managed + maybe some other changes that
I'm not aware of.

Reviewed-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20220628174314.1216643-4-sdf@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-29 13:21:51 -07:00
Stanislav Fomichev
00442143a2 bpf: convert cgroup_bpf.progs to hlist
This lets us reclaim some space to be used by new cgroup lsm slots.

Before:
struct cgroup_bpf {
	struct bpf_prog_array *    effective[23];        /*     0   184 */
	/* --- cacheline 2 boundary (128 bytes) was 56 bytes ago --- */
	struct list_head           progs[23];            /*   184   368 */
	/* --- cacheline 8 boundary (512 bytes) was 40 bytes ago --- */
	u32                        flags[23];            /*   552    92 */

	/* XXX 4 bytes hole, try to pack */

	/* --- cacheline 10 boundary (640 bytes) was 8 bytes ago --- */
	struct list_head           storages;             /*   648    16 */
	struct bpf_prog_array *    inactive;             /*   664     8 */
	struct percpu_ref          refcnt;               /*   672    16 */
	struct work_struct         release_work;         /*   688    32 */

	/* size: 720, cachelines: 12, members: 7 */
	/* sum members: 716, holes: 1, sum holes: 4 */
	/* last cacheline: 16 bytes */
};

After:
struct cgroup_bpf {
	struct bpf_prog_array *    effective[23];        /*     0   184 */
	/* --- cacheline 2 boundary (128 bytes) was 56 bytes ago --- */
	struct hlist_head          progs[23];            /*   184   184 */
	/* --- cacheline 5 boundary (320 bytes) was 48 bytes ago --- */
	u8                         flags[23];            /*   368    23 */

	/* XXX 1 byte hole, try to pack */

	/* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */
	struct list_head           storages;             /*   392    16 */
	struct bpf_prog_array *    inactive;             /*   408     8 */
	struct percpu_ref          refcnt;               /*   416    16 */
	struct work_struct         release_work;         /*   432    72 */

	/* size: 504, cachelines: 8, members: 7 */
	/* sum members: 503, holes: 1, sum holes: 1 */
	/* last cacheline: 56 bytes */
};

Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Reviewed-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20220628174314.1216643-3-sdf@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-29 13:21:51 -07:00
Stanislav Fomichev
af3f413400 bpf: add bpf_func_t and trampoline helpers
I'll be adding lsm cgroup specific helpers that grab
trampoline mutex.

No functional changes.

Reviewed-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20220628174314.1216643-2-sdf@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-29 13:21:51 -07:00
Alexei Starovoitov
c5c7358e4c Merge branch 'libbpf: remove deprecated APIs'
Andrii Nakryiko says:

====================

This patch set removes all the deprecated APIs in preparation for 1.0 release.
It also makes libbpf_set_strict_mode() a no-op (but keeps it to let per-1.0
applications buildable and dynamically linkable against libbpf 1.0 if they
were already libbpf-1.0 ready) and starts enforcing all the
behaviors that were previously opt-in through libbpf_set_strict_mode().

xsk.{c,h} parts that are now properly provided by libxdp ([0]) are still used
by xdpxceiver.c in selftest/bpf, so I've moved xsk.{c,h} with barely any
changes to under selftests/bpf.

Other than that, apart from removing all the LIBBPF_DEPRECATED-marked APIs,
there is a bunch of internal clean ups allowed by that. I've also "restored"
libbpf.map inheritance chain while removing all the deprecated APIs. I think
that's the right way to do this, as applications using libbpf as shared
library but not relying on any deprecated APIs (i.e., "good citizens" that
prepared for libbpf 1.0 release ahead of time to minimize disruption) should
be able to link both against 0.x and 1.x versions. Either way, it doesn't seem
to break anything and preserve a history on when each "surviving" API was
added.

  [0] https://github.com/xdp-project/xdp-tools/tree/master/lib/libxdp

v1->v2:
  - rebase on latest bpf-next now that Jiri's perf patches landed;
  - fix xsk.o dependency in Makefile to ensure libbpf headers are installed
    reliably.
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28 13:13:34 -07:00
Andrii Nakryiko
ab9a5a05dc libbpf: fix up few libbpf.map problems
Seems like we missed to add 2 APIs to libbpf.map and another API was
misspelled. Fix it in libbpf.map.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-16-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28 13:13:33 -07:00
Andrii Nakryiko
bd054102a8 libbpf: enforce strict libbpf 1.0 behaviors
Remove support for legacy features and behaviors that previously had to
be disabled by calling libbpf_set_strict_mode():
  - legacy BPF map definitions are not supported now;
  - RLIMIT_MEMLOCK auto-setting, if necessary, is always on (but see
    libbpf_set_memlock_rlim());
  - program name is used for program pinning (instead of section name);
  - cleaned up error returning logic;
  - entry BPF programs should have SEC() always.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-15-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28 13:13:33 -07:00
Andrii Nakryiko
31e4272197 selftests/bpf: remove last tests with legacy BPF map definitions
Libbpf 1.0 stops support legacy-style BPF map definitions. Selftests has
been migrated away from using legacy BPF map definitions except for two
selftests, to make sure that legacy functionality still worked in
pre-1.0 libbpf. Now it's time to let those tests go as libbpf 1.0 is
imminent.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-14-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28 13:13:33 -07:00
Andrii Nakryiko
450b167fb9 libbpf: clean up SEC() handling
Get rid of sloppy prefix logic and remove deprecated xdp_{devmap,cpumap}
sections.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-13-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28 13:13:33 -07:00
Andrii Nakryiko
cf90a20db8 libbpf: remove internal multi-instance prog support
Clean up internals that had to deal with the possibility of
multi-instance bpf_programs. Libbpf 1.0 doesn't support this, so all
this is not necessary now and can be simplified.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-12-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28 13:13:33 -07:00
Andrii Nakryiko
a11113a2dc libbpf: cleanup LIBBPF_DEPRECATED_SINCE supporting macros for v0.x
Keep the LIBBPF_DEPRECATED_SINCE macro "framework" for future
deprecations, but clean up 0.x related helper macros.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-11-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28 13:13:33 -07:00
Andrii Nakryiko
b4bda502df libbpf: remove multi-instance and custom private data APIs
Remove all the public APIs that are related to creating multi-instance
bpf_programs through custom preprocessing callback and generally working
with them.

Also remove all the bpf_{object,map,program}__[set_]priv() APIs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-10-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28 13:13:32 -07:00
Andrii Nakryiko
146bf811f5 libbpf: remove most other deprecated high-level APIs
Remove a bunch of high-level bpf_object/bpf_map/bpf_program related
APIs. All the APIs related to private per-object/map/prog state,
program preprocessing callback, and generally everything multi-instance
related is removed in a separate patch.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-9-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28 13:13:32 -07:00
Andrii Nakryiko
9a590538ba libbpf: remove prog_info_linear APIs
Remove prog_info_linear-related APIs previously used by perf.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-8-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28 13:13:32 -07:00
Andrii Nakryiko
22dd7a58b2 libbpf: clean up perfbuf APIs
Remove deprecated perfbuf APIs and clean up opts structs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-7-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28 13:13:32 -07:00
Andrii Nakryiko
aaf6886d9b libbpf: remove deprecated BTF APIs
Get rid of deprecated BTF-related APIs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-6-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28 13:13:32 -07:00
Andrii Nakryiko
d320fad217 libbpf: remove deprecated probing APIs
Get rid of deprecated feature-probing APIs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-5-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28 13:13:32 -07:00
Andrii Nakryiko
53e6af3a76 libbpf: remove deprecated XDP APIs
Get rid of deprecated bpf_set_link*() and bpf_get_link*() APIs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28 13:13:32 -07:00
Andrii Nakryiko
765a34130e libbpf: remove deprecated low-level APIs
Drop low-level APIs as well as high-level (and very confusingly named)
BPF object loading bpf_prog_load_xattr() and bpf_prog_load_deprecated()
APIs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28 13:13:32 -07:00
Andrii Nakryiko
f366006342 libbpf: move xsk.{c,h} into selftests/bpf
Remove deprecated xsk APIs from libbpf. But given we have selftests
relying on this, move those files (with minimal adjustments to make them
compilable) under selftests/bpf.

We also remove all the removed APIs from libbpf.map, while overall
keeping version inheritance chain, as most APIs are backwards
compatible so there is no need to reassign them as LIBBPF_1.0.0 versions.

Cc: Magnus Karlsson <magnus.karlsson@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-28 13:13:32 -07:00
John Fastabend
697fb80a53 bpf: Fix sockmap calling sleepable function in teardown path
syzbot reproduced the bug ...

 BUG: sleeping function called from invalid context at kernel/workqueue.c:3010

... with the following stack trace fragment ...

 start_flush_work kernel/workqueue.c:3010 [inline]
 __flush_work+0x109/0xb10 kernel/workqueue.c:3074
 __cancel_work_timer+0x3f9/0x570 kernel/workqueue.c:3162
 sk_psock_stop+0x4cb/0x630 net/core/skmsg.c:802
 sock_map_destroy+0x333/0x760 net/core/sock_map.c:1581
 inet_csk_destroy_sock+0x196/0x440 net/ipv4/inet_connection_sock.c:1130
 __tcp_close+0xd5b/0x12b0 net/ipv4/tcp.c:2897
 tcp_close+0x29/0xc0 net/ipv4/tcp.c:2909

... introduced by d8616ee2af. Do a quick trace of the code path and the
bug is obvious:

   inet_csk_destroy_sock(sk)
     sk_prot->destroy(sk);      <--- sock_map_destroy
        sk_psock_stop(, true);   <--- true so cancel workqueue
          cancel_work_sync()     <--- splat, because *_bh_disable()

We can not call cancel_work_sync() from inside destroy path. So mark
the sk_psock_stop call to skip this cancel_work_sync(). This will avoid
the BUG, but means we may run sk_psock_backlog after or during the
destroy op. We zapped the ingress_skb queue in sk_psock_stop (safe to
do with local_bh_disable) so its empty and the sk_psock_backlog work
item will not find any pkts to process here. However, because we are
not going to wait for it or clear its ->state its possible it kicks off
or is already running. This should be 'safe' up until psock drops its
refcnt to psock->sk. The sock_put() that drops this reference is only
done at psock destroy time from sk_psock_destroy(). This is done through
workqueue when sk_psock_drop() is called on psock refnt reaches 0.
And importantly sk_psock_destroy() does a cancel_work_sync(). So trivial
fix works.

I've had hit or miss luck reproducing this caught it once or twice with
the provided reproducer when running with many runners. However, syzkaller
is very good at reproducing so relying on syzkaller to verify fix.

Fixes: d8616ee2af ("bpf, sockmap: Fix sk->sk_forward_alloc warn_on in sk_stream_kill_queues")
Reported-by: syzbot+140186ceba0c496183bc@syzkaller.appspotmail.com
Suggested-by: Hillf Danton <hdanton@sina.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Wang Yufen <wangyufen@huawei.com>
Link: https://lore.kernel.org/bpf/20220628035803.317876-1-john.fastabend@gmail.com
2022-06-28 09:30:03 +02:00
Daniel Müller
fd75733da2 bpf: Merge "types_are_compat" logic into relo_core.c
BPF type compatibility checks (bpf_core_types_are_compat()) are
currently duplicated between kernel and user space. That's a historical
artifact more than intentional doing and can lead to subtle bugs where
one implementation is adjusted but another is forgotten.

That happened with the enum64 work, for example, where the libbpf side
was changed (commit 23b2a3a8f6 ("libbpf: Add enum64 relocation
support")) to use the btf_kind_core_compat() helper function but the
kernel side was not (commit 6089fb325c ("bpf: Add btf enum64
support")).

This patch addresses both the duplication issue, by merging both
implementations and moving them into relo_core.c, and fixes the alluded
to kind check (by giving preference to libbpf's already adjusted logic).

For discussion of the topic, please refer to:
https://lore.kernel.org/bpf/CAADnVQKbWR7oarBdewgOBZUPzryhRYvEbkhyPJQHHuxq=0K1gw@mail.gmail.com/T/#mcc99f4a33ad9a322afaf1b9276fb1f0b7add9665

Changelog:
v1 -> v2:
- limited libbpf recursion limit to 32
- changed name to __bpf_core_types_are_compat
- included warning previously present in libbpf version
- merged kernel and user space changes into a single patch

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220623182934.2582827-1-deso@posteo.net
2022-06-24 14:15:37 -07:00
Shahab Vahedi
2f6d1e0f8f bpf, docs: Fix the code formatting in instruction-set
A minor typo fix to include "| BPF_LD" into its previous
code phrase:

``BPF_IND`` | BPF_LD --> ``BPF_IND | BPF_LD``

Signed-off-by: Shahab Vahedi <shahab@synopsys.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/b6120b31-3d1d-bf2d-2f2a-aa768d91257b@synopsys.com
2022-06-24 13:58:32 -07:00
Andrii Nakryiko
780d3d5a24 Merge branch 'perf tools: Fix prologue generation'
Jiri Olsa says:

====================

hi,
sending change we discussed some time ago [1] to get rid of
some deprecated functions we use in perf prologue code.

Despite the gloomy discussion I think the final code does
not look that bad ;-)

This patchset removes following libbpf functions from perf:
  bpf_program__set_prep
  bpf_program__nth_fd
  struct bpf_prog_prep_result

v5 changes:
  - squashed patches together so we don't break bisection [Arnaldo]

v4 changes:
  - fix typo [Andrii]

v3 changes:
  - removed R0/R1 zero init in libbpf_prog_prepare_load_fn,
    because it's not needed [Andrii]
  - rebased/post on top of bpf-next/master which now has
    all the needed perf/core changes

v2 changes:
  - use fallback section prog handler, so we don't need to
    use section prefix [Andrii]
  - realloc prog->insns array in bpf_program__set_insns [Andrii]
  - squash patch 1 from previous version with
    bpf_program__set_insns change [Daniel]
  - patch 3 already merged [Arnaldo]
  - added more comments

thanks,
jirka

[1] https://lore.kernel.org/bpf/CAEf4BzaiBO3_617kkXZdYJ8hS8YF--ZLgapNbgeeEJ-pY0H88g@mail.gmail.com/
====================

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-06-24 13:36:23 -07:00
Jiri Olsa
b168852eb8 perf tools: Rework prologue generation code
Some functions we use for bpf prologue generation are going to be
deprecated. This change reworks current code not to use them.

We need to replace following functions/struct:
   bpf_program__set_prep
   bpf_program__nth_fd
   struct bpf_prog_prep_result

Currently we use bpf_program__set_prep to hook perf callback before
program is loaded and provide new instructions with the prologue.

We replace this function/ality by taking instructions for specific
program, attaching prologue to them and load such new ebpf programs
with prologue using separate bpf_prog_load calls (outside libbpf
load machinery).

Before we can take and use program instructions, we need libbpf to
actually load it. This way we get the final shape of its instructions
with all relocations and verifier adjustments).

There's one glitch though.. perf kprobe program already assumes
generated prologue code with proper values in argument registers,
so loading such program directly will fail in the verifier.

That's where the fallback pre-load handler fits in and prepends
the initialization code to the program. Once such program is loaded
we take its instructions, cut off the initialization code and prepend
the prologue.

I know.. sorry ;-)

To have access to the program when loading this patch adds support to
register 'fallback' section handler to take care of perf kprobe programs.
The fallback means that it handles any section definition besides the
ones that libbpf handles.

The handler serves two purposes:
  - allows perf programs to have special arguments in section name
  - allows perf to use pre-load callback where we can attach init
    code (zeroing all argument registers) to each perf program

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/bpf/20220616202214.70359-2-jolsa@kernel.org
2022-06-24 13:36:23 -07:00