IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
[ Upstream commit 658ac06801315b739774a15796ff06913ef5cad5 ]
Fix the following error when building bpftool:
CLANG profiler.bpf.o
CLANG pid_iter.bpf.o
skeleton/profiler.bpf.c:18:21: error: invalid application of 'sizeof' to an incomplete type 'struct bpf_perf_event_value'
__uint(value_size, sizeof(struct bpf_perf_event_value));
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
tools/bpf/bpftool/bootstrap/libbpf/include/bpf/bpf_helpers.h:13:39: note: expanded from macro '__uint'
tools/bpf/bpftool/bootstrap/libbpf/include/bpf/bpf_helper_defs.h:7:8: note: forward declaration of 'struct bpf_perf_event_value'
struct bpf_perf_event_value;
^
struct bpf_perf_event_value is being used in the kernel only when
CONFIG_BPF_EVENTS is enabled, so it misses a BTF entry then.
Define struct bpf_perf_event_value___local with the
`preserve_access_index` attribute inside the pid_iter BPF prog to
allow compiling on any configs. It is a full mirror of a UAPI
structure, so is compatible both with and w/o CO-RE.
bpf_perf_event_read_value() requires a pointer of the original type,
so a cast is needed.
Fixes: 47c09d6a9f67 ("bpftool: Introduce "prog profile" command")
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230707095425.168126-5-quentin@isovalent.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 44ba7b30e84fb40da2295e85a6d209e199fdc977 ]
In order to allow the BPF program in bpftool's pid_iter.bpf.c to compile
correctly on hosts where vmlinux.h does not define
BPF_LINK_TYPE_PERF_EVENT (running kernel versions lower than 5.15, for
example), define and use a local copy of the enum value. This requires
LLVM 12 or newer to build the BPF program.
Fixes: cbdaf71f7e65 ("bpftool: Add bpf_cookie to link output")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230707095425.168126-4-quentin@isovalent.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 67a43462ee2405c94e985a747bdcb8e3a0d66203 ]
When building bpftool with !CONFIG_PERF_EVENTS:
skeleton/pid_iter.bpf.c:47:14: error: incomplete definition of type 'struct bpf_perf_link'
perf_link = container_of(link, struct bpf_perf_link, link);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
tools/bpf/bpftool/bootstrap/libbpf/include/bpf/bpf_helpers.h:74:22: note: expanded from macro 'container_of'
((type *)(__mptr - offsetof(type, member))); \
^~~~~~~~~~~~~~~~~~~~~~
tools/bpf/bpftool/bootstrap/libbpf/include/bpf/bpf_helpers.h:68:60: note: expanded from macro 'offsetof'
#define offsetof(TYPE, MEMBER) ((unsigned long)&((TYPE *)0)->MEMBER)
~~~~~~~~~~~^
skeleton/pid_iter.bpf.c:44:9: note: forward declaration of 'struct bpf_perf_link'
struct bpf_perf_link *perf_link;
^
&bpf_perf_link is being defined and used only under the ifdef.
Define struct bpf_perf_link___local with the `preserve_access_index`
attribute inside the pid_iter BPF prog to allow compiling on any
configs. CO-RE will substitute it with the real struct bpf_perf_link
accesses later on.
container_of() uses offsetof(), which does the necessary CO-RE
relocation if the field is specified with `preserve_access_index` - as
is the case for struct bpf_perf_link___local.
Fixes: cbdaf71f7e65 ("bpftool: Add bpf_cookie to link output")
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230707095425.168126-3-quentin@isovalent.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4cbeeb0dc02f8ac7b975b2ab0080ace53d43d62a ]
When CONFIG_PERF_EVENTS is not set, struct perf_event remains empty.
However, the structure is being used by bpftool indirectly via BTF.
This leads to:
skeleton/pid_iter.bpf.c:49:30: error: no member named 'bpf_cookie' in 'struct perf_event'
return BPF_CORE_READ(event, bpf_cookie);
~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~
...
skeleton/pid_iter.bpf.c:49:9: error: returning 'void' from a function with incompatible result type '__u64' (aka 'unsigned long long')
return BPF_CORE_READ(event, bpf_cookie);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tools and samples can't use any CONFIG_ definitions, so the fields
used there should always be present.
Define struct perf_event___local with the `preserve_access_index`
attribute inside the pid_iter BPF prog to allow compiling on any
configs. CO-RE will substitute it with the real struct perf_event
accesses later on.
Fixes: cbdaf71f7e65 ("bpftool: Add bpf_cookie to link output")
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230707095425.168126-2-quentin@isovalent.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit edd75c802855271c8610f58a2fc9e54aefc49ce5 upstream.
Building BPF selftests with custom HOSTCFLAGS yields an error:
# make HOSTCFLAGS="-O2"
[...]
HOSTCC ./tools/testing/selftests/bpf/tools/build/resolve_btfids/main.o
main.c:73:10: fatal error: linux/rbtree.h: No such file or directory
73 | #include <linux/rbtree.h>
| ^~~~~~~~~~~~~~~~
The reason is that tools/bpf/resolve_btfids/Makefile passes header
include paths by extending HOSTCFLAGS which is overridden by setting
HOSTCFLAGS in the make command (because of Makefile rules [1]).
This patch fixes the above problem by passing the include paths via
`HOSTCFLAGS_resolve_btfids` which is used by tools/build/Build.include
and can be combined with overridding HOSTCFLAGS.
[1] https://www.gnu.org/software/make/manual/html_node/Overriding.html
Fixes: 56a2df7615fa ("tools/resolve_btfids: Compile resolve_btfids as host program")
Signed-off-by: Viktor Malik <vmalik@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20230530123352.1308488-1-vmalik@redhat.com
Cc: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2531ba0e4ae67d6d0219400af27805fe52cd28e8 upstream.
Thorsten reported build issue with command line that defined extra
HOSTCFLAGS that were not passed into 'prepare' targets, but were
used to build resolve_btfids objects.
This results in build fail when these objects are linked together:
/usr/bin/ld: /build.../tools/bpf/resolve_btfids//libbpf/libbpf.a(libbpf-in.o):
relocation R_X86_64_32 against `.rodata.str1.1' can not be used when making a PIE \
object; recompile with -fPIE
Fixing this by passing HOSTCFLAGS in EXTRA_CFLAGS as part of
HOST_OVERRIDES variable for prepare targets.
[1] https://lore.kernel.org/bpf/f7922132-6645-6316-5675-0ece4197bfff@leemhuis.info/
Fixes: 56a2df7615fa ("tools/resolve_btfids: Compile resolve_btfids as host program")
Reported-by: Thorsten Leemhuis <linux@leemhuis.info>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Thorsten Leemhuis <linux@leemhuis.info>
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/bpf/20230209143735.4112845-1-jolsa@kernel.org
Cc: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 56a2df7615fa050cc67b89245b2a482849077939 upstream.
Making resolve_btfids to be compiled as host program so
we can avoid cross compile issues as reported by Nathan.
Also we no longer need HOST_OVERRIDES for BINARY target,
just for 'prepare' targets.
Fixes: 13e07691a16f ("tools/resolve_btfids: Alter how HOSTCC is forced")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/bpf/20230202112839.1131892-1-jolsa@kernel.org
Cc: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 13e07691a16ff31b209fbfce25c01ff296b05e45 upstream.
HOSTCC is always wanted when building. Setting CC to HOSTCC happens
after tools/scripts/Makefile.include is included, meaning flags are
set assuming say CC is gcc, but then it can be later set to HOSTCC
which may be clang. tools/scripts/Makefile.include is needed for host
set up and common macros in objtool's Makefile. Rather than override
CC to HOSTCC, just pass CC as HOSTCC to Makefile.build, the libsubcmd
builds and the linkage step. This means the Makefiles don't see things
like CC changing and tool flag determination, and similar, work
properly.
Also, clear the passed subdir as otherwise an outer build may break by
inadvertently passing an inappropriate value.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20230124064324.672022-2-irogers@google.com
Cc: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit af03299d8536d62b49c7f3cb929349eb2d66bcd5 upstream.
Previously tools/lib/subcmd was added to the include path, switch to
installing the headers and then including from that directory. This
avoids dependencies on headers internal to tools/lib/subcmd. Add the
missing subcmd directory to the affected #include.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20230124064324.672022-1-irogers@google.com
Cc: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0e43662e61f2569500ab83b8188c065603530785 upstream.
When libelf was not installed in the standard location, it cannot be
located by the current building config.
Use pkg-config to help locate libelf in such cases.
Signed-off-by: Shen Jiamin <shen_jiamin@comp.nus.edu.sg>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20221215044703.400139-1-shen_jiamin@comp.nus.edu.sg
Cc: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 04cb8453a91c7c22f60ddadb6cef0d19abb33bb5 ]
On aarch64, "bpftool feature" reports an incorrect BPF JIT limit:
$ sudo /sbin/bpftool feature
Scanning system configuration...
bpf() syscall restricted to privileged users
JIT compiler is enabled
JIT compiler hardening is disabled
JIT compiler kallsyms exports are enabled for root
skipping kernel config, can't open file: No such file or directory
Global memory limit for JIT compiler for unprivileged users is -201326592 bytes
This is because /proc/sys/net/core/bpf_jit_limit reports
$ sudo cat /proc/sys/net/core/bpf_jit_limit
68169519595520
...and an int is assumed in read_procfs(). Change read_procfs()
to return a long to avoid negative value reporting.
Fixes: 7a4522bbef0c ("tools: bpftool: add probes for /proc/ eBPF parameters")
Reported-by: Nicky Veitch <nicky.veitch@oracle.com>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20230512113134.58996-1-alan.maguire@oracle.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 67cf52cdb6c8fa6365d29106555dacf95c9fd374 ]
When dumping the control flow graphs for programs using the 16-byte long
load instruction, we need to skip the second part of this instruction
when looking for the next instruction to process. Otherwise, we end up
printing "BUG_ld_00" from the kernel disassembler in the CFG.
Fixes: efcef17a6d65 ("tools: bpftool: generate .dot graph from CFG information")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/r/20230405132120.59886-3-quentin@isovalent.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c679bbd611c08b0559ffae079330bc4e5574696a ]
RFC8259 ("The JavaScript Object Notation (JSON) Data Interchange
Format") only specifies \", \\, \/, \b, \f, \n, \r, and \r as valid
two-character escape sequences. This does not include \', which is not
required in JSON because it exclusively uses double quotes as string
separators.
Solidus (/) may be escaped, but does not have to. Only reverse
solidus (\), double quotes ("), and the control characters have to be
escaped. Therefore, with this fix, bpftool correctly supports all valid
two-character escape sequences (but still does not support characters
that require multi-character escape sequences).
Witout this fix, attempting to load a JSON file generated by bpftool
using Python 3.10.6's default json.load() may fail with the error
"Invalid \escape" if the file contains the invalid escaped single
quote (\').
Fixes: b66e907cfee2 ("tools: bpftool: copy JSON writer from iproute2 repository")
Signed-off-by: Luis Gerhorst <gerhorst@cs.fau.de>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20230227150853.16863-1-gerhorst@cs.fau.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 878625e1c7a10dfbb1fdaaaae2c4d2a58fbce627 ]
When the clang toolchain has stack protection enabled in order to be
consistent with gcc - which just happens to be the case on Gentoo -
the bpftool build fails:
[...]
clang \
-I. \
-I/tmp/portage/dev-util/bpftool-6.0.12/work/linux-6.0/tools/include/uapi/ \
-I/tmp/portage/dev-util/bpftool-6.0.12/work/linux-6.0/tools/bpf/bpftool/bootstrap/libbpf/include \
-g -O2 -Wall -target bpf -c skeleton/pid_iter.bpf.c -o pid_iter.bpf.o
clang \
-I. \
-I/tmp/portage/dev-util/bpftool-6.0.12/work/linux-6.0/tools/include/uapi/ \
-I/tmp/portage/dev-util/bpftool-6.0.12/work/linux-6.0/tools/bpf/bpftool/bootstrap/libbpf/include \
-g -O2 -Wall -target bpf -c skeleton/profiler.bpf.c -o profiler.bpf.o
skeleton/profiler.bpf.c:40:14: error: A call to built-in function '__stack_chk_fail' is not supported.
int BPF_PROG(fentry_XXX)
^
skeleton/profiler.bpf.c:94:14: error: A call to built-in function '__stack_chk_fail' is not supported.
int BPF_PROG(fexit_XXX)
^
2 errors generated.
[...]
Since stack-protector makes no sense for the BPF bits just unconditionally
disable it.
Bug: https://bugs.gentoo.org/890638
Signed-off-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/74cd9d2e-6052-312a-241e-2b514a75c92c@applied-asynchrony.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 377c16fa3f3c60d21e4b05314c8be034ce37f2eb ]
The number of online cpu may be not equal to possible cpu.
"bpftool prog profile" can not create pmu event on possible
but on online cpu.
$ dmidecode -s system-product-name
PowerEdge R620
$ cat /sys/devices/system/cpu/possible
0-47
$ cat /sys/devices/system/cpu/online
0-31
Disable cpu dynamically:
$ echo 0 > /sys/devices/system/cpu/cpuX/online
If one cpu is offline, perf_event_open will return ENODEV.
To fix this issue:
* check value returned and skip offline cpu.
* close pmu_fd immediately on error path, avoid fd leaking.
Fixes: 47c09d6a9f67 ("bpftool: Introduce "prog profile" command")
Signed-off-by: Tonghao Zhang <tong@infragraf.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Song Liu <song@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20230202131701.29519-1-tong@infragraf.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fa55ef14ef4fe06198c0ce811b603aec24134bc2 ]
strdup() allocates memory for path. We need to release the memory in the
following error path. Add free() to avoid memory leak.
Fixes: 8f184732b60b ("bpftool: Switch to libbpf's hashmap for pinned paths of BPF objects")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20221206071906.806384-1-linmq006@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
When using bpftool to pin {PROG, MAP, LINK} without FILE,
segmentation fault will occur. The reson is that the lack
of FILE will cause strlen to trigger NULL pointer dereference.
The corresponding stacktrace is shown below:
do_pin
do_pin_any
do_pin_fd
mount_bpffs_for_pin
strlen(name) <- NULL pointer dereference
Fix it by adding validation to the common process.
Fixes: 75a1e792c335 ("tools: bpftool: Allow all prog/map handles for pinning objects")
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20221102084034.3342995-1-pulehui@huaweicloud.com
Merge in the left-over fixes before the net-next pull-request.
Conflicts:
drivers/net/ethernet/mediatek/mtk_ppe.c
ae3ed15da588 ("net: ethernet: mtk_eth_soc: fix state in __mtk_foe_entry_clear")
9d8cb4c096ab ("net: ethernet: mtk_eth_soc: add foe_entry_size to mtk_eth_soc")
https://lore.kernel.org/all/6cb6893b-4921-a068-4c30-1109795110bb@tessares.net/
kernel/bpf/helpers.c
8addbfc7b308 ("bpf: Gate dynptr API behind CAP_BPF")
5679ff2f138f ("bpf: Move bpf_loop and bpf_for_each_map_elem under CAP_BPF")
8a67f2de9b1d ("bpf: expose bpf_strtol and bpf_strtoul to all program types")
https://lore.kernel.org/all/20221003201957.13149-1-daniel@iogearbox.net/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
strerror() expects a positive errno, however variable err will never be
positive when an error occurs. This causes bpftool to output too many
"unknown error", even a simple "file not exist" error can not get an
accurate message.
This patch fixed all "strerror(err)" patterns in bpftool.
Specially in btf.c#L823, hashmap__append() is an internal function of
libbpf and will not change errno, so there's a little difference.
Some libbpf_get_error() calls are kept for return values.
Changes since v1: https://lore.kernel.org/bpf/SY4P282MB1084B61CD8671DFA395AA8579D539@SY4P282MB1084.AUSP282.PROD.OUTLOOK.COM/
Check directly for NULL values instead of calling libbpf_get_error().
Signed-off-by: Tianyi Liu <i.pear@outlook.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/SY4P282MB1084AD9CD84A920F08DF83E29D549@SY4P282MB1084.AUSP282.PROD.OUTLOOK.COM
After commit 9b190f185d2f ("tools/bpftool: switch map event_pipe to
libbpf's perf_buffer"), struct event_ring_info is not used any more and
can be removed as well.
Signed-off-by: Yuan Can <yuancan@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220928090440.79637-3-yuancan@huawei.com
After commit 2828d0d75b73 ("bpftool: Switch to libbpf's hashmap for
programs/maps in BTF listing"), struct btf_attach_point is not used
anymore and can be removed as well.
Signed-off-by: Yuan Can <yuancan@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220928090440.79637-2-yuancan@huawei.com
Show tid or pid of iterators if giving an argument of tid or pid
For example, the command `bpftool link list` may list following
lines.
1: iter prog 2 target_name bpf_map
2: iter prog 3 target_name bpf_prog
33: iter prog 225 target_name task_file tid 1644
pids test_progs(1644)
Link 33 is a task_file iterator with tid 1644. For now, only targets
of task, task_file and task_vma may be with tid or pid to filter out
tasks other than those belonging to a process (pid) or a thread (tid).
Signed-off-by: Kui-Feng Lee <kuifeng@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/bpf/20220926184957.208194-6-kuifeng@fb.com
We want to support a ringbuf map type where samples are published from
user-space, to be consumed by BPF programs. BPF currently supports a
kernel -> user-space circular ring buffer via the BPF_MAP_TYPE_RINGBUF
map type. We'll need to define a new map type for user-space -> kernel,
as none of the helpers exported for BPF_MAP_TYPE_RINGBUF will apply
to a user-space producer ring buffer, and we'll want to add one or
more helper functions that would not apply for a kernel-producer
ring buffer.
This patch therefore adds a new BPF_MAP_TYPE_USER_RINGBUF map type
definition. The map type is useless in its current form, as there is no
way to access or use it for anything until we one or more BPF helpers. A
follow-on patch will therefore add a new helper function that allows BPF
programs to run callbacks on samples that are published to the ring
buffer.
Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220920000100.477320-2-void@manifault.com
When root-cgroup attach multi progs and sub-cgroup attach a override prog,
bpftool will display incorrectly for the attach flags of the sub-cgroup’s
effective progs:
$ bpftool cgroup tree /sys/fs/cgroup effective
CgroupPath
ID AttachType AttachFlags Name
/sys/fs/cgroup
6 cgroup_sysctl multi sysctl_tcp_mem
13 cgroup_sysctl multi sysctl_tcp_mem
/sys/fs/cgroup/cg1
20 cgroup_sysctl override sysctl_tcp_mem
6 cgroup_sysctl override sysctl_tcp_mem <- wrong
13 cgroup_sysctl override sysctl_tcp_mem <- wrong
/sys/fs/cgroup/cg1/cg2
20 cgroup_sysctl sysctl_tcp_mem
6 cgroup_sysctl sysctl_tcp_mem
13 cgroup_sysctl sysctl_tcp_mem
Attach flags is only valid for attached progs of this layer cgroup,
but not for effective progs. For querying with EFFECTIVE flags,
exporting attach flags does not make sense. So let's remove the
AttachFlags field and the associated logic. After this patch, the
above effective cgroup tree will show as bellow:
$ bpftool cgroup tree /sys/fs/cgroup effective
CgroupPath
ID AttachType Name
/sys/fs/cgroup
6 cgroup_sysctl sysctl_tcp_mem
13 cgroup_sysctl sysctl_tcp_mem
/sys/fs/cgroup/cg1
20 cgroup_sysctl sysctl_tcp_mem
6 cgroup_sysctl sysctl_tcp_mem
13 cgroup_sysctl sysctl_tcp_mem
/sys/fs/cgroup/cg1/cg2
20 cgroup_sysctl sysctl_tcp_mem
6 cgroup_sysctl sysctl_tcp_mem
13 cgroup_sysctl sysctl_tcp_mem
Fixes: b79c9fc9551b ("bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP")
Fixes: a98bf57391a2 ("tools: bpftool: add support for reporting the effective cgroup progs")
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/r/20220921104604.2340580-3-pulehui@huaweicloud.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Daniel Borkmann says:
====================
pull-request: bpf-next 2022-09-05
The following pull-request contains BPF updates for your *net-next* tree.
We've added 106 non-merge commits during the last 18 day(s) which contain
a total of 159 files changed, 5225 insertions(+), 1358 deletions(-).
There are two small merge conflicts, resolve them as follows:
1) tools/testing/selftests/bpf/DENYLIST.s390x
Commit 27e23836ce22 ("selftests/bpf: Add lru_bug to s390x deny list") in
bpf tree was needed to get BPF CI green on s390x, but it conflicted with
newly added tests on bpf-next. Resolve by adding both hunks, result:
[...]
lru_bug # prog 'printk': failed to auto-attach: -524
setget_sockopt # attach unexpected error: -524 (trampoline)
cb_refs # expected error message unexpected error: -524 (trampoline)
cgroup_hierarchical_stats # JIT does not support calling kernel function (kfunc)
htab_update # failed to attach: ERROR: strerror_r(-524)=22 (trampoline)
[...]
2) net/core/filter.c
Commit 1227c1771dd2 ("net: Fix data-races around sysctl_[rw]mem_(max|default).")
from net tree conflicts with commit 29003875bd5b ("bpf: Change bpf_setsockopt(SOL_SOCKET)
to reuse sk_setsockopt()") from bpf-next tree. Take the code as it is from
bpf-next tree, result:
[...]
if (getopt) {
if (optname == SO_BINDTODEVICE)
return -EINVAL;
return sk_getsockopt(sk, SOL_SOCKET, optname,
KERNEL_SOCKPTR(optval),
KERNEL_SOCKPTR(optlen));
}
return sk_setsockopt(sk, SOL_SOCKET, optname,
KERNEL_SOCKPTR(optval), *optlen);
[...]
The main changes are:
1) Add any-context BPF specific memory allocator which is useful in particular for BPF
tracing with bonus of performance equal to full prealloc, from Alexei Starovoitov.
2) Big batch to remove duplicated code from bpf_{get,set}sockopt() helpers as an effort
to reuse the existing core socket code as much as possible, from Martin KaFai Lau.
3) Extend BPF flow dissector for BPF programs to just augment the in-kernel dissector
with custom logic. In other words, allow for partial replacement, from Shmulik Ladkani.
4) Add a new cgroup iterator to BPF with different traversal options, from Hao Luo.
5) Support for BPF to collect hierarchical cgroup statistics efficiently through BPF
integration with the rstat framework, from Yosry Ahmed.
6) Support bpf_{g,s}et_retval() under more BPF cgroup hooks, from Stanislav Fomichev.
7) BPF hash table and local storages fixes under fully preemptible kernel, from Hou Tao.
8) Add various improvements to BPF selftests and libbpf for compilation with gcc BPF
backend, from James Hilliard.
9) Fix verifier helper permissions and reference state management for synchronous
callbacks, from Kumar Kartikeya Dwivedi.
10) Add support for BPF selftest's xskxceiver to also be used against real devices that
support MAC loopback, from Maciej Fijalkowski.
11) Various fixes to the bpf-helpers(7) man page generation script, from Quentin Monnet.
12) Document BPF verifier's tnum_in(tnum_range(), ...) gotchas, from Shung-Hsi Yu.
13) Various minor misc improvements all over the place.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (106 commits)
bpf: Optimize rcu_barrier usage between hash map and bpf_mem_alloc.
bpf: Remove usage of kmem_cache from bpf_mem_cache.
bpf: Remove prealloc-only restriction for sleepable bpf programs.
bpf: Prepare bpf_mem_alloc to be used by sleepable bpf programs.
bpf: Remove tracing program restriction on map types
bpf: Convert percpu hash map to per-cpu bpf_mem_alloc.
bpf: Add percpu allocation support to bpf_mem_alloc.
bpf: Batch call_rcu callbacks instead of SLAB_TYPESAFE_BY_RCU.
bpf: Adjust low/high watermarks in bpf_mem_cache
bpf: Optimize call_rcu in non-preallocated hash map.
bpf: Optimize element count in non-preallocated hash map.
bpf: Relax the requirement to use preallocated hash maps in tracing progs.
samples/bpf: Reduce syscall overhead in map_perf_test.
selftests/bpf: Improve test coverage of test_maps
bpf: Convert hash map to bpf_mem_alloc.
bpf: Introduce any context BPF specific memory allocator.
selftest/bpf: Add test for bpf_getsockopt()
bpf: Change bpf_getsockopt(SOL_IPV6) to reuse do_ipv6_getsockopt()
bpf: Change bpf_getsockopt(SOL_IP) to reuse do_ip_getsockopt()
bpf: Change bpf_getsockopt(SOL_TCP) to reuse do_tcp_getsockopt()
...
====================
Link: https://lore.kernel.org/r/20220905161136.9150-1-daniel@iogearbox.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Support dumping info of a cgroup_iter link. This includes
showing the cgroup's id and the order for walking the cgroup
hierarchy. Example output is as follows:
> bpftool link show
1: iter prog 2 target_name bpf_map
2: iter prog 3 target_name bpf_prog
3: iter prog 12 target_name cgroup cgroup_id 72 order self_only
> bpftool -p link show
[{
"id": 1,
"type": "iter",
"prog_id": 2,
"target_name": "bpf_map"
},{
"id": 2,
"type": "iter",
"prog_id": 3,
"target_name": "bpf_prog"
},{
"id": 3,
"type": "iter",
"prog_id": 12,
"target_name": "cgroup",
"cgroup_id": 72,
"order": "self_only"
}
]
Signed-off-by: Hao Luo <haoluo@google.com>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220829231828.1016835-1-haoluo@google.com
Signed-off-by: Martin KaFai Lau <martin.lau@linux.dev>
When `data` points to a boolean value, casting it to `int *` is problematic
and could lead to a wrong value being passed to `jsonw_bool`. Change the
cast to `bool *` instead.
Fixes: b12d6ec09730 ("bpf: btf: add btf print functionality")
Signed-off-by: Lam Thai <lamthai@arista.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20220824225859.9038-1-lamthai@arista.com
Andrii Nakryiko says:
====================
bpf-next 2022-08-17
We've added 45 non-merge commits during the last 14 day(s) which contain
a total of 61 files changed, 986 insertions(+), 372 deletions(-).
The main changes are:
1) New bpf_ktime_get_tai_ns() BPF helper to access CLOCK_TAI, from Kurt
Kanzenbach and Jesper Dangaard Brouer.
2) Few clean ups and improvements for libbpf 1.0, from Andrii Nakryiko.
3) Expose crash_kexec() as kfunc for BPF programs, from Artem Savkov.
4) Add ability to define sleepable-only kfuncs, from Benjamin Tissoires.
5) Teach libbpf's bpf_prog_load() and bpf_map_create() to gracefully handle
unsupported names on old kernels, from Hangbin Liu.
6) Allow opting out from auto-attaching BPF programs by libbpf's BPF skeleton,
from Hao Luo.
7) Relax libbpf's requirement for shared libs to be marked executable, from
Henqgi Chen.
8) Improve bpf_iter internals handling of error returns, from Hao Luo.
9) Few accommodations in libbpf to support GCC-BPF quirks, from James Hilliard.
10) Fix BPF verifier logic around tracking dynptr ref_obj_id, from Joanne Koong.
11) bpftool improvements to handle full BPF program names better, from Manu
Bretelle.
12) bpftool fixes around libcap use, from Quentin Monnet.
13) BPF map internals clean ups and improvements around memory allocations,
from Yafang Shao.
14) Allow to use cgroup_get_from_file() on cgroupv1, allowing BPF cgroup
iterator to work on cgroupv1, from Yosry Ahmed.
15) BPF verifier internal clean ups, from Dave Marchevsky and Joanne Koong.
16) Various fixes and clean ups for selftests/bpf and vmtest.sh, from Daniel
Xu, Artem Savkov, Joanne Koong, Andrii Nakryiko, Shibin Koikkara Reeny.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (45 commits)
selftests/bpf: Few fixes for selftests/bpf built in release mode
libbpf: Clean up deprecated and legacy aliases
libbpf: Streamline bpf_attr and perf_event_attr initialization
libbpf: Fix potential NULL dereference when parsing ELF
selftests/bpf: Tests libbpf autoattach APIs
libbpf: Allows disabling auto attach
selftests/bpf: Fix attach point for non-x86 arches in test_progs/lsm
libbpf: Making bpf_prog_load() ignore name if kernel doesn't support
selftests/bpf: Update CI kconfig
selftests/bpf: Add connmark read test
selftests/bpf: Add existing connection bpf_*_ct_lookup() test
bpftool: Clear errno after libcap's checks
bpf: Clear up confusion in bpf_skb_adjust_room()'s documentation
bpftool: Fix a typo in a comment
libbpf: Add names for auxiliary maps
bpf: Use bpf_map_area_alloc consistently on bpf map creation
bpf: Make __GFP_NOWARN consistent in bpf map creation
bpf: Use bpf_map_area_free instread of kvfree
bpf: Remove unneeded memset in queue_stack_map creation
libbpf: preserve errno across pr_warn/pr_info/pr_debug
...
====================
Link: https://lore.kernel.org/r/20220817215656.1180215-1-andrii@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When bpftool is linked against libcap, the library runs a "constructor"
function to compute the number of capabilities of the running kernel
[0], at the beginning of the execution of the program. As part of this,
it performs multiple calls to prctl(). Some of these may fail, and set
errno to a non-zero value:
# strace -e prctl ./bpftool version
prctl(PR_CAPBSET_READ, CAP_MAC_OVERRIDE) = 1
prctl(PR_CAPBSET_READ, 0x30 /* CAP_??? */) = -1 EINVAL (Invalid argument)
prctl(PR_CAPBSET_READ, CAP_CHECKPOINT_RESTORE) = 1
prctl(PR_CAPBSET_READ, 0x2c /* CAP_??? */) = -1 EINVAL (Invalid argument)
prctl(PR_CAPBSET_READ, 0x2a /* CAP_??? */) = -1 EINVAL (Invalid argument)
prctl(PR_CAPBSET_READ, 0x29 /* CAP_??? */) = -1 EINVAL (Invalid argument)
** fprintf added at the top of main(): we have errno == 1
./bpftool v7.0.0
using libbpf v1.0
features: libbfd, libbpf_strict, skeletons
+++ exited with 0 +++
This has been addressed in libcap 2.63 [1], but until this version is
available everywhere, we can fix it on bpftool side.
Let's clean errno at the beginning of the main() function, to make sure
that these checks do not interfere with the batch mode, where we error
out if errno is set after a bpftool command.
[0] https://git.kernel.org/pub/scm/libs/libcap/libcap.git/tree/libcap/cap_alloc.c?h=libcap-2.65#n20
[1] https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=f25a1b7e69f7b33e6afb58b3e38f3450b7d2d9a0
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220815162205.45043-1-quentin@isovalent.com
Commit 6e8ccb4f624a7 ("tools/bpf: properly account for libbfd variations")
sets the linking flags depending on which flavor of the libbfd feature was
detected.
However, the flavors except libbfd cannot be detected, as they are not in
the feature list.
Complete the list of features to detect by adding libbfd-liberty and
libbfd-liberty-z.
Committer notes:
Adjust conflict with with:
1e1613f64cc8a09d ("tools bpftool: Don't display disassembler-four-args feature test")
600b7b26c07a070d ("tools bpftool: Fix compilation error with new binutils")
Fixes: 6e8ccb4f624a73c5 ("tools/bpf: properly account for libbfd variations")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andres Freund <andres@anarazel.de>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: bpf@vger.kernel.org
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: llvm@lists.linux.dev
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Song Liu <song@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20220719170555.2576993-2-roberto.sassu@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
- Introduce 'perf lock contention' subtool, using new lock contention
tracepoints and using BPF for in kernel aggregation and then userspace
processing using the perf tooling infrastructure for resolving symbols, target
specification, etc.
Since the new lock contention tracepoints don't provide lock names, get up to
8 stack traces and display the first non-lock function symbol name as a caller:
$ perf lock report -F acquired,contended,avg_wait,wait_total
Name acquired contended avg wait total wait
update_blocked_a... 40 40 3.61 us 144.45 us
kernfs_fop_open+... 5 5 3.64 us 18.18 us
_nohz_idle_balance 3 3 2.65 us 7.95 us
tick_do_update_j... 1 1 6.04 us 6.04 us
ep_scan_ready_list 1 1 3.93 us 3.93 us
Supports the usual 'perf record' + 'perf report' workflow as well as a
BCC/bpftrace like mode where you start the tool and then press control+C to get
results:
$ sudo perf lock contention -b
^C
contended total wait max wait avg wait type caller
42 192.67 us 13.64 us 4.59 us spinlock queue_work_on+0x20
23 85.54 us 10.28 us 3.72 us spinlock worker_thread+0x14a
6 13.92 us 6.51 us 2.32 us mutex kernfs_iop_permission+0x30
3 11.59 us 10.04 us 3.86 us mutex kernfs_dop_revalidate+0x3c
1 7.52 us 7.52 us 7.52 us spinlock kthread+0x115
1 7.24 us 7.24 us 7.24 us rwlock:W sys_epoll_wait+0x148
2 7.08 us 3.99 us 3.54 us spinlock delayed_work_timer_fn+0x1b
1 6.41 us 6.41 us 6.41 us spinlock idle_balance+0xa06
2 2.50 us 1.83 us 1.25 us mutex kernfs_iop_lookup+0x2f
1 1.71 us 1.71 us 1.71 us mutex kernfs_iop_getattr+0x2c
...
- Add new 'perf kwork' tool to trace time properties of kernel work (such as
softirq, and workqueue), uses eBPF skeletons to collect info in kernel space,
aggregating data that then gets processed by the userspace tool, e.g.:
# perf kwork report
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
----------------------------------------------------------------------------------------------------
nvme0q5:130 | 004 | 1.101 ms | 49 | 0.051 ms | 26035.056403 s | 26035.056455 s |
amdgpu:162 | 002 | 0.176 ms | 9 | 0.046 ms | 26035.268020 s | 26035.268066 s |
nvme0q24:149 | 023 | 0.161 ms | 55 | 0.009 ms | 26035.655280 s | 26035.655288 s |
nvme0q20:145 | 019 | 0.090 ms | 33 | 0.014 ms | 26035.939018 s | 26035.939032 s |
nvme0q31:156 | 030 | 0.075 ms | 21 | 0.010 ms | 26035.052237 s | 26035.052247 s |
nvme0q8:133 | 007 | 0.062 ms | 12 | 0.021 ms | 26035.416840 s | 26035.416861 s |
nvme0q6:131 | 005 | 0.054 ms | 22 | 0.010 ms | 26035.199919 s | 26035.199929 s |
nvme0q19:144 | 018 | 0.052 ms | 14 | 0.010 ms | 26035.110615 s | 26035.110625 s |
nvme0q7:132 | 006 | 0.049 ms | 13 | 0.007 ms | 26035.125180 s | 26035.125187 s |
nvme0q18:143 | 017 | 0.033 ms | 14 | 0.007 ms | 26035.169698 s | 26035.169705 s |
nvme0q17:142 | 016 | 0.013 ms | 1 | 0.013 ms | 26035.565147 s | 26035.565160 s |
enp5s0-rx-0:164 | 006 | 0.004 ms | 4 | 0.002 ms | 26035.928882 s | 26035.928884 s |
enp5s0-tx-0:166 | 008 | 0.003 ms | 3 | 0.002 ms | 26035.870923 s | 26035.870925 s |
--------------------------------------------------------------------------------------------------------
See commit log messages for more examples with extra options to limit the events time window, etc.
- Add support for new AMD IBS (Instruction Based Sampling) features:
With the DataSrc extensions, the source of data can be decoded among:
- Local L3 or other L1/L2 in CCX.
- A peer cache in a near CCX.
- Data returned from DRAM.
- A peer cache in a far CCX.
- DRAM address map with "long latency" bit set.
- Data returned from MMIO/Config/PCI/APIC.
- Extension Memory (S-Link, GenZ, etc - identified by the CS target
and/or address map at DF's choice).
- Peer Agent Memory.
- Support hardware tracing with Intel PT on guest machines, combining the
traces with the ones in the host machine.
- Add a "-m" option to 'perf buildid-list' to show kernel and modules
build-ids, to display all of the information needed to do external
symbolization of kernel stack traces, such as those collected by
bpf_get_stackid().
- Add arch TSC frequency information to perf.data file headers.
- Handle changes in the binutils disassembler function signatures in
perf, bpftool and bpf_jit_disasm (Acked by the bpftool maintainer).
- Fix building the perf perl binding with the newest gcc in distros such
as fedora rawhide, where some new warnings were breaking the build as
perf uses -Werror.
- Add 'perf test' entry for branch stack sampling.
- Add ARM SPE system wide 'perf test' entry.
- Add user space counter reading tests to 'perf test'.
- Build with python3 by default, if available.
- Add python converter script for the vendor JSON event files.
- Update vendor JSON files for alderlake, bonnell, broadwell, broadwellde,
broadwellx, cascadelakex, elkhartlake, goldmont, goldmontplus, haswell,
haswellx, icelake, icelakex, ivybridge, ivytown, jaketown, knightslanding,
nehalemep, nehalemex, sandybridge, sapphirerapids, silvermont, skylake,
skylakex, snowridgex, tigerlake, westmereep-dp, westmereep-sp and westmereex.
- Add vendor JSON File for Intel meteorlake.
- Add Arm Cortex-A78C and X1C JSON vendor event files.
- Add workaround to symbol address reading from ELF files without phdr,
falling back to the previoous equation.
- Convert legacy map definition to BTF-defined in the perf BPF script test.
- Rework prologue generation code to stop using libbpf deprecated APIs.
- Add default hybrid events for 'perf stat' on x86.
- Add topdown metrics in the default 'perf stat' on the hybrid machines (big/little cores).
- Prefer sampled CPU when exporting JSON in 'perf data convert'
- Fix ('perf stat CSV output linter') and ("Check branch stack sampling") 'perf test' entries on s390.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYuw6gwAKCRCyPKLppCJ+
J5+iAP0RL6sKMhzdkRjRYfG8CluJ401YaPHadzv5jxP8gOZz2gEAsuYDrMF9t1zB
4DqORfobdX9UQEJjP9oRltU73GM0swI=
=2/M0
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-for-v6.0-2022-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools updates from Arnaldo Carvalho de Melo:
- Introduce 'perf lock contention' subtool, using new lock contention
tracepoints and using BPF for in kernel aggregation and then
userspace processing using the perf tooling infrastructure for
resolving symbols, target specification, etc.
Since the new lock contention tracepoints don't provide lock names,
get up to 8 stack traces and display the first non-lock function
symbol name as a caller:
$ perf lock report -F acquired,contended,avg_wait,wait_total
Name acquired contended avg wait total wait
update_blocked_a... 40 40 3.61 us 144.45 us
kernfs_fop_open+... 5 5 3.64 us 18.18 us
_nohz_idle_balance 3 3 2.65 us 7.95 us
tick_do_update_j... 1 1 6.04 us 6.04 us
ep_scan_ready_list 1 1 3.93 us 3.93 us
Supports the usual 'perf record' + 'perf report' workflow as well as
a BCC/bpftrace like mode where you start the tool and then press
control+C to get results:
$ sudo perf lock contention -b
^C
contended total wait max wait avg wait type caller
42 192.67 us 13.64 us 4.59 us spinlock queue_work_on+0x20
23 85.54 us 10.28 us 3.72 us spinlock worker_thread+0x14a
6 13.92 us 6.51 us 2.32 us mutex kernfs_iop_permission+0x30
3 11.59 us 10.04 us 3.86 us mutex kernfs_dop_revalidate+0x3c
1 7.52 us 7.52 us 7.52 us spinlock kthread+0x115
1 7.24 us 7.24 us 7.24 us rwlock:W sys_epoll_wait+0x148
2 7.08 us 3.99 us 3.54 us spinlock delayed_work_timer_fn+0x1b
1 6.41 us 6.41 us 6.41 us spinlock idle_balance+0xa06
2 2.50 us 1.83 us 1.25 us mutex kernfs_iop_lookup+0x2f
1 1.71 us 1.71 us 1.71 us mutex kernfs_iop_getattr+0x2c
...
- Add new 'perf kwork' tool to trace time properties of kernel work
(such as softirq, and workqueue), uses eBPF skeletons to collect info
in kernel space, aggregating data that then gets processed by the
userspace tool, e.g.:
# perf kwork report
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
----------------------------------------------------------------------------------------------------
nvme0q5:130 | 004 | 1.101 ms | 49 | 0.051 ms | 26035.056403 s | 26035.056455 s |
amdgpu:162 | 002 | 0.176 ms | 9 | 0.046 ms | 26035.268020 s | 26035.268066 s |
nvme0q24:149 | 023 | 0.161 ms | 55 | 0.009 ms | 26035.655280 s | 26035.655288 s |
nvme0q20:145 | 019 | 0.090 ms | 33 | 0.014 ms | 26035.939018 s | 26035.939032 s |
nvme0q31:156 | 030 | 0.075 ms | 21 | 0.010 ms | 26035.052237 s | 26035.052247 s |
nvme0q8:133 | 007 | 0.062 ms | 12 | 0.021 ms | 26035.416840 s | 26035.416861 s |
nvme0q6:131 | 005 | 0.054 ms | 22 | 0.010 ms | 26035.199919 s | 26035.199929 s |
nvme0q19:144 | 018 | 0.052 ms | 14 | 0.010 ms | 26035.110615 s | 26035.110625 s |
nvme0q7:132 | 006 | 0.049 ms | 13 | 0.007 ms | 26035.125180 s | 26035.125187 s |
nvme0q18:143 | 017 | 0.033 ms | 14 | 0.007 ms | 26035.169698 s | 26035.169705 s |
nvme0q17:142 | 016 | 0.013 ms | 1 | 0.013 ms | 26035.565147 s | 26035.565160 s |
enp5s0-rx-0:164 | 006 | 0.004 ms | 4 | 0.002 ms | 26035.928882 s | 26035.928884 s |
enp5s0-tx-0:166 | 008 | 0.003 ms | 3 | 0.002 ms | 26035.870923 s | 26035.870925 s |
--------------------------------------------------------------------------------------------------------
See commit log messages for more examples with extra options to limit
the events time window, etc.
- Add support for new AMD IBS (Instruction Based Sampling) features:
With the DataSrc extensions, the source of data can be decoded among:
- Local L3 or other L1/L2 in CCX.
- A peer cache in a near CCX.
- Data returned from DRAM.
- A peer cache in a far CCX.
- DRAM address map with "long latency" bit set.
- Data returned from MMIO/Config/PCI/APIC.
- Extension Memory (S-Link, GenZ, etc - identified by the CS target
and/or address map at DF's choice).
- Peer Agent Memory.
- Support hardware tracing with Intel PT on guest machines, combining
the traces with the ones in the host machine.
- Add a "-m" option to 'perf buildid-list' to show kernel and modules
build-ids, to display all of the information needed to do external
symbolization of kernel stack traces, such as those collected by
bpf_get_stackid().
- Add arch TSC frequency information to perf.data file headers.
- Handle changes in the binutils disassembler function signatures in
perf, bpftool and bpf_jit_disasm (Acked by the bpftool maintainer).
- Fix building the perf perl binding with the newest gcc in distros
such as fedora rawhide, where some new warnings were breaking the
build as perf uses -Werror.
- Add 'perf test' entry for branch stack sampling.
- Add ARM SPE system wide 'perf test' entry.
- Add user space counter reading tests to 'perf test'.
- Build with python3 by default, if available.
- Add python converter script for the vendor JSON event files.
- Update vendor JSON files for most Intel cores.
- Add vendor JSON File for Intel meteorlake.
- Add Arm Cortex-A78C and X1C JSON vendor event files.
- Add workaround to symbol address reading from ELF files without phdr,
falling back to the previoous equation.
- Convert legacy map definition to BTF-defined in the perf BPF script
test.
- Rework prologue generation code to stop using libbpf deprecated APIs.
- Add default hybrid events for 'perf stat' on x86.
- Add topdown metrics in the default 'perf stat' on the hybrid machines
(big/little cores).
- Prefer sampled CPU when exporting JSON in 'perf data convert'
- Fix ('perf stat CSV output linter') and ("Check branch stack
sampling") 'perf test' entries on s390.
* tag 'perf-tools-for-v6.0-2022-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (169 commits)
perf stat: Refactor __run_perf_stat() common code
perf lock: Print the number of lost entries for BPF
perf lock: Add --map-nr-entries option
perf lock: Introduce struct lock_contention
perf scripting python: Do not build fail on deprecation warnings
genelf: Use HAVE_LIBCRYPTO_SUPPORT, not the never defined HAVE_LIBCRYPTO
perf build: Suppress openssl v3 deprecation warnings in libcrypto feature test
perf parse-events: Break out tracepoint and printing
perf parse-events: Don't #define YY_EXTRA_TYPE
tools bpftool: Don't display disassembler-four-args feature test
tools bpftool: Fix compilation error with new binutils
tools bpf_jit_disasm: Don't display disassembler-four-args feature test
tools bpf_jit_disasm: Fix compilation error with new binutils
tools perf: Fix compilation error with new binutils
tools include: add dis-asm-compat.h to handle version differences
tools build: Don't display disassembler-four-args feature test
tools build: Add feature test for init_disassemble_info API changes
perf test: Add ARM SPE system wide test
perf tools: Rework prologue generation code
perf bpf: Convert legacy map definition to BTF-defined
...
bpftool was limiting the length of names to BPF_OBJ_NAME_LEN in prog_parse
fds.
Since commit b662000aff84 ("bpftool: Adding support for BTF program names")
we can get the full program name from BTF.
This patch removes the restriction of name length when running `bpftool
prog show name ${name}`.
Test:
Tested against some internal program names that were longer than
`BPF_OBJ_NAME_LEN`, here a redacted example of what was ran to test.
# previous behaviour
$ sudo bpftool prog show name some_long_program_name
Error: can't parse name
# with the patch
$ sudo ./bpftool prog show name some_long_program_name
123456789: tracing name some_long_program_name tag taghexa gpl ....
...
...
...
# too long
sudo ./bpftool prog show name $(python3 -c 'print("A"*128)')
Error: can't parse name
# not too long but no match
$ sudo ./bpftool prog show name $(python3 -c 'print("A"*127)')
Signed-off-by: Manu Bretelle <chantr4@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20220801132409.4147849-1-chantr4@gmail.com
binutils changed the signature of init_disassemble_info(), which now causes
compilation to fail for tools/bpf/bpftool/jit_disasm.c, e.g. on debian
unstable.
Relevant binutils commit:
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=60a3da00bd5407f07
Wire up the feature test and switch to init_disassemble_info_compat(),
which were introduced in prior commits, fixing the compilation failure.
I verified that bpftool can still disassemble bpf programs, both with an
old and new dis-asm.h API. There are no output changes for plain and json
formats. When comparing the output from old binutils (2.35)
to new bintuils with the patch (upstream snapshot) there are a few output
differences, but they are unrelated to this patch. An example hunk is:
2f: pop %r14
31: pop %r13
33: pop %rbx
- 34: leaveq
- 35: retq
+ 34: leave
+ 35: ret
Signed-off-by: Andres Freund <andres@anarazel.de>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Ben Hutchings <benh@debian.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20220622181918.ykrs5rsnmx3og4sv@alap3.anarazel.de
Link: https://lore.kernel.org/r/20220801013834.156015-8-andres@anarazel.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
binutils changed the signature of init_disassemble_info(), which now causes
compilation to fail for tools/bpf/bpf_jit_disasm.c, e.g. on debian
unstable.
Relevant binutils commit:
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=60a3da00bd5407f07
Wire up the feature test and switch to init_disassemble_info_compat(),
which were introduced in prior commits, fixing the compilation failure.
I verified that bpf_jit_disasm can still disassemble bpf programs, both
with the old and new dis-asm.h API. With old binutils there's no change in
output before/after this patch. When comparing the output from old
binutils (2.35) to new bintuils with the patch (upstream snapshot) there
are a few output differences, but they are unrelated to this patch. An
example hunk is:
f4: mov %r14,%rsi
f7: mov %r15,%rdx
fa: mov $0x2a,%ecx
- ff: callq 0xffffffffea8c4988
+ ff: call 0xffffffffea8c4988
104: test %rax,%rax
107: jge 0x0000000000000110
109: xor %eax,%eax
- 10b: jmpq 0x0000000000000073
+ 10b: jmp 0x0000000000000073
110: cmp $0x16,%rax
However, I had to use an older kernel to generate the bpf_jit_enabled =
2 output, as that has been broken since 5.18 / 1022a5498f6f745c ("bpf,
x86_64: Use bpf_jit_binary_pack_alloc").
https://lore.kernel.org/20220703030210.pmjft7qc2eajzi6c@alap3.anarazel.de
Signed-off-by: Andres Freund <andres@anarazel.de>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Ben Hutchings <benh@debian.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20220622181918.ykrs5rsnmx3og4sv@alap3.anarazel.de
Link: https://lore.kernel.org/r/20220801013834.156015-6-andres@anarazel.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
A skeleton generated by bpftool previously contained a return followed
by an expression in OBJ_NAME__detach(), which has return type void. This
did not hurt, the bpf_object__detach_skeleton() called there returns
void itself anyway, but led to a warning when compiling with e.g.
-pedantic.
Signed-off-by: Jörn-Thorben Hinz <jthinz@mailbox.tu-berlin.de>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20220726133203.514087-1-jthinz@mailbox.tu-berlin.de
Use the ARRAY_SIZE macro and make the code more compact.
Signed-off-by: Rongguang Wei <weirongguang@kylinos.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20220726093045.3374026-1-clementwei90@163.com
A flag is a 4-byte symbol that may follow a BTF ID in a set8. This is
used in the kernel to tag kfuncs in BTF sets with certain flags. Add
support to adjust the sorting code so that it passes size as 8 bytes
for 8-byte BTF sets.
Cc: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220721134245.2450-3-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
tools/runqslower use bpftool for vmlinux.h, skeleton, and static linking
only. So we can use lightweight bootstrap version of bpftool to handle
these, and it will be faster.
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220714024612.944071-3-pulehui@huawei.com
The feature test to detect the availability of zlib in bpftool's
Makefile does not bring much. The library is not optional: it may or may
not be required along libbfd for disassembling instructions, but in any
case it is necessary to build feature.o or even libbpf, on which bpftool
depends.
If we remove the feature test, we lose the nicely formatted error
message, but we get a compiler error about "zlib.h: No such file or
directory", which is equally informative. Let's get rid of the test.
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220705200456.285943-1-quentin@isovalent.com
bpftool needs to know about the newly introduced BPF_CORE_TYPE_MATCHES
relocation for its 'gen min_core_btf' command to work properly in the
present of this relocation.
Specifically, we need to make sure to mark types and fields so that they
are present in the minimized BTF for "type match" checks to work out.
However, contrary to the existing btfgen_record_field_relo, we need to
rely on the BTF -- and not the spec -- to find fields. With this change
we handle this new variant correctly. The functionality will be tested
with follow on changes to BPF selftests, which already run against a
minimized BTF created with bpftool.
Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20220628160127.607834-3-deso@posteo.net
To make it more explicit that the features listed with "bpftool feature
list" are known to bpftool, but not necessary available on the system
(as opposed to the probed features), rename the "feature list" command
into "feature list_builtins".
Note that "bpftool feature list" still works as before given that we
recognise arguments from their prefixes; but the real name of the
subcommand, in particular as displayed in the man page or the
interactive help, will now include "_builtins".
Since we update the bash completion accordingly, let's also take this
chance to redirect error output to /dev/null in the completion script,
to avoid displaying unexpected error messages when users attempt to
tab-complete.
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220701093805.16920-1-quentin@isovalent.com
For example, /sys/fs/bpf/maps.debug is a BPF link. When you run `bpftool map show`
to show it:
Before:
$ bpftool map show pinned /sys/fs/bpf/maps.debug
Error: incorrect object type: unknown
After:
$ bpftool map show pinned /sys/fs/bpf/maps.debug
Error: incorrect object type: link
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20220629154832.56986-5-laoar.shao@gmail.com
Now that bpftool is able to produce a list of known program, map, attach
types, let's use as much of this as we can in the bash completion file,
so that we don't have to expand the list each time a new type is added
to the kernel.
Also update the relevant test script to remove some checks that are no
longer needed.
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Müller <deso@posteo.net>
Link: https://lore.kernel.org/bpf/20220629203637.138944-3-quentin@isovalent.com
Add a "bpftool feature list" subcommand to list BPF "features".
Contrarily to "bpftool feature probe", this is not about the features
available on the system. Instead, it lists all features known to bpftool
from compilation time; in other words, all program, map, attach, link
types known to the libbpf version in use, and all helpers found in the
UAPI BPF header.
The first use case for this feature is bash completion: running the
command provides a list of types that can be used to produce the list of
candidate map types, for example.
Now that bpftool uses "standard" names provided by libbpf for the
program, map, link, and attach types, having the ability to list these
types and helpers could also be useful in scripts to loop over existing
items.
Sample output:
# bpftool feature list prog_types | grep -vw unspec | head -n 6
socket_filter
kprobe
sched_cls
sched_act
tracepoint
xdp
# bpftool -p feature list map_types | jq '.[1]'
"hash"
# bpftool feature list attach_types | grep '^cgroup_'
cgroup_inet_ingress
cgroup_inet_egress
[...]
cgroup_inet_sock_release
# bpftool feature list helpers | grep -vw bpf_unspec | wc -l
207
The "unspec" types and helpers are not filtered out by bpftool, so as to
remain closer to the enums, and to preserve the indices in the JSON
arrays (e.g. "hash" at index 1 == BPF_MAP_TYPE_HASH in map types list).
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Müller <deso@posteo.net>
Link: https://lore.kernel.org/bpf/20220629203637.138944-2-quentin@isovalent.com