IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Introduce xdp_set_features_flag utility routine in order to update
dynamically xdp_features according to the dynamic hw configuration via
ethtool (e.g. changing number of hw rx/tx queues).
Add xdp_clear_features_flag() in order to clear all xdp_feature flag.
Reviewed-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fix get_mask utility routine in order to take into account possible gaps
in the elements list.
Fixes: be5bea1cc0 ("net: add basic C code generators for Netlink")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Properly manage render-max property for flags definition type
introducing mask value and setting it to (last_element << 1) - 1
instead of adding max value set to last_element + 1
Fixes: be5bea1cc0 ("net: add basic C code generators for Netlink")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a new selftest, local_kptr_stash, which uses bpf_kptr_xchg to stash
a bpf_obj_new-allocated object in a map. Test the following scenarios:
* Stash two rb_nodes in an arraymap, don't unstash them, rely on map
free to destruct them
* Stash two rb_nodes in an arraymap, unstash the second one in a
separate program, rely on map free to destruct first
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/r/20230310230743.2320707-4-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The following build error can be seen:
progs/test_deny_namespace.c:22:19: error: call to undeclared function 'BIT_LL'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
__u64 cap_mask = BIT_LL(CAP_SYS_ADMIN);
The struct kernel_cap_struct no longer exists in the kernel as well.
Adjust bpf prog to fix both issues.
Fixes: f122a08b19 ("capability: just use a 'u64' instead of a 'u32[2]' array")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The commit 11e456cae9 ("selftests/bpf: Fix compilation errors: Assign a value to a constant")
fixed the issue cleanly in bpf-next.
This is an alternative fix in bpf tree to avoid merge conflict between bpf and bpf-next.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This patch tests how many kmallocs is needed to create and free
a batch of UDP sockets and each socket has a 64bytes bpf storage.
It also measures how fast the UDP sockets can be created.
The result is from my qemu setup.
Before bpf_mem_cache_alloc/free:
./bench -p 1 local-storage-create
Setting up benchmark 'local-storage-create'...
Benchmark 'local-storage-create' started.
Iter 0 ( 73.193us): creates 213.552k/s (213.552k/prod), 3.09 kmallocs/create
Iter 1 (-20.724us): creates 211.908k/s (211.908k/prod), 3.09 kmallocs/create
Iter 2 ( 9.280us): creates 212.574k/s (212.574k/prod), 3.12 kmallocs/create
Iter 3 ( 11.039us): creates 213.209k/s (213.209k/prod), 3.12 kmallocs/create
Iter 4 (-11.411us): creates 213.351k/s (213.351k/prod), 3.12 kmallocs/create
Iter 5 ( -7.915us): creates 214.754k/s (214.754k/prod), 3.12 kmallocs/create
Iter 6 ( 11.317us): creates 210.942k/s (210.942k/prod), 3.12 kmallocs/create
Summary: creates 212.789 ± 1.310k/s (212.789k/prod), 3.12 kmallocs/create
After bpf_mem_cache_alloc/free:
./bench -p 1 local-storage-create
Setting up benchmark 'local-storage-create'...
Benchmark 'local-storage-create' started.
Iter 0 ( 68.265us): creates 243.984k/s (243.984k/prod), 1.04 kmallocs/create
Iter 1 ( 30.357us): creates 238.424k/s (238.424k/prod), 1.04 kmallocs/create
Iter 2 (-18.712us): creates 232.963k/s (232.963k/prod), 1.04 kmallocs/create
Iter 3 (-15.885us): creates 238.879k/s (238.879k/prod), 1.04 kmallocs/create
Iter 4 ( 5.590us): creates 237.490k/s (237.490k/prod), 1.04 kmallocs/create
Iter 5 ( 8.577us): creates 237.521k/s (237.521k/prod), 1.04 kmallocs/create
Iter 6 ( -6.263us): creates 238.508k/s (238.508k/prod), 1.04 kmallocs/create
Summary: creates 237.298 ± 2.198k/s (237.298k/prod), 1.04 kmallocs/create
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20230308065936.1550103-18-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This patch tweats the socket_bind bpf prog to test the
local_storage->smap == NULL case in the bpf_local_storage_free()
code path. The idea is to create the local_storage with
the sk_storage_map's selem first. Then add the sk_storage_map2's selem
and then delete the earlier sk_storeage_map's selem.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20230308065936.1550103-17-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The send_signal tracepoint tests are non-deterministically failing in
CI. The test works as follows:
1. Two pairs of file descriptors are created using the pipe() function.
One pair is used to communicate between a parent process -> child
process, and the other for the reverse direction.
2. A child is fork()'ed. The child process registers a signal handler,
notifies its parent that the signal handler is registered, and then
and waits for its parent to have enabled a BPF program that sends a
signal.
3. The parent opens and loads a BPF skeleton with programs that send
signals to the child process. The different programs are triggered by
different perf events (either NMI or normal perf), or by regular
tracepoints. The signal is delivered to the child whenever the child
triggers the program.
4. The child's signal handler is invoked, which sets a flag saying that
the signal handler was reached. The child then signals to the parent
that it received the signal, and the test ends.
The perf testcases (send_signal_perf{_thread} and
send_signal_nmi{_thread}) work 100% of the time, but the tracepoint
testcases fail non-deterministically because the tracepoint is not
always being fired for the child.
There are two tracepoint programs registered in the test:
'tracepoint/sched/sched_switch', and
'tracepoint/syscalls/sys_enter_nanosleep'. The child never intentionally
blocks, nor sleeps, so neither tracepoint is guaranteed to be triggered.
To fix this, we can have the child trigger the nanosleep program with a
usleep().
Before this patch, the test would fail locally every 2-3 runs. Now, it
doesn't fail after more than 1000 runs.
Signed-off-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20230310061909.1420887-1-void@manifault.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This reverts commit 6d0c4b11e743("libbpf: Poison strlcpy()").
It added the pragma poison directive to libbpf_internal.h to protect
against accidental usage of strlcpy but ended up breaking the build for
toolchains based on libcs which provide the strlcpy() declaration from
string.h (e.g. uClibc-ng). The include order which causes the issue is:
string.h,
from Iibbpf_common.h:12,
from libbpf.h:20,
from libbpf_internal.h:26,
from strset.c:9:
Fixes: 6d0c4b11e7 ("libbpf: Poison strlcpy()")
Signed-off-by: Jesus Sanchez-Palencia <jesussanp@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230309004836.2808610-1-jesussanp@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
- Add Adrian Hunter to MAINTAINERS as a perf tools reviewer.
- Sync various tools/ copies of kernel headers with the kernel sources, this
time trying to avoid first merging with upstream to then update but instead
copy from upstream so that a merge is avoided and the end result after merging
this pull request is the one expected, tools/perf/check-headers.sh (mostly)
happy, less warnings while building tools/perf/.
- Fix counting when initial delay configured by setting
perf_attr.enable_on_exec when starting workloads from the perf command line.
- Don't avoid emitting a PERF_RECORD_MMAP2 in 'perf inject --buildid-all' when
that record comes with a build-id, otherwise we end up not being able to
resolve symbols.
- Don't use comma as the CSV output separator the "stat+csv_output" test, as
comma can appear on some tests as a modifier for an event, use @ instead,
ditto for the JSON linter test.
- The offcpu test was looking for some bits being set on
task_struct->prev_state without masking other bits not important for this
specific 'perf test', fix it.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCZApKjQAKCRCyPKLppCJ+
JzdfAQDRnwDCxhb4cvx7lVR32L1XMIFW6qLWRBJWoxC2SJi6lgD/SoQgKswkxrJv
XnBP7jEaIsh3M3ak82MxLKbjSAEvnwk=
=jup7
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-fixes-for-v6.3-1-2023-03-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Add Adrian Hunter to MAINTAINERS as a perf tools reviewer
- Sync various tools/ copies of kernel headers with the kernel sources,
this time trying to avoid first merging with upstream to then update
but instead copy from upstream so that a merge is avoided and the end
result after merging this pull request is the one expected,
tools/perf/check-headers.sh (mostly) happy, less warnings while
building tools/perf/
- Fix counting when initial delay configured by setting
perf_attr.enable_on_exec when starting workloads from the perf
command line
- Don't avoid emitting a PERF_RECORD_MMAP2 in 'perf inject
--buildid-all' when that record comes with a build-id, otherwise we
end up not being able to resolve symbols
- Don't use comma as the CSV output separator the "stat+csv_output"
test, as comma can appear on some tests as a modifier for an event,
use @ instead, ditto for the JSON linter test
- The offcpu test was looking for some bits being set on
task_struct->prev_state without masking other bits not important for
this specific 'perf test', fix it
* tag 'perf-tools-fixes-for-v6.3-1-2023-03-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf tools: Add Adrian Hunter to MAINTAINERS as a reviewer
tools headers UAPI: Sync linux/perf_event.h with the kernel sources
tools headers x86 cpufeatures: Sync with the kernel sources
tools include UAPI: Sync linux/vhost.h with the kernel sources
tools arch x86: Sync the msr-index.h copy with the kernel sources
tools headers kvm: Sync uapi/{asm/linux} kvm.h headers with the kernel sources
tools include UAPI: Synchronize linux/fcntl.h with the kernel sources
tools headers: Synchronize {linux,vdso}/bits.h with the kernel sources
tools headers UAPI: Sync linux/prctl.h with the kernel sources
tools headers: Update the copy of x86's mem{cpy,set}_64.S used in 'perf bench'
perf stat: Fix counting when initial delay configured
tools headers svm: Sync svm headers with the kernel sources
perf test: Avoid counting commas in json linter
perf tests stat+csv_output: Switch CSV separator to @
perf inject: Fix --buildid-all not to eat up MMAP2
tools arch x86: Sync the msr-index.h copy with the kernel sources
perf test: Fix offcpu test prev_state check
We recently added -Wuninitialized, but it's not enough to catch various
silly mistakes or omissions. Let's go all the way to -Wall, just like we
do for user-space code.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230309054015.4068562-5-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Once we enable -Wall for BPF sources, compiler will complain about lots
of unused variables, variables that are set but never read, etc.
Fix all these issues first before enabling -Wall in Makefile.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230309054015.4068562-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add __sink(expr) macro that forces compiler to believe that passed in
expression is both read and written. It used a simple embedded asm for
this. This is useful in a lot of tests where we assign value to some variable
to trigger some action, but later don't read variable, causing compiler
to complain (if corresponding compiler warnings are turned on, which
we'll do in the next patch).
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230309054015.4068562-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Florian Westphal says:
====================
Netfilter updates for net-next
1. nf_tables 'brouting' support, from Sriram Yagnaraman.
2. Update bridge netfilter and ovs conntrack helpers to handle
IPv6 Jumbo packets properly, i.e. fetch the packet length
from hop-by-hop extension header, from Xin Long.
This comes with a test BIG TCP test case, added to
tools/testing/selftests/net/.
3. Fix spelling and indentation in conntrack, from Jeremy Sowden.
* 'main' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
netfilter: nat: fix indentation of function arguments
netfilter: conntrack: fix typo
selftests: add a selftest for big tcp
netfilter: use nf_ip6_check_hbh_len in nf_ct_skb_network_trim
netfilter: move br_nf_check_hbh_len to utils
netfilter: bridge: move pskb_trim_rcsum out of br_nf_check_hbh_len
netfilter: bridge: check len before accessing more nh data
netfilter: bridge: call pskb_may_pull in br_nf_check_hbh_len
netfilter: bridge: introduce broute meta statement
====================
Link: https://lore.kernel.org/r/20230308193033.13965-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
With latest llvm17, selftest fexit_bpf2bpf/func_replace_return_code
has the following verification failure:
0: R1=ctx(off=0,imm=0) R10=fp0
; int connect_v4_prog(struct bpf_sock_addr *ctx)
0: (bf) r7 = r1 ; R1=ctx(off=0,imm=0) R7_w=ctx(off=0,imm=0)
1: (b4) w6 = 0 ; R6_w=0
; memset(&tuple.ipv4.saddr, 0, sizeof(tuple.ipv4.saddr));
...
; return do_bind(ctx) ? 1 : 0;
179: (bf) r1 = r7 ; R1=ctx(off=0,imm=0) R7=ctx(off=0,imm=0)
180: (85) call pc+147
Func#3 is global and valid. Skipping.
181: R0_w=scalar()
181: (bc) w6 = w0 ; R0_w=scalar() R6_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
182: (05) goto pc-129
; }
54: (bc) w0 = w6 ; R0_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R6_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
55: (95) exit
At program exit the register R0 has value (0x0; 0xffffffff) should have been in (0x0; 0x1)
processed 281 insns (limit 1000000) max_states_per_insn 1 total_states 26 peak_states 26 mark_read 13
-- END PROG LOAD LOG --
libbpf: prog 'connect_v4_prog': failed to load: -22
The corresponding source code:
__attribute__ ((noinline))
int do_bind(struct bpf_sock_addr *ctx)
{
struct sockaddr_in sa = {};
sa.sin_family = AF_INET;
sa.sin_port = bpf_htons(0);
sa.sin_addr.s_addr = bpf_htonl(SRC_REWRITE_IP4);
if (bpf_bind(ctx, (struct sockaddr *)&sa, sizeof(sa)) != 0)
return 0;
return 1;
}
...
SEC("cgroup/connect4")
int connect_v4_prog(struct bpf_sock_addr *ctx)
{
...
return do_bind(ctx) ? 1 : 0;
}
Insn 180 is a call to 'do_bind'. The call's return value is also the return value
for the program. Since do_bind() returns 0/1, so it is legitimate for compiler to
optimize 'return do_bind(ctx) ? 1 : 0' to 'return do_bind(ctx)'. However, such
optimization breaks verifier as the return value of 'do_bind()' is marked as any
scalar which violates the requirement of prog return value 0/1.
There are two ways to fix this problem, (1) changing 'return 1' in do_bind() to
e.g. 'return 10' so the compiler has to do 'do_bind(ctx) ? 1 :0', or (2)
suggested by Andrii, marking do_bind() with __weak attribute so the compiler
cannot make any assumption on do_bind() return value.
This patch adopted adding __weak approach which is simpler and more resistant
to potential compiler optimizations.
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230310012410.2920570-1-yhs@fb.com
There is a report that fib_lookup test is flaky when running in parallel.
A symptom of slowness or delay. An example:
Testing IPv6 stale neigh
set_lookup_params:PASS:inet_pton(IPV6_IFACE_ADDR) 0 nsec
test_fib_lookup:PASS:bpf_prog_test_run_opts 0 nsec
test_fib_lookup:FAIL:fib_lookup_ret unexpected fib_lookup_ret: actual 0 != expected 7
test_fib_lookup:FAIL:dmac not match unexpected dmac not match: actual 1 != expected 0
dmac expected 11:11:11:11:11:11 actual 00:00:00:00:00:00
[ Note that the "fib_lookup_ret unexpected fib_lookup_ret actual 0 ..."
is reversed in terms of expected and actual value. Fixing in this
patch also. ]
One possibility is the testing stale neigh entry was marked dead by the
gc (in neigh_periodic_work). The default gc_stale_time sysctl is 60s.
This patch increases it to 15 mins.
It also:
- fixes the reversed arg (actual vs expected) in one of the
ASSERT_EQ test
- removes the nodad command arg when adding v4 neigh entry which
currently has a warning.
Fixes: 168de02335 ("selftests/bpf: Add bpf_fib_lookup test")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230309060244.3242491-1-martin.lau@linux.dev
-----BEGIN PGP SIGNATURE-----
iQJSBAABCAA8FiEEoEVH9lhNrxiMPSyI7MXwXhnZSjYFAmQJ8eweHGJlbmphbWlu
LnRpc3NvaXJlc0ByZWRoYXQuY29tAAoJEOzF8F4Z2Uo2ff4P/j4zp6J7wfstWL5g
Ma3u3RqRpM0HKw0tO5PeigLYGout40oW8xAH7n8ERu2o45yQAd5ZbgXVey25FTSd
QEVd7zwN/ADMMTTujGQAfzpE6O7eaALVDgtgjOcNS8uRLeyqcDSCgBRaB8sNwLy7
ZhHU5OWKvRCiiwTQyG7gvY9+cTre6wNjdKR2Ei+xra5IS78gp3OZ1NJOjT3iHDxe
Zxo4kpkEaBJmVNYbC41sZfkzuZ84SfKXUC14V3BBiXmvnYU6x3WmuXxFvCgBEjgz
agmKugHrdABa+oooONdKztHSOfa5saeYy11FO+q8txIEZqSiodr1anmk2U77Yu9O
f4E8sQQszHMWyaqac5+dwUCaupgmKtPZOoMRjbGfGjwLObYkhxnJu6AamWYZl7DG
E6AaO+ZV0SQcl89GpJ4+SiXSbSxUopYljzkUnvrnrOqPe4AkdWVTuCexBgGkOKqa
DDQb+OYcI5N2aFMTOx8dkmZ6MPU7Mtot7UPTC1rv5Cgi8xFCH215dLauskDyatmt
XQw5+9hzb1q3ZFFw6E/IhBkRcNLeAga1lsSxeqIGKkCHeEAOuCaO9ev454LD4oKk
7nTqyKKAx+Roaw3dwCyU+U2v12B+PoYBq6BGnJtm/EnF0VuJRi8kV8J9uMXoo30j
v4Fo/IsHUlgIiuJo556tmjcTetUm
=XVs0
-----END PGP SIGNATURE-----
Merge tag 'for-linus-2023030901' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Benjamin Tissoires:
- fix potential out of bound write of zeroes in HID core with a
specially crafted uhid device (Lee Jones)
- fix potential use-after-free in work function in intel-ish-hid (Reka
Norman)
- selftests config fixes (Benjamin Tissoires)
- few device small fixes and support
* tag 'for-linus-2023030901' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: intel-ish-hid: ipc: Fix potential use-after-free in work function
HID: logitech-hidpp: Add support for Logitech MX Master 3S mouse
HID: cp2112: Fix driver not registering GPIO IRQ chip as threaded
selftest: hid: fix hid_bpf not set in config
HID: uhid: Over-ride the default maximum data buffer value with our own
HID: core: Provide new max_buffer_size attribute to over-ride the default
Lorenzo points out that the generic CLI is broken for the netdev
family. When I added the support for documentation of enums
(and sparse enums) the client script was not updated.
It expects the values in enum to be a list of names,
now it can also be a dict (YAML object).
Reported-by: Lorenzo Bianconi <lorenzo@kernel.org>
Fixes: e4b48ed460 ("tools: ynl: add a completely generic client")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Implement a trivial iterator returning same specified integer value
N times as part of bpf_testmod kernel module. Add selftests to validate
everything works end to end.
We also reuse these tests as "verification-only" tests to validate that
kernel prints the state of custom kernel module-defined iterator correctly:
fp-16=iter_testmod_seq(ref_id=1,state=drained,depth=0)
"testmod_seq" part is an iterator type, and is coming from module's BTF
data dynamically at runtime.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230308184121.1165081-9-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add number iterator (bpf_iter_num_{new,next,destroy}()) tests,
validating the correct handling of various corner and common cases
*at runtime*.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230308184121.1165081-8-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add various tests for open-coded iterators. Some of them excercise
various possible coding patterns in C, some go down to low-level
assembly for more control over various conditions, especially invalid
ones.
We also make use of bpf_for(), bpf_for_each(), bpf_repeat() macros in
some of these tests.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230308184121.1165081-7-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add bpf_for_each(), bpf_for(), and bpf_repeat() macros that make writing
open-coded iterator-based loops much more convenient and natural. These
macros utilize cleanup attribute to ensure proper destruction of the
iterator and thanks to that manage to provide the ergonomics that is
very close to C language's for() construct. Typical loop would look like:
int i;
int arr[N];
bpf_for(i, 0, N) {
/* verifier will know that i >= 0 && i < N, so could be used to
* directly access array elements with no extra checks
*/
arr[i] = i;
}
bpf_repeat() is very similar, but it doesn't expose iteration number and
is meant as a simple "repeat action N times" loop:
bpf_repeat(N) { /* whatever, N times */ }
Note that `break` and `continue` statements inside the {} block work as
expected.
bpf_for_each() is a generalization over any kind of BPF open-coded
iterator allowing to use for-each-like approach instead of calling
low-level bpf_iter_<type>_{new,next,destroy}() APIs explicitly. E.g.:
struct cgroup *cg;
bpf_for_each(cgroup, cg, some, input, args) {
/* do something with each cg */
}
would call (not-yet-implemented) bpf_iter_cgroup_{new,next,destroy}()
functions to form a loop over cgroups, where `some, input, args` are
passed verbatim into constructor as
bpf_iter_cgroup_new(&it, some, input, args).
As a first demonstration, add pyperf variant based on the bpf_for() loop.
Also clean up a few tests that either included bpf_misc.h header
unnecessarily from the user-space, which is unsupported, or included it
before any common types are defined (and thus leading to unnecessary
compilation warnings, potentially).
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230308184121.1165081-6-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Implement the first open-coded iterator type over a range of integers.
It's public API consists of:
- bpf_iter_num_new() constructor, which accepts [start, end) range
(that is, start is inclusive, end is exclusive).
- bpf_iter_num_next() which will keep returning read-only pointer to int
until the range is exhausted, at which point NULL will be returned.
If bpf_iter_num_next() is kept calling after this, NULL will be
persistently returned.
- bpf_iter_num_destroy() destructor, which needs to be called at some
point to clean up iterator state. BPF verifier enforces that iterator
destructor is called at some point before BPF program exits.
Note that `start = end = X` is a valid combination to setup an empty
iterator. bpf_iter_num_new() will return 0 (success) for any such
combination.
If bpf_iter_num_new() detects invalid combination of input arguments, it
returns error, resets iterator state to, effectively, empty iterator, so
any subsequent call to bpf_iter_num_next() will keep returning NULL.
BPF verifier has no knowledge that returned integers are in the
[start, end) value range, as both `start` and `end` are not statically
known and enforced: they are runtime values.
While the implementation is pretty trivial, some care needs to be taken
to avoid overflows and underflows. Subsequent selftests will validate
correctness of [start, end) semantics, especially around extremes
(INT_MIN and INT_MAX).
Similarly to bpf_loop(), we enforce that no more than BPF_MAX_LOOPS can
be specified.
bpf_iter_num_{new,next,destroy}() is a logical evolution from bounded
BPF loops and bpf_loop() helper and is the basis for implementing
ergonomic BPF loops with no statically known or verified bounds.
Subsequent patches implement bpf_for() macro, demonstrating how this can
be wrapped into something that works and feels like a normal for() loop
in C language.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230308184121.1165081-5-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Commit 62622dab0a ("ima: return IMA digest value only when IMA_COLLECTED
flag is set") caused bpf_ima_inode_hash() to refuse to give non-fresh
digests. IMA test #3 assumed the old behavior, that bpf_ima_inode_hash()
still returned also non-fresh digests.
Correct the test by accepting both cases. If the samples returned are 1,
assume that the commit above is applied and that the returned digest is
fresh. If the samples returned are 2, assume that the commit above is not
applied, and check both the non-fresh and fresh digest.
Fixes: 62622dab0a ("ima: return IMA digest value only when IMA_COLLECTED flag is set")
Reported-by: David Vernet <void@manifault.com>
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Matt Bobrowski <mattbobrowski@google.com>
Link: https://lore.kernel.org/bpf/20230308103713.1681200-1-roberto.sassu@huaweicloud.com
This test runs on the client-router-server topo, and monitors the traffic
on the RX devices of router and server while sending BIG TCP packets with
netperf from client to server. Meanwhile, it changes 'tso' on the TX devs
and 'gro' on the RX devs. Then it checks if any BIG TCP packets appears
on the RX devs with 'ip/ip6tables -m length ! --length 0:65535' for each
case.
Note that we also add tc action ct in link1 ingress to cover the ipv6
jumbo packets process in nf_ct_skb_network_trim() of nf_conntrack_ovs.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Verify that clone3 can be called successfully with CLONE_NEWTIME in
flags.
Cc: Andrey Vagin <avagin@openvz.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Parsing of USDT arguments is architecture-specific; on arm it is
relatively easy since registers used are r[0-10], fp, ip, sp, lr,
pc. Format is slightly different compared to aarch64; forms are
- "size @ [ reg, #offset ]" for dereferences, for example
"-8 @ [ sp, #76 ]" ; " -4 @ [ sp ]"
- "size @ reg" for register values; for example
"-4@r0"
- "size @ #value" for raw values; for example
"-8@#1"
Add support for parsing USDT arguments for ARM architecture.
To test the above changes QEMU's virt[1] board with cortex-a15
CPU was used. libbpf-bootstrap's usdt example[2] was modified to attach
to a test program with DTRACE_PROBE1/2/3/4... probes to test different
combinations.
[1] https://www.qemu.org/docs/master/system/arm/virt.html
[2] https://github.com/libbpf/libbpf-bootstrap/blob/master/examples/c/usdt.bpf.c
Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230307120440.25941-3-puranjay12@gmail.com
The parse_usdt_arg() function is defined differently for each
architecture but the last part of the function is repeated
verbatim for each architecture.
Refactor parse_usdt_arg() to fill the arg_sz and then do the repeated
post-processing in parse_usdt_spec().
Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230307120440.25941-2-puranjay12@gmail.com
Coverity reported a potential underflow of the offset variable used in
the find_cd() function. Switch to using a signed 64 bit integer for the
representation of offset to make sure we can never underflow.
Fixes: 1eebcb6063 ("libbpf: Implement basic zip archive parsing support")
Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230307215504.837321-1-deso@posteo.net
I was intending to make all the Netlink Spec code BSD-3-Clause
to ease the adoption but it appears that:
- I fumbled the uAPI and used "GPL WITH uAPI note" there
- it gives people pause as they expect GPL in the kernel
As suggested by Chuck re-license under dual. This gives us benefit
of full BSD freedom while fulfilling the broad "kernel is under GPL"
expectations.
Link: https://lore.kernel.org/all/20230304120108.05dd44c5@kernel.org/
Link: https://lore.kernel.org/r/20230306200457.3903854-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZAZsBwAKCRDbK58LschI
g3W1AQCQnO6pqqX5Q2aYDAZPlZRtV2TRLjuqrQE0dHW/XLAbBgD/bgsAmiKhPSCG
2mTt6izpTQVlZB0e8KcDIvbYd9CE3Qc=
=EjJQ
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2023-03-06
We've added 85 non-merge commits during the last 13 day(s) which contain
a total of 131 files changed, 7102 insertions(+), 1792 deletions(-).
The main changes are:
1) Add skb and XDP typed dynptrs which allow BPF programs for more
ergonomic and less brittle iteration through data and variable-sized
accesses, from Joanne Koong.
2) Bigger batch of BPF verifier improvements to prepare for upcoming BPF
open-coded iterators allowing for less restrictive looping capabilities,
from Andrii Nakryiko.
3) Rework RCU enforcement in the verifier, add kptr_rcu and enforce BPF
programs to NULL-check before passing such pointers into kfunc,
from Alexei Starovoitov.
4) Add support for kptrs in percpu hashmaps, percpu LRU hashmaps and in
local storage maps, from Kumar Kartikeya Dwivedi.
5) Add BPF verifier support for ST instructions in convert_ctx_access()
which will help new -mcpu=v4 clang flag to start emitting them,
from Eduard Zingerman.
6) Make uprobe attachment Android APK aware by supporting attachment
to functions inside ELF objects contained in APKs via function names,
from Daniel Müller.
7) Add a new flag BPF_F_TIMER_ABS flag for bpf_timer_start() helper
to start the timer with absolute expiration value instead of relative
one, from Tero Kristo.
8) Add a new kfunc bpf_cgroup_from_id() to look up cgroups via id,
from Tejun Heo.
9) Extend libbpf to support users manually attaching kprobes/uprobes
in the legacy/perf/link mode, from Menglong Dong.
10) Implement workarounds in the mips BPF JIT for DADDI/R4000,
from Jiaxun Yang.
11) Enable mixing bpf2bpf and tailcalls for the loongarch BPF JIT,
from Hengqi Chen.
12) Extend BPF instruction set doc with describing the encoding of BPF
instructions in terms of how bytes are stored under big/little endian,
from Jose E. Marchesi.
13) Follow-up to enable kfunc support for riscv BPF JIT, from Pu Lehui.
14) Fix bpf_xdp_query() backwards compatibility on old kernels,
from Yonghong Song.
15) Fix BPF selftest cross compilation with CLANG_CROSS_FLAGS,
from Florent Revest.
16) Improve bpf_cpumask_ma to only allocate one bpf_mem_cache,
from Hou Tao.
17) Fix BPF verifier's check_subprogs to not unnecessarily mark
a subprogram with has_tail_call, from Ilya Leoshkevich.
18) Fix arm syscall regs spec in libbpf's bpf_tracing.h, from Puranjay Mohan.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (85 commits)
selftests/bpf: Add test for legacy/perf kprobe/uprobe attach mode
selftests/bpf: Split test_attach_probe into multi subtests
libbpf: Add support to set kprobe/uprobe attach mode
tools/resolve_btfids: Add /libsubcmd to .gitignore
bpf: add support for fixed-size memory pointer returns for kfuncs
bpf: generalize dynptr_get_spi to be usable for iters
bpf: mark PTR_TO_MEM as non-null register type
bpf: move kfunc_call_arg_meta higher in the file
bpf: ensure that r0 is marked scratched after any function call
bpf: fix visit_insn()'s detection of BPF_FUNC_timer_set_callback helper
bpf: clean up visit_insn()'s instruction processing
selftests/bpf: adjust log_fixup's buffer size for proper truncation
bpf: honor env->test_state_freq flag in is_state_visited()
selftests/bpf: enhance align selftest's expected log matching
bpf: improve regsafe() checks for PTR_TO_{MEM,BUF,TP_BUFFER}
bpf: improve stack slot state printing
selftests/bpf: Disassembler tests for verifier.c:convert_ctx_access()
selftests/bpf: test if pointer type is tracked for BPF_ST_MEM
bpf: allow ctx writes using BPF_ST_MEM instruction
bpf: Use separate RCU callbacks for freeing selem
...
====================
Link: https://lore.kernel.org/r/20230307004346.27578-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZAZZ1wAKCRDbK58LschI
g4fcAQDYVsICeBDmhdBdZs7Kb91/s6SrU6B0jy4zs0gOIBBOhgD7B3jt3dMTD2tp
rPLHlv6uUoYS7mbZsrZi/XjVw8UmewM=
=VUnr
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2023-03-06
We've added 8 non-merge commits during the last 7 day(s) which contain
a total of 9 files changed, 64 insertions(+), 18 deletions(-).
The main changes are:
1) Fix BTF resolver for DATASEC sections when a VAR points at a modifier,
that is, keep resolving such instances instead of bailing out,
from Lorenz Bauer.
2) Fix BPF test framework with regards to xdp_frame info misplacement
in the "live packet" code, from Alexander Lobakin.
3) Fix an infinite loop in BPF sockmap code for TCP/UDP/AF_UNIX,
from Liu Jian.
4) Fix a build error for riscv BPF JIT under PERF_EVENTS=n,
from Randy Dunlap.
5) Several BPF doc fixes with either broken links or external instead
of internal doc links, from Bagas Sanjaya.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: check that modifier resolves after pointer
btf: fix resolving BTF_KIND_VAR after ARRAY, STRUCT, UNION, PTR
bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMES
bpf, doc: Link to submitting-patches.rst for general patch submission info
bpf, doc: Do not link to docs.kernel.org for kselftest link
bpf, sockmap: Fix an infinite loop error when len is 0 in tcp_bpf_recvmsg_parser()
riscv, bpf: Fix patch_text implicit declaration
bpf, docs: Fix link to BTF doc
====================
Link: https://lore.kernel.org/r/20230306215944.11981-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Bring back the Python scripts that were initially added with
TEST_GEN_FILES but now with TEST_FILES to avoid having them deleted
when doing a clean. Also fix the way the architecture is being
determined as they should also be installed when ARCH=x86_64 is
provided explicitly. Then also append extra files to TEST_FILES and
TEST_PROGS with += so they don't get discarded.
Fixes: ba2d788aa8 ("selftests: amd-pstate: Trigger tbench benchmark and test cpus")
Fixes: a49fb7218e ("selftests: amd-pstate: Don't delete source files via Makefile")
Signed-off-by: Guillaume Tucker <guillaume.tucker@collabora.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
To pick up the changes in:
09519ec3b1 ("perf: Add perf_event_attr::config3")
The patches for the tooling side will come later.
This addresses this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/perf_event.h' differs from latest version at 'include/uapi/linux/perf_event.h'
diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/lkml/ZAZLYmDjWjSItWOq@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a regression test that ensures that a VAR pointing at a
modifier which follows a PTR (or STRUCT or ARRAY) is resolved
correctly by the datasec validator.
Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
Link: https://lore.kernel.org/r/20230306112138.155352-3-lmb@isovalent.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
&xdp_buff and &xdp_frame are bound in a way that
xdp_buff->data_hard_start == xdp_frame
It's always the case and e.g. xdp_convert_buff_to_frame() relies on
this.
IOW, the following:
for (u32 i = 0; i < 0xdead; i++) {
xdpf = xdp_convert_buff_to_frame(&xdp);
xdp_convert_frame_to_buff(xdpf, &xdp);
}
shouldn't ever modify @xdpf's contents or the pointer itself.
However, "live packet" code wrongly treats &xdp_frame as part of its
context placed *before* the data_hard_start. With such flow,
data_hard_start is sizeof(*xdpf) off to the right and no longer points
to the XDP frame.
Instead of replacing `sizeof(ctx)` with `offsetof(ctx, xdpf)` in several
places and praying that there are no more miscalcs left somewhere in the
code, unionize ::frm with ::data in a flex array, so that both starts
pointing to the actual data_hard_start and the XDP frame actually starts
being a part of it, i.e. a part of the headroom, not the context.
A nice side effect is that the maximum frame size for this mode gets
increased by 40 bytes, as xdp_buff::frame_sz includes everything from
data_hard_start (-> includes xdpf already) to the end of XDP/skb shared
info.
Also update %MAX_PKT_SIZE accordingly in the selftests code. Leave it
hardcoded for 64 bit && 4k pages, it can be made more flexible later on.
Minor: align `&head->data` with how `head->frm` is assigned for
consistency.
Minor #2: rename 'frm' to 'frame' in &xdp_page_head while at it for
clarity.
(was found while testing XDP traffic generator on ice, which calls
xdp_convert_frame_to_buff() for each XDP frame)
Fixes: b530e9e106 ("bpf: Add "live packet" mode for XDP in BPF_PROG_RUN")
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20230224163607.2994755-1-aleksander.lobakin@intel.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
To pick the changes from:
8415a74852 ("x86/cpu, kvm: Add support for CPUID_80000021_EAX")
This only causes these perf files to be rebuilt:
CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o
And addresses these perf build warnings:
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/disabled-features.h' differs from latest version at 'arch/x86/include/asm/disabled-features.h'
diff -u tools/arch/x86/include/asm/disabled-features.h arch/x86/include/asm/disabled-features.h
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/required-features.h' differs from latest version at 'arch/x86/include/asm/required-features.h'
diff -u tools/arch/x86/include/asm/required-features.h arch/x86/include/asm/required-features.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZAYlS2XTJ5hRtss7@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In order to adapt to the older kernel, now we split the "attach_probe"
testing into multi subtests:
manual // manual attach tests for kprobe/uprobe
auto // auto-attach tests for kprobe and uprobe
kprobe-sleepable // kprobe sleepable test
uprobe-lib // uprobe tests for library function by name
uprobe-sleepable // uprobe sleepable test
uprobe-ref_ctr // uprobe ref_ctr test
As sleepable kprobe needs to set BPF_F_SLEEPABLE flag before loading,
we need to move it to a stand alone skel file, in case of it is not
supported by kernel and make the whole loading fail.
Therefore, we can only enable part of the subtests for older kernel.
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Biao Jiang <benbjiang@tencent.com>
Link: https://lore.kernel.org/bpf/20230306064833.7932-3-imagedong@tencent.com
By default, libbpf will attach the kprobe/uprobe BPF program in the
latest mode that supported by kernel. In this patch, we add the support
to let users manually attach kprobe/uprobe in legacy or perf mode.
There are 3 mode that supported by the kernel to attach kprobe/uprobe:
LEGACY: create perf event in legacy way and don't use bpf_link
PERF: create perf event with perf_event_open() and don't use bpf_link
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Biao Jiang <benbjiang@tencent.com>
Link: create perf event with perf_event_open() and use bpf_link
Link: https://lore.kernel.org/bpf/20230113093427.1666466-1-imagedong@tencent.com/
Link: https://lore.kernel.org/bpf/20230306064833.7932-2-imagedong@tencent.com
Users now can manually choose the mode with
bpf_program__attach_uprobe_opts()/bpf_program__attach_kprobe_opts().
Add libsubcmd to .gitignore, otherwise after compiling the kernel it
would result in the following:
# bpf-next...bpf-next/master
?? tools/bpf/resolve_btfids/libsubcmd/
Signed-off-by: Rong Tao <rongtao@cestc.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/tencent_F13D670D5D7AA9C4BD868D3220921AAC090A@qq.com
Adjust log_fixup's expected buffer length to fix the test. It's pretty
finicky in its length expectation, but it doesn't break often. So just
adjust the length to work on current kernel and with follow up iterator
changes as well.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230302235015.2044271-6-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Allow to search for expected register state in all the verifier log
output that's related to specified instruction number.
See added comment for an example of possible situation that is happening
due to a simple enhancement done in the next patch, which fixes handling
of env->test_state_freq flag in state checkpointing logic.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230302235015.2044271-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Function verifier.c:convert_ctx_access() applies some rewrites to BPF
instructions that read or write BPF program context. This commit adds
machinery to allow test cases that inspect BPF program after these
rewrites are applied.
An example of a test case:
{
// Shorthand for field offset and size specification
N(CGROUP_SOCKOPT, struct bpf_sockopt, retval),
// Pattern generated for field read
.read = "$dst = *(u64 *)($ctx + bpf_sockopt_kern::current_task);"
"$dst = *(u64 *)($dst + task_struct::bpf_ctx);"
"$dst = *(u32 *)($dst + bpf_cg_run_ctx::retval);",
// Pattern generated for field write
.write = "*(u64 *)($ctx + bpf_sockopt_kern::tmp_reg) = r9;"
"r9 = *(u64 *)($ctx + bpf_sockopt_kern::current_task);"
"r9 = *(u64 *)(r9 + task_struct::bpf_ctx);"
"*(u32 *)(r9 + bpf_cg_run_ctx::retval) = $src;"
"r9 = *(u64 *)($ctx + bpf_sockopt_kern::tmp_reg);" ,
},
For each test case, up to three programs are created:
- One that uses BPF_LDX_MEM to read the context field.
- One that uses BPF_STX_MEM to write to the context field.
- One that uses BPF_ST_MEM to write to the context field.
The disassembly of each program is compared with the pattern specified
in the test case.
Kernel code for disassembly is reused (as is in the bpftool).
To keep Makefile changes to the minimum, symbolic links to
`kernel/bpf/disasm.c` and `kernel/bpf/disasm.h ` are added.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230304011247.566040-4-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Check that verifier tracks pointer types for BPF_ST_MEM instructions
and reports error if pointer types do not match for different
execution branches.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230304011247.566040-3-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Lift verifier restriction to use BPF_ST_MEM instructions to write to
context data structures. This requires the following changes:
- verifier.c:do_check() for BPF_ST updated to:
- no longer forbid writes to registers of type PTR_TO_CTX;
- track dst_reg type in the env->insn_aux_data[...].ptr_type field
(same way it is done for BPF_STX and BPF_LDX instructions).
- verifier.c:convert_ctx_access() and various callbacks invoked by
it are updated to handled BPF_ST instruction alongside BPF_STX.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230304011247.566040-2-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
To pick up the changes in:
e7862eda30 ("x86/cpu: Support AMD Automatic IBRS")
0125acda7d ("x86/bugs: Reset speculation control settings on init")
38aaf921e9 ("perf/x86: Add Meteor Lake support")
5b6fac3fa4 ("x86/resctrl: Detect and configure Slow Memory Bandwidth Allocation")
dc2a3e8579 ("x86/resctrl: Add interface to read mbm_total_bytes_config")
Addressing these tools/perf build warnings:
diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
That makes the beautification scripts to pick some new entries:
$ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
$ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
$ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
$ diff -u before after
--- before 2023-03-03 18:26:51.766923522 -0300
+++ after 2023-03-03 18:27:09.987415481 -0300
@@ -267,9 +267,11 @@
[0xc000010e - x86_64_specific_MSRs_offset] = "AMD64_LBR_SELECT",
[0xc000010f - x86_64_specific_MSRs_offset] = "AMD_DBG_EXTN_CFG",
[0xc0000200 - x86_64_specific_MSRs_offset] = "IA32_MBA_BW_BASE",
+ [0xc0000280 - x86_64_specific_MSRs_offset] = "IA32_SMBA_BW_BASE",
[0xc0000300 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_STATUS",
[0xc0000301 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_CTL",
[0xc0000302 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_STATUS_CLR",
+ [0xc0000400 - x86_64_specific_MSRs_offset] = "IA32_EVT_CFG_BASE",
};
#define x86_AMD_V_KVM_MSRs_offset 0xc0010000
$
Now one can trace systemwide asking to see backtraces to where that MSR
is being read/written, see this example with a previous update:
# perf trace -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB"
^C#
If we use -v (verbose mode) we can see what it does behind the scenes:
# perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB"
Using CPUID AuthenticAMD-25-21-0
0x6a0
0x6a8
New filter for msr:read_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313)
0x6a0
0x6a8
New filter for msr:write_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313)
mmap size 528384B
^C#
Example with a frequent msr:
# perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_SPEC_CTRL" --max-events 2
Using CPUID AuthenticAMD-25-21-0
0x48
New filter for msr:read_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
0x48
New filter for msr:write_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
mmap size 528384B
Looking at the vmlinux_path (8 entries long)
symsrc__init: build id mismatch for vmlinux.
Using /proc/kcore for kernel data
Using /proc/kallsyms for symbols
0.000 Timer/2525383 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
do_trace_write_msr ([kernel.kallsyms])
do_trace_write_msr ([kernel.kallsyms])
__switch_to_xtra ([kernel.kallsyms])
__switch_to ([kernel.kallsyms])
__schedule ([kernel.kallsyms])
schedule ([kernel.kallsyms])
futex_wait_queue_me ([kernel.kallsyms])
futex_wait ([kernel.kallsyms])
do_futex ([kernel.kallsyms])
__x64_sys_futex ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
__futex_abstimed_wait_common64 (/usr/lib64/libpthread-2.33.so)
0.030 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL, val: 2)
do_trace_write_msr ([kernel.kallsyms])
do_trace_write_msr ([kernel.kallsyms])
__switch_to_xtra ([kernel.kallsyms])
__switch_to ([kernel.kallsyms])
__schedule ([kernel.kallsyms])
schedule_idle ([kernel.kallsyms])
do_idle ([kernel.kallsyms])
cpu_startup_entry ([kernel.kallsyms])
secondary_startup_64_no_verify ([kernel.kallsyms])
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Babu Moger <babu.moger@amd.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Breno Leitao <leitao@debian.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nikunj A Dadhania <nikunj@amd.com>
Link: https://lore.kernel.org/lkml/ZAJoaZ41+rU5H0vL@kernel.org
[ I had published the perf-tools branch before with the sync with ]
[ 8c29f01654 ("x86/sev: Add SEV-SNP guest feature negotiation support") ]
[ I removed it from this new sync ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick up the changes in:
89b0e7de34 ("KVM: arm64: nv: Introduce nested virtualization VCPU feature")
14329b825f ("KVM: x86/pmu: Introduce masked events to the pmu event filter")
6213b701a9 ("KVM: x86: Replace 0-length arrays with flexible arrays")
3fd49805d1 ("KVM: s390: Extend MEM_OP ioctl by storage key checked cmpxchg")
14329b825f ("KVM: x86/pmu: Introduce masked events to the pmu event filter")
That don't change functionality in tools/perf, as no new ioctl is added
for the 'perf trace' scripts to harvest.
This addresses these perf build warnings:
Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h'
diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h
Warning: Kernel ABI header at 'tools/arch/arm64/include/uapi/asm/kvm.h' differs from latest version at 'arch/arm64/include/uapi/asm/kvm.h'
diff -u tools/arch/arm64/include/uapi/asm/kvm.h arch/arm64/include/uapi/asm/kvm.h
Cc: Aaron Lewis <aaronlewis@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christoffer Dall <christoffer.dall@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kees Kook <keescook@chromium.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/lkml/ZAJlg7%2FfWDVGX0F3@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick up the changes in:
6fd7353829 ("mm/memfd: add F_SEAL_EXEC")
That doesn't add or change any perf tools functionality, only addresses
these build warnings:
Warning: Kernel ABI header at 'tools/include/uapi/linux/fcntl.h' differs from latest version at 'include/uapi/linux/fcntl.h'
diff -u tools/include/uapi/linux/fcntl.h include/uapi/linux/fcntl.h
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick up the changes in this cset:
cbdb1f163a ("vdso/bits.h: Add BIT_ULL() for the sake of consistency")
That just causes perf to rebuild, the macro included doesn't clash with
anything in tools/{perf,objtool,bpf}.
This addresses this perf build warning:
Warning: Kernel ABI header at 'tools/include/linux/bits.h' differs from latest version at 'include/linux/bits.h'
diff -u tools/include/linux/bits.h include/linux/bits.h
Warning: Kernel ABI header at 'tools/include/vdso/bits.h' differs from latest version at 'include/vdso/bits.h'
diff -u tools/include/vdso/bits.h include/vdso/bits.h
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We also continue with SYM_TYPED_FUNC_START() in util/include/linux/linkage.h
and with an exception in tools/perf/check_headers.sh's diff check to ignore
the include cfi_types.h line when checking if the kernel original files drifted
from the copies we carry.
This is to get the changes from:
69d4c0d321 ("entry, kasan, x86: Disallow overriding mem*() functions")
That addresses these perf tools build warning:
Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S
Warning: Kernel ABI header at 'tools/arch/x86/lib/memset_64.S' differs from latest version at 'arch/x86/lib/memset_64.S'
diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/ZAH%2FjsioJXGIOrkf@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
bpf_rcu_read_lock/unlock() are only available in clang compiled kernels. Lack
of such key mechanism makes it impossible for sleepable bpf programs to use RCU
pointers.
Allow bpf_rcu_read_lock/unlock() in GCC compiled kernels (though GCC doesn't
support btf_type_tag yet) and allowlist certain field dereferences in important
data structures like tast_struct, cgroup, socket that are used by sleepable
programs either as RCU pointer or full trusted pointer (which is valid outside
of RCU CS). Use BTF_TYPE_SAFE_RCU and BTF_TYPE_SAFE_TRUSTED macros for such
tagging. They will be removed once GCC supports btf_type_tag.
With that refactor check_ptr_to_btf_access(). Make it strict in enforcing
PTR_TRUSTED and PTR_UNTRUSTED while deprecating old PTR_TO_BTF_ID without
modifier flags. There is a chance that this strict enforcement might break
existing programs (especially on GCC compiled kernels), but this cleanup has to
start sooner than later. Note PTR_TO_CTX access still yields old deprecated
PTR_TO_BTF_ID. Once it's converted to strict PTR_TRUSTED or PTR_UNTRUSTED the
kfuncs and helpers will be able to default to KF_TRUSTED_ARGS. KF_RCU will
remain as a weaker version of KF_TRUSTED_ARGS where obj refcnt could be 0.
Adjust rcu_read_lock selftest to run on gcc and clang compiled kernels.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/20230303041446.3630-7-alexei.starovoitov@gmail.com
Adjust cgroup kfunc test to dereference RCU protected cgroup pointer
as PTR_TRUSTED and pass into KF_TRUSTED_ARGS kfunc.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/20230303041446.3630-6-alexei.starovoitov@gmail.com
The life time of certain kernel structures like 'struct cgroup' is protected by RCU.
Hence it's safe to dereference them directly from __kptr tagged pointers in bpf maps.
The resulting pointer is MEM_RCU and can be passed to kfuncs that expect KF_RCU.
Derefrence of other kptr-s returns PTR_UNTRUSTED.
For example:
struct map_value {
struct cgroup __kptr *cgrp;
};
SEC("tp_btf/cgroup_mkdir")
int BPF_PROG(test_cgrp_get_ancestors, struct cgroup *cgrp_arg, const char *path)
{
struct cgroup *cg, *cg2;
cg = bpf_cgroup_acquire(cgrp_arg); // cg is PTR_TRUSTED and ref_obj_id > 0
bpf_kptr_xchg(&v->cgrp, cg);
cg2 = v->cgrp; // This is new feature introduced by this patch.
// cg2 is PTR_MAYBE_NULL | MEM_RCU.
// When cg2 != NULL, it's a valid cgroup, but its percpu_ref could be zero
if (cg2)
bpf_cgroup_ancestor(cg2, level); // safe to do.
}
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/20230303041446.3630-4-alexei.starovoitov@gmail.com
__kptr meant to store PTR_UNTRUSTED kernel pointers inside bpf maps.
The concept felt useful, but didn't get much traction,
since bpf_rdonly_cast() was added soon after and bpf programs received
a simpler way to access PTR_UNTRUSTED kernel pointers
without going through restrictive __kptr usage.
Rename __kptr_ref -> __kptr and __kptr -> __kptr_untrusted to indicate
its intended usage.
The main goal of __kptr_untrusted was to read/write such pointers
directly while bpf_kptr_xchg was a mechanism to access refcnted
kernel pointers. The next patch will allow RCU protected __kptr access
with direct read. At that point __kptr_untrusted will be deprecated.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/20230303041446.3630-2-alexei.starovoitov@gmail.com
Pretty much all families use value: 1 or reserve as unspec
the first entry in attribute set and the first operation.
Make this the default. Update documentation (the doc for
values of operations just refers back to doc for attrs
so updating only attrs).
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
To avoid having to repeat the entire definition of an attribute
(including the value) use the Attr object from the original set.
In fact this is already the documented expectation.
Fixes: be5bea1cc0 ("net: add basic C code generators for Netlink")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add test for the absolute BPF timer under the existing timer tests. This
will run the timer two times with 1us expiration time, and then re-arm
the timer at ~35s in the future. At the end, it is verified that the
absolute timer expired exactly two times.
Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com>
Link: https://lore.kernel.org/r/20230302114614.2985072-3-tero.kristo@linux.intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add a new flag BPF_F_TIMER_ABS that can be passed to bpf_timer_start()
to start an absolute value timer instead of the default relative value.
This makes the timer expire at an exact point in time, instead of a time
with latencies induced by both the BPF and timer subsystems.
Suggested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com>
Link: https://lore.kernel.org/r/20230302114614.2985072-2-tero.kristo@linux.intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Per C99 standard [0], Section 6.7.8, Paragraph 10:
If an object that has automatic storage duration is not initialized
explicitly, its value is indeterminate.
And in the same document, in appendix "J.2 Undefined behavior":
The behavior is undefined in the following circumstances:
[...]
The value of an object with automatic storage duration is used while
it is indeterminate (6.2.4, 6.7.8, 6.8).
This means that use of an uninitialized stack variable is undefined
behavior, and therefore that clang can choose to do a variety of scary
things, such as not generating bytecode for "bunch of useful code" in
the below example:
void some_func()
{
int i;
if (!i)
return;
// bunch of useful code
}
To add insult to injury, if some_func above is a helper function for
some BPF program, clang can choose to not generate an "exit" insn,
causing verifier to fail with "last insn is not an exit or jmp". Going
from that verification failure to the root cause of uninitialized use
is certain to be frustrating.
This patch adds -Wuninitialized to the cflags for selftest BPF progs and
fixes up existing instances of uninitialized use.
[0]: https://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Cc: David Vernet <void@manifault.com>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20230303005500.1614874-1-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
When creating counters with initial delay configured, the enable_on_exec
field is not set. So we need to enable the counters later. The problem
is, when a workload is specified the target__none() is true. So we also
need to check stat_config.initial_delay.
In this change, we add a new field 'initial_delay' for struct target
which could be shared by other subcommands. And define
target__enable_on_exec() which returns whether enable_on_exec should be
set on normal cases.
Before this fix the event is not counted:
$ ./perf stat -e instructions -D 100 sleep 2
Events disabled
Events enabled
Performance counter stats for 'sleep 2':
<not counted> instructions
1.901661124 seconds time elapsed
0.001602000 seconds user
0.000000000 seconds sys
After fix it works:
$ ./perf stat -e instructions -D 100 sleep 2
Events disabled
Events enabled
Performance counter stats for 'sleep 2':
404,214 instructions
1.901743475 seconds time elapsed
0.001617000 seconds user
0.000000000 seconds sys
Fixes: c587e77e10 ("perf stat: Do not delay the workload with --delay")
Signed-off-by: Changbin Du <changbin.du@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hui Wang <hw.huiwang@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230302031146.2801588-2-changbin.du@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes in:
8c29f01654 ("x86/sev: Add SEV-SNP guest feature negotiation support")
That triggers:
CC /tmp/build/perf-tools/arch/x86/util/kvm-stat.o
CC /tmp/build/perf-tools/util/header.o
LD /tmp/build/perf-tools/arch/x86/util/perf-in.o
LD /tmp/build/perf-tools/arch/x86/perf-in.o
LD /tmp/build/perf-tools/arch/perf-in.o
LD /tmp/build/perf-tools/util/perf-in.o
LD /tmp/build/perf-tools/perf-in.o
LINK /tmp/build/perf-tools/perf
But this time causes no changes in tooling results, as the introduced
SVM_VMGEXIT_TERM_REQUEST exit reason wasn't added to SVM_EXIT_REASONS,
that is used in kvm-stat.c.
And addresses this perf build warning:
Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/svm.h' differs from latest version at 'arch/x86/include/uapi/asm/svm.h'
diff -u tools/arch/x86/include/uapi/asm/svm.h arch/x86/include/uapi/asm/svm.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nikunj A Dadhania <nikunj@amd.com>
Link: http://lore.kernel.org/lkml/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
- Shrink 'struct instruction', to improve objtool performance & memory
footprint.
- Other maximum memory usage reductions - this makes the build both faster,
and fixes kernel build OOM failures on allyesconfig and similar configs
when they try to build the final (large) vmlinux.o.
- Fix ORC unwinding when a kprobe (INT3) is set on a stack-modifying
single-byte instruction (PUSH/POP or LEAVE). This requires the
extension of the ORC metadata structure with a 'signal' field.
- Misc fixes & cleanups.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmQAVp8RHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1gV6A//YbWb4nNxYbRFBd1O3FnFfy4efrDQ4btI
hwkL6f7jka9RnIpIEatJvaLdNvyN5tuPCC/+B5eVnvFdd1JcBUmj5D+zYFt6H6qt
BG4M6TNHFkP1kOJVfFGn8UPRfoMz2oMiEqilpsc1Yuf7b3ldMJtGUoHaeZC9pyqe
RUisKNw4WHZp2G/gTBUWxW17xpWY3Awgch/w4HCu8wMnR+uEC44i0UCBfnAadl36
ar66PfhMJcQIv0XkK9wu43g7+HFnjpxHOx35JW3lRot0xRnwl/JcsmaX5iPkh0gt
HV8eLH80J0homeMZDY7vWIKJxGeLkIdfjO5gxwTdnFc9rQw3GwHp1B7WTS6J3Vwe
gM00kyaGly3CvkKMiz5QQBfViWCjE25nYS8X0i9Oz6Gk58IkRPGByaDTKRjNrDJB
BwH9DE9xb3dPVZRv/PejkTdggQWo+FDTrL8ulHIjUFK11M7VubwkskecNHkfpAOE
TRy5iLjMocF8u7hdyec6Mma2K6qEndC2Rw9ZMPQ7TeieMsBcl63cSRgSJLFfdRhr
/5c6Hr2SNQKU8xu+3j49GyBwFvp4CwCa+GPs9/o+l0uCvuKNIn9B788cm4TjxLJ9
C3PRzE6B/CaLhYvlC5k5cNM+I4YpoMU/mvSvY6HcC0Duj2nSAWS2VV60MVMDpqVX
8nK4xnla2tM=
=bpPY
-----END PGP SIGNATURE-----
Merge tag 'objtool-core-2023-03-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool updates from Ingo Molnar:
- Shrink 'struct instruction', to improve objtool performance & memory
footprint
- Other maximum memory usage reductions - this makes the build both
faster, and fixes kernel build OOM failures on allyesconfig and
similar configs when they try to build the final (large) vmlinux.o
- Fix ORC unwinding when a kprobe (INT3) is set on a stack-modifying
single-byte instruction (PUSH/POP or LEAVE). This requires the
extension of the ORC metadata structure with a 'signal' field
- Misc fixes & cleanups
* tag 'objtool-core-2023-03-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
objtool: Fix ORC 'signal' propagation
objtool: Remove instruction::list
x86: Fix FILL_RETURN_BUFFER
objtool: Fix overlapping alternatives
objtool: Union instruction::{call_dest,jump_table}
objtool: Remove instruction::reloc
objtool: Shrink instruction::{type,visited}
objtool: Make instruction::alts a single-linked list
objtool: Make instruction::stack_ops a single-linked list
objtool: Change arch_decode_instruction() signature
x86/entry: Fix unwinding from kprobe on PUSH/POP instruction
x86/unwind/orc: Add 'signal' field to ORC metadata
objtool: Optimize layout of struct special_alt
objtool: Optimize layout of struct symbol
objtool: Allocate multiple structures with calloc()
objtool: Make struct check_options static
objtool: Make struct entries[] static and const
objtool: Fix HOSTCC flag usage
objtool: Properly support make V=1
objtool: Install libsubcmd in build
...
This change adds support for attaching uprobes to shared objects located
in APKs, which is relevant for Android systems where various libraries
may reside in APKs. To make that happen, we extend the syntax for the
"binary path" argument to attach to with that supported by various
Android tools:
<archive>!/<binary-in-archive>
For example:
/system/app/test-app/test-app.apk!/lib/arm64-v8a/libc++_shared.so
APKs need to be specified via full path, i.e., we do not attempt to
resolve mere file names by searching system directories.
We cannot currently test this functionality end-to-end in an automated
fashion, because it relies on an Android system being present, but there
is no support for that in CI. I have tested the functionality manually,
by creating a libbpf program containing a uretprobe, attaching it to a
function inside a shared object inside an APK, and verifying the sanity
of the returned values.
Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230301212308.1839139-4-deso@posteo.net
This change splits the elf_find_func_offset() function in two:
elf_find_func_offset(), which now accepts an already opened Elf object
instead of a path to a file that is to be opened, as well as
elf_find_func_offset_from_file(), which opens a binary based on a
path and then invokes elf_find_func_offset() on the Elf object. Having
this split in responsibilities will allow us to call
elf_find_func_offset() from other code paths on Elf objects that did not
necessarily come from a file on disk.
Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230301212308.1839139-3-deso@posteo.net
This change implements support for reading zip archives, including
opening an archive, finding an entry based on its path and name in it,
and closing it.
The code was copied from https://github.com/iovisor/bcc/pull/4440, which
implements similar functionality for bcc. The author confirmed that he
is fine with this usage and the corresponding relicensing. I adjusted it
to adhere to libbpf coding standards.
Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Michał Gregorczyk <michalgr@meta.com>
Link: https://lore.kernel.org/bpf/20230301212308.1839139-2-deso@posteo.net
Extend __flag attribute by allowing to specify one of the following:
* BPF_F_STRICT_ALIGNMENT
* BPF_F_ANY_ALIGNMENT
* BPF_F_TEST_RND_HI32
* BPF_F_TEST_STATE_FREQ
* BPF_F_SLEEPABLE
* BPF_F_XDP_HAS_FRAGS
* Some numeric value
Extend __msg attribute by allowing to specify multiple exepcted messages.
All messages are expected to be present in the verifier log in the
order of application.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230301175417.3146070-2-eddyz87@gmail.com
[ Eduard: added commit message, formatting, comments ]
If target is bpf, there is no __loongarch__ definition, __BITS_PER_LONG
defaults to 32, __NR_nanosleep is not defined:
#if defined(__ARCH_WANT_TIME32_SYSCALLS) || __BITS_PER_LONG != 32
#define __NR_nanosleep 101
__SC_3264(__NR_nanosleep, sys_nanosleep_time32, sys_nanosleep)
#endif
Work around this problem, by explicitly setting __BITS_PER_LONG to
__loongarch_grlen which is defined by compiler as 64 for LA64.
This is similar with commit 36e70b9b06 ("selftests, bpf: Fix broken
riscv build").
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1677585781-21628-1-git-send-email-yangtiezhu@loongson.cn
Firstly, ensure programs successfully load when using all of the
supported maps. Then, extend existing tests to test more cases at
runtime. We are currently testing both the synchronous freeing of items
and asynchronous destruction when map is freed, but the code needs to be
adjusted a bit to be able to also accomodate support for percpu maps.
We now do a delete on the item (and update for array maps which has a
similar effect for kptrs) to perform a synchronous free of the kptr, and
test destruction both for the synchronous and asynchronous deletion.
Next time the program runs, it should observe the refcount as 1 since
all existing references should have been released by then. By running
the program after both possible paths freeing kptrs, we establish that
they correctly release resources. Next, we augment the existing test to
also test the same code path shared by all local storage maps using a
task local storage map.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230225154010.391965-4-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Test skb and xdp dynptr functionality in the following ways:
1) progs/test_cls_redirect_dynptr.c
* Rewrite "progs/test_cls_redirect.c" test to use dynptrs to parse
skb data
* This is a great example of how dynptrs can be used to simplify a
lot of the parsing logic for non-statically known values.
When measuring the user + system time between the original version
vs. using dynptrs, and averaging the time for 10 runs (using
"time ./test_progs -t cls_redirect"):
original version: 0.092 sec
with dynptrs: 0.078 sec
2) progs/test_xdp_dynptr.c
* Rewrite "progs/test_xdp.c" test to use dynptrs to parse xdp data
When measuring the user + system time between the original version
vs. using dynptrs, and averaging the time for 10 runs (using
"time ./test_progs -t xdp_attach"):
original version: 0.118 sec
with dynptrs: 0.094 sec
3) progs/test_l4lb_noinline_dynptr.c
* Rewrite "progs/test_l4lb_noinline.c" test to use dynptrs to parse
skb data
When measuring the user + system time between the original version
vs. using dynptrs, and averaging the time for 10 runs (using
"time ./test_progs -t l4lb_all"):
original version: 0.062 sec
with dynptrs: 0.081 sec
For number of processed verifier instructions:
original version: 6268 insns
with dynptrs: 2588 insns
4) progs/test_parse_tcp_hdr_opt_dynptr.c
* Add sample code for parsing tcp hdr opt lookup using dynptrs.
This logic is lifted from a real-world use case of packet parsing
in katran [0], a layer 4 load balancer. The original version
"progs/test_parse_tcp_hdr_opt.c" (not using dynptrs) is included
here as well, for comparison.
When measuring the user + system time between the original version
vs. using dynptrs, and averaging the time for 10 runs (using
"time ./test_progs -t parse_tcp_hdr_opt"):
original version: 0.031 sec
with dynptrs: 0.045 sec
5) progs/dynptr_success.c
* Add test case "test_skb_readonly" for testing attempts at writes
on a prog type with read-only skb ctx.
* Add "test_dynptr_skb_data" for testing that bpf_dynptr_data isn't
supported for skb progs.
6) progs/dynptr_fail.c
* Add test cases "skb_invalid_data_slice{1,2,3,4}" and
"xdp_invalid_data_slice{1,2}" for testing that helpers that modify the
underlying packet buffer automatically invalidate the associated
data slice.
* Add test cases "skb_invalid_ctx" and "xdp_invalid_ctx" for testing
that prog types that do not support bpf_dynptr_from_skb/xdp don't
have access to the API.
* Add test case "dynptr_slice_var_len{1,2}" for testing that
variable-sized len can't be passed in to bpf_dynptr_slice
* Add test case "skb_invalid_slice_write" for testing that writes to a
read-only data slice are rejected by the verifier.
* Add test case "data_slice_out_of_bounds_skb" for testing that
writes to an area outside the slice are rejected.
* Add test case "invalid_slice_rdwr_rdonly" for testing that prog
types that don't allow writes to packet data don't accept any calls
to bpf_dynptr_slice_rdwr.
[0] https://github.com/facebookincubator/katran/blob/main/katran/lib/bpf/pckt_parsing.h
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230301154953.641654-11-joannelkoong@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Back in 2008 we extended the capability bits from 32 to 64, and we did
it by extending the single 32-bit capability word from one word to an
array of two words. It was then obfuscated by hiding the "2" behind two
macro expansions, with the reasoning being that maybe it gets extended
further some day.
That reasoning may have been valid at the time, but the last thing we
want to do is to extend the capability set any more. And the array of
values not only causes source code oddities (with loops to deal with
it), but also results in worse code generation. It's a lose-lose
situation.
So just change the 'u32[2]' into a 'u64' and be done with it.
We still have to deal with the fact that the user space interface is
designed around an array of these 32-bit values, but that was the case
before too, since the array layouts were different (ie user space
doesn't use an array of 32-bit values for individual capability masks,
but an array of 32-bit slices of multiple masks).
So that marshalling of data is actually simplified too, even if it does
remain somewhat obscure and odd.
This was all triggered by my reaction to the new "cap_isidentical()"
introduced recently. By just using a saner data structure, it went from
unsigned __capi;
CAP_FOR_EACH_U32(__capi) {
if (a.cap[__capi] != b.cap[__capi])
return false;
}
return true;
to just being
return a.val == b.val;
instead. Which is rather more obvious both to humans and to compilers.
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Paul Moore <paul@paul-moore.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Two new kfuncs are added, bpf_dynptr_slice and bpf_dynptr_slice_rdwr.
The user must pass in a buffer to store the contents of the data slice
if a direct pointer to the data cannot be obtained.
For skb and xdp type dynptrs, these two APIs are the only way to obtain
a data slice. However, for other types of dynptrs, there is no
difference between bpf_dynptr_slice(_rdwr) and bpf_dynptr_data.
For skb type dynptrs, the data is copied into the user provided buffer
if any of the data is not in the linear portion of the skb. For xdp type
dynptrs, the data is copied into the user provided buffer if the data is
between xdp frags.
If the skb is cloned and a call to bpf_dynptr_data_rdwr is made, then
the skb will be uncloned (see bpf_unclone_prologue()).
Please note that any bpf_dynptr_write() automatically invalidates any prior
data slices of the skb dynptr. This is because the skb may be cloned or
may need to pull its paged buffer into the head. As such, any
bpf_dynptr_write() will automatically have its prior data slices
invalidated, even if the write is to data in the skb head of an uncloned
skb. Please note as well that any other helper calls that change the
underlying packet buffer (eg bpf_skb_pull_data()) invalidates any data
slices of the skb dynptr as well, for the same reasons.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Link: https://lore.kernel.org/r/20230301154953.641654-10-joannelkoong@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add xdp dynptrs, which are dynptrs whose underlying pointer points
to a xdp_buff. The dynptr acts on xdp data. xdp dynptrs have two main
benefits. One is that they allow operations on sizes that are not
statically known at compile-time (eg variable-sized accesses).
Another is that parsing the packet data through dynptrs (instead of
through direct access of xdp->data and xdp->data_end) can be more
ergonomic and less brittle (eg does not need manual if checking for
being within bounds of data_end).
For reads and writes on the dynptr, this includes reading/writing
from/to and across fragments. Data slices through the bpf_dynptr_data
API are not supported; instead bpf_dynptr_slice() and
bpf_dynptr_slice_rdwr() should be used.
For examples of how xdp dynptrs can be used, please see the attached
selftests.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Link: https://lore.kernel.org/r/20230301154953.641654-9-joannelkoong@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add skb dynptrs, which are dynptrs whose underlying pointer points
to a skb. The dynptr acts on skb data. skb dynptrs have two main
benefits. One is that they allow operations on sizes that are not
statically known at compile-time (eg variable-sized accesses).
Another is that parsing the packet data through dynptrs (instead of
through direct access of skb->data and skb->data_end) can be more
ergonomic and less brittle (eg does not need manual if checking for
being within bounds of data_end).
For bpf prog types that don't support writes on skb data, the dynptr is
read-only (bpf_dynptr_write() will return an error)
For reads and writes through the bpf_dynptr_read() and bpf_dynptr_write()
interfaces, reading and writing from/to data in the head as well as from/to
non-linear paged buffers is supported. Data slices through the
bpf_dynptr_data API are not supported; instead bpf_dynptr_slice() and
bpf_dynptr_slice_rdwr() (added in subsequent commit) should be used.
For examples of how skb dynptrs can be used, please see the attached
selftests.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Link: https://lore.kernel.org/r/20230301154953.641654-8-joannelkoong@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
1, Make -mstrict-align configurable;
2, Add kernel relocation and KASLR support;
3, Add single kernel image implementation for kdump;
4, Add hardware breakpoints/watchpoints support;
5, Add kprobes/kretprobes/kprobes_on_ftrace support;
6, Add LoongArch support for some selftests.
-----BEGIN PGP SIGNATURE-----
iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmP+9H0WHGNoZW5odWFj
YWlAa2VybmVsLm9yZwAKCRAChivD8uImerz+D/98MjkLXM4qtgfAxuBKpVdEVA4U
bzO19UlpqWlwTJbwrhf0GYsRrAis37PTVJG4eNORJairJ/oTkMtEEBPhwq0D9Whc
URDEh+VrjzFztLsu2OlvzOA9gE7lpg+xAx2LKflP7ixlOELOWeercDLW3octp5/J
CJDE8wPaw9tJrMHFWuiVybs03yZmY3YFV55JdWL9hY8Ryy4DY5997mruOfzjvHpl
EfDgQM2zCn2JSQwaD+Kl3MHxHyRx07Tj2wnZAh9ptaGeptK/yplc7nqRwhe7BevS
QwClhJNPICcOi+evZ7cDUY0PTL4evpw2KRnF1N4zw+58RhZECjVrCEJNdf6L1scj
muptQngWKrE/TJvn4way3cJr44stSCtT71elPhn629S23my/CauMmFqCqKpYOPOf
pxwzzCaqDcaZKwMu96qBkZS76tIrhoNeNFntj+C9RS+8ezY3+o144S3vF1A6A9Zb
M4gwa2NiQuLqnCUwKK6dZkLQVX2NMIMViUkYNKdUStxNWx/K7fFmXcl0ycAFpGYp
8Q95LLH34jUrpSgqMSCmcylsPvNiN1QnuXFnw8Tu+zDthp5dOzio60tORLPM1ZUq
gobPeGjeTQInq4eMCf2B5HH8fOMVtJyj6H4K9G1M6HUMg64UtcBp6BvEbwPxTxNN
sIOFUjDfDnBiIXWF4w==
=SzL5
-----END PGP SIGNATURE-----
Merge tag 'loongarch-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:
- Make -mstrict-align configurable
- Add kernel relocation and KASLR support
- Add single kernel image implementation for kdump
- Add hardware breakpoints/watchpoints support
- Add kprobes/kretprobes/kprobes_on_ftrace support
- Add LoongArch support for some selftests.
* tag 'loongarch-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (23 commits)
selftests/ftrace: Add LoongArch kprobe args string tests support
selftests/seccomp: Add LoongArch selftesting support
tools: Add LoongArch build infrastructure
samples/kprobes: Add LoongArch support
LoongArch: Mark some assembler symbols as non-kprobe-able
LoongArch: Add kprobes on ftrace support
LoongArch: Add kretprobes support
LoongArch: Add kprobes support
LoongArch: Simulate branch and PC* instructions
LoongArch: ptrace: Add hardware single step support
LoongArch: ptrace: Add function argument access API
LoongArch: ptrace: Expose hardware breakpoints to debuggers
LoongArch: Add hardware breakpoints/watchpoints support
LoongArch: kdump: Add crashkernel=YM handling
LoongArch: kdump: Add single kernel image implementation
LoongArch: Add support for kernel address space layout randomization (KASLR)
LoongArch: Add support for kernel relocation
LoongArch: Add la_abs macro implementation
LoongArch: Add JUMP_VIRT_ADDR macro implementation to avoid using la.abs
LoongArch: Use la.pcrel instead of la.abs when it's trivially possible
...
The test_local_dnat_portonly() function initiates the client-side as
soon as it sets the listening side to the background. This could lead to
a race condition where the server may not be ready to listen. To ensure
that the server-side is up and running before initiating the
client-side, a delay is introduced to the test_local_dnat_portonly()
function.
Before the fix:
# ./nft_nat.sh
PASS: netns routing/connectivity: ns0-rthlYrBU can reach ns1-rthlYrBU and ns2-rthlYrBU
PASS: ping to ns1-rthlYrBU was ip NATted to ns2-rthlYrBU
PASS: ping to ns1-rthlYrBU OK after ip nat output chain flush
PASS: ipv6 ping to ns1-rthlYrBU was ip6 NATted to ns2-rthlYrBU
2023/02/27 04:11:03 socat[6055] E connect(5, AF=2 10.0.1.99:2000, 16): Connection refused
ERROR: inet port rewrite
After the fix:
# ./nft_nat.sh
PASS: netns routing/connectivity: ns0-9sPJV6JJ can reach ns1-9sPJV6JJ and ns2-9sPJV6JJ
PASS: ping to ns1-9sPJV6JJ was ip NATted to ns2-9sPJV6JJ
PASS: ping to ns1-9sPJV6JJ OK after ip nat output chain flush
PASS: ipv6 ping to ns1-9sPJV6JJ was ip6 NATted to ns2-9sPJV6JJ
PASS: inet port rewrite without l3 address
Fixes: 282e5f8fe9 ("netfilter: nat: really support inet nat without l3 address")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Commit 04d58f1b26a4("libbpf: add API to get XDP/XSK supported features")
added feature_flags to struct bpf_xdp_query_opts. If a user uses
bpf_xdp_query_opts with feature_flags member, the bpf_xdp_query()
will check whether 'netdev' family exists or not in the kernel.
If it does not exist, the bpf_xdp_query() will return -ENOENT.
But 'netdev' family does not exist in old kernels as it is
introduced in the same patch set as Commit 04d58f1b26.
So old kernel with newer libbpf won't work properly with
bpf_xdp_query() api call.
To fix this issue, if the return value of
libbpf_netlink_resolve_genl_family_id() is -ENOENT, bpf_xdp_query()
will just return 0, skipping the rest of xdp feature query.
This preserves backward compatibility.
Fixes: 04d58f1b26 ("libbpf: add API to get XDP/XSK supported features")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230227224943.1153459-1-yhs@fb.com
Current release - regressions:
- phy: multiple fixes for EEE rework
- wifi: wext: warn about usage only once
- wifi: ath11k: allow system suspend to survive ath11k
Current release - new code bugs:
- mlx5: Fix memory leak in IPsec RoCE creation
- ibmvnic: assign XPS map to correct queue index
Previous releases - regressions:
- netfilter: ip6t_rpfilter: Fix regression with VRF interfaces
- netfilter: ctnetlink: make event listener tracking global
- nf_tables: allow to fetch set elements when table has an owner
- mlx5:
- fix skb leak while fifo resync and push
- fix possible ptp queue fifo use-after-free
Previous releases - always broken:
- sched: fix action bind logic
- ptp: vclock: use mutex to fix "sleep on atomic" bug if driver
also uses a mutex
- netfilter: conntrack: fix rmmod double-free race
- netfilter: xt_length: use skb len to match in length_mt6,
avoid issues with BIG TCP
Misc:
- ice: remove unnecessary CONFIG_ICE_GNSS
- mlx5e: remove hairpin write debugfs files
- sched: act_api: move TCA_EXT_WARN_MSG to the correct hierarchy
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmP9JgYACgkQMUZtbf5S
IrsIRRAApy4Hjb8z5z3k4HOM2lA3b/3OWD301I5YtoU3FC4L938yETAPFYUGbWrX
rKN4YOTNh2Fvkgbni7vz9hbC84C6i86Q9+u7dT1U+kCk3kbyQPFZlEDj5fY0I8zK
1xweCRrC1CcG74S2M5UO3UnWz1ypWQpTnHfWZqq0Duh1j9Xc+MHjHC2IKrGnzM6U
1/ODk9FrtsWC+KGJlWwiV+yJMYUA4nCKIS/NrmdRlBa7eoP0oC1xkA8g0kz3/P3S
O+xMyhExcZbMYY5VMkiGBZ5l8Ve3t6lHcMXq7jWlSCOeXd4Ut6zzojHlGZjzlCy9
RQQJzva2wlltqB9rECUQixpZbVS6ubf5++zvACOKONhSIEdpWjZW9K/qsV8igbfM
Xx0hsG1jCBt/xssRw2UBsq73vjNf1AkdksvqJgcggAvBJU8cV3MxRRB4/9lyPdmB
NNFqehwCeE3aU0FSBKoxZVYpfg+8J/XhwKT63Cc2d1ENetsWk/LxvkYm24aokpW+
nn+jUH9AYk3rFlBVQG1xsCwU4VlGk/yZgRwRMYFBqPkAGcXLZOnqdoSviBPN3yN0
Habs1hxToMt3QBgLJcMVn8CYdWCJgnZpxs8Mfo+PGoWKHzQ9kXBdyYyIZm1GyesD
BN/2QN38yMGXRALd2NXS2Va4ygX7KptB7+HsitdkzKCqcp1Ao+I=
=Ko4p
-----END PGP SIGNATURE-----
Merge tag 'net-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from wireless and netfilter.
The notable fixes here are the EEE fix which restores boot for many
embedded platforms (real and QEMU); WiFi warning suppression and the
ICE Kconfig cleanup.
Current release - regressions:
- phy: multiple fixes for EEE rework
- wifi: wext: warn about usage only once
- wifi: ath11k: allow system suspend to survive ath11k
Current release - new code bugs:
- mlx5: Fix memory leak in IPsec RoCE creation
- ibmvnic: assign XPS map to correct queue index
Previous releases - regressions:
- netfilter: ip6t_rpfilter: Fix regression with VRF interfaces
- netfilter: ctnetlink: make event listener tracking global
- nf_tables: allow to fetch set elements when table has an owner
- mlx5:
- fix skb leak while fifo resync and push
- fix possible ptp queue fifo use-after-free
Previous releases - always broken:
- sched: fix action bind logic
- ptp: vclock: use mutex to fix "sleep on atomic" bug if driver also
uses a mutex
- netfilter: conntrack: fix rmmod double-free race
- netfilter: xt_length: use skb len to match in length_mt6, avoid
issues with BIG TCP
Misc:
- ice: remove unnecessary CONFIG_ICE_GNSS
- mlx5e: remove hairpin write debugfs files
- sched: act_api: move TCA_EXT_WARN_MSG to the correct hierarchy"
* tag 'net-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (53 commits)
tcp: tcp_check_req() can be called from process context
net: phy: c45: fix network interface initialization failures on xtensa, arm:cubieboard
xen-netback: remove unused variables pending_idx and index
net/sched: act_api: move TCA_EXT_WARN_MSG to the correct hierarchy
net: dsa: ocelot_ext: remove unnecessary phylink.h include
net: mscc: ocelot: fix duplicate driver name error
net: dsa: felix: fix internal MDIO controller resource length
net: dsa: seville: ignore mscc-miim read errors from Lynx PCS
net/sched: act_sample: fix action bind logic
net/sched: act_mpls: fix action bind logic
net/sched: act_pedit: fix action bind logic
wifi: wext: warn about usage only once
wifi: mt76: usb: fix use-after-free in mt76u_free_rx_queue
qede: avoid uninitialized entries in coal_entry array
nfc: fix memory leak of se_io context in nfc_genl_se_io
ice: remove unnecessary CONFIG_ICE_GNSS
net/sched: cls_api: Move call to tcf_exts_miss_cookie_base_destroy()
ibmvnic: Assign XPS map to correct queue index
docs: net: fix inaccuracies in msg_zerocopy.rst
tools: net: add __pycache__ to gitignore
...
The syscall register definitions for ARM in bpf_tracing.h doesn't define
the fifth parameter for the syscalls. Because of this some KPROBES based
selftests fail to compile for ARM architecture.
Define the fifth parameter that is passed in the R5 register (uregs[4]).
Fixes: 3a95c42d65 ("libbpf: Define arm syscall regs spec in bpf_tracing.h")
Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230223095346.10129-1-puranjay12@gmail.com
Commit bc292ab00f6c("mm: introduce vma->vm_flags wrapper functions")
turns the vm_flags into a const variable.
Added bpf_find_vma test in commit f108662b27c9("selftests/bpf: Add tests
for bpf_find_vma") to assign values to variables that declare const in
find_vma_fail1.c programs, which is an error to the compiler and does not
test BPF verifiers. It is better to replace 'const vm_flags_t vm_flags'
with 'unsigned long vm_start' for testing.
$ make -C tools/testing/selftests/bpf/ -j8
...
progs/find_vma_fail1.c:16:16: error: cannot assign to non-static data
member 'vm_flags' with const-qualified type 'const vm_flags_t' (aka
'const unsigned long')
vma->vm_flags |= 0x55;
~~~~~~~~~~~~~ ^
../tools/testing/selftests/bpf/tools/include/vmlinux.h:1898:20:
note: non-static data member 'vm_flags' declared const here
const vm_flags_t vm_flags;
~~~~~~~~~~~`~~~~~~^~~~~~~~
Signed-off-by: Rong Tao <rongtao@cestc.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/tencent_CB281722B3C1BD504C16CDE586CACC2BE706@qq.com
RFC8259 ("The JavaScript Object Notation (JSON) Data Interchange
Format") only specifies \", \\, \/, \b, \f, \n, \r, and \r as valid
two-character escape sequences. This does not include \', which is not
required in JSON because it exclusively uses double quotes as string
separators.
Solidus (/) may be escaped, but does not have to. Only reverse
solidus (\), double quotes ("), and the control characters have to be
escaped. Therefore, with this fix, bpftool correctly supports all valid
two-character escape sequences (but still does not support characters
that require multi-character escape sequences).
Witout this fix, attempting to load a JSON file generated by bpftool
using Python 3.10.6's default json.load() may fail with the error
"Invalid \escape" if the file contains the invalid escaped single
quote (\').
Fixes: b66e907cfe ("tools: bpftool: copy JSON writer from iproute2 repository")
Signed-off-by: Luis Gerhorst <gerhorst@cs.fau.de>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20230227150853.16863-1-gerhorst@cs.fau.de
After commit 80d7da1cac ("asm-generic: Drop getrlimit and setrlimit
syscalls from default list"), new architectures won't need to include
getrlimit and setrlimit, they are superseded with prlimit64.
In order to maintain compatibility for the new architectures, such as
LoongArch which does not define __NR_getrlimit, it is better to use
__NR_prlimit64 instead of __NR_getrlimit in user_ringbuf test to fix
the following build error:
TEST-OBJ [test_progs] user_ringbuf.test.o
tools/testing/selftests/bpf/prog_tests/user_ringbuf.c: In function 'kick_kernel_cb':
tools/testing/selftests/bpf/prog_tests/user_ringbuf.c:593:17: error: '__NR_getrlimit' undeclared (first use in this function)
593 | syscall(__NR_getrlimit);
| ^~~~~~~~~~~~~~
tools/testing/selftests/bpf/prog_tests/user_ringbuf.c:593:17: note: each undeclared identifier is reported only once for each function it appears in
make: *** [Makefile:573: tools/testing/selftests/bpf/user_ringbuf.test.o] Error 1
make: Leaving directory 'tools/testing/selftests/bpf'
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1677235015-21717-4-git-send-email-yangtiezhu@loongson.cn
device feature provisioning in ifcvf, mlx5
new SolidNET driver
support for zoned block device in virtio blk
numa support in virtio pmem
VIRTIO_F_RING_RESET support in vhost-net
more debugfs entries in mlx5
resume support in vdpa
completion batching in virtio blk
cleanup of dma api use in vdpa
now simulating more features in vdpa-sim
documentation, features, fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmP0D98PHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpV6IH/iecRgLMWWjp3n31IFdu31f/J4HpF7dczVjK
qtV98eJ1N2pkgeJkdCfmB5XszfvFBeAurrS7++FTHiJhrRfR3Z+2ml/Qtvh5DEyP
qxz6wOw6VVsi/txdUxM1wsxLeEmmzkmFdAmPM+FXeIjhWj76GOgy/4A3eaj6TgzV
W8ShsBve/UZ5qMOC3XbIscvdOrudHJ18tH90Tiz3NZfH1fAs5E4uWbU6Mrz9DJVr
canGvf4kAI9z8qram5HSgzPIXRJEYiF4q/eiStdtiiME8gL1mHLRZDNP1I1LeCAb
q6Q6RCRKi3Ek+LGdH6u+nR1Swu03N2b/g+vgKtv30kJo06oZVzw=
=EasV
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:
- device feature provisioning in ifcvf, mlx5
- new SolidNET driver
- support for zoned block device in virtio blk
- numa support in virtio pmem
- VIRTIO_F_RING_RESET support in vhost-net
- more debugfs entries in mlx5
- resume support in vdpa
- completion batching in virtio blk
- cleanup of dma api use in vdpa
- now simulating more features in vdpa-sim
- documentation, features, fixes all over the place
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (64 commits)
vdpa/mlx5: support device features provisioning
vdpa/mlx5: make MTU/STATUS presence conditional on feature bits
vdpa: validate device feature provisioning against supported class
vdpa: validate provisioned device features against specified attribute
vdpa: conditionally read STATUS in config space
vdpa: fix improper error message when adding vdpa dev
vdpa/mlx5: Initialize CVQ iotlb spinlock
vdpa/mlx5: Don't clear mr struct on destroy MR
vdpa/mlx5: Directly assign memory key
tools/virtio: enable to build with retpoline
vringh: fix a typo in comments for vringh_kiov
vhost-vdpa: print warning when vhost_vdpa_alloc_domain fails
scsi: virtio_scsi: fix handling of kmalloc failure
vdpa: Fix a couple of spelling mistakes in some messages
vhost-net: support VIRTIO_F_RING_RESET
vhost-scsi: convert sysfs snprintf and sprintf to sysfs_emit
vdpa: mlx5: support per virtqueue dma device
vdpa: set dma mask for vDPA device
virtio-vdpa: support per vq dma device
vdpa: introduce get_vq_dma_device()
...
- Provide a virtual cache topology to the guest to avoid
inconsistencies with migration on heterogenous systems. Non secure
software has no practical need to traverse the caches by set/way in
the first place.
- Add support for taking stage-2 access faults in parallel. This was an
accidental omission in the original parallel faults implementation,
but should provide a marginal improvement to machines w/o FEAT_HAFDBS
(such as hardware from the fruit company).
- A preamble to adding support for nested virtualization to KVM,
including vEL2 register state, rudimentary nested exception handling
and masking unsupported features for nested guests.
- Fixes to the PSCI relay that avoid an unexpected host SVE trap when
resuming a CPU when running pKVM.
- VGIC maintenance interrupt support for the AIC
- Improvements to the arch timer emulation, primarily aimed at reducing
the trap overhead of running nested.
- Add CONFIG_USERFAULTFD to the KVM selftests config fragment in the
interest of CI systems.
- Avoid VM-wide stop-the-world operations when a vCPU accesses its own
redistributor.
- Serialize when toggling CPACR_EL1.SMEN to avoid unexpected exceptions
in the host.
- Aesthetic and comment/kerneldoc fixes
- Drop the vestiges of the old Columbia mailing list and add [Oliver]
as co-maintainer
This also drags in arm64's 'for-next/sme2' branch, because both it and
the PSCI relay changes touch the EL2 initialization code.
RISC-V:
- Fix wrong usage of PGDIR_SIZE instead of PUD_SIZE
- Correctly place the guest in S-mode after redirecting a trap to the guest
- Redirect illegal instruction traps to guest
- SBI PMU support for guest
s390:
- Two patches sorting out confusion between virtual and physical
addresses, which currently are the same on s390.
- A new ioctl that performs cmpxchg on guest memory
- A few fixes
x86:
- Change tdp_mmu to a read-only parameter
- Separate TDP and shadow MMU page fault paths
- Enable Hyper-V invariant TSC control
- Fix a variety of APICv and AVIC bugs, some of them real-world,
some of them affecting architecurally legal but unlikely to
happen in practice
- Mark APIC timer as expired if its in one-shot mode and the count
underflows while the vCPU task was being migrated
- Advertise support for Intel's new fast REP string features
- Fix a double-shootdown issue in the emergency reboot code
- Ensure GIF=1 and disable SVM during an emergency reboot, i.e. give SVM
similar treatment to VMX
- Update Xen's TSC info CPUID sub-leaves as appropriate
- Add support for Hyper-V's extended hypercalls, where "support" at this
point is just forwarding the hypercalls to userspace
- Clean up the kvm->lock vs. kvm->srcu sequences when updating the PMU and
MSR filters
- One-off fixes and cleanups
- Fix and cleanup the range-based TLB flushing code, used when KVM is
running on Hyper-V
- Add support for filtering PMU events using a mask. If userspace
wants to restrict heavily what events the guest can use, it can now
do so without needing an absurd number of filter entries
- Clean up KVM's handling of "PMU MSRs to save", especially when vPMU
support is disabled
- Add PEBS support for Intel Sapphire Rapids
- Fix a mostly benign overflow bug in SEV's send|receive_update_data()
- Move several SVM-specific flags into vcpu_svm
x86 Intel:
- Handle NMI VM-Exits before leaving the noinstr region
- A few trivial cleanups in the VM-Enter flows
- Stop enabling VMFUNC for L1 purely to document that KVM doesn't support
EPTP switching (or any other VM function) for L1
- Fix a crash when using eVMCS's enlighted MSR bitmaps
Generic:
- Clean up the hardware enable and initialization flow, which was
scattered around multiple arch-specific hooks. Instead, just
let the arch code call into generic code. Both x86 and ARM should
benefit from not having to fight common KVM code's notion of how
to do initialization.
- Account allocations in generic kvm_arch_alloc_vm()
- Fix a memory leak if coalesced MMIO unregistration fails
selftests:
- On x86, cache the CPU vendor (AMD vs. Intel) and use the info to emit
the correct hypercall instruction instead of relying on KVM to patch
in VMMCALL
- Use TAP interface for kvm_binary_stats_test and tsc_msrs_test
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmP2YA0UHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroPg/Qf+J6nT+TkIa+8Ei+fN1oMTDp4YuIOx
mXvJ9mRK9sQ+tAUVwvDz3qN/fK5mjsYbRHIDlVc5p2Q3bCrVGDDqXPFfCcLx1u+O
9U9xjkO4JxD2LS9pc70FYOyzVNeJ8VMGOBbC2b0lkdYZ4KnUc6e/WWFKJs96bK+H
duo+RIVyaMthnvbTwSv1K3qQb61n6lSJXplywS8KWFK6NZAmBiEFDAWGRYQE9lLs
VcVcG0iDJNL/BQJ5InKCcvXVGskcCm9erDszPo7w4Bypa4S9AMS42DHUaRZrBJwV
/WqdH7ckIz7+OSV0W1j+bKTHAFVTCjXYOM7wQykgjawjICzMSnnG9Gpskw==
=goe1
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"ARM:
- Provide a virtual cache topology to the guest to avoid
inconsistencies with migration on heterogenous systems. Non secure
software has no practical need to traverse the caches by set/way in
the first place
- Add support for taking stage-2 access faults in parallel. This was
an accidental omission in the original parallel faults
implementation, but should provide a marginal improvement to
machines w/o FEAT_HAFDBS (such as hardware from the fruit company)
- A preamble to adding support for nested virtualization to KVM,
including vEL2 register state, rudimentary nested exception
handling and masking unsupported features for nested guests
- Fixes to the PSCI relay that avoid an unexpected host SVE trap when
resuming a CPU when running pKVM
- VGIC maintenance interrupt support for the AIC
- Improvements to the arch timer emulation, primarily aimed at
reducing the trap overhead of running nested
- Add CONFIG_USERFAULTFD to the KVM selftests config fragment in the
interest of CI systems
- Avoid VM-wide stop-the-world operations when a vCPU accesses its
own redistributor
- Serialize when toggling CPACR_EL1.SMEN to avoid unexpected
exceptions in the host
- Aesthetic and comment/kerneldoc fixes
- Drop the vestiges of the old Columbia mailing list and add [Oliver]
as co-maintainer
RISC-V:
- Fix wrong usage of PGDIR_SIZE instead of PUD_SIZE
- Correctly place the guest in S-mode after redirecting a trap to the
guest
- Redirect illegal instruction traps to guest
- SBI PMU support for guest
s390:
- Sort out confusion between virtual and physical addresses, which
currently are the same on s390
- A new ioctl that performs cmpxchg on guest memory
- A few fixes
x86:
- Change tdp_mmu to a read-only parameter
- Separate TDP and shadow MMU page fault paths
- Enable Hyper-V invariant TSC control
- Fix a variety of APICv and AVIC bugs, some of them real-world, some
of them affecting architecurally legal but unlikely to happen in
practice
- Mark APIC timer as expired if its in one-shot mode and the count
underflows while the vCPU task was being migrated
- Advertise support for Intel's new fast REP string features
- Fix a double-shootdown issue in the emergency reboot code
- Ensure GIF=1 and disable SVM during an emergency reboot, i.e. give
SVM similar treatment to VMX
- Update Xen's TSC info CPUID sub-leaves as appropriate
- Add support for Hyper-V's extended hypercalls, where "support" at
this point is just forwarding the hypercalls to userspace
- Clean up the kvm->lock vs. kvm->srcu sequences when updating the
PMU and MSR filters
- One-off fixes and cleanups
- Fix and cleanup the range-based TLB flushing code, used when KVM is
running on Hyper-V
- Add support for filtering PMU events using a mask. If userspace
wants to restrict heavily what events the guest can use, it can now
do so without needing an absurd number of filter entries
- Clean up KVM's handling of "PMU MSRs to save", especially when vPMU
support is disabled
- Add PEBS support for Intel Sapphire Rapids
- Fix a mostly benign overflow bug in SEV's
send|receive_update_data()
- Move several SVM-specific flags into vcpu_svm
x86 Intel:
- Handle NMI VM-Exits before leaving the noinstr region
- A few trivial cleanups in the VM-Enter flows
- Stop enabling VMFUNC for L1 purely to document that KVM doesn't
support EPTP switching (or any other VM function) for L1
- Fix a crash when using eVMCS's enlighted MSR bitmaps
Generic:
- Clean up the hardware enable and initialization flow, which was
scattered around multiple arch-specific hooks. Instead, just let
the arch code call into generic code. Both x86 and ARM should
benefit from not having to fight common KVM code's notion of how to
do initialization
- Account allocations in generic kvm_arch_alloc_vm()
- Fix a memory leak if coalesced MMIO unregistration fails
selftests:
- On x86, cache the CPU vendor (AMD vs. Intel) and use the info to
emit the correct hypercall instruction instead of relying on KVM to
patch in VMMCALL
- Use TAP interface for kvm_binary_stats_test and tsc_msrs_test"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (325 commits)
KVM: SVM: hyper-v: placate modpost section mismatch error
KVM: x86/mmu: Make tdp_mmu_allowed static
KVM: arm64: nv: Use reg_to_encoding() to get sysreg ID
KVM: arm64: nv: Only toggle cache for virtual EL2 when SCTLR_EL2 changes
KVM: arm64: nv: Filter out unsupported features from ID regs
KVM: arm64: nv: Emulate EL12 register accesses from the virtual EL2
KVM: arm64: nv: Allow a sysreg to be hidden from userspace only
KVM: arm64: nv: Emulate PSTATE.M for a guest hypervisor
KVM: arm64: nv: Add accessors for SPSR_EL1, ELR_EL1 and VBAR_EL1 from virtual EL2
KVM: arm64: nv: Handle SMCs taken from virtual EL2
KVM: arm64: nv: Handle trapped ERET from virtual EL2
KVM: arm64: nv: Inject HVC exceptions to the virtual EL2
KVM: arm64: nv: Support virtual EL2 exceptions
KVM: arm64: nv: Handle HCR_EL2.NV system register traps
KVM: arm64: nv: Add nested virt VCPU primitives for vEL2 VCPU state
KVM: arm64: nv: Add EL2 system registers to vcpu context
KVM: arm64: nv: Allow userspace to set PSR_MODE_EL2x
KVM: arm64: nv: Reset VCPU to EL2 registers if VCPU nested virt is set
KVM: arm64: nv: Introduce nested virtualization VCPU feature
KVM: arm64: Use the S2 MMU context to iterate over S2 table
...
- Support for configuring secure boot with user-defined keys on PowerVM LPARs.
- Simplify the replay of soft-masked IRQs by making it non-recursive.
- Add support for KCSAN on 64-bit Book3S.
- Improvements to the API & code which interacts with RTAS (pseries firmware).
- Change 32-bit powermac to assign PCI bus numbers per domain by default.
- Some improvements to the 32-bit BPF JIT.
- Various other small features and fixes.
Thanks to: Anders Roxell, Andrew Donnellan, Andrew Jeffery, Benjamin Gray, Christophe
Leroy, Frederic Barrat, Ganesh Goudar, Geoff Levand, Greg Kroah-Hartman, Jan-Benedict
Glaw, Josh Poimboeuf, Kajol Jain, Laurent Dufour, Mahesh Salgaonkar, Mathieu Desnoyers,
Mimi Zohar, Murphy Zhou, Nathan Chancellor, Nathan Lynch, Nayna Jain, Nicholas Piggin,
Pali Rohár, Petr Mladek, Rohan McLure, Russell Currey, Sachin Sant, Sathvika Vasireddy,
Sourabh Jain, Stefan Berger, Stephen Rothwell, Sudhakar Kuppusamy.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmP4GnkTHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgEnlEAC9UoE9JM853o9ZzpOJDrbYknHsRQad
ztQJ9xu5qjkFHHryTmWKYdiAtNDFbcfn7+1aoc5FXrIb6BOfvBo/uRFw6P501Qwv
Fg0MQyWUnT5WrI7+rBE2q+1+FaHBNKLycLNRSh5JpXtuKe2ubQfiFD80tarBnEnU
6I4bqXd+xjDtnqtpfiYnil/kdZTu/MzntdkmCne6fMkflgEQFU9EVQEnnE+imqFa
6BuCwITvZ+NyaaU+cYMeGZT7aoz9PAwkksgTxXW2gQbTIApX9WX4kYU/vbW4aHts
0bpzMmIbSbAklYIu2PQQhSU0bLfKJ+xly8E8tozHgRX6hrFlqvtmD/T5LHTBD11f
FFzKb0NUCD8qTIy6Hn0M1tj5egLpxxzATPe/kVTkxxqTlZrzdSEaqzft6syyJHJd
ueo0QN53AUyBaVMtxLbnB/U/8Vnz6rLqY+8dLKzXhjYjoPJqOZh/Qlc1Tk3syPwf
E2j4H6wFqGMTOGi453Pijkpj3qpNkNT79FG5DmClcQLJxD/EXDyffLZITrkzQa0S
FEkcMzz/Hn9Hkf7ZuNo4DN6ss6IF0vlxoi7GNr+MRR53/aVQJUDc8z24c4ICl/3w
20ETk57XMVJzP++Hb+yn16JyAawfQOOlckBRZ2O8W5YYVoes45hxDQxVoh8EII69
hb3KOGYEqF5wyA==
=ECNb
-----END PGP SIGNATURE-----
Merge tag 'powerpc-6.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- Support for configuring secure boot with user-defined keys on PowerVM
LPARs
- Simplify the replay of soft-masked IRQs by making it non-recursive
- Add support for KCSAN on 64-bit Book3S
- Improvements to the API & code which interacts with RTAS (pseries
firmware)
- Change 32-bit powermac to assign PCI bus numbers per domain by
default
- Some improvements to the 32-bit BPF JIT
- Various other small features and fixes
Thanks to Anders Roxell, Andrew Donnellan, Andrew Jeffery, Benjamin
Gray, Christophe Leroy, Frederic Barrat, Ganesh Goudar, Geoff Levand,
Greg Kroah-Hartman, Jan-Benedict Glaw, Josh Poimboeuf, Kajol Jain,
Laurent Dufour, Mahesh Salgaonkar, Mathieu Desnoyers, Mimi Zohar, Murphy
Zhou, Nathan Chancellor, Nathan Lynch, Nayna Jain, Nicholas Piggin, Pali
Rohár, Petr Mladek, Rohan McLure, Russell Currey, Sachin Sant, Sathvika
Vasireddy, Sourabh Jain, Stefan Berger, Stephen Rothwell, and Sudhakar
Kuppusamy.
* tag 'powerpc-6.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (114 commits)
powerpc/pseries: Avoid hcall in plpks_is_available() on non-pseries
powerpc: dts: turris1x.dts: Set lower priority for CPLD syscon-reboot
powerpc/e500: Add missing prototype for 'relocate_init'
powerpc/64: Fix unannotated intra-function call warning
powerpc/epapr: Don't use wrteei on non booke
powerpc: Pass correct CPU reference to assembler
powerpc/mm: Rearrange if-else block to avoid clang warning
powerpc/nohash: Fix build with llvm-as
powerpc/nohash: Fix build error with binutils >= 2.38
powerpc/pseries: Fix endianness issue when parsing PLPKS secvar flags
macintosh: windfarm: Use unsigned type for 1-bit bitfields
powerpc/kexec_file: print error string on usable memory property update failure
powerpc/machdep: warn when machine_is() used too early
powerpc/64: Replace -mcpu=e500mc64 by -mcpu=e5500
powerpc/eeh: Set channel state after notifying the drivers
selftests/powerpc: Fix incorrect kernel headers search path
powerpc/rtas: arch-wide function token lookup conversions
powerpc/rtas: introduce rtas_function_token() API
powerpc/pseries/lpar: convert to papr_sysparm API
powerpc/pseries/hv-24x7: convert to papr_sysparm API
...
- CXL RAM region enumeration: instantiate 'struct cxl_region' objects
for platform firmware created memory regions
- CXL RAM region provisioning: complement the existing PMEM region
creation support with RAM region support
- "Soft Reservation" policy change: Online (memory hot-add)
soft-reserved memory (EFI_MEMORY_SP) by default, but still allow for
setting aside such memory for dedicated access via device-dax.
- CXL Events and Interrupts: Takeover CXL event handling from
platform-firmware (ACPI calls this CXL Memory Error Reporting) and
export CXL Events via Linux Trace Events.
- Convey CXL _OSC results to drivers: Similar to PCI, let the CXL
subsystem interrogate the result of CXL _OSC negotiation.
- Emulate CXL DVSEC Range Registers as "decoders": Allow for
first-generation devices that pre-date the definition of the CXL HDM
Decoder Capability to translate the CXL DVSEC Range Registers into
'struct cxl_decoder' objects.
- Set timestamp: Per spec, set the device timestamp in case of hotplug,
or if platform-firwmare failed to set it.
- General fixups: linux-next build issues, non-urgent fixes for
pre-production hardware, unit test fixes, spelling and debug message
improvements.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCY/WYcgAKCRDfioYZHlFs
Z6m3APkBUtiEEm1o8ikdu5llUS1OTLBwqjJDwGMTyf8X/WDXhgD+J2mLsCgARS7X
5IS0RAtefutrW5sQpUucPM7QiLuraAY=
=kOXC
-----END PGP SIGNATURE-----
Merge tag 'cxl-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull Compute Express Link (CXL) updates from Dan Williams:
"To date Linux has been dependent on platform-firmware to map CXL RAM
regions and handle events / errors from devices. With this update we
can now parse / update the CXL memory layout, and report events /
errors from devices. This is a precursor for the CXL subsystem to
handle the end-to-end "RAS" flow for CXL memory. i.e. the flow that
for DDR-attached-DRAM is handled by the EDAC driver where it maps
system physical address events to a field-replaceable-unit (FRU /
endpoint device). In general, CXL has the potential to standardize
what has historically been a pile of memory-controller-specific error
handling logic.
Another change of note is the default policy for handling RAM-backed
device-dax instances. Previously the default access mode was "device",
mmap(2) a device special file to access memory. The new default is
"kmem" where the address range is assigned to the core-mm via
add_memory_driver_managed(). This saves typical users from wondering
why their platform memory is not visible via free(1) and stuck behind
a device-file. At the same time it allows expert users to deploy
policy to, for example, get dedicated access to high performance
memory, or hide low performance memory from general purpose kernel
allocations. This affects not only CXL, but also systems with
high-bandwidth-memory that platform-firmware tags with the
EFI_MEMORY_SP (special purpose) designation.
Summary:
- CXL RAM region enumeration: instantiate 'struct cxl_region' objects
for platform firmware created memory regions
- CXL RAM region provisioning: complement the existing PMEM region
creation support with RAM region support
- "Soft Reservation" policy change: Online (memory hot-add)
soft-reserved memory (EFI_MEMORY_SP) by default, but still allow
for setting aside such memory for dedicated access via device-dax.
- CXL Events and Interrupts: Takeover CXL event handling from
platform-firmware (ACPI calls this CXL Memory Error Reporting) and
export CXL Events via Linux Trace Events.
- Convey CXL _OSC results to drivers: Similar to PCI, let the CXL
subsystem interrogate the result of CXL _OSC negotiation.
- Emulate CXL DVSEC Range Registers as "decoders": Allow for
first-generation devices that pre-date the definition of the CXL
HDM Decoder Capability to translate the CXL DVSEC Range Registers
into 'struct cxl_decoder' objects.
- Set timestamp: Per spec, set the device timestamp in case of
hotplug, or if platform-firwmare failed to set it.
- General fixups: linux-next build issues, non-urgent fixes for
pre-production hardware, unit test fixes, spelling and debug
message improvements"
* tag 'cxl-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (66 commits)
dax/kmem: Fix leak of memory-hotplug resources
cxl/mem: Add kdoc param for event log driver state
cxl/trace: Add serial number to trace points
cxl/trace: Add host output to trace points
cxl/trace: Standardize device information output
cxl/pci: Remove locked check for dvsec_range_allowed()
cxl/hdm: Add emulation when HDM decoders are not committed
cxl/hdm: Create emulated cxl_hdm for devices that do not have HDM decoders
cxl/hdm: Emulate HDM decoder from DVSEC range registers
cxl/pci: Refactor cxl_hdm_decode_init()
cxl/port: Export cxl_dvsec_rr_decode() to cxl_port
cxl/pci: Break out range register decoding from cxl_hdm_decode_init()
cxl: add RAS status unmasking for CXL
cxl: remove unnecessary calling of pci_enable_pcie_error_reporting()
dax/hmem: build hmem device support as module if possible
dax: cxl: add CXL_REGION dependency
cxl: avoid returning uninitialized error code
cxl/pmem: Fix nvdimm registration races
cxl/mem: Fix UAPI command comment
cxl/uapi: Tag commands from cxl_query_cmd()
...
BPF for LoongArch is supported now, add the selftesting support in
seccomp_bpf.c.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
We will add tools support for LoongArch (bpf, perf, objtool, etc.), add
build infrastructure and common headers for preparation.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The current mptcp test is run in init netns. If the user or default
system config disabled mptcp, the test will fail. Let's run the mptcp
test in a dedicated netns to avoid none kernel default mptcp setting.
Suggested-by: Martin KaFai Lau <martin.lau@linux.dev>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Link: https://lore.kernel.org/r/20230224061343.506571-3-liuhangbin@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
A lot of tests defined SYS() macro to run system calls with goto label.
Let's move this macro to test_progs.h and add configurable
"goto_label" as the first arg.
Suggested-by: Martin KaFai Lau <martin.lau@linux.dev>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20230224061343.506571-2-liuhangbin@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Some polishing and small fixes for iommufd:
- Remove IOMMU_CAP_INTR_REMAP, instead rely on the interrupt subsystem
- Use GFP_KERNEL_ACCOUNT inside the iommu_domains
- Support VFIO_NOIOMMU mode with iommufd
- Various typos
- A list corruption bug if HWPTs are used for attach
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCY/TgzQAKCRCFwuHvBreF
Ya3AAP4/WxTJIbDvtTyH3Fae3NxTdO8j8gsUvU1vrRYG83zdnAEAxd1yii7GEO8D
crkeq9D4FUiPAkFnJ64Exw2FHb060Qg=
=RABK
-----END PGP SIGNATURE-----
Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd
Pull iommufd updates from Jason Gunthorpe:
"Some polishing and small fixes for iommufd:
- Remove IOMMU_CAP_INTR_REMAP, instead rely on the interrupt
subsystem
- Use GFP_KERNEL_ACCOUNT inside the iommu_domains
- Support VFIO_NOIOMMU mode with iommufd
- Various typos
- A list corruption bug if HWPTs are used for attach"
* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd:
iommufd: Do not add the same hwpt to the ioas->hwpt_list twice
iommufd: Make sure to zero vfio_iommu_type1_info before copying to user
vfio: Support VFIO_NOIOMMU with iommufd
iommufd: Add three missing structures in ucmd_buffer
selftests: iommu: Fix test_cmd_destroy_access() call in user_copy
iommu: Remove IOMMU_CAP_INTR_REMAP
irq/s390: Add arch_is_isolated_msi() for s390
iommu/x86: Replace IOMMU_CAP_INTR_REMAP with IRQ_DOMAIN_FLAG_ISOLATED_MSI
genirq/msi: Rename IRQ_DOMAIN_MSI_REMAP to IRQ_DOMAIN_ISOLATED_MSI
genirq/irqdomain: Remove unused irq_domain_check_msi_remap() code
iommufd: Convert to msi_device_has_isolated_msi()
vfio/type1: Convert to iommu_group_has_isolated_msi()
iommu: Add iommu_group_has_isolated_msi()
genirq/msi: Add msi_device_has_isolated_msi()
Here is the large set of driver changes for char/misc drivers and other
smaller driver subsystems that flow through this git tree.
Included in here are:
- New IIO drivers and features and improvments in that subsystem
- New hwtracing drivers and additions to that subsystem
- lots of interconnect changes and new drivers as that subsystem seems
under very active development recently. This required also merging
in the icc subsystem changes through this tree.
- FPGA driver updates
- counter subsystem and driver updates
- MHI driver updates
- nvmem driver updates
- documentation updates
- Other smaller driver updates and fixes, full details in the shortlog
All of these have been in linux-next for a while with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY/inQw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yksvwCeOvU//SPwrbIpaeHAmHUv0PSVOrwAoKmt4ICh
hQUudlztfkvUJxKIH0gh
=Sjk4
-----END PGP SIGNATURE-----
Merge tag 'char-misc-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc and other driver subsystem updates from Greg KH:
"Here is the large set of driver changes for char/misc drivers and
other smaller driver subsystems that flow through this git tree.
Included in here are:
- New IIO drivers and features and improvments in that subsystem
- New hwtracing drivers and additions to that subsystem
- lots of interconnect changes and new drivers as that subsystem
seems under very active development recently. This required also
merging in the icc subsystem changes through this tree.
- FPGA driver updates
- counter subsystem and driver updates
- MHI driver updates
- nvmem driver updates
- documentation updates
- Other smaller driver updates and fixes, full details in the
shortlog
All of these have been in linux-next for a while with no reported
problems"
* tag 'char-misc-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (223 commits)
scripts/tags.sh: fix incompatibility with PCRE2
firmware: coreboot: Remove GOOGLE_COREBOOT_TABLE_ACPI/OF Kconfig entries
mei: lower the log level for non-fatal failed messages
mei: bus: disallow driver match while dismantling device
misc: vmw_balloon: fix memory leak with using debugfs_lookup()
nvmem: stm32: fix OPTEE dependency
dt-bindings: nvmem: qfprom: add IPQ8074 compatible
nvmem: qcom-spmi-sdam: register at device init time
nvmem: rave-sp-eeprm: fix kernel-doc bad line warning
nvmem: stm32: detect bsec pta presence for STM32MP15x
nvmem: stm32: add OP-TEE support for STM32MP13x
nvmem: core: use nvmem_add_one_cell() in nvmem_add_cells_from_of()
nvmem: core: add nvmem_add_one_cell()
nvmem: core: drop the removal of the cells in nvmem_add_cells()
nvmem: core: move struct nvmem_cell_info to nvmem-provider.h
nvmem: core: add an index parameter to the cell
of: property: add #nvmem-cell-cells property
of: property: make #.*-cells optional for simple props
of: base: add of_parse_phandle_with_optional_args()
net: add helper eth_addr_add()
...
Python will generate its customary cache when running ynl scripts:
?? tools/net/ynl/lib/__pycache__/
Reported-by: Chuck Lever III <chuck.lever@oracle.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
traceback.print_exception() seems tricky to call, we're missing
some argument, so re-raise instead.
Reported-by: Chuck Lever III <chuck.lever@oracle.com>
Fixes: 3aacf82813 ("tools: ynl: add an object hierarchy to represent parsed spec")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Chuck run into an issue with a single-element attr-set which
only has an attr with value of 0. The search for max attr in
a struct records attrs with value larger than 0 only (max_val
is set to 0 at the start). Adjust the comparison, alternatively
max_val could be init'ed to -1. Somehow picking the last attr
of a value seems like a good idea in general.
Reported-by: Chuck Lever III <chuck.lever@oracle.com>
Fixes: be5bea1cc0 ("net: add basic C code generators for Netlink")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fix a repeated copy/paste typo.
Fixes: d3d854fd6a ("netdev-genl: create a simple family for netdev stuff")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
F_SEAL_EXEC") which permits the setting of the memfd execute bit at
memfd creation time, with the option of sealing the state of the X bit.
- Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset()
thread-safe for pmd unshare") which addresses a rare race condition
related to PMD unsharing.
- Several folioification patch serieses from Matthew Wilcox, Vishal
Moola, Sidhartha Kumar and Lorenzo Stoakes
- Johannes Weiner has a series ("mm: push down lock_page_memcg()") which
does perform some memcg maintenance and cleanup work.
- SeongJae Park has added DAMOS filtering to DAMON, with the series
"mm/damon/core: implement damos filter". These filters provide users
with finer-grained control over DAMOS's actions. SeongJae has also done
some DAMON cleanup work.
- Kairui Song adds a series ("Clean up and fixes for swap").
- Vernon Yang contributed the series "Clean up and refinement for maple
tree".
- Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It
adds to MGLRU an LRU of memcgs, to improve the scalability of global
reclaim.
- David Hildenbrand has added some userfaultfd cleanup work in the
series "mm: uffd-wp + change_protection() cleanups".
- Christoph Hellwig has removed the generic_writepages() library
function in the series "remove generic_writepages".
- Baolin Wang has performed some maintenance on the compaction code in
his series "Some small improvements for compaction".
- Sidhartha Kumar is doing some maintenance work on struct page in his
series "Get rid of tail page fields".
- David Hildenbrand contributed some cleanup, bugfixing and
generalization of pte management and of pte debugging in his series "mm:
support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with swap
PTEs".
- Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation
flag in the series "Discard __GFP_ATOMIC".
- Sergey Senozhatsky has improved zsmalloc's memory utilization with his
series "zsmalloc: make zspage chain size configurable".
- Joey Gouly has added prctl() support for prohibiting the creation of
writeable+executable mappings. The previous BPF-based approach had
shortcomings. See "mm: In-kernel support for memory-deny-write-execute
(MDWE)".
- Waiman Long did some kmemleak cleanup and bugfixing in the series
"mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF".
- T.J. Alumbaugh has contributed some MGLRU cleanup work in his series
"mm: multi-gen LRU: improve".
- Jiaqi Yan has provided some enhancements to our memory error
statistics reporting, mainly by presenting the statistics on a per-node
basis. See the series "Introduce per NUMA node memory error
statistics".
- Mel Gorman has a second and hopefully final shot at fixing a CPU-hog
regression in compaction via his series "Fix excessive CPU usage during
compaction".
- Christoph Hellwig does some vmalloc maintenance work in the series
"cleanup vfree and vunmap".
- Christoph Hellwig has removed block_device_operations.rw_page() in ths
series "remove ->rw_page".
- We get some maple_tree improvements and cleanups in Liam Howlett's
series "VMA tree type safety and remove __vma_adjust()".
- Suren Baghdasaryan has done some work on the maintainability of our
vm_flags handling in the series "introduce vm_flags modifier functions".
- Some pagemap cleanup and generalization work in Mike Rapoport's series
"mm, arch: add generic implementation of pfn_valid() for FLATMEM" and
"fixups for generic implementation of pfn_valid()"
- Baoquan He has done some work to make /proc/vmallocinfo and
/proc/kcore better represent the real state of things in his series
"mm/vmalloc.c: allow vread() to read out vm_map_ram areas".
- Jason Gunthorpe rationalized the GUP system's interface to the rest of
the kernel in the series "Simplify the external interface for GUP".
- SeongJae Park wishes to migrate people from DAMON's debugfs interface
over to its sysfs interface. To support this, we'll temporarily be
printing warnings when people use the debugfs interface. See the series
"mm/damon: deprecate DAMON debugfs interface".
- Andrey Konovalov provided the accurately named "lib/stackdepot: fixes
and clean-ups" series.
- Huang Ying has provided a dramatic reduction in migration's TLB flush
IPI rates with the series "migrate_pages(): batch TLB flushing".
- Arnd Bergmann has some objtool fixups in "objtool warning fixes".
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY/PoPQAKCRDdBJ7gKXxA
jlvpAPsFECUBBl20qSue2zCYWnHC7Yk4q9ytTkPB/MMDrFEN9wD/SNKEm2UoK6/K
DmxHkn0LAitGgJRS/W9w81yrgig9tAQ=
=MlGs
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Daniel Verkamp has contributed a memfd series ("mm/memfd: add
F_SEAL_EXEC") which permits the setting of the memfd execute bit at
memfd creation time, with the option of sealing the state of the X
bit.
- Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset()
thread-safe for pmd unshare") which addresses a rare race condition
related to PMD unsharing.
- Several folioification patch serieses from Matthew Wilcox, Vishal
Moola, Sidhartha Kumar and Lorenzo Stoakes
- Johannes Weiner has a series ("mm: push down lock_page_memcg()")
which does perform some memcg maintenance and cleanup work.
- SeongJae Park has added DAMOS filtering to DAMON, with the series
"mm/damon/core: implement damos filter".
These filters provide users with finer-grained control over DAMOS's
actions. SeongJae has also done some DAMON cleanup work.
- Kairui Song adds a series ("Clean up and fixes for swap").
- Vernon Yang contributed the series "Clean up and refinement for maple
tree".
- Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It
adds to MGLRU an LRU of memcgs, to improve the scalability of global
reclaim.
- David Hildenbrand has added some userfaultfd cleanup work in the
series "mm: uffd-wp + change_protection() cleanups".
- Christoph Hellwig has removed the generic_writepages() library
function in the series "remove generic_writepages".
- Baolin Wang has performed some maintenance on the compaction code in
his series "Some small improvements for compaction".
- Sidhartha Kumar is doing some maintenance work on struct page in his
series "Get rid of tail page fields".
- David Hildenbrand contributed some cleanup, bugfixing and
generalization of pte management and of pte debugging in his series
"mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with
swap PTEs".
- Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation
flag in the series "Discard __GFP_ATOMIC".
- Sergey Senozhatsky has improved zsmalloc's memory utilization with
his series "zsmalloc: make zspage chain size configurable".
- Joey Gouly has added prctl() support for prohibiting the creation of
writeable+executable mappings.
The previous BPF-based approach had shortcomings. See "mm: In-kernel
support for memory-deny-write-execute (MDWE)".
- Waiman Long did some kmemleak cleanup and bugfixing in the series
"mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF".
- T.J. Alumbaugh has contributed some MGLRU cleanup work in his series
"mm: multi-gen LRU: improve".
- Jiaqi Yan has provided some enhancements to our memory error
statistics reporting, mainly by presenting the statistics on a
per-node basis. See the series "Introduce per NUMA node memory error
statistics".
- Mel Gorman has a second and hopefully final shot at fixing a CPU-hog
regression in compaction via his series "Fix excessive CPU usage
during compaction".
- Christoph Hellwig does some vmalloc maintenance work in the series
"cleanup vfree and vunmap".
- Christoph Hellwig has removed block_device_operations.rw_page() in
ths series "remove ->rw_page".
- We get some maple_tree improvements and cleanups in Liam Howlett's
series "VMA tree type safety and remove __vma_adjust()".
- Suren Baghdasaryan has done some work on the maintainability of our
vm_flags handling in the series "introduce vm_flags modifier
functions".
- Some pagemap cleanup and generalization work in Mike Rapoport's
series "mm, arch: add generic implementation of pfn_valid() for
FLATMEM" and "fixups for generic implementation of pfn_valid()"
- Baoquan He has done some work to make /proc/vmallocinfo and
/proc/kcore better represent the real state of things in his series
"mm/vmalloc.c: allow vread() to read out vm_map_ram areas".
- Jason Gunthorpe rationalized the GUP system's interface to the rest
of the kernel in the series "Simplify the external interface for
GUP".
- SeongJae Park wishes to migrate people from DAMON's debugfs interface
over to its sysfs interface. To support this, we'll temporarily be
printing warnings when people use the debugfs interface. See the
series "mm/damon: deprecate DAMON debugfs interface".
- Andrey Konovalov provided the accurately named "lib/stackdepot: fixes
and clean-ups" series.
- Huang Ying has provided a dramatic reduction in migration's TLB flush
IPI rates with the series "migrate_pages(): batch TLB flushing".
- Arnd Bergmann has some objtool fixups in "objtool warning fixes".
* tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (505 commits)
include/linux/migrate.h: remove unneeded externs
mm/memory_hotplug: cleanup return value handing in do_migrate_range()
mm/uffd: fix comment in handling pte markers
mm: change to return bool for isolate_movable_page()
mm: hugetlb: change to return bool for isolate_hugetlb()
mm: change to return bool for isolate_lru_page()
mm: change to return bool for folio_isolate_lru()
objtool: add UACCESS exceptions for __tsan_volatile_read/write
kmsan: disable ftrace in kmsan core code
kasan: mark addr_has_metadata __always_inline
mm: memcontrol: rename memcg_kmem_enabled()
sh: initialize max_mapnr
m68k/nommu: add missing definition of ARCH_PFN_OFFSET
mm: percpu: fix incorrect size in pcpu_obj_full_size()
maple_tree: reduce stack usage with gcc-9 and earlier
mm: page_alloc: call panic() when memoryless node allocation fails
mm: multi-gen LRU: avoid futile retries
migrate_pages: move THP/hugetlb migration support check to simplify code
migrate_pages: batch flushing TLB
migrate_pages: share more code between _unmap and _move
...
- Fix ftrace2bconf.sh tool for checking event enable status correctly.
- Add CONFIG_BOOT_CONFIG_FORCE to apply bootconfig without 'bootconfig'
boot parameter.
- Enable CONFIG_BOOT_CONFIG_FORCE by default if a bootconfig is embedded
in the kernel.
- Increase max number of nodes of bootconfig to 8192.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmP1VUUACgkQ2/sHvwUr
PxuExQgAslUeGrdn8nAA2qsModVHrXwLl1Xa6797Xzh/xCoIOAQ5AaUkGOlBBpCi
0UGsiqo5pLfrJ7q1HCTiD4kNpDcK6Kw9UbjClMS2nSf58hK98upUAng+4VlTH3dZ
difzua1Y0PohBDsLZpV5Ex/K9ZHiPhm44pqkaA+q0gHBfa5AmFuRUD3icEdiHmFu
B3GX0qdIMeFmUhxt0jmfvsu1Xq8fjF3Lsz/xCeOHcNJYyxzmdttxHYY8pLTWOIoL
xGL2MmwIYzLRW3/r9E71JNCLgykUWZSBbYhcJ7lIAJadFNbNBFJ0+v5uiyxbZEib
Xv5UAyTKSIeZIyH0fUZ/4Ufa8sw5Nw==
=0Nnb
-----END PGP SIGNATURE-----
Merge tag 'bootconfig-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull bootconfig updates from Masami Hiramatsu:
- Fix ftrace2bconf.sh tool for checking event enable status correctly
- Add CONFIG_BOOT_CONFIG_FORCE to apply bootconfig without 'bootconfig'
boot parameter
- Enable CONFIG_BOOT_CONFIG_FORCE by default if a bootconfig is
embedded in the kernel
- Increase max number of nodes of bootconfig to 8192
* tag 'bootconfig-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
bootconfig: Increase max nodes of bootconfig from 1024 to 8192 for DCC support
bootconfig: Default BOOT_CONFIG_FORCE to y if BOOT_CONFIG_EMBED
Allow forcing unconditional bootconfig processing
tools/bootconfig: fix single & used for logical condition
- Skip negative return code check for snprintf in eprobe.
- Add recursive call test cases for kprobe unit test
- Add 'char' type to probe events to show it as the character instead of value.
- Update kselftest kprobe-event testcase to ignore '__pfx_' symbols.
- Fix kselftest to check filter on eprobe event correctly.
- Add filter on eprobe to the README file in tracefs.
- Fix optprobes to check whether there is 'under unoptimizing' optprobe when optimizing another kprobe correctly.
- Fix optprobe to check whether there is 'under unoptimizing' optprobe when fetching the original instruction correctly.
- Fix optprobe to free 'forcibly unoptimized' optprobe correctly.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmP0JdYACgkQ2/sHvwUr
Pxt6sQf/TD9Kwqx3XG1tnLPev6yt2nuggUippHwWUFHlJtMyUaLV8aKFqByyEe+j
tCQvrFIIJq242xg0Jac/MAf2exlWG9jsmVZPmvC1YzepOAbjXu2eBkIS7LsbeHjF
JJypNnEceffWCpNoD6nlvR0xWXenqRbZJwdsGqo3u+fXnzTurEMY2GU2xOyv39tv
S1uNLPANJxdMb/2iUsUE3hMbe82dqr8zPcApqWFtTBB6QPHI3B2SjuQHpQxwbTPl
bzAl0yQkLSQXprVzT7xJ4xLnzbl1ljgJBci5aX8BFF+VD9oYkypdfYVczBH5VsP9
E3eT9T9lRf4Q99EqxNy5uw7NqQXGQg==
=CMPb
-----END PGP SIGNATURE-----
Merge tag 'probes-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull kprobes updates from Masami Hiramatsu:
- Skip negative return code check for snprintf in eprobe
- Add recursive call test cases for kprobe unit test
- Add 'char' type to probe events to show it as the character instead
of value
- Update kselftest kprobe-event testcase to ignore '__pfx_' symbols
- Fix kselftest to check filter on eprobe event correctly
- Add filter on eprobe to the README file in tracefs
- Fix optprobes to check whether there is 'under unoptimizing' optprobe
when optimizing another kprobe correctly
- Fix optprobe to check whether there is 'under unoptimizing' optprobe
when fetching the original instruction correctly
- Fix optprobe to free 'forcibly unoptimized' optprobe correctly
* tag 'probes-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/eprobe: no need to check for negative ret value for snprintf
test_kprobes: Add recursed kprobe test case
tracing/probe: add a char type to show the character value of traced arguments
selftests/ftrace: Fix probepoint testcase to ignore __pfx_* symbols
selftests/ftrace: Fix eprobe syntax test case to check filter support
tracing/eprobe: Fix to add filter on eprobe description in README file
x86/kprobes: Fix arch_check_optimized_kprobe check within optimized_kprobe range
x86/kprobes: Fix __recover_optprobed_insn check optimizing logic
kprobes: Fix to handle forcibly unoptimized kprobes on freeing_list
- 'perf lock contention' improvements:
- Add -o/--lock-owner option:
$ sudo ./perf lock contention -abo -- ./perf bench sched pipe
# Running 'sched/pipe' benchmark:
# Executed 1000000 pipe operations between two processes
Total time: 4.766 [sec]
4.766540 usecs/op
209795 ops/sec
contended total wait max wait avg wait pid owner
403 565.32 us 26.81 us 1.40 us -1 Unknown
4 27.99 us 8.57 us 7.00 us 1583145 sched-pipe
1 8.25 us 8.25 us 8.25 us 1583144 sched-pipe
1 2.03 us 2.03 us 2.03 us 5068 chrome
The owner is unknown in most cases. Filtering only for the mutex locks, it
will more likely get the owners.
- -S/--callstack-filter is to limit display entries having the given
string in the callstack
$ sudo ./perf lock contention -abv -S net sleep 1
...
contended total wait max wait avg wait type caller
5 70.20 us 16.13 us 14.04 us spinlock __dev_queue_xmit+0xb6d
0xffffffffa5dd1c60 _raw_spin_lock+0x30
0xffffffffa5b8f6ed __dev_queue_xmit+0xb6d
0xffffffffa5cd8267 ip6_finish_output2+0x2c7
0xffffffffa5cdac14 ip6_finish_output+0x1d4
0xffffffffa5cdb477 ip6_xmit+0x457
0xffffffffa5d1fd17 inet6_csk_xmit+0xd7
0xffffffffa5c5f4aa __tcp_transmit_skb+0x54a
0xffffffffa5c6467d tcp_keepalive_timer+0x2fd
Please note that to have the -b option (BPF) working above one has to build
with BUILD_BPF_SKEL=1.
- Add more 'perf test' entries to test these new features.
- Add Ian Rogers to MAINTAINERS as a perf tools reviewer.
- Add support for retire latency feature (pipeline stall of a instruction
compared to the previous one, in cycles) present on some Intel processors.
- Add 'perf c2c' report option to show false sharing with adjacent cachelines, to
be used in machines with cacheline prefetching, where accesses to a cacheline
brings the next one too.
- Skip 'perf test bpf' when the required kernel-debuginfo package isn't installed.
perf script:
- Add 'cgroup' field for 'perf script' output:
$ perf record --all-cgroups -- true
$ perf script -F comm,pid,cgroup
true 337112 /user.slice/user-657345.slice/user@657345.service/...
true 337112 /user.slice/user-657345.slice/user@657345.service/...
true 337112 /user.slice/user-657345.slice/user@657345.service/...
true 337112 /user.slice/user-657345.slice/user@657345.service/...
- Add support for showing branch speculation information in 'perf
script' and in the 'perf report' raw dump (-D).
perf record:
- Fix 'perf record' segfault with --overwrite and --max-size.
Intel PT:
- Add support for synthesizing "cycle" events from Intel PT traces as we
support "instruction" events when Intel PT CYC packets are available. This
enables much more accurate profiles than when using the regular 'perf record -e
cycles' (the default) when the workload lasts for very short periods (<10ms).
- .plt symbol handling improvements, better handling IBT (in the past
MPX) done in the context of decoding Intel PT processor traces, IFUNC
symbols on x86_64, static executables, understanding .plt.got symbols on
x86_64.
- Add a 'perf test' to test symbol resolution, part of the .plt
improvements series, this tests things like symbol size in contexts
where only the symbol start is available (kallsyms), etc.
- Better handle auxtrace/Intel PT data when using pipe mode (perf record sleep 1|perf report).
- Fix symbol lookup with kcore with multiple segments match stext,
getting the symbol resolution to just show DSOs as unknown.
ARM:
- Timestamp improvements for ARM64 systems with ETMv4 (Embedded Trace
Macrocell v4).
- Ensure ARM64 CoreSight timestamps don't go backwards.
- Document that ARM64 SPE (Statistical Profiling Extension) is used with 'perf c2c/mem'.
- Add raw decoding for ARM64 SPEv1.2 previous branch address.
- Update neoverse-n2-v2 ARM vendor events (JSON tables): topdown L1, TLB,
cache, branch, PE utilization and instruction mix metrics.
- Update decoder code for OpenCSD version 1.4, on ARM64 systems.
- Fix command line auto-complete of CPU events on aarch64.
perf test/bench:
- Switch basic BPF filtering test to use syscall tracepoint to avoid the
variable number of probes inserted when using the previous probe point
(do_epoll_wait) that happens on different CPU architectures.
- Fix DWARF unwind test by adding non-inline to expected function in a
backtrace.
- Use 'grep -c' where the longer form 'grep | wc -l' was being used.
- Add getpid and execve benchmarks to 'perf bench syscall'.
Miscellaneous:
- Avoid d3-flame-graph package dependency in 'perf script flamegraph',
making this feature more generally available.
- Add JSON metric events to present CPI stall cycles in Power10.
- Assorted improvements/refactorings on the JSON metrics parsing code.
Build:
- Fix 'perf probe' and 'perf test' when libtraceevent isn't linked, as
several tests use tracepoints, those should be skipped.
- More fallout fixes for the removal of tools/lib/traceevent/.
- Fix build error when linking with libpfm.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCY/YzGgAKCRCyPKLppCJ+
J98CAP4/GD3E86Dk+S+w5FmPEHuBKootuZ3pHOqCnXLiyKFZqgEAs9TWOg9KVKGh
io9cLluMjzfRwQrND8cpn3VfXxWvVAQ=
=L1qh
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-for-v6.3-1-2023-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools updates from Arnaldo Carvalho de Melo:
"Miscellaneous:
- Add Ian Rogers to MAINTAINERS as a perf tools reviewer.
- Add support for retire latency feature (pipeline stall of a
instruction compared to the previous one, in cycles) present on
some Intel processors.
- Add 'perf c2c' report option to show false sharing with adjacent
cachelines, to be used in machines with cacheline prefetching,
where accesses to a cacheline brings the next one too.
- Skip 'perf test bpf' when the required kernel-debuginfo package
isn't installed.
- Avoid d3-flame-graph package dependency in 'perf script flamegraph',
making this feature more generally available.
- Add JSON metric events to present CPI stall cycles in Power10.
- Assorted improvements/refactorings on the JSON metrics parsing
code.
perf lock contention:
- Add -o/--lock-owner option:
$ sudo ./perf lock contention -abo -- ./perf bench sched pipe
# Running 'sched/pipe' benchmark:
# Executed 1000000 pipe operations between two processes
Total time: 4.766 [sec]
4.766540 usecs/op
209795 ops/sec
contended total wait max wait avg wait pid owner
403 565.32 us 26.81 us 1.40 us -1 Unknown
4 27.99 us 8.57 us 7.00 us 1583145 sched-pipe
1 8.25 us 8.25 us 8.25 us 1583144 sched-pipe
1 2.03 us 2.03 us 2.03 us 5068 chrome
The owner is unknown in most cases. Filtering only for the
mutex locks, it will more likely get the owners.
- -S/--callstack-filter is to limit display entries having the given
string in the callstack:
$ sudo ./perf lock contention -abv -S net sleep 1
...
contended total wait max wait avg wait type caller
5 70.20 us 16.13 us 14.04 us spinlock __dev_queue_xmit+0xb6d
0xffffffffa5dd1c60 _raw_spin_lock+0x30
0xffffffffa5b8f6ed __dev_queue_xmit+0xb6d
0xffffffffa5cd8267 ip6_finish_output2+0x2c7
0xffffffffa5cdac14 ip6_finish_output+0x1d4
0xffffffffa5cdb477 ip6_xmit+0x457
0xffffffffa5d1fd17 inet6_csk_xmit+0xd7
0xffffffffa5c5f4aa __tcp_transmit_skb+0x54a
0xffffffffa5c6467d tcp_keepalive_timer+0x2fd
Please note that to have the -b option (BPF) working above one has
to build with BUILD_BPF_SKEL=1.
- Add more 'perf test' entries to test these new features.
perf script:
- Add 'cgroup' field for 'perf script' output:
$ perf record --all-cgroups -- true
$ perf script -F comm,pid,cgroup
true 337112 /user.slice/user-657345.slice/user@657345.service/...
true 337112 /user.slice/user-657345.slice/user@657345.service/...
true 337112 /user.slice/user-657345.slice/user@657345.service/...
true 337112 /user.slice/user-657345.slice/user@657345.service/...
- Add support for showing branch speculation information in 'perf
script' and in the 'perf report' raw dump (-D).
perf record:
- Fix 'perf record' segfault with --overwrite and --max-size.
perf test/bench:
- Switch basic BPF filtering test to use syscall tracepoint to avoid
the variable number of probes inserted when using the previous
probe point (do_epoll_wait) that happens on different CPU
architectures.
- Fix DWARF unwind test by adding non-inline to expected function in
a backtrace.
- Use 'grep -c' where the longer form 'grep | wc -l' was being used.
- Add getpid and execve benchmarks to 'perf bench syscall'.
Intel PT:
- Add support for synthesizing "cycle" events from Intel PT traces as
we support "instruction" events when Intel PT CYC packets are
available. This enables much more accurate profiles than when using
the regular 'perf record -e cycles' (the default) when the workload
lasts for very short periods (<10ms).
- .plt symbol handling improvements, better handling IBT (in the past
MPX) done in the context of decoding Intel PT processor traces,
IFUNC symbols on x86_64, static executables, understanding .plt.got
symbols on x86_64.
- Add a 'perf test' to test symbol resolution, part of the .plt
improvements series, this tests things like symbol size in contexts
where only the symbol start is available (kallsyms), etc.
- Better handle auxtrace/Intel PT data when using pipe mode (perf
record sleep 1|perf report).
- Fix symbol lookup with kcore with multiple segments match stext,
getting the symbol resolution to just show DSOs as unknown.
ARM:
- Timestamp improvements for ARM64 systems with ETMv4 (Embedded Trace
Macrocell v4).
- Ensure ARM64 CoreSight timestamps don't go backwards.
- Document that ARM64 SPE (Statistical Profiling Extension) is used
with 'perf c2c/mem'.
- Add raw decoding for ARM64 SPEv1.2 previous branch address.
- Update neoverse-n2-v2 ARM vendor events (JSON tables): topdown L1,
TLB, cache, branch, PE utilization and instruction mix metrics.
- Update decoder code for OpenCSD version 1.4, on ARM64 systems.
- Fix command line auto-complete of CPU events on aarch64.
Build:
- Fix 'perf probe' and 'perf test' when libtraceevent isn't linked,
as several tests use tracepoints, those should be skipped.
- More fallout fixes for the removal of tools/lib/traceevent/.
- Fix build error when linking with libpfm"
* tag 'perf-tools-for-v6.3-1-2023-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (114 commits)
perf tests stat_all_metrics: Change true workload to sleep workload for system wide check
perf vendor events power10: Add JSON metric events to present CPI stall cycles in powerpc
perf intel-pt: Synthesize cycle events
perf c2c: Add report option to show false sharing in adjacent cachelines
perf record: Fix segfault with --overwrite and --max-size
perf stat: Avoid merging/aggregating metric counts twice
perf tools: Fix perf tool build error in util/pfm.c
perf tools: Fix auto-complete on aarch64
perf lock contention: Support old rw_semaphore type
perf lock contention: Add -o/--lock-owner option
perf lock contention: Fix to save callstack for the default modified
perf test bpf: Skip test if kernel-debuginfo is not present
perf probe: Update the exit error codes in function try_to_find_probe_trace_event
perf script: Fix missing Retire Latency fields option documentation
perf event x86: Add retire_lat when synthesizing PERF_SAMPLE_WEIGHT_STRUCT
perf test x86: Support the retire_lat (Retire Latency) sample_type check
perf test bpf: Check for libtraceevent support
perf script: Support Retire Latency
perf report: Support Retire Latency
perf lock contention: Support filters for different aggregation
...
- Add function names as a way to filter function addresses
- Add sample module to test ftrace ops and dynamic trampolines
- Allow stack traces to be passed from beginning event to end event for
synthetic events. This will allow seeing the stack trace of when a task is
scheduled out and recorded when it gets scheduled back in.
- Add trace event helper __get_buf() to use as a temporary buffer when printing
out trace event output.
- Add kernel command line to create trace instances on boot up.
- Add enabling of events to instances created at boot up.
- Add trace_array_puts() to write into instances.
- Allow boot instances to take a snapshot at the end of boot up.
- Allow live patch modules to include trace events
- Minor fixes and clean ups
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCY/PaaBQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qh5iAPoD0LKZzD33rhO5Ec4hoexE0DkqycP3
dvmOMbCBL8GkxwEA+d2gLz/EquSFm166hc4D79Sn3geCqvkwmy8vQWVjIQc=
=M82D
-----END PGP SIGNATURE-----
Merge tag 'trace-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing updates from Steven Rostedt:
- Add function names as a way to filter function addresses
- Add sample module to test ftrace ops and dynamic trampolines
- Allow stack traces to be passed from beginning event to end event for
synthetic events. This will allow seeing the stack trace of when a
task is scheduled out and recorded when it gets scheduled back in.
- Add trace event helper __get_buf() to use as a temporary buffer when
printing out trace event output.
- Add kernel command line to create trace instances on boot up.
- Add enabling of events to instances created at boot up.
- Add trace_array_puts() to write into instances.
- Allow boot instances to take a snapshot at the end of boot up.
- Allow live patch modules to include trace events
- Minor fixes and clean ups
* tag 'trace-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (31 commits)
tracing: Remove unnecessary NULL assignment
tracepoint: Allow livepatch module add trace event
tracing: Always use canonical ftrace path
tracing/histogram: Fix stacktrace histogram Documententation
tracing/histogram: Fix stacktrace key
tracing/histogram: Fix a few problems with stacktrace variable printing
tracing: Add BUILD_BUG() to make sure stacktrace fits in strings
tracing/histogram: Don't use strlen to find length of stacktrace variables
tracing: Allow boot instances to have snapshot buffers
tracing: Add trace_array_puts() to write into instance
tracing: Add enabling of events to boot instances
tracing: Add creation of instances at boot command line
tracing: Fix trace_event_raw_event_synth() if else statement
samples: ftrace: Make some global variables static
ftrace: sample: avoid open-coded 64-bit division
samples: ftrace: Include the nospec-branch.h only for x86
tracing: Acquire buffer from temparary trace sequence
tracing/histogram: Wrap remaining shell snippets in code blocks
tracing/osnoise: No need for schedule_hrtimeout range
bpf/tracing: Use stage6 of tracing to not duplicate macros
...
- Use total duration to calculate average in rtla osnoise_hist
- Use 2 digit precision for displaying average
- Print an intuitive auto analysis of timerlat results
- Add auto analysis to timerlat top
- Add hwnoise, which is the same as osnoise but focuses on hardware
- Small clean ups
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCY/PB/hQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qo6mAP9Ul7TSaiQ56H0yy5GCwokOBbj2JnkY
N2NCtCv8AEFDDgD/ZgWWLNHglDWfD9V/aAPI5zWGoep3DfnOL5bCWhT/Agg=
=eM96
-----END PGP SIGNATURE-----
Merge tag 'trace-tools-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing tools updates from Steven Rostedt:
- Use total duration to calculate average in rtla osnoise_hist
- Use 2 digit precision for displaying average
- Print an intuitive auto analysis of timerlat results
- Add auto analysis to timerlat top
- Add hwnoise, which is the same as osnoise but focuses on hardware
- Small clean ups
* tag 'trace-tools-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
Documentation/rtla: Add hwnoise man page
rtla: Add hwnoise tool
Documentation/rtla: Add timerlat-top auto-analysis options
rtla/timerlat: Add auto-analysis support to timerlat top
rtla/timerlat: Add auto-analysis core
tools/tracing/rtla: osnoise_hist: display average with two-digit precision
tools/tracing/rtla: osnoise_hist: use total duration for average calculation
tools/rv: Remove unneeded semicolon
- Fix three instances that the tty is not given back to the console on exit.
Forcing the user to do a "reset" to get the console back.
- Fix the console monitor to not hang when too much data is given by the ssh
output.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCY/OmkhQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qlfKAP9ijinbEXt+fuuhjB0HqmERelUBKH9g
9HiQl+yzKh2LiQEAgnOKK8j3MO1VUOlmVso38+Kc3Tp1jEr0KbooTqKiPAU=
=aoI2
-----END PGP SIGNATURE-----
Merge tag 'ktest-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest
Pull ktest updates from Steven Rostedt:
- Fix three instances that the tty is not given back to the console on
exit. Forcing the user to do a "reset" to get the console back.
- Fix the console monitor to not hang when too much data is given by
the ssh output.
* tag 'ktest-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
ktest: Restore stty setting at first in dodie
ktest.pl: Add RUN_TIMEOUT option with default unlimited
ktest.pl: Give back console on Ctrt^C on monitor
ktest.pl: Fix missing "end_monitor" when machine check fails
This KUnit update for Linux 6.3-rc1 consists of cleanups, new features,
and documentation updates:
-- adds Function Redirection API to isolate the code being tested from
other parts of the kernel. functionredirection.rst has the details.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmP1c3AACgkQCwJExA0N
Qxxbwg//TK0YlpQhoO2AgqSp3F8QlXeFKNdm5rHjBBVMYOQOl6rEB+4uznm2AOD9
PZmQfAI+bcxMflSMDEBHEwbh6gLyZJKrsMsxuH2k/LQeWHAbuxHVq+/K4kqzhuhi
QA4ZFKFqnHy+U7jCOGdMtrg9oyg7Glz00fq5pX2iz3FWsE/JpuDZ559RoB9zT9Pu
VnZ+k42Svxkdmf8fXhSCH7C66k9fKkcQm7IGyVbnsWqmldCHpQ6kIjJVTeQSng4j
tXkcys37I/d3/Ffz63rke7+WmJrQviL/gg3PqDmEEVxeX8T3GBT01uONTk+TqyWd
GKudu1lfvuyylFMDoR/5gXr2hr5OJJTGjTfEtwWq7xM0NSiIFHS3/uEYZlE9g3+U
z2/DKMWOHrzJ2G78dfi5fokFdMfGnz2hBCZa9czSxIbjafxLhjSgnt112mDvkJsZ
leeVTB9x6g0b+VYwPKYa9gOmFQyZDGTTsJVT9iaAnhEvlxIRoqxZxzW/jFKgHV/r
ZNRg/kcPfe7m6H15PEblFIuLC4LT/LtDxD8XvkKt42XnG2fuAPS20Jkv6/XB9Ew6
3H1Su27TXIksUD/Z/ZPP9mBno7rwOLrZUa4QNzXqi6q2sbdXP5apg96cPDU0gvI5
sq4zwLgHVuIQ8dfX/hgmqZ8VEcvSFDMINoS+SYGvKjxoTzvd+Sw=
=PloE
-----END PGP SIGNATURE-----
Merge tag 'linux-kselftest-kunit-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull KUnit update from Shuah Khan:
- add Function Redirection API to isolate the code being tested from
other parts of the kernel.
Documentation/dev-tools/kunit/api/functionredirection.rst has the
details.
* tag 'linux-kselftest-kunit-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: Add printf attribute to fail_current_test_impl
lib/hashtable_test.c: add test for the hashtable structure
Documentation: Add Function Redirection API docs
kunit: Expose 'static stub' API to redirect functions
kunit: Add "hooks" to call into KUnit when it's built as a module
kunit: kunit.py extract handlers
tools/testing/kunit/kunit.py: remove redundant double check
o Add s390 support.
o Add support for the ARM Thumb1 instruction set.
o Fix O_* flags definitions for open() and fcntl().
o Make errno a weak symbol instead of a static variable.
o Export environ as a weak symbol.
o Export _auxv as a weak symbol for auxilliary vector retrieval.
o Implement getauxval() and getpagesize().
o Further improve self tests, including permitting userland testing
of the nolibc library.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmPh1DITHHBhdWxtY2tA
a2VybmVsLm9yZwAKCRCevxLzctn7jJTqD/9FPv58m1ZJWP8j8EMF9p6Pd2GuYJ/F
t0tSf8Qmv0tTLqtPzZtu5E5b5bTvsgxQkQJUGLtUBf5l0AsyQt5ve5EUlzGgBHAP
8opwLEzCPUMhjq6ZsHJrmLIPwrH1reVYiAV2uIdBxLHLjGF8QLdYgqIGtguRBIHT
o9HS9RAyPxvMmV8OZqhp+NLjcEzKGloUBdcnDLURQ8Wy12vSQnALl9w1OKiN40rz
dlmXcysn8TboRWZS/DJqr/Xsg5W8ZMIfxrlopgR+FwrqutwH2ZDKgnc5ixm9YxFF
CJCM2QZO8d8UtAxllJRH3NApTCHJh6c257w4awEU97hgkHfhw0tHgRs6sOz6ho0g
O5OeOTAv0NkNNt5jGHXI4s0iQwVU/Ek6m3N8RC2GGzuMXGDcKvbFzGB4T8m8AhYL
MnyaQvuq8SWhE84c+gQgxagZ5cdm8r2hDgnSrlI7P19W5SCsQq7MNSo1WyHQ7uss
sMyxomvCC3y4pMgHcJHWwxtjR/BKjN1wtgCHCvTFcE8k98ti/ycKS6X/zQbGie/1
j20AgP0Cli2MVq+vocInvn0Gf4Ce0xxu5kB0NM8RMX+uiYNB0cJR4lIyWxt0680U
M2Ya6AnfO8Sn57BptTp+QaqZidx9IJJzrAY4QBsdzXIsyJ2kKTK8BVNIaWMQ96nB
twKV/fU0HWWcJQ==
=S+cL
-----END PGP SIGNATURE-----
Merge tag 'nolibc.2023.02.06a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull nolibc updates from Paul McKenney:
- Add s390 support
- Add support for the ARM Thumb1 instruction set
- Fix O_* flags definitions for open() and fcntl()
- Make errno a weak symbol instead of a static variable
- Export environ as a weak symbol
- Export _auxv as a weak symbol for auxilliary vector retrieval
- Implement getauxval() and getpagesize()
- Further improve self tests, including permitting userland testing of
the nolibc library
* tag 'nolibc.2023.02.06a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (28 commits)
selftests/nolibc: Add a "run-user" target to test the program in user land
selftests/nolibc: Support "x86_64" for arch name
selftests/nolibc: Add `getpagesize(2)` selftest
nolibc/sys: Implement `getpagesize(2)` function
nolibc/stdlib: Implement `getauxval(3)` function
tools/nolibc: add auxiliary vector retrieval for s390
tools/nolibc: add auxiliary vector retrieval for mips
tools/nolibc: add auxiliary vector retrieval for riscv
tools/nolibc: add auxiliary vector retrieval for arm
tools/nolibc: add auxiliary vector retrieval for arm64
tools/nolibc: add auxiliary vector retrieval for x86_64
tools/nolibc: add auxiliary vector retrieval for i386
tools/nolibc: export environ as a weak symbol on s390
tools/nolibc: export environ as a weak symbol on riscv
tools/nolibc: export environ as a weak symbol on mips
tools/nolibc: export environ as a weak symbol on arm
tools/nolibc: export environ as a weak symbol on arm64
tools/nolibc: export environ as a weak symbol on i386
tools/nolibc: export environ as a weak symbol on x86_64
tools/nolibc: make errno a weak symbol instead of a static one
...
Documentation updates.
Add read-modify-write sequences, which means that stronger primitives
more consistently result in stronger ordering, while still remaining in
the envelope of the hardware that supports Linux.
Address, data, and control dependencies used to ignore data that was
stored in temporaries. This update extends these dependency chains to
include unmarked intra-thread stores and loads. Note that these unmarked
stores and loads should not be concurrently accessed from multiple
threads, and doing so will cause LKMM to flag such accesses as data races.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmPtYNQTHHBhdWxtY2tA
a2VybmVsLm9yZwAKCRCevxLzctn7jHMaEACfruLJV/hao7R1et0CLasWL6LMenq6
MbzFiOWDMBXxTMUOdAEeM5JQiIsHr8XZbs80hX1OEQb9VPG61HMy/8jqTYtfbGGt
3EykqAKQ8my1/7wEPSfrO/icvPf/czuT1GYYNQi+PGnlrBKUHPkqfuDpPz5E6p/+
hIojbtXcFLIdB42sBw5JSG3itX6lTlmJFZEfmkYCIlgBQxGTlbK7Bpagml+7zGTD
mQT824OKiPJ6aerUuBzCUURq45JvNd9jE39Gc5KV63pxR2hOcsCZz3jYA1ZQcKeX
UP+ZowKC3WH6iLhxmhnsdAIlaeQRcOvU46B0PHdwIKhV1CVLZR4qINPFIPJ2u4Oo
kwsdG8hBHnNnapPMnhmk8DOZRz1SX2Q9ZR35ZOtOOFWw41ZRkGp3fE6JlKaF0pEt
3SlZ98wkxpV5YEQ67clpVGCPdg0yMWNnos4D1Yw82mpI2DH5NF60R5x6Gb/B2QyX
fp/0SpkXwc4PbLY7sHYWH1MF+bRFkJOeDw2XesMMT+Cjn20fqtR0HGFO/rPHeDqQ
qqamNFQVkP7Y/BWzxR27iH9xFqI8a8BKI18/IYbfQZ+eNwULOCCXqdQpuRaTKPaM
4h6Ebtx/j3oXR0TYtb84mWwaKNO17fo8zMH4tn1Jk+K4OrcxCop5m29fkX1Fjqqf
BMpxir7tN4DK7Q==
=uGmQ
-----END PGP SIGNATURE-----
Merge tag 'lkmm.2023.02.15a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull LKMM (Linux Kernel Memory Model) updates from Paul McKenney:
"Documentation updates.
Add read-modify-write sequences, which means that stronger primitives
more consistently result in stronger ordering, while still remaining
in the envelope of the hardware that supports Linux.
Address, data, and control dependencies used to ignore data that was
stored in temporaries. This update extends these dependency chains to
include unmarked intra-thread stores and loads. Note that these
unmarked stores and loads should not be concurrently accessed from
multiple threads, and doing so will cause LKMM to flag such accesses
as data races"
* tag 'lkmm.2023.02.15a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
tools: memory-model: Make plain accesses carry dependencies
Documentation: Fixed a typo in atomic_t.txt
tools: memory-model: Add rmw-sequences to the LKMM
locking/memory-barriers.txt: Improve documentation for writel() example
Now that CONFIG_HID_BPF is not automatically implied by HID, we need
to set it properly in the selftests config.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Commas may appear in events like:
cpu/INT_MISC.RECOVERY_CYCLES,cmask=1,edge/
which causes the count of commas to see more items than expected. Switch
to counting the entries in the dictionary, which is 1 more than the
number of commas.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Claire Jensen <cjense@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20230223071818.329671-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commas may appear in events like:
cpu/INT_MISC.RECOVERY_CYCLES,cmask=1,edge/
which causes the commachecker to see more fields than expected. Use @ as
the CSV separator to avoid this.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Claire Jensen <cjense@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20230223071818.329671-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When MMAP2 has the PERF_RECORD_MISC_MMAP_BUILD_ID flag, it means the
record already has the build-id info. So it marks the DSO as hit, to
skip if the same DSO is not processed if it happens to miss the build-id
later.
But it missed to copy the MMAP2 record itself so it'd fail to symbolize
samples for those regions.
For example, the following generates 249 MMAP2 events.
$ perf record --buildid-mmap -o- true | perf report --stat -i- | grep MMAP2
MMAP2 events: 249 (86.8%)
Adding perf inject should not change the number of events like this
$ perf record --buildid-mmap -o- true | perf inject -b | \
> perf report --stat -i- | grep MMAP2
MMAP2 events: 249 (86.5%)
But when --buildid-all is used, it eats most of the MMAP2 events.
$ perf record --buildid-mmap -o- true | perf inject -b --buildid-all | \
> perf report --stat -i- | grep MMAP2
MMAP2 events: 1 ( 2.5%)
With this patch, it shows the original number now.
$ perf record --buildid-mmap -o- true | perf inject -b --buildid-all | \
> perf report --stat -i- | grep MMAP2
MMAP2 events: 249 (86.5%)
Committer testing:
Before:
$ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf inject -b | perf report --stat -i- | grep MMAP2
MMAP2 events: 58 (36.2%)
$ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf report --stat -i- | grep MMAP2
MMAP2 events: 58 (36.2%)
$ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf inject -b --buildid-all | perf report --stat -i- | grep MMAP2
MMAP2 events: 2 ( 1.9%)
$
After:
$ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf inject -b | perf report --stat -i- | grep MMAP2
MMAP2 events: 58 (29.3%)
$ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf report --stat -i- | grep MMAP2
MMAP2 events: 58 (34.3%)
$ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf inject -b --buildid-all | perf report --stat -i- | grep MMAP2
MMAP2 events: 58 (38.4%)
$
Fixes: f7fc0d1c91 ("perf inject: Do not inject BUILD_ID record if MMAP2 has it")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230223070155.54251-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add tests to check whether the total fib info length is calculated
corretly in route notify process.
Signed-off-by: Lu Wei <luwei32@huawei.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230222083629.335683-3-luwei32@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
There have been some recently reported ORC unwinder warnings like:
WARNING: can't access registers at entry_SYSCALL_64_after_hwframe+0x63/0xcd
WARNING: stack going in the wrong direction? at __sys_setsockopt+0x2c6/0x5b0 net/socket.c:2271
And a KASAN warning:
BUG: KASAN: stack-out-of-bounds in unwind_next_frame (arch/x86/include/asm/ptrace.h:136 arch/x86/kernel/unwind_orc.c:455)
It turns out the 'signal' bit isn't getting propagated from the unwind
hints to the ORC entries, making the unwinder confused at times.
Fixes: ffb1b4a410 ("x86/unwind/orc: Add 'signal' field to ORC metadata")
Reported-by: kernel test robot <oliver.sang@intel.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/97eef9db60cd86d376a9a40d49d77bb67a8f6526.1676579666.git.jpoimboe@kernel.org
Things like ALTERNATIVE_{2,3}() generate multiple alternatives on the
same place, objtool would override the first orig_alt_group with the
second (or third), failing to check the CFI among all the different
variants.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org> # build only
Tested-by: Thomas Weißschuh <linux@weissschuh.net> # compile and run
Link: https://lore.kernel.org/r/20230208172245.711471461@infradead.org
In preparation to changing struct instruction around a bit, avoid
passing it's members by pointer and instead pass the whole thing.
A cleanup in it's own right too.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org> # build only
Tested-by: Thomas Weißschuh <linux@weissschuh.net> # compile and run
Link: https://lore.kernel.org/r/20230208172245.291087549@infradead.org
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
1) Fix broken listing of set elements when table has an owner.
2) Fix conntrack refcount leak in ctnetlink with related conntrack
entries, from Hangyu Hua.
3) Fix use-after-free/double-free in ctnetlink conntrack insert path,
from Florian Westphal.
4) Fix ip6t_rpfilter with VRF, from Phil Sutter.
5) Fix use-after-free in ebtables reported by syzbot, also from Florian.
6) Use skb->len in xt_length to deal with IPv6 jumbo packets,
from Xin Long.
7) Fix NETLINK_LISTEN_ALL_NSID with ctnetlink, from Florian Westphal.
8) Fix memleak in {ip_,ip6_,arp_}tables in ENOMEM error case,
from Pavel Tikhomirov.
* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: x_tables: fix percpu counter block leak on error path when creating new netns
netfilter: ctnetlink: make event listener tracking global
netfilter: xt_length: use skb len to match in length_mt6
netfilter: ebtables: fix table blob use-after-free
netfilter: ip6t_rpfilter: Fix regression with VRF interfaces
netfilter: conntrack: fix rmmod double-free race
netfilter: ctnetlink: fix possible refcount leak in ctnetlink_create_conntrack()
netfilter: nf_tables: allow to fetch set elements when table has an owner
====================
Link: https://lore.kernel.org/r/20230222092137.88637-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kernel's flow dissector continues to parse the packet when
the (optional) IPv6 flow label is empty even when instructed
to stop (via BPF_FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL). Do
the same in our reference BPF reimplementation.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20230221180518.2139026-1-sdf@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
I cross-compile my BPF selftests with the following command:
CLANG_CROSS_FLAGS="--target=aarch64-linux-gnu --sysroot=/sysroot/" \
make LLVM=1 CC=clang CROSS_COMPILE=aarch64-linux-gnu- SRCARCH=arm64
(Note the use of CLANG_CROSS_FLAGS to specify a custom sysroot instead
of letting clang use gcc's default sysroot)
However, CLANG_CROSS_FLAGS gets propagated to host tools builds (libbpf
and bpftool) and because they reference it directly in their Makefiles,
they end up cross-compiling host objects which results in linking
errors.
This patch ensures that CLANG_CROSS_FLAGS is reset if CROSS_COMPILE
isn't set (for example when reaching a BPF host tool build).
Signed-off-by: Florent Revest <revest@chromium.org>
Link: https://lore.kernel.org/r/20230217151832.27784-1-revest@chromium.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The following three uapi headers:
tools/arch/arm64/include/uapi/asm/bpf_perf_event.h
tools/arch/s390/include/uapi/asm/bpf_perf_event.h
tools/arch/s390/include/uapi/asm/ptrace.h
were introduced in commit 618e165b2a ("selftests/bpf: sync kernel headers
and introduce arch support in Makefile"), they are not used any more after
commit 720f228e8d ("bpf: fix broken BPF selftest build"), so remove them.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Link: https://lore.kernel.org/r/1676533861-27508-1-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-----BEGIN PGP SIGNATURE-----
iIYEABYIAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCY/O9MBAcbWljQGRpZ2lr
b2QubmV0AAoJEOXj0OiMgvbSFbgA/RPjQO0J/todz9qJMhbx4QhZizKK8F8hM9Yl
rwhOZWmjAP4wTxOsnrjdR9UusYAr818j01D7ncpp9bM4e2ZNj1wEDw==
=Xe3q
-----END PGP SIGNATURE-----
Merge tag 'landlock-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull landlock updates from Mickaël Salaün:
"This improves documentation, and makes some tests more flexible to be
able to run on systems without overlayfs or with Yama restrictions"
* tag 'landlock-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
MAINTAINERS: Update Landlock repository
selftests/landlock: Test ptrace as much as possible with Yama
selftests/landlock: Skip overlayfs tests when not supported
landlock: Explain file descriptor access rights
Three testcases to make sure that stack reads from uninitialized
locations are accepted by verifier when executed in privileged mode:
- read from a fixed offset;
- read from a variable offset;
- passing a pointer to stack to a helper converts
STACK_INVALID to STACK_MISC.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230219200427.606541-3-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This commits updates the following functions to allow reads from
uninitialized stack locations when env->allow_uninit_stack option is
enabled:
- check_stack_read_fixed_off()
- check_stack_range_initialized(), called from:
- check_stack_read_var_off()
- check_helper_mem_access()
Such change allows to relax logic in stacksafe() to treat STACK_MISC
and STACK_INVALID in a same way and make the following stack slot
configurations equivalent:
| Cached state | Current state |
| stack slot | stack slot |
|------------------+------------------|
| STACK_INVALID or | STACK_INVALID or |
| STACK_MISC | STACK_SPILL or |
| | STACK_MISC or |
| | STACK_ZERO or |
| | STACK_DYNPTR |
This leads to significant verification speed gains (see below).
The idea was suggested by Andrii Nakryiko [1] and initial patch was
created by Alexei Starovoitov [2].
Currently the env->allow_uninit_stack is allowed for programs loaded
by users with CAP_PERFMON or CAP_SYS_ADMIN capabilities.
A number of test cases from verifier/*.c were expecting uninitialized
stack access to be an error. These test cases were updated to execute
in unprivileged mode (thus preserving the tests).
The test progs/test_global_func10.c expected "invalid indirect read
from stack" error message because of the access to uninitialized
memory region. This error is no longer possible in privileged mode.
The test is updated to provoke an error "invalid indirect access to
stack" because of access to invalid stack address (such error is not
verified by progs/test_global_func*.c series of tests).
The following tests had to be removed because these can't be made
unprivileged:
- verifier/sock.c:
- "sk_storage_get(map, skb->sk, &stack_value, 1): partially init
stack_value"
BPF_PROG_TYPE_SCHED_CLS programs are not executed in unprivileged mode.
- verifier/var_off.c:
- "indirect variable-offset stack access, max_off+size > max_initialized"
- "indirect variable-offset stack access, uninitialized"
These tests verify that access to uninitialized stack values is
detected when stack offset is not a constant. However, variable
stack access is prohibited in unprivileged mode, thus these tests
are no longer valid.
* * *
Here is veristat log comparing this patch with current master on a
set of selftest binaries listed in tools/testing/selftests/bpf/veristat.cfg
and cilium BPF binaries (see [3]):
$ ./veristat -e file,prog,states -C -f 'states_pct<-30' master.log current.log
File Program States (A) States (B) States (DIFF)
-------------------------- -------------------------- ---------- ---------- ----------------
bpf_host.o tail_handle_ipv6_from_host 349 244 -105 (-30.09%)
bpf_host.o tail_handle_nat_fwd_ipv4 1320 895 -425 (-32.20%)
bpf_lxc.o tail_handle_nat_fwd_ipv4 1320 895 -425 (-32.20%)
bpf_sock.o cil_sock4_connect 70 48 -22 (-31.43%)
bpf_sock.o cil_sock4_sendmsg 68 46 -22 (-32.35%)
bpf_xdp.o tail_handle_nat_fwd_ipv4 1554 803 -751 (-48.33%)
bpf_xdp.o tail_lb_ipv4 6457 2473 -3984 (-61.70%)
bpf_xdp.o tail_lb_ipv6 7249 3908 -3341 (-46.09%)
pyperf600_bpf_loop.bpf.o on_event 287 145 -142 (-49.48%)
strobemeta.bpf.o on_event 15915 4772 -11143 (-70.02%)
strobemeta_nounroll2.bpf.o on_event 17087 3820 -13267 (-77.64%)
xdp_synproxy_kern.bpf.o syncookie_tc 21271 6635 -14636 (-68.81%)
xdp_synproxy_kern.bpf.o syncookie_xdp 23122 6024 -17098 (-73.95%)
-------------------------- -------------------------- ---------- ---------- ----------------
Note: I limited selection by states_pct<-30%.
Inspection of differences in pyperf600_bpf_loop behavior shows that
the following patch for the test removes almost all differences:
- a/tools/testing/selftests/bpf/progs/pyperf.h
+ b/tools/testing/selftests/bpf/progs/pyperf.h
@ -266,8 +266,8 @ int __on_event(struct bpf_raw_tracepoint_args *ctx)
}
if (event->pthread_match || !pidData->use_tls) {
- void* frame_ptr;
- FrameData frame;
+ void* frame_ptr = 0;
+ FrameData frame = {};
Symbol sym = {};
int cur_cpu = bpf_get_smp_processor_id();
W/o this patch the difference comes from the following pattern
(for different variables):
static bool get_frame_data(... FrameData *frame ...)
{
...
bpf_probe_read_user(&frame->f_code, ...);
if (!frame->f_code)
return false;
...
bpf_probe_read_user(&frame->co_name, ...);
if (frame->co_name)
...;
}
int __on_event(struct bpf_raw_tracepoint_args *ctx)
{
FrameData frame;
...
get_frame_data(... &frame ...) // indirectly via a bpf_loop & callback
...
}
SEC("raw_tracepoint/kfree_skb")
int on_event(struct bpf_raw_tracepoint_args* ctx)
{
...
ret |= __on_event(ctx);
ret |= __on_event(ctx);
...
}
With regards to value `frame->co_name` the following is important:
- Because of the conditional `if (!frame->f_code)` each call to
__on_event() produces two states, one with `frame->co_name` marked
as STACK_MISC, another with it as is (and marked STACK_INVALID on a
first call).
- The call to bpf_probe_read_user() does not mark stack slots
corresponding to `&frame->co_name` as REG_LIVE_WRITTEN but it marks
these slots as BPF_MISC, this happens because of the following loop
in the check_helper_call():
for (i = 0; i < meta.access_size; i++) {
err = check_mem_access(env, insn_idx, meta.regno, i, BPF_B,
BPF_WRITE, -1, false);
if (err)
return err;
}
Note the size of the write, it is a one byte write for each byte
touched by a helper. The BPF_B write does not lead to write marks
for the target stack slot.
- Which means that w/o this patch when second __on_event() call is
verified `if (frame->co_name)` will propagate read marks first to a
stack slot with STACK_MISC marks and second to a stack slot with
STACK_INVALID marks and these states would be considered different.
[1] https://lore.kernel.org/bpf/CAEf4BzY3e+ZuC6HUa8dCiUovQRg2SzEk7M-dSkqNZyn=xEmnPA@mail.gmail.com/
[2] https://lore.kernel.org/bpf/CAADnVQKs2i1iuZ5SUGuJtxWVfGYR9kDgYKhq3rNV+kBLQCu7rA@mail.gmail.com/
[3] git@github.com:anakryiko/cilium.git
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Co-developed-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230219200427.606541-2-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJSBAABCAA8FiEEoEVH9lhNrxiMPSyI7MXwXhnZSjYFAmP15aoeHGJlbmphbWlu
LnRpc3NvaXJlc0ByZWRoYXQuY29tAAoJEOzF8F4Z2Uo2nwkP/Rcr5lZYKQ59ezoJ
PwHx/0wO1Qpgd4fRwD5Mvxynmfoq20A6FZFpmtjUjNcP3dm3X2UJtXE3HkDECCFP
5daqTFOswFiPFtcMlIl+SHxNgIIlTodbqx9MAFj/7n5aihB54JpTHWsgbENj35Y1
RLHYDi0+wj69y9ctOkqKGWHp8Uf220RWrD7zZf7AJAc5cwot1kM00RSy9dSAJ0vB
riZdCQqYwXbf4I1uFzthS6AdIIWcpmpZyYFnsF7F2xQADqCNXr1MTmG0uC+Ey/J/
0PZXjkyMO/pNwxarRiXmBKnsJJlJajxRXGtpgKYZu8AnIaXuNu/z987Y1pZ/kYis
OQSKib2bGWzT2MycqiuVVrOChIMvqmk2aPRvg73jojpSySYTKn1Jp/XHyXS8qkkQ
HJ/u6VPpe42GdfOGM7V9ig+80z/5D5u1XJECoPpxyKHWQ8S7ZlczQjVT+a8nUuBV
hPTiqAYE9NT6SQJ0b5z2uhBdGRzvAbCZzDCgvjE87zsRmLzFk/fzMdMPZrlADKMJ
3qu7ey2GYOBNfnDcJmPu5HmK/A9BaPZiZAMakqjGpezZbe+LGBNskXTRwPpNC+Dh
11pna0ns+vlxeT7nonO0JsYvsKWy0pPcBvhyUEHWsmHgyagRLIsvg01ezB/Ivu0O
xYj3UPVEGTiwHBt9xnaZcIjRSPSZ
=cBvC
-----END PGP SIGNATURE-----
Merge tag 'for-linus-2023022201' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID updates from Benjamin Tissoires:
- HID-BPF infrastructure: this allows to start using HID-BPF. Note that
the mechanism to ship HID-BPF program through the kernel tree is
still not implemented yet (but is planned).
This should be a no-op for 99% of users. Also we are gaining
kselftests for the HID tree (Benjamin Tissoires)
- Some UAF fixes in workers when using uhid (Pietro Borrello & Benjamin
Tissoires)
- Constify hid_ll_driver (Thomas Weißschuh)
- Allow more custom IIO sensors through HID (Philipp Jungkamp)
- Logitech HID++ fixes for scroll wheel, protocol and debug (Bastien
Nocera)
- Some new device support: Steam Deck (Vicki Pfau), UClogic (José
Expósito), Logitech G923 Xbox Edition steering wheel (Walt Holman),
EVision keyboards (Philippe Valembois)
- other assorted code cleanups and fixes
* tag 'for-linus-2023022201' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: (99 commits)
HID: mcp-2221: prevent UAF in delayed work
hid: bigben_probe(): validate report count
HID: asus: use spinlock to safely schedule workers
HID: asus: use spinlock to protect concurrent accesses
HID: bigben: use spinlock to safely schedule workers
HID: bigben_worker() remove unneeded check on report_field
HID: bigben: use spinlock to protect concurrent accesses
HID: logitech-hidpp: Add myself to authors
HID: logitech-hidpp: Retry commands when device is busy
HID: logitech-hidpp: Add more debug statements
HID: Add support for Logitech G923 Xbox Edition steering wheel
HID: logitech-hidpp: Add Signature M650
HID: logitech-hidpp: Remove HIDPP_QUIRK_NO_HIDINPUT quirk
HID: logitech-hidpp: Don't restart communication if not necessary
HID: logitech-hidpp: Add constants for HID++ 2.0 error codes
Revert "HID: logitech-hidpp: add a module parameter to keep firmware gestures"
HID: logitech-hidpp: Hard-code HID++ 1.0 fast scroll support
HID: i2c-hid: goodix: Add mainboard-vddio-supply
dt-bindings: HID: i2c-hid: goodix: Add mainboard-vddio-supply
HID: i2c-hid: goodix: Stop tying the reset line to the regulator
...
To pick up the changes from these csets:
8c29f01654 ("x86/sev: Add SEV-SNP guest feature negotiation support")
That cause no changes to tooling:
$ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
$ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
$ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
$ diff -u before after
$
Just silences this perf build warning:
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Nikunj A Dadhania <nikunj@amd.com>
Link: https://lore.kernel.org/lkml/Y%2FZrNvtcijPWagCp@kernel.org/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The majority of works in this cycle are about ASoC spread over trees.
Most of them are for new devices and cleanups / refactoring works,
and not much significant changes are seen in the core side.
Below are some highlights:
ASoC:
- Continued refactoring to move into common helper functions
- Lots of DT schema conversons and stylistic nits
- Continued work on building out the new SOF IPC4 scheme
- Continued work for Intel AVS
- New drivers for Awinc AT88395, Infineon PEB2466, Iron Device
SMA1303, Mediatek MT8188, Realtek RT712, Renesas IDT821034,
Samsung/Tesla FSD SoC I2S, and TI TAS5720A-Q1
ALSA:
- A few cleanups to make the remove callbacks to void returns
- FireWire refactoring and enhancements
- PCM kselftest enhancements
-----BEGIN PGP SIGNATURE-----
iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmPw+kkOHHRpd2FpQHN1
c2UuZGUACgkQLtJE4w1nLE9X3RAAkxjjtk+BRF+tvS6VYQhezTOE7frSqpxB+ZHm
KjdQClfpbPqYVD/pUEnz+N68bmOZKK8Ihif+LaaW+8NJJa/1kivQWNCQLCvm7L71
x7TRkOYvrzlx+Fhpf6JacOM8VaBkRrfd+cK6pQSv8b72ZTWorfenkaC9OMdL2NEY
YI/sH5zZd6dDoKfQ+WPsplOSCog3KKgAGvn4qEQKxADsyOjsu3rpgijcgDmVc9XT
y2RMAEPID68TtAtcNhesurLEKZ+4mEDvALQjAsxxb99lfAFDlDBezEO4/dl2v9Db
yebsEnM+W5z3dVl13Aok9XtVCxrhy7n+v5z060ZEoTxIEJK7YVCWx8XCVL1KSgNV
31MEVDgf7PrsYAWr54yNF2lmwJh5YchZQ28ngZRHmQ7jMpVbO6ypyIzf77fEQSam
SiCG7hurSCB38LUb7fg1WsjSRupRamoPDhRG9q7C36ePdeYRkBqOJsSmfABjN/Cb
v0fixm45PtZpWoZUpLAzNEtkQA665Sf2SoAnAY+kCPllYuNXXHdEomokppffXHbO
Xbq/wcehpOJKR9vqWhsBuVz34UbGyuM1SBLrNXj+sr24Xv6Uy4E5GcJ75rO1E3TR
gTGTIM/DtOwTGKyceQ30Gnl9M2wKeP9/qEhkH60XgyzitGp9iAvrIvcU1ODVlfgN
ZSBzjOk=
=9s9c
-----END PGP SIGNATURE-----
Merge tag 'sound-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound updates from Takashi Iwai:
"The majority of works in this cycle are about ASoC spread over trees.
Most of them are for new devices and cleanups / refactoring works, and
not much significant changes are seen in the core side.
Below are some highlights:
ASoC:
- Continued refactoring to move into common helper functions
- Lots of DT schema conversons and stylistic nits
- Continued work on building out the new SOF IPC4 scheme
- Continued work for Intel AVS
- New drivers for Awinc AT88395, Infineon PEB2466, Iron Device
SMA1303, Mediatek MT8188, Realtek RT712, Renesas IDT821034,
Samsung/Tesla FSD SoC I2S, and TI TAS5720A-Q1
ALSA:
- A few cleanups to make the remove callbacks to void returns
- FireWire refactoring and enhancements
- PCM kselftest enhancements"
* tag 'sound-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (398 commits)
ALSA: hda/hdmi: Register with vga_switcheroo on Dual GPU Macbooks
ASoC: soc-ac97: Return correct error codes
ASoC: soc-dapm.h: fixup warning struct snd_pcm_substream not declared
ASoC: cs35l45: Remove separate namespace for tables
ASoC: cs35l45: Remove separate tables module
ASoC: soc-ac97: Convert to agnostic GPIO API
ASoC: dt-bindings: renesas,rsnd.yaml: drop "dmas/dma-names" from "rcar_sound,ssi"
ALSA: hda: cs35l41: Enable Amp High Pass Filter
ALSA: hda: cs35l41: Ensure firmware/tuning pairs are always loaded
ALSA: hda: cs35l41: Correct error condition handling
ASoC: codecs: wcd934x: Use min macro for comparison and assignment
ASoC: Intel: Skylake: Fix struct definition
ASoC: tlv320adcx140: extend list of supported samplerates
ASoC: imx-pcm-rpmsg: Remove unused variable
SoC: rt5682s: Disable jack detection interrupt during suspend
ASoC: SOF: Intel: hda-dsp: Set streaming flag for d0i3
ASoC: SOF: Intel: Enable d0i3 work for ipc4
ASoC: SOF: ipc4: Wake up dsp core before sending ipc msg
ASoC: SOF: Intel: hda-dsp: use set_pm_gate according to ipc version
ASoC: SOF: Introduce a new set_pm_gate() IPC PM op
...
On Fedora 36, the 'perf record' offcpu profiling tests are failing. It
was because the BPF checks the prev task's state being S or D but
actually it has more bits set. Let's check the LSB 8 bits for the
purpose of offcpu profiling.
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230218162724.1292657-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Initial support of HID-BPF (Benjamin Tissoires)
The history is a little long for this series, as it was intended to be
sent for v6.2. However some last minute issues forced us to postpone it
to v6.3.
Conflicts:
* drivers/hid/i2c-hid/Kconfig:
commit bf7660dab3 ("HID: stop drivers from selecting CONFIG_HID")
conflicts with commit 2afac81dd1 ("HID: fix I2C_HID not selected
when I2C_HID_OF_ELAN is")
the resolution is simple enough: just drop the "default" and "select"
lines as the new commit from Arnd is doing
Core
----
- Add dedicated kmem_cache for typical/small skb->head, avoid having
to access struct page at kfree time, and improve memory use.
- Introduce sysctl to set default RPS configuration for new netdevs.
- Define Netlink protocol specification format which can be used
to describe messages used by each family and auto-generate parsers.
Add tools for generating kernel data structures and uAPI headers.
- Expose all net/core sysctls inside netns.
- Remove 4s sleep in netpoll if carrier is instantly detected on boot.
- Add configurable limit of MDB entries per port, and port-vlan.
- Continue populating drop reasons throughout the stack.
- Retire a handful of legacy Qdiscs and classifiers.
Protocols
---------
- Support IPv4 big TCP (TSO frames larger than 64kB).
- Add IP_LOCAL_PORT_RANGE socket option, to control local port range
on socket by socket basis.
- Track and report in procfs number of MPTCP sockets used.
- Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP
path manager.
- IPv6: don't check net.ipv6.route.max_size and rely on garbage
collection to free memory (similarly to IPv4).
- Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986).
- ICMP: add per-rate limit counters.
- Add support for user scanning requests in ieee802154.
- Remove static WEP support.
- Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate
reporting.
- WiFi 7 EHT channel puncturing support (client & AP).
BPF
---
- Add a rbtree data structure following the "next-gen data structure"
precedent set by recently added linked list, that is, by using
kfunc + kptr instead of adding a new BPF map type.
- Expose XDP hints via kfuncs with initial support for RX hash and
timestamp metadata.
- Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key
to better support decap on GRE tunnel devices not operating
in collect metadata.
- Improve x86 JIT's codegen for PROBE_MEM runtime error checks.
- Remove the need for trace_printk_lock for bpf_trace_printk
and bpf_trace_vprintk helpers.
- Extend libbpf's bpf_tracing.h support for tracing arguments of
kprobes/uprobes and syscall as a special case.
- Significantly reduce the search time for module symbols
by livepatch and BPF.
- Enable cpumasks to be used as kptrs, which is useful for tracing
programs tracking which tasks end up running on which CPUs in
different time intervals.
- Add support for BPF trampoline on s390x and riscv64.
- Add capability to export the XDP features supported by the NIC.
- Add __bpf_kfunc tag for marking kernel functions as kfuncs.
- Add cgroup.memory=nobpf kernel parameter option to disable BPF
memory accounting for container environments.
Netfilter
---------
- Remove the CLUSTERIP target. It has been marked as obsolete
for years, and we still have WARN splats wrt. races of
the out-of-band /proc interface installed by this target.
- Add 'destroy' commands to nf_tables. They are identical to
the existing 'delete' commands, but do not return an error if
the referenced object (set, chain, rule...) did not exist.
Driver API
----------
- Improve cpumask_local_spread() locality to help NICs set the right
IRQ affinity on AMD platforms.
- Separate C22 and C45 MDIO bus transactions more clearly.
- Introduce new DCB table to control DSCP rewrite on egress.
- Support configuration of Physical Layer Collision Avoidance (PLCA)
Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of
shared medium Ethernet.
- Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing
preemption of low priority frames by high priority frames.
- Add support for controlling MACSec offload using netlink SET.
- Rework devlink instance refcounts to allow registration and
de-registration under the instance lock. Split the code into multiple
files, drop some of the unnecessarily granular locks and factor out
common parts of netlink operation handling.
- Add TX frame aggregation parameters (for USB drivers).
- Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning
messages with notifications for debug.
- Allow offloading of UDP NEW connections via act_ct.
- Add support for per action HW stats in TC.
- Support hardware miss to TC action (continue processing in SW from
a specific point in the action chain).
- Warn if old Wireless Extension user space interface is used with
modern cfg80211/mac80211 drivers. Do not support Wireless Extensions
for Wi-Fi 7 devices at all. Everyone should switch to using nl80211
interface instead.
- Improve the CAN bit timing configuration. Use extack to return error
messages directly to user space, update the SJW handling, including
the definition of a new default value that will benefit CAN-FD
controllers, by increasing their oscillator tolerance.
New hardware / drivers
----------------------
- Ethernet:
- nVidia BlueField-3 support (control traffic driver)
- Ethernet support for imx93 SoCs
- Motorcomm yt8531 gigabit Ethernet PHY
- onsemi NCN26000 10BASE-T1S PHY (with support for PLCA)
- Microchip LAN8841 PHY (incl. cable diagnostics and PTP)
- Amlogic gxl MDIO mux
- WiFi:
- RealTek RTL8188EU (rtl8xxxu)
- Qualcomm Wi-Fi 7 devices (ath12k)
- CAN:
- Renesas R-Car V4H
Drivers
-------
- Bluetooth:
- Set Per Platform Antenna Gain (PPAG) for Intel controllers.
- Ethernet NICs:
- Intel (1G, igc):
- support TSN / Qbv / packet scheduling features of i226 model
- Intel (100G, ice):
- use GNSS subsystem instead of TTY
- multi-buffer XDP support
- extend support for GPIO pins to E823 devices
- nVidia/Mellanox:
- update the shared buffer configuration on PFC commands
- implement PTP adjphase function for HW offset control
- TC support for Geneve and GRE with VF tunnel offload
- more efficient crypto key management method
- multi-port eswitch support
- Netronome/Corigine:
- add DCB IEEE support
- support IPsec offloading for NFP3800
- Freescale/NXP (enetc):
- enetc: support XDP_REDIRECT for XDP non-linear buffers
- enetc: improve reconfig, avoid link flap and waiting for idle
- enetc: support MAC Merge layer
- Other NICs:
- sfc/ef100: add basic devlink support for ef100
- ionic: rx_push mode operation (writing descriptors via MMIO)
- bnxt: use the auxiliary bus abstraction for RDMA
- r8169: disable ASPM and reset bus in case of tx timeout
- cpsw: support QSGMII mode for J721e CPSW9G
- cpts: support pulse-per-second output
- ngbe: add an mdio bus driver
- usbnet: optimize usbnet_bh() by avoiding unnecessary queuing
- r8152: handle devices with FW with NCM support
- amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation
- virtio-net: support multi buffer XDP
- virtio/vsock: replace virtio_vsock_pkt with sk_buff
- tsnep: XDP support
- Ethernet high-speed switches:
- nVidia/Mellanox (mlxsw):
- add support for latency TLV (in FW control messages)
- Microchip (sparx5):
- separate explicit and implicit traffic forwarding rules, make
the implicit rules always active
- add support for egress DSCP rewrite
- IS0 VCAP support (Ingress Classification)
- IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS etc.)
- ES2 VCAP support (Egress Access Control)
- support for Per-Stream Filtering and Policing (802.1Q, 8.6.5.1)
- Ethernet embedded switches:
- Marvell (mv88e6xxx):
- add MAB (port auth) offload support
- enable PTP receive for mv88e6390
- NXP (ocelot):
- support MAC Merge layer
- support for the the vsc7512 internal copper phys
- Microchip:
- lan9303: convert to PHYLINK
- lan966x: support TC flower filter statistics
- lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x
- lan937x: support Credit Based Shaper configuration
- ksz9477: support Energy Efficient Ethernet
- other:
- qca8k: convert to regmap read/write API, use bulk operations
- rswitch: Improve TX timestamp accuracy
- Intel WiFi (iwlwifi):
- EHT (Wi-Fi 7) rate reporting
- STEP equalizer support: transfer some STEP (connection to radio
on platforms with integrated wifi) related parameters from the
BIOS to the firmware.
- Qualcomm 802.11ax WiFi (ath11k):
- IPQ5018 support
- Fine Timing Measurement (FTM) responder role support
- channel 177 support
- MediaTek WiFi (mt76):
- per-PHY LED support
- mt7996: EHT (Wi-Fi 7) support
- Wireless Ethernet Dispatch (WED) reset support
- switch to using page pool allocator
- RealTek WiFi (rtw89):
- support new version of Bluetooth co-existance
- Mobile:
- rmnet: support TX aggregation.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmP1VIYACgkQMUZtbf5S
IrvsChAApz0rNL/sPKxXTEfxZ1tN7D3sYxYKQPomxvl5BV+MvicrLddJy3KmzEFK
nnJNO3nuRNuH422JQ/ylZ4mGX1opa6+5QJb0UINImXUI7Fm8HHBIuPGkv7d5CheZ
7JexFqjPJXUy9nPyh1Rra+IA9AcRd2U7jeGEZR38wb99bHJQj5Bzdk20WArEB0el
n44aqg49LXH71bSeXRz77x5SjkwVtYiccQxLcnmTbjLU2xVraLvI2J+wAhHnVXWW
9lrU1+V4Ex2Xcd1xR0L0cHeK+meP1TrPRAeF+JDpVI3a/zJiE7cZjfHdG/jH5xWl
leZJqghVozrZQNtewWWO7XhUFhMDgFu3W/1vNLjSHPZEqaz1JpM67J1+ql6s63l4
LMWoXbcYZz+SL9ZRCoPkbGue/5fKSHv8/Jl9Sh58+eTS+c/zgN8uFGRNFXLX1+EP
n8uvt985PxMd6x1+dHumhOUzxnY4Sfi1vjitSunTsNFQ3Cmp4SO0IfBVJWfLUCuC
xz5hbJGJJbSpvUsO+HWyCg83E5OWghRE/Onpt2jsQSZCrO9HDg4FRTEf3WAMgaqc
edb5KfbRZPTJQM08gWdluXzSk1nw3FNP2tXW4XlgUrEbjb+fOk0V9dQg2gyYTxQ1
Nhvn8ZQPi6/GMMELHAIPGmmW1allyOGiAzGlQsv8EmL+OFM6WDI=
=xXhC
-----END PGP SIGNATURE-----
Merge tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- Add dedicated kmem_cache for typical/small skb->head, avoid having
to access struct page at kfree time, and improve memory use.
- Introduce sysctl to set default RPS configuration for new netdevs.
- Define Netlink protocol specification format which can be used to
describe messages used by each family and auto-generate parsers.
Add tools for generating kernel data structures and uAPI headers.
- Expose all net/core sysctls inside netns.
- Remove 4s sleep in netpoll if carrier is instantly detected on
boot.
- Add configurable limit of MDB entries per port, and port-vlan.
- Continue populating drop reasons throughout the stack.
- Retire a handful of legacy Qdiscs and classifiers.
Protocols:
- Support IPv4 big TCP (TSO frames larger than 64kB).
- Add IP_LOCAL_PORT_RANGE socket option, to control local port range
on socket by socket basis.
- Track and report in procfs number of MPTCP sockets used.
- Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP path
manager.
- IPv6: don't check net.ipv6.route.max_size and rely on garbage
collection to free memory (similarly to IPv4).
- Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986).
- ICMP: add per-rate limit counters.
- Add support for user scanning requests in ieee802154.
- Remove static WEP support.
- Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate
reporting.
- WiFi 7 EHT channel puncturing support (client & AP).
BPF:
- Add a rbtree data structure following the "next-gen data structure"
precedent set by recently added linked list, that is, by using
kfunc + kptr instead of adding a new BPF map type.
- Expose XDP hints via kfuncs with initial support for RX hash and
timestamp metadata.
- Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to
better support decap on GRE tunnel devices not operating in collect
metadata.
- Improve x86 JIT's codegen for PROBE_MEM runtime error checks.
- Remove the need for trace_printk_lock for bpf_trace_printk and
bpf_trace_vprintk helpers.
- Extend libbpf's bpf_tracing.h support for tracing arguments of
kprobes/uprobes and syscall as a special case.
- Significantly reduce the search time for module symbols by
livepatch and BPF.
- Enable cpumasks to be used as kptrs, which is useful for tracing
programs tracking which tasks end up running on which CPUs in
different time intervals.
- Add support for BPF trampoline on s390x and riscv64.
- Add capability to export the XDP features supported by the NIC.
- Add __bpf_kfunc tag for marking kernel functions as kfuncs.
- Add cgroup.memory=nobpf kernel parameter option to disable BPF
memory accounting for container environments.
Netfilter:
- Remove the CLUSTERIP target. It has been marked as obsolete for
years, and we still have WARN splats wrt races of the out-of-band
/proc interface installed by this target.
- Add 'destroy' commands to nf_tables. They are identical to the
existing 'delete' commands, but do not return an error if the
referenced object (set, chain, rule...) did not exist.
Driver API:
- Improve cpumask_local_spread() locality to help NICs set the right
IRQ affinity on AMD platforms.
- Separate C22 and C45 MDIO bus transactions more clearly.
- Introduce new DCB table to control DSCP rewrite on egress.
- Support configuration of Physical Layer Collision Avoidance (PLCA)
Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of
shared medium Ethernet.
- Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing
preemption of low priority frames by high priority frames.
- Add support for controlling MACSec offload using netlink SET.
- Rework devlink instance refcounts to allow registration and
de-registration under the instance lock. Split the code into
multiple files, drop some of the unnecessarily granular locks and
factor out common parts of netlink operation handling.
- Add TX frame aggregation parameters (for USB drivers).
- Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning
messages with notifications for debug.
- Allow offloading of UDP NEW connections via act_ct.
- Add support for per action HW stats in TC.
- Support hardware miss to TC action (continue processing in SW from
a specific point in the action chain).
- Warn if old Wireless Extension user space interface is used with
modern cfg80211/mac80211 drivers. Do not support Wireless
Extensions for Wi-Fi 7 devices at all. Everyone should switch to
using nl80211 interface instead.
- Improve the CAN bit timing configuration. Use extack to return
error messages directly to user space, update the SJW handling,
including the definition of a new default value that will benefit
CAN-FD controllers, by increasing their oscillator tolerance.
New hardware / drivers:
- Ethernet:
- nVidia BlueField-3 support (control traffic driver)
- Ethernet support for imx93 SoCs
- Motorcomm yt8531 gigabit Ethernet PHY
- onsemi NCN26000 10BASE-T1S PHY (with support for PLCA)
- Microchip LAN8841 PHY (incl. cable diagnostics and PTP)
- Amlogic gxl MDIO mux
- WiFi:
- RealTek RTL8188EU (rtl8xxxu)
- Qualcomm Wi-Fi 7 devices (ath12k)
- CAN:
- Renesas R-Car V4H
Drivers:
- Bluetooth:
- Set Per Platform Antenna Gain (PPAG) for Intel controllers.
- Ethernet NICs:
- Intel (1G, igc):
- support TSN / Qbv / packet scheduling features of i226 model
- Intel (100G, ice):
- use GNSS subsystem instead of TTY
- multi-buffer XDP support
- extend support for GPIO pins to E823 devices
- nVidia/Mellanox:
- update the shared buffer configuration on PFC commands
- implement PTP adjphase function for HW offset control
- TC support for Geneve and GRE with VF tunnel offload
- more efficient crypto key management method
- multi-port eswitch support
- Netronome/Corigine:
- add DCB IEEE support
- support IPsec offloading for NFP3800
- Freescale/NXP (enetc):
- support XDP_REDIRECT for XDP non-linear buffers
- improve reconfig, avoid link flap and waiting for idle
- support MAC Merge layer
- Other NICs:
- sfc/ef100: add basic devlink support for ef100
- ionic: rx_push mode operation (writing descriptors via MMIO)
- bnxt: use the auxiliary bus abstraction for RDMA
- r8169: disable ASPM and reset bus in case of tx timeout
- cpsw: support QSGMII mode for J721e CPSW9G
- cpts: support pulse-per-second output
- ngbe: add an mdio bus driver
- usbnet: optimize usbnet_bh() by avoiding unnecessary queuing
- r8152: handle devices with FW with NCM support
- amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation
- virtio-net: support multi buffer XDP
- virtio/vsock: replace virtio_vsock_pkt with sk_buff
- tsnep: XDP support
- Ethernet high-speed switches:
- nVidia/Mellanox (mlxsw):
- add support for latency TLV (in FW control messages)
- Microchip (sparx5):
- separate explicit and implicit traffic forwarding rules, make
the implicit rules always active
- add support for egress DSCP rewrite
- IS0 VCAP support (Ingress Classification)
- IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS
etc.)
- ES2 VCAP support (Egress Access Control)
- support for Per-Stream Filtering and Policing (802.1Q,
8.6.5.1)
- Ethernet embedded switches:
- Marvell (mv88e6xxx):
- add MAB (port auth) offload support
- enable PTP receive for mv88e6390
- NXP (ocelot):
- support MAC Merge layer
- support for the the vsc7512 internal copper phys
- Microchip:
- lan9303: convert to PHYLINK
- lan966x: support TC flower filter statistics
- lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x
- lan937x: support Credit Based Shaper configuration
- ksz9477: support Energy Efficient Ethernet
- other:
- qca8k: convert to regmap read/write API, use bulk operations
- rswitch: Improve TX timestamp accuracy
- Intel WiFi (iwlwifi):
- EHT (Wi-Fi 7) rate reporting
- STEP equalizer support: transfer some STEP (connection to radio
on platforms with integrated wifi) related parameters from the
BIOS to the firmware.
- Qualcomm 802.11ax WiFi (ath11k):
- IPQ5018 support
- Fine Timing Measurement (FTM) responder role support
- channel 177 support
- MediaTek WiFi (mt76):
- per-PHY LED support
- mt7996: EHT (Wi-Fi 7) support
- Wireless Ethernet Dispatch (WED) reset support
- switch to using page pool allocator
- RealTek WiFi (rtw89):
- support new version of Bluetooth co-existance
- Mobile:
- rmnet: support TX aggregation"
* tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1872 commits)
page_pool: add a comment explaining the fragment counter usage
net: ethtool: fix __ethtool_dev_mm_supported() implementation
ethtool: pse-pd: Fix double word in comments
xsk: add linux/vmalloc.h to xsk.c
sefltests: netdevsim: wait for devlink instance after netns removal
selftest: fib_tests: Always cleanup before exit
net/mlx5e: Align IPsec ASO result memory to be as required by hardware
net/mlx5e: TC, Set CT miss to the specific ct action instance
net/mlx5e: Rename CHAIN_TO_REG to MAPPED_OBJ_TO_REG
net/mlx5: Refactor tc miss handling to a single function
net/mlx5: Kconfig: Make tc offload depend on tc skb extension
net/sched: flower: Support hardware miss to tc action
net/sched: flower: Move filter handle initialization earlier
net/sched: cls_api: Support hardware miss to tc action
net/sched: Rename user cookie and act cookie
sfc: fix builds without CONFIG_RTC_LIB
sfc: clean up some inconsistent indentings
net/mlx4_en: Introduce flexible array to silence overflow warning
net: lan966x: Fix possible deadlock inside PTP
net/ulp: Remove redundant ->clone() test in inet_clone_ulp().
...
Highlights:
- AMD PMC: Improvements to aid s2idle debugging
- Dell WMI-DDV: hwmon support
- INT3472 camera sensor power-management: Improve privacy LED support
- Intel VSEC: Base TPMI (Topology Aware Register and PM Capsule Interface) support
- Mellanox: SN5600 and Nvidia L1 switch support
- Microsoft Surface Support: Various cleanups + code improvements
- tools/intel-speed-select: Various improvements
- Miscellaneous other cleanups / fixes
The following is an automated git shortlog grouped by driver:
Add include/linux/platform_data/x86 to MAINTAINERS:
- Add include/linux/platform_data/x86 to MAINTAINERS
Documentation/ABI:
- Add new attribute for mlxreg-io sysfs interfaces
Fix header inclusion in linux/platform_data/x86/soc.h:
- Fix header inclusion in linux/platform_data/x86/soc.h
HID:
- surface-hid: Use target-ID enum instead of hard-coding values
MAINTAINERS:
- dell-wmi-sysman: drop Divya Bharathi
- Add entry for TPMI driver
Merge tag 'ib-leds-led_get-v6.3' into HEAD:
- Merge tag 'ib-leds-led_get-v6.3' into HEAD
acerhdf:
- Drop empty platform remove function
apple_gmux:
- Drop no longer used ACPI_VIDEO Kconfig dependency
dell-ddv:
- Prefer asynchronous probing
- Add hwmon support
- Add "force" module param
- Replace EIO with ENOMSG
- Return error if buffer is empty
- Add support for interface version 3
dell-smo8800:
- Use min_t() for comparison and assignment
dell-wmi-sysman:
- Make kobj_type structure constant
hp-wmi:
- Ignore Win-Lock key events
int1092:
- Switch to use acpi_evaluate_dsm_typed()
int3472/discrete:
- add LEDS_CLASS dependency
- Drop unnecessary obj->type == string check
- Get the polarity from the _DSM entry
- Move GPIO request to skl_int3472_register_clock()
- Create a LED class device for the privacy LED
- Refactor GPIO to sensor mapping
intel:
- punit_ipc: Drop empty platform remove function
- oaktrail: Drop empty platform remove function
intel/pmc:
- Switch to use acpi_evaluate_dsm_typed()
leds:
- led-class: Add generic [devm_]led_get()
- led-class: Add __devm_led_get() helper
- led-class: Add led_module_get() helper
- led-class: Add missing put_device() to led_put()
media:
- v4l2-core: Make the v4l2-core code enable/disable the privacy LED if present
nvidia-wmi-ec-backlight:
- Add force module parameter
platform:
- mellanox: mlx-platform: Move bus shift assignment out of the loop
- mellanox: mlx-platform: Add mux selection register to regmap
- mellanox: Extend all systems with I2C notification callback
- mellanox: Split logic in init and exit flow
- mellanox: Split initialization procedure
- mellanox: Introduce support of new Nvidia L1 switch
- mellanox: Introduce support for next-generation 800GB/s switch
- mellanox: Cosmetic changes - rename to more common name
- mellanox: Change "reset_pwr_converter_fail" attribute
- mellanox: Introduce support for rack manager switch
platform/mellanox:
- mlxreg-hotplug: Allow more flexible hotplug events configuration
platform/surface:
- Switch to use acpi_evaluate_dsm_typed()
- aggregator: Rename top-level request functions to avoid ambiguities
- aggregator_registry: Fix target-ID of base-hub
- aggregator: Enforce use of target-ID enum in device ID macros
- dtx: Use target-ID enum instead of hard-coding values
- aggregator_tabletsw: Use target-ID enum instead of hard-coding values
- aggregator_hub: Use target-ID enum instead of hard-coding values
- aggregator: Add target and source IDs to command trace events
- aggregator: Improve documentation and handling of message target and source IDs
platform/x86/amd:
- pmc: Add line break for readability
- pmc: differentiate STB/SMU messaging prints
- pmc: Write dummy postcode into the STB DRAM
- pmc: Add num_samples message id support to STB
platform/x86/amd/pmf:
- Add depends on CONFIG_POWER_SUPPLY
platform/x86/intel:
- Intel TPMI enumeration driver
platform/x86/intel/tpmi:
- ADD tpmi external interface for tpmi feature drivers
- Process CPU package mapping
platform/x86/intel/vsec:
- Use mutex for ida_alloc() and ida_free()
- Support private data
- Enhance and Export intel_vsec_add_aux()
- Add TPMI ID
platform_data/mlxreg:
- Add field with mapped resource address
think-lmi:
- Make kobj_type structure constant
- Use min_t() for comparison and assignment
tools/power/x86/intel-speed-select:
- v1.14 release
- Adjust uncore max/min frequency
- Add Emerald Rapid quirk
- Fix display of uncore min frequency
- turbo-freq auto mode with SMT off
- cpufreq reads on offline CPUs
- Use null-terminated string
- Remove duplicate dup()
- Handle open() failure case
- Remove unused non_block flag
- Remove wrong check in set_isst_id()
x86/platform/uv:
- Make kobj_type structure constant
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAmPzRpgUHGhkZWdvZWRl
QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9wYPwf+I6PP0XBg8MrivLc2DHklVojUU0aX
/M0LbCP8gxCDdyisV8swC3e848riaTchYlUGASPZu0ieas1U7KsDvghkiittNvlI
U+0h7TbkOQNymM8oE0oauflH4W5KwCXGrLsJWVkGk0lhJd6WmjXkjWLkruaXazLd
kc5fq0QyzRVzhhCtocQ7qhIgXSZyKYx433VqbDR7/SUi5F2wkC9JbGY02maKWaK3
4lQaoyMKLjGlDr9YVv+UHTwLoXwP0mW/fjlsZ3Xz5lz6WfihQzPuOrl/10mRj0Ez
eP9dlF1Dipee4BYS2FM5dtk5xPpqdVqRlQUX2qKzyDNTSx5wdtJnv8j/cg==
=VoXq
-----END PGP SIGNATURE-----
Merge tag 'platform-drivers-x86-v6.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver updates from Hans de Goede:
- AMD PMC: Improvements to aid s2idle debugging
- Dell WMI-DDV: hwmon support
- INT3472 camera sensor power-management: Improve privacy LED support
- Intel VSEC: Base TPMI (Topology Aware Register and PM Capsule
Interface) support
- Mellanox: SN5600 and Nvidia L1 switch support
- Microsoft Surface Support: Various cleanups + code improvements
- tools/intel-speed-select: Various improvements
- Miscellaneous other cleanups / fixes
* tag 'platform-drivers-x86-v6.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (80 commits)
platform/x86: nvidia-wmi-ec-backlight: Add force module parameter
platform/x86/amd/pmf: Add depends on CONFIG_POWER_SUPPLY
platform/x86: dell-ddv: Prefer asynchronous probing
platform/x86: dell-ddv: Add hwmon support
Documentation/ABI: Add new attribute for mlxreg-io sysfs interfaces
platform: mellanox: mlx-platform: Move bus shift assignment out of the loop
platform: mellanox: mlx-platform: Add mux selection register to regmap
platform_data/mlxreg: Add field with mapped resource address
platform/mellanox: mlxreg-hotplug: Allow more flexible hotplug events configuration
platform: mellanox: Extend all systems with I2C notification callback
platform: mellanox: Split logic in init and exit flow
platform: mellanox: Split initialization procedure
platform: mellanox: Introduce support of new Nvidia L1 switch
platform: mellanox: Introduce support for next-generation 800GB/s switch
platform: mellanox: Cosmetic changes - rename to more common name
platform: mellanox: Change "reset_pwr_converter_fail" attribute
platform: mellanox: Introduce support for rack manager switch
MAINTAINERS: dell-wmi-sysman: drop Divya Bharathi
x86/platform/uv: Make kobj_type structure constant
platform/x86: think-lmi: Make kobj_type structure constant
...
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCY/GzaAAKCRCAXGG7T9hj
vhgtAP96ax9EV49/kCST52z9yGfGUA+giq/9Jm6bwHlP3PZXVAD/Wfhfp1HbxzFp
CqXG7veXU+uGVP3lbpbYKNPV9DIOdgQ=
=K+0Q
-----END PGP SIGNATURE-----
Merge tag 'for-linus-6.3-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
- help deprecate the /proc/xen files by making the related information
available via sysfs
- mark the Xen variants of play_dead "noreturn"
- support a shared Xen platform interrupt
- several small cleanups and fixes
* tag 'for-linus-6.3-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: sysfs: make kobj_type structure constant
x86/Xen: drop leftover VM-assist uses
xen: Replace one-element array with flexible-array member
xen/grant-dma-iommu: Implement a dummy probe_device() callback
xen/pvcalls-back: fix permanently masked event channel
xen: Allow platform PCI interrupt to be shared
x86/xen/time: prefer tsc as clocksource when it is invariant
x86/xen: mark xen_pv_play_dead() as __noreturn
x86/xen: don't let xen_pv_play_dead() return
drivers/xen/hypervisor: Expose Xen SIF flags to userspace
- Remove a superfluous variables from apic_get_tmcct()
- Fix various edge cases in x2APIC MSR emulation
- Mark APIC timer as expired if its in one-shot mode and the count
underflows while the vCPU task was being migrated
- Reset xAPIC when userspace forces "impossible" x2APIC => xAPIC transition
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmPsB58SHHNlYW5qY0Bn
b29nbGUuY29tAAoJEGCRIgFNDBL5CK0P/1hhxUWokhNJX0skgf8uKhxTf8bLAq5F
xr221M4Ac9YwjJaS0p4PJVSLVJxcVXHsyvanCOQh6AE8q1Ugz+iDLr2gAI+fHbJY
lnczpAj1UhhttaLSOl13/31TaJdE2Ep0/q3+5vf1qQrOJYkElKpiDYbf3M8T5G72
pguUFhKKKeZcCB99Jpr0u0HupiwCZoYWvdx7mvzRhi11bWaUyYIWc9CBETmAb4kN
1UAmov16UrVOFAg/ssde6qPgUsAgB8XwJjta6oIQLeEm70L5ci6g/2Tw0IEwMybR
yLCCST9eATl2U/hPV4KwBzSN1gHCAx4JDp4TKBR8ic+c+Z8CceIZln05fz6rQ8Sz
ljyaRVFhaQZyZpjrZJ0h3kqMG1JT/Q4Hj9dq8RZJ0K73KVuCspxaJDHqp6a2p9D0
dDacDkD3LFIPBdem3hHcpmV2XduaMfQwspObJORarkkQTZZS6erxmPvK/6Quvmbk
UdD+6hvuSQA8rxNKXF+fOBsnK/1xYvzkVis0sxMwthkSDvENdcPbmlD6kHLz52cg
Jt+yw/85oIg7zBgEkG2c8+5bB2hw0SRPQBlW4j29jYUhRwXwHxuovllFS2GU7iIc
fVNtocw5Q9WATp752va4bVjv9XeYBmExn99fd3xvFenTa/ya4+5gNFK8vc9zL++J
x3fDhAPXmQHJ
=ieB+
-----END PGP SIGNATURE-----
Merge tag 'kvm-x86-apic-6.3' of https://github.com/kvm-x86/linux into HEAD
KVM x86 APIC changes for 6.3:
- Remove a superfluous variables from apic_get_tmcct()
- Fix various edge cases in x2APIC MSR emulation
- Mark APIC timer as expired if its in one-shot mode and the count
underflows while the vCPU task was being migrated
- Reset xAPIC when userspace forces "impossible" x2APIC => xAPIC transition
- Support for arm64 SME 2 and 2.1. SME2 introduces a new 512-bit
architectural register (ZT0, for the look-up table feature) that Linux
needs to save/restore.
- Include TPIDR2 in the signal context and add the corresponding
kselftests.
- Perf updates: Arm SPEv1.2 support, HiSilicon uncore PMU updates, ACPI
support to the Marvell DDR and TAD PMU drivers, reset DTM_PMU_CONFIG
(ARM CMN) at probe time.
- Support for DYNAMIC_FTRACE_WITH_CALL_OPS on arm64.
- Permit EFI boot with MMU and caches on. Instead of cleaning the entire
loaded kernel image to the PoC and disabling the MMU and caches before
branching to the kernel bare metal entry point, leave the MMU and
caches enabled and rely on EFI's cacheable 1:1 mapping of all of
system RAM to populate the initial page tables.
- Expose the AArch32 (compat) ELF_HWCAP features to user in an arm64
kernel (the arm32 kernel only defines the values).
- Harden the arm64 shadow call stack pointer handling: stash the shadow
stack pointer in the task struct on interrupt, load it directly from
this structure.
- Signal handling cleanups to remove redundant validation of size
information and avoid reading the same data from userspace twice.
- Refactor the hwcap macros to make use of the automatically generated
ID registers. It should make new hwcaps writing less error prone.
- Further arm64 sysreg conversion and some fixes.
- arm64 kselftest fixes and improvements.
- Pointer authentication cleanups: don't sign leaf functions, unify
asm-arch manipulation.
- Pseudo-NMI code generation optimisations.
- Minor fixes for SME and TPIDR2 handling.
- Miscellaneous updates: ARCH_FORCE_MAX_ORDER is now selectable, replace
strtobool() to kstrtobool() in the cpufeature.c code, apply dynamic
shadow call stack in two passes, intercept pfn changes in set_pte_at()
without the required break-before-make sequence, attempt to dump all
instructions on unhandled kernel faults.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmP0/QsACgkQa9axLQDI
XvG+gA/+JDVEH9wRzAIZvbp9hSuohPc48xgAmIMP1eiVB0/5qeRjYAJwS33H0rXS
BPC2kj9IBy/eQeM9ICg0nFd0zYznSVacITqe6NrqeJ1F+ftS4rrHdfxd+J7kIoCs
V2L8e+BJvmHdhmNV2qMAgJdGlfxfQBA7fv2cy52HKYcouoOh1AUVR/x+yXVXAsCd
qJP3+dlUKccgm/oc5unEC1eZ49u8O+EoasqOyfG6K5udMgzhEX3K6imT9J3hw0WT
UjstYkx5uGS/prUrRCQAX96VCHoZmzEDKtQuHkHvQXEYXsYPF3ldbR2CziNJnHe7
QfSkjJlt8HAtExA+BkwEe9i0MQO/2VF5qsa2e4fA6l7uqGu3LOtS/jJd23C9n9fR
Id8aBMeN6S8+MjqRA9L2uf4t6e4ISEHoG9ZRdc4WOwloxEEiJoIeun+7bHdOSZLj
AFdHFCz4NXiiwC0UP0xPDI2YeCLqt5np7HmnrUqwzRpVO8UUagiJD8TIpcBSjBN9
J68eidenHUW7/SlIeaMKE2lmo8AUEAJs9AorDSugF19/ThJcQdx7vT2UAZjeVB3j
1dbbwajnlDOk/w8PQC4thFp5/MDlfst0htS3WRwa+vgkweE2EAdTU4hUZ8qEP7FQ
smhYtlT1xUSTYDTqoaG/U2OWR6/UU79wP0jgcOsHXTuyYrtPI/Q=
=VmXL
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
- Support for arm64 SME 2 and 2.1. SME2 introduces a new 512-bit
architectural register (ZT0, for the look-up table feature) that
Linux needs to save/restore
- Include TPIDR2 in the signal context and add the corresponding
kselftests
- Perf updates: Arm SPEv1.2 support, HiSilicon uncore PMU updates, ACPI
support to the Marvell DDR and TAD PMU drivers, reset DTM_PMU_CONFIG
(ARM CMN) at probe time
- Support for DYNAMIC_FTRACE_WITH_CALL_OPS on arm64
- Permit EFI boot with MMU and caches on. Instead of cleaning the
entire loaded kernel image to the PoC and disabling the MMU and
caches before branching to the kernel bare metal entry point, leave
the MMU and caches enabled and rely on EFI's cacheable 1:1 mapping of
all of system RAM to populate the initial page tables
- Expose the AArch32 (compat) ELF_HWCAP features to user in an arm64
kernel (the arm32 kernel only defines the values)
- Harden the arm64 shadow call stack pointer handling: stash the shadow
stack pointer in the task struct on interrupt, load it directly from
this structure
- Signal handling cleanups to remove redundant validation of size
information and avoid reading the same data from userspace twice
- Refactor the hwcap macros to make use of the automatically generated
ID registers. It should make new hwcaps writing less error prone
- Further arm64 sysreg conversion and some fixes
- arm64 kselftest fixes and improvements
- Pointer authentication cleanups: don't sign leaf functions, unify
asm-arch manipulation
- Pseudo-NMI code generation optimisations
- Minor fixes for SME and TPIDR2 handling
- Miscellaneous updates: ARCH_FORCE_MAX_ORDER is now selectable,
replace strtobool() to kstrtobool() in the cpufeature.c code, apply
dynamic shadow call stack in two passes, intercept pfn changes in
set_pte_at() without the required break-before-make sequence, attempt
to dump all instructions on unhandled kernel faults
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (130 commits)
arm64: fix .idmap.text assertion for large kernels
kselftest/arm64: Don't require FA64 for streaming SVE+ZA tests
kselftest/arm64: Copy whole EXTRA context
arm64: kprobes: Drop ID map text from kprobes blacklist
perf: arm_spe: Print the version of SPE detected
perf: arm_spe: Add support for SPEv1.2 inverted event filtering
perf: Add perf_event_attr::config3
arm64/sme: Fix __finalise_el2 SMEver check
drivers/perf: fsl_imx8_ddr_perf: Remove set-but-not-used variable
arm64/signal: Only read new data when parsing the ZT context
arm64/signal: Only read new data when parsing the ZA context
arm64/signal: Only read new data when parsing the SVE context
arm64/signal: Avoid rereading context frame sizes
arm64/signal: Make interface for restore_fpsimd_context() consistent
arm64/signal: Remove redundant size validation from parse_user_sigframe()
arm64/signal: Don't redundantly verify FPSIMD magic
arm64/cpufeature: Use helper macros to specify hwcaps
arm64/cpufeature: Always use symbolic name for feature value in hwcaps
arm64/sysreg: Initial unsigned annotations for ID registers
arm64/sysreg: Initial annotation of signed ID registers
...
A single & will create a background process and return true, so the grep
command will run even if the file checked in the first condition does not
exist.
Link: https://lore.kernel.org/all/20230112114215.17103-1-antonio.feijoo@suse.com/
Fixes: 1eaad3ac3f ("tools/bootconfig: Use per-group/all enable option in ftrace2bconf script")
Signed-off-by: Antonio Alvarez Feijoo <antonio.feijoo@suse.com>
Cc: stable@vger.kernel.org
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
When calling ip6_route_lookup() for the packet arriving on the VRF
interface, the result is always the real (slave) interface. Expect this
when validating the result.
Fixes: acc641ab95 ("netfilter: rpfilter/fib: Populate flowic_l3mdev field")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
where possible, when supporting a debug registers swap feature for
SEV-ES guests
- Add support for AMD's version of eIBRS called Automatic IBRS which is
a set-and-forget control of indirect branch restriction speculation
resources on privilege change
- Add support for a new x86 instruction - LKGS - Load kernel GS which is
part of the FRED infrastructure
- Reset SPEC_CTRL upon init to accomodate use cases like kexec which
rediscover
- Other smaller fixes and cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmP1RDIACgkQEsHwGGHe
VUohBw//ZB9ZRqsrKdm6D9YaP2x4Zb+kqKqo6rjYeWaYqyPyCwDujPwh+pb3Oq1t
aj62muDv1t/wEJc8mKNkfXkjEEtBVAOcpb5YIpKreoEvNKyevol83Ih0u5iJcTRE
E5qf8HDS8b/JZrcazJJLl6WQmQNH5RiKSu5bbCpRhoeOcyo5pRYR5MztK9vNmAQk
GMdwHsUSU+jN8uiE4HnpaOb/luhgFindRwZVTpdjJegQWLABS8cl3CKeTv4+PW45
isvv37XnQP248wsptIEVRHeG6g3g/HtvwRx7DikUw06QwUyUK7H9hJssOoSP8TL9
u4psRwfWnJ1OxU6klL+s0Ii+pjQ97wXmK/oqK7QkdUwhWqR/mQAW2e9kWHAngyDn
A6mKbzSM6HFAeSXQpB9cMb6uvYRD44SngDFe3WXtEK8jiiQ70ikUm4E28I5KJOPg
s+RyioHk0NFRHYSOOBqNG1NKz6ED7L3GbgbbzxkgMh21AAyI3X351t+PtGoLV5ew
eqOsM7lbg9Scg1LvPk1JcoALS8USWqgar397rz9qGUs+OkPWBtEBCmTdMz/Eb+2t
g/WHdLS5/ajSs5gNhT99W3DeqZMPDEkgBRSeyBBmY3CUD3gBL2wXEktRXv504zBR
RC4oyUPX3c9E2ib6GATLE3kBLbcz9hTWbMxF+X3lLJvTVd/Qc2o=
=v/ZC
-----END PGP SIGNATURE-----
Merge tag 'x86_cpu_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpuid updates from Borislav Petkov:
- Cache the AMD debug registers in per-CPU variables to avoid MSR
writes where possible, when supporting a debug registers swap feature
for SEV-ES guests
- Add support for AMD's version of eIBRS called Automatic IBRS which is
a set-and-forget control of indirect branch restriction speculation
resources on privilege change
- Add support for a new x86 instruction - LKGS - Load kernel GS which
is part of the FRED infrastructure
- Reset SPEC_CTRL upon init to accomodate use cases like kexec which
rediscover
- Other smaller fixes and cleanups
* tag 'x86_cpu_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/amd: Cache debug register values in percpu variables
KVM: x86: Propagate the AMD Automatic IBRS feature to the guest
x86/cpu: Support AMD Automatic IBRS
x86/cpu, kvm: Add the SMM_CTL MSR not present feature
x86/cpu, kvm: Add the Null Selector Clears Base feature
x86/cpu, kvm: Move X86_FEATURE_LFENCE_RDTSC to its native leaf
x86/cpu, kvm: Add the NO_NESTED_DATA_BP feature
KVM: x86: Move open-coded CPUID leaf 0x80000021 EAX bit propagation code
x86/cpu, kvm: Add support for CPUID_80000021_EAX
x86/gsseg: Add the new <asm/gsseg.h> header to <asm/asm-prototypes.h>
x86/gsseg: Use the LKGS instruction if available for load_gs_index()
x86/gsseg: Move load_gs_index() to its own new header file
x86/gsseg: Make asm_load_gs_index() take an u16
x86/opcode: Add the LKGS instruction to x86-opcode-map
x86/cpufeature: Add the CPU feature bit for LKGS
x86/bugs: Reset speculation control settings on init
x86/cpu: Remove redundant extern x86_read_arch_cap_msr()
- Rework a large bunch of drivers to use the generic thermal trip
structure and use the opportunity to do more cleanups by removing
unused functions from the OF code (Daniel Lezcano).
- Remove core header inclusion from drivers (Daniel Lezcano).
- Fix some locking issues related to the generic thermal trip rework
(Johan Hovold).
- Fix a crash when requesting the critical temperature on tegra, which
is related to the generic trip point work (Jon Hunter).
- Clean up thermal device unregistration code (Viresh Kumar).
- Fix and clean up thermal control core initialization error code
paths (Daniel Lezcano).
- Relocate the trip points handling code into a separate file (Daniel
Lezcano).
- Make the thermal core fail registration of thermal zones and cooling
devices if the thermal class has not been registered (Rafael Wysocki).
- Add trip point initialization helper functions for ACPI-defined trip
points and modify two thermal drivers to use them (Rafael Wysocki,
Daniel Lezcano).
- Make the core thermal control code use sysfs_emit_at() instead of
scnprintf() where applicable (ye xingchen).
- Consolidate code accessing the Intel TCC (Thermal Control Circuitry)
MSRs by introducing library functions for that and making the
TCC-related code in thermal drivers use them (Zhang Rui).
- Enhance the x86_pkg_temp_thermal driver to support dynamic tjmax
changes (Zhang Rui).
- Address an "unsigned expression compared with zero" warning in the
intel_soc_dts_iosf thermal driver (Yang Li).
- Update comments regarding two functions in the Intel Menlow thermal
driver (Deming Wang).
- Use sysfs_emit_at() instead of scnprintf() in the int340x thermal
driver (ye xingchen).
- Make the intel_pch thermal driver support the Wellsburg PCH (Tim
Zimmermann).
- Modify the intel_pch and processor_thermal_device_pci thermal drivers
use generic trip point tables instead of thermal zone trip point
callbacks (Daniel Lezcano).
- Add production mode attribute sysfs attribute to the int340x thermal
driver (Srinivas Pandruvada).
- Rework dynamic trip point updates handling and locking in the int340x
thermal driver (Rafael Wysocki).
- Make the int340x thermal driver use a generic trip points table
instead of thermal zone trip point callbacks (Rafael Wysocki, Daniel
Lezcano).
- Clean up and improve the int340x thermal driver (Rafael Wysocki).
- Simplify and clean up the intel_pch thermal driver (Rafael Wysocki).
- Fix the Intel powerclamp thermal driver and make it use the common
idle injection framework (Srinivas Pandruvada).
- Add two module parameters, cpumask and max_idle, to the Intel powerclamp
thermal driver to allow it to affect only a specific subset of CPUs
instead of all of them (Srinivas Pandruvada).
- Make the Intel quark_dts thermal driver Use generic trip point
objects instead of its own trip point representation (Daniel
Lezcano).
- Add toctree entry for thermal documents and fix two issues in the
Intel powerclamp driver documentation (Bagas Sanjaya).
- Use strscpy() to instead of strncpy() in the thermal core (Xu Panda).
- Fix thermal_sampling_exit() (Vincent Guittot).
- Add Mediatek Low Voltage Thermal Sensor (LVTS) driver (Balsam Chihi).
- Add r8a779g0 RCar support to the rcar_gen3 thermal driver (Geert
Uytterhoeven).
- Fix useless call to set_trips() when resuming in the rcar_gen3
thermal control driver and add interrupt support detection at init
time to it (Niklas Söderlund).
- Fix memory corruption in the hi3660 thermal driver (Yongqin Liu).
- Fix include path for libnl3 in pkg-config file for libthermal (Vibhav
Pant).
- Remove syscfg-based driver for st as the platform is not supported
any more (Alain Volmat).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmPuJuESHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxef0P/3h73rPjGEyuDlvXaazyXsJ2ItIoGeXF
v9sDwK3IPeFTNwAu80RySXQViOG6G1e5Cl8Ee+LuuMZfPRlBnr3n35BazejDDK0N
u3YAhPqtNOvWqr31T3A27dYtK+feFR2QL9SGFP0E4yxS1jpMOSO4Q24z7yaXdegT
hD8YT1HbTW4Cra7A17qdXsG8LkIe0+GQXy7Ig/Dul1eqXTM4RSReGTmXic66hGpv
lutqIQl8VdjmVBcQtTustpdycAD9zj07xd9BvOyM0lmF90zt6S0VOWFDsk+8u1jA
FCiuRLBAM1xbguxGubahTVOM051J/MdfM5WqGgPtesNIXlDq4Je2WUGC07jGvSfV
DMjNNb+nTkD3BK+BEe+rgv3KZBngj4p2sGHFW19v3EPdGftzohqDD5Oqn0GpsKR0
J4GaT04T66A6jlNdzY/nPfOIw5FYEAsMwx4hR0qtEWDMT4uYtXQYM5iml9TBDoDx
Kqyx+N8KhaKnQ4PLZ0MwtusyZydKQC1S1YK6G2eo+bXeJEre07FjZkiNfURi5gv9
lrKS5nbAGBqUrNV4XnS18RmGAC+bxuQrNA5Gr0ouYaaLMT+jGzcdu1yCMeWJxwZI
fFGAwE6sOU8EtmdGJrQdJt4eKCnpzOS7I1XuMDTBstl8Wv92x/YbH39vOl9wbJVs
rmSkM+4t+sXb
=tZwm
-----END PGP SIGNATURE-----
Merge tag 'thermal-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control updates from Rafael Wysocki:
"The majority of changes here are related to the general switch-over to
using arrays of generic trip point structures registered along with a
thermal zone instead of trip point callbacks (this has been done
mostly by Daniel Lezcano with some help from yours truly on the Intel
drivers front).
Apart from that and the related reorganization of code, there are some
enhancements of the existing driver and a new Mediatek Low Voltage
Thermal Sensor (LVTS) driver. The Intel powerclamp undergoes a major
rework so it will use the generic idle_inject facility for CPU idle
time injection going forward and it will take additional module
parameters for specifying the subset of CPUs to be affected by it
(work done by Srinivas Pandruvada).
Also included are assorted fixes and a whole bunch of cleanups.
Specifics:
- Rework a large bunch of drivers to use the generic thermal trip
structure and use the opportunity to do more cleanups by removing
unused functions from the OF code (Daniel Lezcano)
- Remove core header inclusion from drivers (Daniel Lezcano)
- Fix some locking issues related to the generic thermal trip rework
(Johan Hovold)
- Fix a crash when requesting the critical temperature on tegra,
which is related to the generic trip point work (Jon Hunter)
- Clean up thermal device unregistration code (Viresh Kumar)
- Fix and clean up thermal control core initialization error code
paths (Daniel Lezcano)
- Relocate the trip points handling code into a separate file (Daniel
Lezcano)
- Make the thermal core fail registration of thermal zones and
cooling devices if the thermal class has not been registered
(Rafael Wysocki)
- Add trip point initialization helper functions for ACPI-defined
trip points and modify two thermal drivers to use them (Rafael
Wysocki, Daniel Lezcano)
- Make the core thermal control code use sysfs_emit_at() instead of
scnprintf() where applicable (ye xingchen)
- Consolidate code accessing the Intel TCC (Thermal Control
Circuitry) MSRs by introducing library functions for that and
making the TCC-related code in thermal drivers use them (Zhang Rui)
- Enhance the x86_pkg_temp_thermal driver to support dynamic tjmax
changes (Zhang Rui)
- Address an "unsigned expression compared with zero" warning in the
intel_soc_dts_iosf thermal driver (Yang Li)
- Update comments regarding two functions in the Intel Menlow thermal
driver (Deming Wang)
- Use sysfs_emit_at() instead of scnprintf() in the int340x thermal
driver (ye xingchen)
- Make the intel_pch thermal driver support the Wellsburg PCH (Tim
Zimmermann)
- Modify the intel_pch and processor_thermal_device_pci thermal
drivers use generic trip point tables instead of thermal zone trip
point callbacks (Daniel Lezcano)
- Add production mode attribute sysfs attribute to the int340x
thermal driver (Srinivas Pandruvada)
- Rework dynamic trip point updates handling and locking in the
int340x thermal driver (Rafael Wysocki)
- Make the int340x thermal driver use a generic trip points table
instead of thermal zone trip point callbacks (Rafael Wysocki,
Daniel Lezcano)
- Clean up and improve the int340x thermal driver (Rafael Wysocki)
- Simplify and clean up the intel_pch thermal driver (Rafael Wysocki)
- Fix the Intel powerclamp thermal driver and make it use the common
idle injection framework (Srinivas Pandruvada)
- Add two module parameters, cpumask and max_idle, to the Intel
powerclamp thermal driver to allow it to affect only a specific
subset of CPUs instead of all of them (Srinivas Pandruvada)
- Make the Intel quark_dts thermal driver Use generic trip point
objects instead of its own trip point representation (Daniel
Lezcano)
- Add toctree entry for thermal documents and fix two issues in the
Intel powerclamp driver documentation (Bagas Sanjaya)
- Use strscpy() to instead of strncpy() in the thermal core (Xu
Panda)
- Fix thermal_sampling_exit() (Vincent Guittot)
- Add Mediatek Low Voltage Thermal Sensor (LVTS) driver (Balsam
Chihi)
- Add r8a779g0 RCar support to the rcar_gen3 thermal driver (Geert
Uytterhoeven)
- Fix useless call to set_trips() when resuming in the rcar_gen3
thermal control driver and add interrupt support detection at init
time to it (Niklas Söderlund)
- Fix memory corruption in the hi3660 thermal driver (Yongqin Liu)
- Fix include path for libnl3 in pkg-config file for libthermal
(Vibhav Pant)
- Remove syscfg-based driver for st as the platform is not supported
any more (Alain Volmat)"
* tag 'thermal-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (135 commits)
thermal/drivers/st: Remove syscfg based driver
thermal: Remove core header inclusion from drivers
tools/lib/thermal: Fix include path for libnl3 in pkg-config file.
thermal/drivers/hisi: Drop second sensor hi3660
thermal/drivers/rcar_gen3_thermal: Fix device initialization
thermal/drivers/rcar_gen3_thermal: Create device local ops struct
thermal/drivers/rcar_gen3_thermal: Do not call set_trips() when resuming
thermal/drivers/rcar_gen3: Add support for R-Car V4H
dt-bindings: thermal: rcar-gen3-thermal: Add r8a779g0 support
thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver
dt-bindings: thermal: mediatek: Add LVTS thermal controllers
thermal/drivers/mediatek: Relocate driver to mediatek folder
tools/lib/thermal: Fix thermal_sampling_exit()
Documentation: powerclamp: Fix numbered lists formatting
Documentation: powerclamp: Escape wildcard in cpumask description
Documentation: admin-guide: Add toctree entry for thermal docs
thermal: intel: powerclamp: Add two module parameters
Documentation: admin-guide: Move intel_powerclamp documentation
thermal: core: Use sysfs_emit_at() instead of scnprintf()
thermal: intel: powerclamp: Fix duration module parameter
...
- Add EPP support to the AMD P-state cpufreq driver (Perry Yuan, Wyes
Karny, Arnd Bergmann, Bagas Sanjaya).
- Drop the custom cpufreq driver for loongson1 that is not necessary
any more and the corresponding cpufreq platform device (Keguang
Zhang).
- Remove "select SRCU" from system sleep, cpufreq and OPP Kconfig
entries (Paul E. McKenney).
- Enable thermal cooling for Tegra194 (Yi-Wei Wang).
- Register module device table and add missing compatibles for
cpufreq-qcom-hw (Nícolas F. R. A. Prado, Abel Vesa and Luca Weiss).
- Various dt binding updates for qcom-cpufreq-nvmem and opp-v2-kryo-cpu
(Christian Marangi).
- Make kobj_type structure in the cpufreq core constant (Thomas
Weißschuh).
- Make cpufreq_unregister_driver() return void (Uwe Kleine-König).
- Make the TEO cpuidle governor check CPU utilization in order to refine
idle state selection (Kajetan Puchalski).
- Make Kconfig select the haltpoll cpuidle governor when the haltpoll
cpuidle driver is selected and replace a default_idle() call in that
driver with arch_cpu_idle() to allow MWAIT to be used (Li RongQing).
- Add Emerald Rapids Xeon support to the intel_idle driver (Artem
Bityutskiy).
- Add ARCH_SUSPEND_POSSIBLE dependencies for ARMv4 cpuidle drivers to
avoid randconfig build failures (Arnd Bergmann).
- Make kobj_type structures used in the cpuidle sysfs interface
constant (Thomas Weißschuh).
- Make the cpuidle driver registration code update microsecond values
of idle state parameters in accordance with their nanosecond values
if they are provided (Rafael Wysocki).
- Make the PSCI cpuidle driver prevent topology CPUs from being
suspended on PREEMPT_RT (Krzysztof Kozlowski).
- Document that pm_runtime_force_suspend() cannot be used with
DPM_FLAG_SMART_SUSPEND (Richard Fitzgerald).
- Add EXPORT macros for exporting PM functions from drivers (Richard
Fitzgerald).
- Remove /** from non-kernel-doc comments in hibernation code (Randy
Dunlap).
- Fix possible name leak in powercap_register_zone() (Yang Yingliang).
- Add Meteor Lake and Emerald Rapids support to the intel_rapl power
capping driver (Zhang Rui).
- Modify the idle_inject power capping facility to support 100% idle
injection (Srinivas Pandruvada).
- Fix large time windows handling in the intel_rapl power capping
driver (Zhang Rui).
- Fix memory leaks with using debugfs_lookup() in the generic PM
domains and Energy Model code (Greg Kroah-Hartman).
- Add missing 'cache-unified' property in the example for kryo OPP
bindings (Rob Herring).
- Fix error checking in opp_migrate_dentry() (Qi Zheng).
- Let qcom,opp-fuse-level be a 2-long array for qcom SoCs (Konrad
Dybcio).
- Modify some power management utilities to use the canonical ftrace
path (Ross Zwisler).
- Correct spelling problems for Documentation/power/ as reported by
codespell (Randy Dunlap).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmPuJfMSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRx/5kQAJNOVImLEPLerLP8xufw30//LuDU5Gi0
STsyDOMql/I2MpkeqeCcgrSbpy6NlEglOvg16gfpQ3qqTCLF9ypENxs9E5BGGvW0
aEdCzvaoqmvi9PCr/jmj0EPP70/U+rIX5m/k0QdjLh9x0aLoAEe3uRJTfR9QVqXf
I7JX0N9kjKi7YxpA5DlkHrS7J7GPPiWlesJ3p4wXuHMo3jf+6fgkoPFt8yRrGWeh
AHzGT2BLrsy7aAUjGZB65Qx9q3fnSXMmXOjmn0Xh2njQah+zRZDwrNzwoY2HTLL/
KQ6/Ww16USYRZtCS1fmGwAj9I+ddq6AOvhPCMn0vLXXmKVAMUrVVWnQS/0+vpm9y
suUMK9Tndkgxd1vjby2246ThJn27uDd/ERFan4ouQo2j22uICY+SDo3osj2hMXka
wq4zthXkY8KgjZ+MuXnZxPhcOvo8KRvfxAU0fy5efQnSkbtwY9UlMvjPBMBHm/RA
21/6kjQNtq5vMmI37oC8DH+oPrRQ7sUKuY7HNqwO9P3QNKWVmNe7cF5UtXXxME7Q
ULvP1d+u+TNNdHFLryPwCSzBO34wQEccdRZBjalZ8tBe6JiDWUFHC3giSURZSuzZ
GDvzVaNX6PkgToyv4inBTB8lTp6pAuUjaWNvNJzVvUXiEKHB0ihzg5vpJW5NdwlH
15Tn8cjH7pp0
=lZLx
-----END PGP SIGNATURE-----
Merge tag 'pm-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These add EPP support to the AMD P-state cpufreq driver, add support
for new platforms to the Intel RAPL power capping driver, intel_idle
and the Qualcomm cpufreq driver, enable thermal cooling for Tegra194,
drop the custom cpufreq driver for loongson1 that is not necessary any
more (and the corresponding cpufreq platform device), fix assorted
issues and clean up code.
Specifics:
- Add EPP support to the AMD P-state cpufreq driver (Perry Yuan, Wyes
Karny, Arnd Bergmann, Bagas Sanjaya)
- Drop the custom cpufreq driver for loongson1 that is not necessary
any more and the corresponding cpufreq platform device (Keguang
Zhang)
- Remove "select SRCU" from system sleep, cpufreq and OPP Kconfig
entries (Paul E. McKenney)
- Enable thermal cooling for Tegra194 (Yi-Wei Wang)
- Register module device table and add missing compatibles for
cpufreq-qcom-hw (Nícolas F. R. A. Prado, Abel Vesa and Luca Weiss)
- Various dt binding updates for qcom-cpufreq-nvmem and
opp-v2-kryo-cpu (Christian Marangi)
- Make kobj_type structure in the cpufreq core constant (Thomas
Weißschuh)
- Make cpufreq_unregister_driver() return void (Uwe Kleine-König)
- Make the TEO cpuidle governor check CPU utilization in order to
refine idle state selection (Kajetan Puchalski)
- Make Kconfig select the haltpoll cpuidle governor when the haltpoll
cpuidle driver is selected and replace a default_idle() call in
that driver with arch_cpu_idle() to allow MWAIT to be used (Li
RongQing)
- Add Emerald Rapids Xeon support to the intel_idle driver (Artem
Bityutskiy)
- Add ARCH_SUSPEND_POSSIBLE dependencies for ARMv4 cpuidle drivers to
avoid randconfig build failures (Arnd Bergmann)
- Make kobj_type structures used in the cpuidle sysfs interface
constant (Thomas Weißschuh)
- Make the cpuidle driver registration code update microsecond values
of idle state parameters in accordance with their nanosecond values
if they are provided (Rafael Wysocki)
- Make the PSCI cpuidle driver prevent topology CPUs from being
suspended on PREEMPT_RT (Krzysztof Kozlowski)
- Document that pm_runtime_force_suspend() cannot be used with
DPM_FLAG_SMART_SUSPEND (Richard Fitzgerald)
- Add EXPORT macros for exporting PM functions from drivers (Richard
Fitzgerald)
- Remove /** from non-kernel-doc comments in hibernation code (Randy
Dunlap)
- Fix possible name leak in powercap_register_zone() (Yang Yingliang)
- Add Meteor Lake and Emerald Rapids support to the intel_rapl power
capping driver (Zhang Rui)
- Modify the idle_inject power capping facility to support 100% idle
injection (Srinivas Pandruvada)
- Fix large time windows handling in the intel_rapl power capping
driver (Zhang Rui)
- Fix memory leaks with using debugfs_lookup() in the generic PM
domains and Energy Model code (Greg Kroah-Hartman)
- Add missing 'cache-unified' property in the example for kryo OPP
bindings (Rob Herring)
- Fix error checking in opp_migrate_dentry() (Qi Zheng)
- Let qcom,opp-fuse-level be a 2-long array for qcom SoCs (Konrad
Dybcio)
- Modify some power management utilities to use the canonical ftrace
path (Ross Zwisler)
- Correct spelling problems for Documentation/power/ as reported by
codespell (Randy Dunlap)"
* tag 'pm-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (53 commits)
Documentation: amd-pstate: disambiguate user space sections
cpufreq: amd-pstate: Fix invalid write to MSR_AMD_CPPC_REQ
dt-bindings: opp: opp-v2-kryo-cpu: enlarge opp-supported-hw maximum
dt-bindings: cpufreq: qcom-cpufreq-nvmem: make cpr bindings optional
dt-bindings: cpufreq: qcom-cpufreq-nvmem: specify supported opp tables
PM: Add EXPORT macros for exporting PM functions
cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RT
MIPS: loongson32: Drop obsolete cpufreq platform device
powercap: intel_rapl: Fix handling for large time window
cpuidle: driver: Update microsecond values of state parameters as needed
cpuidle: sysfs: make kobj_type structures constant
cpuidle: add ARCH_SUSPEND_POSSIBLE dependencies
PM: EM: fix memory leak with using debugfs_lookup()
PM: domains: fix memory leak with using debugfs_lookup()
cpufreq: Make kobj_type structure constant
cpufreq: davinci: Fix clk use after free
cpufreq: amd-pstate: avoid uninitialized variable use
cpufreq: Make cpufreq_unregister_driver() return void
OPP: fix error checking in opp_migrate_dentry()
dt-bindings: cpufreq: cpufreq-qcom-hw: Add SM8550 compatible
...
This pull request contains the following branches:
doc.2023.01.05a: Documentation updates.
fixes.2023.01.23a: Miscellaneous fixes, perhaps most notably:
o Throttling callback invocation based on the number of callbacks
that are now ready to invoke instead of on the total number
of callbacks.
o Several patches that suppress false-positive boot-time
diagnostics, for example, due to lockdep not yet being
initialized.
o Make expedited RCU CPU stall warnings dump stacks of any tasks
that are blocking the stalled grace period. (Normal RCU CPU
stall warnings have doen this for mnay years.)
o Lazy-callback fixes to avoid delays during boot, suspend, and
resume. (Note that lazy callbacks must be explicitly enabled,
so this should not (yet) affect production use cases.)
kvfree.2023.01.03a: Cause kfree_rcu() and friends to take advantage of
polled grace periods, thus reducing memory footprint by almost
two orders of magnitude, admittedly on a microbenchmark.
This series also begins the transition from kfree_rcu(p) to
kfree_rcu_mightsleep(p). This transition was motivated by bugs
where kfree_rcu(p), which can block, was typed instead of the
intended kfree_rcu(p, rh).
srcu.2023.01.03a: SRCU updates, perhaps most notably fixing a bug that
causes SRCU to fail when booted on a system with a non-zero boot
CPU. This surprising situation actually happens for kdump kernels
on the powerpc architecture. It also adds an srcu_down_read()
and srcu_up_read(), which act like srcu_read_lock() and
srcu_read_unlock(), but allow an SRCU read-side critical section
to be handed off from one task to another.
srcu-always.2023.02.02a: Cleans up the now-useless SRCU Kconfig option.
There are a few more commits that are not yet acked or pulled
into maintainer trees, and these will be in a pull request for
a later merge window.
tasks.2023.01.03a: RCU-tasks updates, perhaps most notably these fixes:
o A strange interaction between PID-namespace unshare and the
RCU-tasks grace period that results in a low-probability but
very real hang.
o A race between an RCU tasks rude grace period on a single-CPU
system and CPU-hotplug addition of the second CPU that can result
in a too-short grace period.
o A race between shrinking RCU tasks down to a single callback list
and queuing a new callback to some other CPU, but where that
queuing is delayed for more than an RCU grace period. This can
result in that callback being stranded on the non-boot CPU.
torture.2023.01.05a: Torture-test updates and fixes.
torturescript.2023.01.03a: Torture-test scripting updates and fixes.
stall.2023.01.09a: Provide additional RCU CPU stall-warning information
in kernels built with CONFIG_RCU_CPU_STALL_CPUTIME=y, and
restore the full five-minute timeout limit for expedited RCU
CPU stall warnings.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmPq29UTHHBhdWxtY2tA
a2VybmVsLm9yZwAKCRCevxLzctn7jAhVEACEAKJY1VJ9IUqz7CwzAYkzgRJfiygh
oDUXmlqtm6ew9pr2GdLUVCVsUSldzBc0K7Djb/G1niv4JPs+v7YwupIV33+UbStU
Qxt6ztTdxc4lKospLm1+2vF9ZdzVEmiP4wVCc4iDarv5FM3FpWSTNc8+L7qmlC+X
myjv+GqMTxkXZBvYJOgJGFjDwN8noTd7Fr3mCCVLFm3PXMDa7tcwD6HRP5AqD2N8
qC5M6LEqepKVGmz0mYMLlSN1GPaqIsEcexIFEazRsPEivPh/iafyQCQ/cqxwhXmV
vEt7u+dXGZT/oiDq9cJ+/XRDS2RyKIS6dUE14TiiHolDCn1ONESahfA/gXWKykC2
BaGPfjWXrWv/hwbeZ+8xEdkAvTIV92tGpXir9Fby1Z5PjP3balvrnn6hs5AnQBJb
NdhRPLzy/dCnEF+CweAYYm1qvTo8cd5nyiNwBZHn7rEAIu3Axrecag1rhFl3AJ07
cpVMQXZtkQVa2X8aIRTUC+ijX6yIqNaHlu0HqNXgIUTDzL4nv5cMjOMzpNQP9/dZ
FwAMZYNiOk9IlMiKJ8ZiVcxeiA8ouIBlkYM3k6vGrmiONZ7a/EV/mSHoJqI8bvqr
AxUIJ2Ayhg3bxPboL5oKgCiLql0A7ZVvz6quX6McitWGMgaSvel1fDzT3TnZd41e
4AFBFd/+VedUGg==
=bBYK
-----END PGP SIGNATURE-----
Merge tag 'rcu.2023.02.10a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU updates from Paul McKenney:
- Documentation updates
- Miscellaneous fixes, perhaps most notably:
- Throttling callback invocation based on the number of callbacks
that are now ready to invoke instead of on the total number of
callbacks
- Several patches that suppress false-positive boot-time
diagnostics, for example, due to lockdep not yet being
initialized
- Make expedited RCU CPU stall warnings dump stacks of any tasks
that are blocking the stalled grace period. (Normal RCU CPU
stall warnings have done this for many years)
- Lazy-callback fixes to avoid delays during boot, suspend, and
resume. (Note that lazy callbacks must be explicitly enabled, so
this should not (yet) affect production use cases)
- Make kfree_rcu() and friends take advantage of polled grace periods,
thus reducing memory footprint by almost two orders of magnitude,
admittedly on a microbenchmark
This also begins the transition from kfree_rcu(p) to
kfree_rcu_mightsleep(p). This transition was motivated by bugs where
kfree_rcu(p), which can block, was typed instead of the intended
kfree_rcu(p, rh)
- SRCU updates, perhaps most notably fixing a bug that causes SRCU to
fail when booted on a system with a non-zero boot CPU. This
surprising situation actually happens for kdump kernels on the
powerpc architecture
This also adds an srcu_down_read() and srcu_up_read(), which act like
srcu_read_lock() and srcu_read_unlock(), but allow an SRCU read-side
critical section to be handed off from one task to another
- Clean up the now-useless SRCU Kconfig option
There are a few more commits that are not yet acked or pulled into
maintainer trees, and these will be in a pull request for a later
merge window
- RCU-tasks updates, perhaps most notably these fixes:
- A strange interaction between PID-namespace unshare and the
RCU-tasks grace period that results in a low-probability but
very real hang
- A race between an RCU tasks rude grace period on a single-CPU
system and CPU-hotplug addition of the second CPU that can
result in a too-short grace period
- A race between shrinking RCU tasks down to a single callback
list and queuing a new callback to some other CPU, but where
that queuing is delayed for more than an RCU grace period. This
can result in that callback being stranded on the non-boot CPU
- Torture-test updates and fixes
- Torture-test scripting updates and fixes
- Provide additional RCU CPU stall-warning information in kernels built
with CONFIG_RCU_CPU_STALL_CPUTIME=y, and restore the full five-minute
timeout limit for expedited RCU CPU stall warnings
* tag 'rcu.2023.02.10a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (80 commits)
rcu/kvfree: Add kvfree_rcu_mightsleep() and kfree_rcu_mightsleep()
kernel/notifier: Remove CONFIG_SRCU
init: Remove "select SRCU"
fs/quota: Remove "select SRCU"
fs/notify: Remove "select SRCU"
fs/btrfs: Remove "select SRCU"
fs: Remove CONFIG_SRCU
drivers/pci/controller: Remove "select SRCU"
drivers/net: Remove "select SRCU"
drivers/md: Remove "select SRCU"
drivers/hwtracing/stm: Remove "select SRCU"
drivers/dax: Remove "select SRCU"
drivers/base: Remove CONFIG_SRCU
rcu: Disable laziness if lazy-tracking says so
rcu: Track laziness during boot and suspend
rcu: Remove redundant call to rcu_boost_kthread_setaffinity()
rcu: Allow up to five minutes expedited RCU CPU stall-warning timeouts
rcu: Align the output of RCU CPU stall warning messages
rcu: Add RCU stall diagnosis information
sched: Add helper nr_context_switches_cpu()
...
- Some smaller fixes
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmPzusMACgkQEsHwGGHe
VUojfQ/7BOqXI0XsHTIwilF12w2bLQl1PeI4bSk6VY+iAN2YmQkq2qvNUgwt62e5
5Z95cDuCZ8sx6L3mDIoOgWBN9zdLbxNhezLFDykb+6as67PMaww9l9R6n3JoC2qm
ELso5JZnWvIZ7Cu7RRm9IzbSj93JAlN3Aypexe61NywMyge9CAvCiOEhvW+lkYSD
lhZqgbm5WAB14F1CeqFyC8kVvUez1GH9Dunbe7ozk7LqRfTRlf5YPH88iE4UKzdg
JXmbcHB2K4aQzfIW66OFPnl/4Cl+XxS/i5CR2NtWlB4/ANZBPoUr7QAS239OpC6u
3uwv/qPmMe7p/lYMaGXSUpzD/MOCHP1HPN8/CWgdyK+Mdmctpqr0FYh1qXXm1Nuu
v0SE3btHVIB5UfvImoOlV/RfCx3+TqxzqUU2erc0iD5VxlRfrqJEwJdJHOgRGxFU
vflRxMQOshhyI7+Q7et0S0QlgK4HvGEHmBUwBsUbfyptIxbqpOLK8INC6N8qwGKZ
gTuBxLNZ5yRE/NeOVe0cL2ooelfOlg7GKUI+gZbfzzQw8M5WZW9qEDS9y2wIuGey
wBFJNzjKXSkrTxc6Hd136N7DX7PlMjiJhXP42s+7rXJguPvgk1oVyEuaX540+xX4
HphXRC2QW0o0hCeFgP11Ai4oq/vRW1RFvdDimJjveJAv19bQNv0=
=Wg/8
-----END PGP SIGNATURE-----
Merge tag 'x86_vdso_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 vdso updates from Borislav Petkov:
- Add getcpu support for the 32-bit version of the vDSO
- Some smaller fixes
* tag 'x86_vdso_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vdso: Fix -Wmissing-prototypes warnings
x86/vdso: Fake 32bit VDSO build on 64bit compile for vgetcpu
selftests: Emit a warning if getcpu() is missing on 32bit
x86/vdso: Provide getcpu for x86-32.
x86/cpu: Provide the full setup for getcpu() on x86-32
x86/vdso: Move VDSO image init to vdso2c generated code
tall calls properly which can be static calls too
- Add proper struct alt_instr.flags which controls different aspects of
insn patching behavior
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmPzkAcACgkQEsHwGGHe
VUpA3A//bALDnLosUQe/m8CTcj1AU12Y59fGInoLl5xArM3liOhRYWj9yu8+2r5N
j+89yjWoiaogu/9B18pV0+VnBrFUbALZmHxec0+4VAWyMqYuTbqN28Nj/2cZiHdP
I/9mwGu40I/Ira021D132EcdoZI7O/6bFlh+kEoAqxc7rsqhD5KKRMlrTTdEPVjH
aRbWIuzqDWNhbi7IwfgEBIPLiQZQKmIdH5hsFMD6yOMIdMRL6CwKmXVg2M1Zp8ta
5v2Aqgvu2nZYCIteP4GQck2AlUBlGR4ClGQQRII+U1o8c9dM0hfcIDgsbSYKvgrY
ANm9MQJaF7MRomk9y4E0EHPZAJEMLKUgiQXMxWpER3O1GOKgZPlyzNSe0gRCiL6O
NZWZ2cGtdhQMrko4EapE3GNryM1HoCY/QCuD1fCYwoc/pRBDhCxsSqjWUd8G/6wn
s3S/mD0v3nmTrxHg8sWvqhKshsd7B9V0LSkTpHktz3soFIJGXTxbrtty0CIS61pM
4iUMYB9SjunoEmdwC7+gCN3sCiRpRqfmIybqXdsW3d37QI+FM5aSBPw51xULubfY
Wsxo8SkH+IMYxXmfbQuUppsGZ+1QHzU08+MrlvNxGHUjS1aMnsrFF/fbfbbCnWvX
7hcyBPT0jxc9RPMNeKDm4ItapMMGxGdv6XiRmM8LiUtVG2fMaW4=
=XUqC
-----END PGP SIGNATURE-----
Merge tag 'x86_alternatives_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 asm alternatives updates from Borislav Petkov:
- Teach the static_call patching infrastructure to handle conditional
tall calls properly which can be static calls too
- Add proper struct alt_instr.flags which controls different aspects of
insn patching behavior
* tag 'x86_alternatives_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/static_call: Add support for Jcc tail-calls
x86/alternatives: Teach text_poke_bp() to patch Jcc.d32 instructions
x86/alternatives: Introduce int3_emulate_jcc()
x86/alternatives: Add alt_instr.flags
When devlink instance is put into network namespace and that network
namespace gets deleted, devlink instance is moved back into init_ns.
This is done as a part of cleanup_net() routine. Since cleanup_net()
is called asynchronously from workqueue, there is no guarantee that
the devlink instance move is done after "ip netns del" returns.
So fix this race by making sure that the devlink instance is present
before any other operation.
Reported-by: Amir Tzin <amirtz@nvidia.com>
Fixes: b74c37fd35 ("selftests: netdevsim: add tests for devlink reload with resources")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://lore.kernel.org/r/20230220132336.198597-1-jiri@resnulli.us
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Usage of `set -e` before executing a command causes immediate exit
on failure, without cleanup up the resources allocated at setup.
This can affect the next tests that use the same resources,
leading to a chain of failures.
A simple fix is to always call cleanup function when the script exists.
This approach is already used by other existing tests.
Fixes: 1056691b26 ("selftests: fib_tests: Make test results more verbose")
Signed-off-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Link: https://lore.kernel.org/r/20230220110400.26737-2-roxana.nicolescu@canonical.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
- Improve the scalability of the CFS bandwidth unthrottling logic
with large number of CPUs.
- Fix & rework various cpuidle routines, simplify interaction with
the generic scheduler code. Add __cpuidle methods as noinstr to
objtool's noinstr detection and fix boatloads of cpuidle bugs & quirks.
- Add new ABI: introduce MEMBARRIER_CMD_GET_REGISTRATIONS,
to query previously issued registrations.
- Limit scheduler slice duration to the sysctl_sched_latency period,
to improve scheduling granularity with a large number of SCHED_IDLE
tasks.
- Debuggability enhancement on sys_exit(): warn about disabled IRQs,
but also enable them to prevent a cascade of followup problems and
repeat warnings.
- Fix the rescheduling logic in prio_changed_dl().
- Micro-optimize cpufreq and sched-util methods.
- Micro-optimize ttwu_runnable()
- Micro-optimize the idle-scanning in update_numa_stats(),
select_idle_capacity() and steal_cookie_task().
- Update the RSEQ code & self-tests
- Constify various scheduler methods
- Remove unused methods
- Refine __init tags
- Documentation updates
- ... Misc other cleanups, fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmPzbJwRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1iIvA//ZcEaB8Z6ChLRQjM+bsaudKJu3pdLQbPK
iYbP8Da+LsAfxbEfYuGV3m+jIp0LlBOtsI/EezxQrXV+V7FvNyAX9Y00eEu/zlj8
7Jn3LMy/DBYTwH7LwVdcU0MyIVI8ZPc6WNnkx0LOtGZn8n+qfHPSDzcP3CW+a5AV
UvllPYpYyEmsX0Eby7CF4Ue8mSmbViw/xR3rNr8ZSve0c25XzKabw8O9kE3jiHxP
d/zERJoAYeDyYUEuZqhfn5dTlB4an4IjNEkAfRE5SQ09RA8Gkxsa5Ar8gob9e9M1
eQsdd4/bdhnrkM8L5qDZczqmgCTZ2bukQrxkBXhRDhLgoFxwAn77b+2ZjmIW3Lae
AyGqRcDSg1q2oxaYm5ZiuO/t26aDOZu9vPHyHRDGt95EGbZlrp+GgeePyfCigJYz
UmPdZAAcHdSymnnnlcvdG37WVvaVkpgWZzd8LbtBi23QR+Zc4WQ2IlgnUS5WKNNf
VOBcAcP6E1IslDotZDQCc2dPFFQoQQEssVooyUc5oMytm7BsvxXLOeHG+Ncu/8uc
H+U8Qn8jnqTxJbC5hkWQIJlhVKCq2FJrHxxySYTKROfUNcDgCmxboFeAcXTCIU1K
T0S+sdoTS/CvtLklRkG0j6B8N4N98mOd9cFwUV3tX+/gMLMep3hCQs5L76JagvC5
skkQXoONNaM=
=l1nN
-----END PGP SIGNATURE-----
Merge tag 'sched-core-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
- Improve the scalability of the CFS bandwidth unthrottling logic with
large number of CPUs.
- Fix & rework various cpuidle routines, simplify interaction with the
generic scheduler code. Add __cpuidle methods as noinstr to objtool's
noinstr detection and fix boatloads of cpuidle bugs & quirks.
- Add new ABI: introduce MEMBARRIER_CMD_GET_REGISTRATIONS, to query
previously issued registrations.
- Limit scheduler slice duration to the sysctl_sched_latency period, to
improve scheduling granularity with a large number of SCHED_IDLE
tasks.
- Debuggability enhancement on sys_exit(): warn about disabled IRQs,
but also enable them to prevent a cascade of followup problems and
repeat warnings.
- Fix the rescheduling logic in prio_changed_dl().
- Micro-optimize cpufreq and sched-util methods.
- Micro-optimize ttwu_runnable()
- Micro-optimize the idle-scanning in update_numa_stats(),
select_idle_capacity() and steal_cookie_task().
- Update the RSEQ code & self-tests
- Constify various scheduler methods
- Remove unused methods
- Refine __init tags
- Documentation updates
- Misc other cleanups, fixes
* tag 'sched-core-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (110 commits)
sched/rt: pick_next_rt_entity(): check list_entry
sched/deadline: Add more reschedule cases to prio_changed_dl()
sched/fair: sanitize vruntime of entity being placed
sched/fair: Remove capacity inversion detection
sched/fair: unlink misfit task from cpu overutilized
objtool: mem*() are not uaccess safe
cpuidle: Fix poll_idle() noinstr annotation
sched/clock: Make local_clock() noinstr
sched/clock/x86: Mark sched_clock() noinstr
x86/pvclock: Improve atomic update of last_value in pvclock_clocksource_read()
x86/atomics: Always inline arch_atomic64*()
cpuidle: tracing, preempt: Squash _rcuidle tracing
cpuidle: tracing: Warn about !rcu_is_watching()
cpuidle: lib/bug: Disable rcu_is_watching() during WARN/BUG
cpuidle: drivers: firmware: psci: Dont instrument suspend code
KVM: selftests: Fix build of rseq test
exit: Detect and fix irq disabled state in oops
cpuidle, arm64: Fix the ARM64 cpuidle logic
cpuidle: mvebu: Fix duplicate flags assignment
sched/fair: Limit sched slice duration
...
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCY+/uBgAKCRDbK58LschI
g0ngAPwJHd1RicBuy2C4fLv0nGKZtmYZBAnTGlI2RisPxU6BRwEAwUDLHuc5K6nR
j261okOxOy/MRxdN1NhmR6Qe7nMyQAk=
=tYU+
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2023-02-17
We've added 64 non-merge commits during the last 7 day(s) which contain
a total of 158 files changed, 4190 insertions(+), 988 deletions(-).
The main changes are:
1) Add a rbtree data structure following the "next-gen data structure"
precedent set by recently-added linked-list, that is, by using
kfunc + kptr instead of adding a new BPF map type, from Dave Marchevsky.
2) Add a new benchmark for hashmap lookups to BPF selftests,
from Anton Protopopov.
3) Fix bpf_fib_lookup to only return valid neighbors and add an option
to skip the neigh table lookup, from Martin KaFai Lau.
4) Add cgroup.memory=nobpf kernel parameter option to disable BPF memory
accouting for container environments, from Yafang Shao.
5) Batch of ice multi-buffer and driver performance fixes,
from Alexander Lobakin.
6) Fix a bug in determining whether global subprog's argument is
PTR_TO_CTX, which is based on type names which breaks kprobe progs,
from Andrii Nakryiko.
7) Prep work for future -mcpu=v4 LLVM option which includes usage of
BPF_ST insn. Thus improve BPF_ST-related value tracking in verifier,
from Eduard Zingerman.
8) More prep work for later building selftests with Memory Sanitizer
in order to detect usages of undefined memory, from Ilya Leoshkevich.
9) Fix xsk sockets to check IFF_UP earlier to avoid a NULL pointer
dereference via sendmsg(), from Maciej Fijalkowski.
10) Implement BPF trampoline for RV64 JIT compiler, from Pu Lehui.
11) Fix BPF memory allocator in combination with BPF hashtab where it could
corrupt special fields e.g. used in bpf_spin_lock, from Hou Tao.
12) Fix LoongArch BPF JIT to always use 4 instructions for function
address so that instruction sequences don't change between passes,
from Hengqi Chen.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (64 commits)
selftests/bpf: Add bpf_fib_lookup test
bpf: Add BPF_FIB_LOOKUP_SKIP_NEIGH for bpf_fib_lookup
riscv, bpf: Add bpf trampoline support for RV64
riscv, bpf: Add bpf_arch_text_poke support for RV64
riscv, bpf: Factor out emit_call for kernel and bpf context
riscv: Extend patch_text for multiple instructions
Revert "bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMES"
selftests/bpf: Add global subprog context passing tests
selftests/bpf: Convert test_global_funcs test to test_loader framework
bpf: Fix global subprog context argument resolution logic
LoongArch, bpf: Use 4 instructions for function address in JIT
bpf: bpf_fib_lookup should not return neigh in NUD_FAILED state
bpf: Disable bh in bpf_test_run for xdp and tc prog
xsk: check IFF_UP earlier in Tx path
Fix typos in selftest/bpf files
selftests/bpf: Use bpf_{btf,link,map,prog}_get_info_by_fd()
samples/bpf: Use bpf_{btf,link,map,prog}_get_info_by_fd()
bpftool: Use bpf_{btf,link,map,prog}_get_info_by_fd()
libbpf: Use bpf_{btf,link,map,prog}_get_info_by_fd()
libbpf: Introduce bpf_{btf,link,map,prog}_get_info_by_fd()
...
====================
Link: https://lore.kernel.org/r/20230217221737.31122-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add build options to bring it close to a linux kernel. It allows for
testing that is close to reality.
Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
Message-Id: <20230202104538.2041879-1-mie@igel.co.jp>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
There are scenes that we want to show the character value of traced
arguments other than a decimal or hexadecimal or string value for debug
convinience. I add a new type named 'char' to do it and a new test case
file named 'kprobe_args_char.tc' to do selftest for char type.
For example:
The to be traced function is 'void demo_func(char type, char *name);', we
can add a kprobe event as follows to show argument values as we want:
echo 'p:myprobe demo_func $arg1:char +0($arg2):char[5]' > kprobe_events
we will get the following trace log:
... myprobe: (demo_func+0x0/0x29) arg1='A' arg2={'b','p','f','1',''}
Link: https://lore.kernel.org/all/20221219110613.367098-1-dolinux.peng@gmail.com/
Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Fix kprobe probepoint testcase to ignore __pfx_* prefix symbols. Those are
introduced by commit b341b20d64 ("x86: Add prefix symbols for function
padding") for identifying PADDING_BYTES of NOPs. Since kprobe events can
not probe these prefix symbols, this testcase has to skip those symbols.
Link: https://lore.kernel.org/all/167309835609.640500.9664678940260305746.stgit@devnote3/
Fixes: b341b20d64 ("x86: Add prefix symbols for function padding")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Fix eprobe syntax test case to check whether the kernel supports the filter
on eprobe for filter syntax test command. Without this fix, this test case
will fail if the kernel supports eprobe but doesn't support the filter on
eprobe.
Link: https://lore.kernel.org/all/167309834742.640500.379128668288448035.stgit@devnote3/
Fixes: 9e14bae7d0 ("selftests/ftrace: Add eprobe syntax error testcase")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
A lot of the tsan helpers are already excempt from the UACCESS warnings,
but some more functions were added that need the same thing:
kernel/kcsan/core.o: warning: objtool: __tsan_volatile_read16+0x0: call to __tsan_unaligned_read16() with UACCESS enabled
kernel/kcsan/core.o: warning: objtool: __tsan_volatile_write16+0x0: call to __tsan_unaligned_write16() with UACCESS enabled
vmlinux.o: warning: objtool: __tsan_unaligned_volatile_read16+0x4: call to __tsan_unaligned_read16() with UACCESS enabled
vmlinux.o: warning: objtool: __tsan_unaligned_volatile_write16+0x4: call to __tsan_unaligned_write16() with UACCESS enabled
As Marco points out, these functions don't even call each other
explicitly but instead gcc (but not clang) notices the functions
being identical and turns one symbol into a direct branch to the
other.
Link: https://lkml.kernel.org/r/20230215130058.3836177-4-arnd@kernel.org
Fixes: 75d75b7a4d ("kcsan: Support distinguishing volatile accesses")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCY+5NlQAKCRCRxhvAZXjc
orOaAP9i2h3OJy95nO2Fpde0Bt2UT+oulKCCcGlvXJ8/+TQpyQD/ZQq47gFQ0EAz
Br5NxeyGeecAb0lHpFz+CpLGsxMrMwQ=
=+BG5
-----END PGP SIGNATURE-----
Merge tag 'fs.idmapped.v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping
Pull vfs idmapping updates from Christian Brauner:
- Last cycle we introduced the dedicated struct mnt_idmap type for
mount idmapping and the required infrastucture in 256c8aed2b ("fs:
introduce dedicated idmap type for mounts"). As promised in last
cycle's pull request message this converts everything to rely on
struct mnt_idmap.
Currently we still pass around the plain namespace that was attached
to a mount. This is in general pretty convenient but it makes it easy
to conflate namespaces that are relevant on the filesystem with
namespaces that are relevant on the mount level. Especially for
non-vfs developers without detailed knowledge in this area this was a
potential source for bugs.
This finishes the conversion. Instead of passing the plain namespace
around this updates all places that currently take a pointer to a
mnt_userns with a pointer to struct mnt_idmap.
Now that the conversion is done all helpers down to the really
low-level helpers only accept a struct mnt_idmap argument instead of
two namespace arguments.
Conflating mount and other idmappings will now cause the compiler to
complain loudly thus eliminating the possibility of any bugs. This
makes it impossible for filesystem developers to mix up mount and
filesystem idmappings as they are two distinct types and require
distinct helpers that cannot be used interchangeably.
Everything associated with struct mnt_idmap is moved into a single
separate file. With that change no code can poke around in struct
mnt_idmap. It can only be interacted with through dedicated helpers.
That means all filesystems are and all of the vfs is completely
oblivious to the actual implementation of idmappings.
We are now also able to extend struct mnt_idmap as we see fit. For
example, we can decouple it completely from namespaces for users that
don't require or don't want to use them at all. We can also extend
the concept of idmappings so we can cover filesystem specific
requirements.
In combination with the vfs{g,u}id_t work we finished in v6.2 this
makes this feature substantially more robust and thus difficult to
implement wrong by a given filesystem and also protects the vfs.
- Enable idmapped mounts for tmpfs and fulfill a longstanding request.
A long-standing request from users had been to make it possible to
create idmapped mounts for tmpfs. For example, to share the host's
tmpfs mount between multiple sandboxes. This is a prerequisite for
some advanced Kubernetes cases. Systemd also has a range of use-cases
to increase service isolation. And there are more users of this.
However, with all of the other work going on this was way down on the
priority list but luckily someone other than ourselves picked this
up.
As usual the patch is tiny as all the infrastructure work had been
done multiple kernel releases ago. In addition to all the tests that
we already have I requested that Rodrigo add a dedicated tmpfs
testsuite for idmapped mounts to xfstests. It is to be included into
xfstests during the v6.3 development cycle. This should add a slew of
additional tests.
* tag 'fs.idmapped.v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping: (26 commits)
shmem: support idmapped mounts for tmpfs
fs: move mnt_idmap
fs: port vfs{g,u}id helpers to mnt_idmap
fs: port fs{g,u}id helpers to mnt_idmap
fs: port i_{g,u}id_into_vfs{g,u}id() to mnt_idmap
fs: port i_{g,u}id_{needs_}update() to mnt_idmap
quota: port to mnt_idmap
fs: port privilege checking helpers to mnt_idmap
fs: port inode_owner_or_capable() to mnt_idmap
fs: port inode_init_owner() to mnt_idmap
fs: port acl to mnt_idmap
fs: port xattr to mnt_idmap
fs: port ->permission() to pass mnt_idmap
fs: port ->fileattr_set() to pass mnt_idmap
fs: port ->set_acl() to pass mnt_idmap
fs: port ->get_acl() to pass mnt_idmap
fs: port ->tmpfile() to pass mnt_idmap
fs: port ->rename() to pass mnt_idmap
fs: port ->mknod() to pass mnt_idmap
fs: port ->mkdir() to pass mnt_idmap
...
The do_send_email() will call die before restoring stty if sendmail
setting is not correct or sendmail is not installed. It is safer to
restore it in the beginning of dodie().
Link: https://lkml.kernel.org/r/167420617635.2988775.13045295332829029437.stgit@devnote3
Cc: John 'Warthog9' Hawley <warthog9@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
There is a disconnect between the run_command function and the
wait_for_input. The wait_for_input has a default timeout of 2 minutes. But
if that happens, the run_command loop will exit out to the waitpid() of
the executing command. This fails in that it no longer monitors the
command, and also, the ssh to the test box can hang when its finished, as
it's waiting for the pipe it's writing to to flush, but the loop that
reads that pipe has already exited, leaving the command stuck, and the
test hangs.
Instead, make the default "wait_for_input" of the run_command infinite,
and allow the user to override it if they want with a default timeout
option "RUN_TIMEOUT".
But this fixes the hang that happens when the pipe is full and the ssh
session never exits.
Cc: stable@vger.kernel.org
Fixes: 6e98d1b441 ("ktest: Add timeout to ssh command")
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
When monitoring the console output, the stdout is being redirected to do
so. If Ctrl^C is hit during this mode, the stdout is not back to the
console, the user does not see anything they type (no echo).
Add "end_monitor" to the SIGINT interrupt handler to give back the console
on Ctrl^C.
Cc: stable@vger.kernel.org
Fixes: 9f2cdcbbb9 ("ktest: Give console process a dedicated tty")
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
In the "reboot" command, it does a check of the machine to see if it is
still alive with a simple "ssh echo" command. If it fails, it will assume
that a normal "ssh reboot" is not possible and force a power cycle.
In this case, the "start_monitor" is executed, but the "end_monitor" is
not, and this causes the screen will not be given back to the console. That
is, after the test, a "reset" command needs to be performed, as "echo" is
turned off.
Cc: stable@vger.kernel.org
Fixes: 6474ace999 ("ktest.pl: Powercycle the box on reboot if no connection can be made")
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Explicitly check for child netns and main ns independency
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Provide a virtual cache topology to the guest to avoid
inconsistencies with migration on heterogenous systems. Non secure
software has no practical need to traverse the caches by set/way in
the first place.
- Add support for taking stage-2 access faults in parallel. This was an
accidental omission in the original parallel faults implementation,
but should provide a marginal improvement to machines w/o FEAT_HAFDBS
(such as hardware from the fruit company).
- A preamble to adding support for nested virtualization to KVM,
including vEL2 register state, rudimentary nested exception handling
and masking unsupported features for nested guests.
- Fixes to the PSCI relay that avoid an unexpected host SVE trap when
resuming a CPU when running pKVM.
- VGIC maintenance interrupt support for the AIC
- Improvements to the arch timer emulation, primarily aimed at reducing
the trap overhead of running nested.
- Add CONFIG_USERFAULTFD to the KVM selftests config fragment in the
interest of CI systems.
- Avoid VM-wide stop-the-world operations when a vCPU accesses its own
redistributor.
- Serialize when toggling CPACR_EL1.SMEN to avoid unexpected exceptions
in the host.
- Aesthetic and comment/kerneldoc fixes
- Drop the vestiges of the old Columbia mailing list and add myself as
co-maintainer
This also drags in a couple of branches to avoid conflicts:
- The shared 'kvm-hw-enable-refactor' branch that reworks
initialization, as it conflicted with the virtual cache topology
changes.
- arm64's 'for-next/sme2' branch, as the PSCI relay changes, as both
touched the EL2 initialization code.
-----BEGIN PGP SIGNATURE-----
iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmPw29cPHG1hekBrZXJu
ZWwub3JnAAoJECPQ0LrRPXpD9doQAIJyMW0odT6JBe15uGCxTuTnJbb8mniajJdX
CuSxPl85WyKLtZbIJLRTQgyt6Nzbu0N38zM0y/qBZT5BvAnWYI8etvnJhYZjooAy
jrf0Me/GM5hnORXN+1dByCmlV+DSuBkax86tgIC7HhU71a2SWpjlmWQi/mYvQmIK
PBAqpFF+w2cWHi0ZvCq96c5EXBdN4FLEA5cdZhekCbgw1oX8+x+HxdpBuGW5lTEr
9oWOzOzJQC1uFnjP3unFuIaG94QIo+NA4aGLMzfb7wm2wdQUnKebtdj/RxsDZOKe
43Q1+MDFWMsxxFu4FULH8fPMwidIm5rfz3pw3JJloqaZp8vk/vjDLID7AYucMIX8
1G/mjqz6E9lYvv57WBmBhT/+apSDAmeHlAT97piH73Nemga91esDKuHSdtA8uB5j
mmzcUYajuB2GH9rsaXJhVKt/HW7l9fbGliCkI99ckq/oOTO9VsKLsnwS/rMRIsPn
y2Y8Lyoe4eqokd1DNn5/bo+3qDnfmzm6iDmZOo+JYuJv9KS95zuw17Wu7la9UAPV
e13+btoijHDvu8RnTecuXljWfAAKVtEjpEIoS5aP2R2iDvhr0d8POlMPaJ40YuRq
D2fKr18b6ngt+aI0TY63/ksEIFexx67HuwQsUZ2lRjyjq5/x+u3YIqUPbKrU4Rnl
uxXjSvyr
=r4s/
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for 6.3
- Provide a virtual cache topology to the guest to avoid
inconsistencies with migration on heterogenous systems. Non secure
software has no practical need to traverse the caches by set/way in
the first place.
- Add support for taking stage-2 access faults in parallel. This was an
accidental omission in the original parallel faults implementation,
but should provide a marginal improvement to machines w/o FEAT_HAFDBS
(such as hardware from the fruit company).
- A preamble to adding support for nested virtualization to KVM,
including vEL2 register state, rudimentary nested exception handling
and masking unsupported features for nested guests.
- Fixes to the PSCI relay that avoid an unexpected host SVE trap when
resuming a CPU when running pKVM.
- VGIC maintenance interrupt support for the AIC
- Improvements to the arch timer emulation, primarily aimed at reducing
the trap overhead of running nested.
- Add CONFIG_USERFAULTFD to the KVM selftests config fragment in the
interest of CI systems.
- Avoid VM-wide stop-the-world operations when a vCPU accesses its own
redistributor.
- Serialize when toggling CPACR_EL1.SMEN to avoid unexpected exceptions
in the host.
- Aesthetic and comment/kerneldoc fixes
- Drop the vestiges of the old Columbia mailing list and add [Oliver]
as co-maintainer
This also drags in arm64's 'for-next/sme2' branch, because both it and
the PSCI relay changes touch the EL2 initialization code.
Data passed to user-space with a (SOL_UDP, UDP_GRO) cmsg carries an
int (see udp_cmsg_recv), not a u16 value, as strace confirms:
recvmsg(8, {msg_name=...,
msg_iov=[{iov_base="\0\0..."..., iov_len=96000}],
msg_iovlen=1,
msg_control=[{cmsg_len=20, <-- sizeof(cmsghdr) + 4
cmsg_level=SOL_UDP,
cmsg_type=0x68}], <-- UDP_GRO
msg_controllen=24,
msg_flags=0}, 0) = 11200
Interpreting the data as an u16 value won't work on big-endian platforms.
Since it is too late to back out of this API decision [1], fix the test.
[1]: https://lore.kernel.org/netdev/20230131174601.203127-1-jakub@cloudflare.com/
Fixes: 3327a9c463 ("selftests: add functionals test for UDP GRO")
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The canonical location for the tracefs filesystem is at /sys/kernel/tracing.
But, from Documentation/trace/ftrace.rst:
Before 4.1, all ftrace tracing control files were within the debugfs
file system, which is typically located at /sys/kernel/debug/tracing.
For backward compatibility, when mounting the debugfs file system,
the tracefs file system will be automatically mounted at:
/sys/kernel/debug/tracing
Many comments and Kconfig help messages in the tracing code still refer
to this older debugfs path, so let's update them to avoid confusion.
Link: https://lore.kernel.org/linux-trace-kernel/20230215223350.2658616-2-zwisler@google.com
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com>
Signed-off-by: Ross Zwisler <zwisler@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
This patch tests the bpf_fib_lookup helper when looking up
a neigh in NUD_FAILED and NUD_STALE state. It also adds test
for the new BPF_FIB_LOOKUP_SKIP_NEIGH flag.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230217205515.3583372-2-martin.lau@linux.dev
The bpf_fib_lookup() also looks up the neigh table.
This was done before bpf_redirect_neigh() was added.
In the use case that does not manage the neigh table
and requires bpf_fib_lookup() to lookup a fib to
decide if it needs to redirect or not, the bpf prog can
depend only on using bpf_redirect_neigh() to lookup the
neigh. It also keeps the neigh entries fresh and connected.
This patch adds a bpf_fib_lookup flag, SKIP_NEIGH, to avoid
the double neigh lookup when the bpf prog always call
bpf_redirect_neigh() to do the neigh lookup. The params->smac
output is skipped together when SKIP_NEIGH is set because
bpf_redirect_neigh() will figure out the smac also.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230217205515.3583372-1-martin.lau@linux.dev
Testcase stat_all_metrics.sh fails in powerpc:
98: perf all metrics test : FAILED!
Logs with verbose:
[command]# ./perf test 98 -vv
98: perf all metrics test :
--- start ---
test child forked, pid 13262
Testing BRU_STALL_CPI
Testing COMPLETION_STALL_CPI
----
Testing TOTAL_LOCAL_NODE_PUMPS_P23
Metric 'TOTAL_LOCAL_NODE_PUMPS_P23' not printed in:
Error:
Invalid event (hv_24x7/PM_PB_LNS_PUMP23,chip=3/) in per-thread mode, enable system wide with '-a'.
Testing TOTAL_LOCAL_NODE_PUMPS_RETRIES_P01
Metric 'TOTAL_LOCAL_NODE_PUMPS_RETRIES_P01' not printed in:
Error:
Invalid event (hv_24x7/PM_PB_RTY_LNS_PUMP01,chip=3/) in per-thread mode, enable system wide with '-a'.
----
Based on above logs, we could see some of the hv-24x7 metric events
fails, and logs suggest to run the metric event with -a option. This
change happened after the commit a4b8cfcabb ("perf stat: Delay
metric parsing"), which delayed the metric parsing phase and now before
metric parsing phase perf tool identifies, whether target is system-wide
or not. With this change, perf_event_open will fails with workload
monitoring for uncore events as expected.
The perf all metric test case fails as some of the hv-24x7 metric events
may need bigger workload with system wide monitoring to get the data.
Fix this issue by changing current system wide check from true workload
to sleep 0.01 workload.
Result with the patch changes in powerpc:
98: perf all metrics test : Ok
Fixes: a4b8cfcabb ("perf stat: Delay metric parsing")
Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Tested-by: Disha Goel <disgoel@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230215093827.124921-1-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add tests validating that it's possible to pass context arguments into
global subprogs for various types of programs, including a particularly
tricky KPROBE programs (which cover kprobes, uprobes, USDTs, a vast and
important class of programs).
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20230216045954.3002473-4-andrii@kernel.org
Convert 17 test_global_funcs subtests into test_loader framework for
easier maintenance and more declarative way to define expected
failures/successes.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20230216045954.3002473-3-andrii@kernel.org
Power10 Performance Monitoring Unit (PMU) provides events to understand
stall cycles of different pipeline stages. These events along with
completed instructions provides useful metrics for application tuning.
Patch implements the JSON changes to collect counter statistics to
present the high level CPI stall breakdown metrics. New metric group is
named as "CPI_STALL_RATIO" and this new metric group presents these
stall metrics:
- DISPATCHED_CPI ( Dispatch stall cycles per insn )
- ISSUE_STALL_CPI ( Issue stall cycles per insn )
- EXECUTION_STALL_CPI ( Execution stall cycles per insn )
- COMPLETION_STALL_CPI ( Completition stall cycles per insn )
To avoid multipling of events, PM_RUN_INST_CMPL event has been modified
to use PMC5(performance monitoring counter5) instead of PMC4. This
change is needed, since completion stall event is using PMC4.
Usage example:
./perf stat --metric-no-group -M CPI_STALL_RATIO <workload>
Performance counter stats for 'workload':
63,056,817,982 PM_CMPL_STALL # 0.28 COMPLETION_STALL_CPI
1,743,988,038,896 PM_ISSUE_STALL # 7.73 ISSUE_STALL_CPI
225,597,495,030 PM_RUN_INST_CMPL # 6.18 DISPATCHED_CPI
# 37.48 EXECUTION_STALL_CPI
1,393,916,546,654 PM_DISP_STALL_CYC
8,455,376,836,463 PM_EXEC_STALL
"--metric-no-group" is used for forcing PM_RUN_INST_CMPL to be scheduled
in all group for more accuracy.
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Disha Goel <disgoel@linux.ibm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230216061240.18067-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There is no good reason why we cannot synthesize "cycle" events from
Intel PT just as we can synthesize "instruction" events, in particular
when CYC packets are available. This enables using PT to getting much
more accurate cycle profiles than regular sampling (record -e cycles)
when the work last for very short periods (<10 ms). Thus, add support
for this, based off of the existing IPC calculation framework. The new
option to --itrace is "y" (for cYcles), as c was taken for calls. Cycle
and instruction events can be synthesized together, and are by default.
The only real caveat is that CYC packets are only emitted whenever some
other packet is, which in practice is when a branch instruction is
encountered (and not even all branches). Thus, even at no subsampling
(e.g. --itrace=y0ns), it is impossible to get more accuracy than a
single basic block, and all cycles spent executing that block will get
attributed to the branch instruction that ends the packet. Thus, one
cannot know whether the cycles came from e.g. a specific load, a
mispredicted branch, or something else. When subsampling (which is the
default), the cycle events will get smeared out even more, but will
still be generally useful to attribute cycle counts to functions.
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Steinar H. Gunderson <sesse@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220322082452.1429091-1-sesse@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Run spell checker on files in selftest/bpf and fixed typos.
Signed-off-by: Taichi Nishimura <awkrail01@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/bpf/20230216085537.519062-1-awkrail01@gmail.com
Use the new type-safe wrappers around bpf_obj_get_info_by_fd().
Fix a prog/map mixup in prog_holds_map().
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230214231221.249277-6-iii@linux.ibm.com
Use the new type-safe wrappers around bpf_obj_get_info_by_fd().
Split the bpf_obj_get_info_by_fd() call in build_btf_type_table() in
two, since knowing the type helps with the Memory Sanitizer.
Improve map_parse_fd_and_info() type safety by using
struct bpf_map_info * instead of void * for info.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20230214231221.249277-4-iii@linux.ibm.com
These are type-safe wrappers around bpf_obj_get_info_by_fd(). They
found one problem in selftests, and are also useful for adding
Memory Sanitizer annotations.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230214231221.249277-2-iii@linux.ibm.com
Current release - regressions:
- fix unwanted sign extension in netdev_stats_to_stats64()
Current release - new code bugs:
- initialize net->notrefcnt_tracker earlier
- devlink: fix netdev notifier chain corruption
- nfp: make sure mbox accesses in IPsec code are atomic
- ice: fix check for weight and priority of a scheduling node
Previous releases - regressions:
- ice: xsk: fix cleaning of XDP_TX frame, prevent inf loop
- igb: fix I2C bit banging config with external thermal sensor
Previous releases - always broken:
- sched: tcindex: update imperfect hash filters respecting rcu
- mpls: fix stale pointer if allocation fails during device rename
- dccp/tcp: avoid negative sk_forward_alloc by ipv6_pinfo.pktoptions
- remove WARN_ON_ONCE(sk->sk_forward_alloc) from sk_stream_kill_queues()
- af_key: fix heap information leak
- ipv6: fix socket connection with DSCP (correct interpretation
of the tclass field vs fib rule matching)
- tipc: fix kernel warning when sending SYN message
- vmxnet3: read RSS information from the correct descriptor (eop)
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmPueVYACgkQMUZtbf5S
IrudBg/9ENOTSK4LeUE0If6Mr/DBUje6OwUX29/prxJz+PfU+XTNZqFf9xfVC5tX
FnRAf7SxC4d4hmaA7JdOTFMyyDyZqQP1t13s0FSWAa5VbW3pnrqOXTyxCc9g3cZk
7m2co3XTH/U0FnbAZ6uzaoz7/NU+7UHYMVo5bCwLgB6YB31JWIcyTYKdYzl+xBFX
y0GNkLLDJOo6cSpA5VRDMit42XmIUk1C4uuMPN0M6Q2U2xnYg2nYqlSXvaW2Y9CN
NZgCCTp3AcKNqTU4zpjR6eyHK0xNgSCT4WJlx2tbnErdEo6gf8/Kw0PbbYAe/euk
hgZQ0I8JxujskJFuA7S4u+VaPG92ycAyKMZFIucwepKSqtAjjA+XxTs8jxyVJIwZ
+jwHnayEomlgnel+q+hE+XkNg3OdengTaudWn3vO2EtJrWyZh3nVP2krgxFlh8vY
qwSqhSiXi+sjxspYgeIyyu2XGfWdY1sKNRgkGqisTyfBIMRK/jRhas5kyKOAOiYo
32xuvXzVouxXipVZhE5RVoR1VpeKl6OXlCIVVDvT6el2Y5bkbEmkCgEyIPfv1xBX
17C9a1IhvTfHU8uJ11Mbi7jxRCC3ELi7JHNdht0/WZtdVNppcwALuK+xGjHwR67H
VWwWKO/M/ggd2OF1jW9eLzcJIixxCJjpu+VJFImd8aIUUw0LKy8=
=r2l1
-----END PGP SIGNATURE-----
Merge tag 'net-6.2-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Fixes from the main networking tree only, probably because all
sub-trees have backed off and haven't submitted their changes.
None of the fixes here are particularly scary and no outstanding
regressions. In an ideal world the "current release" sections would be
empty at this stage but that never happens.
Current release - regressions:
- fix unwanted sign extension in netdev_stats_to_stats64()
Current release - new code bugs:
- initialize net->notrefcnt_tracker earlier
- devlink: fix netdev notifier chain corruption
- nfp: make sure mbox accesses in IPsec code are atomic
- ice: fix check for weight and priority of a scheduling node
Previous releases - regressions:
- ice: xsk: fix cleaning of XDP_TX frame, prevent inf loop
- igb: fix I2C bit banging config with external thermal sensor
Previous releases - always broken:
- sched: tcindex: update imperfect hash filters respecting rcu
- mpls: fix stale pointer if allocation fails during device rename
- dccp/tcp: avoid negative sk_forward_alloc by ipv6_pinfo.pktoptions
- remove WARN_ON_ONCE(sk->sk_forward_alloc) from
sk_stream_kill_queues()
- af_key: fix heap information leak
- ipv6: fix socket connection with DSCP (correct interpretation of
the tclass field vs fib rule matching)
- tipc: fix kernel warning when sending SYN message
- vmxnet3: read RSS information from the correct descriptor (eop)"
* tag 'net-6.2-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (35 commits)
devlink: Fix netdev notifier chain corruption
igb: conditionalize I2C bit banging on external thermal sensor support
net: mpls: fix stale pointer if allocation fails during device rename
net/sched: tcindex: search key must be 16 bits
tipc: fix kernel warning when sending SYN message
igb: Fix PPS input and output using 3rd and 4th SDP
net: use a bounce buffer for copying skb->mark
ixgbe: add double of VLAN header when computing the max MTU
i40e: add double of VLAN header when computing the max MTU
ixgbe: allow to increase MTU to 3K with XDP enabled
net: stmmac: Restrict warning on disabling DMA store and fwd mode
net/sched: act_ctinfo: use percpu stats
net: stmmac: fix order of dwmac5 FlexPPS parametrization sequence
ice: fix lost multicast packets in promisc mode
ice: Fix check for weight and priority of a scheduling node
bnxt_en: Fix mqprio and XDP ring checking logic
net: Fix unwanted sign extension in netdev_stats_to_stats64()
net/usb: kalmia: Don't pass act_len in usb_bulk_msg error path
net: openvswitch: fix possible memory leak in ovs_meter_cmd_set()
af_key: Fix heap information leak
...
Many platforms have feature of adjacent cachelines prefetch, when it is
enabled, for data in RAM of 2 cachelines (2N and 2N+1) granularity, if
one is fetched to cache, the other one could likely be fetched too,
which sort of extends the cacheline size to double, thus the false
sharing could happens in adjacent cachelines.
0Day has captured performance changed related with this [1], and some
commercial software explicitly makes its hot global variables 128 bytes
aligned (2 cache lines) to avoid this kind of extended false sharing.
So add an option "--double-cl" for 'perf c2c report' to show false
sharing in double cache line granularity, which acts just like the
cacheline size is doubled. There is no change to c2c record. The
hardware events of shared cacheline are still per cacheline, and this
option just changes the granularity of how events are grouped and
displayed.
In the 'perf c2c report' output below (will-it-scale's 'pagefault2' case
on old kernel):
----------------------------------------------------------------------
26 31 2 0 0 0 0xffff888103ec6000
----------------------------------------------------------------------
35.48% 50.00% 0.00% 0.00% 0.00% 0x10 0 1 0xffffffff8133148b 1153 66 971 3748 74 [k] get_mem_cgroup_from_mm
6.45% 0.00% 0.00% 0.00% 0.00% 0x10 0 1 0xffffffff813396e4 570 0 1531 879 75 [k] mem_cgroup_charge
25.81% 50.00% 0.00% 0.00% 0.00% 0x54 0 1 0xffffffff81331472 949 70 593 3359 74 [k] get_mem_cgroup_from_mm
19.35% 0.00% 0.00% 0.00% 0.00% 0x54 0 1 0xffffffff81339686 1352 0 1073 1022 74 [k] mem_cgroup_charge
9.68% 0.00% 0.00% 0.00% 0.00% 0x54 0 1 0xffffffff813396d6 1401 0 863 768 74 [k] mem_cgroup_charge
3.23% 0.00% 0.00% 0.00% 0.00% 0x54 0 1 0xffffffff81333106 618 0 804 11 9 [k] uncharge_batch
The offset 0x10 and 0x54 used to displayed in 2 groups, and now they are
listed together to give users a hint of extended false sharing.
[1]. https://lore.kernel.org/lkml/20201102091543.GM31092@shao2-debian/
Committer notes:
Link: https://lore.kernel.org/r/Y+wvVNWqXb70l4uy@feng-clx
Removed -a, leaving just as --double-cl, as this probably is not used so
frequently and perhaps will be even auto-detected if we manage to record
the MSR where this is configured.
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Feng Tang <feng.tang@intel.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Joe Mario <jmario@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tim Chen <tim.c.chen@intel.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20230214075823.246414-1-feng.tang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This selftest is designed for testing the PSP flavor in SRv6 End behavior.
It instantiates a virtual network composed of several nodes: hosts and
SRv6 routers. Each node is realized using a network namespace that is
properly interconnected to others through veth pairs.
The test makes use of the SRv6 End behavior and of the PSP flavor needed
for removing the SRH from the IPv6 header at the penultimate node.
The correct execution of the behavior is verified through reachability
tests carried out between hosts.
Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Signed-off-by: Paolo Lungaroni <paolo.lungaroni@uniroma2.it>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The rsvp classifier has served us well for about a quarter of a century but has
has not been getting much maintenance attention due to lack of known users.
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The tcindex classifier has served us well for about a quarter of a century
but has not been getting much TLC due to lack of known users. Most recently
it has become easy prey to syzkaller. For this reason, we are retiring it.
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The dsmark qdisc has served us well over the years for diffserv but has not
been getting much attention due to other more popular approaches to do diffserv
services. Most recently it has become a shooting target for syzkaller. For this
reason, we are retiring it.
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The ATM qdisc has served us well over the years but has not been getting much
TLC due to lack of known users. Most recently it has become a shooting target
for syzkaller. For this reason, we are retiring it.
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
While this amazing qdisc has served us well over the years it has not been
getting any tender love and care and has bitrotted over time.
It has become mostly a shooting target for syzkaller lately.
For this reason, we are retiring it. Goodbye CBQ - we loved you.
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
remove temporary files created by 'mirred_egress_to_ingress_tcp' test
in the cleanup() handler. Also, change variable names to avoid clashing
with globals from lib.sh.
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Link: https://lore.kernel.org/r/091649045a017fc00095ecbb75884e5681f7025f.1676368027.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
&xdp_buff and &xdp_frame are bound in a way that
xdp_buff->data_hard_start == xdp_frame
It's always the case and e.g. xdp_convert_buff_to_frame() relies on
this.
IOW, the following:
for (u32 i = 0; i < 0xdead; i++) {
xdpf = xdp_convert_buff_to_frame(&xdp);
xdp_convert_frame_to_buff(xdpf, &xdp);
}
shouldn't ever modify @xdpf's contents or the pointer itself.
However, "live packet" code wrongly treats &xdp_frame as part of its
context placed *before* the data_hard_start. With such flow,
data_hard_start is sizeof(*xdpf) off to the right and no longer points
to the XDP frame.
Instead of replacing `sizeof(ctx)` with `offsetof(ctx, xdpf)` in several
places and praying that there are no more miscalcs left somewhere in the
code, unionize ::frm with ::data in a flex array, so that both starts
pointing to the actual data_hard_start and the XDP frame actually starts
being a part of it, i.e. a part of the headroom, not the context.
A nice side effect is that the maximum frame size for this mode gets
increased by 40 bytes, as xdp_buff::frame_sz includes everything from
data_hard_start (-> includes xdpf already) to the end of XDP/skb shared
info.
Also update %MAX_PKT_SIZE accordingly in the selftests code. Leave it
hardcoded for 64 bit && 4k pages, it can be made more flexible later on.
Minor: align `&head->data` with how `head->frm` is assigned for
consistency.
Minor #2: rename 'frm' to 'frame' in &xdp_page_head while at it for
clarity.
(was found while testing XDP traffic generator on ice, which calls
xdp_convert_frame_to_buff() for each XDP frame)
Fixes: b530e9e106 ("bpf: Add "live packet" mode for XDP in BPF_PROG_RUN")
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20230215185440.4126672-1-aleksander.lobakin@intel.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Add a new benchmark which measures hashmap lookup operations speed. A user can
control the following parameters of the benchmark:
* key_size (max 1024): the key size to use
* max_entries: the hashmap max entries
* nr_entries: the number of entries to insert/lookup
* nr_loops: the number of loops for the benchmark
* map_flags The hashmap flags passed to BPF_MAP_CREATE
The BPF program performing the benchmarks calls two nested bpf_loop:
bpf_loop(nr_loops/nr_entries)
bpf_loop(nr_entries)
bpf_map_lookup()
So the nr_loops determines the number of actual map lookups. All lookups are
successful.
Example (the output is generated on a AMD Ryzen 9 3950X machine):
for nr_entries in `seq 4096 4096 65536`; do echo -n "$((nr_entries*100/65536))% full: "; sudo ./bench -d2 -a bpf-hashmap-lookup --key_size=4 --nr_entries=$nr_entries --max_entries=65536 --nr_loops=1000000 --map_flags=0x40 | grep cpu; done
6% full: cpu01: lookup 50.739M ± 0.018M events/sec (approximated from 32 samples of ~19ms)
12% full: cpu01: lookup 47.751M ± 0.015M events/sec (approximated from 32 samples of ~20ms)
18% full: cpu01: lookup 45.153M ± 0.013M events/sec (approximated from 32 samples of ~22ms)
25% full: cpu01: lookup 43.826M ± 0.014M events/sec (approximated from 32 samples of ~22ms)
31% full: cpu01: lookup 41.971M ± 0.012M events/sec (approximated from 32 samples of ~23ms)
37% full: cpu01: lookup 41.034M ± 0.015M events/sec (approximated from 32 samples of ~24ms)
43% full: cpu01: lookup 39.946M ± 0.012M events/sec (approximated from 32 samples of ~25ms)
50% full: cpu01: lookup 38.256M ± 0.014M events/sec (approximated from 32 samples of ~26ms)
56% full: cpu01: lookup 36.580M ± 0.018M events/sec (approximated from 32 samples of ~27ms)
62% full: cpu01: lookup 36.252M ± 0.012M events/sec (approximated from 32 samples of ~27ms)
68% full: cpu01: lookup 35.200M ± 0.012M events/sec (approximated from 32 samples of ~28ms)
75% full: cpu01: lookup 34.061M ± 0.009M events/sec (approximated from 32 samples of ~29ms)
81% full: cpu01: lookup 34.374M ± 0.010M events/sec (approximated from 32 samples of ~29ms)
87% full: cpu01: lookup 33.244M ± 0.011M events/sec (approximated from 32 samples of ~30ms)
93% full: cpu01: lookup 32.182M ± 0.013M events/sec (approximated from 32 samples of ~31ms)
100% full: cpu01: lookup 31.497M ± 0.016M events/sec (approximated from 32 samples of ~31ms)
Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230213091519.1202813-8-aspsk@isovalent.com
The bench utility will print
Setting up benchmark '<bench-name>'...
Benchmark '<bench-name>' started.
on startup to stdout. Suppress this output if --quiet option if given. This
makes it simpler to parse benchmark output by a script.
Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230213091519.1202813-7-aspsk@isovalent.com
The "local-storage-tasks-trace" benchmark has a `--quiet` option. Move it to
the list of common options, so that the main code and other benchmarks can use
(new) env.quiet variable. Patch the run_bench_local_storage_rcu_tasks_trace.sh
helper script accordingly.
Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230213091519.1202813-6-aspsk@isovalent.com
The benchs/bench_bpf_hashmap_full_update.c doesn't set a custom argp,
so it shouldn't include the <argp.h> header.
Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230213091519.1202813-5-aspsk@isovalent.com
To parse command line the bench utility uses the argp_parse() function. This
function takes as an argument a parent 'struct argp' structure which defines
common command line options and an array of children 'struct argp' structures
which defines additional command line options for particular benchmarks. This
implementation doesn't allow benchmarks to share option names, e.g., if two
benchmarks want to use, say, the --option option, then only one of them will
succeed (the first one encountered in the array). This will be convenient if
same option names could be used in different benchmarks (with the same
semantics, e.g., --nr_loops=N).
Fix this by calling the argp_parse() function twice. The first call is the same
as it was before, with all children argps, and helps to find the benchmark name
and to print a combined help message if anything is wrong. Given the name, we
can call the argp_parse the second time, but now the children array points only
to a correct benchmark thus always calling the correct parsers. (If there's no
a specific list of arguments, then only one call to argp_parse will be done.)
Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230213091519.1202813-4-aspsk@isovalent.com
The hashmap_report_final callback function defined in the
benchs/bench_bpf_hashmap_full_update.c file should be static.
Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230213091519.1202813-3-aspsk@isovalent.com
To call the bpf_hashmap_full_update benchmark, one should say:
bench bpf-hashmap-ful-update
The patch adds a missing 'l' to the benchmark name.
Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230213091519.1202813-2-aspsk@isovalent.com
The reinitialization of spin-lock in map value after immediate reuse may
corrupt lookup with BPF_F_LOCK flag and result in hard lock-up, so add
one test case to demonstrate the problem.
Signed-off-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20230215082132.3856544-3-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
A test case to verify that variable offset BPF_ST instruction
preserves STACK_ZERO marks when writes zeros, e.g. in the following
situation:
*(u64*)(r10 - 8) = 0 ; STACK_ZERO marks for fp[-8]
r0 = random(-7, -1) ; some random number in range of [-7, -1]
r0 += r10 ; r0 is now variable offset pointer to stack
*(u8*)(r0) = 0 ; BPF_ST writing zero, STACK_ZERO mark for
; fp[-8] should be preserved.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230214232030.1502829-5-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Check that verifier tracks the value of 'imm' spilled to stack by
BPF_ST_MEM instruction. Cover the following cases:
- write of non-zero constant to stack;
- write of a zero constant to stack.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230214232030.1502829-3-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
For aligned stack writes using BPF_ST instruction track stored values
in a same way BPF_STX is handled, e.g. make sure that the following
commands produce similar verifier knowledge:
fp[-8] = 42; r1 = 42;
fp[-8] = r1;
This covers two cases:
- non-null values written to stack are stored as spill of fake
registers;
- null values written to stack are stored as STACK_ZERO marks.
Previously both cases above used STACK_MISC marks instead.
Some verifier test cases relied on the old logic to obtain STACK_MISC
marks for some stack values. These test cases are updated in the same
commit to avoid failures during bisect.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230214232030.1502829-2-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Merge power management utilities and documentation updates for 6.3-rc1:
- Modify some power management utilities to use the canonical ftrace
path (Ross Zwisler).
- Correct spelling problems for Documentation/power/ as reported by
codespell (Randy Dunlap).
* pm-tools:
PM: tools: use canonical ftrace path
* pm-docs:
Documentation: power: correct spelling
* The last part of the cmpxchg patches
* A few fixes
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEwGNS88vfc9+v45Yq41TmuOI4ufgFAmPkwH0ACgkQ41TmuOI4
ufhrshAAmv9OlCNVsGTmQLpEnGdnxGM2vBPDEygdi+oVHtpMBFn27R3fu295aUR0
v0o3xsSImhaOU03OxWrsLqPanEL5BqnicLwkL4xou3NXXD4Wo0Zrstd3ykfaODhq
bTDx7zC2zMQ5J+LPuwDaYUat5R0bHv7cULv1CKLdyISnPGafy0kpUPvC30nymJZi
nV7/DjvDYbuOFfhdTEOklGRXvMSEBPLGhIJk/cYZzJECNeNJFUeSs+00uNJ8P6WO
BQD/FLWie+Fn6lTGIUhulZCPf65KI4bHHLB6WFXA5Jy+O08urdtLiZwlBC4iNsFV
NFIwangpJ/RnupJoOMwQfw31op5SZuiOYn91njaGIiLpHgvA9+iaERsqXtjp8NW7
/ne1TZqtrGbYY71XvZ/yPQU5VGc/MG1CyCGX1CPNSQO7v4yl27BNChxdkBHzzm2u
C0IuLZuXl25XwAt8xbdi65fb84pJOeWRU4Zoe4cUZ3drBy5cZsmFXe3lhEAqs7nf
MB9XekTLpZ6pCqTE1u/BOrobVg5es/lDQiDeLCvDe1I3I5inSD6ehjJz7qjK0w8o
3pn0rb+Kb4Ijzfi4RNbgJXmBNzkwwSSPPwYt4THHOZtr8p0fZMBeGHqq1wTJmKcq
M/+9w4cZqgFpdyNqitj8NyTayX1Lj4LWayexCBYaGkLuHTD6cCk=
=HOly
-----END PGP SIGNATURE-----
Merge tag 'kvm-s390-next-6.3-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
* Two more V!=R patches
* The last part of the cmpxchg patches
* A few fixes
The compiler is optimizing out majority of unref_ptr read/writes, so the test
wasn't testing much. For example, one could delete '__kptr' tag from
'struct prog_test_ref_kfunc __kptr *unref_ptr;' and the test would still "pass".
Convert it to volatile stores. Confirmed by comparing bpf asm before/after.
Fixes: 2cbc469a6f ("selftests/bpf: Add C tests for kptr")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230214235051.22938-1-alexei.starovoitov@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
When the BPF selftests are cross-compiled, only the a host version of
bpftool is built. This version of bpftool is used on the host-side to
generate various intermediates, e.g., skeletons.
The test runners are also using bpftool, so the Makefile will symlink
bpftool from the selftest/bpf root, where the test runners will look
the tool:
| $(Q)ln -sf $(if $2,..,.)/tools/build/bpftool/bootstrap/bpftool \
| $(OUTPUT)/$(if $2,$2/)bpftool
There are two problems for cross-compilation builds:
1. There is no native (cross-compilation target) of bpftool
2. The bootstrap/bpftool is never cross-compiled (by design)
Make sure that a native/cross-compiled version of bpftool is built,
and if CROSS_COMPILE is set, symlink the native/non-bootstrap version.
Acked-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20230214161253.183458-1-bjorn@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
There exists build error when make -C tools/testing/selftests/bpf/
on LoongArch:
BINARY test_verifier
In file included from test_verifier.c:27:
tools/include/uapi/linux/bpf_perf_event.h:14:28: error: field 'regs' has incomplete type
14 | bpf_user_pt_regs_t regs;
| ^~~~
make: *** [Makefile:577: tools/testing/selftests/bpf/test_verifier] Error 1
make: Leaving directory 'tools/testing/selftests/bpf'
Add missing uapi header for LoongArch to use the following definition:
typedef struct user_pt_regs bpf_user_pt_regs_t;
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Link: https://lore.kernel.org/r/1676458867-22052-1-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
thermal_sampling_init() suscribes to THERMAL_GENL_SAMPLING_GROUP_NAME group
so thermal_sampling_exit() should unsubscribe from the same group.
Fixes: 47c4b0de08 ("tools/lib/thermal: Add a thermal library")
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20230202102812.453357-1-vincent.guittot@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Cache the CPU vendor (AMD vs. Intel) and use the info to emit the correct
hypercall instruction instead of relying on KVM to patch in VMMCALL
- A variety of one-off cleanups and fixes
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmPsHjESHHNlYW5qY0Bn
b29nbGUuY29tAAoJEGCRIgFNDBL55TEP/jMaS0pV9MFPvu4aEoqBL5uB1ReXKaFY
3Z6a79oRLbfceoaFRhc8I+PTJP3V8jIGTvuJSzhtCzOlUpy6breIhdrJcnVWSEl2
sxsbEGZvC9+hRFH0OMNcpLOTk8z7OfMGc0QjbWU1W2wmN8ePD/WkoKqOvJL2NZs8
fYG0b7L2o3jU4wGZ6Y7N+1fu8bev5K0BO/NkGzrs8M3XuIXzB36jPar4am/th8fF
qguqti1vhj7LIaroyFPIV6YT6LVH3O5TT1gL/L/oo/bQdTpJ9yrAWXJBSEBGeEu2
9RcUPkVEfWXocQB8xSx0HFfjCIa9NS2Yzl9H1Up9zaTVM3RR7ebS0Mh1ekPGWBYy
HPMRVBSRElHxLNLy+peR6dvK3j+LLUguOTUfRXeP65uuNR+gqQs/pM8WcZogIpdf
6VGjQ16fHaxRI8r6/oxH78vQRGcL9HtURlJfGrZU0b4msi2a/nT9kYXsO1/L/jvh
s6gggAdv/IGO3iDGnql9iNniYOtUTnigFHRLe8QNK3JVIeWHjY/segB6qvbIo14N
1AN5sNy8ArtbEC8whr5ghFG6VMbPNPB0aQo2WOZ058JaEo0QQKnPtKy9dJNUvHTI
CIQp6eFAn14qUKTuDFxbCjUiADJb8C9XoVNd1OTofPX4i78U4ST621YE5SbqszPY
dsX6XYFfxrze
=KI7X
-----END PGP SIGNATURE-----
Merge tag 'kvm-x86-selftests-6.3' of https://github.com/kvm-x86/linux into HEAD
KVM selftests changes for 6.3:
- Cache the CPU vendor (AMD vs. Intel) and use the info to emit the correct
hypercall instruction instead of relying on KVM to patch in VMMCALL
- A variety of one-off cleanups and fixes
- Add support for created masked events for the PMU filter to allow
userspace to heavily restrict what events the guest can use without
needing to create an absurd number of events
- Clean up KVM's handling of "PMU MSRs to save", especially when vPMU
support is disabled
- Add PEBS support for Intel SPR
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmPsFZ4SHHNlYW5qY0Bn
b29nbGUuY29tAAoJEGCRIgFNDBL5eKEP/0qeZsOQot53wkf+wNiGh1X6qDacBPFP
A8GAPC70fEisxAt776DeKEBwikHpARPglCt1Il9dFvkG+0jgYpvPu8UGF1LpouKX
cD/7itr2k8GZlXZBg2Rgu3TRyFBJEGHT6tAu7PBhZyL6yWQDUxao8FPFrRGfmJ7O
Z6eFMo1cERNHICQm+W/2TBd1xguiF+m4CXKlA70R4wzM37aPF9o5HvmIwAvPzyhU
w4WzcIQbjVPs1VpBTzwPqRmyZ8omSlDYo7VqmsDiRtJbucqgbhFI2wR+nyImFCa9
D2pI5TV3CFTt0fvd8SZpH19nR3S6cMLCXONOsijmvR2BmS3PhJMP4dMm5m4R06nF
RBtnTj9fkbeL1ghFEkMxHBZVTG3bBlO4ySOxIqNHCvPjqQ37mJ+xP4C8kcIC9p5F
+xL3AvZ7zenPv3A29SY9YH+QvZLBwyDJzAsveLeYkLFoJxoDT4glOY/Wpi1rkZ17
/zHDZWoF49l1Eu3Bql0hFetkCreUNFGpa4moUmEC0evYOvV2WCb+39TDXZ8CPCGD
+cDiRnD8MFQpBw47F03EnFheFHxiJoL0Clv5vvM3C+xOq2J9WVG9mqQWCk+4ta2B
Um4D++0a9lwvJhOImaR7uyiV3K7oVm+rU8+46x+nTNGaIP2bnE+vronY+b6KGeUx
7+xzTKlYygGe
=ev5v
-----END PGP SIGNATURE-----
Merge tag 'kvm-x86-pmu-6.3' of https://github.com/kvm-x86/linux into HEAD
KVM x86 PMU changes for 6.3:
- Add support for created masked events for the PMU filter to allow
userspace to heavily restrict what events the guest can use without
needing to create an absurd number of events
- Clean up KVM's handling of "PMU MSRs to save", especially when vPMU
support is disabled
- Add PEBS support for Intel SPR
When --overwrite and --max-size options of perf record are used
together, a segmentation fault occurs. The following is an example:
# perf record -e sched:sched* --overwrite --max-size 1K -a -- sleep 1
[ perf record: Woken up 1 times to write data ]
perf: Segmentation fault
Obtained 12 stack frames.
./perf/perf(+0x197673) [0x55f99710b673]
/lib/x86_64-linux-gnu/libc.so.6(+0x3ef0f) [0x7fa45f3cff0f]
./perf/perf(+0x8eb40) [0x55f997002b40]
./perf/perf(+0x1f6882) [0x55f99716a882]
./perf/perf(+0x794c2) [0x55f996fed4c2]
./perf/perf(+0x7b7c7) [0x55f996fef7c7]
./perf/perf(+0x9074b) [0x55f99700474b]
./perf/perf(+0x12e23c) [0x55f9970a223c]
./perf/perf(+0x12e54a) [0x55f9970a254a]
./perf/perf(+0x7db60) [0x55f996ff1b60]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe6) [0x7fa45f3b2c86]
./perf/perf(+0x7dfe9) [0x55f996ff1fe9]
Segmentation fault (core dumped)
backtrace of the core file is as follows:
(gdb) bt
#0 record__bytes_written (rec=0x55f99755a200 <record>) at builtin-record.c:234
#1 record__output_max_size_exceeded (rec=0x55f99755a200 <record>) at builtin-record.c:242
#2 record__write (map=0x0, size=12816, bf=0x55f9978da2e0, rec=0x55f99755a200 <record>) at builtin-record.c:263
#3 process_synthesized_event (tool=tool@entry=0x55f99755a200 <record>, event=event@entry=0x55f9978da2e0, sample=sample@entry=0x0, machine=machine@entry=0x55f997893658) at builtin-record.c:618
#4 0x000055f99716a883 in __perf_event__synthesize_id_index (tool=tool@entry=0x55f99755a200 <record>, process=process@entry=0x55f997002aa0 <process_synthesized_event>, evlist=0x55f9978928b0, machine=machine@entry=0x55f997893658,
from=from@entry=0) at util/synthetic-events.c:1895
#5 0x000055f99716a91f in perf_event__synthesize_id_index (tool=tool@entry=0x55f99755a200 <record>, process=process@entry=0x55f997002aa0 <process_synthesized_event>, evlist=<optimized out>, machine=machine@entry=0x55f997893658)
at util/synthetic-events.c:1905
#6 0x000055f996fed4c3 in record__synthesize (tail=tail@entry=true, rec=0x55f99755a200 <record>) at builtin-record.c:1997
#7 0x000055f996fef7c8 in __cmd_record (argc=argc@entry=2, argv=argv@entry=0x7ffc67551260, rec=0x55f99755a200 <record>) at builtin-record.c:2802
#8 0x000055f99700474c in cmd_record (argc=<optimized out>, argv=0x7ffc67551260) at builtin-record.c:4258
#9 0x000055f9970a223d in run_builtin (p=0x55f997564d88 <commands+264>, argc=10, argv=0x7ffc67551260) at perf.c:330
#10 0x000055f9970a254b in handle_internal_command (argc=10, argv=0x7ffc67551260) at perf.c:384
#11 0x000055f996ff1b61 in run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:428
#12 main (argc=<optimized out>, argv=0x7ffc67551260) at perf.c:562
The reason is that record__bytes_written accesses the freed memory rec->thread_data,
The process is as follows:
__cmd_record
-> record__free_thread_data
-> zfree(&rec->thread_data) // free rec->thread_data
-> record__synthesize
-> perf_event__synthesize_id_index
-> process_synthesized_event
-> record__write
-> record__bytes_written // access rec->thread_data
We add a member variable "thread_bytes_written" in the struct "record"
to save the data size written by the threads.
Fixes: 6d57581659 ("perf record: Add support for limit perf output file size")
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiwei Sun <jiwei.sun@windriver.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/CAM9d7ci_TRrqBQVQNW8=GwakUr7SsZpYxaaty-S4bxF8zJWyqw@mail.gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Cc: stable@vger.kernel.org # v5.18+
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230127135755.79929-22-mathieu.desnoyers@efficios.com
Pick up the CXL DVSEC range register emulation for v6.3, and resolve
conflicts with the cxl_port_probe() split (from for-6.3/cxl-ram-region)
and event handling (from for-6.3/cxl-events).
CXL rev3 spec 8.1.3
RCDs may not have HDM register blocks. Create a fake HDM with information
from the CXL PCIe DVSEC registers. The decoder count will be set to the
HDM count retrieved from the DVSEC cap register.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/167640368994.935665.15831225724059704620.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In the case where HDM decoder register block exists but is not programmed
and at the same time the DVSEC range register range is active, populate the
CXL decoder object 'cxl_decoder' with info from DVSEC range registers.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/167640368454.935665.13806415120298330717.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Call cxl_dvsec_rr_decode() in the beginning of cxl_port_probe() and
preserve the decoded information in a local
'struct cxl_endpoint_dvsec_info'. This info can be passed to various
functions later on in order to support the HDM decoder emulation.
The invocation of cxl_dvsec_rr_decode() in cxl_hdm_decode_init() is
removed and a pointer to the 'struct cxl_endpoint_dvsec_info' is passed
in.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/167640367377.935665.2848747799651019676.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This test depends on <linux/user_events.h> exported in uapi
The following commit removed user_events.h out of uapi:
commit 5cfff569ca ("tracing: Move user_events.h temporarily out
of include/uapi")
This test will not compile until user_events.h is added back to uapi.
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Clean up prog_tests/dynptr.c by removing the unneeded "expected_err_msg"
in the dynptr_tests struct, which is a remnant from converting the fail
tests cases to use the generic verification tester.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Link: https://lore.kernel.org/r/20230214051332.4007131-2-joannelkoong@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Clean up user_ringbuf, cgrp_kfunc, and kfunc_dynptr_param tests to use
the generic verification tester for checking verifier rejections.
The generic verification tester uses btf_decl_tag-based annotations
for verifying that the tests fail with the expected log messages.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Acked-by: David Vernet <void@manifault.com>
Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com>
Link: https://lore.kernel.org/r/20230214051332.4007131-1-joannelkoong@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The hwnoise tool is a special mode for the osnoise top tool.
hwnoise dispatches the osnoise tracer and displays a summary of the noise.
The difference is that it runs the tracer with the OSNOISE_IRQ_DISABLE
option set, thus only allowing only hardware-related noise, resulting in
a simplified output. hwnoise has the same features of osnoise.
An example of the tool's output:
# rtla hwnoise -c 1-11 -T 1 -d 10m -q
Hardware-related Noise
duration: 0 00:10:00 | time is in us
CPU Period Runtime Noise % CPU Aval Max Noise Max Single HW NMI
1 #599 599000000 138 99.99997 3 3 4 74
2 #599 599000000 85 99.99998 3 3 4 75
3 #599 599000000 86 99.99998 4 3 6 75
4 #599 599000000 81 99.99998 4 4 2 75
5 #599 599000000 85 99.99998 2 2 2 75
Link: https://lkml.kernel.org/r/2d6f49a6f3a4f8b51b2c806458b1cff71ad4d014.1675805361.git.bristot@kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
This patch adds selftests exercising the logic changed/added in the
previous patches in the series. A variety of successful and unsuccessful
rbtree usages are validated:
Success:
* Add some nodes, let map_value bpf_rbtree_root destructor clean them
up
* Add some nodes, remove one using the non-owning ref leftover by
successful rbtree_add() call
* Add some nodes, remove one using the non-owning ref returned by
rbtree_first() call
Failure:
* BTF where bpf_rb_root owns bpf_list_node should fail to load
* BTF where node of type X is added to tree containing nodes of type Y
should fail to load
* No calling rbtree api functions in 'less' callback for rbtree_add
* No releasing lock in 'less' callback for rbtree_add
* No removing a node which hasn't been added to any tree
* No adding a node which has already been added to a tree
* No escaping of non-owning references past their lock's
critical section
* No escaping of non-owning references past other invalidation points
(rbtree_remove)
These tests mostly focus on rbtree-specific additions, but some of the
failure cases revalidate scenarios common to both linked_list and rbtree
which are covered in the former's tests. Better to be a bit redundant in
case linked_list and rbtree semantics deviate over time.
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/r/20230214004017.2534011-8-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Newly-added bpf_rbtree_{remove,first} kfuncs have some special properties
that require handling in the verifier:
* both bpf_rbtree_remove and bpf_rbtree_first return the type containing
the bpf_rb_node field, with the offset set to that field's offset,
instead of a struct bpf_rb_node *
* mark_reg_graph_node helper added in previous patch generalizes
this logic, use it
* bpf_rbtree_remove's node input is a node that's been inserted
in the tree - a non-owning reference.
* bpf_rbtree_remove must invalidate non-owning references in order to
avoid aliasing issue. Use previously-added
invalidate_non_owning_refs helper to mark this function as a
non-owning ref invalidation point.
* Unlike other functions, which convert one of their input arg regs to
non-owning reference, bpf_rbtree_first takes no arguments and just
returns a non-owning reference (possibly null)
* For now verifier logic for this is special-cased instead of
adding new kfunc flag.
This patch, along with the previous one, complete special verifier
handling for all rbtree API functions added in this series.
With functional verifier handling of rbtree_remove, under current
non-owning reference scheme, a node type with both bpf_{list,rb}_node
fields could cause the verifier to accept programs which remove such
nodes from collections they haven't been added to.
In order to prevent this, this patch adds a check to btf_parse_fields
which rejects structs with both bpf_{list,rb}_node fields. This is a
temporary measure that can be removed after "collection identity"
followup. See comment added in btf_parse_fields. A linked_list BTF test
exercising the new check is added in this patch as well.
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/r/20230214004017.2534011-6-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This patch adds special BPF_RB_{ROOT,NODE} btf_field_types similar to
BPF_LIST_{HEAD,NODE}, adds the necessary plumbing to detect the new
types, and adds bpf_rb_root_free function for freeing bpf_rb_root in
map_values.
structs bpf_rb_root and bpf_rb_node are opaque types meant to
obscure structs rb_root_cached rb_node, respectively.
btf_struct_access will prevent BPF programs from touching these special
fields automatically now that they're recognized.
btf_check_and_fixup_fields now groups list_head and rb_root together as
"graph root" fields and {list,rb}_node as "graph node", and does same
ownership cycle checking as before. Note that this function does _not_
prevent ownership type mixups (e.g. rb_root owning list_node) - that's
handled by btf_parse_graph_root.
After this patch, a bpf program can have a struct bpf_rb_root in a
map_value, but not add anything to nor do anything useful with it.
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/r/20230214004017.2534011-2-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
make run_tests doesn't run the test. Fix Makefile to set TEST_GEN_PROGS
instead of TEST_GEN_FILES to fix the problem.
run_tests runs TEST_GEN_PROGS, TEST_CUSTOM_PROGS, and TEST_PROGS.
TEST_GEN_FILES is for files generated by tests.
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Fix the following build error due to redefining struct mount_attr by
removing duplicate define from mount_setattr_test.c
gcc -g -isystem .../tools/testing/selftests/../../../usr/include -Wall -O2 -pthread mount_setattr_test.c -o .../tools/testing/selftests/mount_setattr/mount_setattr_test
mount_setattr_test.c:107:8: error: redefinition of ‘struct mount_attr’
107 | struct mount_attr {
| ^~~~~~~~~~
In file included from /usr/include/x86_64-linux-gnu/sys/mount.h:32,
from mount_setattr_test.c:10:
.../usr/include/linux/mount.h:129:8: note: originally defined here
129 | struct mount_attr {
| ^~~~~~~~~~
make: *** [../lib.mk:145: .../tools/testing/selftests/mount_setattr/mount_setattr_test] Error 1
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* kvm-arm64/misc:
: Miscellaneous updates
:
: - Convert CPACR_EL1_TTA to the new, generated system register
: definitions.
:
: - Serialize toggling CPACR_EL1.SMEN to avoid unexpected exceptions when
: accessing SVCR in the host.
:
: - Avoid quiescing the guest if a vCPU accesses its own redistributor's
: SGIs/PPIs, eliminating the need to IPI. Largely an optimization for
: nested virtualization, as the L1 accesses the affected registers
: rather often.
:
: - Conversion to kstrtobool()
:
: - Common definition of INVALID_GPA across architectures
:
: - Enable CONFIG_USERFAULTFD for CI runs of KVM selftests
KVM: arm64: Fix non-kerneldoc comments
KVM: selftests: Enable USERFAULTFD
KVM: selftests: Remove redundant setbuf()
arm64/sysreg: clean up some inconsistent indenting
KVM: MMU: Make the definition of 'INVALID_GPA' common
KVM: arm64: vgic-v3: Use kstrtobool() instead of strtobool()
KVM: arm64: vgic-v3: Limit IPI-ing when accessing GICR_{C,S}ACTIVER0
KVM: arm64: Synchronize SMEN on vcpu schedule out
KVM: arm64: Kill CPACR_EL1_TTA definition
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
This patch introduces non-owning reference semantics to the verifier,
specifically linked_list API kfunc handling. release_on_unlock logic for
refs is refactored - with small functional changes - to implement these
semantics, and bpf_list_push_{front,back} are migrated to use them.
When a list node is pushed to a list, the program still has a pointer to
the node:
n = bpf_obj_new(typeof(*n));
bpf_spin_lock(&l);
bpf_list_push_back(&l, n);
/* n still points to the just-added node */
bpf_spin_unlock(&l);
What the verifier considers n to be after the push, and thus what can be
done with n, are changed by this patch.
Common properties both before/after this patch:
* After push, n is only a valid reference to the node until end of
critical section
* After push, n cannot be pushed to any list
* After push, the program can read the node's fields using n
Before:
* After push, n retains the ref_obj_id which it received on
bpf_obj_new, but the associated bpf_reference_state's
release_on_unlock field is set to true
* release_on_unlock field and associated logic is used to implement
"n is only a valid ref until end of critical section"
* After push, n cannot be written to, the node must be removed from
the list before writing to its fields
* After push, n is marked PTR_UNTRUSTED
After:
* After push, n's ref is released and ref_obj_id set to 0. NON_OWN_REF
type flag is added to reg's type, indicating that it's a non-owning
reference.
* NON_OWN_REF flag and logic is used to implement "n is only a
valid ref until end of critical section"
* n can be written to (except for special fields e.g. bpf_list_node,
timer, ...)
Summary of specific implementation changes to achieve the above:
* release_on_unlock field, ref_set_release_on_unlock helper, and logic
to "release on unlock" based on that field are removed
* The anonymous active_lock struct used by bpf_verifier_state is
pulled out into a named struct bpf_active_lock.
* NON_OWN_REF type flag is introduced along with verifier logic
changes to handle non-owning refs
* Helpers are added to use NON_OWN_REF flag to implement non-owning
ref semantics as described above
* invalidate_non_owning_refs - helper to clobber all non-owning refs
matching a particular bpf_active_lock identity. Replaces
release_on_unlock logic in process_spin_lock.
* ref_set_non_owning - set NON_OWN_REF type flag after doing some
sanity checking
* ref_convert_owning_non_owning - convert owning reference w/
specified ref_obj_id to non-owning references. Set NON_OWN_REF
flag for each reg with that ref_obj_id and 0-out its ref_obj_id
* Update linked_list selftests to account for minor semantic
differences introduced by this patch
* Writes to a release_on_unlock node ref are not allowed, while
writes to non-owning reference pointees are. As a result the
linked_list "write after push" failure tests are no longer scenarios
that should fail.
* The test##missing_lock##op and test##incorrect_lock##op
macro-generated failure tests need to have a valid node argument in
order to have the same error output as before. Otherwise
verification will fail early and the expected error output won't be seen.
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/r/20230212092715.1422619-2-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Building BPF selftests out of srctree fails with:
make: *** No rule to make target '/linux-build//ima_setup.sh', needed by 'ima_setup.sh'. Stop.
The culprit is the rule that defines convenient shorthands like
"make test_progs", which builds $(OUTPUT)/test_progs. These shorthands
make sense only for binaries that are built though; scripts that live
in the source tree do not end up in $(OUTPUT).
Therefore drop $(TEST_PROGS) and $(TEST_PROGS_EXTENDED) from the rule.
The issue exists for a while, but it became a problem only after commit
d68ae4982c ("selftests/bpf: Install all required files to run selftests"),
which added dependencies on these scripts.
Fixes: 03dcb78460 ("selftests/bpf: Add simple per-test targets to Makefile")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230208231211.283606-1-iii@linux.ibm.com
Fix the following build warn removing unnecessary clean target
from the Makefile. lib.mk handles clean.
Makefile:10: warning: overriding recipe for target clean
../lib.mk:124: warning: ignoring old recipe for target clean
In addition, fix to use TEST_GEN_PROGS for generated test executables
and TES_PROGS for the shell script. Ger rid of all target as lib.mk
handles it.
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Rather than trying to guess which implementation of "echo" to run with
support for "-ne" options, use "printf" instead of "echo -ne". It
handles escape characters as a standard feature and it is widespread
among modern shells.
Reported-by: "kernelci.org bot" <bot@kernelci.org>
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Fixes: 3297a4df80 ("kselftests: Enable the echo command to print newlines in Makefile")
Fixes: 79c16b1120fe ("selftests: find echo binary to use -ne options")
Signed-off-by: Guillaume Tucker <guillaume.tucker@collabora.com>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Since commit a1d6cd88c8 ("selftests/ftrace: event_triggers: wait
longer for test_event_enable") introduced bash specific "=="
comparation operator, that test will fail when we run it on a
posix-shell. `checkbashisms` warned it as below.
possible bashism in ftrace/func_event_triggers.tc line 45 (should be 'b = a'):
if [ "$e" == $val ]; then
This replaces it with "=".
Fixes: a1d6cd88c8 ("selftests/ftrace: event_triggers: wait longer for test_event_enable")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
When testing with FLAG_DEBUG enabled client, it emits the following
error messages:
File "/root/tpm2/tpm2.py", line 347, in hex_dump
d = [format(ord(x), '02x') for x in d]
File "/root/tpm2/tpm2.py", line 347, in <listcomp>
d = [format(ord(x), '02x') for x in d]
TypeError: ord() expected string of length 1, but int found
The input of hex_dump() should be packed binary data. Remove the
ord().
Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Find the actual echo binary using $(which echo) and use it for
formatted output with -ne. On some systems, the default echo command
doesn't handle the -e option and the output looks like this (arm64
build):
-ne Emit Tests for alsa
-ne Emit Tests for amd-pstate
-ne Emit Tests for arm64
This is for example the case with the KernelCI Docker images
e.g. kernelci/gcc-10:x86-kselftest-kernelci. With the actual echo
binary (e.g. in /bin/echo), the output is formatted as expected (x86
build this time):
Emit Tests for alsa
Emit Tests for amd-pstate
Skipping non-existent dir: arm64
Only the install target is using "echo -ne" so keep the $ECHO variable
local to it.
Reported-by: "kernelci.org bot" <bot@kernelci.org>
Fixes: 3297a4df80 ("kselftests: Enable the echo command to print newlines in Makefile")
Signed-off-by: Guillaume Tucker <guillaume.tucker@collabora.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
There are two spelling mistakes in the test messages. Fix them.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for installed kernel headers rather
than using kernel headers in include/uapi from the source kernel tree
kernel headers.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for installed kernel headers rather
than using kernel headers in include/uapi from the source kernel tree
kernel headers.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for installed kernel headers rather
than using kernel headers in include/uapi from the source kernel tree
kernel headers.
Remove bogus ../../../../include/ from the search path, because
kernel source headers are not needed by those user-space selftests, and
it causes issues because -I paths are searched before -isystem paths,
and conflicts for files appearing both in kernel sources and in uapi
headers with incompatible semantics (e.g. types.h).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for installed kernel headers rather
than using kernel headers in include/uapi from the source kernel tree
kernel headers.
Remove bogus ../../../../include/ from the search path, because
kernel source headers are not needed by those user-space selftests, and
it causes issues because -I paths are searched before -isystem paths,
and conflicts for files appearing both in kernel sources and in uapi
headers with incompatible semantics (e.g. types.h).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Mark xen_pv_play_dead() and related to that xen_cpu_bringup_again()
as "__noreturn".
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221125063248.30256-3-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
This reverts commit 115d9d77bb.
The pages being freed by memblock_free_late() have already been
initialized, but if they are in the deferred init range, __free_one_page()
might access nearby uninitialized pages when trying to coalesce buddies,
which will cause a crash.
A proper fix will be more involved so revert this change for the time
being.
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEeOVYVaWZL5900a/pOQOGJssO/ZEFAmPnaSQQHHJwcHRAa2Vy
bmVsLm9yZwAKCRA5A4Ymyw79kQl5B/42xQ7QDacxL+okyQXYUytC5DqZ8+1bL5uU
bHg4rNyR7/+7r+D0p6z7MhpeoSdXMSgSLGbx8joaXDNhyNtQqMSj19IQjtzndj4L
pzH5jQ5RJR9ePJBJ3Mq3uInaEvACzPIkfyvHAT4JE65jle8WQ5F5BJ+TzwlWOU0Q
cf9orYTIlDp50saJ/rrw0WKelSZ1oCQJnvFsgIfshmD4b3fZ+X70gsIRAcvqizgw
gszZmpIkgU6idLlboku0jnVTkW2f1C5ZplrDrFXaDbai5mSviPSA7I3TsTA495iD
bwo6xAaPeVOoJOnu7XvKs0e2MFKIfNPIcGzxJe+4vSS+i4W62uyC
=h6Xh
-----END PGP SIGNATURE-----
Merge tag 'fixes-2023-02-11' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock
Pull memblock revert from Mike Rapoport:
"Revert 'mm: Always release pages to the buddy allocator in
memblock_free_late()'
The pages being freed by memblock_free_late() have already been
initialized, but if they are in the deferred init range,
__free_one_page() might access nearby uninitialized pages when trying
to coalesce buddies, which will cause a crash.
A proper fix will be more involved so revert this change for the time
being"
* tag 'fixes-2023-02-11' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
Revert "mm: Always release pages to the buddy allocator in memblock_free_late()."
Add a 'signal' field which allows unwind hints to specify whether the
instruction pointer should be taken literally (like for most interrupts
and exceptions) rather than decremented (like for call stack return
addresses) when used to find the next ORC entry.
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/d2c5ec4d83a45b513d8fd72fab59f1a8cfa46871.1676068346.git.jpoimboe@kernel.org
For mysterious raisins I listed the new __asan_mem*() functions as
being uaccess safe, this is giving objtool fails on KASAN builds
because these functions call out to the actual __mem*() functions
which are not marked uaccess safe.
Removing it doesn't make the robots unhappy.
Fixes: 69d4c0d321 ("entry, kasan, x86: Disallow overriding mem*() functions")
Reported-by: "Paul E. McKenney" <paulmck@kernel.org>
Bisected-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20230126182302.GA687063@paulmck-ThinkPad-P17-Gen-1
The kernel maintains three markers for the MDB dump:
1. The last bridge device from which the MDB was dumped.
2. The last MDB entry from which the MDB was dumped.
3. The last port-group entry that was dumped.
Add test cases for large scale MDB dump to make sure that all the
configured entries are dumped and that the markers are used correctly.
Specifically, create 2 bridges with 32 ports and add 256 MDB entries in
which all the ports are member of. Test that each bridge reports 8192
(256 * 32) permanent entries. Do that with IPv4, IPv6 and L2 MDB
entries.
On my system, MDB dump of the above is contained in about 50 netlink
messages.
Example output:
# ./bridge_mdb.sh
[...]
INFO: # Large scale dump tests
TEST: IPv4 large scale dump tests [ OK ]
TEST: IPv6 large scale dump tests [ OK ]
TEST: L2 large scale dump tests [ OK ]
[...]
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Include the support for enumerating and provisioning ram regions for
v6.3. This also include a default policy change for ram / volatile
device-dax instances to assign them to the dax_kmem driver by default.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCY+bZrwAKCRDbK58LschI
gzi4AP4+TYo0jnSwwkrOoN9l4f5VO9X8osmj3CXfHBv7BGWVxAD/WnvA3TDZyaUd
agIZTkRs6BHF9He8oROypARZxTeMLwM=
=nO1C
-----END PGP SIGNATURE-----
Daniel Borkmann says:
====================
pull-request: bpf-next 2023-02-11
We've added 96 non-merge commits during the last 14 day(s) which contain
a total of 152 files changed, 4884 insertions(+), 962 deletions(-).
There is a minor conflict in drivers/net/ethernet/intel/ice/ice_main.c
between commit 5b246e533d ("ice: split probe into smaller functions")
from the net-next tree and commit 66c0e13ad2 ("drivers: net: turn on
XDP features") from the bpf-next tree. Remove the hunk given ice_cfg_netdev()
is otherwise there a 2nd time, and add XDP features to the existing
ice_cfg_netdev() one:
[...]
ice_set_netdev_features(netdev);
netdev->xdp_features = NETDEV_XDP_ACT_BASIC | NETDEV_XDP_ACT_REDIRECT |
NETDEV_XDP_ACT_XSK_ZEROCOPY;
ice_set_ops(netdev);
[...]
Stephen's merge conflict mail:
https://lore.kernel.org/bpf/20230207101951.21a114fa@canb.auug.org.au/
The main changes are:
1) Add support for BPF trampoline on s390x which finally allows to remove many
test cases from the BPF CI's DENYLIST.s390x, from Ilya Leoshkevich.
2) Add multi-buffer XDP support to ice driver, from Maciej Fijalkowski.
3) Add capability to export the XDP features supported by the NIC.
Along with that, add a XDP compliance test tool,
from Lorenzo Bianconi & Marek Majtyka.
4) Add __bpf_kfunc tag for marking kernel functions as kfuncs,
from David Vernet.
5) Add a deep dive documentation about the verifier's register
liveness tracking algorithm, from Eduard Zingerman.
6) Fix and follow-up cleanups for resolve_btfids to be compiled
as a host program to avoid cross compile issues,
from Jiri Olsa & Ian Rogers.
7) Batch of fixes to the BPF selftest for xdp_hw_metadata which resulted
when testing on different NICs, from Jesper Dangaard Brouer.
8) Fix libbpf to better detect kernel version code on Debian, from Hao Xiang.
9) Extend libbpf to add an option for when the perf buffer should
wake up, from Jon Doron.
10) Follow-up fix on xdp_metadata selftest to just consume on TX
completion, from Stanislav Fomichev.
11) Extend the kfuncs.rst document with description on kfunc
lifecycle & stability expectations, from David Vernet.
12) Fix bpftool prog profile to skip attaching to offline CPUs,
from Tonghao Zhang.
====================
Link: https://lore.kernel.org/r/20230211002037.8489-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Take two endpoints attached to the first switch on the first host-bridge
in the cxl_test topology and define a pre-initialized region. This is a
x2 interleave underneath a x1 CXL Window.
$ modprobe cxl_test
$ # cxl list -Ru
{
"region":"region3",
"resource":"0xf010000000",
"size":"512.00 MiB (536.87 MB)",
"interleave_ways":2,
"interleave_granularity":4096,
"decode_state":"commit"
}
Tested-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/167602000547.1924368.11613151863880268868.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The code assumes that everything that comes after nlmsgerr are nlattrs.
When calculating their size, it does not account for the initial
nlmsghdr. This may lead to accessing uninitialized memory.
Fixes: bbf48c18ee ("libbpf: add error reporting in XDP")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230210001210.395194-8-iii@linux.ibm.com
To get useful results from the Memory Sanitizer, all code running in a
process needs to be instrumented. When building tests with other
sanitizers, it's not strictly necessary, but is also helpful.
So make sure runqslower and libbpf are compiled with SAN_CFLAGS and
linked with SAN_LDFLAGS.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230210001210.395194-5-iii@linux.ibm.com