linux

iv/linux

Author	SHA1	Message	Date
Magnus Karlsson	64370d7c8a	selftests/xsk: add timeout for Tx thread Add a timeout for the transmission thread. If packets are not completed properly, for some reason, the test harness would previously get stuck forever in a while loop. But with this patch, this timeout will trigger, flag the test as a failure, and continue with the next test. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20230914084900.492-3-magnus.karlsson@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-14 09:47:55 -07:00
Magnus Karlsson	2d2712caf4	selftests/xsk: print per packet info in verbose mode Print info about every packet in verbose mode, both for Tx and Rx. This is useful to have when a test fails or to validate that a test is really doing what it was designed to do. Info on what is supposed to be received and sent is also printed for the custom packet streams since they differ from the base line. Here is an example: Tx addr: 37e0 len: 64 options: 0 pkt_nb: 8 Tx addr: 4000 len: 64 options: 0 pkt_nb: 9 Rx: addr: 100 len: 64 options: 0 pkt_nb: 0 valid: 1 Rx: addr: 1100 len: 64 options: 0 pkt_nb: 1 valid: 1 Rx: addr: 2100 len: 64 options: 0 pkt_nb: 4 valid: 1 Rx: addr: 3100 len: 64 options: 0 pkt_nb: 8 valid: 1 Rx: addr: 4100 len: 64 options: 0 pkt_nb: 9 valid: 1 One pointless verbose print statement is also deleted and another one is made clearer. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20230914084900.492-2-magnus.karlsson@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-14 09:47:55 -07:00
Quan Tian	558c50cc3b	docs/bpf: update out-of-date doc in BPF flow dissector Commit a5e2151ff9d5 ("net/ipv6: SKB symmetric hash should incorporate transport ports") removed the use of FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL in __skb_get_hash_symmetric(), making the doc out-of-date. Signed-off-by: Quan Tian <qtian@vmware.com> Link: https://lore.kernel.org/r/20230911152353.8280-1-qtian@vmware.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-09-13 14:22:57 -07:00
Alexei Starovoitov	5bbb9e1f08	Merge branch 'bpf-x64-fix-tailcall-infinite-loop' Leon Hwang says: ==================== bpf, x64: Fix tailcall infinite loop This patch series fixes a tailcall infinite loop on x64. From commit ebf7d1f508a73871 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT"), the tailcall on x64 works better than before. From commit e411901c0b775a3a ("bpf: allow for tailcalls in BPF subprograms for x64 JIT"), tailcall is able to run in BPF subprograms on x64. From commit 5b92a28aae4dd0f8 ("bpf: Support attaching tracing BPF program to other BPF programs"), BPF program is able to trace other BPF programs. How about combining them all together? 1. FENTRY/FEXIT on a BPF subprogram. 2. A tailcall runs in the BPF subprogram. 3. The tailcall calls the subprogram's caller. As a result, a tailcall infinite loop comes up. And the loop would halt the machine. As we know, in tail call context, the tail_call_cnt propagates by stack and rax register between BPF subprograms. So do in trampolines. How did I discover the bug? From commit 7f6e4312e15a5c37 ("bpf: Limit caller's stack depth 256 for subprogs with tailcalls"), the total stack size limits to around 8KiB. Then, I write some bpf progs to validate the stack consuming, that are tailcalls running in bpf2bpf and FENTRY/FEXIT tracing on bpf2bpf. At that time, accidently, I made a tailcall loop. And then the loop halted my VM. Without the loop, the bpf progs would consume over 8KiB stack size. But the _stack-overflow_ did not halt my VM. With bpf_printk(), I confirmed that the tailcall count limit did not work expectedly. Next, read the code and fix it. Thank Ilya Leoshkevich, this bug on s390x has been fixed. Hopefully, this bug on arm64 will be fixed in near future. ==================== Link: https://lore.kernel.org/r/20230912150442.2009-1-hffilwlqm@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-12 13:06:12 -07:00
Leon Hwang	e13b5f2f3b	selftests/bpf: Add testcases for tailcall infinite loop fixing Add 4 test cases to confirm the tailcall infinite loop bug has been fixed. Like tailcall_bpf2bpf cases, do fentry/fexit on the bpf2bpf, and then check the final count result. tools/testing/selftests/bpf/test_progs -t tailcalls 226/13 tailcalls/tailcall_bpf2bpf_fentry:OK 226/14 tailcalls/tailcall_bpf2bpf_fexit:OK 226/15 tailcalls/tailcall_bpf2bpf_fentry_fexit:OK 226/16 tailcalls/tailcall_bpf2bpf_fentry_entry:OK 226 tailcalls:OK Summary: 1/16 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Leon Hwang <hffilwlqm@gmail.com> Link: https://lore.kernel.org/r/20230912150442.2009-4-hffilwlqm@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-12 13:06:12 -07:00
Leon Hwang	2b5dcb31a1	bpf, x64: Fix tailcall infinite loop From commit ebf7d1f508a73871 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT"), the tailcall on x64 works better than before. From commit e411901c0b775a3a ("bpf: allow for tailcalls in BPF subprograms for x64 JIT"), tailcall is able to run in BPF subprograms on x64. From commit 5b92a28aae4dd0f8 ("bpf: Support attaching tracing BPF program to other BPF programs"), BPF program is able to trace other BPF programs. How about combining them all together? 1. FENTRY/FEXIT on a BPF subprogram. 2. A tailcall runs in the BPF subprogram. 3. The tailcall calls the subprogram's caller. As a result, a tailcall infinite loop comes up. And the loop would halt the machine. As we know, in tail call context, the tail_call_cnt propagates by stack and rax register between BPF subprograms. So do in trampolines. Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT") Fixes: e411901c0b77 ("bpf: allow for tailcalls in BPF subprograms for x64 JIT") Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Leon Hwang <hffilwlqm@gmail.com> Link: https://lore.kernel.org/r/20230912150442.2009-3-hffilwlqm@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-12 13:06:12 -07:00
Leon Hwang	2bee9770f3	bpf, x64: Comment tail_call_cnt initialisation Without understanding emit_prologue(), it is really hard to figure out where does tail_call_cnt come from, even though searching tail_call_cnt in the whole kernel repo. By adding these comments, it is a little bit easier to understand tail_call_cnt initialisation. Signed-off-by: Leon Hwang <hffilwlqm@gmail.com> Link: https://lore.kernel.org/r/20230912150442.2009-2-hffilwlqm@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-12 13:06:12 -07:00
Leon Hwang	96daa98742	selftests/bpf: Correct map_fd to data_fd in tailcalls Get and check data_fd. It should not check map_fd again. Meanwhile, correct some 'return' to 'goto out'. Thank the suggestion from Maciej in "bpf, x64: Fix tailcall infinite loop"[0] discussions. [0] https://lore.kernel.org/bpf/e496aef8-1f80-0f8e-dcdd-25a8c300319a@gmail.com/T/#m7d3b601066ba66400d436b7e7579b2df4a101033 Fixes: 79d49ba048ec ("bpf, testing: Add various tail call test cases") Fixes: 3b0379111197 ("selftests/bpf: Add tailcall_bpf2bpf tests") Fixes: 5e0b0a4c52d3 ("selftests/bpf: Test tail call counting with bpf2bpf and data on stack") Signed-off-by: Leon Hwang <hffilwlqm@gmail.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/r/20230906154256.95461-1-hffilwlqm@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-09-11 15:28:24 -07:00
Denys Zagorui	ebc8484d0e	bpftool: Fix -Wcast-qual warning This cast was made by purpose for older libbpf where the bpf_object_skeleton field is void * instead of const void * to eliminate a warning (as i understand -Wincompatible-pointer-types-discards-qualifiers) but this cast introduces another warning (-Wcast-qual) for libbpf where data field is const void * It makes sense for bpftool to be in sync with libbpf from kernel sources Signed-off-by: Denys Zagorui <dzagorui@cisco.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20230907090210.968612-1-dzagorui@cisco.com	2023-09-08 17:04:24 -07:00
Andrii Nakryiko	dbbe15859b	Merge branch 'selftests/bpf: Optimize kallsyms cache' Rong Tao says: ==================== We need to optimize the kallsyms cache, including optimizations for the number of symbols limit, and, some test cases add new kernel symbols (such as testmods) and we need to refresh kallsyms (reload or refresh). ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-09-08 17:04:14 -07:00
Rong Tao	a28b1ba259	selftests/bpf: trace_helpers.c: Add a global ksyms initialization mutex As Jirka said [0], we just need to make sure that global ksyms initialization won't race. [0] https://lore.kernel.org/lkml/ZPCbAs3ItjRd8XVh@krava/ Signed-off-by: Rong Tao <rongtao@cestc.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/tencent_5D0A837E219E2CFDCB0495DAD7D5D1204407@qq.com	2023-09-08 16:22:41 -07:00
Rong Tao	c698eaebdf	selftests/bpf: trace_helpers.c: Optimize kallsyms cache Static ksyms often have problems because the number of symbols exceeds the MAX_SYMS limit. Like changing the MAX_SYMS from 300000 to 400000 in commit e76a014334a6("selftests/bpf: Bump and validate MAX_SYMS") solves the problem somewhat, but it's not the perfect way. This commit uses dynamic memory allocation, which completely solves the problem caused by the limitation of the number of kallsyms. At the same time, add APIs: load_kallsyms_local() ksym_search_local() ksym_get_addr_local() free_kallsyms_local() There are used to solve the problem of selftests/bpf updating kallsyms after attach new symbols during testmod testing. Signed-off-by: Rong Tao <rongtao@cestc.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/tencent_C9BDA68F9221F21BE4081566A55D66A9700A@qq.com	2023-09-08 16:22:41 -07:00
Alexei Starovoitov	9bc869253d	Merge branch 'bpf-task_group_seq_get_next-misc-cleanups' Oleg Nesterov says: ==================== bpf: task_group_seq_get_next: misc cleanups Yonghong, I am resending 1-5 of 6 as you suggested with your acks included. The next (final) patch will change this code to use __next_thread when https://lore.kernel.org/all/20230824143142.GA31222@redhat.com/ is merged. Oleg. ==================== Link: https://lore.kernel.org/r/20230905154612.GA24872@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-08 08:42:19 -07:00
Alexei Starovoitov	35897c3c52	Merge branch 'bpf-enable-irq-after-irq_work_raise-completes' Hou Tao says: ==================== bpf: Enable IRQ after irq_work_raise() completes From: Hou Tao <houtao1@huawei.com> Hi, The patchset aims to fix the problem that bpf_mem_alloc() may return NULL unexpectedly when multiple bpf_mem_alloc() are invoked concurrently under process context and there is still free memory available. The problem was found when doing stress test for qp-trie but the same problem also exists for bpf_obj_new() as demonstrated in patch #3. As pointed out by Alexei, the patchset can only fix ENOMEM problem for normal process context and can not fix the problem for irq-disabled context or RT-enabled kernel. Patch #1 fixes the race between unit_alloc() and unit_alloc(). Patch #2 fixes the race between unit_alloc() and unit_free(). And patch #3 adds a selftest for the problem. The major change compared with v1 is using local_irq_{save,restore)() pair to disable and enable preemption instead of preempt_{disable,enable}_notrace pair. The main reason is to prevent potential overhead from __preempt_schedule_notrace(). I also run htab_mem benchmark and hash_map_perf on a 8-CPUs KVM VM to compare the performance between local_irq_{save,restore} and preempt_{disable,enable}_notrace(), but the results are similar as shown below: (1) use preempt_{disable,enable}_notrace() [root@hello bpf]# ./map_perf_test 4 8 0:hash_map_perf kmalloc 652179 events per sec 1:hash_map_perf kmalloc 651880 events per sec 2:hash_map_perf kmalloc 651382 events per sec 3:hash_map_perf kmalloc 650791 events per sec 5:hash_map_perf kmalloc 650140 events per sec 6:hash_map_perf kmalloc 652773 events per sec 7:hash_map_perf kmalloc 652751 events per sec 4:hash_map_perf kmalloc 648199 events per sec [root@hello bpf]# ./benchs/run_bench_htab_mem.sh normal bpf ma ============= overwrite per-prod-op: 110.82 ± 0.02k/s, avg mem: 2.00 ± 0.00MiB, peak mem: 2.73MiB batch_add_batch_del per-prod-op: 89.79 ± 0.75k/s, avg mem: 1.68 ± 0.38MiB, peak mem: 2.73MiB add_del_on_diff_cpu per-prod-op: 17.83 ± 0.07k/s, avg mem: 25.68 ± 2.92MiB, peak mem: 35.10MiB (2) use local_irq_{save,restore} [root@hello bpf]# ./map_perf_test 4 8 0:hash_map_perf kmalloc 656299 events per sec 1:hash_map_perf kmalloc 656397 events per sec 2:hash_map_perf kmalloc 656046 events per sec 3:hash_map_perf kmalloc 655723 events per sec 5:hash_map_perf kmalloc 655221 events per sec 4:hash_map_perf kmalloc 654617 events per sec 6:hash_map_perf kmalloc 650269 events per sec 7:hash_map_perf kmalloc 653665 events per sec [root@hello bpf]# ./benchs/run_bench_htab_mem.sh normal bpf ma ============= overwrite per-prod-op: 116.10 ± 0.02k/s, avg mem: 2.00 ± 0.00MiB, peak mem: 2.74MiB batch_add_batch_del per-prod-op: 88.76 ± 0.61k/s, avg mem: 1.94 ± 0.33MiB, peak mem: 2.74MiB add_del_on_diff_cpu per-prod-op: 18.12 ± 0.08k/s, avg mem: 25.10 ± 2.70MiB, peak mem: 34.78MiB As ususal comments are always welcome. Change Log: v2: * Use local_irq_save to disable preemption instead of using preempt_{disable,enable}_notrace pair to prevent potential overhead v1: https://lore.kernel.org/bpf/20230822133807.3198625-1-houtao@huaweicloud.com/ ==================== Link: https://lore.kernel.org/r/20230901111954.1804721-1-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-08 08:42:19 -07:00
Oleg Nesterov	780aa8dfcb	bpf: task_group_seq_get_next: simplify the "next tid" logic Kill saved_tid. It looks ugly to update tid and then restore the previous value if __task_pid_nr_ns() returns 0. Change this code to update tid and common->pid_visiting once before return. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230905154656.GA24950@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-08 08:42:19 -07:00
Hou Tao	29c11aa808	selftests/bpf: Test preemption between bpf_obj_new() and bpf_obj_drop() The test case creates 4 threads and then pins these 4 threads in CPU 0. These 4 threads will run different bpf program through bpf_prog_test_run_opts() and these bpf program will use bpf_obj_new() and bpf_obj_drop() to allocate and free local kptrs concurrently. Under preemptible kernel, bpf_obj_new() and bpf_obj_drop() may preempt each other, bpf_obj_new() may return NULL and the test will fail before applying these fixes as shown below: test_preempted_bpf_ma_op:PASS:open_and_load 0 nsec test_preempted_bpf_ma_op:PASS:attach 0 nsec test_preempted_bpf_ma_op:PASS:no test prog 0 nsec test_preempted_bpf_ma_op:PASS:no test prog 0 nsec test_preempted_bpf_ma_op:PASS:no test prog 0 nsec test_preempted_bpf_ma_op:PASS:no test prog 0 nsec test_preempted_bpf_ma_op:PASS:pthread_create 0 nsec test_preempted_bpf_ma_op:PASS:pthread_create 0 nsec test_preempted_bpf_ma_op:PASS:pthread_create 0 nsec test_preempted_bpf_ma_op:PASS:pthread_create 0 nsec test_preempted_bpf_ma_op:PASS:run prog err 0 nsec test_preempted_bpf_ma_op:PASS:run prog err 0 nsec test_preempted_bpf_ma_op:PASS:run prog err 0 nsec test_preempted_bpf_ma_op:PASS:run prog err 0 nsec test_preempted_bpf_ma_op:FAIL:ENOMEM unexpected ENOMEM: got TRUE #168 preempted_bpf_ma_op:FAIL Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20230901111954.1804721-4-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-08 08:42:19 -07:00
Oleg Nesterov	0ee9808b0a	bpf: task_group_seq_get_next: kill next_task It only adds the unnecessary confusion and compicates the "retry" code. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230905154654.GA24945@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-08 08:42:19 -07:00
Hou Tao	62cf51cb0e	bpf: Enable IRQ after irq_work_raise() completes in unit_free{_rcu}() Both unit_free() and unit_free_rcu() invoke irq_work_raise() to free freed objects back to slab and the invocation may also be preempted by unit_alloc() and unit_alloc() may return NULL unexpectedly as shown in the following case: task A task B unit_free() // high_watermark = 48 // free_cnt = 49 after free irq_work_raise() // mark irq work as IRQ_WORK_PENDING irq_work_claim() // task B preempts task A unit_alloc() // free_cnt = 48 after alloc // does unit_alloc() 32-times ...... // free_cnt = 16 unit_alloc() // free_cnt = 15 after alloc // irq work is already PENDING, // so just return irq_work_raise() // does unit_alloc() 15-times ...... // free_cnt = 0 unit_alloc() // free_cnt = 0 before alloc return NULL Fix it by enabling IRQ after irq_work_raise() completes. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20230901111954.1804721-3-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-08 08:42:19 -07:00
Oleg Nesterov	87abbf7a54	bpf: task_group_seq_get_next: fix the skip_if_dup_files check Unless I am notally confused it is wrong. We are going to return or skip next_task so we need to check next_task-files, not task->files. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230905154651.GA24940@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-08 08:42:19 -07:00
Oleg Nesterov	4981921350	bpf: task_group_seq_get_next: cleanup the usage of get/put_task_struct get_pid_task() makes no sense, the code does put_task_struct() soon after. Use find_task_by_pid_ns() instead of find_pid_ns + get_pid_task and kill put_task_struct(), this allows to do get_task_struct() only once before return. While at it, kill the unnecessary "if (!pid)" check in the "if (!*tid)" block, this matches the next usage of find_pid_ns() + get_pid_task() in this function. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230905154649.GA24935@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-08 08:42:19 -07:00
Oleg Nesterov	1a00ef57d9	bpf: task_group_seq_get_next: cleanup the usage of next_thread() 1. find_pid_ns() + get_pid_task() under rcu_read_lock() guarantees that we can safely iterate the task->thread_group list. Even if this task exits right after get_pid_task() (or goto retry) and pid_alive() returns 0. Kill the unnecessary pid_alive() check. 2. next_thread() simply can't return NULL, kill the bogus "if (!next_task)" check. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230905154646.GA24928@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-08 08:42:19 -07:00
Alexei Starovoitov	1e4a6d975e	Merge branch 'bpf-add-support-for-local-percpu-kptr' Yonghong Song says: ==================== bpf: Add support for local percpu kptr Patch set [1] implemented cgroup local storage BPF_MAP_TYPE_CGRP_STORAGE similar to sk/task/inode local storage and old BPF_MAP_TYPE_CGROUP_STORAGE map is marked as deprecated since old BPF_MAP_TYPE_CGROUP_STORAGE map can only work with current cgroup. Similarly, the existing BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE map is a percpu version of BPF_MAP_TYPE_CGROUP_STORAGE and only works with current cgroup. But there is no replacement which can work with arbitrary cgroup. This patch set solved this problem but adding support for local percpu kptr. The map value can have a percpu kptr field which holds a bpf prog allocated percpu data. The below is an example, struct percpu_val_t { ... fields ... } struct map_value_t { struct percpu_val_t __percpu_kptr *percpu_data_ptr; } In the above, 'map_value_t' is the map value type for a BPF_MAP_TYPE_CGRP_STORAGE map. User can access 'percpu_data_ptr' and then read/write percpu data. This covers BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE and more. So BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE map type is marked as deprecated. In additional, local percpu kptr supports the same map type as other kptrs including hash, lru_hash, array, sk/inode/task/cgrp local storage. Currently, percpu data structure does not support non-scalars or special fields (e.g., bpf_spin_lock, bpf_rb_root, etc.). They can be supported in the future if there exist use cases. Please for individual patches for details. [1] https://lore.kernel.org/all/20221026042835.672317-1-yhs@fb.com/ Changelog: v2 -> v3: - fix libbpf_str test failure. v1 -> v2: - does not support special fields in percpu data structure. - rename __percpu attr to __percpu_kptr attr. - rename BPF_KPTR_PERCPU_REF to BPF_KPTR_PERCPU. - better code to handle bpf_{this,per}_cpu_ptr() helpers. - add more negative tests. - fix a bpftool related test failure. ==================== Link: https://lore.kernel.org/r/20230827152729.1995219-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-08 08:42:18 -07:00
Hou Tao	566f6de3ce	bpf: Enable IRQ after irq_work_raise() completes in unit_alloc() When doing stress test for qp-trie, bpf_mem_alloc() returned NULL unexpectedly because all qp-trie operations were initiated from bpf syscalls and there was still available free memory. bpf_obj_new() has the same problem as shown by the following selftest. The failure is due to the preemption. irq_work_raise() will invoke irq_work_claim() first to mark the irq work as pending and then inovke __irq_work_queue_local() to raise an IPI. So when the current task which is invoking irq_work_raise() is preempted by other task, unit_alloc() may return NULL for preemption task as shown below: task A task B unit_alloc() // low_watermark = 32 // free_cnt = 31 after alloc irq_work_raise() // mark irq work as IRQ_WORK_PENDING irq_work_claim() // task B preempts task A unit_alloc() // free_cnt = 30 after alloc // irq work is already PENDING, // so just return irq_work_raise() // does unit_alloc() 30-times ...... unit_alloc() // free_cnt = 0 before alloc return NULL Fix it by enabling IRQ after irq_work_raise() completes. An alternative fix is using preempt_{disable\|enable}_notrace() pair, but it may have extra overhead. Another feasible fix is to only disable preemption or IRQ before invoking irq_work_queue() and enable preemption or IRQ after the invocation completes, but it can't handle the case when c->low_watermark is 1. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20230901111954.1804721-2-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-08 08:42:18 -07:00
Yonghong Song	9bc95a95ab	bpf: Mark BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE deprecated Now 'BPF_MAP_TYPE_CGRP_STORAGE + local percpu ptr' can cover all BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE functionality and more. So mark BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE deprecated. Also make changes in selftests/bpf/test_bpftool_synctypes.py and selftest libbpf_str to fix otherwise test errors. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230827152837.2003563-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-08 08:42:18 -07:00
Yonghong Song	1bd7931728	selftests/bpf: Add some negative tests Add a few negative tests for common mistakes with using percpu kptr including: - store to percpu kptr. - type mistach in bpf_kptr_xchg arguments. - sleepable prog with untrusted arg for bpf_this_cpu_ptr(). - bpf_percpu_obj_new && bpf_obj_drop, and bpf_obj_new && bpf_percpu_obj_drop - struct with ptr for bpf_percpu_obj_new - struct with special field (e.g., bpf_spin_lock) for bpf_percpu_obj_new Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230827152832.2002421-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-08 08:42:18 -07:00
Yonghong Song	dfae1eeee9	selftests/bpf: Add tests for cgrp_local_storage with local percpu kptr Add a non-sleepable cgrp_local_storage test with percpu kptr. The test does allocation of percpu data, assigning values to percpu data and retrieval of percpu data. The de-allocation of percpu data is done when the map is freed. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230827152827.2001784-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-08 08:42:18 -07:00
Yonghong Song	46200d6da5	selftests/bpf: Remove unnecessary direct read of local percpu kptr For the second argument of bpf_kptr_xchg(), if the reg type contains MEM_ALLOC and MEM_PERCPU, which means a percpu allocation, after bpf_kptr_xchg(), the argument is marked as MEM_RCU and MEM_PERCPU if in rcu critical section. This way, re-reading from the map value is not needed. Remove it from the percpu_alloc_array.c selftest. Without previous kernel change, the test will fail like below: 0: R1=ctx(off=0,imm=0) R10=fp0 ; int BPF_PROG(test_array_map_10, int a) 0: (b4) w1 = 0 ; R1_w=0 ; int i, index = 0; 1: (63) (u32 )(r10 -4) = r1 ; R1_w=0 R10=fp0 fp-8=0000???? 2: (bf) r2 = r10 ; R2_w=fp0 R10=fp0 ; 3: (07) r2 += -4 ; R2_w=fp-4 ; e = bpf_map_lookup_elem(&array, &index); 4: (18) r1 = 0xffff88810e771800 ; R1_w=map_ptr(off=0,ks=4,vs=16,imm=0) 6: (85) call bpf_map_lookup_elem#1 ; R0_w=map_value_or_null(id=1,off=0,ks=4,vs=16,imm=0) 7: (bf) r6 = r0 ; R0_w=map_value_or_null(id=1,off=0,ks=4,vs=16,imm=0) R6_w=map_value_or_null(id=1,off=0,ks=4,vs=16,imm=0) ; if (!e) 8: (15) if r6 == 0x0 goto pc+81 ; R6_w=map_value(off=0,ks=4,vs=16,imm=0) ; bpf_rcu_read_lock(); 9: (85) call bpf_rcu_read_lock#87892 ; ; p = e->pc; 10: (bf) r7 = r6 ; R6=map_value(off=0,ks=4,vs=16,imm=0) R7_w=map_value(off=0,ks=4,vs=16,imm=0) 11: (07) r7 += 8 ; R7_w=map_value(off=8,ks=4,vs=16,imm=0) 12: (79) r6 = (u64 )(r6 +8) ; R6_w=percpu_rcu_ptr_or_null_val_t(id=2,off=0,imm=0) ; if (!p) { 13: (55) if r6 != 0x0 goto pc+13 ; R6_w=0 ; p = bpf_percpu_obj_new(struct val_t); 14: (18) r1 = 0x12 ; R1_w=18 16: (b7) r2 = 0 ; R2_w=0 17: (85) call bpf_percpu_obj_new_impl#87883 ; R0_w=percpu_ptr_or_null_val_t(id=4,ref_obj_id=4,off=0,imm=0) refs=4 18: (bf) r6 = r0 ; R0=percpu_ptr_or_null_val_t(id=4,ref_obj_id=4,off=0,imm=0) R6=percpu_ptr_or_null_val_t(id=4,ref_obj_id=4,off=0,imm=0) refs=4 ; if (!p) 19: (15) if r6 == 0x0 goto pc+69 ; R6=percpu_ptr_val_t(ref_obj_id=4,off=0,imm=0) refs=4 ; p1 = bpf_kptr_xchg(&e->pc, p); 20: (bf) r1 = r7 ; R1_w=map_value(off=8,ks=4,vs=16,imm=0) R7=map_value(off=8,ks=4,vs=16,imm=0) refs=4 21: (bf) r2 = r6 ; R2_w=percpu_ptr_val_t(ref_obj_id=4,off=0,imm=0) R6=percpu_ptr_val_t(ref_obj_id=4,off=0,imm=0) refs=4 22: (85) call bpf_kptr_xchg#194 ; R0_w=percpu_ptr_or_null_val_t(id=6,ref_obj_id=6,off=0,imm=0) refs=6 ; if (p1) { 23: (15) if r0 == 0x0 goto pc+3 ; R0_w=percpu_ptr_val_t(ref_obj_id=6,off=0,imm=0) refs=6 ; bpf_percpu_obj_drop(p1); 24: (bf) r1 = r0 ; R0_w=percpu_ptr_val_t(ref_obj_id=6,off=0,imm=0) R1_w=percpu_ptr_val_t(ref_obj_id=6,off=0,imm=0) refs=6 25: (b7) r2 = 0 ; R2_w=0 refs=6 26: (85) call bpf_percpu_obj_drop_impl#87882 ; ; v = bpf_this_cpu_ptr(p); 27: (bf) r1 = r6 ; R1_w=scalar(id=7) R6=scalar(id=7) 28: (85) call bpf_this_cpu_ptr#154 R1 type=scalar expected=percpu_ptr_, percpu_rcu_ptr_, percpu_trusted_ptr_ The R1 which gets its value from R6 is a scalar. But before insn 22, R6 is R6=percpu_ptr_val_t(ref_obj_id=4,off=0,imm=0) Its type is changed to a scalar at insn 22 without previous patch. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230827152821.2001129-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-08 08:42:18 -07:00
Yonghong Song	5b221ecb3a	bpf: Mark OBJ_RELEASE argument as MEM_RCU when possible In previous selftests/bpf patch, we have p = bpf_percpu_obj_new(struct val_t); if (!p) goto out; p1 = bpf_kptr_xchg(&e->pc, p); if (p1) { /* race condition */ bpf_percpu_obj_drop(p1); } p = e->pc; if (!p) goto out; After bpf_kptr_xchg(), we need to re-read e->pc into 'p'. This is due to that the second argument of bpf_kptr_xchg() is marked OBJ_RELEASE and it will be marked as invalid after the call. So after bpf_kptr_xchg(), 'p' is an unknown scalar, and the bpf program needs to reread from the map value. This patch checks if the 'p' has type MEM_ALLOC and MEM_PERCPU, and if 'p' is RCU protected. If this is the case, 'p' can be marked as MEM_RCU. MEM_ALLOC needs to be removed since 'p' is not an owning reference any more. Such a change makes re-read from the map value unnecessary. Note that re-reading 'e->pc' after bpf_kptr_xchg() might get a different value from 'p' if immediately before 'p = e->pc', another cpu may do another bpf_kptr_xchg() and swap in another value into 'e->pc'. If this is the case, then 'p = e->pc' may get either 'p' or another value, and race condition already exists. So removing direct re-reading seems fine too. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230827152816.2000760-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-08 08:42:18 -07:00
Yonghong Song	6adf82a439	selftests/bpf: Add tests for array map with local percpu kptr Add non-sleepable and sleepable tests with percpu kptr. For non-sleepable test, four programs are executed in the order of: 1. allocate percpu data. 2. assign values to percpu data. 3. retrieve percpu data. 4. de-allocate percpu data. The sleepable prog tried to exercise all above 4 steps in a single prog. Also for sleepable prog, rcu_read_lock is needed to protect direct percpu ptr access (from map value) and following bpf_this_cpu_ptr() and bpf_per_cpu_ptr() helpers. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230827152811.2000125-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-08 08:42:18 -07:00
Yonghong Song	968c76cb3d	selftests/bpf: Add bpf_percpu_obj_{new,drop}() macro in bpf_experimental.h The new macro bpf_percpu_obj_{new/drop}() is very similar to bpf_obj_{new,drop}() as they both take a type as the argument. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230827152805.1999417-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-08 08:42:18 -07:00
Yonghong Song	ed5285a148	libbpf: Add __percpu_kptr macro definition Add __percpu_kptr macro definition in bpf_helpers.h. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230827152800.1998492-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-08 08:42:18 -07:00
Andrii Nakryiko	3903802bb9	libbpf: Add basic BTF sanity validation Implement a simple and straightforward BTF sanity check when parsing BTF data. Right now it's very basic and just validates that all the string offsets and type IDs are within valid range. For FUNC we also check that it points to FUNC_PROTO kinds. Even with such simple checks it fixes a bunch of crashes found by OSS fuzzer ([0]-[5]) and will allow fuzzer to make further progress. Some other invariants will be checked in follow up patches (like ensuring there is no infinite type loops), but this seems like a good start already. Adding FUNC -> FUNC_PROTO check revealed that one of selftests has a problem with FUNC pointing to VAR instead, so fix it up in the same commit. [0] https://github.com/libbpf/libbpf/issues/482 [1] https://github.com/libbpf/libbpf/issues/483 [2] https://github.com/libbpf/libbpf/issues/485 [3] https://github.com/libbpf/libbpf/issues/613 [4] https://github.com/libbpf/libbpf/issues/618 [5] https://github.com/libbpf/libbpf/issues/619 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Reviewed-by: Song Liu <song@kernel.org> Closes: https://github.com/libbpf/libbpf/issues/617 Link: https://lore.kernel.org/bpf/20230825202152.1813394-1-andrii@kernel.org	2023-09-08 08:42:17 -07:00
Yonghong Song	96fc99d3d5	selftests/bpf: Update error message in negative linked_list test Some error messages are changed due to the addition of percpu kptr support. Fix linked_list test with changed error messages. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230827152754.1997769-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-08 08:42:17 -07:00
Yonghong Song	01cc55af93	bpf: Add bpf_this_cpu_ptr/bpf_per_cpu_ptr support for allocated percpu obj The bpf helpers bpf_this_cpu_ptr() and bpf_per_cpu_ptr() are re-purposed for allocated percpu objects. For an allocated percpu obj, the reg type is 'PTR_TO_BTF_ID \| MEM_PERCPU \| MEM_RCU'. The return type for these two re-purposed helpera is 'PTR_TO_MEM \| MEM_RCU \| MEM_ALLOC'. The MEM_ALLOC allows that the per-cpu data can be read and written. Since the memory allocator bpf_mem_alloc() returns a ptr to a percpu ptr for percpu data, the first argument of bpf_this_cpu_ptr() and bpf_per_cpu_ptr() is patched with a dereference before passing to the helper func. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230827152749.1997202-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-08 08:42:17 -07:00
Yonghong Song	36d8bdf75a	bpf: Add alloc/xchg/direct_access support for local percpu kptr Add two new kfunc's, bpf_percpu_obj_new_impl() and bpf_percpu_obj_drop_impl(), to allocate a percpu obj. Two functions are very similar to bpf_obj_new_impl() and bpf_obj_drop_impl(). The major difference is related to percpu handling. bpf_rcu_read_lock() struct val_t __percpu_kptr *v = map_val->percpu_data; ... bpf_rcu_read_unlock() For a percpu data map_val like above 'v', the reg->type is set as PTR_TO_BTF_ID \| MEM_PERCPU \| MEM_RCU if inside rcu critical section. MEM_RCU marking here is similar to NON_OWN_REF as 'v' is not a owning reference. But NON_OWN_REF is trusted and typically inside the spinlock while MEM_RCU is under rcu read lock. RCU is preferred here since percpu data structures mean potential concurrent access into its contents. Also, bpf_percpu_obj_new_impl() is restricted such that no pointers or special fields are allowed. Therefore, the bpf_list_head and bpf_rb_root will not be supported in this patch set to avoid potential memory leak issue due to racing between bpf_obj_free_fields() and another bpf_kptr_xchg() moving an allocated object to bpf_list_head and bpf_rb_root. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230827152744.1996739-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-08 08:42:17 -07:00
Yonghong Song	55db92f42f	bpf: Add BPF_KPTR_PERCPU as a field type BPF_KPTR_PERCPU represents a percpu field type like below struct val_t { ... fields ... }; struct t { ... struct val_t __percpu_kptr *percpu_data_ptr; ... }; where #define __percpu_kptr __attribute__((btf_type_tag("percpu_kptr"))) While BPF_KPTR_REF points to a trusted kernel object or a trusted local object, BPF_KPTR_PERCPU points to a trusted local percpu object. This patch added basic support for BPF_KPTR_PERCPU related to percpu_kptr field parsing, recording and free operations. BPF_KPTR_PERCPU also supports the same map types as BPF_KPTR_REF does. Note that unlike a local kptr, it is possible that a BPF_KTPR_PERCPU struct may not contain any special fields like other kptr, bpf_spin_lock, bpf_list_head, etc. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230827152739.1996391-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-08 08:42:17 -07:00
Yonghong Song	41a5db8d81	bpf: Add support for non-fix-size percpu mem allocation This is needed for later percpu mem allocation when the allocation is done by bpf program. For such cases, a global bpf_global_percpu_ma is added where a flexible allocation size is needed. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230827152734.1995725-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-08 08:42:17 -07:00
Linus Torvalds	73be7fb14e	Including fixes from netfilter and bpf. Current release - regressions: - eth: stmmac: fix failure to probe without MAC interface specified Current release - new code bugs: - docs: netlink: fix missing classic_netlink doc reference Previous releases - regressions: - deal with integer overflows in kmalloc_reserve() - use sk_forward_alloc_get() in sk_get_meminfo() - bpf_sk_storage: fix the missing uncharge in sk_omem_alloc - fib: avoid warn splat in flow dissector after packet mangling - skb_segment: call zero copy functions before using skbuff frags - eth: sfc: check for zero length in EF10 RX prefix Previous releases - always broken: - af_unix: fix msg_controllen test in scm_pidfd_recv() for MSG_CMSG_COMPAT - xsk: fix xsk_build_skb() dereferencing possible ERR_PTR() - netfilter: - nft_exthdr: fix non-linear header modification - xt_u32, xt_sctp: validate user space input - nftables: exthdr: fix 4-byte stack OOB write - nfnetlink_osf: avoid OOB read - one more fix for the garbage collection work from last release - igmp: limit igmpv3_newpack() packet size to IP_MAX_MTU - bpf, sockmap: fix preempt_rt splat when using raw_spin_lock_t - handshake: fix null-deref in handshake_nl_done_doit() - ip: ignore dst hint for multipath routes to ensure packets are hashed across the nexthops - phy: micrel: - correct bit assignments for cable test errata - disable EEE according to the KSZ9477 errata Misc: - docs/bpf: document compile-once-run-everywhere (CO-RE) relocations - Revert "net: macsec: preserve ingress frame ordering", it appears to have been developed against an older kernel, problem doesn't exist upstream Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmT6R6wACgkQMUZtbf5S IrsmTg//TgmRjxSZ0lrPQtJwZR/eN3ZR2oQG3rwnssCx+YgHEGGxQsfT4KHEMacR ZgGDZVTpthUJkkACBPi8ZMoy++RdjEmlCcanfeDkGHoYGtiX1lhkofhLMn1KUHbI rIbP9EdNKxQT0SsBlw/U28pD5jKyqOgL23QobEwmcjLTdMpamb+qIsD6/xNv9tEj Tu4BdCIkhjxnBD622hsE3pFTG7oSn2WM6rf5NT1E43mJ3W8RrMcydSB27J7Oryo9 l3nYMAhz0vQINS2WQ9eCT1/7GI6gg1nDtxFtrnV7ASvxayRBPIUr4kg1vT+Tixsz CZMnwVamEBIYl9agmj7vSji7d5nOUgXPhtWhwWUM2tRoGdeGw3vSi1pgDvRiUCHE PJ4UHv7goa2AgnOlOQCFtRybAu+9nmSGm7V+GkeGLnH7xbFsEa5smQ/+FSPJs8Dn Yf4q5QAhdN8tdnofRlrN/nCssoDF3cfmBsTJ7wo5h71gW+BWhsP58eDCJlXd/r8k +Qnvoe2kw27ktFR1tjsUDZ0AcSmeVARNwmXCOBYZsG4tEek8pLyj008mDvJvdfyn PGPn7Eo5DyaERlHVmPuebHXSyniDEPe2GLTmlHcGiRpGspoUHbB+HRiDAuRLMB9g pkL8RHpNfppnuUXeUoNy3rgEkYwlpTjZX0QHC6N8NQ76ccB6CNM= =YpmE -----END PGP SIGNATURE----- Merge tag 'net-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking updates from Jakub Kicinski: "Including fixes from netfilter and bpf. Current release - regressions: - eth: stmmac: fix failure to probe without MAC interface specified Current release - new code bugs: - docs: netlink: fix missing classic_netlink doc reference Previous releases - regressions: - deal with integer overflows in kmalloc_reserve() - use sk_forward_alloc_get() in sk_get_meminfo() - bpf_sk_storage: fix the missing uncharge in sk_omem_alloc - fib: avoid warn splat in flow dissector after packet mangling - skb_segment: call zero copy functions before using skbuff frags - eth: sfc: check for zero length in EF10 RX prefix Previous releases - always broken: - af_unix: fix msg_controllen test in scm_pidfd_recv() for MSG_CMSG_COMPAT - xsk: fix xsk_build_skb() dereferencing possible ERR_PTR() - netfilter: - nft_exthdr: fix non-linear header modification - xt_u32, xt_sctp: validate user space input - nftables: exthdr: fix 4-byte stack OOB write - nfnetlink_osf: avoid OOB read - one more fix for the garbage collection work from last release - igmp: limit igmpv3_newpack() packet size to IP_MAX_MTU - bpf, sockmap: fix preempt_rt splat when using raw_spin_lock_t - handshake: fix null-deref in handshake_nl_done_doit() - ip: ignore dst hint for multipath routes to ensure packets are hashed across the nexthops - phy: micrel: - correct bit assignments for cable test errata - disable EEE according to the KSZ9477 errata Misc: - docs/bpf: document compile-once-run-everywhere (CO-RE) relocations - Revert "net: macsec: preserve ingress frame ordering", it appears to have been developed against an older kernel, problem doesn't exist upstream" * tag 'net-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (95 commits) net: enetc: distinguish error from valid pointers in enetc_fixup_clear_rss_rfs() Revert "net: team: do not use dynamic lockdep key" net: hns3: remove GSO partial feature bit net: hns3: fix the port information display when sfp is absent net: hns3: fix invalid mutex between tc qdisc and dcb ets command issue net: hns3: fix debugfs concurrency issue between kfree buffer and read net: hns3: fix byte order conversion issue in hclge_dbg_fd_tcam_read() net: hns3: Support query tx timeout threshold by debugfs net: hns3: fix tx timeout issue net: phy: Provide Module 4 KSZ9477 errata (DS80000754C) netfilter: nf_tables: Unbreak audit log reset netfilter: ipset: add the missing IP_SET_HASH_WITH_NET0 macro for ip_set_hash_netportnet.c netfilter: nft_set_rbtree: skip sync GC for new elements in this transaction netfilter: nf_tables: uapi: Describe NFTA_RULE_CHAIN_ID netfilter: nfnetlink_osf: avoid OOB read netfilter: nftables: exthdr: fix 4-byte stack OOB write selftests/bpf: Check bpf_sk_storage has uncharged sk_omem_alloc bpf: bpf_sk_storage: Fix the missing uncharge in sk_omem_alloc bpf: bpf_sk_storage: Fix invalid wait context lockdep report s390/bpf: Pass through tail call counter in trampolines ...	2023-09-07 18:33:07 -07:00
Linus Torvalds	2ab35ce202	Devicetree fixes for v6.6, part 1: - Convert st,stih407-irq-syscfg and Omnivision OV7251 bindings to DT schema - Merge Omnivision OV5695 into OV5693 binding - Fix of_overlay_fdt_apply prototype when !CONFIG_OF_OVERLAY -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAmT6BSsACgkQ+vtdtY28 YcPSTQ/9G+hCbWrfY6e+qoHPtZo37qC2kqop5ll+N6clDNaad/BtX1J9BJGPkoPb ySUaHy5z+GbqZUHbKDbExI40h6oNbtPvOUeMvlwOVEJYjzA3OZZefd6gQihC1XsZ nOkjIRfWRkzND0MYb9gb/LGIiKOfC/gTEhPqIhtkyhoMljPLNphbg+Rt04Qwfokf 04/yc6JQNFUcyVU3iHz1PwbIYHo1t1fgATzYBaUQcHUZrmBtNtVKwpCx8LQsDCeX U7Drl02l9SHDZYu1U1kpzxz6Pl6GvAFC7Vda5EUxU6p0AwqUChOJRv16Z2OpK+9f HPsN3osOuW+QS6H5xlzVH4zylcWjEbiL0ifWP5v557+OZqFcpfz2uL5BUjl9qoQ4 WNvbfqnIqXS4gftqUAEoKY71cIQEuRFmnpSTxbjq0VyXDuCyIzNau0gmGXLq60LV lC63BQEOk9ZCvGMuGli1ild7+jSnYU1n73+UV0VUBrwbtvHiphPVVZ5JAmfMzvKH 8rAPqZbf5TY5sONqGj+hbmAnLiiuf0khu21pKoaNMbnjxnA+BiKN9GiP6+EqPyHY oBj3qSPWzq4DRs+ZjfIsmLu1kmoznjL4BTmeK2sFXlVa8XKfZRN6FYp8wNXJWpFo MnHG10jgm5Js6lE+vaQOZ0BFzL7JYRESu1YTPn/NXBA7NY8CTkg= =n+1p -----END PGP SIGNATURE----- Merge tag 'devicetree-fixes-for-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull more devicetree updates from Rob Herring: "A couple of conversions which didn't get picked up by the subsystems and one fix: - Convert st,stih407-irq-syscfg and Omnivision OV7251 bindings to DT schema - Merge Omnivision OV5695 into OV5693 binding - Fix of_overlay_fdt_apply prototype when !CONFIG_OF_OVERLAY" * tag 'devicetree-fixes-for-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: dt-bindings: irqchip: convert st,stih407-irq-syscfg to DT schema media: dt-bindings: Convert Omnivision OV7251 to DT schema media: dt-bindings: Merge OV5695 into OV5693 binding of: overlay: Fix of_overlay_fdt_apply prototype when !CONFIG_OF_OVERLAY	2023-09-07 18:16:37 -07:00
Linus Torvalds	8d844b3518	pwm: Changes for v6.6-rc1 This contains various cleanups and fixes across the board. -----BEGIN PGP SIGNATURE----- iQJNBAABCAA3FiEEiOrDCAFJzPfAjcif3SOs138+s6EFAmT55DMZHHRoaWVycnku cmVkaW5nQGdtYWlsLmNvbQAKCRDdI6zXfz6zoZHtEACJJRp36f/VBNHIkdIHlgnY OA1ScjwsLSmavWo6IJtGM/iAawnNBnKyMNymWjjs6240cNlXmIjQBvEX0zkRVsrp kuREZS0o5yS1kaM9QMXpur6HLNqwKxFGIzvUlcN2IB+myCBxfEQ/KlR3u3vRdQyH yB+9dvpAS1iRU957WcmdAtnid1j3mwxbFnNBMPp9iV7iH0Lg1TDSHuf1OCxc8lnP BZhS0zOjLUY8Eyo/pDkI9IIA4lXIg/JH9Ux4n88Ag84UiU/Q12APyT6R5nClK8Cr 0+LUHzjL4ZJbGdUkHyJtfzIGaO0Qy8TTgn7irPQChdIVTehhH5T4Uzl7v0EFudWi qz3BeGnOlFlhQG0WdAwH8pYYeTIVOn5HjjXQunmk36e1b+FNg5baQZ7gInry172b HT9KmDGfFBaDME1mZ4IayCvmjRIuoFWI6GtS/ykPBOTd58CjytMJD+khuTUwkUSd D5KAakc70JYBHQLzsRZExiP5RwxJ8LChmwF4yBfROF0S7KCL84R4pbEMLZ0ko08g bcxyZ8yHoTEpRxg332B1M/A/KFAMPiit0qK3LnuGdP0B73GFBb9mGvgwb6vwzRx1 Vo6Ewr8A3wZAL56rzXPFvabnMblzvHbuNQTMFkJLhA+doI4n3Oq+KW1u7Cf4ygEF qGOFsQsphAHZFrIob8xleg== =kax5 -----END PGP SIGNATURE----- Merge tag 'pwm/for-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm Pull pwm updates from Thierry Reding: "Various cleanups and fixes across the board" * tag 'pwm/for-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: (31 commits) pwm: lpc32xx: Remove handling of PWM channels pwm: atmel: Simplify using devm functions dt-bindings: pwm: brcm,kona-pwm: convert to YAML pwm: stmpe: Handle errors when disabling the signal pwm: stm32: Simplify using devm_pwmchip_add() pwm: stm32: Don't modify HW state in .remove() callback pwm: Fix order of freeing resources in pwmchip_remove() pwm: ntxec: Use device_set_of_node_from_dev() pwm: ntxec: Drop a write-only variable from driver data pwm: pxa: Don't reimplement of_device_get_match_data() pwm: lpc18xx-sct: Simplify using devm_clk_get_enabled() pwm: atmel-tcb: Don't track polarity in driver data pwm: atmel-tcb: Unroll atmel_tcb_pwm_set_polarity() into only caller pwm: atmel-tcb: Put per-channel data into driver data pwm: atmel-tcb: Fix resource freeing in error path and remove pwm: atmel-tcb: Harmonize resource allocation order pwm: Drop unused #include <linux/radix-tree.h> pwm: rz-mtu3: Fix build warning 'num_channel_ios' not described pwm: Remove outdated documentation for pwmchip_remove() pwm: atmel: Enable clk when pwm already enabled in bootloader ...	2023-09-07 18:05:58 -07:00
Linus Torvalds	ff6e6ded54	RTC for 6.6 Subsystem: - Add a way for drivers to tell the core the supported alarm range is smaller than the date range. This is not used yet but will be useful for the alarmtimers in the next release. - fix Wvoid-pointer-to-enum-cast warnings - remove redundant of_match_ptr() - stop warning for invalid alarms when the alarm is disabled Drivers: - isl12022: allow setting the trip level for battery level detection - pcf2127: add support for PCF2131 and multiple timestamps - stm32: time precision improvement, many fixes - twl: NVRAM support -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEBqsFVZXh8s/0O5JiY6TcMGxwOjIFAmT6M14ACgkQY6TcMGxw OjKE3Q/+N7OMaTeJcqk1c5JEFZukOIYGZnogDSAvOF5mP9Y0TBlc/+UD5V6SgPLG B0oQfAEbYoo5e9VRCawIXBuE1fFA4AsHd6yc4395pEB5OZPpYcdvEWHfB1MQCKSy /sQ2xeevdOLtpHxqjK2cN1crlWOo1YACuUoXgN1IvOgtLC7NKxH77q+I9gvhNWiy P/U9gCRR7ji/GZ5LssOi9+rTj5gi8eeW6R9WzPfARAljGZwgMY2dzZWDmmq5gKJn VTLeoxbvsMNgM5G72sUlSBeeN1lPFIIknx3+DU4+v9JqSklI1GarySfnFqRqIDp6 JQKicAfv+PXbrqJliUGUvU5O0JTDjpsRI23ridKQ/jFaBkk8k26jXehK5IYUBO+A xvIsTZgSn53oGMoAZFSUGTIC4m0KDCQ6Wu0rCDr1OCecds2ueYfBHfCOj5rNvBdE ZAb/m0tquX5hkVuLqGt7Wn32PdpsPA9noz4cdgtDkNLL2juEYwBHUUBnuuHsEEzR dDoKdcm3q+KA9PEW0ftLb069UQ1ddca2Y8Mxp79Xt3mQhg28BMWV9h9qyPdicS5W Cti54qplG7EP5SRqsyqh6KH91ZfJFMbyb+ivEEQlYeOK45IFLkGXjVBZmHRPuZaV tjbNkp74asaftpMIB9kiW82JdJDUNGKYjYbSyssAWABTDbRlpGc= =4kkJ -----END PGP SIGNATURE----- Merge tag 'rtc-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux Pull RTC updates from Alexandre Belloni: "Subsystem: - Add a way for drivers to tell the core the supported alarm range is smaller than the date range. This is not used yet but will be useful for the alarmtimers in the next release. - fix Wvoid-pointer-to-enum-cast warnings - remove redundant of_match_ptr() - stop warning for invalid alarms when the alarm is disabled Drivers: - isl12022: allow setting the trip level for battery level detection - pcf2127: add support for PCF2131 and multiple timestamps - stm32: time precision improvement, many fixes - twl: NVRAM support" * tag 'rtc-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (73 commits) dt-bindings: rtc: ds3231: Remove text binding rtc: wm8350: remove unnecessary messages rtc: twl: remove unnecessary messages rtc: sun6i: remove unnecessary message rtc: stop warning for invalid alarms when the alarm is disabled rtc: twl: add NVRAM support rtc: pcf85363: Allow to wake up system without IRQ rtc: m48t86: add DT support for m48t86 dt-bindings: rtc: Add ST M48T86 rtc: pcf2127: remove useless check rtc: rzn1: Report maximum alarm limit to rtc core rtc: ds1305: Report maximum alarm limit to rtc core rtc: tps6586x: Report maximum alarm limit to rtc core rtc: cmos: Report supported alarm limit to rtc infrastructure rtc: cros-ec: Detect and report supported alarm window size rtc: Add support for limited alarm timer offsets rtc: isl1208: Fix incorrect logic in isl1208_set_xtoscb() MAINTAINERS: remove obsolete pattern in RTC SUBSYSTEM section rtc: tps65910: Remove redundant dev_warn() and do not check for 0 return after calling platform_get_irq() rtc: omap: Do not check for 0 return after calling platform_get_irq() ...	2023-09-07 16:07:35 -07:00
Linus Torvalds	e59a698b2d	I3C for 6.6 Core: - Fix SETDASA when static and dynamic adress are equal - Fix cmd_v1 DAA exit criteria Drivers: - svc: allow probing without any device -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEBqsFVZXh8s/0O5JiY6TcMGxwOjIFAmT49e4ACgkQY6TcMGxw OjLM/g//R0z4u82PTEzefslGWmwiap6uH4cXV3HofXIyk7hsYG+81Z3evdqaaIjQ rg+rGdXP4tt0VUmhEYtuYoM2NeNGaSm6FPa7FcF5qcquSvPW9TGtyCLX2T4L93ns CQC7CQah0XAnFresoMtvvSOfFJ6mz3EUzWR35n386L3t9ap1f7P87Gl9d7dyIwf7 bmbV2yAc0ezITTDyqTEUiFIBE+wupt9b1go5nY9i/F4DyxDx9xZhPT2ryaA2lEfQ dzBkz9i8PnrawURdG1zjnGGalIaq7g6djXIUJ1g2SZ3pMHt77OZtFgU63UD7/LcU NiXHi1CXJA5iwXVNpp3yA09xtwWwrV5hIoPqN4A1eQqArbS/nMl+mOGAbL40JVkd nkE2SmFXUWBBYI6UDnuDXhpmgwbFk6jWWV3mmuMxJwAjEXIHNmzgIlX8J2o+dAyA JVFzJGBcgUtto6Kklrv49cRzdycuVw28JPro8TL2Fodpp4EuK+7R1QfDE9Lejnxf aAScSf0EoHCaIik3NxxXb8WYB+LO1QQPUMKZkN9TZ9jTYxjBViE3POyUChmMmW/U zWggD8ky6yTJPZuuyE++KlqbkQMPowI77PUkYGYXxtj+QH09I9kPJk03G/AAeqrm rsnwYAddRZfUknvGOXPfDHw0wY/7Y9UHG1Keg9UFl8XZevZNJfc= =av5t -----END PGP SIGNATURE----- Merge tag 'i3c/for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux Pull i3c updates from Alexandre Belloni: "Core: - Fix SETDASA when static and dynamic adress are equal - Fix cmd_v1 DAA exit criteria Drivers: - svc: allow probing without any device" * tag 'i3c/for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux: i3c: master: svc: fix probe failure when no i3c device exist i3c: master: Fix SETDASA process dt-bindings: i3c: Fix description for assigned-address i3c: master: svc: Describe member 'saved_regs' i3c: master: svc: Do not check for 0 return after calling platform_get_irq() i3c/master: cmd_v1: Fix the exit criteria for the daa procedure i3c: Explicitly include correct DT includes	2023-09-07 15:59:57 -07:00
Linus Torvalds	d9b9ea589b	regulator: Fixes for v6.6 A couple of fixes that came in during the merge window, both driver specific - one for a bug that came up in testing, one for a bug due to a misreading of the datasheet. -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmT5wtcACgkQJNaLcl1U h9D42Qf+LB9UYFJKAXPyLkpQYGHKbSaqz6hIMFWjmpsVjvuahI63wVRv/GoPZDIU B1Zy8mrC4t37BK/CmjlwvxVFPuY+DaYTTBjjnSEnfrDsQZ+PgfIRmfqzxp/q6TYr cpDOg/HrLkvfzNoTVtnUDz4Kn1l66OAMWXlwZ/AZEW8uD/JY172H7tBfjqDdzDvc NCxKW0FO/pX07kc1IFaPl6UvZRqJLQUkYl7dKFl7PrG6Lzh1+FKQSUmtZY33Ndgd LqFux/mc3jsV5E6Y0F899Lf1890VxzYHpbwT1QrHdaXhbqoACrQoaFfLfvd5YQLY K9ncwzj8hNM87t0LYLH09lSsH5L2Xw== =UyE6 -----END PGP SIGNATURE----- Merge tag 'regulator-fix-v6.6-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "A couple of fixes that came in during the merge window, both driver specific - one for a bug that came up in testing, one for a bug due to a misreading of the datasheet" * tag 'regulator-fix-v6.6-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: tps6594-regulator: Fix random kernel crash regulator: tps6287x: Fix n_voltages	2023-09-07 15:51:07 -07:00
Linus Torvalds	32904dec06	spi: Fixes for v6.6 A couple of fixes for the sun6i driver, the patch to reduce DMA RX to single byte width all the time is hopefully excessively cautious but it's unclear which SoCs are affected so the fix just covers everything for safety. -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmT5wQ0ACgkQJNaLcl1U h9Bltgf/T7tViGF1+g0fUx7Nlju+h8Hg3qBhlE+VKwRO6/qjBczT4ny4nkiej+CK IapuEDul+F56Cy4/dgyFQkICTCm7qGev+r7Oya0OCWqAiXif5fz/zfxc4rHUzRA9 lZrdWrYNxhlAqcDu80G00nh5Gx5UCpTHEriWJQUQ0oxXGQoPRPW2WczMsksF6QLs gnqr7hhIWzSvj0OOHVWi6a1iOMVgw0ICEgUF6TSoDsMOzvi6c1WVnCLZaE6udWhv 0wwOBzH1YBbwhfK/RdjUcSfHIy/jI2Gs7BnrUWviIfQSCb90V9YvAGlgusJAR4Ks Y5UM3d8pVxQg5tW75IFjMt/fic6Rfg== =HZqv -----END PGP SIGNATURE----- Merge tag 'spi-fix-v6.6-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A couple of fixes for the sun6i driver. The patch to reduce DMA RX to single byte width all the time is hopefully excessively cautious but it's unclear which SoCs are affected so the fix just covers everything for safety" * tag 'spi-fix-v6.6-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: sun6i: fix race between DMA RX transfer completion and RX FIFO drain spi: sun6i: reduce DMA RX transfer width to single byte	2023-09-07 15:49:20 -07:00
Linus Torvalds	0c02183427	ARM: * Clean up vCPU targets, always returning generic v8 as the preferred target * Trap forwarding infrastructure for nested virtualization (used for traps that are taken from an L2 guest and are needed by the L1 hypervisor) * FEAT_TLBIRANGE support to only invalidate specific ranges of addresses when collapsing a table PTE to a block PTE. This avoids that the guest refills the TLBs again for addresses that aren't covered by the table PTE. * Fix vPMU issues related to handling of PMUver. * Don't unnecessary align non-stack allocations in the EL2 VA space * Drop HCR_VIRT_EXCP_MASK, which was never used... * Don't use smp_processor_id() in kvm_arch_vcpu_load(), but the cpu parameter instead * Drop redundant call to kvm_set_pfn_accessed() in user_mem_abort() * Remove prototypes without implementations RISC-V: * Zba, Zbs, Zicntr, Zicsr, Zifencei, and Zihpm support for guest * Added ONE_REG interface for SATP mode * Added ONE_REG interface to enable/disable multiple ISA extensions * Improved error codes returned by ONE_REG interfaces * Added KVM_GET_REG_LIST ioctl() implementation for KVM RISC-V * Added get-reg-list selftest for KVM RISC-V s390: * PV crypto passthrough enablement (Tony, Steffen, Viktor, Janosch) Allows a PV guest to use crypto cards. Card access is governed by the firmware and once a crypto queue is "bound" to a PV VM every other entity (PV or not) looses access until it is not bound anymore. Enablement is done via flags when creating the PV VM. * Guest debug fixes (Ilya) x86: * Clean up KVM's handling of Intel architectural events * Intel bugfixes * Add support for SEV-ES DebugSwap, allowing SEV-ES guests to use debug registers and generate/handle #DBs * Clean up LBR virtualization code * Fix a bug where KVM fails to set the target pCPU during an IRTE update * Fix fatal bugs in SEV-ES intrahost migration * Fix a bug where the recent (architecturally correct) change to reinject #BP and skip INT3 broke SEV guests (can't decode INT3 to skip it) * Retry APIC map recalculation if a vCPU is added/enabled * Overhaul emergency reboot code to bring SVM up to par with VMX, tie the "emergency disabling" behavior to KVM actually being loaded, and move all of the logic within KVM * Fix user triggerable WARNs in SVM where KVM incorrectly assumes the TSC ratio MSR cannot diverge from the default when TSC scaling is disabled up related code * Add a framework to allow "caching" feature flags so that KVM can check if the guest can use a feature without needing to search guest CPUID * Rip out the ancient MMU_DEBUG crud and replace the useful bits with CONFIG_KVM_PROVE_MMU * Fix KVM's handling of !visible guest roots to avoid premature triple fault injection * Overhaul KVM's page-track APIs, and KVMGT's usage, to reduce the API surface that is needed by external users (currently only KVMGT), and fix a variety of issues in the process This last item had a silly one-character bug in the topic branch that was sent to me. Because it caused pretty bad selftest failures in some configurations, I decided to squash in the fix. So, while the exact commit ids haven't been in linux-next, the code has (from the kvm-x86 tree). Generic: * Wrap kvm_{gfn,hva}_range.pte in a union to allow mmu_notifier events to pass action specific data without needing to constantly update the main handlers. * Drop unused function declarations Selftests: * Add testcases to x86's sync_regs_test for detecting KVM TOCTOU bugs * Add support for printf() in guest code and covert all guest asserts to use printf-based reporting * Clean up the PMU event filter test and add new testcases * Include x86 selftests in the KVM x86 MAINTAINERS entry -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmT1m0kUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroMNgggAiN7nz6UC423FznuI+yO3TLm8tkx1 CpKh5onqQogVtchH+vrngi97cfOzZb1/AtifY90OWQi31KEWhehkeofcx7G6ERhj 5a9NFADY1xGBsX4exca/VHDxhnzsbDWaWYPXw5vWFWI6erft9Mvy3tp1LwTvOzqM v8X4aWz+g5bmo/DWJf4Wu32tEU6mnxzkrjKU14JmyqQTBawVmJ3RYvHVJ/Agpw+n hRtPAy7FU6XTdkmq/uCT+KUHuJEIK0E/l1js47HFAqSzwdW70UDg14GGo1o4ETxu RjZQmVNvL57yVgi6QU38/A0FWIsWQm5IlaX1Ug6x8pjZPnUKNbo9BY4T1g== =W+4p -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "ARM: - Clean up vCPU targets, always returning generic v8 as the preferred target - Trap forwarding infrastructure for nested virtualization (used for traps that are taken from an L2 guest and are needed by the L1 hypervisor) - FEAT_TLBIRANGE support to only invalidate specific ranges of addresses when collapsing a table PTE to a block PTE. This avoids that the guest refills the TLBs again for addresses that aren't covered by the table PTE. - Fix vPMU issues related to handling of PMUver. - Don't unnecessary align non-stack allocations in the EL2 VA space - Drop HCR_VIRT_EXCP_MASK, which was never used... - Don't use smp_processor_id() in kvm_arch_vcpu_load(), but the cpu parameter instead - Drop redundant call to kvm_set_pfn_accessed() in user_mem_abort() - Remove prototypes without implementations RISC-V: - Zba, Zbs, Zicntr, Zicsr, Zifencei, and Zihpm support for guest - Added ONE_REG interface for SATP mode - Added ONE_REG interface to enable/disable multiple ISA extensions - Improved error codes returned by ONE_REG interfaces - Added KVM_GET_REG_LIST ioctl() implementation for KVM RISC-V - Added get-reg-list selftest for KVM RISC-V s390: - PV crypto passthrough enablement (Tony, Steffen, Viktor, Janosch) Allows a PV guest to use crypto cards. Card access is governed by the firmware and once a crypto queue is "bound" to a PV VM every other entity (PV or not) looses access until it is not bound anymore. Enablement is done via flags when creating the PV VM. - Guest debug fixes (Ilya) x86: - Clean up KVM's handling of Intel architectural events - Intel bugfixes - Add support for SEV-ES DebugSwap, allowing SEV-ES guests to use debug registers and generate/handle #DBs - Clean up LBR virtualization code - Fix a bug where KVM fails to set the target pCPU during an IRTE update - Fix fatal bugs in SEV-ES intrahost migration - Fix a bug where the recent (architecturally correct) change to reinject #BP and skip INT3 broke SEV guests (can't decode INT3 to skip it) - Retry APIC map recalculation if a vCPU is added/enabled - Overhaul emergency reboot code to bring SVM up to par with VMX, tie the "emergency disabling" behavior to KVM actually being loaded, and move all of the logic within KVM - Fix user triggerable WARNs in SVM where KVM incorrectly assumes the TSC ratio MSR cannot diverge from the default when TSC scaling is disabled up related code - Add a framework to allow "caching" feature flags so that KVM can check if the guest can use a feature without needing to search guest CPUID - Rip out the ancient MMU_DEBUG crud and replace the useful bits with CONFIG_KVM_PROVE_MMU - Fix KVM's handling of !visible guest roots to avoid premature triple fault injection - Overhaul KVM's page-track APIs, and KVMGT's usage, to reduce the API surface that is needed by external users (currently only KVMGT), and fix a variety of issues in the process Generic: - Wrap kvm_{gfn,hva}_range.pte in a union to allow mmu_notifier events to pass action specific data without needing to constantly update the main handlers. - Drop unused function declarations Selftests: - Add testcases to x86's sync_regs_test for detecting KVM TOCTOU bugs - Add support for printf() in guest code and covert all guest asserts to use printf-based reporting - Clean up the PMU event filter test and add new testcases - Include x86 selftests in the KVM x86 MAINTAINERS entry" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (279 commits) KVM: x86/mmu: Include mmu.h in spte.h KVM: x86/mmu: Use dummy root, backed by zero page, for !visible guest roots KVM: x86/mmu: Disallow guest from using !visible slots for page tables KVM: x86/mmu: Harden TDP MMU iteration against root w/o shadow page KVM: x86/mmu: Harden new PGD against roots without shadow pages KVM: x86/mmu: Add helper to convert root hpa to shadow page drm/i915/gvt: Drop final dependencies on KVM internal details KVM: x86/mmu: Handle KVM bookkeeping in page-track APIs, not callers KVM: x86/mmu: Drop @slot param from exported/external page-track APIs KVM: x86/mmu: Bug the VM if write-tracking is used but not enabled KVM: x86/mmu: Assert that correct locks are held for page write-tracking KVM: x86/mmu: Rename page-track APIs to reflect the new reality KVM: x86/mmu: Drop infrastructure for multiple page-track modes KVM: x86/mmu: Use page-track notifiers iff there are external users KVM: x86/mmu: Move KVM-only page-track declarations to internal header KVM: x86: Remove the unused page-track hook track_flush_slot() drm/i915/gvt: switch from ->track_flush_slot() to ->track_remove_region() KVM: x86: Add a new page-track hook to handle memslot deletion drm/i915/gvt: Don't bother removing write-protection on to-be-deleted slot KVM: x86: Reject memslot MOVE operations if KVMGT is attached ...	2023-09-07 13:52:20 -07:00
Vladimir Oltean	1b36955cc0	net: enetc: distinguish error from valid pointers in enetc_fixup_clear_rss_rfs() enetc_psi_create() returns an ERR_PTR() or a valid station interface pointer, but checking for the non-NULL quality of the return code blurs that difference away. So if enetc_psi_create() fails, we call enetc_psi_destroy() when we shouldn't. This will likely result in crashes, since enetc_psi_create() cleans up everything after itself when it returns an ERR_PTR(). Fixes: f0168042a212 ("net: enetc: reimplement RFS/RSS memory clearing as PCI quirk") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/netdev/582183ef-e03b-402b-8e2d-6d9bb3c83bd9@moroto.mountain/ Suggested-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230906141609.247579-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-09-07 11:19:42 -07:00
Jakub Kicinski	6afcf0fb92	Revert "net: team: do not use dynamic lockdep key" This reverts commit 39285e124edbc752331e98ace37cc141a6a3747a. Looks like the change has unintended consequences in exposing objects before they are initialized. Let's drop this patch and try again in net-next. Reported-by: syzbot+44ae022028805f4600fc@syzkaller.appspotmail.com Fixes: 39285e124edb ("net: team: do not use dynamic lockdep key") Link: https://lore.kernel.org/all/20230907103124.6adb7256@kernel.org/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-09-07 11:19:36 -07:00
Linus Torvalds	4a0fc73da9	more s390 updates for 6.6 merge window - Couple of virtual vs physical address confusion fixes - Rework locking in dcssblk driver to address a lockdep warning - Remove support for "noexec" kernel command line option since there is no use case where it would make sense - Simplify kernel mapping setup and get rid of quite a bit of code - Add architecture specific __set_memory_yy() functions which allow to modify kernel mappings. Unlike the set_memory_xx() variants they take void pointer start and end parameters, which allows to use them without the usual casts, and also to use them on areas larger than 8TB. Note that the set_memory_xx() family comes with an int num_pages parameter which overflows with 8TB. This could be addressed by changing the num_pages parameter to unsigned long, however requires to change all architectures, since the module code expects an int parameter (see module_set_memory()). This was indeed an issue since for debug_pagealloc() we call set_memory_4k() on the whole identity mapping. Therefore address this for now with the __set_memory_yy() variant, and address common code later - Use dev_set_name() and also fix memory leak in zcrypt driver error handling - Remove unused lsi_mask from airq_struct - Add warning for invalid kernel mapping requests -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEECMNfWEw3SLnmiLkZIg7DeRspbsIFAmT5sYkACgkQIg7DeRsp bsJQHg//SFJOcVr2uViorvmt+IyiW78H75y8+AC6qoxZ3fuJeuyW2zz/resRsFRO UDCjkV2QYT885ADFUMcunf7LQnJYpSFVjAB6WEhgFr+bFdMWtX1crjHsMHkpcobR Fo4IFurG5NOMIdunPwKiO/dLU8pNGTm1T430uyLZH3ApVrmJ8stgOczusqTO/h3V tjbC5HAm9BBwNyOd2lnZSi8pbUMcfGE2DkRYhFQup1N9NT0BC4S6B8vTsD7NVYVZ WbJzqSl3CnTUJxZFX287YDW6cSaPCLAUfCl7r5FJZUoGvUtOVLlqcOjIIm/72epv YygMI8RdM0KHuzOd8XTNDbBnyYWTmWnZypCQVgNezY1GiO5nfTCkwYLZZagcc0Lo 6fx1B7xBbNao9/pURa6wJ9dXsx+NmD52DjM806hP1H+qoL/YP8dLqV6Qh2C1PFzI VncrnG2mnxjwry5WoOyl19SKqXDgmDylnFuHQHKA9mdCaqZWNkqovOo/6qE6OdWI Q3TinPwTug9vmgwBuQgua/2R1owy7G9mJwwDW9AtZXVzlMG6k06deKIYrZjLxf6Z hOtAu+Xy5IeoXOaaOFU92RsRKX481OTG77rWLREC+orAT1dPrfnguPLlQ7KeFbkR foduTTHcDPOLiIVL1RbWQC9AAw/i7IjN04Jp6KFlQTZqLATyPbM= =+iSZ -----END PGP SIGNATURE----- Merge tag 's390-6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Heiko Carstens: - A couple of virtual vs physical address confusion fixes - Rework locking in dcssblk driver to address a lockdep warning - Remove support for "noexec" kernel command line option since there is no use case where it would make sense - Simplify kernel mapping setup and get rid of quite a bit of code - Add architecture specific __set_memory_yy() functions which allow us to modify kernel mappings. Unlike the set_memory_xx() variants they take void pointer start and end parameters, which allows using them without the usual casts, and also to use them on areas larger than 8TB. Note that the set_memory_xx() family comes with an int num_pages parameter which overflows with 8TB. This could be addressed by changing the num_pages parameter to unsigned long, however requires to change all architectures, since the module code expects an int parameter (see module_set_memory()). This was indeed an issue since for debug_pagealloc() we call set_memory_4k() on the whole identity mapping. Therefore address this for now with the __set_memory_yy() variant, and address common code later - Use dev_set_name() and also fix memory leak in zcrypt driver error handling - Remove unused lsi_mask from airq_struct - Add warning for invalid kernel mapping requests * tag 's390-6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/vmem: do not silently ignore mapping limit s390/zcrypt: utilize dev_set_name() ability to use a formatted string s390/zcrypt: don't leak memory if dev_set_name() fails s390/mm: fix MAX_DMA_ADDRESS physical vs virtual confusion s390/airq: remove lsi_mask from airq_struct s390/mm: use __set_memory() variants where useful s390/set_memory: add __set_memory() variant s390/set_memory: generate all set_memory() functions s390/mm: improve description of mapping permissions of prefix pages s390/amode31: change type of __samode31, __eamode31, etc s390/mm: simplify kernel mapping setup s390: remove "noexec" option s390/vmem: fix virtual vs physical address confusion s390/dcssblk: fix lockdep warning s390/monreader: fix virtual vs physical address confusion	2023-09-07 10:52:13 -07:00
Linus Torvalds	ac2224a467	just cleanups and fixes -----BEGIN PGP SIGNATURE----- iQJOBAABCAA4FiEEbt46xwy6kEcDOXoUeZbBVTGwZHAFAmT42rEaHHRzYm9nZW5k QGFscGhhLmZyYW5rZW4uZGUACgkQeZbBVTGwZHCh6A/9H6hXrmKx1upxzgYiDAwU BiS+eEKWPTUCTyFT5Qs02GiEtDpAVPBoPIaPpcVub9nyvvUEJrUdS7QccRCiZ4se JJBwieKcoLX5v2bGqXsFp5Bjgldm53TS7g/SP5291V8tU5KANnTZuIFibvTnzA1y o3A5yky9FcauJ0hfLpKR2y7bnhD4XZNHRqqkiYylxtMer/+Ymqsu+V92N8aACM/x cPwp72ELyDg+keVMrIOOdQdHti54ZUcfB8lnmmkpm0EOo21pxQrCwVQJpQsnJbVd o1K+qu1DPT2E/PQI6YiroOClyKjnwa8GoVFBr2VAlbDrPWHJlk0iSL66m/KbvrPK EfoPgL59pUUWZ0HQ4iCq9AFrpFg8n7kqfwlKnvyDz39RnRrCA28tYBaNkg+BiUi2 NoDsvLgIC72E420X2PJisU48X2wxITuUt5CBtEcxA5Ry0lWeEZk0fqdYMNDgkYuD /LjEGxW/NyhhM5D8OZIc5beSf0mRwALMQuY90FkfQacJorr1mWQbVLxI6yPrhJpl EizxfOnC440p5A9IaSq6TGnUHhftZpOT70lZw3+SA2IuDN9y1IhaPAYl63RdSIHw 9LuIwFjbghrkXd1189p2li1Wy3DLBv2SbuhoJoNYtCiu8CPBj3Vuzw5mAoqJWje8 rePhw/NWMnI2OCRMVK4JYnc= =db73 -----END PGP SIGNATURE----- Merge tag 'mips_6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS updates from Thomas Bogendoerfer: "Just cleanups and fixes" * tag 'mips_6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: TXx9: Do PCI error checks on own line arch/mips/configs/*_defconfig cleanup MIPS: VDSO: Conditionally export __vdso_gettimeofday() Mips: loongson3_defconfig: Enable ast drm driver by default mips: remove <asm/export.h> mips: replace #include <asm/export.h> with #include <linux/export.h> mips: remove unneeded #include <asm/export.h> MIPS: Loongson64: Fix more __iomem attributes MIPS: loongson32: Remove regs-rtc.h MIPS: loongson32: Remove regs-clk.h MIPS: More explicit DT include clean-ups MIPS: Fixup explicit DT include clean-up Revert MIPS: Loongson: Fix build error when make modules_install MIPS: Only fiddle with CHECKFLAGS if `need-compiler' MIPS: Fix CONFIG_CPU_DADDI_WORKAROUNDS `modules_install' regression MIPS: Explicitly include correct DT includes	2023-09-07 10:35:14 -07:00
Linus Torvalds	dd1386dd3c	Xtensa updates for v6.6 - enable MTD XIP support - fix base address of the xtensa perf module in newer hardware -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEK2eFS5jlMn3N6xfYUfnMkfg/oEQFAmT3iNQTHGpjbXZia2Jj QGdtYWlsLmNvbQAKCRBR+cyR+D+gROrfD/4zCnLRQ36L6QY2+F7xZWZiPAIbZO01 0fFKUm2C+crWNGO5ccYmBRo0DYq9qHzS1P3r/617NmcuA+MN2+2avr9u6oq3j4O5 MR3xnnj3wMo9BS4VlFxf3JimI6S+DAFrYqF+smwES+9yqX3tlg23qOTyxINvAPIX D7Cs/8PlQ7mV/d42wspQvnPucBlgW9T3UxoT1nbxIUcTBYE6TLMchkVsRigQHHS0 Mr+T1kEmGKNSSz7F8oLNYtwqKoVf+WyjMePSkFJTu8Uki7xAHPfZ/5iZGDqGRKS9 yTgbmSp/bYfOmO7DnM3Dxx7mrYV8v7HOK5OPN1X0W/r5S8EC7dcCVTAyalds5Zhp edpdiQ9izP0gTHZTsXuBWh6sTIOSiSSDveuSyeT06YsPtZNanG0il41jZJAy5GVo 2fA88tE/GVBTiMQe5AfC4b3AIRCGKca6DMuJHdPgwfB0JBpItE7V4rdD+r9qPHym O5GVBgWhYCtXHT/qe2tbFAXY7y98uFLQBNltuEV/Ky3l47vw7qCP6aI+csSAVNp1 oUNqQo8YAgI+i5gMUpqCZZivrH0iQDQX1pGS06Vftj1wkFppyVRrPHBiseG3vNng SnpNwHZiornub239hPy52kr/EgOG5TV89htx1Iu6Y+s2UaR1e/HfIcsow0SjqysS tAj8M397lzlHtA== =GV6O -----END PGP SIGNATURE----- Merge tag 'xtensa-20230905' of https://github.com/jcmvbkbc/linux-xtensa Pull xtensa updates from Max Filippov: - enable MTD XIP support - fix base address of the xtensa perf module in newer hardware * tag 'xtensa-20230905' of https://github.com/jcmvbkbc/linux-xtensa: xtensa: add XIP-aware MTD support xtensa: PMU: fix base address for the newer hardware	2023-09-07 10:30:17 -07:00

1 2 3 4 5 ...

1214666 Commits