41568 Commits

Author SHA1 Message Date
Antoine Tenart
2845db7bed libbpf: Skip base btf sanity checks
[ Upstream commit c73a9683cb21012b6c0f14217974837151c527a8 ]

When upgrading to libbpf 1.3 we noticed a big performance hit while
loading programs using CORE on non base-BTF symbols. This was tracked
down to the new BTF sanity check logic. The issue is the base BTF
definitions are checked first for the base BTF and then again for every
module BTF.

Loading 5 dummy programs (using libbpf-rs) that are using CORE on a
non-base BTF symbol on my system:
- Before this fix: 3s.
- With this fix: 0.1s.

Fix this by only checking the types starting at the BTF start id. This
should ensure the base BTF is still checked as expected but only once
(btf->start_id == 1 when creating the base BTF), and then only
additional types are checked for each module BTF.

Fixes: 3903802bb99a ("libbpf: Add basic BTF sanity validation")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20240624090908.171231-1-atenart@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-03 08:59:40 +02:00
Donglin Peng
aeb79296c6 libbpf: Checking the btf_type kind when fixing variable offsets
[ Upstream commit cc5083d1f3881624ad2de1f3cbb3a07e152cb254 ]

I encountered an issue when building the test_progs from the repository [1]:

  $ pwd
  /work/Qemu/x86_64/linux-6.10-rc2/tools/testing/selftests/bpf/

  $ make test_progs V=1
  [...]
  ./tools/sbin/bpftool gen object ./ip_check_defrag.bpf.linked2.o ./ip_check_defrag.bpf.linked1.o
  libbpf: failed to find symbol for variable 'bpf_dynptr_slice' in section '.ksyms'
  Error: failed to link './ip_check_defrag.bpf.linked1.o': No such file or directory (2)
  [...]

Upon investigation, I discovered that the btf_types referenced in the '.ksyms'
section had a kind of BTF_KIND_FUNC instead of BTF_KIND_VAR:

  $ bpftool btf dump file ./ip_check_defrag.bpf.linked1.o
  [...]
  [2] DATASEC '.ksyms' size=0 vlen=2
        type_id=16 offset=0 size=0 (FUNC 'bpf_dynptr_from_skb')
        type_id=17 offset=0 size=0 (FUNC 'bpf_dynptr_slice')
  [...]
  [16] FUNC 'bpf_dynptr_from_skb' type_id=82 linkage=extern
  [17] FUNC 'bpf_dynptr_slice' type_id=85 linkage=extern
  [...]

For a detailed analysis, please refer to [2]. We can add a kind checking to
fix the issue.

  [1] https://github.com/eddyz87/bpf/tree/binsort-btf-dedup
  [2] https://lore.kernel.org/all/0c0ef20c-c05e-4db9-bad7-2cbc0d6dfae7@oracle.com/

Fixes: 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support")
Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240619122355.426405-1-dolinux.peng@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-03 08:59:39 +02:00
Jiri Olsa
fe60c69183 bpf: Change bpf_session_cookie return value to __u64 *
[ Upstream commit 717d6313bba1b3179f0bf1026aaec6b7e26f484e ]

This reverts [1] and changes return value for bpf_session_cookie
in bpf selftests. Having long * might lead to problems on 32-bit
architectures.

Fixes: 2b8dd87332cd ("bpf: Make bpf_session_cookie() kfunc return long *")
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240619081624.1620152-1-jolsa@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-03 08:59:39 +02:00
Ido Schimmel
431342f4ba mlxsw: spectrum_acl: Fix ACL scale regression and firmware errors
[ Upstream commit 75d8d7a63065b18df9555dbaab0b42d4c6f20943 ]

ACLs that reside in the algorithmic TCAM (A-TCAM) in Spectrum-2 and
newer ASICs can share the same mask if their masks only differ in up to
8 consecutive bits. For example, consider the following filters:

 # tc filter add dev swp1 ingress pref 1 proto ip flower dst_ip 192.0.2.0/24 action drop
 # tc filter add dev swp1 ingress pref 1 proto ip flower dst_ip 198.51.100.128/25 action drop

The second filter can use the same mask as the first (dst_ip/24) with a
delta of 1 bit.

However, the above only works because the two filters have different
values in the common unmasked part (dst_ip/24). When entries have the
same value in the common unmasked part they create undesired collisions
in the device since many entries now have the same key. This leads to
firmware errors such as [1] and to a reduced scale.

Fix by adjusting the hash table key to only include the value in the
common unmasked part. That is, without including the delta bits. That
way the driver will detect the collision during filter insertion and
spill the filter into the circuit TCAM (C-TCAM).

Add a test case that fails without the fix and adjust existing cases
that check C-TCAM spillage according to the above limitation.

[1]
mlxsw_spectrum2 0000:06:00.0: EMAD reg access failed (tid=3379b18a00003394,reg_id=3027(ptce3),type=write,status=8(resource not available))

Fixes: c22291f7cf45 ("mlxsw: spectrum: acl: Implement delta for ERP")
Reported-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Tested-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-03 08:59:37 +02:00
Geliang Tang
bd334e2f45 selftests/bpf: Check length of recv in test_sockmap
[ Upstream commit de1b5ea789dc28066cc8dc634b6825bd6148f38b ]

The value of recv in msg_loop may be negative, like EWOULDBLOCK, so it's
necessary to check if it is positive before accumulating it to bytes_recvd.

Fixes: 16962b2404ac ("bpf: sockmap, add selftests")
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/5172563f7c7b2a2e953cef02e89fc34664a7b190.1716446893.git.tanggeliang@kylinos.cn
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-03 08:59:36 +02:00
Andrii Nakryiko
f6f67fff74 libbpf: keep FD_CLOEXEC flag when dup()'ing FD
[ Upstream commit 531876c80004ecff7bfdbd8ba6c6b48835ef5e22 ]

Make sure to preserve and/or enforce FD_CLOEXEC flag on duped FDs.
Use dup3() with O_CLOEXEC flag for that.

Without this fix libbpf effectively clears FD_CLOEXEC flag on each of BPF
map/prog FD, which is definitely not the right or expected behavior.

Reported-by: Lennart Poettering <lennart@poettering.net>
Fixes: bc308d011ab8 ("libbpf: call dup2() syscall directly")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20240529223239.504241-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-03 08:59:36 +02:00
Geliang Tang
eab9ecfa8e selftests/bpf: Fix prog numbers in test_sockmap
[ Upstream commit 6c8d7598dfed759bf1d9d0322b4c2b42eb7252d8 ]

bpf_prog5 and bpf_prog7 are removed from progs/test_sockmap_kern.h in
commit d79a32129b21 ("bpf: Selftests, remove prints from sockmap tests"),
now there are only 9 progs in it, not 11:

	SEC("sk_skb1")
	int bpf_prog1(struct __sk_buff *skb)
	SEC("sk_skb2")
	int bpf_prog2(struct __sk_buff *skb)
	SEC("sk_skb3")
	int bpf_prog3(struct __sk_buff *skb)
	SEC("sockops")
	int bpf_sockmap(struct bpf_sock_ops *skops)
	SEC("sk_msg1")
	int bpf_prog4(struct sk_msg_md *msg)
	SEC("sk_msg2")
	int bpf_prog6(struct sk_msg_md *msg)
	SEC("sk_msg3")
	int bpf_prog8(struct sk_msg_md *msg)
	SEC("sk_msg4")
	int bpf_prog9(struct sk_msg_md *msg)
	SEC("sk_msg5")
	int bpf_prog10(struct sk_msg_md *msg)

This patch updates the array sizes of prog_fd[], prog_attach_type[] and
prog_type[] from 11 to 9 accordingly.

Fixes: d79a32129b21 ("bpf: Selftests, remove prints from sockmap tests")
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/9c10d9f974f07fcb354a43a8eca67acb2fafc587.1715926605.git.tanggeliang@kylinos.cn
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-03 08:59:32 +02:00
Ivan Babrou
b04566066c bpftool: Un-const bpf_func_info to fix it for llvm 17 and newer
[ Upstream commit f4aba3471cfb9ccf69b476463f19b4c50fef6b14 ]

LLVM 17 started treating const structs as constants:

* https://github.com/llvm/llvm-project/commit/0b2d5b967d98

Combined with pointer laundering via ptr_to_u64, which takes a const ptr,
but in reality treats the underlying memory as mutable, this makes clang
always pass zero to btf__type_by_id, which breaks full name resolution.

Disassembly before (LLVM 16) and after (LLVM 17):

    -    8b 75 cc                 mov    -0x34(%rbp),%esi
    -    e8 47 8d 02 00           call   3f5b0 <btf__type_by_id>
    +    31 f6                    xor    %esi,%esi
    +    e8 a9 8c 02 00           call   3f510 <btf__type_by_id>

It's a bigger project to fix this properly (and a question whether LLVM
itself should detect this), but for right now let's just fix bpftool.

For more information, see this thread in bpf mailing list:

* https://lore.kernel.org/bpf/CABWYdi0ymezpYsQsPv7qzpx2fWuTkoD1-wG1eT-9x-TSREFrQg@mail.gmail.com/T/

Fixes: b662000aff84 ("bpftool: Adding support for BTF program names")
Signed-off-by: Ivan Babrou <ivan@cloudflare.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20240520225149.5517-1-ivan@cloudflare.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-03 08:59:32 +02:00
Josh Poimboeuf
f38d2e9214 x86/syscall: Mark exit[_group] syscall handlers __noreturn
[ Upstream commit 9142be9e6443fd641ca37f820efe00d9cd890eb1 ]

The direct-call syscall dispatch function doesn't know that the exit()
and exit_group() syscall handlers don't return, so the call sites aren't
optimized accordingly.

Fix that by marking the exit syscall declarations __noreturn.

Fixes the following warnings:

  vmlinux.o: warning: objtool: x64_sys_call+0x2804: __x64_sys_exit() is missing a __noreturn annotation
  vmlinux.o: warning: objtool: ia32_sys_call+0x29b6: __ia32_sys_exit_group() is missing a __noreturn annotation

Fixes: 1e3ad78334a6 ("x86/syscall: Don't force use of indirect calls for system calls")
Closes: https://lkml.kernel.org/lkml/6dba9b32-db2c-4e6d-9500-7a08852f17a3@paulmck-laptop
Reported-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/5d8882bc077d8eadcc7fd1740b56dfb781f12288.1719381528.git.jpoimboe@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-03 08:59:12 +02:00
Linus Torvalds
51df8e0cba Including fixes from bpf and netfilter.
Current release - regressions:
 
   - core: fix rc7's __skb_datagram_iter() regression
 
 Current release - new code bugs:
 
   - eth: bnxt: fix crashes when reducing ring count with active RSS contexts
 
 Previous releases - regressions:
 
   - sched: fix UAF when resolving a clash
 
   - skmsg: skip zero length skb in sk_msg_recvmsg2
 
   - sunrpc: fix kernel free on connection failure in xs_tcp_setup_socket
 
   - tcp: avoid too many retransmit packets
 
   - tcp: fix incorrect undo caused by DSACK of TLP retransmit
 
   - udp: Set SOCK_RCU_FREE earlier in udp_lib_get_port().
 
   - eth: ks8851: fix deadlock with the SPI chip variant
 
   - eth: i40e: fix XDP program unloading while removing the driver
 
 Previous releases - always broken:
 
   - bpf:
     - fix too early release of tcx_entry
     - fail bpf_timer_cancel when callback is being cancelled
     - bpf: fix order of args in call to bpf_map_kvcalloc
 
   - netfilter: nf_tables: prefer nft_chain_validate
 
   - ppp: reject claimed-as-LCP but actually malformed packets
 
   - wireguard: avoid unaligned 64-bit memory accesses
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmaP4GYSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkzUgQAKroA17iMt2rD8h385hL8T9r483CGsR+
 MX6SuWn6T8v4cuhKbqhkf25pWOD0mKH2i+dmgYon7g9LjLG4DMiZBZAqmBwArbyM
 mITgndWH57MnQQh3pgkDFp0lhzYkeERCVSgcgh2AFTcNoxXbazHkMMIghsBENx3+
 wccTsqtPmT2GWRpw6IrHO6kUs98Gry4O2p6fw3dX3/umD0z8OgnyRoCdVkylCDTM
 2tBl4rWsXw8LvzSzmQ7qX3FSzdS++RJk2iXKWdrglah8cuKohZ98WbUwlt2ObCxz
 fLbaAocPzOijaX2YAsNzKYzWizq0i4IjpgSebNUcI3YFthAog5nNGJv79w+cxBKy
 NpaQA31Hd6K6oFJybkysHAf776RC3ueF48Isp3kag7NQ8Qy2+nfWAM9g1wq4UnOu
 IePFdgojqUbfp3GzOG5yFyqqD8RPppJp7DowSjjfYN8Dxw1y5R090suDyrjFeiiC
 MezV4xu7vZdi/6R8RXVfskR/iczHqDHGuuMikTlkm1LaLty9dfnoIZC5AagFH1rA
 Jkzztkd9MnSGK94G9upIhu+t8F22/wzmrJhJOgl9LFvbXP91uGjZGr75u4cc8Q/G
 jn2uy05T/vzoBSiLRQc0Z2Wp9GYEWRXHKSjLkabrGeakw6tYYgZSAVAlxqrpvqbV
 +fm9Ibpvs9fb
 =qwVA
 -----END PGP SIGNATURE-----

Merge tag 'net-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from bpf and netfilter.

  Current release - regressions:

   - core: fix rc7's __skb_datagram_iter() regression

  Current release - new code bugs:

   - eth: bnxt: fix crashes when reducing ring count with active RSS
     contexts

  Previous releases - regressions:

   - sched: fix UAF when resolving a clash

   - skmsg: skip zero length skb in sk_msg_recvmsg2

   - sunrpc: fix kernel free on connection failure in
     xs_tcp_setup_socket

   - tcp: avoid too many retransmit packets

   - tcp: fix incorrect undo caused by DSACK of TLP retransmit

   - udp: Set SOCK_RCU_FREE earlier in udp_lib_get_port().

   - eth: ks8851: fix deadlock with the SPI chip variant

   - eth: i40e: fix XDP program unloading while removing the driver

  Previous releases - always broken:

   - bpf:
       - fix too early release of tcx_entry
       - fail bpf_timer_cancel when callback is being cancelled
       - bpf: fix order of args in call to bpf_map_kvcalloc

   - netfilter: nf_tables: prefer nft_chain_validate

   - ppp: reject claimed-as-LCP but actually malformed packets

   - wireguard: avoid unaligned 64-bit memory accesses"

* tag 'net-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (33 commits)
  net, sunrpc: Remap EPERM in case of connection failure in xs_tcp_setup_socket
  net/sched: Fix UAF when resolving a clash
  net: ks8851: Fix potential TX stall after interface reopen
  udp: Set SOCK_RCU_FREE earlier in udp_lib_get_port().
  netfilter: nf_tables: prefer nft_chain_validate
  netfilter: nfnetlink_queue: drop bogus WARN_ON
  ethtool: netlink: do not return SQI value if link is down
  ppp: reject claimed-as-LCP but actually malformed packets
  selftests/bpf: Add timer lockup selftest
  net: ethernet: mtk-star-emac: set mac_managed_pm when probing
  e1000e: fix force smbus during suspend flow
  tcp: avoid too many retransmit packets
  bpf: Defer work in bpf_timer_cancel_and_free
  bpf: Fail bpf_timer_cancel when callback is being cancelled
  bpf: fix order of args in call to bpf_map_kvcalloc
  net: ethernet: lantiq_etop: fix double free in detach
  i40e: Fix XDP program unloading while removing the driver
  net: fix rc7's __skb_datagram_iter()
  net: ks8851: Fix deadlock with the SPI chip variant
  octeontx2-af: Fix incorrect value output on error path in rvu_check_rsrc_availability()
  ...
2024-07-11 09:29:49 -07:00
Kumar Kartikeya Dwivedi
50bd5a0c65 selftests/bpf: Add timer lockup selftest
Add a selftest that tries to trigger a situation where two timer callbacks
are attempting to cancel each other's timer. By running them continuously,
we hit a condition where both run in parallel and cancel each other.

Without the fix in the previous patch, this would cause a lockup as
hrtimer_cancel on either side will wait for forward progress from the
callback.

Ensure that this situation leads to a EDEADLK error.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240711052709.2148616-1-memxor@gmail.com
2024-07-11 10:18:31 +02:00
Linus Torvalds
920bc844ba linux_kselftest-fixes-6.10
This kselftest fixes update for Linux 6.10 consists of fixes to clang
 build failures to timerns, vDSO tests and fixes to vDSO makefile.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmaMaSgACgkQCwJExA0N
 Qxz+xw/8DG3OciD30wrXyPbH7Lw35AJXH3IsIjirQET6hoE+saxYOWbVN9POqqVy
 VTh21ZwJpuSrh8gIYHUEZPLezTwIYWN7aZm2Zps7VttlfXjNvbWiLBB4ptAL/XWu
 SogHFeE1u1KHk8JZY3v3j//hQxL0FqnUbqRjv5nnOUS1krgL4shP6JsdUU65Bs9x
 TLSCJrJSCSpG/u7KAXSHlYy0kn9fnL+F2LUqTFf+kzOOdLZ+XaxHS/02GsZYgcVI
 SUvL6x4NEqVMyxcnvL4QBs91SD1/q80vf7g0+gKHkcuHKluto/Zmnwhw40oN92lr
 T6muSS2jW+OemZzglJdD4aIbCEisVtwPsPkdtux9JZV9VAH2lyYz0+G0J2fX7r11
 LOcd4Y7HhoYA5UL6s6puE8xQEZOUrBNMY4exfeOkW/UaJhscewtyTMQsNRs8qW+4
 lEoHFJSsVQtfuZSxUaiXm49loVxu8JueynG6dafRue8tf9mCWpOzl01fVpkoLL/1
 5lMOau3DZallsiHKU0COg6eJhAi6QQjC2nYNMJHwO3DFCKpwneMYbbU9xqS5MZ5Y
 wVijpgyFdIMk5qxHDdVEmevFNyYG3xGYKq/sReDuwb4qJkdx7rDS5mMTkVxyHdCe
 ezHxw6tuiLohHXDHVCR/KxQwjiHkXZF2uudzTFDt6Lxeu68PAFk=
 =c2zt
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-fixes-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest fixes from Shuah Khan
 "Fixes to clang build failures to timerns, vDSO tests and fixes to vDSO
  makefile"

* tag 'linux_kselftest-fixes-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/vDSO: remove duplicate compiler invocations from Makefile
  selftests/vDSO: remove partially duplicated "all:" target in Makefile
  selftests/vDSO: fix clang build errors and warnings
  selftest/timerns: fix clang build failures for abs() calls
2024-07-09 08:11:39 -07:00
Linus Torvalds
4376e966ec perf tools: Fix the performance issue for v6.10
These addresses the performance issues reported by Matt, Namhyung and
 Linus.  Recently it changed processing comm string and DSO with sorted
 arrays but it required to sort the array whenever it adds a new entry.
 This caused a performance issue and fix is to enhance the sorting by
 finding the insertion point in the sorted array and to shift righthand
 side using memmove().
 
 Signed-off-by: Namhyung Kim <namhyung@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSo2x5BnqMqsoHtzsmMstVUGiXMgwUCZov/wgAKCRCMstVUGiXM
 g3olAQCFzp/BnopE7VgUSK5j0EOnMjSsvkQGkocqCVN1Km3y8AEAlV3EKb1rUN8s
 SQ+QcEx7F4V38s+Aoo2SqU1yAwYsXAc=
 =Ao/v
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-fixes-for-v6.10-2024-07-08' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools fixes from Namhyung Kim:
 "Fix performance issue for v6.10

  These address the performance issues reported by Matt, Namhyung and
  Linus. Recently perf changed the processing of the comm string and DSO
  using sorted arrays but this caused it to sort the array whenever
  adding a new entry.

  This caused a performance issue and the fix is to enhance the sorting
  by finding the insertion point in the sorted array and to shift
  righthand side using memmove()"

* tag 'perf-tools-fixes-for-v6.10-2024-07-08' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
  perf dsos: When adding a dso into sorted dsos maintain the sort order
  perf comm str: Avoid sort during insert
2024-07-08 14:08:43 -07:00
Daniel Borkmann
5f1d18de79 selftests/bpf: Extend tcx tests to cover late tcx_entry release
Add a test case which replaces an active ingress qdisc while keeping the
miniq in-tact during the transition period to the new clsact qdisc.

  # ./vmtest.sh -- ./test_progs -t tc_link
  [...]
  ./test_progs -t tc_link
  [    3.412871] bpf_testmod: loading out-of-tree module taints kernel.
  [    3.413343] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
  #332     tc_links_after:OK
  #333     tc_links_append:OK
  #334     tc_links_basic:OK
  #335     tc_links_before:OK
  #336     tc_links_chain_classic:OK
  #337     tc_links_chain_mixed:OK
  #338     tc_links_dev_chain0:OK
  #339     tc_links_dev_cleanup:OK
  #340     tc_links_dev_mixed:OK
  #341     tc_links_ingress:OK
  #342     tc_links_invalid:OK
  #343     tc_links_prepend:OK
  #344     tc_links_replace:OK
  #345     tc_links_revision:OK
  Summary: 14/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240708133130.11609-2-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-07-08 14:07:31 -07:00
Ian Rogers
7b2450bb40 perf dsos: When adding a dso into sorted dsos maintain the sort order
dsos__add would add at the end of the dso array possibly requiring a
later find to re-sort the array. Patterns of find then add were
becoming O(n*log n) due to the sorts. Change the add routine to be
O(n) rather than O(1) but to maintain the sorted-ness of the dsos
array so that later finds don't need the O(n*log n) sort.

Fixes: 3f4ac23a9908 ("perf dsos: Switch backing storage to array from rbtree/list")
Reported-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Steinar Gunderson <sesse@google.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Matt Fleming <matt@readmodwrite.com>
Link: https://lore.kernel.org/r/20240703172117.810918-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-07-07 22:26:29 -07:00
Ian Rogers
88076e4699 perf comm str: Avoid sort during insert
The array is sorted, so just move the elements and insert in order.

Fixes: 13ca628716c6 ("perf comm: Add reference count checking to 'struct comm_str'")
Reported-by: Matt Fleming <matt@readmodwrite.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Matt Fleming <matt@readmodwrite.com>
Cc: Steinar Gunderson <sesse@google.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20240703172117.810918-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-07-07 22:26:27 -07:00
Linus Torvalds
c6653f49e4 powerpc fixes for 6.10 #4
- Fix unnecessary copy to 0 when kernel is booted at address 0.
 
  - Fix usercopy crash when dumping dtl via debugfs.
 
  - Avoid possible crash when PCI hotplug races with error handling.
 
  - Fix kexec crash caused by scv being disabled before other CPUs call-in.
 
  - Fix powerpc selftests build with USERCFLAGS set.
 
 Thanks to: Anjali K, Ganesh Goudar, Gautam Menghani, Jinglin Wen, Nicholas
 Piggin, Sourabh Jain, Srikar Dronamraju, Vishal Chourasia.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmaJyekTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgN6zD/0d8lPrWQ3TRkS+jLdhsDfHc+qMW1/N
 DuxPrVJl4qLgvYPEZWAF5+uWuhJurmbTCXNRnUQ5HHfwPtkU77pbTNiQcCAYsy2l
 W35DYE+vqnNNid9hFCgvLoSrGDA0qvcGpMVBVfqjRygOLxpWztmV7S7q9E0CvuWg
 ESXt4HNyPiRVl4ufPam12lmiEDh+PycsD24U6FSjaTxqvd4kwSTyLDLfmI+gTaqx
 1PdzKt0c3g2QhDBoR7cpRaTCRamKRPwqFHANMUAkIXm3fIdHpWOEF03lvTsA0OgA
 0ktzaEUhCPHr6kjAizbybmgXZovh/eoZc9wUd7zCWdSGNiq8FlhsmFuIuScrbQ7k
 YCYz+X/KoqNk2VbxKkDneO6/H2juzu9wzzK5OMcKsVGSWi7+DjBp9FBDiFCfb3VQ
 ZMuc71dOTtA7fDqWDnYtFMtEwrUGpTixE5xPNBzbzIVkKdSjb1H3RLd/mhu7+X/B
 eVjFOPj7mRburIX5M3UllvsdbOiLqjbg6P28JL3qG6qT/OiiQAmF5apKvf1LNvPV
 xgJHGPemlAkVNihg6Xu8+up+wcPuMi13osjA9FZkLUdLXK4O+d3q/K/Rf7TGjT2X
 rBNhCd3lRd6gmpa52ujm5X0f9czEJxMrfy0Ota3L8YFUb7hW8JK6fgcwSispXGSI
 o/JUlQ30K6JAjA==
 =B1pw
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - Fix unnecessary copy to 0 when kernel is booted at address 0

 - Fix usercopy crash when dumping dtl via debugfs

 - Avoid possible crash when PCI hotplug races with error handling

 - Fix kexec crash caused by scv being disabled before other CPUs
   call-in

 - Fix powerpc selftests build with USERCFLAGS set

Thanks to Anjali K, Ganesh Goudar, Gautam Menghani, Jinglin Wen,
Nicholas Piggin, Sourabh Jain, Srikar Dronamraju, and Vishal Chourasia.

* tag 'powerpc-6.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  selftests/powerpc: Fix build with USERCFLAGS set
  powerpc/pseries: Fix scv instruction crash with kexec
  powerpc/eeh: avoid possible crash when edev->pdev changes
  powerpc/pseries: Whitelist dtl slub object for copying to userspace
  powerpc/64s: Fix unnecessary copy to 0 when kernel is booted at address 0
2024-07-06 18:31:24 -07:00
Michael Ellerman
8b7f59de92 selftests/powerpc: Fix build with USERCFLAGS set
Currently building the powerpc selftests with USERCFLAGS set to anything
causes the build to break:

  $ make -C tools/testing/selftests/powerpc V=1 USERCFLAGS=-Wno-error
  ...
  gcc -Wno-error    cache_shape.c ...
  cache_shape.c:18:10: fatal error: utils.h: No such file or directory
     18 | #include "utils.h"
        |          ^~~~~~~~~
  compilation terminated.

This happens because the USERCFLAGS are added to CFLAGS in lib.mk, which
causes the check of CFLAGS in powerpc/flags.mk to skip setting CFLAGS at
all, resulting in none of the usual CFLAGS being passed. That can
be seen in the output above, the only flag passed to the compiler is
-Wno-error.

Fix it by dropping the conditional setting of CFLAGS in flags.mk.
Instead always set CFLAGS, but also append USERCFLAGS if they are set.

Note that appending to CFLAGS (with +=) wouldn't work, because flags.mk
is included by multiple Makefiles (to support partial builds), causing
CFLAGS to be appended to multiple times. Additionally that would place
the USERCFLAGS prior to the standard CFLAGS, meaning the USERCFLAGS
couldn't override the standard flags. Being able to override the
standard flags is desirable, for example for adding -Wno-error.

With the fix in place, the CFLAGS are set correctly, including the
USERCFLAGS:

  $ make -C tools/testing/selftests/powerpc V=1 USERCFLAGS=-Wno-error
  ...
  gcc -std=gnu99 -O2 -Wall -Werror -DGIT_VERSION='"v6.10-rc2-7-gdea17e7e56c3"'
  -I/home/michael/linux/tools/testing/selftests/powerpc/include -Wno-error
  cache_shape.c ...

Fixes: 5553a79387e9 ("selftests/powerpc: Add flags.mk to support pmu buildable")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240706120833.909853-1-mpe@ellerman.id.au
2024-07-06 22:10:14 +10:00
Jason A. Donenfeld
2cb489eb8d wireguard: selftests: use acpi=off instead of -no-acpi for recent QEMU
QEMU 9.0 removed -no-acpi, in favor of machine properties, so update the
Makefile to use the correct QEMU invocation.

Cc: stable@vger.kernel.org
Fixes: b83fdcd9fb8a ("wireguard: selftests: use microvm on x86")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://patch.msgid.link/20240704154517.1572127-2-Jason@zx2c4.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-05 17:21:10 -07:00
John Hubbard
66cde337fa selftests/vDSO: remove duplicate compiler invocations from Makefile
The Makefile open-codes compiler invocations that ../lib.mk already
provides.

Avoid this by using a Make feature that allows setting per-target
variables, which in this case are: CFLAGS and LDFLAGS. This approach
generates the exact same compiler invocations as before, but removes all
of the code duplication, along with the quirky mangled variable names.
So now the Makefile is smaller, less unusual, and easier to read.

The new dependencies are listed after including lib.mk, in order to
let lib.mk provide the first target ("all:"), and are grouped together
with their respective source file dependencies, for visual clarity.

Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-07-05 14:12:34 -06:00
John Hubbard
bb2a605de3 selftests/vDSO: remove partially duplicated "all:" target in Makefile
There were a couple of errors here:

1. TEST_GEN_PROGS was incorrectly prepending $(OUTPUT) to each program
to be built. However, lib.mk already does that because it assumes "bare"
program names are passed in, so this ended up creating
$(OUTPUT)/$(OUTPUT)/file.c, which of course won't work as intended.

2. lib.mk was included before TEST_GEN_PROGS was set, which led to
lib.mk's "all:" target not seeing anything to rebuild.

So nothing worked, which caused the author to force things by creating
an "all:" target locally--while still including ../lib.mk.

Fix all of this by including ../lib.mk at the right place, and removing
the $(OUTPUT) prefix to the programs to be built, and removing the
duplicate "all:" target.

Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-07-05 14:12:28 -06:00
John Hubbard
73810cd45b selftests/vDSO: fix clang build errors and warnings
When building with clang, via:

    make LLVM=1 -C tools/testing/selftests

...there are several warnings, and an error. This fixes all of those and
allows these tests to run and pass.

1. Fix linker error (undefined reference to memcpy) by providing a local
   version of memcpy.

2. clang complains about using this form:

    if (g = h & 0xf0000000)

...so factor out the assignment into a separate step.

3. The code is passing a signed const char* to elf_hash(), which expects
   a const unsigned char *. There are several callers, so fix this at
   the source by allowing the function to accept a signed argument, and
   then converting to unsigned operations, once inside the function.

4. clang doesn't have __attribute__((externally_visible)) and generates
   a warning to that effect. Fortunately, gcc 12 and gcc 13 do not seem
   to require that attribute in order to build, run and pass tests here,
   so remove it.

Reviewed-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Edward Liaw <edliaw@google.com>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Tested-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-07-05 14:12:23 -06:00
Linus Torvalds
b673f2bda0 RISC-V Fixes for 6.10-rc7
* A fix for the CMODX example in therecently added icache flushing
   prctl().
 * A fix to the perf driver to avoid corrupting event data on counter
   overflows when external overflow handlers are in use.
 * A fix to clear all hardware performance monitor events on boot, to
   avoid dangling events firmware or previously booted kernels from
   triggering spuriously.
 * A fix to the perf event probing logic to avoid erroneously reporting
   the presence of unimplemented counters.  This also prevents some
   implemented counters from being reported.
 * A build fix for the vector sigreturn selftest on clang.
 * A fix to ftrace, which now requires the previously optional index
   argument to ftrace_graph_ret_addr().
 * A fix to avoid deadlocking if kexec crash handling triggers in an
   interrupt context.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmaIJBATHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiYk6D/981DUAWJ5JPsqve7PihWnhFXh7T/fm
 KZL7cNQN7/9QmqzJMD756oQCHZT2TeDTxwji4WUQo27uoS1SamsAxRWCPdW8GqDt
 GwBJeviyWDwjNMgrejWwgH3d9so+WZ4kNKfiUrY+j1vgQ8TkE4h5wMzUtOBTSgDI
 5EhHT5B5yjiRcadPshXivZAyimc6mxKJKph5v8W3BGgtLQRHs5tYop4ZkP5Utmv3
 yBie7orfMRx5fNxE6fgn0c/3r49i+KGTSCzkK+0689qPlQNt7MTj4kqDVp7xu2ll
 jl5GJNZrWSZR0cST9AG3VByqfeN2f9sbGYq5fAozkZy3idEYovtvGIU2xJVZRuIU
 ZhY+VTk0fwO8HlilTLMbyk7t99EJ4a7bXcUuD6ub3BthlKfc41PArhZgasL/dFPd
 VOSjy5hfGpJgmifSTpPXElf8jgBq6N4Kw9N+rBNkNiruEiwtWfsyqOckYAfNbULe
 Z8Nikl+3pfWlwzQrAb30X78s4ZyJyOX+XxP118lvx+UAbZofxg5qJJGo7U0Ru54r
 JPBCW8swlco6AXwvAj3yKcaL3qtKlc6f068QvcSaRELUvS2qfuJ7w4fjKdl/IT93
 QggGUyuEVG3UC1Dj961plrACXmqISTAlW8HqkdPvUgLY9rSPuTLuCR54b3fGI+n/
 3wJF6gl5leEPMw==
 =/Gsf
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - A fix for the CMODX example in the recently added icache flushing
   prctl()

 - A fix to the perf driver to avoid corrupting event data on counter
   overflows when external overflow handlers are in use

 - A fix to clear all hardware performance monitor events on boot, to
   avoid dangling events firmware or previously booted kernels from
   triggering spuriously

 - A fix to the perf event probing logic to avoid erroneously reporting
   the presence of unimplemented counters. This also prevents some
   implemented counters from being reported

 - A build fix for the vector sigreturn selftest on clang

 - A fix to ftrace, which now requires the previously optional index
   argument to ftrace_graph_ret_addr()

 - A fix to avoid deadlocking if kexec crash handling triggers in an
   interrupt context

* tag 'riscv-for-linus-6.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: kexec: Avoid deadlock in kexec crash path
  riscv: stacktrace: fix usage of ftrace_graph_ret_addr()
  riscv: selftests: Fix vsetivli args for clang
  perf: RISC-V: Check standard event availability
  drivers/perf: riscv: Reset the counter to hpmevent mapping while starting cpus
  drivers/perf: riscv: Do not update the event data if uptodate
  documentation: Fix riscv cmodx example
2024-07-05 12:22:51 -07:00
John Hubbard
f76f9bc616 selftest/timerns: fix clang build failures for abs() calls
When building with clang, via:

    make LLVM=1 -C tools/testing/selftests

...clang warns about mismatches between the expected and required
integer length being supplied to abs(3).

Fix this by using the correct variant of abs(3): labs(3) or llabs(3), in
these cases.

Reviewed-by: Dmitry Safonov <dima@arista.com>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Andrei Vagin <avagin@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-07-05 13:21:48 -06:00
Linus Torvalds
033771c085 Including fixes from bluetooth, wireless and netfilter.
There's one fix for power management with Intel's e1000e here,
 Thorsten tells us there's another problem that started in v6.9.
 We're trying to wrap that up but I don't think it's blocking.
 
 Current release - new code bugs:
 
  - wifi: mac80211: disable softirqs for queued frame handling
 
  - af_unix: fix uninit-value in __unix_walk_scc(), with the new garbage
    collection algo
 
 Previous releases - regressions:
 
  - Bluetooth:
    - qca: fix BT enable failure for QCA6390 after warm reboot
    - add quirk to ignore reserved PHY bits in LE Extended Adv Report,
      abused by some Broadcom controllers found on Apple machines
 
  - wifi: wilc1000: fix ies_len type in connect path
 
 Previous releases - always broken:
 
  - tcp: fix DSACK undo in fast recovery to call tcp_try_to_open(),
    avoid premature timeouts
 
  - net: make sure skb_datagram_iter maps fragments page by page,
    in case we somehow get compound highmem mixed in
 
  - eth: bnx2x: fix multiple UBSAN array-index-out-of-bounds when
    more queues are used
 
 Misc:
 
  - MAINTAINERS: Remembering Larry Finger
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmaGv3sACgkQMUZtbf5S
 IrviRxAAr7yvDg07P1UzFsXI1khOijGCxOYcDDV+AHVZKwJ9fAq0ky5pcaiW62mY
 h3HffvEbNgZ07zd9l4Z9dVoel5ke9k4yofYZrI0D8X/T1e/Xo0LlxUFGwb0IidBj
 IYkTnZfu1lGa73TWCIh369s1HgybupiHQicYSw+KIO1wtfds8gvyZJUyjNhlvUYQ
 NdB/JQrBp/oxm2JlAMDubZfNuVEFCum5J3Ldj5W32j+H82RbGDi/eMn5w+Cs/tEx
 rRFSuJ1L0rBhNcB4HDbcfin9jHjLhDjNXyYprlZAauMXK5AEBwRcOuEzyXWt1Npq
 ZkJ8t/ToVLk9QkXaKA1gR9C6Bo8A+SL5a8ddfj/pHEqOa/GNXKYqEvGOmM7mmbBf
 93sU+dBYZ3nLGrUtuTRVGTnbr+J1AhP/kUqIY1c787m3gCSB1qFkF67DQiYTGB9g
 qf+xTcmJeGpL+4OtXgjpK4gUa152g0VsuAMTzecW/7EU/owD0+zCWuVGK9Gv/bgf
 si40hgZ7Ipnq8k+N+4e2VQp1ufCduT8zGn6sxiivdS5GSNc8e2BnQH3AfjfIM8Z8
 rK15U5WJIVQiCkthYh8cx8pxh2uwtcXevjUh4B682/U4HbLdiYfAQuD4/AOc2i8M
 EJVzl7/5AaxhjoZPxloe9mtRnMvt7XhUiNOW0lR9fgDYOqcmnSo=
 =MZAG
 -----END PGP SIGNATURE-----

Merge tag 'net-6.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bluetooth, wireless and netfilter.

  There's one fix for power management with Intel's e1000e here,
  Thorsten tells us there's another problem that started in v6.9. We're
  trying to wrap that up but I don't think it's blocking.

  Current release - new code bugs:

   - wifi: mac80211: disable softirqs for queued frame handling

   - af_unix: fix uninit-value in __unix_walk_scc(), with the new
     garbage collection algo

  Previous releases - regressions:

   - Bluetooth:
      - qca: fix BT enable failure for QCA6390 after warm reboot
      - add quirk to ignore reserved PHY bits in LE Extended Adv Report,
        abused by some Broadcom controllers found on Apple machines

   - wifi: wilc1000: fix ies_len type in connect path

  Previous releases - always broken:

   - tcp: fix DSACK undo in fast recovery to call tcp_try_to_open(),
     avoid premature timeouts

   - net: make sure skb_datagram_iter maps fragments page by page, in
     case we somehow get compound highmem mixed in

   - eth: bnx2x: fix multiple UBSAN array-index-out-of-bounds when more
     queues are used

  Misc:

   - MAINTAINERS: Remembering Larry Finger"

* tag 'net-6.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (62 commits)
  bnxt_en: Fix the resource check condition for RSS contexts
  mlxsw: core_linecards: Fix double memory deallocation in case of invalid INI file
  inet_diag: Initialize pad field in struct inet_diag_req_v2
  tcp: Don't flag tcp_sk(sk)->rx_opt.saw_unknown for TCP AO.
  selftests: make order checking verbose in msg_zerocopy selftest
  selftests: fix OOM in msg_zerocopy selftest
  ice: use proper macro for testing bit
  ice: Reject pin requests with unsupported flags
  ice: Don't process extts if PTP is disabled
  ice: Fix improper extts handling
  selftest: af_unix: Add test case for backtrack after finalising SCC.
  af_unix: Fix uninit-value in __unix_walk_scc()
  bonding: Fix out-of-bounds read in bond_option_arp_ip_targets_set()
  net: rswitch: Avoid use-after-free in rswitch_poll()
  netfilter: nf_tables: unconditionally flush pending work before notifier
  wifi: iwlwifi: mvm: check vif for NULL/ERR_PTR before dereference
  wifi: iwlwifi: mvm: avoid link lookup in statistics
  wifi: iwlwifi: mvm: don't wake up rx_sync_waitq upon RFKILL
  wifi: iwlwifi: properly set WIPHY_FLAG_SUPPORTS_EXT_KEK_KCK
  wifi: wilc1000: fix ies_len type in connect path
  ...
2024-07-04 10:11:12 -07:00
Linus Torvalds
4d85acef10 Fix Kselftests timeout and race condition
-----BEGIN PGP SIGNATURE-----
 
 iIYEABYKAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCZoaS3BAcbWljQGRpZ2lr
 b2QubmV0AAoJEOXj0OiMgvbSCq4A/0yPPeNTsq0mio4mq1cvCCq5C5HcDOEjqAc/
 80qeUvb/AQCaV4dETTRSgHWQw3PNdsFBkZrqjWByakaLep5ZTPqOCQ==
 =0QIq
 -----END PGP SIGNATURE-----

Merge tag 'kselftest-fix-2024-07-04' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux

Pull Kselftest fix from Mickaël Salaün:
 "Fix Kselftests timeout.

  We can't use CLONE_VFORK, since that blocks the parent - and thus the
  timeout handling - until the child exits or execve's.

  Go back to using plain fork()"

* tag 'kselftest-fix-2024-07-04' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
  selftests/harness: Fix tests timeout and race condition
2024-07-04 09:29:42 -07:00
Zijian Zhang
7d6d8f0c8b selftests: make order checking verbose in msg_zerocopy selftest
We find that when lock debugging is on, notifications may not come in
order. Thus, we have order checking outputs managed by cfg_verbose, to
avoid too many outputs in this case.

Fixes: 07b65c5b31ce ("test: add msg_zerocopy test")
Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com>
Signed-off-by: Xiaochun Lu <xiaochun.lu@bytedance.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240701225349.3395580-3-zijianzhang@bytedance.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-03 19:42:32 -07:00
Zijian Zhang
af2b7e5b74 selftests: fix OOM in msg_zerocopy selftest
In selftests/net/msg_zerocopy.c, it has a while loop keeps calling sendmsg
on a socket with MSG_ZEROCOPY flag, and it will recv the notifications
until the socket is not writable. Typically, it will start the receiving
process after around 30+ sendmsgs. However, as the introduction of commit
dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale"), the sender is
always writable and does not get any chance to run recv notifications.
The selftest always exits with OUT_OF_MEMORY because the memory used by
opt_skb exceeds the net.core.optmem_max. Meanwhile, it could be set to a
different value to trigger OOM on older kernels too.

Thus, we introduce "cfg_notification_limit" to force sender to receive
notifications after some number of sendmsgs.

Fixes: 07b65c5b31ce ("test: add msg_zerocopy test")
Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com>
Signed-off-by: Xiaochun Lu <xiaochun.lu@bytedance.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240701225349.3395580-2-zijianzhang@bytedance.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-03 19:42:32 -07:00
Kuniyuki Iwashima
2a79651bf2 selftest: af_unix: Add test case for backtrack after finalising SCC.
syzkaller reported a KMSAN splat in __unix_walk_scc() while backtracking
edge_stack after finalising SCC.

Let's add a test case exercising the path.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Link: https://patch.msgid.link/20240702160428.10153-2-syoshida@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-03 19:36:22 -07:00
Charlie Jenkins
3582ce0d7c
riscv: selftests: Fix vsetivli args for clang
Clang does not support implicit LMUL in the vset* instruction sequences.
Introduce an explicit LMUL in the vsetivli instruction.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Fixes: 9d5328eeb185 ("riscv: selftests: Add signal handling vector tests")
Link: https://lore.kernel.org/r/20240702-fix_sigreturn_test-v1-1-485f88a80612@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-03 13:04:54 -07:00
Linus Torvalds
e9d22f7a66 linux_kselftest-fixes-6.10-rc7
This kselftest fixes update for Linux 6.10-rc7 consists of one single
 patch to fix the non-contiguous CBM resctrl:
 
 - AMD supports non-contiguous CBM but does not report it via CPUID. This
   test should not use CPUID on AMD to detect non-contiguous CBM support.
   Fix the problem so the test uses CPUID to discover non-contiguous CBM
   support only on Intel.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmaEZZkACgkQCwJExA0N
 QxwgbRAAwLYXO30O83wEYW3CpTQ+zuEGB9OUwZpDX1zFatREYXUfwxWWnnZVHoed
 4/ifd4Xn+GcHEDZi1adzAadEAbHWyta8nTzpNvP15wQ1zLCDq33G/ec+aFYiAntK
 zu/AGqwDvV+mgOq0OJ9QuZy/PSaB1WPcRXghxbaeiD0jJCZt8QL/WDjTuq7+n7J1
 Sgbo0W9eoRpVNAgtAf8kFrghggvPAorrTvah28YMRM3yEGc5Vp3XtkURYAhbKBzr
 ZF/04PUoM2GDN3ua1wY63n3eGz5CupP7f/AdCRxwW0YJgKjGQuKmyBSt7AAdsAvy
 kV2eAy0Qb1u6JowJwfvJ/P5/nEgvtqovfMah/yLpr1Y0AIgycgKsmJcy6FaK1KlD
 hV3omXNLlRQGgepViHI7PEQcsYhi8GX3Mi9M8oWu5QgMCsxXocvpUVBu64gmUakK
 Foj3gI4CGPskgO5IT5FsZlv/PbyWNVntn+I7geSpemTxHy1e7CIxMGaDgSM3Wsps
 3/IQvP4RwF0uHzvC5dK85JJe9toxD9zzXUZavyYUmrhSc6dfex0qvlynfhvvTL9S
 +sJe3gdu09PFZCMRFIsWmmeN6EopNEr6cLO46uLnkrHZVi33yhtezuTloXimo2yR
 udMYLsM2nVV5+S2m+0XGb+peXWu48bmxepNxHFfiIXDKWReaid0=
 =Net1
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-fixes-6.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest fixes from Shuah Khan:
 "One single patch to fix the non-contiguous CBM resctrl:

  - AMD supports non-contiguous CBM but does not report it via CPUID.
    This test should not use CPUID on AMD to detect non-contiguous CBM
    support. Fix the problem so the test uses CPUID to discover
    non-contiguous CBM support only on Intel"

* tag 'linux_kselftest-fixes-6.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/resctrl: Fix non-contiguous CBM for AMD
2024-07-02 13:53:24 -07:00
Linus Torvalds
73e931504f cxl fixes for v6.10-rc7
- Fix no cxl_nvd during pmem region auto-assemble
 - Avoid NULLL pointer dereference in region lookup
 - Add missing checks to interleave capability
 - Add cxl kdoc fix to address document compilation error
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE5DAy15EJMCV1R6v9YGjFFmlTOEoFAmaC5f0ACgkQYGjFFmlT
 OEpjug/8DxnDA/hxi45kQHz0iwtMNFPv87qpC5MQ5XEXG0RLY4xka5UI7SKDPuo8
 JZGWCi/wObUU0iAtG9tBji3aMkx9BM0PYj3LIMkQaZccUtS47xgzQius4c4KpPCC
 gGDGEXc1oMVIIBzh7/ZXU6PTd4a3+8c6DZoSIUEyyt1j72R8ef9lans/w4Dl39FI
 c+SVE4GdlnVe5/34CUTe+It8vn8bV/a9YXwjadIuXnOFxsPym2CdeADssj8IZUOS
 pmDU5CdGPJAnL9jT+/NtuY312wrGi7ImxhLtD9/3pJhluqs/OMW4OWcIgDoDAP13
 Ndn/eLoO2zgZtVAoCeMMuEQcRBZGwCcrIbN1CBVJ2HR+n6XlO7ICmABcOoZvhG0b
 wdK6ukNnWLoP0xXRpqYWcTsGfjWTadKqom1hs6jJqMeJcK8HcQNzb8xNzoXdk0QR
 wT8AKYRupwQuAY90mSp4aAlo8aJypAXB6tJzzp05QcgbTd+uif0STiiOzkG9FCAE
 1v1snUjjkrIjUwODzX4Me2xw0AxSZdOvk//5mKB3fSQdXYxQUNfRunAI02qIM+8M
 XPM/QxA+DeJEyD8BTnDo5J5SK5XhoMHaCPrOPYMm5bPYKS0TxHHCY4tMRAQTfhW+
 QVcbkqi+WSAvVibl0OcwYmR64TnGMtCwhQrPFqaX+aWBpvhc+1c=
 =6MV8
 -----END PGP SIGNATURE-----

Merge tag 'cxl-fixes-6.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull cxl fixes from Dave Jiang:

 - Fix no cxl_nvd during pmem region auto-assemble

 - Avoid NULLL pointer dereference in region lookup

 - Add missing checks to interleave capability

 - Add cxl kdoc fix to address document compilation error

* tag 'cxl-fixes-6.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl: documentation: add missing files to cxl driver-api
  cxl/region: check interleave capability
  cxl/region: Avoid null pointer dereference in region lookup
  cxl/mem: Fix no cxl_nvd during pmem region auto-assembling
2024-07-01 13:03:30 -07:00
Linus Torvalds
234458d049 Fix three recent minor turbostat regressions.
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEE67dNfPFP+XUaA73mB9BFOha3NhcFAmZ+PUEUHGxlbi5icm93
 bkBpbnRlbC5jb20ACgkQB9BFOha3Nhcq4hAAiGPxoZMkTZLVbipLJFo8AZRvG0Wh
 BGhWJ76L4gfKtYIFnR1nBItPIvrD//5gXvRE48wI2dsylHjDyCPqktEpYivv5ESA
 SA7oqb47FroXXO4b5u0A5lG+Nzbt3SJK28z0t6GWy9NjQz6HHGT09UAP6X35/ZJH
 mc2lwLmDH60f2zdaKIJfAYfhUWdOk8VyctnZVeu8LFYnZf6lP9hD54gqhtd1eYB0
 1upAWeH4h8oYcr153zuRECAuz2kZMjjgy+mskHBMRLGcGTojtKwROBeVtXiSJo+N
 81ds2QsCAG5YtH3Er3eoLCyWTI+pNeraKuZbnVKfgnR943vh+J89CpCA56hSACGT
 9DBA6sjngYE6HKB+NxeAxrBHDC3kE0orsfArpaPsWyWeSYj9lxqjP1raH96w5uwS
 +t0KOgKlfOOxrUyUpfXqor4Z/J7JvpRF2ymvzsDWnhQVTnGIsBhmpyyW+FVn/QYm
 K38HkMK9eippA0t0wShsg2qF79vVMNhlPMwYhuA9kamTKh0jkOmtn8g9LNP3LBup
 qLjq5GT+r2dAO1FPZhk88I0OjhHTjAQkzbCeKM+C0++TVi+HQ9Npcv8hkfebb8rw
 ZP6owRfAho6cniYxJNQSsghUcus51zUs4jVQaKwetQxp3awgMUsuOHu5eUy8V7GL
 eqjfchtqgWHzxK0=
 =if6M
 -----END PGP SIGNATURE-----

Merge tag 'v6.10-rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux

Pull turbostat fixes from Len Brown:
 "Fix three recent minor turbostat regressions"

* tag 'v6.10-rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  tools/power turbostat: Add local build_bug.h header for snapshot target
  tools/power turbostat: Fix unc freq columns not showing with '-q' or '-l'
  tools/power turbostat: option '-n' is ambiguous
2024-06-28 09:04:33 -07:00
Mickaël Salaün
130e428067
selftests/harness: Fix tests timeout and race condition
We cannot use CLONE_VFORK because we also need to wait for the timeout
signal.

Restore tests timeout by using the original fork() call in __run_test()
but also in __TEST_F_IMPL().  Also fix a race condition when waiting for
the test child process.

Because test metadata are shared between test processes, only the
parent process must set the test PID (child).  Otherwise, t->pid may be
set to zero, leading to inconsistent error cases:

  #  RUN           layout1.rule_on_mountpoint ...
  # rule_on_mountpoint: Test ended in some other way [127]
  #            OK  layout1.rule_on_mountpoint
  ok 20 layout1.rule_on_mountpoint

As safeguards, initialize the "status" variable with a valid exit code,
and handle unknown test exits as errors.

The use of fork() introduces a new race condition in landlock/fs_test.c
which seems to be specific to hostfs bind mounts, but I haven't found
the root cause and it's difficult to trigger.  I'll try to fix it with
another patch.

Cc: Christian Brauner <brauner@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Will Drewry <wad@chromium.org>
Cc: stable@vger.kernel.org
Closes: https://lore.kernel.org/r/9341d4db-5e21-418c-bf9e-9ae2da7877e1@sirena.org.uk
Fixes: a86f18903db9 ("selftests/harness: Fix interleaved scheduling leading to race conditions")
Fixes: 24cf65a62266 ("selftests/harness: Share _metadata between forked processes")
Link: https://lore.kernel.org/r/20240621180605.834676-1-mic@digikod.net
Tested-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2024-06-28 16:06:03 +02:00
Patryk Wlazlyn
b15943c4b3 tools/power turbostat: Add local build_bug.h header for snapshot target
Fixes compilation errors for Makefile snapshot target described in:
commit 231ce08b662a ("tools/power turbostat: Add "snapshot:" Makefile target")

Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2024-06-27 23:53:27 -04:00
Adam Hawley
c5120a3356 tools/power turbostat: Fix unc freq columns not showing with '-q' or '-l'
Commit 78464d7681f7 ("tools/power turbostat: Add columns for clustered
uncore frequency") introduced 'probe_intel_uncore_frequency_cluster()'
in a way which prevents printing uncore frequency columns if either of
the '-q' or '-l' options are used. Systems which do not have multiple
uncore frequencies per package are unaffected by this regression.

Fix the function so that uncore frequency columns are shown when either
the '-l' or '-q' option is used by checking if 'quiet' is true after
adding counters for the uncore frequency columns.

Fixes: 78464d7681f7 ("tools/power turbostat: Add columns for clustered uncore frequency")

Signed-off-by: Adam Hawley <adam.james.hawley@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2024-06-27 23:53:27 -04:00
David Arcari
ebb5b260af tools/power turbostat: option '-n' is ambiguous
In some cases specifying the '-n' command line argument will cause
turbostat to fail.  For instance 'turbostat -n 1' works fine; however,
'turbostat -n 1 -d' will fail.  This is the result of the first call
to getopt_long_only() where "MP" is specified as the optstring.  This can
be easily fixed by changing the optstring from "MP" to "MPn:" to remove
ambiguity between the arguments.

tools/power turbostat: option '-n' is ambiguous; possibilities: '-num_iterations' '-no-msr' '-no-perf'

Fixes: a0e86c90b83c ("tools/power turbostat: Add --no-perf option")

Signed-off-by: David Arcari <darcari@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2024-06-27 23:53:27 -04:00
Linus Torvalds
fd19d4a492 Including fixes from can, bpf and netfilter.
Current release - regressions:
 
   - core: add softirq safety to netdev_rename_lock
 
   - tcp: fix tcp_rcv_fastopen_synack() to enter TCP_CA_Loss for failed TFO
 
   - batman-adv: fix RCU race at module unload time
 
 Current release - new code bugs:
 
 Previous releases - regressions:
 
   - openvswitch: get related ct labels from its master if it is not confirmed
 
   - eth: bonding: fix incorrect software timestamping report
 
   - eth: mlxsw: fix memory corruptions on spectrum-4 systems
 
   - eth: ionic: use dev_consume_skb_any outside of napi
 
 Previous releases - always broken:
 
   - netfilter: fully validate NFT_DATA_VALUE on store to data registers
 
   - unix: several fixes for OoB data
 
   - tcp: fix race for duplicate reqsk on identical SYN
 
   - bpf:
     - fix may_goto with negative offset.
     - fix the corner case with may_goto and jump to the 1st insn.
     - fix overrunning reservations in ringbuf
 
   - can:
     - j1939: recover socket queue on CAN bus error during BAM transmission
     - mcp251xfd: fix infinite loop when xmit fails
 
   - dsa: microchip: monitor potential faults in half-duplex mode
 
   - eth: vxlan: pull inner IP header in vxlan_xmit_one()
 
   - eth: ionic: fix kernel panic due to multi-buffer handling
 
 Misc:
 
   - selftest: unix tests refactor and a lot of new cases added
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmZ9ZlQSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkawoQAKLTWHswqM790uaAAgqP6jGuC4/waRS8
 MowEt5rHlwdMXcHhLrDSrLQoDJAZRsWmjniIgbsaeX+HtY4HXfF0tfDMPKiws3vx
 Z51qVj7zYjdT7IoZ7Yc8Zlwmt2kVgO4ba6gSigQSORQO9Qq/WNSb0q8BM6cDaYXT
 cXC7ikPeMlLnxKxsFRpZ3CUD06dI/aJFp/pefPEm7/X/EbROlSs5y+2GshPdp5t7
 tzOUsLHs6ORVq/6jg2nRHH+0D+LMuQG0Z0yCMmYerJMJNtRIxyW6tTYeAsWXeyn3
 UN3gaoQ/SIURDrNRZvHsaVDNO/u4rbYtFLoK7S5uPffPWqsGJY59FcH+xYFukFCD
 P5Lca4kKBr8xOahsRfSiO0uFbwQfQAauzNiz9Ue39n1hj+ZhZ/CliBLhUeoBl6Y6
 jSsxq+/8CZCQ7beek96cyLx83skAcWAU5BEC9xOVlOTuTL91Gxr9UzSx/FqLI34h
 Smgw9ZUPzJgvFLgB/OBQ/WYne9LfJ5RYQHZoAXObiozO3TX7NgBUfa0e1T9dLE3F
 TalysSO3/goiZNK5a/UNJcj3fAcSEs4M2z9UIK790i3P3GuRigs1sJEtTUqyowWk
 aaTFmWCXE0wdoshJjux3syh3Vk6phJWpOlMLYjy0v5s0BF/ZOfDaKQT/dGsvV1HE
 AFGpKpybizNV
 =BYgZ
 -----END PGP SIGNATURE-----

Merge tag 'net-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from can, bpf and netfilter.

  There are a bunch of regressions addressed here, but hopefully nothing
  spectacular. We are still waiting the driver fix from Intel, mentioned
  by Jakub in the previous networking pull.

  Current release - regressions:

   - core: add softirq safety to netdev_rename_lock

   - tcp: fix tcp_rcv_fastopen_synack() to enter TCP_CA_Loss for failed
     TFO

   - batman-adv: fix RCU race at module unload time

  Previous releases - regressions:

   - openvswitch: get related ct labels from its master if it is not
     confirmed

   - eth: bonding: fix incorrect software timestamping report

   - eth: mlxsw: fix memory corruptions on spectrum-4 systems

   - eth: ionic: use dev_consume_skb_any outside of napi

  Previous releases - always broken:

   - netfilter: fully validate NFT_DATA_VALUE on store to data registers

   - unix: several fixes for OoB data

   - tcp: fix race for duplicate reqsk on identical SYN

   - bpf:
       - fix may_goto with negative offset
       - fix the corner case with may_goto and jump to the 1st insn
       - fix overrunning reservations in ringbuf

   - can:
       - j1939: recover socket queue on CAN bus error during BAM
         transmission
       - mcp251xfd: fix infinite loop when xmit fails

   - dsa: microchip: monitor potential faults in half-duplex mode

   - eth: vxlan: pull inner IP header in vxlan_xmit_one()

   - eth: ionic: fix kernel panic due to multi-buffer handling

  Misc:

   - selftest: unix tests refactor and a lot of new cases added"

* tag 'net-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (61 commits)
  net: mana: Fix possible double free in error handling path
  selftest: af_unix: Check SIOCATMARK after every send()/recv() in msg_oob.c.
  af_unix: Fix wrong ioctl(SIOCATMARK) when consumed OOB skb is at the head.
  selftest: af_unix: Check EPOLLPRI after every send()/recv() in msg_oob.c
  selftest: af_unix: Check SIGURG after every send() in msg_oob.c
  selftest: af_unix: Add SO_OOBINLINE test cases in msg_oob.c
  af_unix: Don't stop recv() at consumed ex-OOB skb.
  selftest: af_unix: Add non-TCP-compliant test cases in msg_oob.c.
  af_unix: Don't stop recv(MSG_DONTWAIT) if consumed OOB skb is at the head.
  af_unix: Stop recv(MSG_PEEK) at consumed OOB skb.
  selftest: af_unix: Add msg_oob.c.
  selftest: af_unix: Remove test_unix_oob.c.
  tracing/net_sched: NULL pointer dereference in perf_trace_qdisc_reset()
  netfilter: nf_tables: fully validate NFT_DATA_VALUE on store to data registers
  net: usb: qmi_wwan: add Telit FN912 compositions
  tcp: fix tcp_rcv_fastopen_synack() to enter TCP_CA_Loss for failed TFO
  ionic: use dev_consume_skb_any outside of napi
  net: dsa: microchip: fix wrong register write when masking interrupt
  Fix race for duplicate reqsk on identical SYN
  ibmvnic: Add tx check to prevent skb leak
  ...
2024-06-27 10:05:35 -07:00
Kuniyuki Iwashima
91b7186c8d selftest: af_unix: Check SIOCATMARK after every send()/recv() in msg_oob.c.
To catch regression, let's check ioctl(SIOCATMARK) after every
send() and recv() calls.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-27 12:05:01 +02:00
Kuniyuki Iwashima
e400cfa38b af_unix: Fix wrong ioctl(SIOCATMARK) when consumed OOB skb is at the head.
Even if OOB data is recv()ed, ioctl(SIOCATMARK) must return 1 when the
OOB skb is at the head of the receive queue and no new OOB data is queued.

Without fix:

  #  RUN           msg_oob.no_peek.oob ...
  # msg_oob.c:305:oob:Expected answ[0] (0) == oob_head (1)
  # oob: Test terminated by assertion
  #          FAIL  msg_oob.no_peek.oob
  not ok 2 msg_oob.no_peek.oob

With fix:

  #  RUN           msg_oob.no_peek.oob ...
  #            OK  msg_oob.no_peek.oob
  ok 2 msg_oob.no_peek.oob

Fixes: 314001f0bf92 ("af_unix: Add OOB support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-27 12:05:01 +02:00
Kuniyuki Iwashima
48a9983730 selftest: af_unix: Check EPOLLPRI after every send()/recv() in msg_oob.c
When OOB data is in recvq, we can detect it with epoll by checking
EPOLLPRI.

This patch add checks for EPOLLPRI after every send() and recv() in
all test cases.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-27 12:05:01 +02:00
Kuniyuki Iwashima
d02689e686 selftest: af_unix: Check SIGURG after every send() in msg_oob.c
When data is sent with MSG_OOB, SIGURG is sent to a process if the
receiver socket has set its owner to the process by ioctl(FIOSETOWN)
or fcntl(F_SETOWN).

This patch adds SIGURG check after every send(MSG_OOB) call.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-27 12:05:01 +02:00
Kuniyuki Iwashima
436352e8e5 selftest: af_unix: Add SO_OOBINLINE test cases in msg_oob.c
When SO_OOBINLINE is enabled on a socket, MSG_OOB can be recv()ed
without MSG_OOB flag, and ioctl(SIOCATMARK) will behaves differently.

This patch adds some test cases for SO_OOBINLINE.

Note the new test cases found two bugs in TCP.

  1) After reading OOB data with non-inline mode, we can re-read
     the data by setting SO_OOBINLINE.

  #  RUN           msg_oob.no_peek.inline_oob_ahead_break ...
  # msg_oob.c:146:inline_oob_ahead_break:AF_UNIX :world
  # msg_oob.c:147:inline_oob_ahead_break:TCP     :oworld
  #            OK  msg_oob.no_peek.inline_oob_ahead_break
  ok 14 msg_oob.no_peek.inline_oob_ahead_break

  2) The head OOB data is dropped if SO_OOBINLINE is disabled
     if a new OOB data is queued.

  #  RUN           msg_oob.no_peek.inline_ex_oob_drop ...
  # msg_oob.c:171:inline_ex_oob_drop:AF_UNIX :x
  # msg_oob.c:172:inline_ex_oob_drop:TCP     :y
  # msg_oob.c:146:inline_ex_oob_drop:AF_UNIX :y
  # msg_oob.c:147:inline_ex_oob_drop:TCP     :Resource temporarily unavailable
  #            OK  msg_oob.no_peek.inline_ex_oob_drop
  ok 17 msg_oob.no_peek.inline_ex_oob_drop

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-27 12:05:01 +02:00
Kuniyuki Iwashima
36893ef0b6 af_unix: Don't stop recv() at consumed ex-OOB skb.
Currently, recv() is stopped at a consumed OOB skb even if a new
OOB skb is queued and we can ignore the old OOB skb.

  >>> from socket import *
  >>> c1, c2 = socket(AF_UNIX, SOCK_STREAM)
  >>> c1.send(b'hellowor', MSG_OOB)
  8
  >>> c2.recv(1, MSG_OOB)  # consume OOB data stays at middle of recvq.
  b'r'
  >>> c1.send(b'ld', MSG_OOB)
  2
  >>> c2.recv(10)          # recv() stops at the old consumed OOB
  b'hellowo'               # should be 'hellowol'

manage_oob() should not stop recv() at the old consumed OOB skb if
there is a new OOB data queued.

Note that TCP behaviour is apparently wrong in this test case because
we can recv() the same OOB data twice.

Without fix:

  #  RUN           msg_oob.no_peek.ex_oob_ahead_break ...
  # msg_oob.c:138:ex_oob_ahead_break:AF_UNIX :hellowo
  # msg_oob.c:139:ex_oob_ahead_break:Expected:hellowol
  # msg_oob.c:141:ex_oob_ahead_break:Expected ret[0] (7) == expected_len (8)
  # ex_oob_ahead_break: Test terminated by assertion
  #          FAIL  msg_oob.no_peek.ex_oob_ahead_break
  not ok 11 msg_oob.no_peek.ex_oob_ahead_break

With fix:

  #  RUN           msg_oob.no_peek.ex_oob_ahead_break ...
  # msg_oob.c:146:ex_oob_ahead_break:AF_UNIX :hellowol
  # msg_oob.c:147:ex_oob_ahead_break:TCP     :helloworl
  #            OK  msg_oob.no_peek.ex_oob_ahead_break
  ok 11 msg_oob.no_peek.ex_oob_ahead_break

Fixes: 314001f0bf92 ("af_unix: Add OOB support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-27 12:05:01 +02:00
Kuniyuki Iwashima
f5ea0768a2 selftest: af_unix: Add non-TCP-compliant test cases in msg_oob.c.
While testing, I found some weird behaviour on the TCP side as well.

For example, TCP drops the preceding OOB data when queueing a new
OOB data if the old OOB data is at the head of recvq.

  #  RUN           msg_oob.no_peek.ex_oob_drop ...
  # msg_oob.c:146:ex_oob_drop:AF_UNIX :x
  # msg_oob.c:147:ex_oob_drop:TCP     :Resource temporarily unavailable
  # msg_oob.c:146:ex_oob_drop:AF_UNIX :y
  # msg_oob.c:147:ex_oob_drop:TCP     :Invalid argument
  #            OK  msg_oob.no_peek.ex_oob_drop
  ok 9 msg_oob.no_peek.ex_oob_drop

  #  RUN           msg_oob.no_peek.ex_oob_drop_2 ...
  # msg_oob.c:146:ex_oob_drop_2:AF_UNIX :x
  # msg_oob.c:147:ex_oob_drop_2:TCP     :Resource temporarily unavailable
  #            OK  msg_oob.no_peek.ex_oob_drop_2
  ok 10 msg_oob.no_peek.ex_oob_drop_2

This patch allows AF_UNIX's MSG_OOB implementation to produce different
results from TCP when operations are guarded with tcp_incompliant{}.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-27 12:05:01 +02:00
Kuniyuki Iwashima
93c99f21db af_unix: Don't stop recv(MSG_DONTWAIT) if consumed OOB skb is at the head.
Let's say a socket send()s "hello" with MSG_OOB and "world" without flags,

  >>> from socket import *
  >>> c1, c2 = socketpair(AF_UNIX)
  >>> c1.send(b'hello', MSG_OOB)
  5
  >>> c1.send(b'world')
  5

and its peer recv()s "hell" and "o".

  >>> c2.recv(10)
  b'hell'
  >>> c2.recv(1, MSG_OOB)
  b'o'

Now the consumed OOB skb stays at the head of recvq to return a correct
value for ioctl(SIOCATMARK), which is broken now and fixed by a later
patch.

Then, if peer issues recv() with MSG_DONTWAIT, manage_oob() returns NULL,
so recv() ends up with -EAGAIN.

  >>> c2.setblocking(False)  # This causes -EAGAIN even with available data
  >>> c2.recv(5)
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
  BlockingIOError: [Errno 11] Resource temporarily unavailable

However, next recv() will return the following available data, "world".

  >>> c2.recv(5)
  b'world'

When the consumed OOB skb is at the head of the queue, we need to fetch
the next skb to fix the weird behaviour.

Note that the issue does not happen without MSG_DONTWAIT because we can
retry after manage_oob().

This patch also adds a test case that covers the issue.

Without fix:

  #  RUN           msg_oob.no_peek.ex_oob_break ...
  # msg_oob.c:134:ex_oob_break:AF_UNIX :Resource temporarily unavailable
  # msg_oob.c:135:ex_oob_break:Expected:ld
  # msg_oob.c:137:ex_oob_break:Expected ret[0] (-1) == expected_len (2)
  # ex_oob_break: Test terminated by assertion
  #          FAIL  msg_oob.no_peek.ex_oob_break
  not ok 8 msg_oob.no_peek.ex_oob_break

With fix:

  #  RUN           msg_oob.no_peek.ex_oob_break ...
  #            OK  msg_oob.no_peek.ex_oob_break
  ok 8 msg_oob.no_peek.ex_oob_break

Fixes: 314001f0bf92 ("af_unix: Add OOB support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-27 12:05:01 +02:00
Kuniyuki Iwashima
b94038d841 af_unix: Stop recv(MSG_PEEK) at consumed OOB skb.
After consuming OOB data, recv() reading the preceding data must break at
the OOB skb regardless of MSG_PEEK.

Currently, MSG_PEEK does not stop recv() for AF_UNIX, and the behaviour is
not compliant with TCP.

  >>> from socket import *
  >>> c1, c2 = socketpair(AF_UNIX)
  >>> c1.send(b'hello', MSG_OOB)
  5
  >>> c1.send(b'world')
  5
  >>> c2.recv(1, MSG_OOB)
  b'o'
  >>> c2.recv(9, MSG_PEEK)  # This should return b'hell'
  b'hellworld'              # even with enough buffer.

Let's fix it by returning NULL for consumed skb and unlinking it only if
MSG_PEEK is not specified.

This patch also adds test cases that add recv(MSG_PEEK) before each recv().

Without fix:

  #  RUN           msg_oob.peek.oob_ahead_break ...
  # msg_oob.c:134:oob_ahead_break:AF_UNIX :hellworld
  # msg_oob.c:135:oob_ahead_break:Expected:hell
  # msg_oob.c:137:oob_ahead_break:Expected ret[0] (9) == expected_len (4)
  # oob_ahead_break: Test terminated by assertion
  #          FAIL  msg_oob.peek.oob_ahead_break
  not ok 13 msg_oob.peek.oob_ahead_break

With fix:

  #  RUN           msg_oob.peek.oob_ahead_break ...
  #            OK  msg_oob.peek.oob_ahead_break
  ok 13 msg_oob.peek.oob_ahead_break

Fixes: 314001f0bf92 ("af_unix: Add OOB support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-27 12:05:00 +02:00
Kuniyuki Iwashima
d098d77232 selftest: af_unix: Add msg_oob.c.
AF_UNIX's MSG_OOB functionality lacked thorough testing, and we found
some bizarre behaviour.

The new selftest validates every MSG_OOB operation against TCP as a
reference implementation.

This patch adds only a few tests with basic send() and recv() that
do not fail.

The following patches will add more test cases for SO_OOBINLINE, SIGURG,
EPOLLPRI, and SIOCATMARK.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-27 12:05:00 +02:00
Kuniyuki Iwashima
7d139181a8 selftest: af_unix: Remove test_unix_oob.c.
test_unix_oob.c does not fully cover AF_UNIX's MSG_OOB functionality,
thus there are discrepancies between TCP behaviour.

Also, the test uses fork() to create message producer, and it's not
easy to understand and add more test cases.

Let's remove test_unix_oob.c and rewrite a new test.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-27 12:05:00 +02:00
Babu Moger
48236960c0 selftests/resctrl: Fix non-contiguous CBM for AMD
The non-contiguous CBM test fails on AMD with:
Starting L3_NONCONT_CAT test ...
Mounting resctrl to "/sys/fs/resctrl"
CPUID output doesn't match 'sparse_masks' file content!
not ok 5 L3_NONCONT_CAT: test

AMD always supports non-contiguous CBM but does not report it via CPUID.

Fix the non-contiguous CBM test to use CPUID to discover non-contiguous
CBM support only on Intel.

Fixes: ae638551ab64 ("selftests/resctrl: Add non-contiguous CBMs CAT test")
Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-06-26 13:22:34 -06:00