IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
When tests are run by runner.sh, bond_options.sh gets killed before
it can complete:
make -C tools/testing/selftests run_tests TARGETS="drivers/net/bonding"
[...]
# timeout set to 120
# selftests: drivers/net/bonding: bond_options.sh
# TEST: prio (active-backup miimon primary_reselect 0) [ OK ]
# TEST: prio (active-backup miimon primary_reselect 1) [ OK ]
# TEST: prio (active-backup miimon primary_reselect 2) [ OK ]
# TEST: prio (active-backup arp_ip_target primary_reselect 0) [ OK ]
# TEST: prio (active-backup arp_ip_target primary_reselect 1) [ OK ]
# TEST: prio (active-backup arp_ip_target primary_reselect 2) [ OK ]
#
not ok 7 selftests: drivers/net/bonding: bond_options.sh # TIMEOUT 120 seconds
This test includes many sleep statements, at least some of which are
related to timers in the operation of the bonding driver itself. Increase
the test timeout to allow the test to complete.
I ran the test in slightly different VMs (including one without HW
virtualization support) and got runtimes of 13m39.760s, 13m31.238s, and
13m2.956s. Use a ~1.5x "safety factor" and set the timeout to 1200s.
Fixes: 42a8d4aaea84 ("selftests: bonding: add bonding prio option test")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/netdev/20240116104402.1203850a@kernel.org/#t
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240118001233.304759-1-bpoirier@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Previous releases - regressions:
- Revert "net: rtnetlink: Enslave device before bringing it up",
breaks the case inverse to the one it was trying to fix
- net: dsa: fix oob access in DSA's netdevice event handler
dereference netdev_priv() before check its a DSA port
- sched: track device in tcf_block_get/put_ext() only for clsact
binder types
- net: tls, fix WARNING in __sk_msg_free when record becomes full
during splice and MORE hint set
- sfp-bus: fix SFP mode detect from bitrate
- drv: stmmac: prevent DSA tags from breaking COE
Previous releases - always broken:
- bpf: fix no forward progress in in bpf_iter_udp if output
buffer is too small
- bpf: reject variable offset alu on registers with a type
of PTR_TO_FLOW_KEYS to prevent oob access
- netfilter: tighten input validation
- net: add more sanity check in virtio_net_hdr_to_skb()
- rxrpc: fix use of Don't Fragment flag on RESPONSE packets,
avoid infinite loop
- amt: do not use the portion of skb->cb area which may get clobbered
- mptcp: improve validation of the MPTCPOPT_MP_JOIN MCTCP option
Misc:
- spring cleanup of inactive maintainers
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmWpnvoACgkQMUZtbf5S
Irvskg/+Or5tETxOmpQXxnj6ECZyrSp0Jcyd7+TIcos/7JfPdn3Kebl004SG4h/s
bwKDOIIP1iSjQ+0NFsPjyYIVd6wFuCElSB7npV5uQAT6ptXx7A4Ym68/rVxodI8T
6hiYV/mlPuZF8JjRhtp/VJL8sY1qnG7RIUB4oH3y9HQNfwZX0lIWChuUilHuWfbq
zQ2Iu97tMkoIBjXrkIT3Qaj0aFxYbjCOrg9zy+FZ69a7Rmrswr//7amlCH6saNTx
Ku7Wl8FXhe7O23OiM6GSl7AechSM1aJ5kOS3orseej0+aSp9eH3ekYGmbsQr6sjz
ix/eZ7V7SUkJK3bEH5haeymk4TDV3lHE8SziMbosK4wVbHOyPwEmqCxppADYJLZs
WycHZKcTBluFBOxknAofH7m5Hh0ToXkeTfpptSSGtRB4WncAOMsMapr3yS4WXg/q
AnOo/tzCBgMrnSJtD/kjqgUiCk8vYoLc8lBR9K74l0zqI1sf13OfuTHvEgqIS6z1
Ir/ewlAV6fCH8gQbyzjKUVlyjZS4+vFv19xg/2GgLf+LdyzcCOxUZkND3/DE6+OA
Dgf9gtABYU4hGXMUfTfml3KCBTF65QmY8dIh17zraNylYUHEJ2lI4D+sdiqWUrXb
mXPBJh4nOPwIV5t2gT80skNwF3aWPr6l4ieY2codSbP04rO74S8=
=YhQQ
-----END PGP SIGNATURE-----
Merge tag 'net-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bpf and netfilter.
Previous releases - regressions:
- Revert "net: rtnetlink: Enslave device before bringing it up",
breaks the case inverse to the one it was trying to fix
- net: dsa: fix oob access in DSA's netdevice event handler
dereference netdev_priv() before check its a DSA port
- sched: track device in tcf_block_get/put_ext() only for clsact
binder types
- net: tls, fix WARNING in __sk_msg_free when record becomes full
during splice and MORE hint set
- sfp-bus: fix SFP mode detect from bitrate
- drv: stmmac: prevent DSA tags from breaking COE
Previous releases - always broken:
- bpf: fix no forward progress in in bpf_iter_udp if output buffer is
too small
- bpf: reject variable offset alu on registers with a type of
PTR_TO_FLOW_KEYS to prevent oob access
- netfilter: tighten input validation
- net: add more sanity check in virtio_net_hdr_to_skb()
- rxrpc: fix use of Don't Fragment flag on RESPONSE packets, avoid
infinite loop
- amt: do not use the portion of skb->cb area which may get clobbered
- mptcp: improve validation of the MPTCPOPT_MP_JOIN MCTCP option
Misc:
- spring cleanup of inactive maintainers"
* tag 'net-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (88 commits)
i40e: Include types.h to some headers
ipv6: mcast: fix data-race in ipv6_mc_down / mld_ifc_work
selftests: mlxsw: qos_pfc: Adjust the test to support 8 lanes
selftests: mlxsw: qos_pfc: Remove wrong description
mlxsw: spectrum_router: Register netdevice notifier before nexthop
mlxsw: spectrum_acl_tcam: Fix stack corruption
mlxsw: spectrum_acl_tcam: Fix NULL pointer dereference in error path
mlxsw: spectrum_acl_erp: Fix error flow of pool allocation failure
ethtool: netlink: Add missing ethnl_ops_begin/complete
selftests: bonding: Add more missing config options
selftests: netdevsim: add a config file
libbpf: warn on unexpected __arg_ctx type when rewriting BTF
selftests/bpf: add tests confirming type logic in kernel for __arg_ctx
bpf: enforce types for __arg_ctx-tagged arguments in global subprogs
bpf: extract bpf_ctx_convert_map logic and make it more reusable
libbpf: feature-detect arg:ctx tag support in kernel
ipvs: avoid stat macros calls from preemptible context
netfilter: nf_tables: reject NFT_SET_CONCAT with not field length description
netfilter: nf_tables: skip dead set elements in netlink dump
netfilter: nf_tables: do not allow mismatch field size and set key length
...
This brings the first of three planned user IO page table invalidation
operations:
- IOMMU_HWPT_INVALIDATE allows invalidating the IOTLB integrated into the
iommu itself. The Intel implementation will also generate an ATC
invalidation to flush the device IOTLB as it unambiguously knows the
device, but other HW will not.
It goes along with the prior PR to implement userspace IO page tables (aka
nested translation for VMs) to allow Intel to have full functionality for
simple cases. An Intel implementation of the operation is provided.
Fix a small bug in the selftest mock iommu driver probe.
-----BEGIN PGP SIGNATURE-----
iHQEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCZaFiRQAKCRCFwuHvBreF
YbmgAP9Z0+cAUPKxUKaMRls8YR+gmaOCniSkqBlyrxcib+F/WAD2NPLcBPBRk2o7
GfXPIrovx96Btf8M40AFdiTEp7LABw==
=9POe
-----END PGP SIGNATURE-----
Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd
Pull iommufd updates from Jason Gunthorpe:
"This brings the first of three planned user IO page table invalidation
operations:
- IOMMU_HWPT_INVALIDATE allows invalidating the IOTLB integrated into
the iommu itself. The Intel implementation will also generate an
ATC invalidation to flush the device IOTLB as it unambiguously
knows the device, but other HW will not.
It goes along with the prior PR to implement userspace IO page tables
(aka nested translation for VMs) to allow Intel to have full
functionality for simple cases. An Intel implementation of the
operation is provided.
Also fix a small bug in the selftest mock iommu driver probe"
* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd:
iommufd/selftest: Check the bus type during probe
iommu/vt-d: Add iotlb flush for nested domain
iommufd: Add data structure for Intel VT-d stage-1 cache invalidation
iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl
iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op
iommufd/selftest: Add mock_domain_cache_invalidate_user support
iommu: Add iommu_copy_struct_from_user_array helper
iommufd: Add IOMMU_HWPT_INVALIDATE
iommu: Add cache_invalidate_user op
- Allow kernel trace instance creation to specify what events are created
Inside the kernel, a subsystem may create a tracing instance that it can
use to send events to user space. This sub-system may not care about the
thousands of events that exist in eventfs. Allow the sub-system to specify
what sub-systems of events it cares about, and only those events are exposed
to this instance.
- Allow the ring buffer to be broken up into bigger sub-buffers than just the
architecture page size. A new tracefs file called "buffer_subbuf_size_kb"
is created. The user can now specify a minimum size the sub-buffer may be
in kilobytes. Note, that the implementation currently make the sub-buffer
size a power of 2 pages (1, 2, 4, 8, 16, ...) but the user only writes in
kilobyte size, and the sub-buffer will be updated to the next size that
it will can accommodate it. If the user writes in 10, it will change the
size to be 4 pages on x86 (16K), as that is the next available size that
can hold 10K pages.
- Update the debug output when a corrupt time is detected in the ring buffer.
If the ring buffer detects inconsistent timestamps, there's a debug config
options that will dump the contents of the meta data of the sub-buffer that
is used for debugging. Add some more information to this dump that helps
with debugging.
- Add more timestamp debugging checks (only triggers when the config is enabled)
- Increase the trace_seq iterator to 2 page sizes.
- Allow strings written into tracefs_marker to be larger. Up to just under
2 page sizes (based on what trace_seq can hold).
- Increase the trace_maker_raw write to be as big as a sub-buffer can hold.
- Remove 32 bit time stamp logic, now that the rb_time_cmpxchg() has been
removed.
- More selftests were added.
- Some code clean ups as well.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZZ8p3BQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6ql2GAQDZg/zlFEiJHyTfWbCIE8pA3T5xbzKo
26TNxIZAxJJZpQEAvGFU5Smy14pG6soEoVMp8B6ZOANbqU8VVamhOL+r+Qw=
=0OYG
-----END PGP SIGNATURE-----
Merge tag 'trace-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing updates from Steven Rostedt:
- Allow kernel trace instance creation to specify what events are
created
Inside the kernel, a subsystem may create a tracing instance that it
can use to send events to user space. This sub-system may not care
about the thousands of events that exist in eventfs. Allow the
sub-system to specify what sub-systems of events it cares about, and
only those events are exposed to this instance.
- Allow the ring buffer to be broken up into bigger sub-buffers than
just the architecture page size.
A new tracefs file called "buffer_subbuf_size_kb" is created. The
user can now specify a minimum size the sub-buffer may be in
kilobytes. Note, that the implementation currently make the
sub-buffer size a power of 2 pages (1, 2, 4, 8, 16, ...) but the user
only writes in kilobyte size, and the sub-buffer will be updated to
the next size that it will can accommodate it. If the user writes in
10, it will change the size to be 4 pages on x86 (16K), as that is
the next available size that can hold 10K pages.
- Update the debug output when a corrupt time is detected in the ring
buffer. If the ring buffer detects inconsistent timestamps, there's a
debug config options that will dump the contents of the meta data of
the sub-buffer that is used for debugging. Add some more information
to this dump that helps with debugging.
- Add more timestamp debugging checks (only triggers when the config is
enabled)
- Increase the trace_seq iterator to 2 page sizes.
- Allow strings written into tracefs_marker to be larger. Up to just
under 2 page sizes (based on what trace_seq can hold).
- Increase the trace_maker_raw write to be as big as a sub-buffer can
hold.
- Remove 32 bit time stamp logic, now that the rb_time_cmpxchg() has
been removed.
- More selftests were added.
- Some code clean ups as well.
* tag 'trace-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (29 commits)
ring-buffer: Remove stale comment from ring_buffer_size()
tracing histograms: Simplify parse_actions() function
tracing/selftests: Remove exec permissions from trace_marker.tc test
ring-buffer: Use subbuf_order for buffer page masking
tracing: Update subbuffer with kilobytes not page order
ringbuffer/selftest: Add basic selftest to test changing subbuf order
ring-buffer: Add documentation on the buffer_subbuf_order file
ring-buffer: Just update the subbuffers when changing their allocation order
ring-buffer: Keep the same size when updating the order
tracing: Stop the tracing while changing the ring buffer subbuf size
tracing: Update snapshot order along with main buffer order
ring-buffer: Make sure the spare sub buffer used for reads has same size
ring-buffer: Do no swap cpu buffers if order is different
ring-buffer: Clear pages on error in ring_buffer_subbuf_order_set() failure
ring-buffer: Read and write to ring buffers with custom sub buffer size
ring-buffer: Set new size of the ring buffer sub page
ring-buffer: Add interface for configuring trace sub buffer size
ring-buffer: Page size per ring buffer
ring-buffer: Have ring_buffer_print_page_header() be able to access ring_buffer_iter
ring-buffer: Check if absolute timestamp goes backwards
...
- Avoid building selftests when not on x86
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmWfAycACgkQaDWVMHDJ
krD5XQ/+KQ7kITy7jr5fskRQ3uGjk9KUnc3e3qO/SxDLJU9xuXbZbh5pqxZQrkud
mj0G1LRCk8wsIPU44wP9SKPQRG9AqcsCNSiBkBaTBusHyCXCCvoJ013Mlqyj9ecz
bvaYuHDuji29eV/0+xuOcv8ELJHFp/UCTQk6azeQIfUs/97/Ho2qMb1oHC7zNjWX
okJBUj73tLO3EUCW5p9cLw2TgrmOtNa6KlNqj//xoDx03HofjoGyrx2fd8RcmOvY
Z2v8XEfx/fnpD8vA8SwnCKhWDLHDdwdnLMREy3gykt3PBdmuIKTT5fIggMSMZh6c
wbxYALGMyE+T0klIfme4k4SJuoitI+Ec/naW/aP3buAgdVFXVw7+KjAwEcOi18Sx
kSpzvYCwE+sHIZdErk+1Wx/VIWgCBfkAr4hPLgxl5s6nHB2l7lXwGLvaxiBbXSQO
aMDVD61JwCPI5WuLG8r8iCsCdbRwZVoe4Jm+CkwE69BccZfTXmjOuP0uNTY+cOoH
Wroe74XGQp4QOvaBhunkzT/ntLaDcQvXGOhaTrYmCvElu1gB25c/FdEIPMTcQgPv
dFMm49Gzo7v4RZjm/LSavJz6DU/40PRTYMntbKGSiirxAmxwpG8uNz9nUm6Q/4+D
7uL0be3ey2DzGa+8FoYe9T3i0LbGiRBjlNIEXMjWh1pnD/auUGA=
=PaVo
-----END PGP SIGNATURE-----
Merge tag 'x86_sgx_for_6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 SGX updates from Dave Hansen:
"This time, these are entirely confined to SGX selftests fixes.
The mini SGX enclave built by the selftests has garnered some
attention because it stands alone and does not need the sizable
infrastructure of the official SGX SDK. I think that's why folks are
suddently interested in cleaning it up.
- Clean up selftest compilation issues, mostly from non-gcc compilers
- Avoid building selftests when not on x86"
* tag 'x86_sgx_for_6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
selftests/sgx: Skip non X86_64 platform
selftests/sgx: Remove incomplete ABI sanitization code in test enclave
selftests/sgx: Discard unsupported ELF sections
selftests/sgx: Ensure expected location of test enclave buffer
selftests/sgx: Ensure test enclave buffer is entirely preserved
selftests/sgx: Fix linker script asserts
selftests/sgx: Handle relocations in test enclave
selftests/sgx: Produce static-pie executable for test enclave
selftests/sgx: Remove redundant enclave base address save/restore
selftests/sgx: Specify freestanding environment for enclave compilation
selftests/sgx: Separate linker options
selftests/sgx: Include memory clobber for inline asm in test enclave
selftests/sgx: Fix uninitialized pointer dereferences in encl_get_entry
selftests/sgx: Fix uninitialized pointer dereference in error path
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZalBVQAKCRDbK58LschI
gyfQAP4+KhkJiJiOXsECo0f3JcuzDgCqEMnylNx0Wujzgs2s9wD+LEjYr8zztqUd
E9rkjGKUoSYYfarEJ0KKfy6Lv61BlgY=
=xI6t
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2024-01-18
We've added 10 non-merge commits during the last 5 day(s) which contain
a total of 12 files changed, 806 insertions(+), 51 deletions(-).
The main changes are:
1) Fix an issue in bpf_iter_udp under backward progress which prevents
user space process from finishing iteration, from Martin KaFai Lau.
2) Fix BPF verifier to reject variable offset alu on registers with a type
of PTR_TO_FLOW_KEYS to prevent oob access, from Hao Sun.
3) Follow up fixes for kernel- and libbpf-side logic around handling
arg:ctx tagged arguments of BPF global subprogs, from Andrii Nakryiko.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
libbpf: warn on unexpected __arg_ctx type when rewriting BTF
selftests/bpf: add tests confirming type logic in kernel for __arg_ctx
bpf: enforce types for __arg_ctx-tagged arguments in global subprogs
bpf: extract bpf_ctx_convert_map logic and make it more reusable
libbpf: feature-detect arg:ctx tag support in kernel
selftests/bpf: Add test for alu on PTR_TO_FLOW_KEYS
bpf: Reject variable offset alu on PTR_TO_FLOW_KEYS
selftests/bpf: Test udp and tcp iter batching
bpf: Avoid iter->offset making backward progress in bpf_iter_udp
bpf: iter_udp: Retry with a larger batch size without going back to the previous bucket
====================
Link: https://lore.kernel.org/r/20240118153936.11769-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
'qos_pfc' test checks PFC behavior. The idea is to limit the traffic
using a shaper somewhere in the flow of the packets. In this area, the
buffer is smaller than the buffer at the beginning of the flow, so it fills
up until there is no more space left. The test configures there PFC
which is supposed to notice that the headroom is filling up and send PFC
Xoff to indicate the transmitter to stop sending traffic for the priorities
sharing this PG.
The Xon/Xoff threshold is auto-configured and always equal to
2*(MTU rounded up to cell size). Even after sending the PFC Xoff packet,
traffic will keep arriving until the transmitter receives and processes
the PFC packet. This amount of traffic is known as the PFC delay allowance.
Currently the buffer for the delay traffic is configured as 100KB. The
MTU in the test is 10KB, therefore the threshold for Xoff is about 20KB.
This allows 80KB extra to be stored in this buffer.
8-lane ports use two buffers among which the configured buffer is split,
the Xoff threshold then applies to each buffer in parallel.
The test does not take into account the behavior of 8-lane ports, when the
ports are configured to 400Gbps with 8 lanes or 800Gbps with 8 lanes,
packets are dropped and the test fails.
Check if the relevant ports use 8 lanes, in such case double the size of
the buffer, as the headroom is split half-half.
Cc: Shuah Khan <shuah@kernel.org>
Fixes: bfa804784e32 ("selftests: mlxsw: Add a PFC test")
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/23ff11b7dff031eb04a41c0f5254a2b636cd8ebb.1705502064.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In the diagram of the topology, $swp3 and $swp4 are described as 1Gbps
ports. This is wrong information, the test does not configure such speed.
Cc: Shuah Khan <shuah@kernel.org>
Fixes: bfa804784e32 ("selftests: mlxsw: Add a PFC test")
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/0087e2d416aff7e444d15f7c2958fc1d438dc27e.1705502064.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When tc filters are first added to a net device, the corresponding local
port gets bound to an ACL group in the device. The group contains a list
of ACLs. In turn, each ACL points to a different TCAM region where the
filters are stored. During forwarding, the ACLs are sequentially
evaluated until a match is found.
One reason to place filters in different regions is when they are added
with decreasing priorities and in an alternating order so that two
consecutive filters can never fit in the same region because of their
key usage.
In Spectrum-2 and newer ASICs the firmware started to report that the
maximum number of ACLs in a group is more than 16, but the layout of the
register that configures ACL groups (PAGT) was not updated to account
for that. It is therefore possible to hit stack corruption [1] in the
rare case where more than 16 ACLs in a group are required.
Fix by limiting the maximum ACL group size to the minimum between what
the firmware reports and the maximum ACLs that fit in the PAGT register.
Add a test case to make sure the machine does not crash when this
condition is hit.
[1]
Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: mlxsw_sp_acl_tcam_group_update+0x116/0x120
[...]
dump_stack_lvl+0x36/0x50
panic+0x305/0x330
__stack_chk_fail+0x15/0x20
mlxsw_sp_acl_tcam_group_update+0x116/0x120
mlxsw_sp_acl_tcam_group_region_attach+0x69/0x110
mlxsw_sp_acl_tcam_vchunk_get+0x492/0xa20
mlxsw_sp_acl_tcam_ventry_add+0x25/0xe0
mlxsw_sp_acl_rule_add+0x47/0x240
mlxsw_sp_flower_replace+0x1a9/0x1d0
tc_setup_cb_add+0xdc/0x1c0
fl_hw_replace_filter+0x146/0x1f0
fl_change+0xc17/0x1360
tc_new_tfilter+0x472/0xb90
rtnetlink_rcv_msg+0x313/0x3b0
netlink_rcv_skb+0x58/0x100
netlink_unicast+0x244/0x390
netlink_sendmsg+0x1e4/0x440
____sys_sendmsg+0x164/0x260
___sys_sendmsg+0x9a/0xe0
__sys_sendmsg+0x7a/0xc0
do_syscall_64+0x40/0xe0
entry_SYSCALL_64_after_hwframe+0x63/0x6b
Fixes: c3ab435466d5 ("mlxsw: spectrum: Extend to support Spectrum-2 ASIC")
Reported-by: Orel Hagag <orelh@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/2d91c89afba59c22587b444994ae419dbea8d876.1705502064.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Lately, a bug was found when many TC filters are added - at some point,
several bugs are printed to dmesg [1] and the switch is crashed with
segmentation fault.
The issue starts when gen_pool_free() fails because of unexpected
behavior - a try to free memory which is already freed, this leads to BUG()
call which crashes the switch and makes many other bugs.
Trying to track down the unexpected behavior led to a bug in eRP code. The
function mlxsw_sp_acl_erp_table_alloc() gets a pointer to the allocated
index, sets the value and returns an error code. When gen_pool_alloc()
fails it returns address 0, we track it and return -ENOBUFS outside, BUT
the call for gen_pool_alloc() already override the index in erp_table
structure. This is a problem when such allocation is done as part of
table expansion. This is not a new table, which will not be used in case
of allocation failure. We try to expand eRP table and override the
current index (non-zero) with zero. Then, it leads to an unexpected
behavior when address 0 is freed twice. Note that address 0 is valid in
erp_table->base_index and indeed other tables use it.
gen_pool_alloc() fails in case that there is no space left in the
pre-allocated pool, in our case, the pool is limited to
ACL_MAX_ERPT_BANK_SIZE, which is read from hardware. When more than max
erp entries are required, we exceed the limit and return an error, this
error leads to "Failed to migrate vregion" print.
Fix this by changing erp_table->base_index only in case of a successful
allocation.
Add a test case for such a scenario. Without this fix it causes
segmentation fault:
$ TESTS="max_erp_entries_test" ./tc_flower.sh
./tc_flower.sh: line 988: 1560 Segmentation fault tc filter del dev $h2 ingress chain $i protocol ip pref $i handle $j flower &>/dev/null
[1]:
kernel BUG at lib/genalloc.c:508!
invalid opcode: 0000 [#1] PREEMPT SMP
CPU: 6 PID: 3531 Comm: tc Not tainted 6.7.0-rc5-custom-ga6893f479f5e #1
Hardware name: Mellanox Technologies Ltd. MSN4700/VMOD0010, BIOS 5.11 07/12/2021
RIP: 0010:gen_pool_free_owner+0xc9/0xe0
...
Call Trace:
<TASK>
__mlxsw_sp_acl_erp_table_other_dec+0x70/0xa0 [mlxsw_spectrum]
mlxsw_sp_acl_erp_mask_destroy+0xf5/0x110 [mlxsw_spectrum]
objagg_obj_root_destroy+0x18/0x80 [objagg]
objagg_obj_destroy+0x12c/0x130 [objagg]
mlxsw_sp_acl_erp_mask_put+0x37/0x50 [mlxsw_spectrum]
mlxsw_sp_acl_ctcam_region_entry_remove+0x74/0xa0 [mlxsw_spectrum]
mlxsw_sp_acl_ctcam_entry_del+0x1e/0x40 [mlxsw_spectrum]
mlxsw_sp_acl_tcam_ventry_del+0x78/0xd0 [mlxsw_spectrum]
mlxsw_sp_flower_destroy+0x4d/0x70 [mlxsw_spectrum]
mlxsw_sp_flow_block_cb+0x73/0xb0 [mlxsw_spectrum]
tc_setup_cb_destroy+0xc1/0x180
fl_hw_destroy_filter+0x94/0xc0 [cls_flower]
__fl_delete+0x1ac/0x1c0 [cls_flower]
fl_destroy+0xc2/0x150 [cls_flower]
tcf_proto_destroy+0x1a/0xa0
...
mlxsw_spectrum3 0000:07:00.0: Failed to migrate vregion
mlxsw_spectrum3 0000:07:00.0: Failed to migrate vregion
Fixes: f465261aa105 ("mlxsw: spectrum_acl: Implement common eRP core")
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/4cfca254dfc0e5d283974801a24371c7b6db5989.1705502064.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
As a followup to commit 03fb8565c880 ("selftests: bonding: add missing
build configs"), add more networking-specific config options which are
needed for bonding tests.
For testing, I used the minimal config generated by virtme-ng and I added
the options in the config file. All bonding tests passed.
Fixes: bbb774d921e2 ("net: Add tests for bonding and team address list management") # for ipv6
Fixes: 6cbe791c0f4e ("kselftest: bonding: add num_grat_arp test") # for tc options
Fixes: 222c94ec0ad4 ("selftests: bonding: add tests for ether type changes") # for nlmon
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Link: https://lore.kernel.org/r/20240116154926.202164-1-bpoirier@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
netdevsim tests aren't very well integrated with kselftest,
which has its advantages and disadvantages. But regardless
of the intended integration - a config file to know what kernel
to build is very useful, add one.
Fixes: fc4c93f145d7 ("selftests: add basic netdevsim devlink flash testing")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20240116154311.1945801-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Add a bunch of global subprogs across variety of program types to
validate expected kernel type enforcement logic for __arg_ctx arguments.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240118033143.3384355-5-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add feature detector of kernel-side arg:ctx (__arg_ctx) tag support. If
this is detected, libbpf will avoid doing any __arg_ctx-related BTF
rewriting and checks in favor of letting kernel handle this completely.
test_global_funcs/ctx_arg_rewrite subtest is adjusted to do the same
feature detection (albeit in much simpler, though round-about and
inefficient, way), and skip the tests. This is done to still be able to
execute this test on older kernels (like in libbpf CI).
Note, BPF token series ([0]) does a major refactor and code moving of
libbpf-internal feature detection "framework", so to avoid unnecessary
conflicts we keep newly added feature detection stand-alone with ad-hoc
result caching. Once things settle, there will be a small follow up to
re-integrate everything back and move code into its final place in
newly-added (by BPF token series) features.c file.
[0] https://patchwork.kernel.org/project/netdevbpf/list/?series=814209&state=*
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240118033143.3384355-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
- Use memdup_array_user() to harden against overflow.
- Unconditionally advertise KVM_CAP_DEVICE_CTRL for all architectures.
- Clean up Kconfigs that all KVM architectures were selecting
- New functionality around "guest_memfd", a new userspace API that
creates an anonymous file and returns a file descriptor that refers
to it. guest_memfd files are bound to their owning virtual machine,
cannot be mapped, read, or written by userspace, and cannot be resized.
guest_memfd files do however support PUNCH_HOLE, which can be used to
switch a memory area between guest_memfd and regular anonymous memory.
- New ioctl KVM_SET_MEMORY_ATTRIBUTES allowing userspace to specify
per-page attributes for a given page of guest memory; right now the
only attribute is whether the guest expects to access memory via
guest_memfd or not, which in Confidential SVMs backed by SEV-SNP,
TDX or ARM64 pKVM is checked by firmware or hypervisor that guarantees
confidentiality (AMD PSP, Intel TDX module, or EL2 in the case of pKVM).
x86:
- Support for "software-protected VMs" that can use the new guest_memfd
and page attributes infrastructure. This is mostly useful for testing,
since there is no pKVM-like infrastructure to provide a meaningfully
reduced TCB.
- Fix a relatively benign off-by-one error when splitting huge pages during
CLEAR_DIRTY_LOG.
- Fix a bug where KVM could incorrectly test-and-clear dirty bits in non-leaf
TDP MMU SPTEs if a racing thread replaces a huge SPTE with a non-huge SPTE.
- Use more generic lockdep assertions in paths that don't actually care
about whether the caller is a reader or a writer.
- let Xen guests opt out of having PV clock reported as "based on a stable TSC",
because some of them don't expect the "TSC stable" bit (added to the pvclock
ABI by KVM, but never set by Xen) to be set.
- Revert a bogus, made-up nested SVM consistency check for TLB_CONTROL.
- Advertise flush-by-ASID support for nSVM unconditionally, as KVM always
flushes on nested transitions, i.e. always satisfies flush requests. This
allows running bleeding edge versions of VMware Workstation on top of KVM.
- Sanity check that the CPU supports flush-by-ASID when enabling SEV support.
- On AMD machines with vNMI, always rely on hardware instead of intercepting
IRET in some cases to detect unmasking of NMIs
- Support for virtualizing Linear Address Masking (LAM)
- Fix a variety of vPMU bugs where KVM fail to stop/reset counters and other state
prior to refreshing the vPMU model.
- Fix a double-overflow PMU bug by tracking emulated counter events using a
dedicated field instead of snapshotting the "previous" counter. If the
hardware PMC count triggers overflow that is recognized in the same VM-Exit
that KVM manually bumps an event count, KVM would pend PMIs for both the
hardware-triggered overflow and for KVM-triggered overflow.
- Turn off KVM_WERROR by default for all configs so that it's not
inadvertantly enabled by non-KVM developers, which can be problematic for
subsystems that require no regressions for W=1 builds.
- Advertise all of the host-supported CPUID bits that enumerate IA32_SPEC_CTRL
"features".
- Don't force a masterclock update when a vCPU synchronizes to the current TSC
generation, as updating the masterclock can cause kvmclock's time to "jump"
unexpectedly, e.g. when userspace hotplugs a pre-created vCPU.
- Use RIP-relative address to read kvm_rebooting in the VM-Enter fault paths,
partly as a super minor optimization, but mostly to make KVM play nice with
position independent executable builds.
- Guard KVM-on-HyperV's range-based TLB flush hooks with an #ifdef on
CONFIG_HYPERV as a minor optimization, and to self-document the code.
- Add CONFIG_KVM_HYPERV to allow disabling KVM support for HyperV "emulation"
at build time.
ARM64:
- LPA2 support, adding 52bit IPA/PA capability for 4kB and 16kB
base granule sizes. Branch shared with the arm64 tree.
- Large Fine-Grained Trap rework, bringing some sanity to the
feature, although there is more to come. This comes with
a prefix branch shared with the arm64 tree.
- Some additional Nested Virtualization groundwork, mostly
introducing the NV2 VNCR support and retargetting the NV
support to that version of the architecture.
- A small set of vgic fixes and associated cleanups.
Loongarch:
- Optimization for memslot hugepage checking
- Cleanup and fix some HW/SW timer issues
- Add LSX/LASX (128bit/256bit SIMD) support
RISC-V:
- KVM_GET_REG_LIST improvement for vector registers
- Generate ISA extension reg_list using macros in get-reg-list selftest
- Support for reporting steal time along with selftest
s390:
- Bugfixes
Selftests:
- Fix an annoying goof where the NX hugepage test prints out garbage
instead of the magic token needed to run the test.
- Fix build errors when a header is delete/moved due to a missing flag
in the Makefile.
- Detect if KVM bugged/killed a selftest's VM and print out a helpful
message instead of complaining that a random ioctl() failed.
- Annotate the guest printf/assert helpers with __printf(), and fix the
various bugs that were lurking due to lack of said annotation.
There are two non-KVM patches buried in the middle of guest_memfd support:
fs: Rename anon_inode_getfile_secure() and anon_inode_getfd_secure()
mm: Add AS_UNMOVABLE to mark mapping as completely unmovable
The first is small and mostly suggested-by Christian Brauner; the second
a bit less so but it was written by an mm person (Vlastimil Babka).
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmWcMWkUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroO15gf/WLmmg3SET6Uzw9iEq2xo28831ZA+
6kpILfIDGKozV5safDmMvcInlc/PTnqOFrsKyyN4kDZ+rIJiafJdg/loE0kPXBML
wdR+2ix5kYI1FucCDaGTahskBDz8Lb/xTpwGg9BFLYFNmuUeHc74o6GoNvr1uliE
4kLZL2K6w0cSMPybUD+HqGaET80ZqPwecv+s1JL+Ia0kYZJONJifoHnvOUJ7DpEi
rgudVdgzt3EPjG0y1z6MjvDBXTCOLDjXajErlYuZD3Ej8N8s59Dh2TxOiDNTLdP4
a4zjRvDmgyr6H6sz+upvwc7f4M4p+DBvf+TkWF54mbeObHUYliStqURIoA==
=66Ws
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"Generic:
- Use memdup_array_user() to harden against overflow.
- Unconditionally advertise KVM_CAP_DEVICE_CTRL for all
architectures.
- Clean up Kconfigs that all KVM architectures were selecting
- New functionality around "guest_memfd", a new userspace API that
creates an anonymous file and returns a file descriptor that refers
to it. guest_memfd files are bound to their owning virtual machine,
cannot be mapped, read, or written by userspace, and cannot be
resized. guest_memfd files do however support PUNCH_HOLE, which can
be used to switch a memory area between guest_memfd and regular
anonymous memory.
- New ioctl KVM_SET_MEMORY_ATTRIBUTES allowing userspace to specify
per-page attributes for a given page of guest memory; right now the
only attribute is whether the guest expects to access memory via
guest_memfd or not, which in Confidential SVMs backed by SEV-SNP,
TDX or ARM64 pKVM is checked by firmware or hypervisor that
guarantees confidentiality (AMD PSP, Intel TDX module, or EL2 in
the case of pKVM).
x86:
- Support for "software-protected VMs" that can use the new
guest_memfd and page attributes infrastructure. This is mostly
useful for testing, since there is no pKVM-like infrastructure to
provide a meaningfully reduced TCB.
- Fix a relatively benign off-by-one error when splitting huge pages
during CLEAR_DIRTY_LOG.
- Fix a bug where KVM could incorrectly test-and-clear dirty bits in
non-leaf TDP MMU SPTEs if a racing thread replaces a huge SPTE with
a non-huge SPTE.
- Use more generic lockdep assertions in paths that don't actually
care about whether the caller is a reader or a writer.
- let Xen guests opt out of having PV clock reported as "based on a
stable TSC", because some of them don't expect the "TSC stable" bit
(added to the pvclock ABI by KVM, but never set by Xen) to be set.
- Revert a bogus, made-up nested SVM consistency check for
TLB_CONTROL.
- Advertise flush-by-ASID support for nSVM unconditionally, as KVM
always flushes on nested transitions, i.e. always satisfies flush
requests. This allows running bleeding edge versions of VMware
Workstation on top of KVM.
- Sanity check that the CPU supports flush-by-ASID when enabling SEV
support.
- On AMD machines with vNMI, always rely on hardware instead of
intercepting IRET in some cases to detect unmasking of NMIs
- Support for virtualizing Linear Address Masking (LAM)
- Fix a variety of vPMU bugs where KVM fail to stop/reset counters
and other state prior to refreshing the vPMU model.
- Fix a double-overflow PMU bug by tracking emulated counter events
using a dedicated field instead of snapshotting the "previous"
counter. If the hardware PMC count triggers overflow that is
recognized in the same VM-Exit that KVM manually bumps an event
count, KVM would pend PMIs for both the hardware-triggered overflow
and for KVM-triggered overflow.
- Turn off KVM_WERROR by default for all configs so that it's not
inadvertantly enabled by non-KVM developers, which can be
problematic for subsystems that require no regressions for W=1
builds.
- Advertise all of the host-supported CPUID bits that enumerate
IA32_SPEC_CTRL "features".
- Don't force a masterclock update when a vCPU synchronizes to the
current TSC generation, as updating the masterclock can cause
kvmclock's time to "jump" unexpectedly, e.g. when userspace
hotplugs a pre-created vCPU.
- Use RIP-relative address to read kvm_rebooting in the VM-Enter
fault paths, partly as a super minor optimization, but mostly to
make KVM play nice with position independent executable builds.
- Guard KVM-on-HyperV's range-based TLB flush hooks with an #ifdef on
CONFIG_HYPERV as a minor optimization, and to self-document the
code.
- Add CONFIG_KVM_HYPERV to allow disabling KVM support for HyperV
"emulation" at build time.
ARM64:
- LPA2 support, adding 52bit IPA/PA capability for 4kB and 16kB base
granule sizes. Branch shared with the arm64 tree.
- Large Fine-Grained Trap rework, bringing some sanity to the
feature, although there is more to come. This comes with a prefix
branch shared with the arm64 tree.
- Some additional Nested Virtualization groundwork, mostly
introducing the NV2 VNCR support and retargetting the NV support to
that version of the architecture.
- A small set of vgic fixes and associated cleanups.
Loongarch:
- Optimization for memslot hugepage checking
- Cleanup and fix some HW/SW timer issues
- Add LSX/LASX (128bit/256bit SIMD) support
RISC-V:
- KVM_GET_REG_LIST improvement for vector registers
- Generate ISA extension reg_list using macros in get-reg-list
selftest
- Support for reporting steal time along with selftest
s390:
- Bugfixes
Selftests:
- Fix an annoying goof where the NX hugepage test prints out garbage
instead of the magic token needed to run the test.
- Fix build errors when a header is delete/moved due to a missing
flag in the Makefile.
- Detect if KVM bugged/killed a selftest's VM and print out a helpful
message instead of complaining that a random ioctl() failed.
- Annotate the guest printf/assert helpers with __printf(), and fix
the various bugs that were lurking due to lack of said annotation"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (185 commits)
x86/kvm: Do not try to disable kvmclock if it was not enabled
KVM: x86: add missing "depends on KVM"
KVM: fix direction of dependency on MMU notifiers
KVM: introduce CONFIG_KVM_COMMON
KVM: arm64: Add missing memory barriers when switching to pKVM's hyp pgd
KVM: arm64: vgic-its: Avoid potential UAF in LPI translation cache
RISC-V: KVM: selftests: Add get-reg-list test for STA registers
RISC-V: KVM: selftests: Add steal_time test support
RISC-V: KVM: selftests: Add guest_sbi_probe_extension
RISC-V: KVM: selftests: Move sbi_ecall to processor.c
RISC-V: KVM: Implement SBI STA extension
RISC-V: KVM: Add support for SBI STA registers
RISC-V: KVM: Add support for SBI extension registers
RISC-V: KVM: Add SBI STA info to vcpu_arch
RISC-V: KVM: Add steal-update vcpu request
RISC-V: KVM: Add SBI STA extension skeleton
RISC-V: paravirt: Implement steal-time support
RISC-V: Add SBI STA extension definitions
RISC-V: paravirt: Add skeleton for pv-time support
RISC-V: KVM: Fix indentation in kvm_riscv_vcpu_set_reg_csr()
...
* Support for many new extensions in hwprobe, along with a handful of
cleanups.
* Various cleanups to our page table handling code, so we alwayse use
{READ,WRITE}_ONCE.
* Support for the which-cpus flavor of hwprobe.
* Support for XIP kernels has been resurrected.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmWhb+sTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYiWyJEADH/l2PND3AE2sfhtkDceMR8k+MOrjn
3T0+EIow28tBEpcu7Bdu7aw65ZQDgV9aEDuo8HYlwtimPUfvTQ01QiwDRVZoxPGT
4Br2X7n5lczQOvp6r5+8p34viQVNXaBXApgZc+iMbelj0W7AnNJNdr8/d1pMw/hA
y6v8rq6BBgFKZKmU0va+T2AaXQN3nj/fme1l8Rn6Wf8JpaBtTnlNWGOepRfJdFbv
ZewTEqu4CVmCE6ij8c+Gatk8k71KXLjH3mSjZ2F0FIreI0I5pdD9OKQJk+hiRCEA
wnEneWyl+rHPUTRXpZEeLVPD4gBTbKt20awImpNG+eN+l68s4ESNWP2EZM4n5utF
NWJAscxMA1c8NlWhnQfAKK2eAmi2sp0/9O3pTfpvZ7yWAp/GpkZGEuAaQe4R80X+
0lLKrS8P8T2ZSA5UVfszN5vLXU/Ae3GpAQCJkzoYXjDes8sxw4fjHcg/AWn/ZmrO
FoqPA1ka/2i0b5be+p3Emt5kfTK8WeDnV2rV1ZLYEJYBkXdTLAM8jR+mhXJ7z59P
shfOSpZ7icvX7Q3t/eFKApryM93JE3w6WZBOYuY4D7FPoPSxJG7VgL2U42wiTZjj
xr1ta4vdfEqWgRpAOvGaP569MQ9awzA6JZHJQOVLx9FOWox2gMWsTB8xQ33y5k/n
eNd7JjUOu4K3jQ==
=fLgG
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-6.8-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
- Support for many new extensions in hwprobe, along with a handful of
cleanups
- Various cleanups to our page table handling code, so we alwayse use
{READ,WRITE}_ONCE
- Support for the which-cpus flavor of hwprobe
- Support for XIP kernels has been resurrected
* tag 'riscv-for-linus-6.8-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (52 commits)
riscv: hwprobe: export Zicond extension
riscv: hwprobe: export Zacas ISA extension
riscv: add ISA extension parsing for Zacas
dt-bindings: riscv: add Zacas ISA extension description
riscv: hwprobe: export Ztso ISA extension
riscv: add ISA extension parsing for Ztso
use linux/export.h rather than asm-generic/export.h
riscv: Remove SHADOW_OVERFLOW_STACK_SIZE macro
riscv; fix __user annotation in save_v_state()
riscv: fix __user annotation in traps_misaligned.c
riscv: Select ARCH_WANTS_NO_INSTR
riscv: Remove obsolete rv32_defconfig file
riscv: Allow disabling of BUILTIN_DTB for XIP
riscv: Fixed wrong register in XIP_FIXUP_FLASH_OFFSET macro
riscv: Make XIP bootable again
riscv: Fix set_direct_map_default_noflush() to reset _PAGE_EXEC
riscv: Fix module_alloc() that did not reset the linear mapping permissions
riscv: Fix wrong usage of lm_alias() when splitting a huge linear mapping
riscv: Check if the code to patch lies in the exit section
riscv: Use the same CPU operations for all CPUs
...
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZaHe5gAKCRDdBJ7gKXxA
jrAiAQCYZQuwsNVyGJUuPD/GGQzqVUZNpWcuYwMXXAi6dO5rSAD+LDeFviun2K52
uHCz4iRq5EwNLA+MbdHtAnQzr+e5CQ8=
=Jjkw
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2024-01-12-16-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc hotfixes from Andrew Morton:
"For once not mostly MM-related.
17 hotfixes. 10 address post-6.7 issues and the other 7 are cc:stable"
* tag 'mm-hotfixes-stable-2024-01-12-16-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
userfaultfd: avoid huge_zero_page in UFFDIO_MOVE
MAINTAINERS: add entry for shrinker
selftests: mm: hugepage-vmemmap fails on 64K page size systems
mm/memory_hotplug: fix memmap_on_memory sysfs value retrieval
mailmap: switch email for Tanzir Hasan
mailmap: add old address mappings for Randy
kernel/crash_core.c: make __crash_hotplug_lock static
efi: disable mirror feature during crashkernel
kexec: do syscore_shutdown() in kernel_kexec
mailmap: update entry for Manivannan Sadhasivam
fs/proc/task_mmu: move mmu notification mechanism inside mm lock
mm: zswap: switch maintainers to recently active developers and reviewers
scripts/decode_stacktrace.sh: optionally use LLVM utilities
kasan: avoid resetting aux_lock
lib/Kconfig.debug: disable CONFIG_DEBUG_INFO_BTF for Hexagon
MAINTAINERS: update LTP maintainers
kdump: defer the insertion of crashkernel resources
This is a follow-up of commit a159cbe81d3b ("selftests: rtnetlink: check
enslaving iface in a bond") after the merge of net-next into net.
The goal is to follow the new convention,
see commit d3b6b1116127 ("selftests/net: convert rtnetlink.sh to run it in
unique namespace") for more details.
Let's use also the generic dummy name instead of defining a new one.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240115135922.3662648-1-nicolas.dichtel@6wind.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a test case for PTR_TO_FLOW_KEYS alu. Testing if alu with variable
offset on flow_keys is rejected. For the fixed offset success case, we
already have C code coverage to verify (e.g. via bpf_flow.c).
Signed-off-by: Hao Sun <sunhao.th@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20240115082028.9992-2-sunhao.th@gmail.com
bonding tests also try to create bridge, veth and dummy
interfaces. These are not currently listed in config.
Fixes: bbb774d921e2 ("net: Add tests for bonding and team address list management")
Fixes: c078290a2b76 ("selftests: include bonding tests into the kselftest infra")
Acked-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Link: https://lore.kernel.org/r/20240116020201.1883023-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Number of tests are failing when netdev renaming is active
on the system. Add udevadm settle in logic determining
the names.
Fixes: 242aaf03dc9b ("selftests: add a test for ethtool pause stats")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240114224726.1210532-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The lib.sh script is meant to be sourced from other scripts, not executed
directly. Therefore, remove the executable bits from lib.sh's permissions.
Fixes: fe32dffdcd33 ("selftests: forwarding: add TCPDUMP_EXTRA_FLAGS to lib.sh")
Tested-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The tests changed by this patch, as well as the scripts they source, use
features which are not part of POSIX sh (ex. 'source' and 'local'). As a
result, these tests fail when /bin/sh is dash such as on Debian. Change the
interpreter to bash so that these tests can run successfully.
Fixes: d43eff0b85ae ("selftests: bonding: up/down delay w/ slave link flapping")
Tested-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
syzbot found an error with how splice() is handled with a msg greater
than 32. This was fixed in previous patch, but lets add a test for
it to ensure it continues to work.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch adds a test to exercise the bpf_iter_udp batching
logic. It specifically tests the case that there are multiple
so_reuseport udp_sk in a bucket of the udp_table.
The test creates two sets of so_reuseport sockets and
each set on a different port. Meaning there will be
two buckets in the udp_table.
The test does the following:
1. read() 3 out of 4 sockets in the first bucket.
2. close() all sockets in the first bucket. This
will ensure the current bucket's offset in
the kernel does not affect the read() of the
following bucket.
3. read() all 4 sockets in the second bucket.
The test also reads one udp_sk at a time from
the bpf_iter_udp prog. The true case in
"do_test(..., bool onebyone)". This is the buggy case
that the previous patch fixed.
It also tests the "false" case in "do_test(..., bool onebyone)",
meaning the userspace reads the whole bucket. There is
no bug in this case but adding this test also while
at it.
Considering the way to have multiple tcp_sk in the same
bucket is similar (by using so_reuseport),
this patch also tests the bpf_iter_tcp even though the
bpf_iter_tcp batching logic works correctly.
Both IP v4 and v6 are exercising the same bpf_iter batching
code path, so only v6 is tested.
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240112190530.3751661-4-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This pull request contains the following branches:
doc.2023.12.13a: Documentation and comment updates.
torture.2023.11.23a: RCU torture, locktorture updates that include
cleanups; nolibc init build support for mips, ppc and rv64;
testing of mid stall duration scenario and fixing fqs task
creation conditions.
fixes.2023.12.13a: Misc fixes, most notably restricting usage of
RCU CPU stall notifiers, to confine their usage primarily
to debug kernels.
rcu-tasks.2023.12.12b: RCU tasks minor fixes.
srcu.2023.12.13a: lockdep annotation fix for NMI-safe accesses,
callback advancing/acceleration cleanup and documentation
improvements.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQSi2tPIQIc2VEtjarIAHS7/6Z0wpQUCZYUS0AAKCRAAHS7/6Z0w
pRXgAQD+k8oqjvKL6la61ppWm5Y7NLjdj/IbV+cOd42jKnM6PAEAyavNhX0n7zGx
o9cDlvIDxJfHnFrOTc5WLH9yEs3IiQQ=
=8rdu
-----END PGP SIGNATURE-----
Merge tag 'rcu.release.v6.8' of https://github.com/neeraju/linux
Pull RCU updates from Neeraj Upadhyay:
- Documentation and comment updates
- RCU torture, locktorture updates that include cleanups; nolibc init
build support for mips, ppc and rv64; testing of mid stall duration
scenario and fixing fqs task creation conditions
- Misc fixes, most notably restricting usage of RCU CPU stall
notifiers, to confine their usage primarily to debug kernels
- RCU tasks minor fixes
- lockdep annotation fix for NMI-safe accesses, callback
advancing/acceleration cleanup and documentation improvements
* tag 'rcu.release.v6.8' of https://github.com/neeraju/linux:
rcu: Force quiescent states only for ongoing grace period
doc: Clarify historical disclaimers in memory-barriers.txt
doc: Mention address and data dependencies in rcu_dereference.rst
doc: Clarify RCU Tasks reader/updater checklist
rculist.h: docs: Fix wrong function summary
Documentation: RCU: Remove repeated word in comments
srcu: Use try-lock lockdep annotation for NMI-safe access.
srcu: Explain why callbacks invocations can't run concurrently
srcu: No need to advance/accelerate if no callback enqueued
srcu: Remove superfluous callbacks advancing from srcu_gp_start()
rcu: Remove unused macros from rcupdate.h
rcu: Restrict access to RCU CPU stall notifiers
rcu-tasks: Mark RCU Tasks accesses to current->rcu_tasks_idle_cpu
rcutorture: Add fqs_holdoff check before fqs_task is created
rcutorture: Add mid-sized stall to TREE07
rcutorture: add nolibc init support for mips, ppc and rv64
locktorture: Increase Hamming distance between call_rcu_chain and rcu_call_chains
The kernel sefltest mm/hugepage-vmemmap fails on architectures which has
different page size other than 4K. In hugepage-vmemmap page size used is
4k so the pfn calculation will go wrong on systems which has different
page size .The length of MAP_HUGETLB memory must be hugepage aligned but
in hugepage-vmemmap map length is 2M so this will not get aligned if the
system has differnet hugepage size.
Added psize() to get the page size and default_huge_page_size() to
get the default hugepage size at run time, hugepage-vmemmap test pass
on powerpc with 64K page size and x86 with 4K page size.
Result on powerpc without patch (page size 64K)
*# ./hugepage-vmemmap
Returned address is 0x7effff000000 whose pfn is 0
Head page flags (100000000) is invalid
check_page_flags: Invalid argument
*#
Result on powerpc with patch (page size 64K)
*# ./hugepage-vmemmap
Returned address is 0x7effff000000 whose pfn is 600
*#
Result on x86 with patch (page size 4K)
*# ./hugepage-vmemmap
Returned address is 0x7fc7c2c00000 whose pfn is 1dac00
*#
Link: https://lkml.kernel.org/r/3b3a3ae37ba21218481c482a872bbf7526031600.1704865754.git.donettom@linux.vnet.ibm.com
Fixes: b147c89cd429 ("selftests: vm: add a hugetlb test case")
Signed-off-by: Donet Tom <donettom@linux.vnet.ibm.com>
Reported-by: Geetika Moolchandani <geetika@linux.ibm.com>
Tested-by: Geetika Moolchandani <geetika@linux.ibm.com>
Acked-by: Muchun Song <muchun.song@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIVAwUAZZxaAaZi849r7WBJAQL3kQ/+O657H6H/yfi2tC/i+S8Q13gWc9bhtYt/
dI90ixcrWnZbNEuSUZ9aLt5UzVfSO2GnsmwGUwRdfMCOIYv42mS9st0JAGvYx0jL
xYHaqMW5VHn9pdUBDdgXG90DyivcbprxAldpTyJFr029g1H7vdnp/KXhzveBfaIw
lBGOzM1miiK2/5quj9/tIW1rJLJiR8LLNBpjaDAVZrJqAjJXObCY9AmtpsgiGQSY
kh8YTohNcMTo6w/CVoAekoQugE6tDHAvAg7QqRVwuMrFXu71fMjUcyOd8vFrptwC
8OnOVN8qZYohdE9o9AxO8jUm2dUI8hTvijdxERW6zZy3lRNOnfpiTYozGISJofUc
+E1fY8/LCtow1RzH8tlfuc+JcWfBdn4egU+r727aRMZSgu+f61xXZGTGsUuwY37Q
zSGoAa2P5xodk4S4bF40XKIYYlbEbfUJP73GRFk4QQYkE5lcAK+djG4e1guU+lw2
VsWwCHK4Nl9LxNIj5a6VLrK4JegBHuY9uBGrQUDw7NcB86o7le4wh1HdM8cfBDG5
RrLe/lfyonquFOAPIasVHB5oG+FK1E+ex4DL1qfvWcmV1RRVaEvtrBF2mpidMSoI
W0m6iAPN5keGhlus50xXllfmbotpgFQtcBgJQKCUaSYP/WAJT9JpHLYPOA2ADM0g
3XoBX8i6s2M=
=3SCb
-----END PGP SIGNATURE-----
Merge tag 'hid-for-linus-2024010801' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID updates from Jiri Kosina:
- assorted functional fixes for hid-steam ported from SteamOS betas
(Vicki Pfau)
- fix for custom sensor-hub sensors (hinge angle sensor and LISS
sensors) not working (Yauhen Kharuzhy)
- functional fix for handling Confidence in Wacom driver (Jason
Gerecke)
- support for Ilitek ili2901 touchscreen (Zhengqiao Xia)
- power management fix for Wacom userspace battery exporting
(Tatsunosuke Tobita)
- rework of wait-for-reset in order to reduce the need for
I2C_HID_QUIRK_NO_IRQ_AFTER_RESET qurk; the success rate is now 50%
better, but there are still further improvements to be made (Hans de
Goede)
- greatly improved coverage of Tablets in hid-selftests (Benjamin
Tissoires)
- support for Nintendo NSO controllers -- SNES, Genesis and N64 (Ryan
McClelland)
- support for controlling mcp2200 GPIOs (Johannes Roith)
- power management improvement for EHL OOB wakeup in intel-ish
(Kai-Heng Feng)
- other assorted device-specific fixes and code cleanups
* tag 'hid-for-linus-2024010801' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: (53 commits)
HID: amd_sfh: Add a new interface for exporting ALS data
HID: amd_sfh: Add a new interface for exporting HPD data
HID: amd_sfh: rename float_to_int() to amd_sfh_float_to_int()
HID: i2c-hid: elan: Add ili2901 timing
dt-bindings: HID: i2c-hid: elan: Introduce Ilitek ili2901
HID: bpf: make bus_type const in struct hid_bpf_ops
HID: make ishtp_cl_bus_type const
HID: make hid_bus_type const
HID: hid-steam: Add gamepad-only mode switched to by holding options
HID: hid-steam: Better handling of serial number length
HID: hid-steam: Update list of identifiers from SDL
HID: hid-steam: Make client_opened a counter
HID: hid-steam: Clean up locking
HID: hid-steam: Disable watchdog instead of using a heartbeat
HID: hid-steam: Avoid overwriting smoothing parameter
HID: magicmouse: fix kerneldoc for struct magicmouse_sc
HID: sensor-hub: Enable hid core report processing for all devices
HID: wacom: Add additional tests of confidence behavior
HID: wacom: Correct behavior when processing some confidence == false touches
HID: nintendo: add support for nso controllers
...
It was a clam development cycle. There were an ALSA core extension
for subformat PCM bits and a few ASoC core changes to support N:M
mappings, while the most of remaining changes are driver-specific.
Core:
- API extensions for properly limiting PCM format bits via subformat
- Enhanced support for N:M CPU:CODEC mappings in the core and in
audio-graph-card2
ASoC:
- Lots of SOF updates: fallback support to older IPC versions,
notification on control changes with IPC4.
Also supports for ACPI parse for the ES83xx driver that reduces
quirks.
- Device tree support for describing parts of the card which can be
active over suspend (for very low power playback or wake word use
cases)
- Support for more AMD and Intel systems, NXP i.MX8m MICFIL, Qualcomm
SM8250, SM8550, SM8650 and X1E80100
- Drop of Freescale MPC8610 code that is no longer supported
HD-audio:
- More CS35L41 codec extensions for Dell, HP and Lenovo models
- TAS2781 codec extensions for Lenovo and co
- New PCM subformat supports
Others:
- More enhancement for Scarlett2 USB mixer support
- Various kselftest fixes
-----BEGIN PGP SIGNATURE-----
iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmWfzJgOHHRpd2FpQHN1
c2UuZGUACgkQLtJE4w1nLE/NHg//VT0JRZDKqAZMUIaFZswUIKgoUy9fY9WaePO+
db38uFzjuTC1ZN1PIDFnU0bQ2uL7IYF/xnceqitk+G76+Wt/pcDP/1hX+F9UhsVd
wgRrihBeNRtSmPku8cwKu+duD/46439JyWo34PdHm8FxNoyilBx6fIApk6JqB57V
HYU3ZFtQE84s8TmPHefasLeewY5thFtWYXLAaEZR5oohyXUTN0Np7h7vG8nchh1F
zZOwoQ+nBo607PwUoXd0BFYkcuXwHlK4vLVmAM05KPaH1Q/kesEecYMhIVbnDHOj
a4caMz+/tMbNbw3/GRsg8HgIeiyp1NoyC4LAqufa+Pj6BgPREVra5j8XoxAnCxXr
8X8EDtBds6frMzqtQyNvTkyRCf3Iki8fhz60Re5nEkoXLcv34E3kleQDLG/FnHqC
qeH0J3FEed84Gf6KrnpjkPHLFRx5ZKyahOHZ7Xc76fUYMCwvczkc5CKeG2EoivE4
koEkhlQU1gnNyjNTTi4JchWis+EZG/oNA91eud1lMDm320lFJmxdZ5z31xZubVvs
WTcMStgHCDPIKOeSBuwBCYFwugMtV/o/ejE567E4bxVC5ZA/zbxyvpxU9iDAjUNK
T0JdPf/KKy1YJiNe9xuJn9/1ZpV6BXFCl7b7wILV+ZbGduOczoMCEH5T7dwAmZqq
/lQtT/8=
=yVxc
-----END PGP SIGNATURE-----
Merge tag 'sound-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound updates from Takashi Iwai:
"It was a calm development cycle. There were an ALSA core extension for
subformat PCM bits and a few ASoC core changes to support N:M
mappings, while the most of remaining changes are driver-specific.
Core:
- API extensions for properly limiting PCM format bits via subformat
- Enhanced support for N:M CPU:CODEC mappings in the core and in
audio-graph-card2
ASoC:
- Lots of SOF updates: fallback support to older IPC versions,
notification on control changes with IPC4. Also supports for ACPI
parse for the ES83xx driver that reduces quirks.
- Device tree support for describing parts of the card which can be
active over suspend (for very low power playback or wake word use
cases)
- Support for more AMD and Intel systems, NXP i.MX8m MICFIL, Qualcomm
SM8250, SM8550, SM8650 and X1E80100
- Drop of Freescale MPC8610 code that is no longer supported
HD-audio:
- More CS35L41 codec extensions for Dell, HP and Lenovo models
- TAS2781 codec extensions for Lenovo and co
- New PCM subformat supports
Others:
- More enhancement for Scarlett2 USB mixer support
- Various kselftest fixes"
* tag 'sound-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (337 commits)
kselftest/alsa - conf: Stringify the printed errno in sysfs_get()
kselftest/alsa - mixer-test: Fix the print format specifier warning
kselftest/alsa - mixer-test: Fix the print format specifier warning
kselftest/alsa - mixer-test: fix the number of parameters to ksft_exit_fail_msg()
ALSA: hda/tas2781: annotate calibration data endianness
ALSA: hda/realtek: Fix mute and mic-mute LEDs for HP Envy X360 13-ay0xxx
ALSA: hda/conexant: Fix headset auto detect fail in cx8070 and SN6140
ALSA: ac97: fix build regression
ALSA: hda: cs35l41: Support more HP models without _DSD
ALSA: hda/tas2781: add fixup for Lenovo 14ARB7
ALSA: hda/tas2781: add TAS2563 support for 14ARB7
ALSA: hda/tas2781: add configurable global i2c address
ALSA: hda/tas2781: add ptrs to calibration functions
ALSA: hda: Add driver properties for cs35l41 for Lenovo Legion Slim 7 Gen 8 serie
ALSA: hda/realtek: enable SND_PCI_QUIRK for Lenovo Legion Slim 7 Gen 8 (2023) serie
ALSA: hda/tas2781: configure the amp after firmware load
ALSA: mark all struct bus_type as const
ASoC: pxa: sspa: Don't select SND_ARM
ASoC: rt5663: cancel the work when system suspends
ALSA: scarlett2: Add PCM Input Switch for Solo Gen 4
...
The goal is to check the following two sequences:
> ip link set dummy0 up
> ip link set dummy0 master bond0 down
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240108094103.2001224-3-nicolas.dichtel@6wind.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The rules to link selftests are:
> $(OUTPUT)/%_ipv4: %.c
> $(LINK.c) $^ $(LDLIBS) -o $@
>
> $(OUTPUT)/%_ipv6: %.c
> $(LINK.c) -DIPV6_TEST $^ $(LDLIBS) -o $@
The intel test robot uses only selftest's Makefile, not the top linux
Makefile:
> make W=1 O=/tmp/kselftest -C tools/testing/selftests
So, $(LINK.c) is determined by environment, rather than by kernel
Makefiles. On my machine (as well as other people that ran tcp-ao
selftests) GNU/Make implicit definition does use $(LDFLAGS):
> [dima@Mindolluin ~]$ make -p -f/dev/null | grep '^LINK.c\>'
> make: *** No targets. Stop.
> LINK.c = $(CC) $(CFLAGS) $(CPPFLAGS) $(LDFLAGS) $(TARGET_ARCH)
But, according to build robot report, it's not the case for them.
While I could just avoid using pre-defined $(LINK.c), it's also used by
selftests/lib.mk by default.
Anyways, according to GNU/Make documentation [1], I should have used
$(LDLIBS) instead of $(LDFLAGS) in the first place, so let's just do it:
> LDFLAGS
> Extra flags to give to compilers when they are supposed to invoke
> the linker, ‘ld’, such as -L. Libraries (-lfoo) should be added
> to the LDLIBS variable instead.
> LDLIBS
> Library flags or names given to compilers when they are supposed
> to invoke the linker, ‘ld’. LOADLIBES is a deprecated (but still
> supported) alternative to LDLIBS. Non-library linker flags, such
> as -L, should go in the LDFLAGS variable.
[1]: https://www.gnu.org/software/make/manual/html_node/Implicit-Variables.html
Fixes: cfbab37b3da0 ("selftests/net: Add TCP-AO library")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202401011151.veyYTJzq-lkp@intel.com/
Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/20240110-tcp_ao-selftests-makefile-v1-1-aa07d043f052@arista.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Core & protocols
----------------
- Analyze and reorganize core networking structs (socks, netdev,
netns, mibs) to optimize cacheline consumption and set up
build time warnings to safeguard against future header changes.
This improves TCP performances with many concurrent connections
up to 40%.
- Add page-pool netlink-based introspection, exposing the
memory usage and recycling stats. This helps indentify
bad PP users and possible leaks.
- Refine TCP/DCCP source port selection to no longer favor even
source port at connect() time when IP_LOCAL_PORT_RANGE is set.
This lowers the time taken by connect() for hosts having
many active connections to the same destination.
- Refactor the TCP bind conflict code, shrinking related socket
structs.
- Refactor TCP SYN-Cookie handling, as a preparation step to
allow arbitrary SYN-Cookie processing via eBPF.
- Tune optmem_max for 0-copy usage, increasing the default value
to 128KB and namespecifying it.
- Allow coalescing for cloned skbs coming from page pools, improving
RX performances with some common configurations.
- Reduce extension header parsing overhead at GRO time.
- Add bridge MDB bulk deletion support, allowing user-space to
request the deletion of matching entries.
- Reorder nftables struct members, to keep data accessed by the
datapath first.
- Introduce TC block ports tracking and use. This allows supporting
multicast-like behavior at the TC layer.
- Remove UAPI support for retired TC qdiscs (dsmark, CBQ and ATM) and
classifiers (RSVP and tcindex).
- More data-race annotations.
- Extend the diag interface to dump TCP bound-only sockets.
- Conditional notification of events for TC qdisc class and actions.
- Support for WPAN dynamic associations with nearby devices, to form
a sub-network using a specific PAN ID.
- Implement SMCv2.1 virtual ISM device support.
- Add support for Batman-avd mulicast packet type.
BPF
---
- Tons of verifier improvements:
- BPF register bounds logic and range support along with a large
test suite
- log improvements
- complete precision tracking support for register spills
- track aligned STACK_ZERO cases as imprecise spilled registers. It
improves the verifier "instructions processed" metric from single
digit to 50-60% for some programs
- support for user's global BPF subprogram arguments with few
commonly requested annotations for a better developer experience
- support tracking of BPF_JNE which helps cases when the compiler
transforms (unsigned) "a > 0" into "if a == 0 goto xxx" and the
like
- several fixes
- Add initial TX metadata implementation for AF_XDP with support in
mlx5 and stmmac drivers. Two types of offloads are supported right
now, that is, TX timestamp and TX checksum offload.
- Fix kCFI bugs in BPF all forms of indirect calls from BPF into
kernel and from kernel into BPF work with CFI enabled. This allows
BPF to work with CONFIG_FINEIBT=y.
- Change BPF verifier logic to validate global subprograms lazily
instead of unconditionally before the main program, so they can be
guarded using BPF CO-RE techniques.
- Support uid/gid options when mounting bpffs.
- Add a new kfunc which acquires the associated cgroup of a task
within a specific cgroup v1 hierarchy where the latter is identified
by its id.
- Extend verifier to allow bpf_refcount_acquire() of a map value field
obtained via direct load which is a use-case needed in sched_ext.
- Add BPF link_info support for uprobe multi link along with bpftool
integration for the latter.
- Support for VLAN tag in XDP hints.
- Remove deprecated bpfilter kernel leftovers given the project
is developed in user-space (https://github.com/facebook/bpfilter).
Misc
----
- Support for parellel TC self-tests execution.
- Increase MPTCP self-tests coverage.
- Updated the bridge documentation, including several so-far
undocumented features.
- Convert all the net self-tests to run in unique netns, to
avoid random failures due to conflict and allow concurrent
runs.
- Add TCP-AO self-tests.
- Add kunit tests for both cfg80211 and mac80211.
- Autogenerate Netlink families documentation from YAML spec.
- Add yml-gen support for fixed headers and recursive nests, the
tool can now generate user-space code for all genetlink families
for which we have specs.
- A bunch of additional module descriptions fixes.
- Catch incorrect freeing of pages belonging to a page pool.
Driver API
----------
- Rust abstractions for network PHY drivers; do not cover yet the
full C API, but already allow implementing functional PHY drivers
in rust.
- Introduce queue and NAPI support in the netdev Netlink interface,
allowing complete access to the device <> NAPIs <> queues
relationship.
- Introduce notifications filtering for devlink to allow control
application scale to thousands of instances.
- Improve PHY validation, requesting rate matching information for
each ethtool link mode supported by both the PHY and host.
- Add support for ethtool symmetric-xor RSS hash.
- ACPI based Wifi band RFI (WBRF) mitigation feature for the AMD
platform.
- Expose pin fractional frequency offset value over new DPLL generic
netlink attribute.
- Convert older drivers to platform remove callback returning void.
- Add support for PHY package MMD read/write.
New hardware / drivers
----------------------
- Ethernet:
- Octeon CN10K devices
- Broadcom 5760X P7
- Qualcomm SM8550 SoC
- Texas Instrument DP83TG720S PHY
- Bluetooth:
- IMC Networks Bluetooth radio
Removed
-------
- WiFi:
- libertas 16-bit PCMCIA support
- Atmel at76c50x drivers
- HostAP ISA/PCMCIA style 802.11b driver
- zd1201 802.11b USB dongles
- Orinoco ISA/PCMCIA 802.11b driver
- Aviator/Raytheon driver
- Planet WL3501 driver
- RNDIS USB 802.11b driver
Drivers
-------
- Ethernet high-speed NICs:
- Intel (100G, ice, idpf):
- allow one by one port representors creation and removal
- add temperature and clock information reporting
- add get/set for ethtool's header split ringparam
- add again FW logging
- adds support switchdev hardware packet mirroring
- iavf: implement symmetric-xor RSS hash
- igc: add support for concurrent physical and free-running timers
- i40e: increase the allowable descriptors
- nVidia/Mellanox:
- Preparation for Socket-Direct multi-dev netdev. That will allow
in future releases combining multiple PFs devices attached to
different NUMA nodes under the same netdev
- Broadcom (bnxt):
- TX completion handling improvements
- add basic ntuple filter support
- reduce MSIX vectors usage for MQPRIO offload
- add VXLAN support, USO offload and TX coalesce completion for P7
- Marvell Octeon EP:
- xmit-more support
- add PF-VF mailbox support and use it for FW notifications for VFs
- Wangxun (ngbe/txgbe):
- implement ethtool functions to operate pause param, ring param,
coalesce channel number and msglevel
- Netronome/Corigine (nfp):
- add flow-steering support
- support UDP segmentation offload
- Ethernet NICs embedded, slower, virtual:
- Xilinx AXI: remove duplicate DMA code adopting the dma engine driver
- stmmac: add support for HW-accelerated VLAN stripping
- TI AM654x sw: add mqprio, frame preemption & coalescing
- gve: add support for non-4k page sizes.
- virtio-net: support dynamic coalescing moderation
- nVidia/Mellanox Ethernet datacenter switches:
- allow firmware upgrade without a reboot
- more flexible support for bridge flooding via the compressed
FID flooding mode
- Ethernet embedded switches:
- Microchip:
- fine-tune flow control and speed configurations in KSZ8xxx
- KSZ88X3: enable setting rmii reference
- Renesas:
- add jumbo frames support
- Marvell:
- 88E6xxx: add "eth-mac" and "rmon" stats support
- Ethernet PHYs:
- aquantia: add firmware load support
- at803x: refactor the driver to simplify adding support for more
chip variants
- NXP C45 TJA11xx: Add MACsec offload support
- Wifi:
- MediaTek (mt76):
- NVMEM EEPROM improvements
- mt7996 Extremely High Throughput (EHT) improvements
- mt7996 Wireless Ethernet Dispatcher (WED) support
- mt7996 36-bit DMA support
- Qualcomm (ath12k):
- support for a single MSI vector
- WCN7850: support AP mode
- Intel (iwlwifi):
- new debugfs file fw_dbg_clear
- allow concurrent P2P operation on DFS channels
- Bluetooth:
- QCA2066: support HFP offload
- ISO: more broadcast-related improvements
- NXP: better recovery in case receiver/transmitter get out of sync
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmWdamsSHHBhYmVuaUBy
ZWRoYXQuY29tAAoJECkkeY3MjxOkGC4P/2xjLzdw22ckSssuE9ORbGko9SNjnqHk
PQh1E+26BHiCg5KB8VvzMsL78E79MRNXEattSW+1g7dhCvln3oi+Vd0WkdRkgt35
98Iv18zLbbwFAJeyKvmLAPAkQkMLtVj19QILBBRrugF+egEZgVSE3JBcTAiKv2ZQ
HzkabA171Ri6LpCcEEtY5XuaKvimGnGzF8YMFf8rX0wtqd2p5kbY9aMe47WAGxvU
Vf9548XvH+A5yVH2/4/gujtUOpA/RHuhuCMb+oo0cZ+VCC1x9MGzoXzj6r87OTkf
k2W1whNzcGoin92f+9Lk1JYMuiGKBH4QVaDdNXJnYFSJWPTE7RvRsPzYTSD4/GzK
yEZbzSJXpy/2vDQm16NoAxl7evRs8Sorzkw4LQRviZHI/5SAkK2ZQiCK5CO8QSYy
C1LELcV5kn6Foe24xWnrWLjAGug9oJnYoGPMU5gvPmFJMvUMXqm5rmbBgUWL5Rxw
q1M6gVzabCyWUy6z2G2vaqW2ZntNVvCkdsLtIX0XZkcTzNoP0MA+TuhyGz4wbiuo
PeyQp/mbGnDgCYggqKIA0YWrTVxkhFrKN520cbO8qXBQytV9oFbM/0/+C0/r/5WX
pL1JVzLrh6l5ME7EIQfha8UOF9j8q4ueSwb40P3AR2NaZiDABM0zfUZ6+sx+91WF
ucqPEcZB5cRE
=1bW6
-----END PGP SIGNATURE-----
Merge tag 'net-next-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni:
"The most interesting thing is probably the networking structs
reorganization and a significant amount of changes is around
self-tests.
Core & protocols:
- Analyze and reorganize core networking structs (socks, netdev,
netns, mibs) to optimize cacheline consumption and set up build
time warnings to safeguard against future header changes
This improves TCP performances with many concurrent connections up
to 40%
- Add page-pool netlink-based introspection, exposing the memory
usage and recycling stats. This helps indentify bad PP users and
possible leaks
- Refine TCP/DCCP source port selection to no longer favor even
source port at connect() time when IP_LOCAL_PORT_RANGE is set. This
lowers the time taken by connect() for hosts having many active
connections to the same destination
- Refactor the TCP bind conflict code, shrinking related socket
structs
- Refactor TCP SYN-Cookie handling, as a preparation step to allow
arbitrary SYN-Cookie processing via eBPF
- Tune optmem_max for 0-copy usage, increasing the default value to
128KB and namespecifying it
- Allow coalescing for cloned skbs coming from page pools, improving
RX performances with some common configurations
- Reduce extension header parsing overhead at GRO time
- Add bridge MDB bulk deletion support, allowing user-space to
request the deletion of matching entries
- Reorder nftables struct members, to keep data accessed by the
datapath first
- Introduce TC block ports tracking and use. This allows supporting
multicast-like behavior at the TC layer
- Remove UAPI support for retired TC qdiscs (dsmark, CBQ and ATM) and
classifiers (RSVP and tcindex)
- More data-race annotations
- Extend the diag interface to dump TCP bound-only sockets
- Conditional notification of events for TC qdisc class and actions
- Support for WPAN dynamic associations with nearby devices, to form
a sub-network using a specific PAN ID
- Implement SMCv2.1 virtual ISM device support
- Add support for Batman-avd mulicast packet type
BPF:
- Tons of verifier improvements:
- BPF register bounds logic and range support along with a large
test suite
- log improvements
- complete precision tracking support for register spills
- track aligned STACK_ZERO cases as imprecise spilled registers.
This improves the verifier "instructions processed" metric from
single digit to 50-60% for some programs
- support for user's global BPF subprogram arguments with few
commonly requested annotations for a better developer
experience
- support tracking of BPF_JNE which helps cases when the compiler
transforms (unsigned) "a > 0" into "if a == 0 goto xxx" and the
like
- several fixes
- Add initial TX metadata implementation for AF_XDP with support in
mlx5 and stmmac drivers. Two types of offloads are supported right
now, that is, TX timestamp and TX checksum offload
- Fix kCFI bugs in BPF all forms of indirect calls from BPF into
kernel and from kernel into BPF work with CFI enabled. This allows
BPF to work with CONFIG_FINEIBT=y
- Change BPF verifier logic to validate global subprograms lazily
instead of unconditionally before the main program, so they can be
guarded using BPF CO-RE techniques
- Support uid/gid options when mounting bpffs
- Add a new kfunc which acquires the associated cgroup of a task
within a specific cgroup v1 hierarchy where the latter is
identified by its id
- Extend verifier to allow bpf_refcount_acquire() of a map value
field obtained via direct load which is a use-case needed in
sched_ext
- Add BPF link_info support for uprobe multi link along with bpftool
integration for the latter
- Support for VLAN tag in XDP hints
- Remove deprecated bpfilter kernel leftovers given the project is
developed in user-space (https://github.com/facebook/bpfilter)
Misc:
- Support for parellel TC self-tests execution
- Increase MPTCP self-tests coverage
- Updated the bridge documentation, including several so-far
undocumented features
- Convert all the net self-tests to run in unique netns, to avoid
random failures due to conflict and allow concurrent runs
- Add TCP-AO self-tests
- Add kunit tests for both cfg80211 and mac80211
- Autogenerate Netlink families documentation from YAML spec
- Add yml-gen support for fixed headers and recursive nests, the tool
can now generate user-space code for all genetlink families for
which we have specs
- A bunch of additional module descriptions fixes
- Catch incorrect freeing of pages belonging to a page pool
Driver API:
- Rust abstractions for network PHY drivers; do not cover yet the
full C API, but already allow implementing functional PHY drivers
in rust
- Introduce queue and NAPI support in the netdev Netlink interface,
allowing complete access to the device <> NAPIs <> queues
relationship
- Introduce notifications filtering for devlink to allow control
application scale to thousands of instances
- Improve PHY validation, requesting rate matching information for
each ethtool link mode supported by both the PHY and host
- Add support for ethtool symmetric-xor RSS hash
- ACPI based Wifi band RFI (WBRF) mitigation feature for the AMD
platform
- Expose pin fractional frequency offset value over new DPLL generic
netlink attribute
- Convert older drivers to platform remove callback returning void
- Add support for PHY package MMD read/write
New hardware / drivers:
- Ethernet:
- Octeon CN10K devices
- Broadcom 5760X P7
- Qualcomm SM8550 SoC
- Texas Instrument DP83TG720S PHY
- Bluetooth:
- IMC Networks Bluetooth radio
Removed:
- WiFi:
- libertas 16-bit PCMCIA support
- Atmel at76c50x drivers
- HostAP ISA/PCMCIA style 802.11b driver
- zd1201 802.11b USB dongles
- Orinoco ISA/PCMCIA 802.11b driver
- Aviator/Raytheon driver
- Planet WL3501 driver
- RNDIS USB 802.11b driver
Driver updates:
- Ethernet high-speed NICs:
- Intel (100G, ice, idpf):
- allow one by one port representors creation and removal
- add temperature and clock information reporting
- add get/set for ethtool's header split ringparam
- add again FW logging
- adds support switchdev hardware packet mirroring
- iavf: implement symmetric-xor RSS hash
- igc: add support for concurrent physical and free-running
timers
- i40e: increase the allowable descriptors
- nVidia/Mellanox:
- Preparation for Socket-Direct multi-dev netdev. That will
allow in future releases combining multiple PFs devices
attached to different NUMA nodes under the same netdev
- Broadcom (bnxt):
- TX completion handling improvements
- add basic ntuple filter support
- reduce MSIX vectors usage for MQPRIO offload
- add VXLAN support, USO offload and TX coalesce completion
for P7
- Marvell Octeon EP:
- xmit-more support
- add PF-VF mailbox support and use it for FW notifications
for VFs
- Wangxun (ngbe/txgbe):
- implement ethtool functions to operate pause param, ring
param, coalesce channel number and msglevel
- Netronome/Corigine (nfp):
- add flow-steering support
- support UDP segmentation offload
- Ethernet NICs embedded, slower, virtual:
- Xilinx AXI: remove duplicate DMA code adopting the dma engine
driver
- stmmac: add support for HW-accelerated VLAN stripping
- TI AM654x sw: add mqprio, frame preemption & coalescing
- gve: add support for non-4k page sizes.
- virtio-net: support dynamic coalescing moderation
- nVidia/Mellanox Ethernet datacenter switches:
- allow firmware upgrade without a reboot
- more flexible support for bridge flooding via the compressed
FID flooding mode
- Ethernet embedded switches:
- Microchip:
- fine-tune flow control and speed configurations in KSZ8xxx
- KSZ88X3: enable setting rmii reference
- Renesas:
- add jumbo frames support
- Marvell:
- 88E6xxx: add "eth-mac" and "rmon" stats support
- Ethernet PHYs:
- aquantia: add firmware load support
- at803x: refactor the driver to simplify adding support for more
chip variants
- NXP C45 TJA11xx: Add MACsec offload support
- Wifi:
- MediaTek (mt76):
- NVMEM EEPROM improvements
- mt7996 Extremely High Throughput (EHT) improvements
- mt7996 Wireless Ethernet Dispatcher (WED) support
- mt7996 36-bit DMA support
- Qualcomm (ath12k):
- support for a single MSI vector
- WCN7850: support AP mode
- Intel (iwlwifi):
- new debugfs file fw_dbg_clear
- allow concurrent P2P operation on DFS channels
- Bluetooth:
- QCA2066: support HFP offload
- ISO: more broadcast-related improvements
- NXP: better recovery in case receiver/transmitter get out of sync"
* tag 'net-next-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1714 commits)
lan78xx: remove redundant statement in lan78xx_get_eee
lan743x: remove redundant statement in lan743x_ethtool_get_eee
bnxt_en: Fix RCU locking for ntuple filters in bnxt_rx_flow_steer()
bnxt_en: Fix RCU locking for ntuple filters in bnxt_srxclsrldel()
bnxt_en: Remove unneeded variable in bnxt_hwrm_clear_vnic_filter()
tcp: Revert no longer abort SYN_SENT when receiving some ICMP
Revert "mlx5 updates 2023-12-20"
Revert "net: stmmac: Enable Per DMA Channel interrupt"
ipvlan: Remove usage of the deprecated ida_simple_xx() API
ipvlan: Fix a typo in a comment
net/sched: Remove ipt action tests
net: stmmac: Use interrupt mode INTM=1 for per channel irq
net: stmmac: Add support for TX/RX channel interrupt
net: stmmac: Make MSI interrupt routine generic
dt-bindings: net: snps,dwmac: per channel irq
net: phy: at803x: make read_status more generic
net: phy: at803x: add support for cdt cross short test for qca808x
net: phy: at803x: refactor qca808x cable test get status function
net: phy: at803x: generalize cdt fault length function
net: ethernet: cortina: Drop TSO support
...
Add test cases for the IOMMU_HWPT_INVALIDATE ioctl and verify it by using
the new IOMMU_TEST_OP_MD_CHECK_IOTLB.
Link: https://lore.kernel.org/r/20240111041015.47920-7-yi.l.liu@intel.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Co-developed-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Allow to test whether IOTLB has been invalidated or not.
Link: https://lore.kernel.org/r/20240111041015.47920-6-yi.l.liu@intel.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
To help make the move of sysctls out of kernel/sysctl.c not incur a size
penalty sysctl has been changed to allow us to not require the sentinel, the
final empty element on the sysctl array. Joel Granados has been doing all this
work. On the v6.6 kernel we got the major infrastructure changes required to
support this. For v6.7 we had all arch/ and drivers/ modified to remove
the sentinel. For v6.8-rc1 we get a few more updates for fs/ directory only.
The kernel/ directory is left but we'll save that for v6.9-rc1 as those patches
are still being reviewed. After that we then can expect also the removal of the
no longer needed check for procname == NULL.
Let us recap the purpose of this work:
- this helps reduce the overall build time size of the kernel and run time
memory consumed by the kernel by about ~64 bytes per array
- the extra 64-byte penalty is no longer inncurred now when we move sysctls
out from kernel/sysctl.c to their own files
Thomas Weißschuh also sent a few cleanups, for v6.9-rc1 we expect to see further
work by Thomas Weißschuh with the constificatin of the struct ctl_table.
Due to Joel Granados's work, and to help bring in new blood, I have suggested
for him to become a maintainer and he's accepted. So for v6.9-rc1 I look forward
to seeing him sent you a pull request for further sysctl changes. This also
removes Iurii Zaikin as a maintainer as he has moved on to other projects and
has had no time to help at all.
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmWdWDESHG1jZ3JvZkBr
ZXJuZWwub3JnAAoJEM4jHQowkoinjJAP/jTNNoyzWisvrrvmXqR5txFGLOE+wW6x
Xv9avuiM+DTHsH/wK8CkXEivwDqYNAZEHU7NEcolS5bJX/ddSRwN9b5aSVlCrUdX
Ab4rXmpeSCNFp9zNszWJsDuBKIqjvsKw7qGleGtgZ2qAUHbbH30VROLWCggaee50
wU3icDLdwkasxrcMXy4Sq5dT5wYC4j/QelqBGIkYPT14Arl1im5zqPZ95gmO/s/6
mdicTAmq+hhAUfUBJBXRKtsvxY6CItxe55Q4fjpncLUJLHUw+VPVNoBKFWJlBwlh
LO3liKFfakPSkil4/en+/+zuMByd0JBkIzIJa+Kk5kjpbHRhK0RkmU4+Y5G5spWN
jjLfiv6RxInNaZ8EWQBMfjE95A7PmYDQ4TOH08+OvzdDIi6B0BB5tBGQpG9BnyXk
YsLg1Uo4CwE/vn1/a9w0rhadjUInvmAryhb/uSJYFz/lmApLm2JUpY3/KstwGetb
z+HmLstJb24Djkr6pH8DcjhzRBHeWQ5p0b4/6B+v1HqAUuEhdbyw1F2GrDywyF3R
h/UOAaKLm1+ffdA246o9TejKiDU96qEzzXMaCzPKyestaRZuiyuYEMDhYbvtsMV5
zIdMJj5HQ+U1KHDv4IN99DEj7+/vjE3f4Sjo+POFpQeQ8/d+fxpFNqXVv449dgnb
6xEkkxsR0ElM
=2qBt
-----END PGP SIGNATURE-----
Merge tag 'sysctl-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
Pull sysctl updates from Luis Chamberlain:
"To help make the move of sysctls out of kernel/sysctl.c not incur a
size penalty sysctl has been changed to allow us to not require the
sentinel, the final empty element on the sysctl array. Joel Granados
has been doing all this work.
In the v6.6 kernel we got the major infrastructure changes required to
support this. For v6.7 we had all arch/ and drivers/ modified to
remove the sentinel. For v6.8-rc1 we get a few more updates for fs/
directory only.
The kernel/ directory is left but we'll save that for v6.9-rc1 as
those patches are still being reviewed. After that we then can expect
also the removal of the no longer needed check for procname == NULL.
Let us recap the purpose of this work:
- this helps reduce the overall build time size of the kernel and run
time memory consumed by the kernel by about ~64 bytes per array
- the extra 64-byte penalty is no longer inncurred now when we move
sysctls out from kernel/sysctl.c to their own files
Thomas Weißschuh also sent a few cleanups, for v6.9-rc1 we expect to
see further work by Thomas Weißschuh with the constificatin of the
struct ctl_table.
Due to Joel Granados's work, and to help bring in new blood, I have
suggested for him to become a maintainer and he's accepted. So for
v6.9-rc1 I look forward to seeing him sent you a pull request for
further sysctl changes. This also removes Iurii Zaikin as a maintainer
as he has moved on to other projects and has had no time to help at
all"
* tag 'sysctl-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
sysctl: remove struct ctl_path
sysctl: delete unused define SYSCTL_PERM_EMPTY_DIR
coda: Remove the now superfluous sentinel elements from ctl_table array
sysctl: Remove the now superfluous sentinel elements from ctl_table array
fs: Remove the now superfluous sentinel elements from ctl_table array
cachefiles: Remove the now superfluous sentinel element from ctl_table array
sysclt: Clarify the results of selftest run
sysctl: Add a selftest for handling empty dirs
sysctl: Fix out of bounds access for empty sysctl registers
MAINTAINERS: Add Joel Granados as co-maintainer for proc sysctl
MAINTAINERS: remove Iurii Zaikin from proc sysctl
This kselftest update for Linux 6.8-rc1 consists of enhancements
to reporting test results, fixes to root and user run behavior
and fixing ksft_print_msg() calls.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmWdmy0ACgkQCwJExA0N
QxxN0BAA0Uukn5U1LMJ57K+4smPsQ0EfjWCZw9pZ2GL9eWT//JPQDcuGY04jBpmZ
7zIXMK2k8xctHqmRBkIX8qaLBQI0nHVxpF82UgrvgrJsyCYWvEh6ExDXOFdpYuPi
t8JOyR2TF3vOeZorIly4ZpxNm5tHG2AXrr2dvaWiuEukfa45YbEHu4qgoLmH0Nr5
h072085EPTNR6nsBPEwWZFOjQGOGHIqkAiUmXSukQ2iTQXjO4xg/zDiB+sGihUet
/fNzmAMDnXhf5Uxsk4nRnKTp+XFnJhXpt+mGRcXQBDEveQzxDvl3qdrUOEgIWsAo
Dz6pUPbarGRsNAMGhHeSdeC5GEWJMB9cZFr3CsTFDzcZHzAacSDBGNpMSCtBq3kU
Xj/2dFRVN/K4zbCxA+IpjZ3TmSjb7eFi2sOr4EpkKLwkFfXyMpTbtLWSovcfiBzQ
flnm1Cuhy6nMWBcXH+GtaXb0Ix/R6qFoYi3dt94d686BVSmKbYjSSVanK0cim2lU
kmf3nBDhZNVunR6mYVWjAovMUHBUmOP6kdBeAS5QCaqwNm1gzXOfOBFIqczPwNbU
qjxkDhwOb2v7Suze2kTszuHy+zfIrHNN0b8AoLAsA25yVM6ClJZc/Q8Pt0IGvvBF
bYqup+To8/eUcUD4CsHoE45Huyx7IILBEcCqBk+q8qyaPayFMEA=
=2+oJ
-----END PGP SIGNATURE-----
Merge tag 'linux_kselftest-next-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest update from Shuah Khan:
"Enhancements to reporting test results, fixes to root and user run
behavior and fixing ksft_print_msg() calls"
* tag 'linux_kselftest-next-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
tracing/selftests: Add ownership modification tests for eventfs
selftests: sched: Remove initialization to 0 for a static variable
selftests: capabilities: namespace create varies for root and normal user
selftests: prctl: Add prctl test for PR_GET_NAME
kselftest/vDSO: Use ksft_print_msg() rather than printf in vdso_test_abi
kselftest/vDSO: Fix message formatting for clock_id logging
kselftest/vDSO: Make test name reporting for vdso_abi_test tooling friendly
selftests:x86: Fix Format String Warnings in lam.c
selftests/breakpoints: Fix format specifier in ksft_print_msg in step_after_suspend_test.c
selftests:breakpoints: Fix Format String Warning in breakpoint_test
This nolibc update for Linux 6.8-rc1 consists of:
* Support for PIC mode on MIPS.
* Support for getrlimit()/setrlimit().
* Replace some custom declarations with UAPI includes.
* A new script "run-tests.sh" to run the testsuite over different architectures
and configurations.
* A few non-functional code cleanups.
* Minor improvements to nolibc-test, primarily to support the test script.
There are no urgent fixes available at this time.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmWckk8ACgkQCwJExA0N
QxyHRQ//eXSdYmn3VkNepc3iFn75ntzH8KAFN29ZKCtuTu7+kVjSx+swpRjY4NYE
jE3n9V8YXw7+R4VNj/AmJlSsnZXsx/PrRa9DNtjHeAza7jFYhWowM9LSSJUUyl78
bOh6EvdRhoKuz3zz9A68OTDYSUwA3LaZ0vin8f+WtLH05NfSdafmX1pHLRB9LHzj
J235WktJHoSXOwSAkPZ6NHdtkyeqxy7QomHkuxmmxeVxHnI5SIEDexfa+1FNffGa
9n5TXGZtcgKPE/m1EqBvW02GbIpflpu6H2fAzssaDb9QxhOXEw2wySn06i5q3hGD
6gwTsNqBUfPxZCj2tF6FH/7TxxPmNqLqrJVag/e4pO1rDZzrcTL+Dd6HP5TagJtV
O6/L6UJvqzogIjZD9lk/rWyKfXW0TKk5zBGczduZj/W/McjQ9BDfjR9EjRD/F57Y
fTB3kHd4TFL4DJyN+AHEdzpm1gwc+0NeGE9CJcrMkKzvqjafo2MNMrlYD9GSxKLy
aPlWExE7KIBbLIyrwDNxQbt42RYVfkNFGNVX274TighQ9nGBRjXybflPioTUpMw1
Qyi7qrIDA7QGIrsEgCr6pPeA+LVkWoSyvpXGi/RULQUeg94V9TPXpO4jmf4VNPDQ
NtYwI/D3UcroYDYY2K2M7KUpv0J4vM9kgQ1wqQ4n/6aKth6kek4=
=gQNC
-----END PGP SIGNATURE-----
Merge tag 'linux_kselftest-nolibc-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull nolibc updates from Shuah Khan:
- Support for PIC mode on MIPS
- Support for getrlimit()/setrlimit()
- Replace some custom declarations with UAPI includes
- A new script "run-tests.sh" to run the testsuite over different
architectures and configurations
- A few non-functional code cleanups
- Minor improvements to nolibc-test, primarily to support the test
script
* tag 'linux_kselftest-nolibc-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (22 commits)
selftests/nolibc: disable coredump via setrlimit
tools/nolibc: add support for getrlimit/setrlimit
tools/nolibc: drop custom definition of struct rusage
tools/nolibc: drop duplicated testcase ioctl_tiocinq
tools/nolibc: annotate va_list printf formats
selftests/nolibc: make result alignment more robust
tools/nolibc: mips: add support for PIC
selftests/nolibc: run-tests.sh: enable testing via qemu-user
selftests/nolibc: introduce QEMU_ARCH_USER
selftests/nolibc: fix testcase status alignment
selftests/nolibc: add configuration for mipso32be
selftests/nolibc: extraconfig support
selftests/nolibc: explicitly specify ABI for MIPS
selftests/nolibc: use XARCH for MIPS
tools/nolibc: move MIPS ABI validation into arch-mips.h
tools/nolibc: error out on unsupported architecture
selftests/nolibc: add script to run testsuite
selftests/nolibc: support out-of-tree builds
selftests/nolibc: anchor paths in $(srcdir) if possible
selftests/nolibc: use EFI -bios for LoongArch qemu
...
-----BEGIN PGP SIGNATURE-----
iIYEABYKAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCZZu2bRAcbWljQGRpZ2lr
b2QubmV0AAoJEOXj0OiMgvbSISYA/ipOXctyQzetyl37ZcGGgj/lHdWWyTOuv7Bu
sSgPDITwAP9EG0E8cT2vgBALPjCBmYb4H7Y2EDKNjjHFEQdEtZiGAg==
=QhjN
-----END PGP SIGNATURE-----
Merge tag 'landlock-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull Landlock updates from Mickaël Salaün:
"New tests, a slight optimization, and some cosmetic changes"
* tag 'landlock-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
landlock: Optimize the number of calls to get_access_mask slightly
selftests/landlock: Rename "permitted" to "allowed" in ftruncate tests
landlock: Remove remaining "inline" modifiers in .c files [v6.6]
landlock: Remove remaining "inline" modifiers in .c files [v6.1]
landlock: Remove remaining "inline" modifiers in .c files [v5.15]
selftests/landlock: Add tests to check unhandled rule's access rights
selftests/landlock: Add tests to check unknown rule's access rights
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmWYKUIUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNyHw/+IKnqL1MZ5QS+/HtSzi4jCL47N9yZ
OHLol6XswyEGHH9myKPPGnT5lVA93v98v4ty2mws7EJUSGZQQUntYBPbU9Gi40+B
XDzYSRocoj96sdlKeOJMgaWo3NBRD9HYSoGPDNWZixy6m+bLPk/Dqhn3FabKf1lo
2qQSmstvChFRmVNkmgaQnBCAtWVqla4EJEL0EKX6cspHbuzRNTeJdTPn6Q/zOUVL
O2znOZuEtSVpYS7yg3uJT0hHD8H0GnIciAcDAhyPSBL5Uk5l6gwJiACcdRfLRbgp
QM5Z4qUFdKljV5XBCzYnfhhrx1df08h1SG84El8UK8HgTTfOZfYmawByJRWNJSQE
TdCmtyyvEbfb61CKBFVwD7Tzb9/y8WgcY5N3Un8uCQqRzFIO+6cghHri5NrVhifp
nPFlP4klxLHh3d7ZVekLmCMHbpaacRyJKwLy+f/nwbBEID47jpPkvZFIpbalat+r
QaKRBNWdTeV+GZ+Yu0uWsI029aQnpcO1kAnGg09fl6b/dsmxeKOVWebir25AzQ++
a702S8HRmj80X+VnXHU9a64XeGtBH7Nq0vu0lGHQPgwhSx/9P6/qICEPwsIriRjR
I9OulWt4OBPDtlsonHFgDs+lbnd0Z0GJUwYT8e9pjRDMxijVO9lhAXyglVRmuNR8
to2ByKP5BO+Vh8Y=
=Py+n
-----END PGP SIGNATURE-----
Merge tag 'lsm-pr-20240105' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Pull security module updates from Paul Moore:
- Add three new syscalls: lsm_list_modules(), lsm_get_self_attr(), and
lsm_set_self_attr().
The first syscall simply lists the LSMs enabled, while the second and
third get and set the current process' LSM attributes. Yes, these
syscalls may provide similar functionality to what can be found under
/proc or /sys, but they were designed to support multiple,
simultaneaous (stacked) LSMs from the start as opposed to the current
/proc based solutions which were created at a time when only one LSM
was allowed to be active at a given time.
We have spent considerable time discussing ways to extend the
existing /proc interfaces to support multiple, simultaneaous LSMs and
even our best ideas have been far too ugly to support as a kernel
API; after +20 years in the kernel, I felt the LSM layer had
established itself enough to justify a handful of syscalls.
Support amongst the individual LSM developers has been nearly
unanimous, with a single objection coming from Tetsuo (TOMOYO) as he
is worried that the LSM_ID_XXX token concept will make it more
difficult for out-of-tree LSMs to survive. Several members of the LSM
community have demonstrated the ability for out-of-tree LSMs to
continue to exist by picking high/unused LSM_ID values as well as
pointing out that many kernel APIs rely on integer identifiers, e.g.
syscalls (!), but unfortunately Tetsuo's objections remain.
My personal opinion is that while I have no interest in penalizing
out-of-tree LSMs, I'm not going to penalize in-tree development to
support out-of-tree development, and I view this as a necessary step
forward to support the push for expanded LSM stacking and reduce our
reliance on /proc and /sys which has occassionally been problematic
for some container users. Finally, we have included the linux-api
folks on (all?) recent revisions of the patchset and addressed all of
their concerns.
- Add a new security_file_ioctl_compat() LSM hook to handle the 32-bit
ioctls on 64-bit systems problem.
This patch includes support for all of the existing LSMs which
provide ioctl hooks, although it turns out only SELinux actually
cares about the individual ioctls. It is worth noting that while
Casey (Smack) and Tetsuo (TOMOYO) did not give explicit ACKs to this
patch, they did both indicate they are okay with the changes.
- Fix a potential memory leak in the CALIPSO code when IPv6 is disabled
at boot.
While it's good that we are fixing this, I doubt this is something
users are seeing in the wild as you need to both disable IPv6 and
then attempt to configure IPv6 labeled networking via
NetLabel/CALIPSO; that just doesn't make much sense.
Normally this would go through netdev, but Jakub asked me to take
this patch and of all the trees I maintain, the LSM tree seemed like
the best fit.
- Update the LSM MAINTAINERS entry with additional information about
our process docs, patchwork, bug reporting, etc.
I also noticed that the Lockdown LSM is missing a dedicated
MAINTAINERS entry so I've added that to the pull request. I've been
working with one of the major Lockdown authors/contributors to see if
they are willing to step up and assume a Lockdown maintainer role;
hopefully that will happen soon, but in the meantime I'll continue to
look after it.
- Add a handful of mailmap entries for Serge Hallyn and myself.
* tag 'lsm-pr-20240105' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: (27 commits)
lsm: new security_file_ioctl_compat() hook
lsm: Add a __counted_by() annotation to lsm_ctx.ctx
calipso: fix memory leak in netlbl_calipso_add_pass()
selftests: remove the LSM_ID_IMA check in lsm/lsm_list_modules_test
MAINTAINERS: add an entry for the lockdown LSM
MAINTAINERS: update the LSM entry
mailmap: add entries for Serge Hallyn's dead accounts
mailmap: update/replace my old email addresses
lsm: mark the lsm_id variables are marked as static
lsm: convert security_setselfattr() to use memdup_user()
lsm: align based on pointer length in lsm_fill_user_ctx()
lsm: consolidate buffer size handling into lsm_fill_user_ctx()
lsm: correct error codes in security_getselfattr()
lsm: cleanup the size counters in security_getselfattr()
lsm: don't yet account for IMA in LSM_CONFIG_COUNT calculation
lsm: drop LSM_ID_IMA
LSM: selftests for Linux Security Module syscalls
SELinux: Add selfattr hooks
AppArmor: Add selfattr hooks
Smack: implement setselfattr and getselfattr hooks
...
are included in this merge do the following:
- Peng Zhang has done some mapletree maintainance work in the
series
"maple_tree: add mt_free_one() and mt_attr() helpers"
"Some cleanups of maple tree"
- In the series "mm: use memmap_on_memory semantics for dax/kmem"
Vishal Verma has altered the interworking between memory-hotplug
and dax/kmem so that newly added 'device memory' can more easily
have its memmap placed within that newly added memory.
- Matthew Wilcox continues folio-related work (including a few
fixes) in the patch series
"Add folio_zero_tail() and folio_fill_tail()"
"Make folio_start_writeback return void"
"Fix fault handler's handling of poisoned tail pages"
"Convert aops->error_remove_page to ->error_remove_folio"
"Finish two folio conversions"
"More swap folio conversions"
- Kefeng Wang has also contributed folio-related work in the series
"mm: cleanup and use more folio in page fault"
- Jim Cromie has improved the kmemleak reporting output in the
series "tweak kmemleak report format".
- In the series "stackdepot: allow evicting stack traces" Andrey
Konovalov to permits clients (in this case KASAN) to cause
eviction of no longer needed stack traces.
- Charan Teja Kalla has fixed some accounting issues in the page
allocator's atomic reserve calculations in the series "mm:
page_alloc: fixes for high atomic reserve caluculations".
- Dmitry Rokosov has added to the samples/ dorectory some sample
code for a userspace memcg event listener application. See the
series "samples: introduce cgroup events listeners".
- Some mapletree maintanance work from Liam Howlett in the series
"maple_tree: iterator state changes".
- Nhat Pham has improved zswap's approach to writeback in the
series "workload-specific and memory pressure-driven zswap
writeback".
- DAMON/DAMOS feature and maintenance work from SeongJae Park in
the series
"mm/damon: let users feed and tame/auto-tune DAMOS"
"selftests/damon: add Python-written DAMON functionality tests"
"mm/damon: misc updates for 6.8"
- Yosry Ahmed has improved memcg's stats flushing in the series
"mm: memcg: subtree stats flushing and thresholds".
- In the series "Multi-size THP for anonymous memory" Ryan Roberts
has added a runtime opt-in feature to transparent hugepages which
improves performance by allocating larger chunks of memory during
anonymous page faults.
- Matthew Wilcox has also contributed some cleanup and maintenance
work against eh buffer_head code int he series "More buffer_head
cleanups".
- Suren Baghdasaryan has done work on Andrea Arcangeli's series
"userfaultfd move option". UFFDIO_MOVE permits userspace heap
compaction algorithms to move userspace's pages around rather than
UFFDIO_COPY'a alloc/copy/free.
- Stefan Roesch has developed a "KSM Advisor", in the series
"mm/ksm: Add ksm advisor". This is a governor which tunes KSM's
scanning aggressiveness in response to userspace's current needs.
- Chengming Zhou has optimized zswap's temporary working memory
use in the series "mm/zswap: dstmem reuse optimizations and
cleanups".
- Matthew Wilcox has performed some maintenance work on the
writeback code, both code and within filesystems. The series is
"Clean up the writeback paths".
- Andrey Konovalov has optimized KASAN's handling of alloc and
free stack traces for secondary-level allocators, in the series
"kasan: save mempool stack traces".
- Andrey also performed some KASAN maintenance work in the series
"kasan: assorted clean-ups".
- David Hildenbrand has gone to town on the rmap code. Cleanups,
more pte batching, folio conversions and more. See the series
"mm/rmap: interface overhaul".
- Kinsey Ho has contributed some maintenance work on the MGLRU
code in the series "mm/mglru: Kconfig cleanup".
- Matthew Wilcox has contributed lruvec page accounting code
cleanups in the series "Remove some lruvec page accounting
functions".
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZZyF2wAKCRDdBJ7gKXxA
jjWjAP42LHvGSjp5M+Rs2rKFL0daBQsrlvy6/jCHUequSdWjSgEAmOx7bc5fbF27
Oa8+DxGM9C+fwqZ/7YxU2w/WuUmLPgU=
=0NHs
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2024-01-08-15-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
"Many singleton patches against the MM code. The patch series which are
included in this merge do the following:
- Peng Zhang has done some mapletree maintainance work in the series
'maple_tree: add mt_free_one() and mt_attr() helpers'
'Some cleanups of maple tree'
- In the series 'mm: use memmap_on_memory semantics for dax/kmem'
Vishal Verma has altered the interworking between memory-hotplug
and dax/kmem so that newly added 'device memory' can more easily
have its memmap placed within that newly added memory.
- Matthew Wilcox continues folio-related work (including a few fixes)
in the patch series
'Add folio_zero_tail() and folio_fill_tail()'
'Make folio_start_writeback return void'
'Fix fault handler's handling of poisoned tail pages'
'Convert aops->error_remove_page to ->error_remove_folio'
'Finish two folio conversions'
'More swap folio conversions'
- Kefeng Wang has also contributed folio-related work in the series
'mm: cleanup and use more folio in page fault'
- Jim Cromie has improved the kmemleak reporting output in the series
'tweak kmemleak report format'.
- In the series 'stackdepot: allow evicting stack traces' Andrey
Konovalov to permits clients (in this case KASAN) to cause eviction
of no longer needed stack traces.
- Charan Teja Kalla has fixed some accounting issues in the page
allocator's atomic reserve calculations in the series 'mm:
page_alloc: fixes for high atomic reserve caluculations'.
- Dmitry Rokosov has added to the samples/ dorectory some sample code
for a userspace memcg event listener application. See the series
'samples: introduce cgroup events listeners'.
- Some mapletree maintanance work from Liam Howlett in the series
'maple_tree: iterator state changes'.
- Nhat Pham has improved zswap's approach to writeback in the series
'workload-specific and memory pressure-driven zswap writeback'.
- DAMON/DAMOS feature and maintenance work from SeongJae Park in the
series
'mm/damon: let users feed and tame/auto-tune DAMOS'
'selftests/damon: add Python-written DAMON functionality tests'
'mm/damon: misc updates for 6.8'
- Yosry Ahmed has improved memcg's stats flushing in the series 'mm:
memcg: subtree stats flushing and thresholds'.
- In the series 'Multi-size THP for anonymous memory' Ryan Roberts
has added a runtime opt-in feature to transparent hugepages which
improves performance by allocating larger chunks of memory during
anonymous page faults.
- Matthew Wilcox has also contributed some cleanup and maintenance
work against eh buffer_head code int he series 'More buffer_head
cleanups'.
- Suren Baghdasaryan has done work on Andrea Arcangeli's series
'userfaultfd move option'. UFFDIO_MOVE permits userspace heap
compaction algorithms to move userspace's pages around rather than
UFFDIO_COPY'a alloc/copy/free.
- Stefan Roesch has developed a 'KSM Advisor', in the series 'mm/ksm:
Add ksm advisor'. This is a governor which tunes KSM's scanning
aggressiveness in response to userspace's current needs.
- Chengming Zhou has optimized zswap's temporary working memory use
in the series 'mm/zswap: dstmem reuse optimizations and cleanups'.
- Matthew Wilcox has performed some maintenance work on the writeback
code, both code and within filesystems. The series is 'Clean up the
writeback paths'.
- Andrey Konovalov has optimized KASAN's handling of alloc and free
stack traces for secondary-level allocators, in the series 'kasan:
save mempool stack traces'.
- Andrey also performed some KASAN maintenance work in the series
'kasan: assorted clean-ups'.
- David Hildenbrand has gone to town on the rmap code. Cleanups, more
pte batching, folio conversions and more. See the series 'mm/rmap:
interface overhaul'.
- Kinsey Ho has contributed some maintenance work on the MGLRU code
in the series 'mm/mglru: Kconfig cleanup'.
- Matthew Wilcox has contributed lruvec page accounting code cleanups
in the series 'Remove some lruvec page accounting functions'"
* tag 'mm-stable-2024-01-08-15-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (361 commits)
mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER
mm, treewide: introduce NR_PAGE_ORDERS
selftests/mm: add separate UFFDIO_MOVE test for PMD splitting
selftests/mm: skip test if application doesn't has root privileges
selftests/mm: conform test to TAP format output
selftests: mm: hugepage-mmap: conform to TAP format output
selftests/mm: gup_test: conform test to TAP format output
mm/selftests: hugepage-mremap: conform test to TAP format output
mm/vmstat: move pgdemote_* out of CONFIG_NUMA_BALANCING
mm: zsmalloc: return -ENOSPC rather than -EINVAL in zs_malloc while size is too large
mm/memcontrol: remove __mod_lruvec_page_state()
mm/khugepaged: use a folio more in collapse_file()
slub: use a folio in __kmalloc_large_node
slub: use folio APIs in free_large_kmalloc()
slub: use alloc_pages_node() in alloc_slab_page()
mm: remove inc/dec lruvec page state functions
mm: ratelimit stat flush from workingset shrinker
kasan: stop leaking stack trace handles
mm/mglru: remove CONFIG_TRANSPARENT_HUGEPAGE
mm/mglru: add dummy pmd_dirty()
...
GCC 13.2.0 reported the warning of the print format specifier:
conf.c: In function ‘sysfs_get’:
conf.c:181:72: warning: format ‘%s’ expects argument of type ‘char *’, \
but argument 3 has type ‘int’ [-Wformat=]
181 | ksft_exit_fail_msg("sysfs: unable to read value '%s': %s\n",
| ~^
| |
| char *
| %d
The fix passes strerror(errno) as it was intended, like in the sibling error
exit message.
Fixes: aba51cd0949ae ("selftests: alsa - add PCM test")
Cc: Mark Brown <broonie@kernel.org>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-sound@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Acked-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20240107173704.937824-5-mirsad.todorovac@alu.unizg.hr
Signed-off-by: Takashi Iwai <tiwai@suse.de>
GCC 13.2.0 compiler issued the following warning:
mixer-test.c:350:80: warning: format ‘%ld’ expects argument of type ‘long int’, \
but argument 5 has type ‘unsigned int’ [-Wformat=]
350 | ksft_print_msg("%s.%d value %ld more than item count %ld\n",
| ~~^
| |
| long int
| %d
351 | ctl->name, index, int_val,
352 | snd_ctl_elem_info_get_items(ctl->info));
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| |
| unsigned int
Fixing the format specifier in call to ksft_print_msg() according to the
compiler suggestion silences the warning.
Fixes: 10f2f194663af ("kselftest: alsa: Validate values read from enumerations")
Cc: Mark Brown <broonie@kernel.org>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-sound@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Acked-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20240107173704.937824-4-mirsad.todorovac@alu.unizg.hr
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The GCC 13.2.0 compiler issued the following warning:
mixer-test.c: In function ‘ctl_value_index_valid’:
mixer-test.c:322:79: warning: format ‘%lld’ expects argument of type ‘long long int’, \
but argument 5 has type ‘long int’ [-Wformat=]
322 | ksft_print_msg("%s.%d value %lld more than maximum %lld\n",
| ~~~^
| |
| long long int
| %ld
323 | ctl->name, index, int64_val,
324 | snd_ctl_elem_info_get_max(ctl->info));
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| |
| long int
Fixing the format specifier as advised by the compiler suggestion removes the
warning.
Fixes: 3f48b137d88e7 ("kselftest: alsa: Factor out check that values meet constraints")
Cc: Mark Brown <broonie@kernel.org>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-sound@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Acked-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20240107173704.937824-3-mirsad.todorovac@alu.unizg.hr
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Minor fix in the number of arguments to error reporting function in the
test program as reported by GCC 13.2.0 warning.
mixer-test.c: In function ‘find_controls’:
mixer-test.c:169:44: warning: too many arguments for format [-Wformat-extra-args]
169 | ksft_exit_fail_msg("snd_ctl_poll_descriptors() failed for %d\n",
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The number of arguments in call to ksft_exit_fail_msg() doesn't correspond
to the format specifiers, so this is adjusted resembling the sibling calls
to the error function.
Fixes: b1446bda56456 ("kselftest: alsa: Check for event generation when we write to controls")
Cc: Mark Brown <broonie@kernel.org>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-sound@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Acked-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20240107173704.937824-2-mirsad.todorovac@alu.unizg.hr
Signed-off-by: Takashi Iwai <tiwai@suse.de>
- Yafang Shao added task_get_cgroup1() helper to enable a similar BPF helper
so that BPF progs can be more useful on cgroup1 hierarchies. While cgroup1
is mostly in maintenance mode, this addition is very small while having an
outsized usefulness for users who are still on cgroup1. Yafang also
optimized root cgroup list access by making it RCU protected in the
process.
- Waiman Long optimized rstat operation leading to substantially lower and
more consistent lock hold time while flushing the hierarchical statistics.
As the lock can be acquired briefly in various hot paths, this reduction
has cascading benefits.
- Waiman also improved the quality of isolation for cpuset's isolated
partitions. CPUs which are allocated to isolated partitions are now
excluded from running unbound work items and cpu_is_isolated() test which
is used by vmstat and memcg to reduce interference now includes cpuset
isolated CPUs. While it isn't there yet, the hope is eventually reaching
parity with the isolation level provided by the `isolcpus` boot param but
in a dynamic manner.
This involved a couple workqueue patches which were applied directly to
cgroup/for-6.8 rather than ping-ponged through the wq tree. This was
because the wq code change was small and the area is usually very static
and unlikely to cause conflicts. However, luck had it that there was a wq
bug fix in the area during the 6.7 cycle which caused a conflict. The
conflict is contextual but can be a bit confusing to resolve, so there is
one merge from wq/for-6.7-fixes.
-----BEGIN PGP SIGNATURE-----
iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZYnuJg4cdGpAa2VybmVs
Lm9yZwAKCRCxYfJx3gVYGQ5kAP9nMMWqi+R1HeG7+hWROTVjQZ0OM9KRcpZ1TmjF
FNbkJgEAzt+sPnoWwYDTSI7pkNeZ/IM7x1qkkKGvENNtUXrz0Ac=
=PyYN
-----END PGP SIGNATURE-----
Merge tag 'cgroup-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
- Yafang Shao added task_get_cgroup1() helper to enable a similar BPF
helper so that BPF progs can be more useful on cgroup1 hierarchies.
While cgroup1 is mostly in maintenance mode, this addition is very
small while having an outsized usefulness for users who are still on
cgroup1. Yafang also optimized root cgroup list access by making it
RCU protected in the process.
- Waiman Long optimized rstat operation leading to substantially lower
and more consistent lock hold time while flushing the hierarchical
statistics. As the lock can be acquired briefly in various hot paths,
this reduction has cascading benefits.
- Waiman also improved the quality of isolation for cpuset's isolated
partitions. CPUs which are allocated to isolated partitions are now
excluded from running unbound work items and cpu_is_isolated() test
which is used by vmstat and memcg to reduce interference now includes
cpuset isolated CPUs. While it isn't there yet, the hope is
eventually reaching parity with the isolation level provided by the
`isolcpus` boot param but in a dynamic manner.
* tag 'cgroup-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: Move rcu_head up near the top of cgroup_root
cgroup/cpuset: Include isolated cpuset CPUs in cpu_is_isolated() check
cgroup: Avoid false cacheline sharing of read mostly rstat_cpu
cgroup/rstat: Optimize cgroup_rstat_updated_list()
cgroup: Fix documentation for cpu.idle
cgroup/cpuset: Expose cpuset.cpus.isolated
workqueue: Move workqueue_set_unbound_cpumask() and its helpers inside CONFIG_SYSFS
cgroup/rstat: Reduce cpu_lock hold time in cgroup_rstat_flush_locked()
cgroup/cpuset: Take isolated CPUs out of workqueue unbound cpumask
cgroup/cpuset: Keep track of CPUs in isolated partitions
selftests/cgroup: Minor code cleanup and reorganization of test_cpuset_prs.sh
workqueue: Add workqueue_unbound_exclude_cpumask() to exclude CPUs from wq_unbound_cpumask
selftests: cgroup: Fixes a typo in a comment
cgroup: Add a new helper for cgroup1 hierarchy
cgroup: Add annotation for holding namespace_sem in current_cgns_cgroup_from_root()
cgroup: Eliminate the need for cgroup_mutex in proc_cgroup_show()
cgroup: Make operations on the cgroup root_list RCU safe
cgroup: Remove unnecessary list_empty()
* for-next/cpufeature
- Remove ARM64_HAS_NO_HW_PREFETCH copy_page() optimisation for ye olde
Thunder-X machines.
- Avoid mapping KPTI trampoline when it is not required.
- Make CPU capability API more robust during early initialisation.
* for-next/early-idreg-overrides
- Remove dependencies on core kernel helpers from the early
command-line parsing logic in preparation for moving this code
before the kernel is mapped.
* for-next/fpsimd
- Restore kernel-mode fpsimd context lazily, allowing us to run fpsimd
code sequences in the kernel with pre-emption enabled.
* for-next/kbuild
- Install 'vmlinuz.efi' when CONFIG_EFI_ZBOOT=y.
- Makefile cleanups.
* for-next/lpa2-prep
- Preparatory work for enabling the 'LPA2' extension, which will
introduce 52-bit virtual and physical addressing even with 4KiB
pages (including for KVM guests).
* for-next/misc
- Remove dead code and fix a typo.
* for-next/mm
- Pass NUMA node information for IRQ stack allocations.
* for-next/perf
- Add perf support for the Synopsys DesignWare PCIe PMU.
- Add support for event counting thresholds (FEAT_PMUv3_TH) introduced
in Armv8.8.
- Add support for i.MX8DXL SoCs to the IMX DDR PMU driver.
- Minor PMU driver fixes and optimisations.
* for-next/rip-vpipt
- Remove what support we had for the obsolete VPIPT I-cache policy.
* for-next/selftests
- Improvements to the SVE and SME selftests.
* for-next/stacktrace
- Refactor kernel unwind logic so that it can used by BPF unwinding
and, eventually, reliable backtracing.
* for-next/sysregs
- Update a bunch of register definitions based on the latest XML drop
from Arm.
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmWWvKYQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNIiTB/9agZBkEhZjP2sNDGyE4UFwawweWHkt2r8h
WyvdwP91Z/AIsYSsGYu36J0l4pOnMKp/i6t+rt031SK4j+Q8hJYhSfDt3RvVbc0/
Pz9D18V6cLrfq+Yxycqq9ufVdjs+m+CQ5WeLaRGmNIyEzJ/Jv/qrAN+2r603EeLP
nq08qMZhDIQd2ZzbigCnGaNrTsVSafFfBFv1GsgDvnMZAjs1G6457A6zu+NatNUc
+TMSG+3EawutHZZ2noXl0Ra7VOfIbVZFiUssxRPenKQByHHHR+QB2c/O1blri+dm
XLMutvqO2/WvYGIfXO5koqZqvpVeR3zXxPwmGi5hQBsmOjtXzKd+
=U4mo
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"CPU features:
- Remove ARM64_HAS_NO_HW_PREFETCH copy_page() optimisation for ye
olde Thunder-X machines
- Avoid mapping KPTI trampoline when it is not required
- Make CPU capability API more robust during early initialisation
Early idreg overrides:
- Remove dependencies on core kernel helpers from the early
command-line parsing logic in preparation for moving this code
before the kernel is mapped
FPsimd:
- Restore kernel-mode fpsimd context lazily, allowing us to run
fpsimd code sequences in the kernel with pre-emption enabled
KBuild:
- Install 'vmlinuz.efi' when CONFIG_EFI_ZBOOT=y
- Makefile cleanups
LPA2 prep:
- Preparatory work for enabling the 'LPA2' extension, which will
introduce 52-bit virtual and physical addressing even with 4KiB
pages (including for KVM guests).
Misc:
- Remove dead code and fix a typo
MM:
- Pass NUMA node information for IRQ stack allocations
Perf:
- Add perf support for the Synopsys DesignWare PCIe PMU
- Add support for event counting thresholds (FEAT_PMUv3_TH)
introduced in Armv8.8
- Add support for i.MX8DXL SoCs to the IMX DDR PMU driver.
- Minor PMU driver fixes and optimisations
RIP VPIPT:
- Remove what support we had for the obsolete VPIPT I-cache policy
Selftests:
- Improvements to the SVE and SME selftests
Stacktrace:
- Refactor kernel unwind logic so that it can used by BPF unwinding
and, eventually, reliable backtracing
Sysregs:
- Update a bunch of register definitions based on the latest XML drop
from Arm"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (87 commits)
kselftest/arm64: Don't probe the current VL for unsupported vector types
efi/libstub: zboot: do not use $(shell ...) in cmd_copy_and_pad
arm64: properly install vmlinuz.efi
arm64/sysreg: Add missing system instruction definitions for FGT
arm64/sysreg: Add missing system register definitions for FGT
arm64/sysreg: Add missing ExtTrcBuff field definition to ID_AA64DFR0_EL1
arm64/sysreg: Add missing Pauth_LR field definitions to ID_AA64ISAR1_EL1
arm64: memory: remove duplicated include
arm: perf: Fix ARCH=arm build with GCC
arm64: Align boot cpucap handling with system cpucap handling
arm64: Cleanup system cpucap handling
MAINTAINERS: add maintainers for DesignWare PCIe PMU driver
drivers/perf: add DesignWare PCIe PMU driver
PCI: Move pci_clear_and_set_dword() helper to PCI header
PCI: Add Alibaba Vendor ID to linux/pci_ids.h
docs: perf: Add description for Synopsys DesignWare PCIe PMU driver
arm64: irq: set the correct node for shadow call stack
Revert "perf/arm_dmc620: Remove duplicate format attribute #defines"
arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD
arm64: fpsimd: Preserve/restore kernel mode NEON at context switch
...
- Add initial support to recognise the HeXin C2000 processor.
- Add papr-vpd and papr-sysparm character device drivers for VPD & sysparm
retrieval, so userspace tools can be adapted to avoid doing raw firmware
calls from userspace.
- Sched domains optimisations for shared processor partitions on P9/P10.
- A series of optimisations for KVM running as a nested HV under PowerVM.
- Other small features and fixes.
Thanks to: Aditya Gupta, Aneesh Kumar K.V, Arnd Bergmann, Christophe Leroy,
Colin Ian King, Dario Binacchi, David Heidelberg, Geoff Levand, Gustavo A.
R. Silva, Haoran Liu, Jordan Niethe, Kajol Jain, Kevin Hao, Kunwu Chan, Li
kunyu, Li zeming, Masahiro Yamada, Michal Suchánek, Nathan Lynch, Naveen N Rao,
Nicholas Piggin, Randy Dunlap, Sathvika Vasireddy, Srikar Dronamraju, Stephen
Rothwell, Vaibhav Jain, Zhao Ke.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmWRVf0THG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgIfpEACns86LkKuH1wTxbXJFaY2vIdPbBVUO
oh0+y6Bm6ybCVvSp/CcyDPRRWpVlnp4BZlAh4x3gHrdRYEbIaFhI3gUzUtPLxAmf
Oza1qyN570AFOudTNOy3VErtHiMHSuI7ckRshXWCakbAN8VlBDFWje3VJ4vZZ5OB
Ii4RM0a3e/XqUZodLQXvDcqo3GDeIVmf1BnOTvEFFPhjZUZBfJarL6OHuyX7Xp1J
oGSBA3O7UBVGrQsoGS5UAMRqZQnvLc5hn150FU1qDPkHu5X5iLvIMUakTFCYgGYw
mT7DBPpDWKKFSfVjsjIVX2GPv8XSMPnZDmxOl/SIKM1F4aKAL9vmbYP6AMXXmvVB
SpluSmkcp+YujtK5QO8BN4I2SD3xIbhH8yjMUh2CAFP1SBR0QnKpXUGHRiZ0m7fM
SSFAHHLEzKJC46vUsazazoldyWQMAwBHKQzoASHf59yrEP4uta/+pimHdsOeU2UP
IAQEYzw7fTKbEIvqV4qf6sW+5bVUhISS1vSlJ3OEkGqUxVvaUMQ2ePPbX+rfv7lS
hXlxh9vjFzcDK5PYmLi0Agua9ct0ER0MOdY5kRMXAb4+AlVLQi4EgymxRCrjYu2/
XodDf1xJU2w7gdMc4TpiouHRrOtZQ9JWH5j+x0YnN4lG2vmG7lbU22a4myn6PjP9
RLAymXt4/1iHqA==
=LjlQ
-----END PGP SIGNATURE-----
Merge tag 'powerpc-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- Add initial support to recognise the HeXin C2000 processor.
- Add papr-vpd and papr-sysparm character device drivers for VPD &
sysparm retrieval, so userspace tools can be adapted to avoid doing
raw firmware calls from userspace.
- Sched domains optimisations for shared processor partitions on
P9/P10.
- A series of optimisations for KVM running as a nested HV under
PowerVM.
- Other small features and fixes.
Thanks to Aditya Gupta, Aneesh Kumar K.V, Arnd Bergmann, Christophe
Leroy, Colin Ian King, Dario Binacchi, David Heidelberg, Geoff Levand,
Gustavo A. R. Silva, Haoran Liu, Jordan Niethe, Kajol Jain, Kevin Hao,
Kunwu Chan, Li kunyu, Li zeming, Masahiro Yamada, Michal Suchánek,
Nathan Lynch, Naveen N Rao, Nicholas Piggin, Randy Dunlap, Sathvika
Vasireddy, Srikar Dronamraju, Stephen Rothwell, Vaibhav Jain, and
Zhao Ke.
* tag 'powerpc-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (96 commits)
powerpc/ps3_defconfig: Disable PPC64_BIG_ENDIAN_ELF_ABI_V2
powerpc/86xx: Drop unused CONFIG_MPC8610
powerpc/powernv: Add error handling to opal_prd_range_is_valid
selftests/powerpc: Fix spelling mistake "EACCESS" -> "EACCES"
powerpc/hvcall: Reorder Nestedv2 hcall opcodes
powerpc/ps3: Add missing set_freezable() for ps3_probe_thread()
powerpc/mpc83xx: Use wait_event_freezable() for freezable kthread
powerpc/mpc83xx: Add the missing set_freezable() for agent_thread_fn()
powerpc/fsl: Fix fsl,tmu-calibration to match the schema
powerpc/smp: Dynamically build Powerpc topology
powerpc/smp: Avoid asym packing within thread_group of a core
powerpc/smp: Add __ro_after_init attribute
powerpc/smp: Disable MC domain for shared processor
powerpc/smp: Enable Asym packing for cores on shared processor
powerpc/sched: Cleanup vcpu_is_preempted()
powerpc: add cpu_spec.cpu_features to vmcoreinfo
powerpc/imc-pmu: Add a null pointer check in update_events_in_group()
powerpc/powernv: Add a null pointer check in opal_powercap_init()
powerpc/powernv: Add a null pointer check in opal_event_init()
powerpc/powernv: Add a null pointer check to scom_debug_init_one()
...
commit 23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely") has
changed the definition of MAX_ORDER to be inclusive. This has caused
issues with code that was not yet upstream and depended on the previous
definition.
To draw attention to the altered meaning of the define, rename MAX_ORDER
to MAX_PAGE_ORDER.
Link: https://lkml.kernel.org/r/20231228144704.14033-2-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>