IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Daniel Borkmann says:
====================
pull-request: bpf-next 2023-11-30
We've added 30 non-merge commits during the last 7 day(s) which contain
a total of 58 files changed, 1598 insertions(+), 154 deletions(-).
The main changes are:
1) Add initial TX metadata implementation for AF_XDP with support in mlx5
and stmmac drivers. Two types of offloads are supported right now, that
is, TX timestamp and TX checksum offload, from Stanislav Fomichev with
stmmac implementation from Song Yoong Siang.
2) Change BPF verifier logic to validate global subprograms lazily instead
of unconditionally before the main program, so they can be guarded using
BPF CO-RE techniques, from Andrii Nakryiko.
3) Add BPF link_info support for uprobe multi link along with bpftool
integration for the latter, from Jiri Olsa.
4) Use pkg-config in BPF selftests to determine ld flags which is
in particular needed for linking statically, from Akihiko Odaki.
5) Fix a few BPF selftest failures to adapt to the upcoming LLVM18,
from Yonghong Song.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (30 commits)
bpf/tests: Remove duplicate JSGT tests
selftests/bpf: Add TX side to xdp_hw_metadata
selftests/bpf: Convert xdp_hw_metadata to XDP_USE_NEED_WAKEUP
selftests/bpf: Add TX side to xdp_metadata
selftests/bpf: Add csum helpers
selftests/xsk: Support tx_metadata_len
xsk: Add option to calculate TX checksum in SW
xsk: Validate xsk_tx_metadata flags
xsk: Document tx_metadata_len layout
net: stmmac: Add Tx HWTS support to XDP ZC
net/mlx5e: Implement AF_XDP TX timestamp and checksum offload
tools: ynl: Print xsk-features from the sample
xsk: Add TX timestamp and TX checksum offload support
xsk: Support tx_metadata_len
selftests/bpf: Use pkg-config for libelf
selftests/bpf: Override PKG_CONFIG for static builds
selftests/bpf: Choose pkg-config for the target
bpftool: Add support to display uprobe_multi links
selftests/bpf: Add link_info test for uprobe_multi link
selftests/bpf: Use bpf_link__destroy in fill_link_info tests
...
====================
Conflicts:
Documentation/netlink/specs/netdev.yaml:
839ff60df3 ("net: page_pool: add nlspec for basic access to page pools")
48eb03dd26 ("xsk: Add TX timestamp and TX checksum offload support")
https://lore.kernel.org/all/20231201094705.1ee3cab8@canb.auug.org.au/
While at it also regen, tree is dirty after:
48eb03dd26 ("xsk: Add TX timestamp and TX checksum offload support")
looks like code wasn't re-rendered after "render-max" was removed.
Link: https://lore.kernel.org/r/20231130145708.32573-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
syzbot was able to trigger a crash [1] in page_pool_unlist()
page_pool_list() only inserts a page pool into a netdev page pool list
if a netdev was set in params.
Even if the kzalloc() call in page_pool_create happens to initialize
pool->user.list, I chose to be more explicit in page_pool_list()
adding one INIT_HLIST_NODE().
We could test in page_pool_unlist() if netdev was set,
but since netdev can be changed to lo, it seems more robust to
check if pool->user.list is hashed before calling hlist_del().
[1]
Illegal XDP return value 4294946546 on prog (id 2) dev N/A, expect packet loss!
general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 0 PID: 5064 Comm: syz-executor391 Not tainted 6.7.0-rc2-syzkaller-00533-ga379972973a8 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023
RIP: 0010:__hlist_del include/linux/list.h:988 [inline]
RIP: 0010:hlist_del include/linux/list.h:1002 [inline]
RIP: 0010:page_pool_unlist+0xd1/0x170 net/core/page_pool_user.c:342
Code: df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 90 00 00 00 4c 8b a3 f0 06 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 75 68 48 85 ed 49 89 2c 24 74 24 e8 1b ca 07 f9 48 8d
RSP: 0018:ffffc900039ff768 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: ffff88814ae02000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff88814ae026f0
RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff1d57fdc
R10: ffffffff8eabfee3 R11: ffffffff8aa0008b R12: 0000000000000000
R13: ffff88814ae02000 R14: dffffc0000000000 R15: 0000000000000001
FS: 000055555717a380(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000002555398 CR3: 0000000025044000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
__page_pool_destroy net/core/page_pool.c:851 [inline]
page_pool_release+0x507/0x6b0 net/core/page_pool.c:891
page_pool_destroy+0x1ac/0x4c0 net/core/page_pool.c:956
xdp_test_run_teardown net/bpf/test_run.c:216 [inline]
bpf_test_run_xdp_live+0x1578/0x1af0 net/bpf/test_run.c:388
bpf_prog_test_run_xdp+0x827/0x1530 net/bpf/test_run.c:1254
bpf_prog_test_run kernel/bpf/syscall.c:4041 [inline]
__sys_bpf+0x11bf/0x4920 kernel/bpf/syscall.c:5402
__do_sys_bpf kernel/bpf/syscall.c:5488 [inline]
__se_sys_bpf kernel/bpf/syscall.c:5486 [inline]
__x64_sys_bpf+0x78/0xc0 kernel/bpf/syscall.c:5486
Fixes: 083772c9f9 ("net: page_pool: record pools per netdev")
Reported-and-tested-by: syzbot+f9f8efb58a4db2ca98d0@syzkaller.appspotmail.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20231130092259.3797753-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
During reload-reinit, all entities except for params, resources, regions
and health reporter should be removed and re-added. Add a warning to
be triggered in case the driver behaves differently.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
We will support arbitrary SYN Cookie with BPF, and then kfunc at
TC will preallocate reqsk and initialise some fields that should
not be overwritten later by cookie_v[46]_check().
To simplify the flow in cookie_v[46]_check(), we move such fields'
initialisation to cookie_tcp_reqsk_alloc() and factorise non-BPF
SYN Cookie handling into cookie_tcp_check(), where we validate the
cookie and allocate reqsk, as done by kfunc later.
Note that we set ireq->ecn_ok in two steps, the latter of which will
be shared by the BPF case. As cookie_ecn_ok() is one-liner, now
it's inlined.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20231129022924.96156-9-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We will support arbitrary SYN Cookie with BPF, and then some reqsk fields
are initialised in kfunc, and others are done in cookie_v[46]_check().
This patch factorises the common part as cookie_tcp_reqsk_init() and
calls it in cookie_tcp_reqsk_alloc() to minimise the discrepancy between
cookie_v[46]_check().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20231129022924.96156-8-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We initialise treq->af_specific in cookie_tcp_reqsk_alloc() so that
we can look up a key later in tcp_create_openreq_child().
Initially, that change was added for MD5 by commit ba5a4fdd63 ("tcp:
make sure treq->af_specific is initialized"), but it has not been used
since commit d0f2b7a9ca ("tcp: Disable header prediction for MD5
flow.").
Now, treq->af_specific is used only by TCP-AO, so, we can move that
initialisation into tcp_ao_syncookie().
In addition to that, l3index in cookie_v[46]_check() is only used for
tcp_ao_syncookie(), so let's move it as well.
While at it, we move down tcp_ao_syncookie() in cookie_v4_check() so
that it will be called after security_inet_conn_request() to make
functions order consistent with cookie_v6_check().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20231129022924.96156-7-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When we create a full socket from SYN Cookie, we initialise
tcp_sk(sk)->tsoffset redundantly in tcp_get_cookie_sock() as
the field is inherited from tcp_rsk(req)->ts_off.
cookie_v[46]_check
|- treq->ts_off = 0
`- tcp_get_cookie_sock
|- tcp_v[46]_syn_recv_sock
| `- tcp_create_openreq_child
| `- newtp->tsoffset = treq->ts_off
`- tcp_sk(child)->tsoffset = tsoff
Let's initialise tcp_rsk(req)->ts_off with the correct offset
and remove the second initialisation of tcp_sk(sk)->tsoffset.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20231129022924.96156-6-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If the initial subflow has been removed, we cannot know without checking
other counters, e.g. ss -ti <filter> | grep -c tcp-ulp-mptcp or
getsockopt(SOL_MPTCP, MPTCP_FULL_INFO, ...) (or others except MPTCP_INFO
of course) and then check mptcp_subflow_data->num_subflows to get the
total amount of subflows.
This patch adds a new counter mptcpi_subflows_total in mptcpi_flags to
store the total amount of subflows, including the initial one. A new
helper __mptcp_has_initial_subflow() is added to check whether the
initial subflow has been removed or not. With this helper, we can then
compute the total amount of subflows from mptcp_info by doing something
like:
mptcpi_subflows_total = mptcpi_subflows +
__mptcp_has_initial_subflow(msk).
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/428
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20231128-send-net-next-2023107-v4-1-8d6b94150f6b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Johannes Berg says:
====================
wireless fixes:
- debugfs had a deadlock (removal vs. use of files),
fixes going through wireless ACKed by Greg
- support for HT STAs on 320 MHz channels, even if it's
not clear that should ever happen (that's 6 GHz), best
not to WARN()
- fix for the previous CQM fix that broke most cases
- various wiphy locking fixes
- various small driver fixes
* tag 'wireless-2023-11-29' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
wifi: mac80211: use wiphy locked debugfs for sdata/link
wifi: mac80211: use wiphy locked debugfs helpers for agg_status
wifi: cfg80211: add locked debugfs wrappers
debugfs: add API to allow debugfs operations cancellation
debugfs: annotate debugfs handlers vs. removal with lockdep
debugfs: fix automount d_fsdata usage
wifi: mac80211: handle 320 MHz in ieee80211_ht_cap_ie_to_sta_ht_cap
wifi: avoid offset calculation on NULL pointer
wifi: cfg80211: hold wiphy mutex for send_interface
wifi: cfg80211: lock wiphy mutex for rfkill poll
wifi: cfg80211: fix CQM for non-range use
wifi: mac80211: do not pass AP_VLAN vif pointer to drivers during flush
wifi: iwlwifi: mvm: fix an error code in iwl_mvm_mld_add_sta()
wifi: mt76: mt7925: fix typo in mt7925_init_he_caps
wifi: mt76: mt7921: fix 6GHz disabled by the missing default CLC config
====================
Link: https://lore.kernel.org/r/20231129150809.31083-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Daniel Borkmann says:
====================
pull-request: bpf 2023-11-30
We've added 5 non-merge commits during the last 7 day(s) which contain
a total of 10 files changed, 66 insertions(+), 15 deletions(-).
The main changes are:
1) Fix AF_UNIX splat from use after free in BPF sockmap,
from John Fastabend.
2) Fix a syzkaller splat in netdevsim by properly handling offloaded
programs (and not device-bound ones), from Stanislav Fomichev.
3) Fix bpf_mem_cache_alloc_flags() to initialize the allocation hint,
from Hou Tao.
4) Fix netkit by rejecting IFLA_NETKIT_PEER_INFO in changelink,
from Daniel Borkmann.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf, sockmap: Add af_unix test with both sockets in map
bpf, sockmap: af_unix stream sockets need to hold ref for pair sock
netkit: Reject IFLA_NETKIT_PEER_INFO in netkit_change_link
bpf: Add missed allocation hint for bpf_mem_cache_alloc_flags()
netdevsim: Don't accept device bound programs
====================
Link: https://lore.kernel.org/r/20231129234916.16128-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
AF_UNIX stream sockets are a paired socket. So sending on one of the pairs
will lookup the paired socket as part of the send operation. It is possible
however to put just one of the pairs in a BPF map. This currently increments
the refcnt on the sock in the sockmap to ensure it is not free'd by the
stack before sockmap cleans up its state and stops any skbs being sent/recv'd
to that socket.
But we missed a case. If the peer socket is closed it will be free'd by the
stack. However, the paired socket can still be referenced from BPF sockmap
side because we hold a reference there. Then if we are sending traffic through
BPF sockmap to that socket it will try to dereference the free'd pair in its
send logic creating a use after free. And following splat:
[59.900375] BUG: KASAN: slab-use-after-free in sk_wake_async+0x31/0x1b0
[59.901211] Read of size 8 at addr ffff88811acbf060 by task kworker/1:2/954
[...]
[59.905468] Call Trace:
[59.905787] <TASK>
[59.906066] dump_stack_lvl+0x130/0x1d0
[59.908877] print_report+0x16f/0x740
[59.910629] kasan_report+0x118/0x160
[59.912576] sk_wake_async+0x31/0x1b0
[59.913554] sock_def_readable+0x156/0x2a0
[59.914060] unix_stream_sendmsg+0x3f9/0x12a0
[59.916398] sock_sendmsg+0x20e/0x250
[59.916854] skb_send_sock+0x236/0xac0
[59.920527] sk_psock_backlog+0x287/0xaa0
To fix let BPF sockmap hold a refcnt on both the socket in the sockmap and its
paired socket. It wasn't obvious how to contain the fix to bpf_unix logic. The
primarily problem with keeping this logic in bpf_unix was: In the sock close()
we could handle the deref by having a close handler. But, when we are destroying
the psock through a map delete operation we wouldn't have gotten any signal
thorugh the proto struct other than it being replaced. If we do the deref from
the proto replace its too early because we need to deref the sk_pair after the
backlog worker has been stopped.
Given all this it seems best to just cache it at the end of the psock and eat 8B
for the af_unix and vsock users. Notice dgram sockets are OK because they handle
locking already.
Fixes: 94531cfcbe ("af_unix: Add unix_stream_proto for sockmap")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20231129012557.95371-2-john.fastabend@gmail.com
Accept only the flags that the kernel knows about to make
sure we can extend this field in the future. Note that only
in XDP_COPY mode we propagate the error signal back to the user
(via sendmsg). For zerocopy mode we silently skip the metadata
for the descriptors that have wrong flags (since we process
the descriptors deep in the driver).
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20231127190319.1190813-8-sdf@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This change actually defines the (initial) metadata layout
that should be used by AF_XDP userspace (xsk_tx_metadata).
The first field is flags which requests appropriate offloads,
followed by the offload-specific fields. The supported per-device
offloads are exported via netlink (new xsk-flags).
The offloads themselves are still implemented in a bit of a
framework-y fashion that's left from my initial kfunc attempt.
I'm introducing new xsk_tx_metadata_ops which drivers are
supposed to implement. The drivers are also supposed
to call xsk_tx_metadata_request/xsk_tx_metadata_complete in
the right places. Since xsk_tx_metadata_{request,_complete}
are static inline, we don't incur any extra overhead doing
indirect calls.
The benefit of this scheme is as follows:
- keeps all metadata layout parsing away from driver code
- makes it easy to grep and see which drivers implement what
- don't need any extra flags to maintain to keep track of what
offloads are implemented; if the callback is implemented - the offload
is supported (used by netlink reporting code)
Two offloads are defined right now:
1. XDP_TXMD_FLAGS_CHECKSUM: skb-style csum_start+csum_offset
2. XDP_TXMD_FLAGS_TIMESTAMP: writes TX timestamp back into metadata
area upon completion (tx_timestamp field)
XDP_TXMD_FLAGS_TIMESTAMP is also implemented for XDP_COPY mode: it writes
SW timestamp from the skb destructor (note I'm reusing hwtstamps to pass
metadata pointer).
The struct is forward-compatible and can be extended in the future
by appending more fields.
Reviewed-by: Song Yoong Siang <yoong.siang.song@intel.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20231127190319.1190813-3-sdf@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
For zerocopy mode, tx_desc->addr can point to an arbitrary offset
and carry some TX metadata in the headroom. For copy mode, there
is no way currently to populate skb metadata.
Introduce new tx_metadata_len umem config option that indicates how many
bytes to treat as metadata. Metadata bytes come prior to tx_desc address
(same as in RX case).
The size of the metadata has mostly the same constraints as XDP:
- less than 256 bytes
- 8-byte aligned (compared to 4-byte alignment on xdp, due to 8-byte
timestamp in the completion)
- non-zero
This data is not interpreted in any way right now.
Reviewed-by: Song Yoong Siang <yoong.siang.song@intel.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20231127190319.1190813-2-sdf@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The default dump handler needs to clear ret before returning.
Otherwise if the last interface returns an inconsequential
error this error will propagate to user space.
This may confuse user space (ethtool CLI seems to ignore it,
but YNL doesn't). It will also terminate the dump early
for mutli-skb dump, because netlink core treats EOPNOTSUPP
as a real error.
Fixes: 728480f124 ("ethtool: default handlers for GET requests")
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20231126225806.2143528-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mute the periodic "stalled pool shutdown" warning if the page pool
is visible to user space. Rolling out a driver using page pools
to just a few hundred hosts at Meta surfaces applications which
fail to reap their broken sockets. Obviously it's best if the
applications are fixed, but we don't generally print warnings
for application resource leaks. Admins can now depend on the
netlink interface for getting page pool info to detect buggy
apps.
While at it throw in the ID of the pool into the message,
in rare cases (pools from destroyed netns) this will make
finding the pool with a debugger easier.
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Report when page pool was destroyed. Together with the inflight
/ memory use reporting this can serve as a replacement for the
warning about leaked page pools we currently print to dmesg.
Example output for a fake leaked page pool using some hacks
in netdevsim (one "live" pool, and one "leaked" on the same dev):
$ ./cli.py --no-schema --spec netlink/specs/netdev.yaml \
--dump page-pool-get
[{'id': 2, 'ifindex': 3},
{'id': 1, 'ifindex': 3, 'destroyed': 133, 'inflight': 1}]
Tested-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Advanced deployments need the ability to check memory use
of various system components. It makes it possible to make informed
decisions about memory allocation and to find regressions and leaks.
Report memory use of page pools. Report both number of references
and bytes held.
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
To avoid any issues with race conditions on accessing napi
and having to think about the lifetime of NAPI objects
in netlink GET - stash the napi_id to which page pool
was linked at creation time.
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link the page pools with netdevs. This needs to be netns compatible
so we have two options. Either we record the pools per netns and
have to worry about moving them as the netdev gets moved.
Or we record them directly on the netdev so they move with the netdev
without any extra work.
Implement the latter option. Since pools may outlast netdev we need
a place to store orphans. In time honored tradition use loopback
for this purpose.
Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Kalle Valo says:
====================
wireless-next patches for v6.8
The first features pull request for v6.8. Not so big in number of
commits but we removed quite a few ancient drivers: libertas 16-bit
PCMCIA support, atmel, hostap, zd1201, orinoco, ray_cs, wl3501 and
rndis_wlan.
Major changes:
cfg80211/mac80211
- extend support for scanning while Multi-Link Operation (MLO) connected
* tag 'wireless-next-2023-11-27' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (68 commits)
wifi: nl80211: Documentation update for NL80211_CMD_PORT_AUTHORIZED event
wifi: mac80211: Extend support for scanning while MLO connected
wifi: cfg80211: Extend support for scanning while MLO connected
wifi: ieee80211: fix PV1 frame control field name
rfkill: return ENOTTY on invalid ioctl
MAINTAINERS: update iwlwifi maintainers
wifi: rtw89: 8922a: read efuse content from physical map
wifi: rtw89: 8922a: read efuse content via efuse map struct from logic map
wifi: rtw89: 8852c: read RX gain offset from efuse for 6GHz channels
wifi: rtw89: mac: add to access efuse for WiFi 7 chips
wifi: rtw89: mac: use mac_gen pointer to access about efuse
wifi: rtw89: 8922a: add 8922A basic chip info
wifi: rtlwifi: drop unused const_amdpci_aspm
wifi: mwifiex: mwifiex_process_sleep_confirm_resp(): remove unused priv variable
wifi: rtw89: regd: update regulatory map to R65-R44
wifi: rtw89: regd: handle policy of 6 GHz according to BIOS
wifi: rtw89: acpi: process 6 GHz band policy from DSM
wifi: rtlwifi: simplify rtl_action_proc() and rtl_tx_agg_start()
wifi: rtw89: pci: update interrupt mitigation register for 8922AE
wifi: rtw89: pci: correct interrupt mitigation register for 8852CE
...
====================
Link: https://lore.kernel.org/r/20231127180056.0B48DC433C8@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
With latest upstream llvm18, the following test cases failed:
$ ./test_progs -j
#13/2 bpf_cookie/multi_kprobe_link_api:FAIL
#13/3 bpf_cookie/multi_kprobe_attach_api:FAIL
#13 bpf_cookie:FAIL
#77 fentry_fexit:FAIL
#78/1 fentry_test/fentry:FAIL
#78 fentry_test:FAIL
#82/1 fexit_test/fexit:FAIL
#82 fexit_test:FAIL
#112/1 kprobe_multi_test/skel_api:FAIL
#112/2 kprobe_multi_test/link_api_addrs:FAIL
[...]
#112 kprobe_multi_test:FAIL
#356/17 test_global_funcs/global_func17:FAIL
#356 test_global_funcs:FAIL
Further analysis shows llvm upstream patch [1] is responsible for the above
failures. For example, for function bpf_fentry_test7() in net/bpf/test_run.c,
without [1], the asm code is:
0000000000000400 <bpf_fentry_test7>:
400: f3 0f 1e fa endbr64
404: e8 00 00 00 00 callq 0x409 <bpf_fentry_test7+0x9>
409: 48 89 f8 movq %rdi, %rax
40c: c3 retq
40d: 0f 1f 00 nopl (%rax)
... and with [1], the asm code is:
0000000000005d20 <bpf_fentry_test7.specialized.1>:
5d20: e8 00 00 00 00 callq 0x5d25 <bpf_fentry_test7.specialized.1+0x5>
5d25: c3 retq
... and <bpf_fentry_test7.specialized.1> is called instead of <bpf_fentry_test7>
and this caused test failures for #13/#77 etc. except #356.
For test case #356/17, with [1] (progs/test_global_func17.c)), the main prog
looks like:
0000000000000000 <global_func17>:
0: b4 00 00 00 2a 00 00 00 w0 = 0x2a
1: 95 00 00 00 00 00 00 00 exit
... which passed verification while the test itself expects a verification
failure.
Let us add 'barrier_var' style asm code in both places to prevent function
specialization which caused selftests failure.
[1] https://github.com/llvm/llvm-project/pull/72903
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20231127050342.1945270-1-yonghong.song@linux.dev
The debugfs files for netdevs (sdata) and links are removed
with the wiphy mutex held, which may deadlock. Use the new
wiphy locked debugfs to avoid that.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The read is currently with RCU and the write can deadlock,
convert both for the sake of illustration.
Make mac80211 depend on cfg80211 debugfs to get the helpers,
but mac80211 debugfs without it does nothing anyway. This
also required some adjustments in ath9k.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Add wrappers for debugfs files that should be called with
the wiphy mutex held, while the file is also to be removed
under the wiphy mutex. This could otherwise deadlock when
a file is trying to acquire the wiphy mutex while the code
removing it holds the mutex but waits for the removal.
This actually works by pushing the execution of the read
or write handler to a wiphy work that can be cancelled
using the debugfs cancellation API.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Given all the locking rework in mac80211, we pretty much
need to get into the driver with the wiphy mutex held in
all callbacks. This is already mostly the case, but as
Johan reported, in the get_txpower it may not be true.
Lock the wiphy mutex around nl80211_send_iface(), then
is also around callers of nl80211_notify_iface(). This
is easy to do, fixes the problem, and aligns the locking
between various calls to it in different parts of the
code of cfg80211.
Fixes: 0e8185ce1d ("wifi: mac80211: check wiphy mutex in ops")
Reported-by: Johan Hovold <johan@kernel.org>
Closes: https://lore.kernel.org/r/ZVOXX6qg4vXEx8dX@hovoldconsulting.com
Tested-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
My prior race fix here broke CQM when ranges aren't used, as
the reporting worker now requires the cqm_config to be set in
the wdev, but isn't set when there's no range configured.
Rather than continuing to special-case the range version, set
the cqm_config always and configure accordingly, also tracking
if range was used or not to be able to clear the configuration
appropriately with the same API, which was actually not right
if both were implemented by a driver for some reason, as is
the case with mac80211 (though there the implementations are
equivalent so it doesn't matter.)
Also, the original multiple-RSSI commit lost checking for the
callback, so might have potentially crashed if a driver had
neither implementation, and userspace tried to use it despite
not being advertised as supported.
Cc: stable@vger.kernel.org
Fixes: 4a4b816950 ("cfg80211: Accept multiple RSSI thresholds for CQM")
Fixes: 37c20b2eff ("wifi: cfg80211: fix cqm_config access race")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The commit dcd2cf5f2f ("net/smc: add autocorking support") adds an
atomic variable tx_pushing in smc_connection to make sure only one can
send to let it cork more and save CDC slot. since smc_tx_pending can be
called in the soft IRQ without checking sock_owned_by_user() at that
time, which would cause a race condition because bh_lock_sock() did
not honor sock_lock()
After commit 6b88af839d ("net/smc: don't send in the BH context if
sock_owned_by_user"), the transmission is deferred to when sock_lock()
is held by the user. Therefore, we no longer need tx_pending to hold
message.
So remove atomic variable tx_pushing and its operation, and
smc_tx_sndbuf_nonempty becomes a wrapper of __smc_tx_sndbuf_nonempty,
so rename __smc_tx_sndbuf_nonempty back to smc_tx_sndbuf_nonempty
Suggested-by: Alexandra Winter <wintera@linux.ibm.com>
Co-developed-by: Dust Li <dust.li@linux.alibaba.com>
Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
diff v4: remove atomic variable tx_pushing
diff v3: improvements in the commit body and comments
diff v2: fix a typo in commit body and add net-next subject-prefix
net/smc/smc.h | 1 -
net/smc/smc_tx.c | 30 +-----------------------------
2 files changed, 1 insertion(+), 30 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new sysctl: net.smc.smcr_max_conns_per_lgr, which is
used to control the preferred max connections per lgr for
SMC-R v2.1. The default value of this sysctl is 255, and
the acceptable value ranges from 16 to 255.
Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>