IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Since [1], dma_alloc_coherent() does not accept requests for GFP_COMP
anymore, even on archs that may be able to fulfill this. Functionality that
relied on the receive buffer being a compound page broke at that point:
The SMC-D protocol, that utilizes the ism device driver, passes receive
buffers to the splice processor in a struct splice_pipe_desc with a
single entry list of struct pages. As the buffer is no longer a compound
page, the splice processor now rejects requests to handle more than a
page worth of data.
Replace dma_alloc_coherent() and allocate a buffer with folio_alloc and
create a DMA map for it with dma_map_page(). Since only receive buffers
on ISM devices use DMA, qualify the mapping as FROM_DEVICE.
Since ISM devices are available on arch s390, only and on that arch all
DMA is coherent, there is no need to introduce and export some kind of
dma_sync_to_cpu() method to be called by the SMC-D protocol layer.
Analogously, replace dma_free_coherent by a two step dma_unmap_page,
then folio_put to free the receive buffer.
[1] https://lore.kernel.org/all/20221113163535.884299-1-hch@lst.de/
Fixes: c08004eede4b ("s390/ism: don't pass bogus GFP_ flags to dma_alloc_coherent")
Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
syzbot is able to trigger an uninit-value in geneve_xmit() [1]
Problem : While most ip tunnel helpers (like ip_tunnel_get_dsfield())
uses skb_protocol(skb, true), pskb_inet_may_pull() is only using
skb->protocol.
If anything else than ETH_P_IPV6 or ETH_P_IP is found in skb->protocol,
pskb_inet_may_pull() does nothing at all.
If a vlan tag was provided by the caller (af_packet in the syzbot case),
the network header might not point to the correct location, and skb
linear part could be smaller than expected.
Add skb_vlan_inet_prepare() to perform a complete mac validation.
Use this in geneve for the moment, I suspect we need to adopt this
more broadly.
v4 - Jakub reported v3 broke l2_tos_ttl_inherit.sh selftest
- Only call __vlan_get_protocol() for vlan types.
Link: https://lore.kernel.org/netdev/20240404100035.3270a7d5@kernel.org/
v2,v3 - Addressed Sabrina comments on v1 and v2
Link: https://lore.kernel.org/netdev/Zg1l9L2BNoZWZDZG@hog/
[1]
BUG: KMSAN: uninit-value in geneve_xmit_skb drivers/net/geneve.c:910 [inline]
BUG: KMSAN: uninit-value in geneve_xmit+0x302d/0x5420 drivers/net/geneve.c:1030
geneve_xmit_skb drivers/net/geneve.c:910 [inline]
geneve_xmit+0x302d/0x5420 drivers/net/geneve.c:1030
__netdev_start_xmit include/linux/netdevice.h:4903 [inline]
netdev_start_xmit include/linux/netdevice.h:4917 [inline]
xmit_one net/core/dev.c:3531 [inline]
dev_hard_start_xmit+0x247/0xa20 net/core/dev.c:3547
__dev_queue_xmit+0x348d/0x52c0 net/core/dev.c:4335
dev_queue_xmit include/linux/netdevice.h:3091 [inline]
packet_xmit+0x9c/0x6c0 net/packet/af_packet.c:276
packet_snd net/packet/af_packet.c:3081 [inline]
packet_sendmsg+0x8bb0/0x9ef0 net/packet/af_packet.c:3113
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0x30f/0x380 net/socket.c:745
__sys_sendto+0x685/0x830 net/socket.c:2191
__do_sys_sendto net/socket.c:2203 [inline]
__se_sys_sendto net/socket.c:2199 [inline]
__x64_sys_sendto+0x125/0x1d0 net/socket.c:2199
do_syscall_64+0xd5/0x1f0
entry_SYSCALL_64_after_hwframe+0x6d/0x75
Uninit was created at:
slab_post_alloc_hook mm/slub.c:3804 [inline]
slab_alloc_node mm/slub.c:3845 [inline]
kmem_cache_alloc_node+0x613/0xc50 mm/slub.c:3888
kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:577
__alloc_skb+0x35b/0x7a0 net/core/skbuff.c:668
alloc_skb include/linux/skbuff.h:1318 [inline]
alloc_skb_with_frags+0xc8/0xbf0 net/core/skbuff.c:6504
sock_alloc_send_pskb+0xa81/0xbf0 net/core/sock.c:2795
packet_alloc_skb net/packet/af_packet.c:2930 [inline]
packet_snd net/packet/af_packet.c:3024 [inline]
packet_sendmsg+0x722d/0x9ef0 net/packet/af_packet.c:3113
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0x30f/0x380 net/socket.c:745
__sys_sendto+0x685/0x830 net/socket.c:2191
__do_sys_sendto net/socket.c:2203 [inline]
__se_sys_sendto net/socket.c:2199 [inline]
__x64_sys_sendto+0x125/0x1d0 net/socket.c:2199
do_syscall_64+0xd5/0x1f0
entry_SYSCALL_64_after_hwframe+0x6d/0x75
CPU: 0 PID: 5033 Comm: syz-executor346 Not tainted 6.9.0-rc1-syzkaller-00005-g928a87efa423 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024
Fixes: d13f048dd40e ("net: geneve: modify IP header check in geneve6_xmit_skb and geneve_xmit_skb")
Reported-by: syzbot+9ee20ec1de7b3168db09@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/000000000000d19c3a06152f9ee4@google.com/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Phillip Potter <phil@philpotter.co.uk>
Cc: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Phillip Potter <phil@philpotter.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
When sending a patch to (among others) Li Yang the nxp MTA replied that
the address doesn't exist and so the mail couldn't be delivered. The
error code was 550, so at least technically that's not a temporal issue.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
- void infinite loop trying to resize local TT, by Sven Eckelmann
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAmYPtjcWHHN3QHNpbW9u
d3VuZGVybGljaC5kZQAKCRChK+OYQpKeofxdD/92tUfDpBCxd8lgMmLuQK6mciOl
4xb+KsYx/yTIMWV0JWYqgntJnfqVb4ycppla72A9eBjvIfD0F4Rki9zStm3WQOdZ
6KmjpIJicMQTbaXeS5hecQ5arGR39GbyyVVVDzxj3xFFZ/lD00DQdfPzyn6kcSWM
huKJTtineXTokwxMfhrKQZUm2tXa0oUPc9JICH/VElvdWD7wJiT/5T8oa821ukaP
JxJLargoMrnnCQvvO9m047dhz5u6xrIkuyEQPj7gjFcR04MhZxOb3qbiC/gwgQoD
SHMM0e7eenkq9NdNQtrIdb5az4Urv7nUVTvw2KQhuqtxNgK9Fcn6Fy2O3OGE8MBZ
aEQVlL658l/fLX/yCYYnIK2dGCg3af0q4qBuuU1J+a0DnMgFK8AldIZ6q4ZwYBed
e8HQJLiyLQ9o0XSRiElNQ7ucE+RtzGOy9ypBYIEek4CKycN1uJHkwM6tZmf/oJse
xCp3kMMCxFSmS+/PyInpB7jGm0j/D/U9v56g+8LaD/kzZ2xSOs96fUuDSxSHgViI
FWf4g2XWTccjb01+g0dU2LndOSa0YiS1S1bN8cG2p6LHO4DRL3M3KuEPvWmpGQVC
XrzcdwMOpiElwdmw/pIgSMe9VHqiX0INbSQkoXUS/eioO0HVClUrB7nbOlRoqtvi
ifGT3286JN1Nubm+Og==
=t7p3
-----END PGP SIGNATURE-----
Merge tag 'batadv-net-pullrequest-20240405' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
Here is a batman-adv bugfix:
- void infinite loop trying to resize local TT, by Sven Eckelmann
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When CONFIG_NET is disabled, an extra warning shows up for this
unused variable:
lib/checksum_kunit.c:218:18: error: 'expected_csum_ipv6_magic' defined but not used [-Werror=unused-const-variable=]
Replace the #ifdef with an IS_ENABLED() check that makes the compiler's
dead-code-elimination take care of the link failure.
Fixes: f24a70106dc1 ("lib: checksum: Fix build with CONFIG_NET=n")
Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Simon Horman <horms@kernel.org> # build-tested
Signed-off-by: David S. Miller <davem@davemloft.net>
Inorder to support shaping and scheduling, Upon class creation
Netdev driver allocates trasmit schedulers.
The previous patch which added support for Round robin scheduling has
a bug due to which driver is not freeing transmit schedulers post
class deletion.
This patch fixes the same.
Fixes: 47a9656f168a ("octeontx2-pf: htb offload support for Round Robin scheduling")
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a bug when setting the RSS options in virtio_net that can break
the whole machine, getting the kernel into an infinite loop.
Running the following command in any QEMU virtual machine with virtionet
will reproduce this problem:
# ethtool -X eth0 hfunc toeplitz
This is how the problem happens:
1) ethtool_set_rxfh() calls virtnet_set_rxfh()
2) virtnet_set_rxfh() calls virtnet_commit_rss_command()
3) virtnet_commit_rss_command() populates 4 entries for the rss
scatter-gather
4) Since the command above does not have a key, then the last
scatter-gatter entry will be zeroed, since rss_key_size == 0.
sg_buf_size = vi->rss_key_size;
5) This buffer is passed to qemu, but qemu is not happy with a buffer
with zero length, and do the following in virtqueue_map_desc() (QEMU
function):
if (!sz) {
virtio_error(vdev, "virtio: zero sized buffers are not allowed");
6) virtio_error() (also QEMU function) set the device as broken
vdev->broken = true;
7) Qemu bails out, and do not repond this crazy kernel.
8) The kernel is waiting for the response to come back (function
virtnet_send_command())
9) The kernel is waiting doing the following :
while (!virtqueue_get_buf(vi->cvq, &tmp) &&
!virtqueue_is_broken(vi->cvq))
cpu_relax();
10) None of the following functions above is true, thus, the kernel
loops here forever. Keeping in mind that virtqueue_is_broken() does
not look at the qemu `vdev->broken`, so, it never realizes that the
vitio is broken at QEMU side.
Fix it by not sending RSS commands if the feature is not available in
the device.
Fixes: c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.")
Cc: stable@vger.kernel.org
Cc: qemu-devel@nongnu.org
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Heng Qi <hengqi@linux.alibaba.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
syzbot reported an illegal copy in xsk_setsockopt() [1]
Make sure to validate setsockopt() @optlen parameter.
[1]
BUG: KASAN: slab-out-of-bounds in copy_from_sockptr_offset include/linux/sockptr.h:49 [inline]
BUG: KASAN: slab-out-of-bounds in copy_from_sockptr include/linux/sockptr.h:55 [inline]
BUG: KASAN: slab-out-of-bounds in xsk_setsockopt+0x909/0xa40 net/xdp/xsk.c:1420
Read of size 4 at addr ffff888028c6cde3 by task syz-executor.0/7549
CPU: 0 PID: 7549 Comm: syz-executor.0 Not tainted 6.8.0-syzkaller-08951-gfe46a7dd189e #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x169/0x550 mm/kasan/report.c:488
kasan_report+0x143/0x180 mm/kasan/report.c:601
copy_from_sockptr_offset include/linux/sockptr.h:49 [inline]
copy_from_sockptr include/linux/sockptr.h:55 [inline]
xsk_setsockopt+0x909/0xa40 net/xdp/xsk.c:1420
do_sock_setsockopt+0x3af/0x720 net/socket.c:2311
__sys_setsockopt+0x1ae/0x250 net/socket.c:2334
__do_sys_setsockopt net/socket.c:2343 [inline]
__se_sys_setsockopt net/socket.c:2340 [inline]
__x64_sys_setsockopt+0xb5/0xd0 net/socket.c:2340
do_syscall_64+0xfb/0x240
entry_SYSCALL_64_after_hwframe+0x6d/0x75
RIP: 0033:0x7fb40587de69
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fb40665a0c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
RAX: ffffffffffffffda RBX: 00007fb4059abf80 RCX: 00007fb40587de69
RDX: 0000000000000005 RSI: 000000000000011b RDI: 0000000000000006
RBP: 00007fb4058ca47a R08: 0000000000000002 R09: 0000000000000000
R10: 0000000020001980 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000000b R14: 00007fb4059abf80 R15: 00007fff57ee4d08
</TASK>
Allocated by task 7549:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
poison_kmalloc_redzone mm/kasan/common.c:370 [inline]
__kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:387
kasan_kmalloc include/linux/kasan.h:211 [inline]
__do_kmalloc_node mm/slub.c:3966 [inline]
__kmalloc+0x233/0x4a0 mm/slub.c:3979
kmalloc include/linux/slab.h:632 [inline]
__cgroup_bpf_run_filter_setsockopt+0xd2f/0x1040 kernel/bpf/cgroup.c:1869
do_sock_setsockopt+0x6b4/0x720 net/socket.c:2293
__sys_setsockopt+0x1ae/0x250 net/socket.c:2334
__do_sys_setsockopt net/socket.c:2343 [inline]
__se_sys_setsockopt net/socket.c:2340 [inline]
__x64_sys_setsockopt+0xb5/0xd0 net/socket.c:2340
do_syscall_64+0xfb/0x240
entry_SYSCALL_64_after_hwframe+0x6d/0x75
The buggy address belongs to the object at ffff888028c6cde0
which belongs to the cache kmalloc-8 of size 8
The buggy address is located 1 bytes to the right of
allocated 2-byte region [ffff888028c6cde0, ffff888028c6cde2)
The buggy address belongs to the physical page:
page:ffffea0000a31b00 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888028c6c9c0 pfn:0x28c6c
anon flags: 0xfff00000000800(slab|node=0|zone=1|lastcpupid=0x7ff)
page_type: 0xffffffff()
raw: 00fff00000000800 ffff888014c41280 0000000000000000 dead000000000001
raw: ffff888028c6c9c0 0000000080800057 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x112cc0(GFP_USER|__GFP_NOWARN|__GFP_NORETRY), pid 6648, tgid 6644 (syz-executor.0), ts 133906047828, free_ts 133859922223
set_page_owner include/linux/page_owner.h:31 [inline]
post_alloc_hook+0x1ea/0x210 mm/page_alloc.c:1533
prep_new_page mm/page_alloc.c:1540 [inline]
get_page_from_freelist+0x33ea/0x3580 mm/page_alloc.c:3311
__alloc_pages+0x256/0x680 mm/page_alloc.c:4569
__alloc_pages_node include/linux/gfp.h:238 [inline]
alloc_pages_node include/linux/gfp.h:261 [inline]
alloc_slab_page+0x5f/0x160 mm/slub.c:2175
allocate_slab mm/slub.c:2338 [inline]
new_slab+0x84/0x2f0 mm/slub.c:2391
___slab_alloc+0xc73/0x1260 mm/slub.c:3525
__slab_alloc mm/slub.c:3610 [inline]
__slab_alloc_node mm/slub.c:3663 [inline]
slab_alloc_node mm/slub.c:3835 [inline]
__do_kmalloc_node mm/slub.c:3965 [inline]
__kmalloc_node+0x2db/0x4e0 mm/slub.c:3973
kmalloc_node include/linux/slab.h:648 [inline]
__vmalloc_area_node mm/vmalloc.c:3197 [inline]
__vmalloc_node_range+0x5f9/0x14a0 mm/vmalloc.c:3392
__vmalloc_node mm/vmalloc.c:3457 [inline]
vzalloc+0x79/0x90 mm/vmalloc.c:3530
bpf_check+0x260/0x19010 kernel/bpf/verifier.c:21162
bpf_prog_load+0x1667/0x20f0 kernel/bpf/syscall.c:2895
__sys_bpf+0x4ee/0x810 kernel/bpf/syscall.c:5631
__do_sys_bpf kernel/bpf/syscall.c:5738 [inline]
__se_sys_bpf kernel/bpf/syscall.c:5736 [inline]
__x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:5736
do_syscall_64+0xfb/0x240
entry_SYSCALL_64_after_hwframe+0x6d/0x75
page last free pid 6650 tgid 6647 stack trace:
reset_page_owner include/linux/page_owner.h:24 [inline]
free_pages_prepare mm/page_alloc.c:1140 [inline]
free_unref_page_prepare+0x95d/0xa80 mm/page_alloc.c:2346
free_unref_page_list+0x5a3/0x850 mm/page_alloc.c:2532
release_pages+0x2117/0x2400 mm/swap.c:1042
tlb_batch_pages_flush mm/mmu_gather.c:98 [inline]
tlb_flush_mmu_free mm/mmu_gather.c:293 [inline]
tlb_flush_mmu+0x34d/0x4e0 mm/mmu_gather.c:300
tlb_finish_mmu+0xd4/0x200 mm/mmu_gather.c:392
exit_mmap+0x4b6/0xd40 mm/mmap.c:3300
__mmput+0x115/0x3c0 kernel/fork.c:1345
exit_mm+0x220/0x310 kernel/exit.c:569
do_exit+0x99e/0x27e0 kernel/exit.c:865
do_group_exit+0x207/0x2c0 kernel/exit.c:1027
get_signal+0x176e/0x1850 kernel/signal.c:2907
arch_do_signal_or_restart+0x96/0x860 arch/x86/kernel/signal.c:310
exit_to_user_mode_loop kernel/entry/common.c:105 [inline]
exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline]
__syscall_exit_to_user_mode_work kernel/entry/common.c:201 [inline]
syscall_exit_to_user_mode+0xc9/0x360 kernel/entry/common.c:212
do_syscall_64+0x10a/0x240 arch/x86/entry/common.c:89
entry_SYSCALL_64_after_hwframe+0x6d/0x75
Memory state around the buggy address:
ffff888028c6cc80: fa fc fc fc fa fc fc fc fa fc fc fc fa fc fc fc
ffff888028c6cd00: fa fc fc fc fa fc fc fc 00 fc fc fc 06 fc fc fc
>ffff888028c6cd80: fa fc fc fc fa fc fc fc fa fc fc fc 02 fc fc fc
^
ffff888028c6ce00: fa fc fc fc fa fc fc fc fa fc fc fc fa fc fc fc
ffff888028c6ce80: fa fc fc fc fa fc fc fc fa fc fc fc fa fc fc fc
Fixes: 423f38329d26 ("xsk: add umem fill queue support and mmap")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: "Björn Töpel" <bjorn@kernel.org>
Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Cc: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Cc: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20240404202738.3634547-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fix bogus lockdep warnings if multiple u64_stats_sync variables are
initialized in the same file.
With CONFIG_LOCKDEP, seqcount_init() is a macro which declares:
static struct lock_class_key __key;
Since u64_stats_init() is a function (albeit an inline one), all calls
within the same file end up using the same instance, effectively treating
them all as a single lock-class.
Fixes: 9464ca650008 ("net: make u64_stats_init() a function")
Closes: https://lore.kernel.org/netdev/ea1567d9-ce66-45e6-8168-ac40a47d1821@roeck-us.net/
Signed-off-by: Petr Tesarik <petr@tesarici.cz>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240404075740.30682-1-petr@tesarici.cz
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
On startup, ovs-vswitchd probes different datapath features including
support for timeout policies. While probing, it tries to execute
certain operations with OVS_PACKET_ATTR_PROBE or OVS_FLOW_ATTR_PROBE
attributes set. These attributes tell the openvswitch module to not
log any errors when they occur as it is expected that some of the
probes will fail.
For some reason, setting the timeout policy ignores the PROBE attribute
and logs a failure anyway. This is causing the following kernel log
on each re-start of ovs-vswitchd:
kernel: Failed to associated timeout policy `ovs_test_tp'
Fix that by using the same logging macro that all other messages are
using. The message will still be printed at info level when needed
and will be rate limited, but with a net rate limiter instead of
generic printk one.
The nf_ct_set_timeout() itself will still print some info messages,
but at least this change makes logging in openvswitch module more
consistent.
Fixes: 06bd2bdf19d2 ("openvswitch: Add timeout support to ct action")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Link: https://lore.kernel.org/r/20240403203803.2137962-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fairly usual collection of driver and core fixes. The large selftest
accompanying one of the fixes is also becoming a common occurrence.
Current release - regressions:
- ipv6: fix infinite recursion in fib6_dump_done()
- net/rds: fix possible null-deref in newly added error path
Current release - new code bugs:
- net: do not consume a full cacheline for system_page_pool
- bpf: fix bpf_arena-related file descriptor leaks in the verifier
- drv: ice: fix freeing uninitialized pointers, fixing misuse of
the newfangled __free() auto-cleanup
Previous releases - regressions:
- x86/bpf: fixes the BPF JIT with retbleed=stuff
- xen-netfront: add missing skb_mark_for_recycle, fix page pool
accounting leaks, revealed by recently added explicit warning
- tcp: fix bind() regression for v6-only wildcard and v4-mapped-v6
non-wildcard addresses
- Bluetooth:
- replace "hci_qca: Set BDA quirk bit if fwnode exists in DT"
with better workarounds to un-break some buggy Qualcomm devices
- set conn encrypted before conn establishes, fix re-connecting
to some headsets which use slightly unusual sequence of msgs
- mptcp:
- prevent BPF accessing lowat from a subflow socket
- don't account accept() of non-MPC client as fallback to TCP
- drv: mana: fix Rx DMA datasize and skb_over_panic
- drv: i40e: fix VF MAC filter removal
Previous releases - always broken:
- gro: various fixes related to UDP tunnels - netns crossing problems,
incorrect checksum conversions, and incorrect packet transformations
which may lead to panics
- bpf: support deferring bpf_link dealloc to after RCU grace period
- nf_tables:
- release batch on table validation from abort path
- release mutex after nft_gc_seq_end from abort path
- flush pending destroy work before exit_net release
- drv: r8169: skip DASH fw status checks when DASH is disabled
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmYO91wACgkQMUZtbf5S
IrvHBQ/+PH/hobI+o3aLqwtdVlyxhmA31bVQ0I3aTIZV7c3ideMBcfgYa8TiZM2g
pLiBiWoJXCN0h33wgUmlUee+sBvpoPCdPjGD/g99OJyKWjVt2D7ObnSwxMfjHUoq
dtcN2JupqHP0SHz6wPPCmnWtTLxSGUsDdKjmkHQcCRhQIGTYFkYyHcOmPgNbBjaB
6jvmH1kE9WQTFD8QcOMaZmXQ5omoafpxxQLsgundtOWxPWHL7XNvk0B5k/ESDRG1
ujbxwtNnOESzpxZMQ6OyZlsnN/1tWfnEvLJFYVwf9BMrOlahJT/f5b/EJ9/Xy4dC
zkAp7Tul3uAvNRKhBNhVBTWQbnIykmiNMp1VBFmiScQAy8hcnX+6d4LKTIHxbXZK
V3AqcUS6YU2nyMdLRkhvq9f3uxD6hcY19gQdyqgCUPOtyUAs/JPv7lXQjCuuEqkq
urEZkigUApnEqPIrIqANJ7nXUy3U0K8qU6evOZoGZ5OdiKeNKC3+tIr+g2f1ZUZq
a7Dkat7JH9WQ7IG8Geody6Z30K9EpSqYMTKzB5wTfmuqw6cV8bl9OAW9UOSRK0GL
pyG8GwpkpFPkNiZdu9Zt44Pno5xdLIa1+C3QZR0r5CJWYAzCbI80MppP5veF9Mw+
v+2v8iBWuh9iv0AUj9KJOwG5QQ+EXLUuSlhtx/DFnmn2CJ9plXI=
=6bQI
-----END PGP SIGNATURE-----
Merge tag 'net-6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from netfilter, bluetooth and bpf.
Fairly usual collection of driver and core fixes. The large selftest
accompanying one of the fixes is also becoming a common occurrence.
Current release - regressions:
- ipv6: fix infinite recursion in fib6_dump_done()
- net/rds: fix possible null-deref in newly added error path
Current release - new code bugs:
- net: do not consume a full cacheline for system_page_pool
- bpf: fix bpf_arena-related file descriptor leaks in the verifier
- drv: ice: fix freeing uninitialized pointers, fixing misuse of the
newfangled __free() auto-cleanup
Previous releases - regressions:
- x86/bpf: fixes the BPF JIT with retbleed=stuff
- xen-netfront: add missing skb_mark_for_recycle, fix page pool
accounting leaks, revealed by recently added explicit warning
- tcp: fix bind() regression for v6-only wildcard and v4-mapped-v6
non-wildcard addresses
- Bluetooth:
- replace "hci_qca: Set BDA quirk bit if fwnode exists in DT" with
better workarounds to un-break some buggy Qualcomm devices
- set conn encrypted before conn establishes, fix re-connecting to
some headsets which use slightly unusual sequence of msgs
- mptcp:
- prevent BPF accessing lowat from a subflow socket
- don't account accept() of non-MPC client as fallback to TCP
- drv: mana: fix Rx DMA datasize and skb_over_panic
- drv: i40e: fix VF MAC filter removal
Previous releases - always broken:
- gro: various fixes related to UDP tunnels - netns crossing
problems, incorrect checksum conversions, and incorrect packet
transformations which may lead to panics
- bpf: support deferring bpf_link dealloc to after RCU grace period
- nf_tables:
- release batch on table validation from abort path
- release mutex after nft_gc_seq_end from abort path
- flush pending destroy work before exit_net release
- drv: r8169: skip DASH fw status checks when DASH is disabled"
* tag 'net-6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (81 commits)
netfilter: validate user input for expected length
net/sched: act_skbmod: prevent kernel-infoleak
net: usb: ax88179_178a: avoid the interface always configured as random address
net: dsa: sja1105: Fix parameters order in sja1110_pcs_mdio_write_c45()
net: ravb: Always update error counters
net: ravb: Always process TX descriptor ring
netfilter: nf_tables: discard table flag update with pending basechain deletion
netfilter: nf_tables: Fix potential data-race in __nft_flowtable_type_get()
netfilter: nf_tables: reject new basechain after table flag update
netfilter: nf_tables: flush pending destroy work before exit_net release
netfilter: nf_tables: release mutex after nft_gc_seq_end from abort path
netfilter: nf_tables: release batch on table validation from abort path
Revert "tg3: Remove residual error handling in tg3_suspend"
tg3: Remove residual error handling in tg3_suspend
net: mana: Fix Rx DMA datasize and skb_over_panic
net/sched: fix lockdep splat in qdisc_tree_reduce_backlog()
net: phy: micrel: lan8814: Fix when enabling/disabling 1-step timestamping
net: stmmac: fix rx queue priority assignment
net: txgbe: fix i2c dev name cannot match clkdev
net: fec: Set mac_managed_pm during probe
...
A couple more small fixes, and new repair code.
We can now automatically recover from arbitrary corrupted interior btree
nodes by scanning, and we can reconstruct metadata as needed to bring a
filesystem back into a working, consistent, read-write state and
preserve access to whatevver wasn't corrupted.
Meaning - you can blow away all metadata except for extents and dirents
leaf nodes, and repair will reconstruct everything else and give you
your data, and under the correct paths. If inodes are missing i_size
will be slightly off and permissions/ownership/timestamps will be gone,
and we do still need the snapshots btree if snapshots were in use - in
the future we'll be able to guess the snapshot tree structure in some
situations.
IOW - aside from shaking out remaining bugs (fuzz testing is still
coming), repair code should be complete and if repair ever doesn't work
that's the highest priority bug that I want to know about immediately.
This patchset was kindly tested by a user from India who accidentally
wiped one drive out of a three drive filesystem with no replication on
the family computer - it took a couple weeks but we got everything
important back.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmYNq9IACgkQE6szbY3K
bnaG9w/+Od0iq4Nqx62Mf8+O5DLnZZNu3c9aUOEiuzdXlNrpUr4S9j4WwDxTb/EN
2a3ldXY5AhauZqEW7Qv+WBZvVVbm3GYH+oOYQo8V+yf1oGNB3+AGxBCCmruHJGLk
5nmwsRyVm1ihAKxn1oxwrDDPtOlxbGOlc4peR+nCY/b5QnlXegGkGfRAHO/z9bul
4JdBYBqR4KBGdevIV8EG2WVa6ASA6mF1QOboeB6INekD4klDpm41gK/0S9Uf2oXm
q1PiN655YHquXbJTT9k/HtVX4WhlcaHv+R4KeZ5TEReJjB57ot/M8Rx57lgsYHP6
TeyR4Y5VYGLYqlwMK5RiKyGLB92qNFcSlg5inASyTCUNi1KKu12SpqS3+Nel6+tF
gu4F4ElSvAcsmJ6LrfsfP9B8u0ULDkIyq9xBFFbLTIpLuDOqz8FcgFpZrpiO445w
F6FcYXqt2/fP7gxA3GzdFjeUojIjWNMJapgpsePg/HGNArBsoAsBL8rAhAyetG3Z
EOJlrJ8m59/QoPgXBpScfbS7cxk3JgrUzfSI/oKaEr2lS0YNlYjQANYHoEHTFaxA
bMWKXwMkvqz49MMm5WLaMIOYDJRDtrt0qpnW7x+qU7ik/VkHeUTJr07bSRIKT0z1
yNCynYtdbeQVfekZQS6JwsyTs/ehbI1OVN8MGwVRCrQTonYz+BA=
=7/rR
-----END PGP SIGNATURE-----
Merge tag 'bcachefs-2024-04-03' of https://evilpiepirate.org/git/bcachefs
Pull bcachefs repair code from Kent Overstreet:
"A couple more small fixes, and new repair code.
We can now automatically recover from arbitrary corrupted interior
btree nodes by scanning, and we can reconstruct metadata as needed to
bring a filesystem back into a working, consistent, read-write state
and preserve access to whatevver wasn't corrupted.
Meaning - you can blow away all metadata except for extents and
dirents leaf nodes, and repair will reconstruct everything else and
give you your data, and under the correct paths. If inodes are missing
i_size will be slightly off and permissions/ownership/timestamps will
be gone, and we do still need the snapshots btree if snapshots were in
use - in the future we'll be able to guess the snapshot tree structure
in some situations.
IOW - aside from shaking out remaining bugs (fuzz testing is still
coming), repair code should be complete and if repair ever doesn't
work that's the highest priority bug that I want to know about
immediately.
This patchset was kindly tested by a user from India who accidentally
wiped one drive out of a three drive filesystem with no replication on
the family computer - it took a couple weeks but we got everything
important back"
* tag 'bcachefs-2024-04-03' of https://evilpiepirate.org/git/bcachefs:
bcachefs: reconstruct_inode()
bcachefs: Subvolume reconstruction
bcachefs: Check for extents that point to same space
bcachefs: Reconstruct missing snapshot nodes
bcachefs: Flag btrees with missing data
bcachefs: Topology repair now uses nodes found by scanning to fill holes
bcachefs: Repair pass for scanning for btree nodes
bcachefs: Don't skip fake btree roots in fsck
bcachefs: bch2_btree_root_alloc() -> bch2_btree_root_alloc_fake()
bcachefs: Etyzinger cleanups
bcachefs: bch2_shoot_down_journal_keys()
bcachefs: Clear recovery_passes_required as they complete without errors
bcachefs: ratelimit informational fsck errors
bcachefs: Check for bad needs_discard before doing discard
bcachefs: Improve bch2_btree_update_to_text()
mean_and_variance: Drop always failing tests
bcachefs: fix nocow lock deadlock
bcachefs: BCH_WATERMARK_interior_updates
bcachefs: Fix btree node reserve
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZg7vEAAKCRDbK58LschI
gxAeAQD18uHqGbcCCnOVnaETdi3bQcSBpobuclThpN1WdMILcwEA+5Vz9Lxmt6BH
qF/vbVHAOeCZt3LTOJ/jx1GRVstVWQk=
=+QKv
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2024-04-04
We've added 7 non-merge commits during the last 5 day(s) which contain
a total of 9 files changed, 75 insertions(+), 24 deletions(-).
The main changes are:
1) Fix x86 BPF JIT under retbleed=stuff which causes kernel panics due to
incorrect destination IP calculation and incorrect IP for relocations,
from Uros Bizjak and Joan Bruguera Micó.
2) Fix BPF arena file descriptor leaks in the verifier,
from Anton Protopopov.
3) Defer bpf_link deallocation to after RCU grace period as currently
running multi-{kprobes,uprobes} programs might still access cookie
information from the link, from Andrii Nakryiko.
4) Fix a BPF sockmap lock inversion deadlock in map_delete_elem reported
by syzkaller, from Jakub Sitnicki.
5) Fix resolve_btfids build with musl libc due to missing linux/types.h
include, from Natanael Copa.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf, sockmap: Prevent lock inversion deadlock in map delete elem
x86/bpf: Fix IP for relocating call depth accounting
x86/bpf: Fix IP after emitting call depth accounting
bpf: fix possible file descriptor leaks in verifier
tools/resolve_btfids: fix build with musl libc
bpf: support deferring bpf_link dealloc to after RCU grace period
bpf: put uprobe link's path and task in release callback
====================
Link: https://lore.kernel.org/r/20240404183258.4401-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEN9lkrMBJgcdVAPub1V2XiooUIOQFAmYOgt8ACgkQ1V2XiooU
IOTX7RAAjhvbva/ussuSDVjDNlYkeZ96uwJNeJz/sHKm7tV043yWwQrYpOy74pBe
oxgya6IbI9yR5JVPlfrsfpgjmCbiWuiZmrMBUr0v1EQ4Zb1WD1KWnoEFKRS80hMq
qZfqCP6gWH6e7nxkD9URamBUOBemN6rg6gqDPIs9khtm9dKrUCKms8kalTlN2gKl
x1jnQwwKUZPtWXw5gN875BHrmX+kBm0YPGr11ys5mYvtUpmvSAwE1l33BAUKLzR+
vkd54f5zF+ZuURE2yYAsa4ZQh/Muho9OVv69r3rJM4thsJmvZLgPYqLZ7iEgxsBQ
RnG034d9liDeNYvsAhZSVTtzT8M8ctfSH/omyAF6QtPN6RdDd5A0XWp6/EhnuAeq
G1qKlXULwfG82uwD6hui0OBRJUTB4muFF4T28iDlCvUbFaEqftU6C/eSjcBbzJNc
nfD7AZsmXJFbEfqamHmezrKcCoCcdkXJlw0SoUp8YgO0mn9txJ3f2hTDsh/48ibA
Y5v6YXwQxP/UYugW8pFTsircaf3uasMJpk49/OlTuGXGj+NtHS0yiaw1kgpVsvPC
mzc/k9iHwan+lCAXQVHPZPwY0puIXIzgcBpJxfzcsJOQVi6SQIAjjRANpqFoGYcz
P2VGeBj1/+Xvhi62OFmvNNDw6iG4LUae4LEVj6gJuzPVqsvZ4qY=
=PKX1
-----END PGP SIGNATURE-----
Merge tag 'nf-24-04-04' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
Patch #1 unlike early commit path stage which triggers a call to abort,
an explicit release of the batch is required on abort, otherwise
mutex is released and commit_list remains in place.
Patch #2 release mutex after nft_gc_seq_end() in commit path, otherwise
async GC worker could collect expired objects.
Patch #3 flush pending destroy work in module removal path, otherwise UaF
is possible.
Patch #4 and #6 restrict the table dormant flag with basechain updates
to fix state inconsistency in the hook registration.
Patch #5 adds missing RCU read side lock to flowtable type to avoid races
with module removal.
* tag 'nf-24-04-04' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_tables: discard table flag update with pending basechain deletion
netfilter: nf_tables: Fix potential data-race in __nft_flowtable_type_get()
netfilter: nf_tables: reject new basechain after table flag update
netfilter: nf_tables: flush pending destroy work before exit_net release
netfilter: nf_tables: release mutex after nft_gc_seq_end from abort path
netfilter: nf_tables: release batch on table validation from abort path
====================
Link: https://lore.kernel.org/r/20240404104334.1627-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2024-04-03 (ice, idpf)
This series contains updates to ice and idpf drivers.
Dan Carpenter initializes some pointer declarations to NULL as needed for
resource cleanup on ice driver.
Petr Oros corrects assignment of VLAN operators to fix Rx VLAN filtering
in legacy mode for ice.
Joshua calls eth_type_trans() on unknown packets to prevent possible
kernel panic on idpf.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
idpf: fix kernel panic on unknown packet types
ice: fix enabling RX VLAN filtering
ice: Fix freeing uninitialized pointers
====================
Link: https://lore.kernel.org/r/20240403201929.1945116-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
After the commit d2689b6a86b9 ("net: usb: ax88179_178a: avoid two
consecutive device resets"), reset is not executed from bind operation and
mac address is not read from the device registers or the devicetree at that
moment. Since the check to configure if the assigned mac address is random
or not for the interface, happens after the bind operation from
usbnet_probe, the interface keeps configured as random address, although the
address is correctly read and set during open operation (the only reset
now).
In order to keep only one reset for the device and to avoid the interface
always configured as random address, after reset, configure correctly the
suitable field from the driver, if the mac address is read successfully from
the device registers or the devicetree. Take into account if a locally
administered address (random) was previously stored.
cc: stable@vger.kernel.org # 6.6+
Fixes: d2689b6a86b9 ("net: usb: ax88179_178a: avoid two consecutive device resets")
Reported-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240403132158.344838-1-jtornosm@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The definition and declaration of sja1110_pcs_mdio_write_c45() don't have
parameters in the same order.
Knowing that sja1110_pcs_mdio_write_c45() is used as a function pointer
in 'sja1105_info' structure with .pcs_mdio_write_c45, and that we have:
int (*pcs_mdio_write_c45)(struct mii_bus *bus, int phy, int mmd,
int reg, u16 val);
it is likely that the definition is the one to change.
Found with cppcheck, funcArgOrderDifferent.
Fixes: ae271547bba6 ("net: dsa: sja1105: C45 only transactions for PCS")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Michael Walle <mwalle@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/ff2a5af67361988b3581831f7bd1eddebfb4c48f.1712082763.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The error statistics should be updated each time the poll function is
called, even if the full RX work budget has been consumed. This prevents
the counts from becoming stuck when RX bandwidth usage is high.
This also ensures that error counters are not updated after we've
re-enabled interrupts as that could result in a race condition.
Also drop an unnecessary space.
Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/20240402145305.82148-2-paul.barker.ct@bp.renesas.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The TX queue should be serviced each time the poll function is called,
even if the full RX work budget has been consumed. This prevents
starvation of the TX queue when RX bandwidth usage is high.
Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/20240402145305.82148-1-paul.barker.ct@bp.renesas.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Hook unregistration is deferred to the commit phase, same occurs with
hook updates triggered by the table dormant flag. When both commands are
combined, this results in deleting a basechain while leaving its hook
still registered in the core.
Fixes: 179d9ba5559a ("netfilter: nf_tables: fix table flag updates")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
nft_unregister_flowtable_type() within nf_flow_inet_module_exit() can
concurrent with __nft_flowtable_type_get() within nf_tables_newflowtable().
And thhere is not any protection when iterate over nf_tables_flowtables
list in __nft_flowtable_type_get(). Therefore, there is pertential
data-race of nf_tables_flowtables list entry.
Use list_for_each_entry_rcu() to iterate over nf_tables_flowtables list
in __nft_flowtable_type_get(), and use rcu_read_lock() in the caller
nft_flowtable_type_get() to protect the entire type query process.
Fixes: 3b49e2e94e6e ("netfilter: nf_tables: add flow table netlink frontend")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
When dormant flag is toggled, hooks are disabled in the commit phase by
iterating over current chains in table (existing and new).
The following configuration allows for an inconsistent state:
add table x
add chain x y { type filter hook input priority 0; }
add table x { flags dormant; }
add chain x w { type filter hook input priority 1; }
which triggers the following warning when trying to unregister chain w
which is already unregistered.
[ 127.322252] WARNING: CPU: 7 PID: 1211 at net/netfilter/core.c:50 1 __nf_unregister_net_hook+0x21a/0x260
[...]
[ 127.322519] Call Trace:
[ 127.322521] <TASK>
[ 127.322524] ? __warn+0x9f/0x1a0
[ 127.322531] ? __nf_unregister_net_hook+0x21a/0x260
[ 127.322537] ? report_bug+0x1b1/0x1e0
[ 127.322545] ? handle_bug+0x3c/0x70
[ 127.322552] ? exc_invalid_op+0x17/0x40
[ 127.322556] ? asm_exc_invalid_op+0x1a/0x20
[ 127.322563] ? kasan_save_free_info+0x3b/0x60
[ 127.322570] ? __nf_unregister_net_hook+0x6a/0x260
[ 127.322577] ? __nf_unregister_net_hook+0x21a/0x260
[ 127.322583] ? __nf_unregister_net_hook+0x6a/0x260
[ 127.322590] ? __nf_tables_unregister_hook+0x8a/0xe0 [nf_tables]
[ 127.322655] nft_table_disable+0x75/0xf0 [nf_tables]
[ 127.322717] nf_tables_commit+0x2571/0x2620 [nf_tables]
Fixes: 179d9ba5559a ("netfilter: nf_tables: fix table flag updates")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The commit mutex should not be released during the critical section
between nft_gc_seq_begin() and nft_gc_seq_end(), otherwise, async GC
worker could collect expired objects and get the released commit lock
within the same GC sequence.
nf_tables_module_autoload() temporarily releases the mutex to load
module dependencies, then it goes back to replay the transaction again.
Move it at the end of the abort phase after nft_gc_seq_end() is called.
Cc: stable@vger.kernel.org
Fixes: 720344340fb9 ("netfilter: nf_tables: GC transaction race with abort path")
Reported-by: Kuan-Ting Chen <hexrabbit@devco.re>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Unlike early commit path stage which triggers a call to abort, an
explicit release of the batch is required on abort, otherwise mutex is
released and commit_list remains in place.
Add WARN_ON_ONCE to ensure commit_list is empty from the abort path
before releasing the mutex.
After this patch, commit_list is always assumed to be empty before
grabbing the mutex, therefore
03c1f1ef1584 ("netfilter: Cleanup nft_net->module_list from nf_tables_exit_net()")
only needs to release the pending modules for registration.
Cc: stable@vger.kernel.org
Fixes: c0391b6ab810 ("netfilter: nf_tables: missing validation from the abort path")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This reverts commit 9ab4ad295622a3481818856762471c1f8c830e18.
I went out of coffee and applied it to the wrong tree. Blame on me.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
As of now, tg3_power_down_prepare always ends with success, but
the error handling code from former tg3_set_power_state call is still here.
This code became unreachable in commit c866b7eac073 ("tg3: Do not use
legacy PCI power management").
Remove (now unreachable) error handling code for simplification and change
tg3_power_down_prepare to a void function as its result is no more checked.
Signed-off-by: Nikita Kiryushin <kiryushin@ancud.ru>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240401191418.361747-1-kiryushin@ancud.ru
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
mana_get_rxbuf_cfg() aligns the RX buffer's DMA datasize to be
multiple of 64. So a packet slightly bigger than mtu+14, say 1536,
can be received and cause skb_over_panic.
Sample dmesg:
[ 5325.237162] skbuff: skb_over_panic: text:ffffffffc043277a len:1536 put:1536 head:ff1100018b517000 data:ff1100018b517100 tail:0x700 end:0x6ea dev:<NULL>
[ 5325.243689] ------------[ cut here ]------------
[ 5325.245748] kernel BUG at net/core/skbuff.c:192!
[ 5325.247838] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 5325.258374] RIP: 0010:skb_panic+0x4f/0x60
[ 5325.302941] Call Trace:
[ 5325.304389] <IRQ>
[ 5325.315794] ? skb_panic+0x4f/0x60
[ 5325.317457] ? asm_exc_invalid_op+0x1f/0x30
[ 5325.319490] ? skb_panic+0x4f/0x60
[ 5325.321161] skb_put+0x4e/0x50
[ 5325.322670] mana_poll+0x6fa/0xb50 [mana]
[ 5325.324578] __napi_poll+0x33/0x1e0
[ 5325.326328] net_rx_action+0x12e/0x280
As discussed internally, this alignment is not necessary. To fix
this bug, remove it from the code. So oversized packets will be
marked as CQE_RX_TRUNCATED by NIC, and dropped.
Cc: stable@vger.kernel.org
Fixes: 2fbbd712baf1 ("net: mana: Enable RX path to handle various MTU sizes")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Link: https://lore.kernel.org/r/1712087316-20886-1-git-send-email-haiyangz@microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There are 2 issues with the blamed commit.
1. When the phy is initialized, it would enable the disabled of UDPv4
checksums. The UDPv6 checksum is already enabled by default. So when
1-step is configured then it would clear these flags.
2. After the 1-step is configured, then if 2-step is configured then the
1-step would be still configured because it is not clearing the flag.
So the sync frames will still have origin timestamps set.
Fix this by reading first the value of the register and then
just change bit 12 as this one determines if the timestamp needs to
be inserted in the frame, without changing any other bits.
Fixes: ece19502834d ("net: phy: micrel: 1588 support for LAN8814 phy")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Divya Koppera <divya.koppera@microchip.com>
Link: https://lore.kernel.org/r/20240402071634.2483524-1-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The driver should ensure that same priority is not mapped to multiple
rx queues. From DesignWare Cores Ethernet Quality-of-Service
Databook, section 17.1.29 MAC_RxQ_Ctrl2:
"[...]The software must ensure that the content of this field is
mutually exclusive to the PSRQ fields for other queues, that is,
the same priority is not mapped to multiple Rx queues[...]"
Previously rx_queue_priority() function was:
- clearing all priorities from a queue
- adding new priorities to that queue
After this patch it will:
- first assign new priorities to a queue
- then remove those priorities from all other queues
- keep other priorities previously assigned to that queue
Fixes: a8f5102af2a7 ("net: stmmac: TX and RX queue priority configuration")
Fixes: 2142754f8b9c ("net: stmmac: Add MAC related callbacks for XGMAC2")
Signed-off-by: Piotr Wejman <piotrwejman90@gmail.com>
Link: https://lore.kernel.org/r/20240401192239.33942-1-piotrwejman90@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Setting mac_managed_pm during interface up is too late.
In situations where the link is not brought up yet and the system suspends
the regular PHY power management will run. Since the FEC ETHEREN control
bit is cleared (automatically) on suspend the controller is off in resume.
When the regular PHY power management resume path runs in this context it
will write to the MII_DATA register but nothing will be transmitted on the
MDIO bus.
This can be observed by the following log:
fec 5b040000.ethernet eth0: MDIO read timeout
Microchip LAN87xx T1 5b040000.ethernet-1:04: PM: dpm_run_callback(): mdio_bus_phy_resume+0x0/0xc8 returns -110
Microchip LAN87xx T1 5b040000.ethernet-1:04: PM: failed to resume: error -110
The data written will however remain in the MII_DATA register.
When the link later is set to administrative up it will trigger a call to
fec_restart() which will restore the MII_SPEED register. This triggers the
quirk explained in f166f890c8f0 ("net: ethernet: fec: Replace interrupt
driven MDIO with polled IO") causing an extra MII_EVENT.
This extra event desynchronizes all the MDIO register reads, causing them
to complete too early. Leading all reads to read as 0 because
fec_enet_mdio_wait() returns too early.
When a Microchip LAN8700R PHY is connected to the FEC, the 0 reads causes
the PHY to be initialized incorrectly and the PHY will not transmit any
ethernet signal in this state. It cannot be brought out of this state
without a power cycle of the PHY.
Fixes: 557d5dc83f68 ("net: fec: use mac-managed PHY PM")
Closes: https://lore.kernel.org/netdev/1f45bdbe-eab1-4e59-8f24-add177590d27@actia.se/
Signed-off-by: Wei Fang <wei.fang@nxp.com>
[jernberg: commit message]
Signed-off-by: John Ernberg <john.ernberg@actia.se>
Link: https://lore.kernel.org/r/20240328155909.59613-2-john.ernberg@actia.se
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In the very rare case where a packet type is unknown to the driver,
idpf_rx_process_skb_fields would return early without calling
eth_type_trans to set the skb protocol / the network layer handler.
This is especially problematic if tcpdump is running when such a
packet is received, i.e. it would cause a kernel panic.
Instead, call eth_type_trans for every single packet, even when
the packet type is unknown.
Fixes: 3a8845af66ed ("idpf: add RX splitq napi poll support")
Reported-by: Balazs Nemeth <bnemeth@redhat.com>
Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Salvatore Daniele <sdaniele@redhat.com>
Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
If an inode is missing, but corresponding extents and dirent still
exist, it's well worth recreating it - this does so.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
In backpointer repair, if we get a missing backpointer - but there's
already a backpointer that points to an existing extent - we've got
multiple extents that point to the same space and need to decide which
to keep.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
When the snapshots btree is going, we'll have to delete huge amounts of
data - unless we can reconstruct it by looking at the keys that refer to
it.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
With the new btree node scan code, we can now recover from corrupt btree
roots - simply create a new fake root at depth 1, and then insert all
the leaves we found.
If the root wasn't corrupt but there's corruption elsewhere in the
btree, we can fill in holes as needed with the newest version of a given
node(s) from the scan; we also check if a given btree node is older than
what we found from the scan.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
If a btree root or interior btree node goes bad, we're going to lose a
lot of data, unless we can recover the nodes that it pointed to by
scanning.
Fortunately btree node headers are fully self describing, and
additionally the magic number is xored with the filesytem UUID, so we
can do so safely.
This implements the scanning - next patch will rework topology repair to
make use of the found nodes.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Pull out eytzinger.c and kill eytzinger_cmp_fn. We now provide
eytzinger0_sort and eytzinger0_sort_r, which use the standard cmp_func_t
and cmp_r_func_t callbacks.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>