linux

iv/linux

Author	SHA1	Message	Date
Geliang Tang	6a0653b96f	selftests: mptcp: add fullmesh setting tests This patch added the fullmesh setting and clearing selftests in mptcp_join.sh. Now we can set both backup and fullmesh flags, so avoid using the words 'backup' and 'bkup'. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-03 11:44:08 +00:00
Geliang Tang	c25d29be00	selftests: mptcp: set fullmesh flag in pm_nl_ctl This patch added the fullmesh flag setting and clearing support in pm_nl_ctl: # pm_nl_ctl set ip flags fullmesh # pm_nl_ctl set ip flags nofullmesh Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-03 11:44:08 +00:00
Geliang Tang	73c762c1f0	mptcp: set fullmesh flag in pm_netlink This patch added the fullmesh flag setting support in pm_netlink. If the fullmesh flag of the address is changed, remove all the related subflows, update the fullmesh flag and create subflows again. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-03 11:44:08 +00:00
Geliang Tang	9ddd1cac6f	mptcp: print out reset infos of MP_RST This patch printed out the reset infos, reset_transient and reset_reason, of MP_RST in mptcp_parse_option() to show that MP_RST is received. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-03 11:44:08 +00:00
Matthieu Baerts	8cca39e251	mptcp: clarify when options can be used RFC8684 doesn't seem to clearly specify which MPTCP options can be used together. Some options are mutually exclusive -- e.g. MP_CAPABLE and MP_JOIN --, some can be used together -- e.g. DSS + MP_PRIO --, some can but we prefer not to -- e.g. DSS + ADD_ADDR -- and some have to be used together at some points -- e.g. MP_FAIL and DSS. We need to clarify this as a base before allowing other modifications. For example, does it make sense to send a RM_ADDR with an MPC or MPJ? This remains open for possible future discussions. Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-03 11:44:08 +00:00
Matthieu Baerts	902c8f8648	mptcp: reduce branching when writing MP_FAIL option MP_FAIL should be use in very rare cases, either when the TCP RST flag is set -- with or without an MP_RST -- or with a DSS, see mptcp_established_options(). Here, we do the same in mptcp_write_options(). Co-developed-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-03 11:44:08 +00:00
Geliang Tang	d7889cfa0b	mptcp: move the declarations of ssk and subflow Move the declarations of ssk and subflow in MP_FAIL and MP_PRIO to the beginning of the function mptcp_write_options(). Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-03 11:44:08 +00:00
Hou Tao	b293dcc473	bpf: Use VM_MAP instead of VM_ALLOC for ringbuf After commit 2fd3fb0be1d1 ("kasan, vmalloc: unpoison VM_ALLOC pages after mapping"), non-VM_ALLOC mappings will be marked as accessible in __get_vm_area_node() when KASAN is enabled. But now the flag for ringbuf area is VM_ALLOC, so KASAN will complain out-of-bound access after vmap() returns. Because the ringbuf area is created by mapping allocated pages, so use VM_MAP instead. After the change, info in /proc/vmallocinfo also changes from [start]-[end] 24576 ringbuf_map_alloc+0x171/0x290 vmalloc user to [start]-[end] 24576 ringbuf_map_alloc+0x171/0x290 vmap user Fixes: 457f44363a88 ("bpf: Implement BPF ring buffer and verifier support for it") Reported-by: syzbot+5ad567a418794b9b5983@syzkaller.appspotmail.com Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220202060158.6260-1-houtao1@huawei.com	2022-02-02 23:15:24 -08:00
Jakub Kicinski	156a532b48	Merge branch 'net-ipa-support-variable-rx-buffer-size' Alex Elder says: ==================== net: ipa: support variable RX buffer size Specify the size of receive buffers used for RX endpoints in the configuration data, rather than using 8192 bytes for all of them. Increase the size of the AP receive buffer for the modem to 32KB. ==================== Link: https://lore.kernel.org/r/20220201153737.601149-1-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-02 21:13:52 -08:00
Alex Elder	33230aeb2e	net: ipa: set IPA v4.11 AP<-modem RX buffer size to 32KB Increase the receive buffer size used for data received from the modem to 32KB, to improve download performance by allowing much greater aggregation. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-02 21:13:45 -08:00
Alex Elder	ed23f02680	net: ipa: define per-endpoint receive buffer size Allow RX endpoints to have differing receive buffer sizes. Define the receive buffer size in the configuration data, and use that rather than IPA_RX_BUFFER_SIZE when configuring the endpoint. Add verification in ipa_endpoint_data_valid_one() that the receive buffer specified for AP RX endpoints is both big enough to handle at least one full packet, and not so big in an aggregating endpoint that its size can't be represented when programming the hardware. Move aggr_byte_limit_max() up in "ipa_endpoint.c" so it can be used earlier in the file without a forward-reference. Initially we'll just keep the 8KB receive buffer size already in use for all AP RX endpoints.. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-02 21:13:45 -08:00
Daniel Borkmann	4a81f6da9c	net, neigh: Do not trigger immediate probes on NUD_FAILED from neigh_managed_work syzkaller was able to trigger a deadlock for NTF_MANAGED entries [0]: kworker/0:16/14617 is trying to acquire lock: ffffffff8d4dd370 (&tbl->lock){++-.}-{2:2}, at: ___neigh_create+0x9e1/0x2990 net/core/neighbour.c:652 [...] but task is already holding lock: ffffffff8d4dd370 (&tbl->lock){++-.}-{2:2}, at: neigh_managed_work+0x35/0x250 net/core/neighbour.c:1572 The neighbor entry turned to NUD_FAILED state, where __neigh_event_send() triggered an immediate probe as per commit cd28ca0a3dd1 ("neigh: reduce arp latency") via neigh_probe() given table lock was held. One option to fix this situation is to defer the neigh_probe() back to the neigh_timer_handler() similarly as pre cd28ca0a3dd1. For the case of NTF_MANAGED, this deferral is acceptable given this only happens on actual failure state and regular / expected state is NUD_VALID with the entry already present. The fix adds a parameter to __neigh_event_send() in order to communicate whether immediate probe is allowed or disallowed. Existing call-sites of neigh_event_send() default as-is to immediate probe. However, the neigh_managed_work() disables it via use of neigh_event_send_probe(). [0] <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 print_deadlock_bug kernel/locking/lockdep.c:2956 [inline] check_deadlock kernel/locking/lockdep.c:2999 [inline] validate_chain kernel/locking/lockdep.c:3788 [inline] __lock_acquire.cold+0x149/0x3ab kernel/locking/lockdep.c:5027 lock_acquire kernel/locking/lockdep.c:5639 [inline] lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5604 __raw_write_lock_bh include/linux/rwlock_api_smp.h:202 [inline] _raw_write_lock_bh+0x2f/0x40 kernel/locking/spinlock.c:334 ___neigh_create+0x9e1/0x2990 net/core/neighbour.c:652 ip6_finish_output2+0x1070/0x14f0 net/ipv6/ip6_output.c:123 __ip6_finish_output net/ipv6/ip6_output.c:191 [inline] __ip6_finish_output+0x61e/0xe90 net/ipv6/ip6_output.c:170 ip6_finish_output+0x32/0x200 net/ipv6/ip6_output.c:201 NF_HOOK_COND include/linux/netfilter.h:296 [inline] ip6_output+0x1e4/0x530 net/ipv6/ip6_output.c:224 dst_output include/net/dst.h:451 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] ndisc_send_skb+0xa99/0x17f0 net/ipv6/ndisc.c:508 ndisc_send_ns+0x3a9/0x840 net/ipv6/ndisc.c:650 ndisc_solicit+0x2cd/0x4f0 net/ipv6/ndisc.c:742 neigh_probe+0xc2/0x110 net/core/neighbour.c:1040 __neigh_event_send+0x37d/0x1570 net/core/neighbour.c:1201 neigh_event_send include/net/neighbour.h:470 [inline] neigh_managed_work+0x162/0x250 net/core/neighbour.c:1574 process_one_work+0x9ac/0x1650 kernel/workqueue.c:2307 worker_thread+0x657/0x1110 kernel/workqueue.c:2454 kthread+0x2e9/0x3a0 kernel/kthread.c:377 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 </TASK> Fixes: 7482e3841d52 ("net, neigh: Add NTF_MANAGED flag for managed neighbor entries") Reported-by: syzbot+5239d0e1778a500d477a@syzkaller.appspotmail.com Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Roopa Prabhu <roopa@nvidia.com> Tested-by: syzbot+5239d0e1778a500d477a@syzkaller.appspotmail.com Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220201193942.5055-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-02 20:30:18 -08:00
Eric Dumazet	b67985be40	tcp: add missing tcp_skb_can_collapse() test in tcp_shift_skb_data() tcp_shift_skb_data() might collapse three packets into a larger one. P_A, P_B, P_C -> P_ABC Historically, it used a single tcp_skb_can_collapse_to(P_A) call, because it was enough. In commit 85712484110d ("tcp: coalesce/collapse must respect MPTCP extensions"), this call was replaced by a call to tcp_skb_can_collapse(P_A, P_B) But the now needed test over P_C has been missed. This probably broke MPTCP. Then later, commit 9b65b17db723 ("net: avoid double accounting for pure zerocopy skbs") added an extra condition to tcp_skb_can_collapse(), but the missing call from tcp_shift_skb_data() is also breaking TCP zerocopy, because P_A and P_C might have different skb_zcopy_pure() status. Fixes: 85712484110d ("tcp: coalesce/collapse must respect MPTCP extensions") Fixes: 9b65b17db723 ("net: avoid double accounting for pure zerocopy skbs") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Mat Martineau <mathew.j.martineau@linux.intel.com> Cc: Talal Ahmad <talalahmad@google.com> Cc: Arjun Roy <arjunroy@google.com> Cc: Willem de Bruijn <willemb@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20220201184640.756716-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-02 16:22:37 -08:00
Linus Torvalds	88808fbbea	Notable bug fixes: - Ensure SM_NOTIFY doesn't crash the NFS server host - Ensure NLM locks are cleaned up after client reboot - Fix a leak of internal NFSv4 lease information -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmH1em4ACgkQM2qzM29m f5drhRAAq8uU+tgABqZNj4aLivUOAionkSiV6Blk1V44DO00yhY2y3dAsOu8bO0k Kh1Yu0QSZeaYDSi2Ak9qCKAl8eNg8lvlxWJ5pQ+GERVJiZj3JJRPSUJI+5r/aQMi k774Y+DzLwPn6/r5iTyymm3vx1wcas+Y/v2nvmHob/G74UKngbhOhP05XS/1MDlM fdTtXVKqLx92grDljTXWCtT5q5mpOc+OFufo2a5+b1aJjUWiU/rraT1mArNlEC7F IMw/eZn6ZnZv+ywbVJFGeRib/Xa7jNeKA+4CQMH+quk/s8rHEaUJqeM5439HLBYk E0KrFAdn+VDV5A6I9TIB1vtykl0KzC/r2u8G4vbA++rfpuxW36lGS95JFnDctGG+ uwk/f4p2+D7oSGt7gLXt8LTOAx0/NeT+OTtUqZRPcoKO7uXvkkCCu2irD9VpGSpD A83Qq0ewT9ntNy0Feik3FgmRSmPTgvywE78MeRFoundd3QhtghUunfY1N2soDt7t 0hyqBhcH8ypWjFoKmv+wAHLPcGcdeg+8T0w3hFPcyTrrdYo/OJl4MNgrIczA2z8O nWCZ+lOZq3QtAkd0eGSFPhnTVebCP5n6yvIfDN4rZc+ASNAqXCR5e1yCDE1gfO+E I1uCcxzewWPe3DsuYWQznEx5u4Rpiml5JF1q5uKFwTNj4UTBFKQ= =IC/r -----END PGP SIGNATURE----- Merge tag 'nfsd-5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: "Notable bug fixes: - Ensure SM_NOTIFY doesn't crash the NFS server host - Ensure NLM locks are cleaned up after client reboot - Fix a leak of internal NFSv4 lease information" * tag 'nfsd-5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: nfsd4_setclientid_confirm mistakenly expires confirmed client. lockd: fix failure to cleanup client locks lockd: fix server crash on reboot of client holding lock	2022-02-02 10:14:31 -08:00
Linus Torvalds	d5084ffbc5	\n -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmH6gtsACgkQnJ2qBz9k QNkHTggAvFfCBZ11AM3MIQifVrN06q0Aq/xTCR56lbLoqjVbLx+oUAPYk45AgzBJ 9NhYFwPbevLs+c1JrigmTW/i2juZghdfV3CBu6uWvnIDgZbjDwt1RFZ/TMuzG488 Orr12n34J+kaM89BxBPxecFXqGW1bqtIeIUkM6M4OefagVvueRP591GEHRPGk60S nz90LIZN2fsXrDq6K0EC4LVnMF8VWe7lpW8vHORc/O83KasHFGv1xXJ/Z1ovq9ln N17pbjFvECyKwIsvQCuKoxa/iutKqiUSQiVyFyN9IryBE5bMKSuv87EP+dKPcP/O Vmkg2AZRAbL9+M/rhgu4mF0bfhnuJQ== =uGov -----END PGP SIGNATURE----- Merge tag 'fsnotify_for_v5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull fanotify fix from Jan Kara: "Fix stale file descriptor in copy_event_to_user" * tag 'fsnotify_for_v5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fanotify: Fix stale file descriptor in copy_event_to_user()	2022-02-02 10:08:52 -08:00
Linus Torvalds	27bb0b18c2	linux-kselftest-kunit-fixes-5.17-rc3 This kunit update for Linux 5.17-rc3 consists of a single fix to an error seen on qemu due to a missing import. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmH5reYACgkQCwJExA0N QxyTlRAA7CYxkE9UjeReUDHWllwI+/hMLqk06Xe+T/uUNlXJTnr7NBLbUI9896Ur 2LB7k+yP1h1tlimQFMH1SfzAQrSUdpwhX+IXN1/3oLgcXYCNeoEUhF7DfGXST1BX mOd1Qr+GU/yIru7WfpRDxEZgjech9BnDlmuEi10ZF7uYyte49ZbyBIrR4VQjuSvb SK2iHpdQk8/7TVoE2r6fTJW2n6LcFzlXCOi48jzBdBfSfeo8WFd9EK2DD9Y7PJSM UjXQBEXGqae2Occu6+qqBZCRMR3sxxo3T7Ak01pAhzgEDDQjuKNFLhQHPti6bgYJ VbMSaP7onuwIRBVSvoXeOnzjT6ozXXb+C0q406gE4vu71UAxD5E0k6wZuoh5q4xr MKkACZbsEmylkTq+cQhe8LmdwEN3yhcjZ0cRsqvhSpwnpXVBb4yUSyIyGJesSQkS JvCxKaCm+wnCyr3xZL+PXK0tfEqTt53wMeVRG0PlMd+zBdxoZo4KbNlX1I2klV9H Mi5J2mUsyoGFaJp3MOqFNTn3WQ3jFNOFFKlFhlUHzsm72Nth4GnBC1W95QxKNAOG Peut78LWVdlCcgnN7IPXL1/MZLUVC0WMf7Q6UbMQCZy+dK6RPIzliN5Fipmah2Re BpH5E0UycoL7NJo8aFhavSm+LbzYa+cFrEVCeXRLDxjjuqRwTso= =maDS -----END PGP SIGNATURE----- Merge tag 'linux-kselftest-kunit-fixes-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull KUnit fixes from Shuah Khan: "A single fix to an error seen on qemu due to a missing import" * tag 'linux-kselftest-kunit-fixes-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: kunit: tool: Import missing importlib.abc	2022-02-02 10:00:08 -08:00
Linus Torvalds	3e5832e923	Pin control fixes for the v5.17 kernel cycle: - Fix up group name building on the Intel Thunderbay - Fix interrupt problems on the Intel Cherryview - Fix some pin data on the Sunxi H616 - Fix up the CONFIG_PINCTRL_ST Kconfig sort order as noted during the merge window - Fix an unexpected interrupt problem on the Intel Sunrisepoint - Fix a glitch when updating IRQ flags on all Intel pin controllers - Revert a Zynqmp patch to unify the pin naming, let's find some better solution - Fix some error paths in the Broadcom BCM2835 driver - Fix a Kconfig problem pertaining to the BCM63XX drivers - Fix the regmap support in the Microchip SGPIO driver -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAmH5x5MACgkQQRCzN7AZ XXPNeQ/9EaMOGK6pE4zok8R0QJ9jChcor07HeOJc0jr/1lKixdpxK/edaugQZRFs CSqdMsHKashPimMn0X//IHhGg3PCC70FFIh0TA1F2dThROqH0JD3j59MUdSkLcCd OMtuIpeRZrhuygJam3MYJETzrI/QeQznUZarri+YJ0ba/Me5XaEY+QWkCB8u7ZLY SsG1p9LRs1PxNvlBMk+QrTg3nMcV3ZhVtEg7soQpNw08oUfiwDvTwkO1pGX/0ntD R0vCvgfaqX1w4v7eiVP4zUj2T7tDxU14WTCDEGbsLr6Z6vzYS7Wzw/tMw7h5iDwX T86CoxD1Yj3RaJplFMFyW9ZN3HHJvISWhujx/EmX871lPwDsHn8zZsS5WSyfiYRb Qddiu3gg4nBBbfeDwP5lcy4ZrHsszAy12Zv3OZTwVkapWxfaKYd2+QTJI+RPrSSS 3F+hAH1cHrOQY0sfGUTI6tmwovccQnNv1qk/IuQwxtDLlWPSyduLH8mmhgLb9wMR AQF6lnlW5M59CDqTN1v/trXC7lJyM4lpSnRxek9rMqQrhy+JWHUCWZ1Je+mk8R0k 03eRmaFHLWhzoQ8ZLToO16He9WI+VFx9KOqoHHUqhBjeBbE69S46PwY3jev3k6YW bViDImOI6DhCeAnu6TH18LOq1UGbPjZ0slmLjWaARPwtC69X3ZE= =3/r6 -----END PGP SIGNATURE----- Merge tag 'pinctrl-v5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: "Most interesting and urgent is the Intel stuff affecting Chromebooks and laptops. - Fix up group name building on the Intel Thunderbay - Fix interrupt problems on the Intel Cherryview - Fix some pin data on the Sunxi H616 - Fix up the CONFIG_PINCTRL_ST Kconfig sort order as noted during the merge window - Fix an unexpected interrupt problem on the Intel Sunrisepoint - Fix a glitch when updating IRQ flags on all Intel pin controllers - Revert a Zynqmp patch to unify the pin naming, let's find some better solution - Fix some error paths in the Broadcom BCM2835 driver - Fix a Kconfig problem pertaining to the BCM63XX drivers - Fix the regmap support in the Microchip SGPIO driver" * tag 'pinctrl-v5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: microchip-sgpio: Fix support for regmap pinctrl: bcm63xx: fix unmet dependency on REGMAP for GPIO_REGMAP pinctrl: bcm2835: Fix a few error paths pinctrl: zynqmp: Revert "Unify pin naming" pinctrl: intel: Fix a glitch when updating IRQ flags on a preconfigured line pinctrl: intel: fix unexpected interrupt pinctrl: Place correctly CONFIG_PINCTRL_ST in the Makefile pinctrl: sunxi: Fix H616 I2S3 pin data pinctrl: cherryview: Trigger hwirq0 for interrupt-lines without a mapping pinctrl: thunderbay: rework loops looking for groups names pinctrl: thunderbay: comment process of building functions a bit	2022-02-02 09:50:17 -08:00
Steen Hegelund	81eb8b0b18	net: sparx5: do not refer to skb after passing it on Do not try to use any SKB fields after the packet has been passed up in the receive stack. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Link: https://lore.kernel.org/r/20220202083039.3774851-1-steen.hegelund@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-02 09:07:22 -08:00
Rafael J. Wysocki	52dae93f3b	drivers: net: Replace acpi_bus_get_device() Replace acpi_bus_get_device() that is going to be dropped with acpi_fetch_acpi_dev(). No intentional functional impact. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/11918902.O9o76ZdvQC@kreacher Link: https://lore.kernel.org/r/11920660.O9o76ZdvQC@kreacher Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-02 08:05:06 -08:00
Vratislav Bendel	186edf7e36	selinux: fix double free of cond_list on error paths On error path from cond_read_list() and duplicate_policydb_cond_list() the cond_list_destroy() gets called a second time in caller functions, resulting in NULL pointer deref. Fix this by resetting the cond_list_len to 0 in cond_list_destroy(), making subsequent calls a noop. Also consistently reset the cond_list pointer to NULL after freeing. Cc: stable@vger.kernel.org Signed-off-by: Vratislav Bendel <vbendel@redhat.com> [PM: fix line lengths in the description] Signed-off-by: Paul Moore <paul@paul-moore.com>	2022-02-02 11:02:10 -05:00
Dmitry V. Levin	c86d86131a	Partially revert "net/smc: Add netlink net namespace support" The change of sizeof(struct smc_diag_linkinfo) by commit 79d39fc503b4 ("net/smc: Add netlink net namespace support") introduced an ABI regression: since struct smc_diag_lgrinfo contains an object of type "struct smc_diag_linkinfo", offset of all subsequent members of struct smc_diag_lgrinfo was changed by that change. As result, applications compiled with the old version of struct smc_diag_linkinfo will receive garbage in struct smc_diag_lgrinfo.role if the kernel implements this new version of struct smc_diag_linkinfo. Fix this regression by reverting the part of commit 79d39fc503b4 that changes struct smc_diag_linkinfo. After all, there is SMC_GEN_NETLINK interface which is good enough, so there is probably no need to touch the smc_diag ABI in the first place. Fixes: 79d39fc503b4 ("net/smc: Add netlink net namespace support") Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Reviewed-by: Karsten Graul <kgraul@linux.ibm.com> Link: https://lore.kernel.org/r/20220202030904.GA9742@altlinux.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-02 07:42:41 -08:00
Akhmat Karakotov	5903123f66	tcp: Use BPF timeout setting for SYN ACK RTO When setting RTO through BPF program, some SYN ACK packets were unaffected and continued to use TCP_TIMEOUT_INIT constant. This patch adds timeout option to struct request_sock. Option is initialized with TCP_TIMEOUT_INIT and is reassigned through BPF using tcp_timeout_init call. SYN ACK retransmits now use newly added timeout option. Signed-off-by: Akhmat Karakotov <hmukos@yandex-team.ru> Acked-by: Martin KaFai Lau <kafai@fb.com> v2: - Add timeout option to struct request_sock. Do not call tcp_timeout_init on every syn ack retransmit. v3: - Use unsigned long for min. Bound tcp_timeout_init to TCP_RTO_MAX. v4: - Refactor duplicate code by adding reqsk_timeout function. Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-02 14:45:18 +00:00
David S. Miller	0b6b0d3113	Merge branch 'qca8k-mdio' Ansuel Smith says: ==================== Add support for qca8k mdio rw in Ethernet packet The main reason for this is that we notice some routing problem in the switch and it seems assisted learning is needed. Considering mdio is quite slow due to the indirect write using this Ethernet alternative way seems to be quicker. The qca8k switch supports a special way to pass mdio read/write request using specially crafted Ethernet packet. This works by putting some defined data in the Ethernet header where the mac source and dst should be placed. The Ethernet type header is set to qca header and is set to a mdio read/write type. This is used to communicate to the switch that this is a special packet and should be parsed differently. Currently we use Ethernet packet for - MIB counter - mdio read/write configuration - phy read/write for each port Current implementation of this use completion API to wait for the packet to be processed by the tagger and has a timeout that fallback to the legacy mdio way and mutex to enforce one transaction at time. We now have connect()/disconnect() ops for the tagger. They are used to allocate priv data in the dsa priv. The header still has to be put in global include to make it usable by a dsa driver. They are called when the tag is connect to the dst and the data is freed using discconect on tagger change. (if someone wonder why the bind function is put at in the general setup function it's because tag is set in the cpu port where the notifier is still not available and we require the notifier to sen the tag_proto_connect() event. We now have a tag_proto_connect() for the dsa driver used to put additional data in the tagger priv (that is actually the dsa priv). This is called using a switch event DSA_NOTIFIER_TAG_PROTO_CONNECT. Current use for this is adding handler for the Ethernet packet to keep the tagger code as dumb as possible. The tagger priv implement only the handler for the special packet. All the other stuff is placed in the qca8k_priv and the tagger has to access it under lock. We use the new API from Vladimir to track if the master port is operational or not. We had to track many thing to reach a usable state. Checking if the port is UP is not enough and tracking a NETDEV_CHANGE is also not enough since it use also for other task. The correct way was both track for interface UP and if a qdisc was assigned to the interface. That tells us the port (and the tagger indirectly) is ready to accept and process packet. I tested this with multicpu port and with port6 set as the unique port and it's sad. It seems they implemented this feature in a bad way and this is only supported with cpu port0. When cpu port6 is the unique port, the switch doesn't send ack packet. With multicpu port, packet ack are not duplicated and only cpu port0 sends them. This is the same for the MIB counter. For this reason this feature is enabled only when cpu port0 is enabled and operational. v8: - Reworked to rolling counter for the seq_num - Reworked the hi/lo cache patch - Fix multiple missing skb free and mutex lock errors - Fix some spelling mistake - Add macro build check for mgmt packet size - Change some struct naming to make them more descriptive v7: - Rebase on net-next changes - Add bulk patches to speedup this even more v6: - Fix some error in ethtool handler caused by rebase/cleanup v5: - Adapt to new API fixes - Fix a wrong logic for noop - Add additional lock for master_state change - Limit mdio Ethernet to cpu port0 (switch limitation) - Add priority to these special packet - Move mdio cache to qca8k_priv v4: - Remove duplicate patch sent by mistake. v3: - Include MIB with Ethernet packet. - Include phy read/write with Ethernet packet. - Reorganize code with new API. - Introuce master tracking by Vladimir v2: - Address all suggestion from Vladimir. Try to generilize this with connect/disconnect function from the tagger and tag_proto_connect for the driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-02 14:44:00 +00:00
Ansuel Smith	4f3701fc59	net: dsa: qca8k: introduce qca8k_bulk_read/write function Introduce qca8k_bulk_read/write() function to use mgmt Ethernet way to read/write packet in bulk. Make use of this new function in the fdb function and while at it reduce the reg for fdb_read from 4 to 3 as the max bit for the ARL(fdb) table is 83 bits. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-02 14:44:00 +00:00
Ansuel Smith	90386223f4	net: dsa: qca8k: add support for larger read/write size with mgmt Ethernet mgmt Ethernet packet can read/write up to 16byte at times. The len reg is limited to 15 (0xf). The switch actually sends and accepts data in 4 different steps of len values. Len steps: - 0: nothing - 1-4: first 4 byte - 5-6: first 12 byte - 7-15: all 16 byte In the alloc skb function we check if the len is 16 and we fix it to a len of 15. It the read/write function interest to extract the real asked data. The tagger handler will always copy the fully 16byte with a READ command. This is useful for some big regs like the fdb reg that are more than 4byte of data. This permits to introduce a bulk function that will send and request the entire entry in one go. Write function is changed and it does now require to pass the pointer to val to also handle array val. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-02 14:44:00 +00:00
Ansuel Smith	2481d206fa	net: dsa: qca8k: cache lo and hi for mdio write From Documentation, we can cache lo and hi the same way we do with the page. This massively reduce the mdio write as 3/4 of the time as we only require to write the lo or hi part for a mdio write. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-02 14:44:00 +00:00
Ansuel Smith	4264350acb	net: dsa: qca8k: move page cache to driver priv There can be multiple qca8k switch on the same system. Move the static qca8k_current_page to qca8k_priv and make it specific for each switch. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-02 14:43:59 +00:00
Ansuel Smith	2cd5485663	net: dsa: qca8k: add support for phy read/write with mgmt Ethernet Use mgmt Ethernet also for phy read/write if availabale. Use a different seq number to make sure we receive the correct packet. On any error, we fallback to the legacy mdio read/write. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-02 14:43:59 +00:00
Ansuel Smith	5c957c7ca7	net: dsa: qca8k: add support for mib autocast in Ethernet packet The switch can autocast MIB counter using Ethernet packet. Add support for this and provide a handler for the tagger. The switch will send packet with MIB counter for each port, the switch will use completion API to wait for the correct packet to be received and will complete the task only when each packet is received. Although the handler will drop all the other packet, we still have to consume each MIB packet to complete the request. This is done to prevent mixed data with concurrent ethtool request. connect_tag_protocol() is used to add the handler to the tag_qca tagger, master_state_change() use the MIB lock to make sure no MIB Ethernet is in progress. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-02 14:43:59 +00:00
Ansuel Smith	5950c7c0a6	net: dsa: qca8k: add support for mgmt read/write in Ethernet packet Add qca8k side support for mgmt read/write in Ethernet packet. qca8k supports some specially crafted Ethernet packet that can be used for mgmt read/write instead of the legacy method uart/internal mdio. This add support for the qca8k side to craft the packet and enqueue it. Each port and the qca8k_priv have a special struct to put data in it. The completion API is used to wait for the packet to be received back with the requested data. The various steps are: 1. Craft the special packet with the qca hdr set to mgmt read/write mode. 2. Set the lock in the dedicated mgmt struct. 3. Increment the seq number and set it in the mgmt pkt 4. Reinit the completion. 5. Enqueue the packet. 6. Wait the packet to be received. 7. Use the data set by the tagger to complete the mdio operation. If the completion timeouts or the ack value is not true, the legacy mdio way is used. It has to be considered that in the initial setup mdio is still used and mdio is still used until DSA is ready to accept and tag packet. tag_proto_connect() is used to fill the required handler for the tagger to correctly parse and elaborate the special Ethernet mdio packet. Locking is added to qca8k_master_change() to make sure no mgmt Ethernet are in progress. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-02 14:43:59 +00:00
Ansuel Smith	cddbec1946	net: dsa: qca8k: add tracking state of master port MDIO/MIB Ethernet require the master port and the tagger availabale to correctly work. Use the new api master_state_change to track when master is operational or not and set a bool in qca8k_priv. We cache the first cached master available and we check if other cpu port are operational when the cached one goes down. This cached master will later be used by mdio read/write and mib request to correctly use the working function. qca8k implementation for MDIO/MIB Ethernet is bad. CPU port0 is the only one that answers with the ack packet or sends MIB Ethernet packets. For this reason the master_state_change ignore CPU port6 and only checks CPU port0 if it's operational and enables this mode. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-02 14:43:59 +00:00
Ansuel Smith	31eb6b4386	net: dsa: tag_qca: add support for handling mgmt and MIB Ethernet packet Add connect/disconnect helper to assign private struct to the DSA switch. Add support for Ethernet mgmt and MIB if the DSA driver provide an handler to correctly parse and elaborate the data. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-02 14:43:59 +00:00
Ansuel Smith	18be654a43	net: dsa: tag_qca: add define for handling MIB packet Add struct to correctly parse a mib Ethernet packet. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-02 14:43:59 +00:00
Ansuel Smith	c2ee8181fd	net: dsa: tag_qca: add define for handling mgmt Ethernet packet Add all the required define to prepare support for mgmt read/write in Ethernet packet. Any packet of this type has to be dropped as the only use of these special packet is receive ack for an mgmt write request or receive data for an mgmt read request. A struct is used that emulates the Ethernet header but is used for a different purpose. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-02 14:43:59 +00:00
Ansuel Smith	101c04c346	net: dsa: tag_qca: enable promisc_on_master flag Ethernet MDIO packets are non-standard and DSA master expects the first 6 octets to be the MAC DA. To address these kind of packet, enable promisc_on_master flag for the tagger. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-02 14:43:59 +00:00
Ansuel Smith	3ec762fb13	net: dsa: tag_qca: move define to include linux/dsa Move tag_qca define to include dir linux/dsa as the qca8k require access to the tagger define to support in-band mdio read/write using ethernet packet. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-02 14:43:59 +00:00
Ansuel Smith	6b04582992	net: dsa: tag_qca: convert to FIELD macro Convert driver to FIELD macro to drop redundant define. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-02 14:43:59 +00:00
Vladimir Oltean	e83d565378	net: dsa: replay master state events in dsa_tree_{setup,teardown}_master In order for switch driver to be able to make simple and reliable use of the master tracking operations, they must also be notified of the initial state of the DSA master, not just of the changes. This is because they might enable certain features only during the time when they know that the DSA master is up and running. Therefore, this change explicitly checks the state of the DSA master under the same rtnl_mutex as we were holding during the dsa_master_setup() and dsa_master_teardown() call. The idea being that if the DSA master became operational in between the moment in which it became a DSA master (dsa_master_setup set dev->dsa_ptr) and the moment when we checked for the master being up, there is a chance that we would emit a ->master_state_change() call with no actual state change. We need to avoid that by serializing the concurrent netdevice event with us. If the netdevice event started before, we force it to finish before we begin, because we take rtnl_lock before making netdev_uses_dsa() return true. So we also handle that early event and do nothing on it. Similarly, if the dev_open() attempt is concurrent with us, it will attempt to take the rtnl_mutex, but we're holding it. We'll see that the master flag IFF_UP isn't set, then when we release the rtnl_mutex we'll process the NETDEV_UP notifier. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-02 14:43:59 +00:00
Vladimir Oltean	295ab96f47	net: dsa: provide switch operations for tracking the master state Certain drivers may need to send management traffic to the switch for things like register access, FDB dump, etc, to accelerate what their slow bus (SPI, I2C, MDIO) can already do. Ethernet is faster (especially in bulk transactions) but is also more unreliable, since the user may decide to bring the DSA master down (or not bring it up), therefore severing the link between the host and the attached switch. Drivers needing Ethernet-based register access already should have fallback logic to the slow bus if the Ethernet method fails, but that fallback may be based on a timeout, and the I/O to the switch may slow down to a halt if the master is down, because every Ethernet packet will have to time out. The driver also doesn't have the option to turn off Ethernet-based I/O momentarily, because it wouldn't know when to turn it back on. Which is where this change comes in. By tracking NETDEV_CHANGE, NETDEV_UP and NETDEV_GOING_DOWN events on the DSA master, we should know the exact interval of time during which this interface is reliably available for traffic. Provide this information to switches so they can use it as they wish. An helper is added dsa_port_master_is_operational() to check if a master port is operational. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-02 14:43:59 +00:00
David S. Miller	c8ff576e4e	mlx5-fixes-2022-02-01 -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmH6D8AACgkQSD+KveBX +j6E2ggA0cntPZ4neBnDk/fg09qVW74/1VUc3Yh9lRGTnOLYUubbeAuU3WKvUTo/ CngDVS8qNJ7bTf5PTXF4+UlpZSok0yVD4ReMgNHln5mnEONvAuiRQdno1amIQ/AN 6OKk9cy+Mn/ua8XFu75iTCJ9YJuR4HsZowE+/rTHaWGU/cFNMyzSFcQwtnz4aS9G 3sDTowblDtinvSLRN/RS5IyhEfPB4zII4HZEtvM/obobYk40FxkwZ4qWw1VY5ush PZYuqDyXF12rKUvJI1GGKO8mWgyUrmori/VjBPt18uaSK5Om0V6pWNTHjvw3UTrz Q3ypiQS+VPdFB0bhkqfaIxfnG4G6iw== =FwNK -----END PGP SIGNATURE----- Merge tag 'mlx5-fixes-2022-02-01' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2022-02-01 This series provides bug fixes to mlx5 driver. Please pull and let me know if there is any problem. Sorry about the long series, but I had to move the top two patches from net-next to net to help avoiding a build break when kspp branch is merged into linus-next on next merge window. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-02 14:19:38 +00:00
Jakub Kicinski	3aa430d33b	Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-02-01 This series contains updates to e1000e driver only. Sasha removes CSME handshake with TGL platform as this is not supported and is causing hardware unit hangs to be reported. * '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: e1000e: Handshake with CSME starts from ADL platforms e1000e: Separate ADP board type from TGP ==================== Link: https://lore.kernel.org/r/20220201173754.580305-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-01 21:03:15 -08:00
Kees Cook	ad5185735f	net/mlx5e: Avoid field-overflowing memcpy() In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memcpy(), memmove(), and memset(), avoid intentionally writing across neighboring fields. Use flexible arrays instead of zero-element arrays (which look like they are always overflowing) and split the cross-field memcpy() into two halves that can be appropriately bounds-checked by the compiler. We were doing: #define ETH_HLEN 14 #define VLAN_HLEN 4 ... #define MLX5E_XDP_MIN_INLINE (ETH_HLEN + VLAN_HLEN) ... struct mlx5e_tx_wqe wqe = mlx5_wq_cyc_get_wqe(wq, pi); ... struct mlx5_wqe_eth_seg eseg = &wqe->eth; struct mlx5_wqe_data_seg dseg = wqe->data; ... memcpy(eseg->inline_hdr.start, xdptxd->data, MLX5E_XDP_MIN_INLINE); target is wqe->eth.inline_hdr.start (which the compiler sees as being 2 bytes in size), but copying 18, intending to write across start (really vlan_tci, 2 bytes). The remaining 16 bytes get written into wqe->data[0], covering byte_count (4 bytes), lkey (4 bytes), and addr (8 bytes). struct mlx5e_tx_wqe { struct mlx5_wqe_ctrl_seg ctrl; / 0 16 / struct mlx5_wqe_eth_seg eth; / 16 16 / struct mlx5_wqe_data_seg data[]; / 32 0 / / size: 32, cachelines: 1, members: 3 / / last cacheline: 32 bytes / }; struct mlx5_wqe_eth_seg { u8 swp_outer_l4_offset; / 0 1 / u8 swp_outer_l3_offset; / 1 1 / u8 swp_inner_l4_offset; / 2 1 / u8 swp_inner_l3_offset; / 3 1 / u8 cs_flags; / 4 1 / u8 swp_flags; / 5 1 / __be16 mss; / 6 2 / __be32 flow_table_metadata; / 8 4 / union { struct { __be16 sz; / 12 2 / u8 start[2]; / 14 2 / } inline_hdr; / 12 4 / struct { __be16 type; / 12 2 / __be16 vlan_tci; / 14 2 / } insert; / 12 4 / __be32 trailer; / 12 4 / }; / 12 4 / / size: 16, cachelines: 1, members: 9 / / last cacheline: 16 bytes / }; struct mlx5_wqe_data_seg { __be32 byte_count; / 0 4 / __be32 lkey; / 4 4 / __be64 addr; / 8 8 / / size: 16, cachelines: 1, members: 3 / / last cacheline: 16 bytes */ }; So, split the memcpy() so the compiler can reason about the buffer sizes. "pahole" shows no size nor member offset changes to struct mlx5e_tx_wqe nor struct mlx5e_umr_wqe. "objdump -d" shows no meaningful object code changes (i.e. only source line number induced differences and optimizations). Fixes: b5503b994ed5 ("net/mlx5e: XDP TX forwarding support") Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:43 -08:00
Kees Cook	6d5c900eb6	net/mlx5e: Use struct_group() for memcpy() region In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memcpy(), memmove(), and memset(), avoid intentionally writing across neighboring fields. Use struct_group() in struct vlan_ethhdr around members h_dest and h_source, so they can be referenced together. This will allow memcpy() and sizeof() to more easily reason about sizes, improve readability, and avoid future warnings about writing beyond the end of h_dest. "pahole" shows no size nor member offset changes to struct vlan_ethhdr. "objdump -d" shows no object code changes. Fixes: 34802a42b352 ("net/mlx5e: Do not modify the TX SKB") Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:43 -08:00
Roi Dayan	5b209d1a22	net/mlx5e: Avoid implicit modify hdr for decap drop rule Currently the driver adds implicit modify hdr action for decap rules on tunnel devices if the port is an ovs port. This is also done if the action is drop and makes the modify hdr redundant and also the FW doesn't support it and will generate a syndrome. kernel: mlx5_core 0000:08:00.0: mlx5_cmd_check:777:(pid 102063): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x8708c3) Fix it by adding the implicit modify hdr only for fwd actions. Fixes: b16eb3c81fe2 ("net/mlx5: Support internal port as decap route device") Fixes: 077cdda764c7 ("net/mlx5e: TC, Fix memory leak with rules with internal port") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Ariel Levkovich <lariel@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:43 -08:00
Raed Salem	de47db0cf7	net/mlx5e: IPsec: Fix tunnel mode crypto offload for non TCP/UDP traffic IPsec Tunnel mode crypto offload software parser (SWP) setting in data path currently always set the inner L4 offset regardless of the encapsulated L4 header type and whether it exists in the first place, this breaks non TCP/UDP traffic as such. Set the SWP inner L4 offset only when the IPsec tunnel encapsulated L4 header protocol is TCP/UDP. While at it fix inner ip protocol read for setting MLX5_ETH_WQE_SWP_INNER_L4_UDP flag to address the case where the ip header protocol is IPv6. Fixes: f1267798c980 ("net/mlx5: Fix checksum issue of VXLAN and IPsec crypto offload") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:43 -08:00
Raed Salem	5352859b3b	net/mlx5e: IPsec: Fix crypto offload for non TCP/UDP encapsulated traffic IPsec crypto offload always set the ethernet segment checksum flags with the inner L4 header checksum flag enabled for encapsulated IPsec offloaded packet regardless of the encapsulated L4 header type, and even if it doesn't exists in the first place, this breaks non TCP/UDP traffic as such. Set the inner L4 checksum flag only when the encapsulated L4 header protocol is TCP/UDP using software parser swp_inner_l4_offset field as indication. Fixes: 5cfb540ef27b ("net/mlx5e: Set IPsec WAs only in IP's non checksum partial case.") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:42 -08:00
Maxim Mikityanskiy	736dfe4e68	net/mlx5e: Don't treat small ceil values as unlimited in HTB offload The hardware spec defines max_average_bw == 0 as "unlimited bandwidth". max_average_bw is calculated as `ceil / BYTES_IN_MBIT`, which can become 0 when ceil is small, leading to an undesired effect of having no bandwidth limit. This commit fixes it by rounding up small values of ceil to 1 Mbit/s. Fixes: 214baf22870c ("net/mlx5e: Support HTB offload") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:42 -08:00
Maor Dickman	d8e5883d69	net/mlx5: E-Switch, Fix uninitialized variable modact The variable modact is not initialized before used in command modify header allocation which can cause command to fail. Fix by initializing modact with zeros. Addresses-Coverity: ("Uninitialized scalar variable") Fixes: 8f1e0b97cc70 ("net/mlx5: E-Switch, Mark miss packets with new chain id mapping") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:42 -08:00
Maor Dickman	ec41332e02	net/mlx5e: Fix handling of wrong devices during bond netevent Current implementation of bond netevent handler only check if the handled netdev is VF representor and it missing a check if the VF representor is on the same phys device of the bond handling the netevent. Fix by adding the missing check and optimizing the check if the netdev is VF representor so it will not access uninitialized private data and crashes. BUG: kernel NULL pointer dereference, address: 000000000000036c PGD 0 P4D 0 Oops: 0000 [#1] SMP NOPTI Workqueue: eth3bond0 bond_mii_monitor [bonding] RIP: 0010:mlx5e_is_uplink_rep+0xc/0x50 [mlx5_core] RSP: 0018:ffff88812d69fd60 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff8881cf800000 RCX: 0000000000000000 RDX: ffff88812d69fe10 RSI: 000000000000001b RDI: ffff8881cf800880 RBP: ffff8881cf800000 R08: 00000445cabccf2b R09: 0000000000000008 R10: 0000000000000004 R11: 0000000000000008 R12: ffff88812d69fe10 R13: 00000000fffffffe R14: ffff88820c0f9000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88846fb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000036c CR3: 0000000103d80006 CR4: 0000000000370ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: mlx5e_eswitch_uplink_rep+0x31/0x40 [mlx5_core] mlx5e_rep_is_lag_netdev+0x94/0xc0 [mlx5_core] mlx5e_rep_esw_bond_netevent+0xeb/0x3d0 [mlx5_core] raw_notifier_call_chain+0x41/0x60 call_netdevice_notifiers_info+0x34/0x80 netdev_lower_state_changed+0x4e/0xa0 bond_mii_monitor+0x56b/0x640 [bonding] process_one_work+0x1b9/0x390 worker_thread+0x4d/0x3d0 ? rescuer_thread+0x350/0x350 kthread+0x124/0x150 ? set_kthread_struct+0x40/0x40 ret_from_fork+0x1f/0x30 Fixes: 7e51891a237f ("net/mlx5e: Use netdev events to set/del egress acl forward-to-vport rule") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:41 -08:00
Khalid Manaa	7957837b81	net/mlx5e: Fix broken SKB allocation in HW-GRO In case the HW doesn't perform header-data split, it will write the whole packet into the data buffer in the WQ, in this case the SHAMPO CQE handler couldn't use the header entry to build the SKB, instead it should allocate a new memory to build the SKB using the function: mlx5e_skb_from_cqe_mpwrq_nonlinear. Fixes: f97d5c2a453e ("net/mlx5e: Add handle SHAMPO cqe support") Signed-off-by: Khalid Manaa <khalidm@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-02-01 20:59:41 -08:00

1 2 3 4 5 ...

1073155 Commits