linux

iv/linux

Author	SHA1	Message	Date
Stanislav Fomichev	26ba7c3f13	MAINTAINERS: mailmap: Update Stanislav's email address Moving to personal address for upstream work. Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20240612225334.41869-1-sdf@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-06-12 17:58:36 -07:00
Petr Pavlu	14a20e5b4a	net/ipv6: Fix the RT cache flush via sysctl using a previous delay The net.ipv6.route.flush system parameter takes a value which specifies a delay used during the flush operation for aging exception routes. The written value is however not used in the currently requested flush and instead utilized only in the next one. A problem is that ipv6_sysctl_rtcache_flush() first reads the old value of net->ipv6.sysctl.flush_delay into a local delay variable and then calls proc_dointvec() which actually updates the sysctl based on the provided input. Fix the problem by switching the order of the two operations. Fixes: `4990509f19` ("[NETNS][IPV6]: Make sysctls route per namespace.") Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20240607112828.30285-1-petr.pavlu@suse.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-12 17:51:35 -07:00
Jakub Kicinski	d92589f8fd	netfilter pull request 24-06-11 -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEN9lkrMBJgcdVAPub1V2XiooUIOQFAmZoxygACgkQ1V2XiooU IOSznA/+IofQMauyDxkynJAVVQSPH0Ee8kbeGj0jJWVTZVtLejZ/nLenGbsVGL16 SPv6QbiHIRCCbJ/Py+LlkXJ8iXF0ka2bu8p5IYQ2ic949JSay32XAaK4EZdQTtTS 3gx3cM2PKIYY4yi9K0qDpMsC7ZyKjAlVVjOrBPiP8viYtXcgoVbMq1/E8kAUf/G1 YB+occVHk6NDY6MJE+ooYwhssv4qSaPNHNHmQDQFHhQ4cdVKzT93do55Hu4/K7DN 6LlXAW+YpVeSmopwARrc+FmchwxyIoSQsh26Yn2Q/LiE9kQGeLQ34+nZjtmp/4aD MkThTSk+ImJO1kuLibC2m84bg94c0dfDk/p5Gbcr1l0DxqOseYOz1hfnNTvfcBE8 4m30+jdtFabDy2pJKIEC730+UW5TrH+0f8HWazGOYJxvJOpip/ZiVjzcp57EybJS y3GxExVWGab3hl+w2wfFfM5uHWosfXfJzKjdKMJNswpcZ89QR4/GqAZAvPMh8ljW dxhwcRQ6IEy6B6yVnQ9dq9W9aGfsojIHqcK3HXdp3xvSnW2ZcRD89EL+nw1xfRzw gl/89/EnTjKywCAJ1XhKh/WwFT5r+b8RGqDTH4aeT4OBrR0v2/Z9RUmXW9wF+C0t nll1UlwdZqE0E0VvhhvwGGgF0Fr8nfLC1taIE0YW1/u9DxWleHs= =yxF0 -----END PGP SIGNATURE----- Merge tag 'nf-24-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: Patch #1 fixes insufficient sanitization of netlink attributes for the inner expression which can trigger nul-pointer dereference, from Davide Ornaghi. Patch #2 address a report that there is a race condition between namespace cleanup and the garbage collection of the list:set type. This patch resolves this issue with other minor issues as well, from Jozsef Kadlecsik. Patch #3 ip6_route_me_harder() ignores flowlabel/dsfield when ip dscp has been mangled, this unbreaks ip6 dscp set $v, from Florian Westphal. All of these patches address issues that are present in several releases. * tag 'nf-24-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: Use flowlabel flow key when re-routing mangled packets netfilter: ipset: Fix race between namespace cleanup and gc in the list:set type netfilter: nft_inner: validate mandatory meta and payload ==================== Link: https://lore.kernel.org/r/20240611220323.413713-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-12 16:29:00 -07:00
Xiaolei Wang	be27b89652	net: stmmac: replace priv->speed with the portTransmitRate from the tc-cbs parameters The current cbs parameter depends on speed after uplinking, which is not needed and will report a configuration error if the port is not initially connected. The UAPI exposed by tc-cbs requires userspace to recalculate the send slope anyway, because the formula depends on port_transmit_rate (see man tc-cbs), which is not an invariant from tc's perspective. Therefore, we use offload->sendslope and offload->idleslope to derive the original port_transmit_rate from the CBS formula. Fixes: `1f705bc61a` ("net: stmmac: Add support for CBS QDISC") Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20240608143524.2065736-1-xiaolei.wang@windriver.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-11 19:58:18 -07:00
Joshua Washington	1b9f756344	gve: ignore nonrelevant GSO type bits when processing TSO headers TSO currently fails when the skb's gso_type field has more than one bit set. TSO packets can be passed from userspace using PF_PACKET, TUNTAP and a few others, using virtio_net_hdr (e.g., PACKET_VNET_HDR). This includes virtualization, such as QEMU, a real use-case. The gso_type and gso_size fields as passed from userspace in virtio_net_hdr are not trusted blindly by the kernel. It adds gso_type \|= SKB_GSO_DODGY to force the packet to enter the software GSO stack for verification. This issue might similarly come up when the CWR bit is set in the TCP header for congestion control, causing the SKB_GSO_TCP_ECN gso_type bit to be set. Fixes: `a57e5de476` ("gve: DQO: Add TX path") Signed-off-by: Joshua Washington <joshwash@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Suggested-by: Eric Dumazet <edumazet@google.com> Acked-by: Andrei Vagin <avagin@gmail.com> v2 - Remove unnecessary comments, remove line break between fixes tag and signoffs. v3 - Add back unrelated empty line removal. Link: https://lore.kernel.org/r/20240610225729.2985343-1-joshwash@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-11 19:42:35 -07:00
Jakub Kicinski	f6b2f578df	bluetooth pull request for net: - hci_sync: Fix not using correct handle - L2CAP: Fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ - L2CAP: fix connection setup in l2cap_connect -----BEGIN PGP SIGNATURE----- iQJNBAABCAA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmZnBfgZHGx1aXoudm9u LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKXIHD/45N5EpMuwOkHMeXDO9FnNa LKAD2t1weNDd8SOqu96jmyxxfLwkU2b/mrSkn5gM+y8g6NNrCH1B+Pp7yRzZE1C4 nMoxTUM26YY4AoznmWb56JgpTfXMsjUMAan6hZUnRdSKtycUOArDJdFwjACRwIkp KtpaUkRPq70kAV/OOtaB5aCn5GuCDIg10vyFhTD8ieP7d1DcRKyw2CJcnu+9s9ao KkvTiEICPGzmHOmOw5QXIUr5MUuhHfGsbk3oEE7dAgNd0E6yOgPa6Cs7KfNz3x9i gyCjJm9vOIr6xozCwIpwUhlPVbvH1aKY0LECLacK7vUYFTrcE6IimGi6HZcSqfsP G/WHMMEMJkLG53FeHVSSX9zzWN8Rvw0hBorW4EZsQm0RSbzhxQInMda05nSTJCos UuNSQCTTynkt27q/UqJj/NGAwkCL5f86UnrDk1Jb9O6r4QKlcQTxS40FSqXpy5nY 6scPyagLdIEWuWM9eUTqtHP0Ax8ROHfwGvGRONFvhTZmRAEcIU3VtwkXV2OGLVhy E+yZTsW3EUnKRQRL30/N2KdTIBgH6V6stn5Bf0mX5pFoykNxSf+scBtvEcRDyUKu 3pc47Q/iypzfa/VFDuXAXhx+IgKp+7d/cPpH+6MTpSGtZbyhkeK0eX112kPijF15 sRbbDuQYZg11aVPThMhC0A== =mmLZ -----END PGP SIGNATURE----- Merge tag 'for-net-2024-06-10' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - hci_sync: fix not using correct handle - L2CAP: fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ - L2CAP: fix connection setup in l2cap_connect * tag 'for-net-2024-06-10' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: fix connection setup in l2cap_connect Bluetooth: L2CAP: Fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ Bluetooth: hci_sync: Fix not using correct handle ==================== Link: https://lore.kernel.org/r/20240610135803.920662-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-11 19:40:27 -07:00
Kory Maincent	144ba8580b	net: pse-pd: Use EOPNOTSUPP error code instead of ENOTSUPP ENOTSUPP is not a SUSV4 error code, prefer EOPNOTSUPP as reported by checkpatch script. Fixes: `18ff0bcda6` ("ethtool: add interface to interact with Ethernet Power Equipment") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://lore.kernel.org/r/20240610083426.740660-1-kory.maincent@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-11 19:33:37 -07:00
Florian Westphal	6f8f132cc7	netfilter: Use flowlabel flow key when re-routing mangled packets 'ip6 dscp set $v' in an nftables outpute route chain has no effect. While nftables does detect the dscp change and calls the reroute hook. But ip6_route_me_harder never sets the dscp/flowlabel: flowlabel/dsfield routing rules are ignored and no reroute takes place. Thanks to Yi Chen for an excellent reproducer script that I used to validate this change. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Reported-by: Yi Chen <yiche@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2024-06-11 18:46:04 +02:00
Jozsef Kadlecsik	4e7aaa6b82	netfilter: ipset: Fix race between namespace cleanup and gc in the list:set type Lion Ackermann reported that there is a race condition between namespace cleanup in ipset and the garbage collection of the list:set type. The namespace cleanup can destroy the list:set type of sets while the gc of the set type is waiting to run in rcu cleanup. The latter uses data from the destroyed set which thus leads use after free. The patch contains the following parts: - When destroying all sets, first remove the garbage collectors, then wait if needed and then destroy the sets. - Fix the badly ordered "wait then remove gc" for the destroy a single set case. - Fix the missing rcu locking in the list:set type in the userspace test case. - Use proper RCU list handlings in the list:set type. The patch depends on `c1193d9bbb` (netfilter: ipset: Add list flush to cancel_gc). Fixes: `97f7cf1cd8` (netfilter: ipset: fix performance regression in swap operation) Reported-by: Lion Ackermann <nnamrec@gmail.com> Tested-by: Lion Ackermann <nnamrec@gmail.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2024-06-11 18:46:04 +02:00
Davide Ornaghi	c4ab9da85b	netfilter: nft_inner: validate mandatory meta and payload Check for mandatory netlink attributes in payload and meta expression when used embedded from the inner expression, otherwise NULL pointer dereference is possible from userspace. Fixes: `a150d122b6` ("netfilter: nft_meta: add inner match support") Fixes: `3a07327d10` ("netfilter: nft_inner: support for inner tunnel header matching") Signed-off-by: Davide Ornaghi <d.ornaghi97@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2024-06-11 18:46:04 +02:00
Eric Dumazet	36534d3c54	tcp: use signed arithmetic in tcp_rtx_probe0_timed_out() Due to timer wheel implementation, a timer will usually fire after its schedule. For instance, for HZ=1000, a timeout between 512ms and 4s has a granularity of 64ms. For this range of values, the extra delay could be up to 63ms. For TCP, this means that tp->rcv_tstamp may be after inet_csk(sk)->icsk_timeout whenever the timer interrupt finally triggers, if one packet came during the extra delay. We need to make sure tcp_rtx_probe0_timed_out() handles this case. Fixes: `e89688e3e9` ("net: tcp: fix unexcepted socket die when snd_wnd is 0") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Menglong Dong <imagedong@tencent.com> Acked-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://lore.kernel.org/r/20240607125652.1472540-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-10 19:50:10 -07:00
Jakub Kicinski	70b3c88cec	Merge branch 'mptcp-various-fixes' Matthieu Baerts says: ==================== mptcp: various fixes The different patches here are some unrelated fixes for MPTCP: - Patch 1 ensures 'snd_una' is initialised on connect in case of MPTCP fallback to TCP followed by retransmissions before the processing of any other incoming packets. A fix for v5.9+. - Patch 2 makes sure the RmAddr MIB counter is incremented, and only once per ID, upon the reception of a RM_ADDR. A fix for v5.10+. - Patch 3 doesn't update 'add addr' related counters if the connect() was not possible. A fix for v5.7+. - Patch 4 updates the mailmap file to add Geliang's new email address. ==================== Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-0-1ab9ddfa3d00@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-10 19:49:14 -07:00
Geliang Tang	74acb250e1	mailmap: map Geliang's new email address Just like my other email addresses, map my new one to kernel.org account too. My new email address uses "last name, first name" format, which is different from my other email addresses. This mailmap is also used to indicate that it is actually the same person. Suggested-by: Mat Martineau <martineau@kernel.org> Suggested-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang@kernel.org> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-4-1ab9ddfa3d00@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-10 19:49:11 -07:00
YonglongLi	40eec1795c	mptcp: pm: update add_addr counters after connect The creation of new subflows can fail for different reasons. If no subflow have been created using the received ADD_ADDR, the related counters should not be updated, otherwise they will never be decremented for events related to this ID later on. For the moment, the number of accepted ADD_ADDR is only decremented upon the reception of a related RM_ADDR, and only if the remote address ID is currently being used by at least one subflow. In other words, if no subflow can be created with the received address, the counter will not be decremented. In this case, it is then important not to increment pm.add_addr_accepted counter, and not to modify pm.accept_addr bit. Note that this patch does not modify the behaviour in case of failures later on, e.g. if the MP Join is dropped or rejected. The "remove invalid addresses" MP Join subtest has been modified to validate this case. The broadcast IP address is added before the "valid" address that will be used to successfully create a subflow, and the limit is decreased by one: without this patch, it was not possible to create the last subflow, because: - the broadcast address would have been accepted even if it was not usable: the creation of a subflow to this address results in an error, - the limit of 2 accepted ADD_ADDR would have then been reached. Fixes: `01cacb00b3` ("mptcp: add netlink-based PM") Cc: stable@vger.kernel.org Co-developed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: YonglongLi <liyonglong@chinatelecom.cn> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-3-1ab9ddfa3d00@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-10 19:49:10 -07:00
YonglongLi	6a09788c1a	mptcp: pm: inc RmAddr MIB counter once per RM_ADDR ID The RmAddr MIB counter is supposed to be incremented once when a valid RM_ADDR has been received. Before this patch, it could have been incremented as many times as the number of subflows connected to the linked address ID, so it could have been 0, 1 or more than 1. The "RmSubflow" is incremented after a local operation. In this case, it is normal to tied it with the number of subflows that have been actually removed. The "remove invalid addresses" MP Join subtest has been modified to validate this case. A broadcast IP address is now used instead: the client will not be able to create a subflow to this address. The consequence is that when receiving the RM_ADDR with the ID attached to this broadcast IP address, no subflow linked to this ID will be found. Fixes: `7a7e52e38a` ("mptcp: add RM_ADDR related mibs") Cc: stable@vger.kernel.org Co-developed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: YonglongLi <liyonglong@chinatelecom.cn> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-2-1ab9ddfa3d00@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-10 19:49:10 -07:00
Paolo Abeni	8031b58c3a	mptcp: ensure snd_una is properly initialized on connect This is strictly related to commit `fb7a0d3348` ("mptcp: ensure snd_nxt is properly initialized on connect"). It turns out that syzkaller can trigger the retransmit after fallback and before processing any other incoming packet - so that snd_una is still left uninitialized. Address the issue explicitly initializing snd_una together with snd_nxt and write_seq. Suggested-by: Mat Martineau <martineau@kernel.org> Fixes: `8fd738049a` ("mptcp: fallback in case of simultaneous connect") Cc: stable@vger.kernel.org Reported-by: Christoph Paasch <cpaasch@apple.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/485 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-1-1ab9ddfa3d00@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-10 19:49:10 -07:00
Johannes Berg	44180feacc	net/sched: initialize noop_qdisc owner When the noop_qdisc owner isn't initialized, then it will be 0, so packets will erroneously be regarded as having been subject to recursion as long as only CPU 0 queues them. For non-SMP, that's all packets, of course. This causes a change in what's reported to userspace, normally noop_qdisc would drop packets silently, but with this change the syscall returns -ENOBUFS if RECVERR is also set on the socket. Fix this by initializing the owner field to -1, just like it would be for dynamically allocated qdiscs by qdisc_alloc(). Fixes: `0f022d32c3` ("net/sched: Fix mirred deadlock on device recursion") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240607175340.786bfb938803.I493bf8422e36be4454c08880a8d3703cea8e421a@changeid Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-10 19:36:49 -07:00
Pauli Virtanen	c695439d19	Bluetooth: fix connection setup in l2cap_connect The amp_id argument of l2cap_connect() was removed in commit `84a4bb6548` ("Bluetooth: HCI: Remove HCI_AMP support") It was always called with amp_id == 0, i.e. AMP_ID_BREDR == 0x00 (ie. non-AMP controller). In the above commit, the code path for amp_id != 0 was preserved, although it should have used the amp_id == 0 one. Restore the previous behavior of the non-AMP code path, to fix problems with L2CAP connections. Fixes: `84a4bb6548` ("Bluetooth: HCI: Remove HCI_AMP support") Signed-off-by: Pauli Virtanen <pav@iki.fi> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-06-10 09:48:30 -04:00
Luiz Augusto von Dentz	806a5198c0	Bluetooth: L2CAP: Fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ This removes the bogus check for max > hcon->le_conn_max_interval since the later is just the initial maximum conn interval not the maximum the stack could support which is really 3200=4000ms. In order to pass GAP/CONN/CPUP/BV-05-C one shall probably enter values of the following fields in IXIT that would cause hci_check_conn_params to fail: TSPX_conn_update_int_min TSPX_conn_update_int_max TSPX_conn_update_peripheral_latency TSPX_conn_update_supervision_timeout Link: https://github.com/bluez/bluez/issues/847 Fixes: `e4b019515f` ("Bluetooth: Enforce validation on max value of connection interval") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-06-10 09:48:27 -04:00
Luiz Augusto von Dentz	86fbd9f63a	Bluetooth: hci_sync: Fix not using correct handle When setting up an advertisement the code shall always attempt to use the handle set by the instance since it may not be equal to the instance ID. Fixes: `e77f43d531` ("Bluetooth: hci_core: Fix not handling hdev->le_num_of_adv_sets=1") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-06-10 09:48:25 -04:00
David S. Miller	93792130a9	Merge branch 'geneve-fixes' Tariq Toukan says: ==================== geneve fixes This small patchset by Gal provides bug fixes to the geneve tunnels flows. Patch 1 fixes an incorrect value returned by the inner network header offset helper. Patch 2 fixes an issue inside the mlx5e tunneling flow. It 'happened' to be harmless so far, before applying patch 1. Series generated against: commit `d30d0e49da` ("Merge tag 'net-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net") ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-10 13:18:09 +01:00
Gal Pressman	791b4089e3	net/mlx5e: Fix features validation check for tunneled UDP (non-VXLAN) packets Move the vxlan_features_check() call to after we verified the packet is a tunneled VXLAN packet. Without this, tunneled UDP non-VXLAN packets (for ex. GENENVE) might wrongly not get offloaded. In some cases, it worked by chance as GENEVE header is the same size as VXLAN, but it is obviously incorrect. Fixes: `e3cfc7e6b7` ("net/mlx5e: TX, Add geneve tunnel stateless offload support") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-10 13:18:09 +01:00
Gal Pressman	c6ae073f59	geneve: Fix incorrect inner network header offset when innerprotoinherit is set When innerprotoinherit is set, the tunneled packets do not have an inner Ethernet header. Change 'maclen' to not always assume the header length is ETH_HLEN, as there might not be a MAC header. This resolves issues with drivers (e.g. mlx5, in mlx5e_tx_tunnel_accel()) who rely on the skb inner network header offset to be correct, and use it for TX offloads. Fixes: `d8a6213d70` ("geneve: fix header validation in geneve[6]_xmit_skb") Signed-off-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-10 13:18:08 +01:00
Andy Shevchenko	d029edefed	net dsa: qca8k: fix usages of device_get_named_child_node() The documentation for device_get_named_child_node() mentions this important point: " The caller is responsible for calling fwnode_handle_put() on the returned fwnode pointer. " Add fwnode_handle_put() to avoid leaked references. Fixes: `1e264f9d29` ("net: dsa: qca8k: add LEDs basic support") Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-10 13:12:14 +01:00
Eric Dumazet	d37fe4255a	tcp: fix race in tcp_v6_syn_recv_sock() tcp_v6_syn_recv_sock() calls ip6_dst_store() before inet_sk(newsk)->pinet6 has been set up. This means ip6_dst_store() writes over the parent (listener) np->dst_cookie. This is racy because multiple threads could share the same parent and their final np->dst_cookie could be wrong. Move ip6_dst_store() call after inet_sk(newsk)->pinet6 has been changed and after the copy of parent ipv6_pinfo. Fixes: `e994b2f0fb` ("tcp: do not lock listener to process SYN packets") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-10 13:11:22 +01:00
David Wei	5add2f7288	netdevsim: fix backwards compatibility in nsim_get_iflink() The default ndo_get_iflink() implementation returns the current ifindex of the netdev. But the overridden nsim_get_iflink() returns 0 if the current nsim is not linked, breaking backwards compatibility for userspace that depend on this behaviour. Fix the problem by returning the current ifindex if not linked to a peer. Fixes: `8debcf5832` ("netdevsim: add ndo_get_iflink() implementation") Reported-by: Yu Watanabe <watanabe.yu@gmail.com> Suggested-by: Yu Watanabe <watanabe.yu@gmail.com> Signed-off-by: David Wei <dw@davidwei.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-10 11:51:04 +01:00
Sagar Cheluvegowda	0579f27249	net: stmmac: dwmac-qcom-ethqos: Configure host DMA width Commit `070246e467` ("net: stmmac: Fix for mismatched host/device DMA address width") added support in the stmmac driver for platform drivers to indicate the host DMA width, but left it up to authors of the specific platforms to indicate if their width differed from the addr64 register read from the MAC itself. Qualcomm's EMAC4 integration supports only up to 36 bit width (as opposed to the addr64 register indicating 40 bit width). Let's indicate that in the platform driver to avoid a scenario where the driver will allocate descriptors of size that is supported by the CPU which in our case is 36 bit, but as the addr64 register is still capable of 40 bits the device will use two descriptors as one address. Fixes: `8c4d92e82d` ("net: stmmac: dwmac-qcom-ethqos: add support for emac4 on sa8775p platforms") Signed-off-by: Sagar Cheluvegowda <quic_scheluve@quicinc.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Andrew Halaney <ahalaney@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-09 15:52:52 +01:00
Aleksandr Mishin	c44711b786	liquidio: Adjust a NULL pointer handling path in lio_vf_rep_copy_packet In lio_vf_rep_copy_packet() pg_info->page is compared to a NULL value, but then it is unconditionally passed to skb_add_rx_frag() which looks strange and could lead to null pointer dereference. lio_vf_rep_copy_packet() call trace looks like: octeon_droq_process_packets octeon_droq_fast_process_packets octeon_droq_dispatch_pkt octeon_create_recv_info ...search in the dispatch_list... ->disp_fn(rdisp->rinfo, ...) lio_vf_rep_pkt_recv(struct octeon_recv_info *recv_info, ...) In this path there is no code which sets pg_info->page to NULL. So this check looks unneeded and doesn't solve potential problem. But I guess the author had reason to add a check and I have no such card and can't do real test. In addition, the code in the function liquidio_push_packet() in liquidio/lio_core.c does exactly the same. Based on this, I consider the most acceptable compromise solution to adjust this issue by moving skb_add_rx_frag() into conditional scope. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: `1f233f3279` ("liquidio: switchdev support for LiquidIO NIC") Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-07 14:22:19 +01:00
David S. Miller	dbfb886465	Merge branch 'hns3-fixes' Jijie Shao says: ==================== There are some bugfix for the HNS3 ethernet driver There are some bugfix for the HNS3 ethernet driver ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-07 12:20:28 +01:00
Jie Wang	968fde8384	net: hns3: add cond_resched() to hns3 ring buffer init process Currently hns3 ring buffer init process would hold cpu too long with big Tx/Rx ring depth. This could cause soft lockup. So this patch adds cond_resched() to the process. Then cpu can break to run other tasks instead of busy looping. Fixes: `a723fb8efe` ("net: hns3: refine for set ring parameters") Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-07 12:20:28 +01:00
Yonglong Liu	12cda92021	net: hns3: fix kernel crash problem in concurrent scenario When link status change, the nic driver need to notify the roce driver to handle this event, but at this time, the roce driver may uninit, then cause kernel crash. To fix the problem, when link status change, need to check whether the roce registered, and when uninit, need to wait link update finish. Fixes: `45e92b7e4e` ("net: hns3: add calling roce callback function when link status change") Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-07 12:20:28 +01:00
Udit Kumar	b472b996a4	dt-bindings: net: dp8386x: Add MIT license along with GPL-2.0 Modify license to include dual licensing as GPL-2.0-only OR MIT license for TI specific phy header files. This allows for Linux kernel files to be used in other Operating System ecosystems such as Zephyr or FreeBSD. While at this, update the GPL-2.0 to be GPL-2.0-only to be in sync with latest SPDX conventions (GPL-2.0 is deprecated). While at this, update the TI copyright year to sync with current year to indicate license change. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Trent Piepho <tpiepho@impinj.com> Cc: Wadim Egorov <w.egorov@phytec.de> Cc: Kip Broadhurst <kbroadhurst@ti.com> Signed-off-by: Udit Kumar <u-kumar1@ti.com> Acked-by: Wadim Egorov <w.egorov@phytec.de> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-07 12:16:22 +01:00
Csókás, Bence	e96b293315	net: sfp: Always call `sfp_sm_mod_remove()` on remove If the module is in SFP_MOD_ERROR, `sfp_sm_mod_remove()` will not be run. As a consequence, `sfp_hwmon_remove()` is not getting run either, leaving a stale `hwmon` device behind. `sfp_sm_mod_remove()` itself checks `sfp->sm_mod_state` anyways, so this check was not really needed in the first place. Fixes: `d2e816c029` ("net: sfp: handle module remove outside state machine") Signed-off-by: "Csókás, Bence" <csokas.bence@prolan.hu> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20240605084251.63502-1-csokas.bence@prolan.hu Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-06 17:34:03 -07:00
Linus Torvalds	d30d0e49da	Including fixes from BPF and big collection of fixes for WiFi core and drivers. Current release - regressions: - vxlan: fix regression when dropping packets due to invalid src addresses - bpf: fix a potential use-after-free in bpf_link_free() - xdp: revert support for redirect to any xsk socket bound to the same UMEM as it can result in a corruption - virtio_net: - add missing lock protection when reading return code from control_buf - fix false-positive lockdep splat in DIM - Revert "wifi: wilc1000: convert list management to RCU" - wifi: ath11k: fix error path in ath11k_pcic_ext_irq_config Previous releases - regressions: - rtnetlink: make the "split" NLM_DONE handling generic, restore the old behavior for two cases where we started coalescing those messages with normal messages, breaking sloppily-coded userspace - wifi: - cfg80211: validate HE operation element parsing - cfg80211: fix 6 GHz scan request building - mt76: mt7615: add missing chanctx ops - ath11k: move power type check to ASSOC stage, fix connecting to 6 GHz AP - ath11k: fix WCN6750 firmware crash caused by 17 num_vdevs - rtlwifi: ignore IEEE80211_CONF_CHANGE_RETRY_LIMITS - iwlwifi: mvm: fix a crash on 7265 Previous releases - always broken: - ncsi: prevent multi-threaded channel probing, a spec violation - vmxnet3: disable rx data ring on dma allocation failure - ethtool: init tsinfo stats if requested, prevent unintentionally reporting all-zero stats on devices which don't implement any - dst_cache: fix possible races in less common IPv6 features - tcp: auth: don't consider TCP_CLOSE to be in TCP_AO_ESTABLISHED - ax25: fix two refcounting bugs - eth: ionic: fix kernel panic in XDP_TX action Misc: - tcp: count CLOSE-WAIT sockets for TCP_MIB_CURRESTAB Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmZh3mUACgkQMUZtbf5S IrvPwRAApv8X0ZIbPD5PuVEkiYuSkSE6QVou5GaVO7DzF4gj07zPNtCe6B/ZZdBu RLdlppxjAmVwdCRmUo0plxSydYZcqFpQqV6lRH/rbWmktWIp0pGIOAcOG7ISRPCC FAYJ4udSt4+wrq0hXTsE1KO1JZ0p7zE2bXxNC8uR8wgM9yonUjqhYdAUZhrl3yCY zOCD/+kvWFLYtehDcmyNK0ANS3yNveTNkRhXDc1UrpOGMtza60lf5u3bWK+sU5VS NGPe9cU60WKMQi6QnWFBZKIcp4Vgy2MukOLdNn9e8BRjFLh2dbY86LAmE4HWPA7I ONZagOfEjeOcRSCMdFHxui/PUDZLBZNhrnqQ6x8uC2yKwwIMr+CgEt5sCmVFwH6n 3HTlWSjL38yuiVuYuhxGchmVnZfC4bLi2qAFF1oxhlDGViBDhAwi36MSCnjDpN8k Jo0x6crQLS/uvwVXPKWAUcQhy7OE69A3FwwA1PtkxRX5EQPn1if2Z7yq7YfYb9aD bChvCarlfuVDm+CBItphXg0ajVZc+im7+JK62Zn50A1cTbEK0lnYCOcmqzqiqrXI Vr3XXt6gVVnvwY374JDO1vmB5ft2IYBn7sWnLcIvR2UlggqEfqMdKSSwm7pOprG9 YJ/LDAXVmG0kLN7rZUYUBLItnpuHAhYDrBOsV5HaFeksWauc1oY= =mwEJ -----END PGP SIGNATURE----- Merge tag 'net-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from BPF and big collection of fixes for WiFi core and drivers. Current release - regressions: - vxlan: fix regression when dropping packets due to invalid src addresses - bpf: fix a potential use-after-free in bpf_link_free() - xdp: revert support for redirect to any xsk socket bound to the same UMEM as it can result in a corruption - virtio_net: - add missing lock protection when reading return code from control_buf - fix false-positive lockdep splat in DIM - Revert "wifi: wilc1000: convert list management to RCU" - wifi: ath11k: fix error path in ath11k_pcic_ext_irq_config Previous releases - regressions: - rtnetlink: make the "split" NLM_DONE handling generic, restore the old behavior for two cases where we started coalescing those messages with normal messages, breaking sloppily-coded userspace - wifi: - cfg80211: validate HE operation element parsing - cfg80211: fix 6 GHz scan request building - mt76: mt7615: add missing chanctx ops - ath11k: move power type check to ASSOC stage, fix connecting to 6 GHz AP - ath11k: fix WCN6750 firmware crash caused by 17 num_vdevs - rtlwifi: ignore IEEE80211_CONF_CHANGE_RETRY_LIMITS - iwlwifi: mvm: fix a crash on 7265 Previous releases - always broken: - ncsi: prevent multi-threaded channel probing, a spec violation - vmxnet3: disable rx data ring on dma allocation failure - ethtool: init tsinfo stats if requested, prevent unintentionally reporting all-zero stats on devices which don't implement any - dst_cache: fix possible races in less common IPv6 features - tcp: auth: don't consider TCP_CLOSE to be in TCP_AO_ESTABLISHED - ax25: fix two refcounting bugs - eth: ionic: fix kernel panic in XDP_TX action Misc: - tcp: count CLOSE-WAIT sockets for TCP_MIB_CURRESTAB" * tag 'net-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (107 commits) selftests: net: lib: set 'i' as local selftests: net: lib: avoid error removing empty netns name selftests: net: lib: support errexit with busywait net: ethtool: fix the error condition in ethtool_get_phy_stats_ethtool() ipv6: fix possible race in __fib6_drop_pcpu_from() af_unix: Annotate data-race of sk->sk_shutdown in sk_diag_fill(). af_unix: Use skb_queue_len_lockless() in sk_diag_show_rqlen(). af_unix: Use skb_queue_empty_lockless() in unix_release_sock(). af_unix: Use unix_recvq_full_lockless() in unix_stream_connect(). af_unix: Annotate data-race of net->unx.sysctl_max_dgram_qlen. af_unix: Annotate data-races around sk->sk_sndbuf. af_unix: Annotate data-races around sk->sk_state in UNIX_DIAG. af_unix: Annotate data-race of sk->sk_state in unix_stream_read_skb(). af_unix: Annotate data-races around sk->sk_state in sendmsg() and recvmsg(). af_unix: Annotate data-race of sk->sk_state in unix_accept(). af_unix: Annotate data-race of sk->sk_state in unix_stream_connect(). af_unix: Annotate data-races around sk->sk_state in unix_write_space() and poll(). af_unix: Annotate data-race of sk->sk_state in unix_inq_len(). af_unix: Annodate data-races around sk->sk_state for writers. af_unix: Set sk->sk_state under unix_state_lock() for truly disconencted peer. ...	2024-06-06 09:55:27 -07:00
Linus Torvalds	2faf6332c5	Single patch, no behavior changes. Tetsuo Handa (1): tomoyo: update project links Documentation/admin-guide/LSM/tomoyo.rst \| 35 +++++++++---------------------- MAINTAINERS \| 2 - security/tomoyo/Kconfig \| 2 - security/tomoyo/common.c \| 2 - 4 files changed, 14 insertions(+), 27 deletions(-) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJmYchgAAoJEEJfEo0MZPUqqXYP/ROdUgeoGCYo4Fv7PKoQtiwm cCf53gQD0ozv2pVpYQH6TnF4MUfnqxEjskYgL9sJahwSQ8pdyj8SO08uVACBgwuJ 1cXAGrSBFJEYTZY9/V3JTSbdTvqQsVTwSii3hj/VABfYTTQtnLdqPFmsslStIstx sGNwIZPvwX5xCTG6YkZCXBPtZGxtAhVvUueRF46525qZvmsgV7ziGfqNecNdfHyi 6wiw9HXZJlaKcj+RNQrcc10JeX0g3we3gpVIa8FJ5+wnpOvVuQOtq9lm9Idzw6xo AsKyg3jTjDaJjIv125lv7++DIXyipvDK8TPZJOwiC8WYsChLveb+fZV3YMNLpz2N Qepgzejf1O7rLT55zJID4KQGwCkCTg7TJILLA57wFAa+7VGLspkvIXsNzOjpe9P7 9ufclnrAkM1RbBIUqSOj1OcTm6dSBkNG32MI869NZ6M8UH3gDbmCLTsMNv7JT2Qe ax7E8zRqDTJBzH4dcAIKJ1pFF4lIj6H7dhbDJf0TPB89UGJdBdil4b+JIaJyZXEn 0M/RFdPiiw/vGsaFn1m6RCkV0WuuLhUHCOhq+0ukzsVfs9XqXWs/Yfngt07I3ldH ALB+dE7sddFI0dvyrJub/MTd3KRHZfB6TF1mKeHQe7Y4lR1TNctQxUuqClDJVXaT a38bb4G+qgIOcVMHeSaL =cwIu -----END PGP SIGNATURE----- Merge tag 'tomoyo-pr-20240606' of git://git.code.sf.net/p/tomoyo/tomoyo Pull tomoyo fixlet from Tetsuo Handa: "Single patch to update project links, no behavior changes" * tag 'tomoyo-pr-20240606' of git://git.code.sf.net/p/tomoyo/tomoyo: tomoyo: update project links	2024-06-06 09:48:57 -07:00
Linus Torvalds	a34adf6010	EFI fixes for v6.10 #2 - Ensure that .discard sections are really discarded in the EFI zboot image build - Return proper error numbers from efi-pstore - Add __nocfi annotations to EFI runtime wrappers -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQQQm/3uucuRGn1Dmh0wbglWLn0tXAUCZmAmwwAKCRAwbglWLn0t XNbNAQDsnOTRK4Azr0rqHUvOoB2g+0XlIL9yR+r5MwV8lAdL+QD9GJpX7p7pzT4q aT4zzzoS1h9FFUNTDtE7by18bDBElgI= =RxkM -----END PGP SIGNATURE----- Merge tag 'efi-fixes-for-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fixes from Ard Biesheuvel: - Ensure that .discard sections are really discarded in the EFI zboot image build - Return proper error numbers from efi-pstore - Add __nocfi annotations to EFI runtime wrappers * tag 'efi-fixes-for-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi: Add missing __nocfi annotations to runtime wrappers efi: pstore: Return proper errors on UEFI failures efi/libstub: zboot.lds: Discard .discard sections	2024-06-06 09:39:36 -07:00
Jakub Kicinski	27bc865408	Merge branch 'selftests-net-lib-small-fixes' Matthieu Baerts says: ==================== selftests: net: lib: small fixes While looking at using 'lib.sh' for the MPTCP selftests [1], we found some small issues with 'lib.sh'. Here they are: - Patch 1: fix 'errexit' (set -e) support with busywait. 'errexit' is supported in some functions, not all. A fix for v6.8+. - Patch 2: avoid confusing error messages linked to the cleaning part when the netns setup fails. A fix for v6.8+. - Patch 3: set a variable as local to avoid accidentally changing the value of a another one with the same name on the caller side. A fix for v6.10-rc1+. Link: https://lore.kernel.org/mptcp/5f4615c3-0621-43c5-ad25-55747a4350ce@kernel.org/T/ [1] ==================== Link: https://lore.kernel.org/r/20240605-upstream-net-20240605-selftests-net-lib-fixes-v1-0-b3afadd368c9@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-06 08:29:07 -07:00
Matthieu Baerts (NGI0)	84a8bc3ec2	selftests: net: lib: set 'i' as local Without this, the 'i' variable declared before could be overridden by accident, e.g. for i in "${@}"; do __ksft_status_merge "${i}" ## 'i' has been modified foo "${i}" ## using 'i' with an unexpected value done After a quick look, it looks like 'i' is currently not used after having been modified in __ksft_status_merge(), but still, better be safe than sorry. I saw this while modifying the same file, not because I suspected an issue somewhere. Fixes: `596c8819cb` ("selftests: forwarding: Have RET track kselftest framework constants") Acked-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20240605-upstream-net-20240605-selftests-net-lib-fixes-v1-3-b3afadd368c9@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-06 08:29:07 -07:00
Matthieu Baerts (NGI0)	79322174bc	selftests: net: lib: avoid error removing empty netns name If there is an error to create the first netns with 'setup_ns()', 'cleanup_ns()' will be called with an empty string as first parameter. The consequences is that 'cleanup_ns()' will try to delete an invalid netns, and wait 20 seconds if the netns list is empty. Instead of just checking if the name is not empty, convert the string separated by spaces to an array. Manipulating the array is cleaner, and calling 'cleanup_ns()' with an empty array will be a no-op. Fixes: `25ae948b44` ("selftests/net: add lib.sh") Cc: stable@vger.kernel.org Acked-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20240605-upstream-net-20240605-selftests-net-lib-fixes-v1-2-b3afadd368c9@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-06 08:29:07 -07:00
Matthieu Baerts (NGI0)	41b02ea4c0	selftests: net: lib: support errexit with busywait If errexit is enabled ('set -e'), loopy_wait -- or busywait and others using it -- will stop after the first failure. Note that if the returned status of loopy_wait is checked, and even if errexit is enabled, Bash will not stop at the first error. Fixes: `25ae948b44` ("selftests/net: add lib.sh") Cc: stable@vger.kernel.org Acked-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20240605-upstream-net-20240605-selftests-net-lib-fixes-v1-1-b3afadd368c9@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-06 08:29:07 -07:00
Su Hui	0dcc53abf5	net: ethtool: fix the error condition in ethtool_get_phy_stats_ethtool() Clang static checker (scan-build) warning: net/ethtool/ioctl.c:line 2233, column 2 Called function pointer is null (null dereference). Return '-EOPNOTSUPP' when 'ops->get_ethtool_phy_stats' is NULL to fix this typo error. Fixes: `201ed315f9` ("net/ethtool/ioctl: split ethtool_get_phy_stats into multiple helpers") Signed-off-by: Su Hui <suhui@nfschina.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Hariprasad Kelam <hkelam@marvell.com> Link: https://lore.kernel.org/r/20240605034742.921751-1-suhui@nfschina.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 13:34:33 +02:00
Eric Dumazet	b01e1c0307	ipv6: fix possible race in __fib6_drop_pcpu_from() syzbot found a race in __fib6_drop_pcpu_from() [1] If compiler reads more than once (*ppcpu_rt), second read could read NULL, if another cpu clears the value in rt6_get_pcpu_route(). Add a READ_ONCE() to prevent this race. Also add rcu_read_lock()/rcu_read_unlock() because we rely on RCU protection while dereferencing pcpu_rt. [1] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000012: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000090-0x0000000000000097] CPU: 0 PID: 7543 Comm: kworker/u8:17 Not tainted 6.10.0-rc1-syzkaller-00013-g2bfcfd584ff5 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024 Workqueue: netns cleanup_net RIP: 0010:__fib6_drop_pcpu_from.part.0+0x10a/0x370 net/ipv6/ip6_fib.c:984 Code: f8 48 c1 e8 03 80 3c 28 00 0f 85 16 02 00 00 4d 8b 3f 4d 85 ff 74 31 e8 74 a7 fa f7 49 8d bf 90 00 00 00 48 89 f8 48 c1 e8 03 <80> 3c 28 00 0f 85 1e 02 00 00 49 8b 87 90 00 00 00 48 8b 0c 24 48 RSP: 0018:ffffc900040df070 EFLAGS: 00010206 RAX: 0000000000000012 RBX: 0000000000000001 RCX: ffffffff89932e16 RDX: ffff888049dd1e00 RSI: ffffffff89932d7c RDI: 0000000000000091 RBP: dffffc0000000000 R08: 0000000000000005 R09: 0000000000000007 R10: 0000000000000001 R11: 0000000000000006 R12: ffff88807fa080b8 R13: fffffbfff1a9a07d R14: ffffed100ff41022 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff8880b9200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b32c26000 CR3: 000000005d56e000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __fib6_drop_pcpu_from net/ipv6/ip6_fib.c:966 [inline] fib6_drop_pcpu_from net/ipv6/ip6_fib.c:1027 [inline] fib6_purge_rt+0x7f2/0x9f0 net/ipv6/ip6_fib.c:1038 fib6_del_route net/ipv6/ip6_fib.c:1998 [inline] fib6_del+0xa70/0x17b0 net/ipv6/ip6_fib.c:2043 fib6_clean_node+0x426/0x5b0 net/ipv6/ip6_fib.c:2205 fib6_walk_continue+0x44f/0x8d0 net/ipv6/ip6_fib.c:2127 fib6_walk+0x182/0x370 net/ipv6/ip6_fib.c:2175 fib6_clean_tree+0xd7/0x120 net/ipv6/ip6_fib.c:2255 __fib6_clean_all+0x100/0x2d0 net/ipv6/ip6_fib.c:2271 rt6_sync_down_dev net/ipv6/route.c:4906 [inline] rt6_disable_ip+0x7ed/0xa00 net/ipv6/route.c:4911 addrconf_ifdown.isra.0+0x117/0x1b40 net/ipv6/addrconf.c:3855 addrconf_notify+0x223/0x19e0 net/ipv6/addrconf.c:3778 notifier_call_chain+0xb9/0x410 kernel/notifier.c:93 call_netdevice_notifiers_info+0xbe/0x140 net/core/dev.c:1992 call_netdevice_notifiers_extack net/core/dev.c:2030 [inline] call_netdevice_notifiers net/core/dev.c:2044 [inline] dev_close_many+0x333/0x6a0 net/core/dev.c:1585 unregister_netdevice_many_notify+0x46d/0x19f0 net/core/dev.c:11193 unregister_netdevice_many net/core/dev.c:11276 [inline] default_device_exit_batch+0x85b/0xae0 net/core/dev.c:11759 ops_exit_list+0x128/0x180 net/core/net_namespace.c:178 cleanup_net+0x5b7/0xbf0 net/core/net_namespace.c:640 process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231 process_scheduled_works kernel/workqueue.c:3312 [inline] worker_thread+0x6c8/0xf70 kernel/workqueue.c:3393 kthread+0x2c1/0x3a0 kernel/kthread.c:389 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Fixes: `d52d3997f8` ("ipv6: Create percpu rt6_info") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20240604193549.981839-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 13:05:54 +02:00
Paolo Abeni	411c0ea696	Merge branch 'af_unix-fix-lockless-access-of-sk-sk_state-and-others-fields' Kuniyuki Iwashima says: ==================== af_unix: Fix lockless access of sk->sk_state and others fields. The patch 1 fixes a bug where SOCK_DGRAM's sk->sk_state is changed to TCP_CLOSE even if the socket is connect()ed to another socket. The rest of this series annotates lockless accesses to the following fields. * sk->sk_state * sk->sk_sndbuf * net->unx.sysctl_max_dgram_qlen * sk->sk_receive_queue.qlen * sk->sk_shutdown Note that with this series there is skb_queue_empty() left in unix_dgram_disconnected() that needs to be changed to lockless version, and unix_peer(other) access there should be protected by unix_state_lock(). This will require some refactoring, so another series will follow. Changes: v2: * Patch 1: Fix wrong double lock v1: https://lore.kernel.org/netdev/20240603143231.62085-1-kuniyu@amazon.com/ ==================== Link: https://lore.kernel.org/r/20240604165241.44758-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 12:57:18 +02:00
Kuniyuki Iwashima	efaf24e30e	af_unix: Annotate data-race of sk->sk_shutdown in sk_diag_fill(). While dumping sockets via UNIX_DIAG, we do not hold unix_state_lock(). Let's use READ_ONCE() to read sk->sk_shutdown. Fixes: `e4e541a848` ("sock-diag: Report shutdown for inet and unix sockets (v2)") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 12:57:15 +02:00
Kuniyuki Iwashima	5d915e584d	af_unix: Use skb_queue_len_lockless() in sk_diag_show_rqlen(). We can dump the socket queue length via UNIX_DIAG by specifying UDIAG_SHOW_RQLEN. If sk->sk_state is TCP_LISTEN, we return the recv queue length, but here we do not hold recvq lock. Let's use skb_queue_len_lockless() in sk_diag_show_rqlen(). Fixes: `c9da99e647` ("unix_diag: Fixup RQLEN extension report") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 12:57:15 +02:00
Kuniyuki Iwashima	83690b82d2	af_unix: Use skb_queue_empty_lockless() in unix_release_sock(). If the socket type is SOCK_STREAM or SOCK_SEQPACKET, unix_release_sock() checks the length of the peer socket's recvq under unix_state_lock(). However, unix_stream_read_generic() calls skb_unlink() after releasing the lock. Also, for SOCK_SEQPACKET, __skb_try_recv_datagram() unlinks skb without unix_state_lock(). Thues, unix_state_lock() does not protect qlen. Let's use skb_queue_empty_lockless() in unix_release_sock(). Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 12:57:15 +02:00
Kuniyuki Iwashima	45d872f0e6	af_unix: Use unix_recvq_full_lockless() in unix_stream_connect(). Once sk->sk_state is changed to TCP_LISTEN, it never changes. unix_accept() takes advantage of this characteristics; it does not hold the listener's unix_state_lock() and only acquires recvq lock to pop one skb. It means unix_state_lock() does not prevent the queue length from changing in unix_stream_connect(). Thus, we need to use unix_recvq_full_lockless() to avoid data-race. Now we remove unix_recvq_full() as no one uses it. Note that we can remove READ_ONCE() for sk->sk_max_ack_backlog in unix_recvq_full_lockless() because of the following reasons: (1) For SOCK_DGRAM, it is a written-once field in unix_create1() (2) For SOCK_STREAM and SOCK_SEQPACKET, it is changed under the listener's unix_state_lock() in unix_listen(), and we hold the lock in unix_stream_connect() Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 12:57:15 +02:00
Kuniyuki Iwashima	bd9f2d0573	af_unix: Annotate data-race of net->unx.sysctl_max_dgram_qlen. net->unx.sysctl_max_dgram_qlen is exposed as a sysctl knob and can be changed concurrently. Let's use READ_ONCE() in unix_create1(). Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 12:57:15 +02:00
Kuniyuki Iwashima	b0632e53e0	af_unix: Annotate data-races around sk->sk_sndbuf. sk_setsockopt() changes sk->sk_sndbuf under lock_sock(), but it's not used in af_unix.c. Let's use READ_ONCE() to read sk->sk_sndbuf in unix_writable(), unix_dgram_sendmsg(), and unix_stream_sendmsg(). Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 12:57:15 +02:00
Kuniyuki Iwashima	0aa3be7b3e	af_unix: Annotate data-races around sk->sk_state in UNIX_DIAG. While dumping AF_UNIX sockets via UNIX_DIAG, sk->sk_state is read locklessly. Let's use READ_ONCE() there. Note that the result could be inconsistent if the socket is dumped during the state change. This is common for other SOCK_DIAG and similar interfaces. Fixes: `c9da99e647` ("unix_diag: Fixup RQLEN extension report") Fixes: `2aac7a2cb0` ("unix_diag: Pending connections IDs NLA") Fixes: `45a96b9be6` ("unix_diag: Dumping all sockets core") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-06 12:57:15 +02:00

1 2 3 4 5 ...

1279946 Commits