linux

iv/linux

Author	SHA1	Message	Date
Xiaomeng Tong	b423e54ba9	myri10ge: fix an incorrect free for skb in myri10ge_sw_tso All remaining skbs should be released when myri10ge_xmit fails to transmit a packet. Fix it within another skb_list_walk_safe. Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 15:29:18 +01:00
Jakub Kicinski	a5b116a0fa	net: wan: remove the lanmedia (lmc) driver The driver for LAN Media WAN interfaces spews build warnings on microblaze. The virt_to_bus() calls discard the volatile keyword. The right thing to do would be to migrate this driver to a modern DMA API but it seems unlikely anyone is actually using it. There had been no fixes or functional changes here since the git era begun. Let's remove this driver, there isn't much changing in the APIs, if users come forward we can apologize and revert. Link: https://lore.kernel.org/all/20220321144013.440d7fc0@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com/ Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 15:28:23 +01:00
Marcin Kozlowski	afb8e24652	net: usb: aqc111: Fix out-of-bounds accesses in RX fixup aqc111_rx_fixup() contains several out-of-bounds accesses that can be triggered by a malicious (or defective) USB device, in particular: - The metadata array (desc_offset..desc_offset+2*pkt_count) can be out of bounds, causing OOB reads and (on big-endian systems) OOB endianness flips. - A packet can overlap the metadata array, causing a later OOB endianness flip to corrupt data used by a cloned SKB that has already been handed off into the network stack. - A packet SKB can be constructed whose tail is far beyond its end, causing out-of-bounds heap data to be considered part of the SKB's data. Found doing variant analysis. Tested it with another driver (ax88179_178a), since I don't have a aqc111 device to test it, but the code looks very similar. Signed-off-by: Marcin Kozlowski <marcinguy@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 15:22:49 +01:00
Wang Qing	207d924dcf	net: usb: remove duplicate assignment netdev_alloc_skb() has assigned ssi->netdev to skb->dev if successed, no need to repeat assignment. Signed-off-by: Wang Qing <wangqing@vivo.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 15:19:53 +01:00
Wang Qing	be8d9d0527	net: ethernet: xilinx: use of_property_read_bool() instead of of_get_property "little-endian" has no specific content, use more helper function of_property_read_bool() instead of of_get_property() Signed-off-by: Wang Qing <wangqing@vivo.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 15:18:03 +01:00
Jamie Bainbridge	4e910dbe36	qede: confirm skb is allocated before using qede_build_skb() assumes build_skb() always works and goes straight to skb_reserve(). However, build_skb() can fail under memory pressure. This results in a kernel panic because the skb to reserve is NULL. Add a check in case build_skb() failed to allocate and return NULL. The NULL return is handled correctly in callers to qede_build_skb(). Fixes: `8a8633978b` ("qede: Add build_skb() support.") Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 15:16:23 +01:00
Florian Westphal	a3ebe92a0f	net: ipv6mr: fix unused variable warning with CONFIG_IPV6_PIMSM_V2=n net/ipv6/ip6mr.c:1656:14: warning: unused variable 'do_wrmifwhole' Move it to the CONFIG_IPV6_PIMSM_V2 scope where its used. Fixes: `4b340a5a72` ("net: ip6mr: add support for passing full packet on wrong mif") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 15:14:30 +01:00
David S. Miller	74edbe9ede	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-04-05 Maciej Fijalkowski says: We were solving issues around AF_XDP busy poll's not-so-usual scenarios, such as very big busy poll budgets applied to very small HW rings. This set carries the things that were found during that work that apply to net tree. One thing that was fixed for all in-tree ZC drivers was missing on ice side all the time - it's about syncing RCU before destroying XDP resources. Next one fixes the bit that is checked in ice_xsk_wakeup and third one avoids false setting of DD bits on Tx descriptors. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 15:03:50 +01:00
Andrea Parri (Microsoft)	eaa03d3453	Drivers: hv: vmbus: Replace smp_store_mb() with virt_store_mb() Following the recommendation in Documentation/memory-barriers.txt for virtual machine guests. Fixes: `8b6a877c06` ("Drivers: hv: vmbus: Replace the per-CPU channel lists with a global array of channels") Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Link: https://lore.kernel.org/r/20220328154457.100872-1-parri.andrea@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>	2022-04-06 13:31:58 +00:00
Boqun Feng	be5802795c	Drivers: hv: balloon: Disable balloon and hot-add accordingly Currently there are known potential issues for balloon and hot-add on ARM64: * Unballoon requests from Hyper-V should only unballoon ranges that are guest page size aligned, otherwise guests cannot handle because it's impossible to partially free a page. This is a problem when guest page size > 4096 bytes. * Memory hot-add requests from Hyper-V should provide the NUMA node id of the added ranges or ARM64 should have a functional memory_add_physaddr_to_nid(), otherwise the node id is missing for add_memory(). These issues require discussions on design and implementation. In the meanwhile, post_status() is working and essential to guest monitoring. Therefore instead of disabling the entire hv_balloon driver, the ballooning (when page size > 4096 bytes) and hot-add are disabled accordingly for now. Once the issues are fixed, they can be re-enable in these cases. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20220325023212.1570049-3-boqun.feng@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>	2022-04-06 13:15:06 +00:00
Boqun Feng	b3d6dd09ff	Drivers: hv: balloon: Support status report for larger page sizes DM_STATUS_REPORT expects the numbers of pages in the unit of 4k pages (HV_HYP_PAGE) instead of guest pages, so to make it work when guest page sizes are larger than 4k, convert the numbers of guest pages into the numbers of HV_HYP_PAGEs. Note that the numbers of guest pages are still used for tracing because tracing is internal to the guest kernel. Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20220325023212.1570049-2-boqun.feng@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org>	2022-04-06 13:15:06 +00:00
Jann Horn	1448769c9c	random: check for signal_pending() outside of need_resched() check signal_pending() checks TIF_NOTIFY_SIGNAL and TIF_SIGPENDING, which signal that the task should bail out of the syscall when possible. This is a separate concept from need_resched(), which checks TIF_NEED_RESCHED, signaling that the task should preempt. In particular, with the current code, the signal_pending() bailout probably won't work reliably. Change this to look like other functions that read lots of data, such as read_zero(). Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>	2022-04-06 15:09:33 +02:00
David S. Miller	f90e5a3d5b	Merge branch 'mtk_eth_soc-flo-offload-plus-wireless' Felix Fietkau says: ==================== MediaTek SoC flow offload improvements + wireless support This series contains the following improvements to mediatek ethernet flow offload support: - support dma-coherent on ethernet to improve performance - add ipv6 offload support - rework hardware flow table entry handling to improve dealing with hash collisions and competing flows - support creating offload entries from user space - support creating offload entries with just source/destination mac address, vlan and output device information - add driver changes for supporting the Wireless Ethernet Dispatch core, which can be used to offload flows from ethernet to MT7915 PCIe WLAN devices Changes in v2: - add missing dt-bindings patches ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 14:08:52 +01:00
Felix Fietkau	33fc42de33	net: ethernet: mtk_eth_soc: support creating mac address based offload entries This will be used to implement a limited form of bridge offloading. Since the hardware does not support flow table entries with just source and destination MAC address, the driver has to emulate it. The hardware automatically creates entries entries for incoming flows, even when they are bridged instead of routed, and reports when packets for these flows have reached the minimum PPS rate for offloading. After this happens, we look up the L2 flow offload entry based on the MAC header and fill in the output routing information in the flow table. The dynamically created per-flow entries are automatically removed when either the hardware flowtable entry expires, is replaced, or if the offload rule they belong to is removed Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 14:08:51 +01:00
Felix Fietkau	8ff25d3774	net: ethernet: mtk_eth_soc: remove bridge flow offload type entry support According to MediaTek, this feature is not supported in current hardware Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 14:08:50 +01:00
Felix Fietkau	c4f033d9e0	net: ethernet: mtk_eth_soc: rework hardware flow table management The hardware was designed to handle flow detection and creation of flow entries by itself, relying on the software primarily for filling in egress routing information. When there is a hash collision between multiple flows, this allows the hardware to maintain the entry for the most active flow. Additionally, the hardware only keeps offloading active for entries with at least 30 packets per second. With this rework, the code no longer creates a hardware entries directly. Instead, the hardware entry is only created when the PPE reports a matching unbound flow with the minimum target rate. In order to reduce CPU overhead, looking for flows belonging to a hash entry is rate limited to once every 100ms. This rework is also used as preparation for emulating bridge offload by managing L4 offload entries on demand. Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 14:08:50 +01:00
Felix Fietkau	1ccc723b58	net: ethernet: mtk_eth_soc: allocate struct mtk_ppe separately Preparation for adding more data to it, which will increase its size. Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 14:08:50 +01:00
Felix Fietkau	bb14c19122	net: ethernet: mtk_eth_soc: support TC_SETUP_BLOCK for PPE offload This allows offload entries to be created from user space Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 14:08:50 +01:00
David Bentham	817b2fdf16	net: ethernet: mtk_eth_soc: add ipv6 flow offload support Add the missing IPv6 flow offloading support for routing only. Hardware flow offloading is done by the packet processing engine (PPE) of the Ethernet MAC and as it doesn't support mangling of IPv6 packets, IPv6 NAT cannot be supported. Signed-off-by: David Bentham <db260179@gmail.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 14:08:50 +01:00
Felix Fietkau	e9b65ecb7c	arm64: dts: mediatek: mt7622: introduce nodes for Wireless Ethernet Dispatch Introduce wed0 and wed1 nodes in order to enable offloading forwarding between ethernet and wireless devices on the mt7622 chipset. Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 14:08:49 +01:00
Felix Fietkau	a333215e10	net: ethernet: mtk_eth_soc: implement flow offloading to WED devices This allows hardware flow offloading from Ethernet to WLAN on MT7622 SoC Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 14:08:49 +01:00
Felix Fietkau	804775dfc2	net: ethernet: mtk_eth_soc: add support for Wireless Ethernet Dispatch (WED) The Wireless Ethernet Dispatch subsystem on the MT7622 SoC can be configured to intercept and handle access to the DMA queues and PCIe interrupts for a MT7615/MT7915 wireless card. It can manage the internal WDMA (Wireless DMA) controller, which allows ethernet packets to be passed from the packet switch engine (PSE) to the wireless card, bypassing the CPU entirely. This can be used to implement hardware flow offloading from ethernet to WLAN. Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 14:08:49 +01:00
Lorenzo Bianconi	f14ac41b78	dt-bindings: arm: mediatek: document the pcie mirror node on MT7622 This patch adds the pcie mirror document bindings for MT7622 SoC. The feature is used for intercepting PCIe MMIO access for the WED core Add related info in mediatek-net bindings. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 14:08:49 +01:00
Lorenzo Bianconi	55c1c4e945	dt-bindings: arm: mediatek: document WED binding for MT7622 Document the binding for the Wireless Ethernet Dispatch core on the MT7622 SoC, which is used for Ethernet->WLAN offloading Add related info in mediatek-net bindings. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 14:08:48 +01:00
Felix Fietkau	3abd063019	arm64: dts: mediatek: mt7622: add support for coherent DMA It improves performance by eliminating the need for a cache flush on rx and tx Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 14:08:48 +01:00
Felix Fietkau	d776a57e4a	net: ethernet: mtk_eth_soc: add support for coherent DMA It improves performance by eliminating the need for a cache flush on rx and tx In preparation for supporting WED (Wireless Ethernet Dispatch), also add a function for disabling coherent DMA at runtime. Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 14:08:48 +01:00
Lorenzo Bianconi	1dafd0d607	dt-bindings: net: mediatek: add optional properties for the SoC ethernet core Introduce dma-coherent, cci-control and hifsys optional properties to the mediatek ethernet controller bindings Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 14:08:47 +01:00
Jason A. Donenfeld	aba120cc10	random: do not allow user to keep crng key around on stack The fast key erasure RNG design relies on the key that's used to be used and then discarded. We do this, making judicious use of memzero_explicit(). However, reads to /dev/urandom and calls to getrandom() involve a copy_to_user(), and userspace can use FUSE or userfaultfd, or make a massive call, dynamically remap memory addresses as it goes, and set the process priority to idle, in order to keep a kernel stack alive indefinitely. By probing /proc/sys/kernel/random/entropy_avail to learn when the crng key is refreshed, a malicious userspace could mount this attack every 5 minutes thereafter, breaking the crng's forward secrecy. In order to fix this, we just overwrite the stack's key with the first 32 bytes of the "free" fast key erasure output. If we're returning <= 32 bytes to the user, then we can still return those bytes directly, so that short reads don't become slower. And for long reads, the difference is hopefully lost in the amortization, so it doesn't change much, with that amortization helping variously for medium reads. We don't need to do this for get_random_bytes() and the various kernel-space callers, and later, if we ever switch to always batching, this won't be necessary either, so there's no need to change the API of these functions. Cc: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jann Horn <jannh@google.com> Fixes: `c92e040d57` ("random: add backtracking protection to the CRNG") Fixes: `186873c549` ("random: use simpler fast key erasure flow on per-cpu keys") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>	2022-04-06 15:05:10 +02:00
David S. Miller	44ec5f71a0	Merge branch 'mscc-miim' Michael Walle says: ==================== net: phy: mscc-miim: add MDIO bus frequency support Introduce MDIO bus frequency support. This way the board can have a faster (or maybe slower) bus frequency than the hardware default. changes since v2: - resend, no RFC anymore, because net-next is open again ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 14:04:17 +01:00
Michael Walle	bb2a1934ca	net: phy: mscc-miim: add support to set MDIO bus frequency Until now, the MDIO bus will have the hardware default bus frequency. Read the desired frequency of the bus from the device tree and configure it. Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 14:04:16 +01:00
Michael Walle	b0385d4c1f	dt-bindings: net: mscc-miim: add clock and clock-frequency Add the (optional) clock input of the MDIO controller and indicate that the common clock-frequency property is supported. The driver can use it to set the desired MDIO bus frequency. Signed-off-by: Michael Walle <michael@walle.cc> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 14:04:16 +01:00
Michael Walle	ed941f65da	dt-bindings: net: convert mscc-miim to YAML format Convert the mscc-miim device tree binding to the new YAML format. The original binding don't mention if the interrupt property is optional or not. But on the SparX-5 SoC, for example, the interrupt property isn't used, thus in the new binding that property is optional. FWIW the driver doesn't use interrupts at all. Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 14:04:16 +01:00
Michael Walle	8d90991e5b	net: phy: mscc-miim: reject clause 45 register accesses The driver doesn't support clause 45 register access yet, but doesn't check if the access is a c45 one either. This leads to spurious register reads and writes. Add the check. Fixes: `542671fe4d` ("net: phy: mscc-miim: Add MDIO driver") Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 13:57:48 +01:00
David S. Miller	9386d1811f	Merge branch 'axienet-broken-link' Andy Chiu says: ==================== Fix broken link on Xilinx's AXI Ethernet in SGMII mode The Ethernet driver use phy-handle to reference the PCS/PMA PHY. This could be a problem if one wants to configure an external PHY via phylink, since it use the same phandle to get the PHY. To fix this, introduce a dedicated pcs-handle to point to the PCS/PMA PHY and deprecate the use of pointing it with phy-handle. A similar use case of pcs-handle can be seen on dpaa2 as well. --- patch v5 --- - Re-apply the v4 patch on the net tree. - Describe the pcs-handle DT binding at ethernet-controller level. --- patch v6 --- - Remove "preferrably" to clearify usage of pcs_handle. --- patch v7 --- - Rebase the patch on latest net/master --- patch v8 --- - Rebase the patch on net-next/master - Add "reviewed-by" tag in PATCH 3/4: dt-bindings: net: add pcs-handle attribute - Remove "fix" tag in last commit message since this is not a critical bug and will not be back ported to stable. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 13:54:52 +01:00
Andy Chiu	19c7a43912	net: axiemac: use a phandle to reference pcs_phy In some SGMII use cases where both a fixed link external PHY and the internal PCS/PMA PHY need to be configured, we should explicitly use a phandle "pcs-phy" to get the reference to the PCS/PMA PHY. Otherwise, the driver would use "phy-handle" in the DT as the reference to both the external and the internal PCS/PMA PHY. In other cases where the core is connected to a SFP cage, we could still point phy-handle to the intenal PCS/PMA PHY, and let the driver connect to the SFP module, if exist, via phylink. Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Reviewed-by: Greentime Hu <greentime.hu@sifive.com> Reviewed-by: Robert Hancock <robert.hancock@calian.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 13:54:52 +01:00
Andy Chiu	dc48f04fd6	dt-bindings: net: add pcs-handle attribute Document the new pcs-handle attribute to support connecting to an external PHY. For Xilinx's AXI Ethernet, this is used when the core operates in SGMII or 1000Base-X modes and links through the internal PCS/PMA PHY. Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Reviewed-by: Greentime Hu <greentime.hu@sifive.com> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 13:54:51 +01:00
Andy Chiu	ab3a5d4c60	net: axienet: factor out phy_node in struct axienet_local the struct member `phy_node` of struct axienet_local is not used by the driver anymore after initialization. It might be a remnent of old code and could be removed. Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Reviewed-by: Greentime Hu <greentime.hu@sifive.com> Reviewed-by: Robert Hancock <robert.hancock@calian.com> Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 13:54:51 +01:00
Andy Chiu	d1c4f93e3f	net: axienet: setup mdio unconditionally The call to axienet_mdio_setup should not depend on whether "phy-node" pressents on the DT. Besides, since `lp->phy_node` is used if PHY is in SGMII or 100Base-X modes, move it into the if statement. And the next patch will remove `lp->phy_node` from driver's private structure and do an of_node_put on it right away after use since it is not used elsewhere. Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Reviewed-by: Greentime Hu <greentime.hu@sifive.com> Reviewed-by: Robert Hancock <robert.hancock@calian.com> Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 13:54:51 +01:00
Taehee Yoo	fb5833d81e	net: sfc: fix using uninitialized xdp tx_queue In some cases, xdp tx_queue can get used before initialization. 1. interface up/down 2. ring buffer size change When CPU cores are lower than maximum number of channels of sfc driver, it creates new channels only for XDP. When an interface is up or ring buffer size is changed, all channels are initialized. But xdp channels are always initialized later. So, the below scenario is possible. Packets are received to rx queue of normal channels and it is acted XDP_TX and tx_queue of xdp channels get used. But these tx_queues are not initialized yet. If so, TX DMA or queue error occurs. In order to avoid this problem. 1. initializes xdp tx_queues earlier than other rx_queue in efx_start_channels(). 2. checks whether tx_queue is initialized or not in efx_xdp_tx_buffers(). Splat looks like: sfc 0000:08:00.1 enp8s0f1np1: TX queue 10 spurious TX completion id 250 sfc 0000:08:00.1 enp8s0f1np1: resetting (RECOVER_OR_ALL) sfc 0000:08:00.1 enp8s0f1np1: MC command 0x80 inlen 100 failed rc=-22 (raw=22) arg=789 sfc 0000:08:00.1 enp8s0f1np1: has been disabled Fixes: `f28100cb9c` ("sfc: fix lack of XDP TX queues - error XDP TX failed (-22)") Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 13:50:17 +01:00
Eric Dumazet	1946014ca3	rxrpc: fix a race in rxrpc_exit_net() Current code can lead to the following race: CPU0 CPU1 rxrpc_exit_net() rxrpc_peer_keepalive_worker() if (rxnet->live) rxnet->live = false; del_timer_sync(&rxnet->peer_keepalive_timer); timer_reduce(&rxnet->peer_keepalive_timer, jiffies + delay); cancel_work_sync(&rxnet->peer_keepalive_work); rxrpc_exit_net() exits while peer_keepalive_timer is still armed, leading to use-after-free. syzbot report was: ODEBUG: free active (active state 0) object type: timer_list hint: rxrpc_peer_keepalive_timeout+0x0/0xb0 WARNING: CPU: 0 PID: 3660 at lib/debugobjects.c:505 debug_print_object+0x16e/0x250 lib/debugobjects.c:505 Modules linked in: CPU: 0 PID: 3660 Comm: kworker/u4:6 Not tainted 5.17.0-syzkaller-13993-g88e6c0207623 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net RIP: 0010:debug_print_object+0x16e/0x250 lib/debugobjects.c:505 Code: ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 af 00 00 00 48 8b 14 dd 00 1c 26 8a 4c 89 ee 48 c7 c7 00 10 26 8a e8 b1 e7 28 05 <0f> 0b 83 05 15 eb c5 09 01 48 83 c4 18 5b 5d 41 5c 41 5d 41 5e c3 RSP: 0018:ffffc9000353fb00 EFLAGS: 00010082 RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000 RDX: ffff888029196140 RSI: ffffffff815efad8 RDI: fffff520006a7f52 RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff815ea4ae R11: 0000000000000000 R12: ffffffff89ce23e0 R13: ffffffff8a2614e0 R14: ffffffff816628c0 R15: dffffc0000000000 FS: 0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fe1f2908924 CR3: 0000000043720000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __debug_check_no_obj_freed lib/debugobjects.c:992 [inline] debug_check_no_obj_freed+0x301/0x420 lib/debugobjects.c:1023 kfree+0xd6/0x310 mm/slab.c:3809 ops_free_list.part.0+0x119/0x370 net/core/net_namespace.c:176 ops_free_list net/core/net_namespace.c:174 [inline] cleanup_net+0x591/0xb00 net/core/net_namespace.c:598 process_one_work+0x996/0x1610 kernel/workqueue.c:2289 worker_thread+0x665/0x1080 kernel/workqueue.c:2436 kthread+0x2e9/0x3a0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298 </TASK> Fixes: `ace45bec6d` ("rxrpc: Fix firewall route keepalive") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Howells <dhowells@redhat.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: linux-afs@lists.infradead.org Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 13:48:51 +01:00
Nick Desaulniers	1ee375d77b	net, uapi: remove inclusion of arpa/inet.h In include/uapi/linux/tipc_config.h, there's a comment that it includes arpa/inet.h for ntohs; but ntohs is not defined in any UAPI header. For now, reuse the definitions from include/linux/byteorder/generic.h, since the various conversion functions do exist in UAPI headers: include/uapi/linux/byteorder/big_endian.h include/uapi/linux/byteorder/little_endian.h We would like to get to the point where we can build UAPI header tests with -nostdinc, meaning that kernel UAPI headers should not have a circular dependency on libc headers. Link: https://android-review.googlesource.com/c/platform/bionic/+/2048127 Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 13:48:02 +01:00
Oliver Hartkopp	f4b41f062c	net: remove noblock parameter from skb_recv_datagram() skb_recv_datagram() has two parameters 'flags' and 'noblock' that are merged inside skb_recv_datagram() by 'flags \| (noblock ? MSG_DONTWAIT : 0)' As 'flags' may contain MSG_DONTWAIT as value most callers split the 'flags' into 'flags' and 'noblock' with finally obsolete bit operations like this: skb_recv_datagram(sk, flags & ~MSG_DONTWAIT, flags & MSG_DONTWAIT, &rc); And this is not even done consistently with the 'flags' parameter. This patch removes the obsolete and costly splitting into two parameters and only performs bit operations when really needed on the caller side. One missing conversion thankfully reported by kernel test robot. I missed to enable kunit tests to build the mctp code. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 13:45:26 +01:00
Ilya Maximets	1f30fb9166	net: openvswitch: fix leak of nested actions While parsing user-provided actions, openvswitch module may dynamically allocate memory and store pointers in the internal copy of the actions. So this memory has to be freed while destroying the actions. Currently there are only two such actions: ct() and set(). However, there are many actions that can hold nested lists of actions and ovs_nla_free_flow_actions() just jumps over them leaking the memory. For example, removal of the flow with the following actions will lead to a leak of the memory allocated by nf_ct_tmpl_alloc(): actions:clone(ct(commit),0) Non-freed set() action may also leak the 'dst' structure for the tunnel info including device references. Under certain conditions with a high rate of flow rotation that may cause significant memory leak problem (2MB per second in reporter's case). The problem is also hard to mitigate, because the user doesn't have direct control over the datapath flows generated by OVS. Fix that by iterating over all the nested actions and freeing everything that needs to be freed recursively. New build time assertion should protect us from this problem if new actions will be added in the future. Unfortunately, openvswitch module doesn't use NLA_F_NESTED, so all attributes has to be explicitly checked. sample() and clone() actions are mixing extra attributes into the user-provided action list. That prevents some code generalization too. Fixes: `34ae932a40` ("openvswitch: Make tunnel set action attach a metadata dst") Link: https://mail.openvswitch.org/pipermail/ovs-dev/2022-March/392922.html Reported-by: Stéphane Graber <stgraber@ubuntu.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-06 13:36:50 +01:00
Mario Limonciello	55b014159e	ata: ahci: Rename CONFIG_SATA_LPM_POLICY configuration item back CONFIG_SATA_LPM_MOBILE_POLICY was renamed to CONFIG_SATA_LPM_POLICY in commit `4dd4d3deb5` ("ata: ahci: Rename CONFIG_SATA_LPM_MOBILE_POLICY configuration item"). This can potentially cause problems as users would invisibly lose configuration policy defaults when they built the new kernel. To avoid such problems, switch back to the old name (even if it's wrong). Suggested-by: Christoph Hellwig <hch@infradead.org> Suggested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>	2022-04-06 11:08:04 +09:00
Andrew Lunn	11f8e7c122	net: ethernet: mv643xx: Fix over zealous checking of_get_mac_address() There is often not a MAC address available in an EEPROM accessible by Linux with Marvell devices. Instead the bootload has the MAC address and directly programs it into the hardware. So don't consider an error from of_get_mac_address() has fatal. However, the check was added for the case where there is a MAC address in an the EEPROM, but the EEPROM has not probed yet, and -EPROBE_DEFER is returned. In that case the error should be returned. So make the check specific to this error code. Cc: Mauri Sandberg <maukka@ext.kapsi.fi> Reported-by: Thomas Walther <walther-it@gmx.de> Fixes: `42404d8f1c` ("net: mv643xx_eth: process retval from of_get_mac_address") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220405000404.3374734-1-andrew@lunn.ch Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-05 18:12:55 -07:00
Ilya Maximets	3f2a3050b4	net: openvswitch: don't send internal clone attribute to the userspace. 'OVS_CLONE_ATTR_EXEC' is an internal attribute that is used for performance optimization inside the kernel. It's added by the kernel while parsing user-provided actions and should not be sent during the flow dump as it's not part of the uAPI. The issue doesn't cause any significant problems to the ovs-vswitchd process, because reported actions are not really used in the application lifecycle and only supposed to be shown to a human via ovs-dpctl flow dump. However, the action list is still incorrect and causes the following error if the user wants to look at the datapath flows: # ovs-dpctl add-dp system@ovs-system # ovs-dpctl add-flow "<flow match>" "clone(ct(commit),0)" # ovs-dpctl dump-flows <flow match>, packets:0, bytes:0, used:never, actions:clone(bad length 4, expected -1 for: action0(01 00 00 00), ct(commit),0) With the fix: # ovs-dpctl dump-flows <flow match>, packets:0, bytes:0, used:never, actions:clone(ct(commit),0) Additionally fixed an incorrect attribute name in the comment. Fixes: `b233504033` ("openvswitch: kernel datapath clone action") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Aaron Conole <aconole@redhat.com> Link: https://lore.kernel.org/r/20220404104150.2865736-1-i.maximets@ovn.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-05 17:57:54 -07:00
Horatiu Vultur	1d7e4fd72b	net: micrel: Fix KS8851 Kconfig KS8851 selects MICREL_PHY, which depends on PTP_1588_CLOCK_OPTIONAL, so make KS8851 also depend on PTP_1588_CLOCK_OPTIONAL. Fixes kconfig warning and build errors: WARNING: unmet direct dependencies detected for MICREL_PHY Depends on [m]: NETDEVICES [=y] && PHYLIB [=y] && PTP_1588_CLOCK_OPTIONAL [=m] Selected by [y]: - KS8851 [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_MICREL [=y] && SPI [=y] ld.lld: error: undefined symbol: ptp_clock_register referenced by micrel.c net/phy/micrel.o:(lan8814_probe) in archive drivers/built-in.a ld.lld: error: undefined symbol: ptp_clock_index referenced by micrel.c net/phy/micrel.o:(lan8814_ts_info) in archive drivers/built-in.a Reported-by: kernel test robot <lkp@intel.com> Fixes: `ece1950283` ("net: phy: micrel: 1588 support for LAN8814 phy") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Tested-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20220405065936.4105272-1-horatiu.vultur@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-05 17:32:05 -07:00
Yuntao Wang	2d0df01974	selftests/bpf: Fix file descriptor leak in load_kallsyms() Currently, if sym_cnt > 0, it just returns and does not close file, fix it. Signed-off-by: Yuntao Wang <ytcoode@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220405145711.49543-1-ytcoode@gmail.com	2022-04-05 16:49:32 -07:00
Xu Kuohai	042152c27c	bpf, arm64: Sign return address for JITed code Sign return address for JITed code when the kernel is built with pointer authentication enabled: 1. Sign LR with paciasp instruction before LR is pushed to stack. Since paciasp acts like landing pads for function entry, no need to insert bti instruction before paciasp. 2. Authenticate LR with autiasp instruction after LR is popped from stack. For BPF tail call, the stack frame constructed by the caller is reused by the callee. That is, the stack frame is constructed by the caller and destructed by the callee. Thus LR is signed and pushed to the stack in the caller's prologue, and poped from the stack and authenticated in the callee's epilogue. For BPF2BPF call, the caller and callee construct their own stack frames, and sign and authenticate their own LRs. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://events.static.linuxfound.org/sites/events/files/slides/slides_23.pdf Link: https://lore.kernel.org/bpf/20220402073942.3782529-1-xukuohai@huawei.com	2022-04-06 00:04:22 +02:00
Johannes Berg	0b5c21bbc0	net: ensure net_todo_list is processed quickly In [1], Will raised a potential issue that the cfg80211 code, which does (from a locking perspective) rtnl_lock() wiphy_lock() rtnl_unlock() might be suspectible to ABBA deadlocks, because rtnl_unlock() calls netdev_run_todo(), which might end up calling rtnl_lock() again, which could then deadlock (see the comment in the code added here for the scenario). Some back and forth and thinking ensued, but clearly this can't happen if the net_todo_list is empty at the rtnl_unlock() here. Clearly, the code here cannot actually put an entry on it, and all other users of rtnl_unlock() will empty it since that will always go through netdev_run_todo(), emptying the list. So the only other way to get there would be to add to the list and then unlock the RTNL without going through rtnl_unlock(), which is only possible through __rtnl_unlock(). However, this isn't exported and not used in many places, and none of them seem to be able to unregister before using it. Therefore, add a WARN_ON() in the code to ensure this invariant won't be broken, so that the cfg80211 (or any similar) code stays safe. [1] https://lore.kernel.org/r/Yjzpo3TfZxtKPMAG@google.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20220404113847.0ee02e4a70da.Ic73d206e217db20fd22dcec14fe5442ca732804b@changeid Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-05 14:28:16 -07:00

... 2 3 4 5 6 ...

1088929 Commits