linux

iv/linux

Author	SHA1	Message	Date
Shannon Nelson	688a5efe0c	ionic: no transition while stopping Make sure we don't try to transition the fw_status_ready while we're still in the FW_STOPPING state, else we can get stuck in limbo waiting on a transition that already happened. While we're here we can remove a superfluous check on the lif pointer. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-28 11:42:45 +00:00
Eric Dumazet	b3483bc7a1	net/sysctl: avoid two synchronize_rcu() calls Both rps_sock_flow_sysctl() and flow_limit_cpu_sysctl() are using synchronize_rcu() right before freeing memory either by vfree() or kfree() They can switch to kvfree_rcu(ptr) and kfree_rcu(ptr) to benefit from asynchronous mode, instead of blocking the current thread. Note that kvfree_rcu(ptr) and kfree_rcu(ptr) eventually can have to use synchronize_rcu() in some memory pressure cases. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-28 11:40:47 +00:00
Lorenzo Bianconi	6a4696c428	net: netsec: enable pp skb recycling Similar to mvneta or mvpp2, enable page_pool skb recycling for netsec dirver. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-28 11:39:23 +00:00
Jia-Ju Bai	d4e26aaea7	atm: firestream: check the return value of ioremap() in fs_init() The function ioremap() in fs_init() can fail, so its return value should be checked. Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-28 11:36:01 +00:00
Casper Andersson	90d4025285	net: sparx5: Add #include to remove warning main.h uses NUM_TARGETS from main_regs.h, but the missing include never causes any errors because everywhere main.h is (currently) included, main_regs.h is included before. But since it is dependent on main_regs.h it should always be included. Signed-off-by: Casper Andersson <casper.casan@gmail.com> Reviewed-by: Joacim Zetterling <joacim.zetterling@westermo.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-28 11:34:26 +00:00
Tony Lu	6900de507c	net/smc: Call trace_smc_tx_sendmsg when data corked This also calls trace_smc_tx_sendmsg() even if data is corked. For ease of understanding, if statements are not expanded here. Link: https://lore.kernel.org/all/f4166712-9a1e-51a0-409d-b7df25a66c52@linux.ibm.com/ Fixes: 139653bc6635 ("net/smc: Remove corked dealyed work") Suggested-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-28 11:32:42 +00:00
Tony Lu	4d08b7b57e	net/smc: Fix cleanup when register ULP fails This patch calls smc_ib_unregister_client() when tcp_register_ulp() fails, and make sure to clean it up. Fixes: d7cd421da9da ("net/smc: Introduce TCP ULP support") Signed-off-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-28 11:31:49 +00:00
David S. Miller	c4eb058ead	Merge branch 'flow_offload-tc-police-parameters' Jianbo Liu says: ==================== flow_offload: add tc police parameters As a preparation for more advanced police offload in mlx5 (e.g., jumping to another chain when bandwidth is not exceeded), extend the flow offload API with more tc-police parameters. Adjust existing drivers to reject unsupported configurations. Changes since v2: * Rename index to extval in exceed and notexceed acts. * Add policer validate functions for all drivers. Changes since v1: * Add one more strict validation for the control of drop/ok. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-28 11:12:39 +00:00
Jianbo Liu	d97b4b105c	flow_offload: reject offload for all drivers with invalid police parameters As more police parameters are passed to flow_offload, driver can check them to make sure hardware handles packets in the way indicated by tc. The conform-exceed control should be drop/pipe or drop/ok. Besides, for drop/ok, the police should be the last action. As hardware can't configure peakrate/avrate/overhead, offload should not be supported if any of them is configured. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-28 11:12:20 +00:00
Jianbo Liu	b8cd5831c6	net: flow_offload: add tc police action parameters The current police offload action entry is missing exceed/notexceed actions and parameters that can be configured by tc police action. Add the missing parameters as a pre-step for offloading police actions to hardware. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-28 11:11:35 +00:00
j.nixdorf@avm.de	9995b408f1	net: ipv6: ensure we call ipv6_mc_down() at most once There are two reasons for addrconf_notify() to be called with NETDEV_DOWN: either the network device is actually going down, or IPv6 was disabled on the interface. If either of them stays down while the other is toggled, we repeatedly call the code for NETDEV_DOWN, including ipv6_mc_down(), while never calling the corresponding ipv6_mc_up() in between. This will cause a new entry in idev->mc_tomb to be allocated for each multicast group the interface is subscribed to, which in turn leaks one struct ifmcaddr6 per nontrivial multicast group the interface is subscribed to. The following reproducer will leak at least $n objects: ip addr add ff2e::4242/32 dev eth0 autojoin sysctl -w net.ipv6.conf.eth0.disable_ipv6=1 for i in $(seq 1 $n); do ip link set up eth0; ip link set down eth0 done Joining groups with IPV6_ADD_MEMBERSHIP (unprivileged) or setting the sysctl net.ipv6.conf.eth0.forwarding to 1 (=> subscribing to ff02::2) can also be used to create a nontrivial idev->mc_list, which will the leak objects with the right up-down-sequence. Based on both sources for NETDEV_DOWN events the interface IPv6 state should be considered: - not ready if the network interface is not ready OR IPv6 is disabled for it - ready if the network interface is ready AND IPv6 is enabled for it The functions ipv6_mc_up() and ipv6_down() should only be run when this state changes. Implement this by remembering when the IPv6 state is ready, and only run ipv6_mc_down() if it actually changed from ready to not ready. The other direction (not ready -> ready) already works correctly, as: - the interface notification triggered codepath for NETDEV_UP / NETDEV_CHANGE returns early if ipv6 is disabled, and - the disable_ipv6=0 triggered codepath skips fully initializing the interface as long as addrconf_link_ready(dev) returns false - calling ipv6_mc_up() repeatedly does not leak anything Fixes: 3ce62a84d53c ("ipv6: exit early in addrconf_notify() if IPv6 is disabled") Signed-off-by: Johannes Nixdorf <j.nixdorf@avm.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-28 11:04:45 +00:00
Jann Horn	258dd90202	efivars: Respect "block" flag in efivar_entry_set_safe() When the "block" flag is false, the old code would sometimes still call check_var_size(), which wrongly tells ->query_variable_store() that it can block. As far as I can tell, this can't really materialize as a bug at the moment, because ->query_variable_store only does something on X86 with generic EFI, and in that configuration we always take the efivar_entry_set_nonblocking() path. Fixes: ca0e30dcaa53 ("efi: Add nonblocking option to efi_query_variable_store()") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220218180559.1432559-1-jannh@google.com	2022-02-28 10:07:50 +01:00
Sunil V L	dcf0c83885	riscv/efi_stub: Fix get_boot_hartid_from_fdt() return value The get_boot_hartid_from_fdt() function currently returns U32_MAX for failure case which is not correct because U32_MAX is a valid hartid value. This patch fixes the issue by returning error code. Cc: <stable@vger.kernel.org> Fixes: d7071743db31 ("RISC-V: Add EFI stub support.") Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Reviewed-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2022-02-28 10:07:49 +01:00
Linus Torvalds	7e57714cd0	Linux 5.17-rc6 v5.17-rc6	2022-02-27 14:36:33 -08:00
Linus Torvalds	52a0255467	A single fix for a regression caused by the recent PCI/MSI rework which resulted in a recursive locking problem in the VMD driver. The cure is to cache the relevant information upfront instead of retrieving it at runtime. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmIbQhATHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoX+ND/0ZY7N+NbHnOWz8aPRIelSgchSq4xEU jjNe+FIe32Zq8zmVDeS59aAXF/gbT4LwR8eL0clzpM+Sd0Rg7xyvYE5v9ltwgv17 3IJNnmJgLmeJazI5qMRSeDZcV5Ys0AIJYueVkDOkiMiJd0alLuGkRocOsejVdFhh 27mLu33tfnXf0qFCZHFUiQtrus5zgJWh+kKz2vOuzLUxF9QPUe+CCTyA9HVNRneh 94PFK7hjjbtyI65KLqSjEQRnGP3ddRwwII4EwE1aa+x/Fx6cDA6/L0PinpIDCSkh vXfODriwqW2Y9M4g3WrKLU69OB+LxVzV5pKcbC8Rrs9xOfNVGOBJNbzyqnR3nye6 jPOb1I5DF427LJpac8BQKcdu9kxwqTF8D77BWZpkjYdKbIFh5Otd0/DgKaLOH4EG u4eMSNsgYkFLTc1Aa59CrYdAM03yflYI0BJ0Sdrw+fZbhRoFFmuEMm9R7f6J6E4+ 2tbq8uZpZcqBP7YLbAuMmC1Km7fhMlGZNj/8XXHj2168wKmTmQm48J2bARkZmIPt Jk2el2wKM14gGttES2nqEf/UDrl8XCllTD+cRzBqEAjOv3himpsErZmuKxni6BAd pQozQpyJlK5swF7U3mZkalJE/btyVL6dzAzlDp0psZbDGFmFK5O+/F3kxQOpoGzo hsbHVeTZFmmWdA== =ukul -----END PGP SIGNATURE----- Merge tag 'irq-urgent-2022-02-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Thomas Gleixner: "A single fix for a regression caused by the recent PCI/MSI rework which resulted in a recursive locking problem in the VMD driver. The cure is to cache the relevant information upfront instead of retrieving it at runtime" * tag 'irq-urgent-2022-02-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: PCI: vmd: Prevent recursive locking on interrupt allocation	2022-02-27 13:07:40 -08:00
Linus Torvalds	98f3e84f8d	dma-mapping fix for Linux 5.17 - fix a swiotlb info leak (Halil Pasic) -----BEGIN PGP SIGNATURE----- iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmIbvm8LHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYMcBg/8DD+QsKEEv+D/+bCSWZVbW1ekFDiDyEdO9xuSymxT pf56Dy693G0BoPZ1JFiN17LIOp3vsHQfwE4ZDo/OHmyDTcziepD40SZiHD3eqEDB cl7RXhAcoxnUMkgK+hIb6Q7t/4dXaC3+rVXUuL//xUjXZu7GML46tRzuDoVr9MZp MBSGdZlXIzfvpzNf+yVzpk7sZP/w4nrzyuzeE5T6fdipjN2o78PhWmZzqsHGujeI ptMuZQwwDuXXetQNiA/t0+DkeWCz/+ERI7k/6lAma14aKHjM1Jn8HcNqW0UGUuGV 148V4wKFqhaOXnoWuodrgjo9AJkOVqh+YH4ui60MMOUvZ4MPh4+5muobybzndtNy 3bAM3JbcNUH1/vyyazvwr22My/TapbydwPuXda63Ls/2WbnV66xbgFEm6S9GXC4X 6+ynK5CzWaTF5INgpd6WjrEMPXGCi0hXM9yQmJvlI1I0muFv9mBXhW/wP7OE3Na8 eQXuP/v+QqYaDVTsGTtySNPkwBstFa9GQUTghrnxKzk3giLVZfGPX5Bs4/+gl4gl rrGBNhUAnMOdvBSkLUf4hDgyGvsCiYtB1YX9QYxq2aRCbIT03164VnXB0lQRKnmz Yu547NO96tBnoIT9/gQYMTHvwPU1WjWi3jVbT2zpMKgG4DBUEEIH33P9S1vbgecY vLY= =bR6d -----END PGP SIGNATURE----- Merge tag 'dma-mapping-5.17-1' of git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping fix from Christoph Hellwig: - fix a swiotlb info leak (Halil Pasic) * tag 'dma-mapping-5.17-1' of git://git.infradead.org/users/hch/dma-mapping: swiotlb: fix info leak with DMA_FROM_DEVICE	2022-02-27 12:42:37 -08:00
Linus Torvalds	6676ba2a6d	Pin control fixes for the v5.17 series: - Fix some drive strength and pull-up code in the K210 driver. - Add the Alder Lake-M ACPI ID so it starts to work properly. - Use a static name for the StarFive GPIO irq_chip, forestalling an upcoming fixes series from Marc Zyngier. - Fix an ages old bug in the Tegra 186 driver where we were indexing at random into struct and being lucky getting the right member. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAmIaztAACgkQQRCzN7AZ XXOp+xAAvF525eqPH/wAvoj0p7KbpQQJ9gYpqCL4FNp+t/Wj3q+7ihgy/Y4iQ6wC vU72M8OQ+i7t/5R8l1cJUj/f/OA2+icNeD5L1+DD+4RB52wQvdbjz7XDqVqEHSFG YnV9YJFGQ3Tr62HU2MrWUCxsY13J6YHlRWHTcMoM0/fcRSHaDYU5mlgGPQV6fb94 WViv+PZncebY9PeyNm/wIpqL/VHqLI5fcaHz+0u6ppNTF7rGRBv7La/Du0mTbmlw rofbc2ynv+gIERyZBZ26UepBid2ZY4qaBzNy5S4srNeY8odlE+C9qCi/UcC3j3aM 1UgsiuZKvn7arR7uR6cKPQSeIEHS25zxbL+FXPa5wtg9KrNhZUG7LG1IB3M7jcK4 CiNj7zm9Zuy/qGGbMNWmmqpFk8ueL2fq7oE6K8oQa9HxzMFd48sB0Ckhyt5PCOEV zcLEo/WeIp3BqOJ4vZQquWO0lcEZr+2SeiGUaUJYwfZI7K2Myrc66hxKVUzBB+EK QWOQj+2W15qBkZd49ygQMJK5D8CQPBkT66AjqtZHr/7jk5H4S0oyyhJHyEWMPcSW oEk7UxKGfULG+zPfg6tCKQSN/QEyF9V2DZ5Ve13klWYZwR6uTZIGhQ2ZVWhJH7DN KXmTGOLGKUp1xR17t8hrAg60WPRoltofY21U/XiABDMGHgLmyYk= =es6M -----END PGP SIGNATURE----- Merge tag 'pinctrl-v5-17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: - Fix some drive strength and pull-up code in the K210 driver. - Add the Alder Lake-M ACPI ID so it starts to work properly. - Use a static name for the StarFive GPIO irq_chip, forestalling an upcoming fixes series from Marc Zyngier. - Fix an ages old bug in the Tegra 186 driver where we were indexing at random into struct and being lucky getting the right member. * tag 'pinctrl-v5-17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: gpio: tegra186: Fix chip_data type confusion pinctrl: starfive: Use a static name for the GPIO irq_chip pinctrl: tigerlake: Revert "Add Alder Lake-M ACPI ID" pinctrl: k210: Fix bias-pull-up pinctrl: fix loop in k210_pinconf_get_drive()	2022-02-27 12:30:54 -08:00
David S. Miller	b42a738e40	Merge branch 'dsa-fdb-isolation' Vladimir Oltean says: ==================== DSA FDB isolation There are use cases which need FDB isolation between standalone ports and bridged ports, as well as isolation between ports of different bridges. Most of these use cases are a result of the fact that packets can now be partially forwarded by the software bridge, so one port might need to send a packet to the CPU but its FDB lookup will see that it can forward it directly to a bridge port where that packet was autonomously learned. So the source port will attempt to shortcircuit the CPU and forward autonomously, which it can't due to the forwarding isolation we have in place. So we will have packet drops instead of proper operation. Additionally, before DSA can implement IFF_UNICAST_FLT for standalone ports, we must have control over which database we install FDB entries corresponding to port MAC addresses in. We don't want to hinder the operation of the bridging layer. DSA does not have a driver API that encourages FDB isolation, so this needs to be created. The basis for this is a new struct dsa_db which annotates each FDB and MDB entry with the database it belongs to. The sja1105 and felix drivers are modified to observe the dsa_db argument, and therefore, enforce the FDB isolation. Compared to the previous RFC patch series from August: https://patchwork.kernel.org/project/netdevbpf/cover/20210818120150.892647-1-vladimir.oltean@nxp.com/ what is different is that I stopped trying to make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE blocking, instead I'm making use of the fact that DSA waits for switchdev FDB work items to finish before a port leaves the bridge. This is possible since: https://patchwork.kernel.org/project/netdevbpf/patch/20211024171757.3753288-7-vladimir.oltean@nxp.com/ Additionally, v2 is also rebased over the DSA LAG FDB work. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 11:06:14 +00:00
Vladimir Oltean	54c3198460	net: mscc: ocelot: enforce FDB isolation when VLAN-unaware Currently ocelot uses a pvid of 0 for standalone ports and ports under a VLAN-unaware bridge, and the pvid of the bridge for ports under a VLAN-aware bridge. Standalone ports do not perform learning, but packets received on them are still subject to FDB lookups. So if the MAC DA that a standalone port receives has been also learned on a VLAN-unaware bridge port, ocelot will attempt to forward to that port, even though it can't, so it will drop packets. So there is a desire to avoid that, and isolate the FDBs of different bridges from one another, and from standalone ports. The ocelot switch library has two distinct entry points: the felix DSA driver and the ocelot switchdev driver. We need to code up a minimal bridge_num allocation in the ocelot switchdev driver too, this is copied from DSA with the exception that ocelot does not care about DSA trees, cross-chip bridging etc. So it only looks at its own ports that are already in the same bridge. The ocelot switchdev driver uses the bridge_num it has allocated itself, while the felix driver uses the bridge_num allocated by DSA. They are both stored inside ocelot_port->bridge_num by the common function ocelot_port_bridge_join() which receives the bridge_num passed by value. Once we have a bridge_num, we can only use it to enforce isolation between VLAN-unaware bridges. As far as I can see, ocelot does not have anything like a FID that further makes VLAN 100 from a port be different to VLAN 100 from another port with regard to FDB lookup. So we simply deny multiple VLAN-aware bridges. For VLAN-unaware bridges, we crop the 4000-4095 VLAN region and we allocate a VLAN for each bridge_num. This will be used as the pvid of each port that is under that VLAN-unaware bridge, for as long as that bridge is VLAN-unaware. VID 0 remains only for standalone ports. It is okay if all standalone ports use the same VID 0, since they perform no address learning, the FDB will contain no entry in VLAN 0, so the packets will always be flooded to the only possible destination, the CPU port. The CPU port module doesn't need to be member of the VLANs to receive packets, but if we use the DSA tag_8021q protocol, those packets are part of the data plane as far as ocelot is concerned, so there it needs to. Just ensure that the DSA tag_8021q CPU port is a member of all reserved VLANs when it is created, and is removed when it is deleted. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 11:06:14 +00:00
Vladimir Oltean	219827ef92	net: dsa: sja1105: enforce FDB isolation For sja1105, to enforce FDB isolation simply means to turn on Independent VLAN Learning unconditionally, and to remap VLAN-unaware FDB and MDB entries towards the private VLAN allocated by tag_8021q for each bridge. Standalone ports each have their own standalone tag_8021q VLAN. No learning happens in that VLAN due to: - learning being disabled on standalone user ports - learning being disabled on the CPU port (we use assisted_learning_on_cpu_port which only installs bridge FDBs) VLAN-aware ports learn FDB entries with the bridge VLANs. VLAN-unaware bridge ports learn with the tag_8021q VLAN for bridging. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 11:06:14 +00:00
Vladimir Oltean	06b9cce426	net: dsa: pass extack to .port_bridge_join driver methods As FDB isolation cannot be enforced between VLAN-aware bridges in lack of hardware assistance like extra FID bits, it seems plausible that many DSA switches cannot do it. Therefore, they need to reject configurations with multiple VLAN-aware bridges from the two code paths that can transition towards that state: - joining a VLAN-aware bridge - toggling VLAN awareness on an existing bridge The .port_vlan_filtering method already propagates the netlink extack to the driver, let's propagate it from .port_bridge_join too, to make sure that the driver can use the same function for both. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 11:06:14 +00:00
Vladimir Oltean	c26933639b	net: dsa: request drivers to perform FDB isolation For DSA, to encourage drivers to perform FDB isolation simply means to track which bridge does each FDB and MDB entry belong to. It then becomes the driver responsibility to use something that makes the FDB entry from one bridge not match the FDB lookup of ports from other bridges. The top-level functions where the bridge is determined are: - dsa_port_fdb_{add,del} - dsa_port_host_fdb_{add,del} - dsa_port_mdb_{add,del} - dsa_port_host_mdb_{add,del} aka the pre-crosschip-notifier functions. Changing the API to pass a reference to a bridge is not superfluous, and looking at the passed bridge argument is not the same as having the driver look at dsa_to_port(ds, port)->bridge from the ->port_fdb_add() method. DSA installs FDB and MDB entries on shared (CPU and DSA) ports as well, and those do not have any dp->bridge information to retrieve, because they are not in any bridge - they are merely the pipes that serve the user ports that are in one or multiple bridges. The struct dsa_bridge associated with each FDB/MDB entry is encapsulated in a larger "struct dsa_db" database. Although only databases associated to bridges are notified for now, this API will be the starting point for implementing IFF_UNICAST_FLT in DSA. There, the idea is to install FDB entries on the CPU port which belong to the corresponding user port's port database. These are supposed to match only when the port is standalone. It is better to introduce the API in its expected final form than to introduce it for bridges first, then to have to change drivers which may have made one or more assumptions. Drivers can use the provided bridge.num, but they can also use a different numbering scheme that is more convenient. DSA must perform refcounting on the CPU and DSA ports by also taking into account the bridge number. So if two bridges request the same local address, DSA must notify the driver twice, once for each bridge. In fact, if the driver supports FDB isolation, DSA must perform refcounting per bridge, but if the driver doesn't, DSA must refcount host addresses across all bridges, otherwise it would be telling the driver to delete an FDB entry for a bridge and the driver would delete it for all bridges. So introduce a bool fdb_isolation in drivers which would make all bridge databases passed to the cross-chip notifier have the same number (0). This makes dsa_mac_addr_find() -> dsa_db_equal() say that all bridge databases are the same database - which is essentially the legacy behavior. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 11:06:14 +00:00
Vladimir Oltean	b6362bdf75	net: dsa: tag_8021q: rename dsa_8021q_bridge_tx_fwd_offload_vid The dsa_8021q_bridge_tx_fwd_offload_vid is no longer used just for bridge TX forwarding offload, it is the private VLAN reserved for VLAN-unaware bridging in a way that is compatible with FDB isolation. So just rename it dsa_tag_8021q_bridge_vid. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 11:06:14 +00:00
Vladimir Oltean	04b67e18ce	net: dsa: tag_8021q: merge RX and TX VLANs In the old Shared VLAN Learning mode of operation that tag_8021q previously used for forwarding, we needed to have distinct concepts for an RX and a TX VLAN. An RX VLAN could be installed on all ports that were members of a given bridge, so that autonomous forwarding could still work, while a TX VLAN was dedicated for precise packet steering, so it just contained the CPU port and one egress port. Now that tag_8021q uses Independent VLAN Learning and imprecise RX/TX all over, those lines have been blurred and we no longer have the need to do precise TX towards a port that is in a bridge. As for standalone ports, it is fine to use the same VLAN ID for both RX and TX. This patch changes the tag_8021q format by shifting the VLAN range it reserves, and halving it. Previously, our DIR bits were encoding the VLAN direction (RX/TX) and were set to either 1 or 2. This meant that tag_8021q reserved 2K VLANs, or 50% of the available range. Change the DIR bits to a hardcoded value of 3 now, which makes tag_8021q reserve only 1K VLANs, and a different range now (the last 1K). This is done so that we leave the old format in place in case we need to return to it. In terms of code, the vid_is_dsa_8021q_rxvlan and vid_is_dsa_8021q_txvlan functions go away. Any vid_is_dsa_8021q is both a TX and an RX VLAN, and they are no longer distinct. For example, felix which did different things for different VLAN types, now needs to handle the RX and the TX logic for the same VLAN. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 11:06:14 +00:00
Vladimir Oltean	08f44db3ab	net: dsa: felix: delete workarounds present due to SVL tag_8021q bridging The felix driver, which also has a tagging protocol implementation based on tag_8021q, does not care about adding the RX VLAN that is pvid on one port on the other ports that are in the same bridge with it. It simply doesn't need that, because in its implementation, the RX VLAN that is pvid of a port is only used to install a TCAM rule that pushes that VLAN ID towards the CPU port. Now that tag_8021q no longer performs Shared VLAN Learning based forwarding, the RX VLANs are actually segregated into two types: standalone VLANs and VLAN-unaware bridging VLANs. Since you actually have to call dsa_tag_8021q_bridge_join() to get a bridging VLAN from tag_8021q, and felix does not do that because it doesn't need it, it means that it only gets standalone port VLANs from tag_8021q. Which is perfect because this means it can drop its workarounds that avoid the VLANs it does not need. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 11:06:14 +00:00
Vladimir Oltean	d27656d02d	docs: net: dsa: sja1105: document limitations of tc-flower rule VLAN awareness After change "net: dsa: tag_8021q: replace the SVL bridging with VLAN-unaware IVL bridging", tag_8021q enforces two different pvids on a port, depending on whether it is standalone or in a VLAN-unaware bridge. Up until now, there was a single pvid, represented by dsa_tag_8021q_rx_vid(), and that was used as the VLAN for VLAN-unaware virtual link rules, regardless of whether the port was bridged or standalone. To keep VLAN-unaware virtual links working, we need to follow whether the port is in a bridge or not, and update the VLAN ID from those rules. In fact we can't fully do that. Depending on whether the switch is VLAN-aware or not, we can accept Virtual Link rules with just the MAC DA, or with a MAC DA and a VID. So we already deny changes to the VLAN awareness of the switch. But the VLAN awareness may also change as a result of joining or leaving a bridge. One might say we could just allow the following: a port may leave a VLAN-unaware bridge while it has VLAN-unaware VL (tc-flower) rules, and the driver will update those with the new tag_8021q pvid for standalone mode, but the driver won't accept joining a bridge at all while VL rules were installed in standalone mode. This is sort of a compromise made because leaving a bridge is an operation that cannot be vetoed. But this sort of setup change is not fully supported, either: as mentioned, VLAN filtering changes can also be triggered by leaving a bridge, therefore, the existing veto we have in place for turning VLAN filtering off with VLAN-aware VL rules active still isn't fully effective. I really don't know how to deal with this in a way that produces predictable behavior for user space. Since at the moment, keeping this feature fully functional on constellation changes (not changing the tag_8021q port pvid when joining a bridge) is blocking progress for the DSA FDB isolation, I'd rather document it as a (potentially temporary) limitation and go on without it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 11:06:13 +00:00
Vladimir Oltean	d7f9787a76	net: dsa: tag_8021q: add support for imprecise RX based on the VBID The sja1105 switch can't populate the PORT field of the tag_8021q header when sending a frame to the CPU with a non-zero VBID. Similar to dsa_find_designated_bridge_port_by_vid() which performs imprecise RX for VLAN-aware bridges, let's introduce a helper in tag_8021q for performing imprecise RX based on the VLAN that it has allocated for a VLAN-unaware bridge. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 11:06:13 +00:00
Vladimir Oltean	91495f21fc	net: dsa: tag_8021q: replace the SVL bridging with VLAN-unaware IVL bridging For VLAN-unaware bridging, tag_8021q uses something perhaps a bit too tied with the sja1105 switch: each port uses the same pvid which is also used for standalone operation (a unique one from which the source port and device ID can be retrieved when packets from that port are forwarded to the CPU). Since each port has a unique pvid when performing autonomous forwarding, the switch must be configured for Shared VLAN Learning (SVL) such that the VLAN ID itself is ignored when performing FDB lookups. Without SVL, packets would always be flooded, since FDB lookup in the source port's VLAN would never find any entry. First of all, to make tag_8021q more palatable to switches which might not support Shared VLAN Learning, let's just use a common VLAN for all ports that are under the same bridge. Secondly, using Shared VLAN Learning means that FDB isolation can never be enforced. But if all ports under the same VLAN-unaware bridge share the same VLAN ID, it can. The disadvantage is that the CPU port can no longer perform precise source port identification for these packets. But at least we have a mechanism which has proven to be adequate for that situation: imprecise RX (dsa_find_designated_bridge_port_by_vid), which is what we use for termination on VLAN-aware bridges. The VLAN ID that VLAN-unaware bridges will use with tag_8021q is the same one as we were previously using for imprecise TX (bridge TX forwarding offload). It is already allocated, it is just a matter of using it. Note that because now all ports under the same bridge share the same VLAN, the complexity of performing a tag_8021q bridge join decreases dramatically. We no longer have to install the RX VLAN of a newly joining port into the port membership of the existing bridge ports. The newly joining port just becomes a member of the VLAN corresponding to that bridge, and the other ports are already members of it from when they joined the bridge themselves. So forwarding works properly. This means that we can unhook dsa_tag_8021q_bridge_{join,leave} from the cross-chip notifier level dsa_switch_bridge_{join,leave}. We can put these calls directly into the sja1105 driver. With this new mode of operation, a port controlled by tag_8021q can have two pvids whereas before it could only have one. The pvid for standalone operation is different from the pvid used for VLAN-unaware bridging. This is done, again, so that FDB isolation can be enforced. Let tag_8021q manage this by deleting the standalone pvid when a port joins a bridge, and restoring it when it leaves it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 11:06:13 +00:00
David S. Miller	1bb1c5bc54	Merge branch 'FFungible-ethernet-driver' Dimitris Michailidis says: ==================== new Fungible Ethernet driver This patch series contains a new network driver for the Ethernet functionality of Fungible cards. It contains two modules. The first one in patch 2 is a library module that implements some of the device setup, queue managenent, and support for operating an admin queue. These are placed in a separate module because the cards provide a number of PCI functions handled by different types of drivers and all use the same common means to interact with the device. Each of the drivers will be relying on this library module for them. The remaining patches provide the Ethernet driver for the cards. v2: - Fix set_pauseparam, remove get_wol, remove module param (Andrew Lunn) - Fix a register poll loop (Andrew) - Replace constants defined with 'static const' - make W=1 C=1 is clean - Remove devlink FW update (Jakub) - Remove duplicate ethtool stats covered by structured API (Jakub) v3: - Make TLS stats unconditional (Andrew) - Remove inline from .c (Andrew) - Replace some ifdef with IS_ENABLED (Andrew) - Fix build failure on 32b arches (build robot) - Fix build issue with make O= (Jakub) v4: - Fix for newer bpf_warn_invalid_xdp_action() (Jakub) - Remove 32b dma_set_mask_and_coherent() v5: - Make XDP enter/exit non-disruptive to active traffic - Remove dormant port state - Style fixes, unused stuff removal (Jakub) v6: - When changing queue depth or numbers allocate the new queues before shutting down the existing ones (Jakub) v7: - Convert IRQ bookeeping to use XArray. - Changes to the numbers of Tx/Rx queues are now incremental and do not disrupt ongoing traffic. - Implement .ndo_eth_ioctl instead of .ndo_do_ioctl. - Replace deprecated irq_set_affinity_hint. - Remove TLS 1.3 support (Jakub) - Remove hwtstamp_config.flags check (Jakub) - Add locking in SR-IOV enable/disable. (Jakub) v8: - Remove dropping of <33B packets and the associated counter (Jakub) - Report CQE size. - Show last MAC stats when the netdev isn't running (Andrew) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 10:51:24 +00:00
Dimitris Michailidis	749efb1e6d	net/fungible: Kconfig, Makefiles, and MAINTAINERS Hook up the new driver to configuration and build. Signed-off-by: Dimitris Michailidis <dmichail@fungible.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 10:51:23 +00:00
Dimitris Michailidis	a3662007a1	net/funeth: add kTLS TX control part This provides the control pieces for kTLS Tx offload, implementinng the offload operations. Signed-off-by: Dimitris Michailidis <dmichail@fungible.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 10:51:23 +00:00
Dimitris Michailidis	db37bc177d	net/funeth: add the data path Add the driver's data path. Tx handles skbs, XDP, and kTLS, Rx has skbs and XDP. Also included are Rx and Tx queue creation/tear-down and tracing. Signed-off-by: Dimitris Michailidis <dmichail@fungible.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 10:51:23 +00:00
Dimitris Michailidis	d1d899f244	net/funeth: devlink support The devlink part, which is minimal at this time giving just the driver name. Signed-off-by: Dimitris Michailidis <dmichail@fungible.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 10:51:23 +00:00
Dimitris Michailidis	21c5ea95da	net/funeth: ethtool operations Add ethtool operations, primarily related to queues and ports, as well as device statistics. Signed-off-by: Dimitris Michailidis <dmichail@fungible.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 10:51:23 +00:00
Dimitris Michailidis	ee6373ddf3	net/funeth: probing and netdev ops This is the first part of the Fungible ethernet driver. It deals with device probing, net_device creation, and netdev ops. Signed-off-by: Dimitris Michailidis <dmichail@fungible.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 10:51:23 +00:00
Dimitris Michailidis	e1ffcc6681	net/fungible: Add service module for Fungible drivers Fungible cards have a number of different PCI functions and thus different drivers, all of which use a common method to initialize and interact with the device. This commit adds a library module that collects these common mechanisms. They mainly deal with device initialization, setting up and destroying queues, and operating an admin queue. A subset of the FW interface is also included here. Signed-off-by: Dimitris Michailidis <dmichail@fungible.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 10:51:23 +00:00
Dimitris Michailidis	e8eb9e3299	PCI: Add Fungible Vendor ID to pci_ids.h Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: linux-pci@vger.kernel.org Signed-off-by: Dimitris Michailidis <dmichail@fungible.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-27 10:51:23 +00:00
Linus Torvalds	2293be58d6	Tracing fixes for 5.17: - rtla (Real-Time Linux Analysis tool): fix typo in man page - rtla: Update API -e to -E before it is released - rlla: Error message fix and memory leak fix - Partially uninline trace event soft disable to shrink text - Fix function graph start up test - Have triggers affect the trace instance they are in and not top level - Have osnoise sleep in the units it says it uses - Remove unused ftrace stub function - Remove event probe redundant info from event in the buffer - Fix group ownership setting in tracefs - Ensure trace buffer is minimum size to prevent crashes -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYho7XBQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qiOhAQDbCbEjIYwkGCpckuGgSQiMU4bAWUzk jCz9PoaTxoIWJwEAsLWrAPb0pDzNwdEKjiC3fJoUJhz3NwlEjJ7hQ3BxzAI= =iXOQ -----END PGP SIGNATURE----- Merge tag 'trace-v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: - rtla (Real-Time Linux Analysis tool): - fix typo in man page - Update API -e to -E before it is released - Error message fix and memory leak fix - Partially uninline trace event soft disable to shrink text - Fix function graph start up test - Have triggers affect the trace instance they are in and not top level - Have osnoise sleep in the units it says it uses - Remove unused ftrace stub function - Remove event probe redundant info from event in the buffer - Fix group ownership setting in tracefs - Ensure trace buffer is minimum size to prevent crashes * tag 'trace-v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: rtla/osnoise: Fix error message when failing to enable trace instance rtla/osnoise: Free params at the exit rtla/hist: Make -E the short version of --entries tracing: Fix selftest config check for function graph start up test tracefs: Set the group ownership in apply_options() not parse_options() tracing/osnoise: Make osnoise_main to sleep for microseconds ftrace: Remove unused ftrace_startup_enable() stub tracing: Ensure trace buffer is at least 4096 bytes large tracing: Uninline trace_trigger_soft_disabled() partly eprobes: Remove redundant event type information tracing: Have traceon and traceoff trigger honor the instance tracing: Dump stacktrace trigger to the corresponding instance rtla: Fix systme -> system typo on man page	2022-02-26 12:10:17 -08:00
Linus Torvalds	e41898d2ba	memblock: use kfree() to release kmalloced memblock regions memblock.{reserved,memory}.regions may be allocated using kmalloc() in memblock_double_array(). Use kfree() to release these kmalloced regions. -----BEGIN PGP SIGNATURE----- iQFHBAABCAAxFiEEeOVYVaWZL5900a/pOQOGJssO/ZEFAmIZ1qgTHHJwcHRAbGlu dXguaWJtLmNvbQAKCRA5A4Ymyw79kXsaB/0TnrLt98t/jPvVGinsnf7r3hXnNq7F 8FXWqdUIBWRfiHVd74pX6VE4Be56BbMUUyRQWDfbjrluVFnBibA3qJhNmpIuwdSb 9GESikUdEnuq0t059yPLupKvYY0ysq4OjNLWage+8tnA/TzlN/+t27c75iZWwGn2 JbutM/j5YKnvAcqUVv/plLVIVrGz1RCaG0diYoY1vxrbpRCicmAI8LHTkK1Xtow2 7YVkRuQWY+yJLOJ/SCst5pxy6cm3R96KvnaC9fg1Pp+8wVFrZp/hDsH8nObFccXq 6zQTbXqS88VKOoNEcuqk2ITbFyghepPIBrliEmcI2h96OSdp6BtrNau7 =UpBU -----END PGP SIGNATURE----- Merge tag 'fixes-2022-02-26' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock fix from Mike Rapoport: "Use kfree() to release kmalloced memblock regions memblock.{reserved,memory}.regions may be allocated using kmalloc() in memblock_double_array(). Use kfree() to release these kmalloced regions" * tag 'fixes-2022-02-26' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: memblock: use kfree() to release kmalloced memblock regions	2022-02-26 12:00:44 -08:00
Linus Torvalds	086ee11b03	Merge branch 'akpm' (patches from Andrew) Merge misc fixes from Andrew Morton: "12 patches. Subsystems affected by this patch series: MAINTAINERS, mailmap, memfd, and mm (hugetlb, kasan, hugetlbfs, pagemap, selftests, memcg, and slab)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: selftests/memfd: clean up mapping in mfd_fail_write mailmap: update Roman Gushchin's email MAINTAINERS, SLAB: add Roman as reviewer, git tree MAINTAINERS: add Shakeel as a memcg co-maintainer MAINTAINERS: remove Vladimir from memcg maintainers MAINTAINERS: add Roman as a memcg co-maintainer selftest/vm: fix map_fixed_noreplace test failure mm: fix use-after-free bug when mm->mmap is reused after being freed hugetlbfs: fix a truncation issue in hugepages parameter kasan: test: prevent cache merging in kmem_cache_double_destroy mm/hugetlb: fix kernel crash with hugetlb mremap MAINTAINERS: add sysctl-next git tree	2022-02-26 11:52:14 -08:00
Linus Torvalds	2c8c230eda	RISC-V Fixes for 5.17-rc6 * A fix for the K210 sdcard defconfig, to avoid using a fixed delay for the root FS. * A fix to make sure there's a proper call frame for trace_hardirqs_{on,off}(). --- There are a handful of additional fixes in flight, but not for this week. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAmIZHmQTHHBhbG1lckBk YWJiZWx0LmNvbQAKCRDvTKFQLMurQT4ND/42sEQLhcQcDpdvFDX/0zBr1Y8RNy25 7I9JBYmuTK5AfwmE52I/OcdCLE9bNELH1g+LMK/3amEqhkUtDelBb5UdC4TYfvRm SRlj75XKPxESMEW9EjU5BeAz+uDI4oMkOmDPyp+Xv/OayGrFQIPUTo75/SiOdlH7 a2khiH4/OxqkVlOff3Ko96M4RNSUeUIEVSfrH4pgJC8n+031u02TvR1IIx5TT7ti W5YIMw6VZ32Gl5ByZaBMbs9pz+iOKDrn3UfnPrVpbs6P3389EmR4btJpqfzN9JeC UQzcx4rqoDzTtJvqkOxiR+Ig4nNJGyeYVvxaGH67MkD/nz6rS26adIs+xPGKjDCC TtFyLt4h2+JX+1kNiutTLrQLAaQO4N+LSkysIsoSr9wNGCdnSrAQRxoOLwIuTMBS 61kRsBvuiuRJZQlbgkP2tiTug/8dYs4vQzPNeC5VO/c3MZB5/j2ykYdKBSsElrpi +br602CMdeqvT+M+pT9lWdxa8X9lbYVm1z3hx2FyRdbnYw3nbcQq/mGp8Ju9O5zt JXajwPFtUPpWXzm2CcRjeh+2GKoLetgVpwHAIOmnDd6meTXp/BEu12+o7c8vf3H7 k12BPHVPH1gPklG2Oh8Z/UvevICd4AHlSJT7J19xh7fYVMaaALVQ71Nd2jFbv9N/ eu8KKxR2pKXiPw== =1a16 -----END PGP SIGNATURE----- Merge tag 'riscv-for-linus-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - A fix for the K210 sdcard defconfig, to avoid using a fixed delay for the root FS - A fix to make sure there's a proper call frame for trace_hardirqs_{on,off}(). * tag 'riscv-for-linus-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: fix oops caused by irqsoff latency tracer riscv: fix nommu_k210_sdcard_defconfig	2022-02-26 10:26:24 -08:00
Linus Torvalds	3bd9dd8138	Bug fixes for 5.17-rc4: - Only call sync_filesystem when we're remounting the filesystem readonly readonly, and actually check its return value. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAmIEnhUACgkQ+H93GTRK tOsn7g//e3R/lqpkx6jJtg6SqiC1KiI9euwD0wBdIvrCWSJZ6IdjOSvfRRG13vN0 S1spU0joiLLlVhzLIQdysgZkRub57P0mRmq3zVpHYxMOWKBvPH1OmZtdu83HOiAv /zjy3tNFc/1ZaqHudAZv3+4780qMZtQTmL7DbgLnvFspCBf4PdBlT0d7Wbf982w8 dMWF7Pla8DhLVFbMsGdyXlnGROz+pw3jofVwY9P6f+PaY37mo+lZu65GrTlNecnc QfTyX45VAWFO/XZtXm7pXCr8211eK2SnrOFZXZH9u3qxSD5vo1NWf9KPKVkYxc8/ 7icz+Yp5t61HQg3o0z7cNAQZp7CQl0BWz6gp2YXMWHS3ZJMnd6H4zTDBdV2MSA5/ alT4kcwncRVcmHtFET7JAsnQkWNeREBqhqCRoAf8hW8uxpjkXw6sPop7+hbZtoJw VAp1TxbEMbPGTZb76Kw4nZt1eZ3SyJOl6ByzsJMxekEFiMYVh4yxO+a3Q6KNOkMM O62JpzdE1EeFgV7qmoZ8QzCZuD7z7KC99iv5QtyacFITCqv5y0h/RLGCsOwJ0EMc fJGKN7uQOZrBIJYInx53S7fCYGGMm0+HUUXMUatBe4RK3dADyqapLzQb0tCGamAf NQra6NotwfNq8SN+Sn17PJ1KifSRKfw6l7Q+6pt9LA2eVbr2jV4= =6ODm -----END PGP SIGNATURE----- Merge tag 'xfs-5.17-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull xfs fixes from Darrick Wong: "Nothing exciting, just more fixes for not returning sync_filesystem error values (and eliding it when it's not necessary). Summary: - Only call sync_filesystem when we're remounting the filesystem readonly readonly, and actually check its return value" * tag 'xfs-5.17-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: only bother with sync_filesystem during readonly remount	2022-02-26 09:53:19 -08:00
Mike Kravetz	fda153c89a	selftests/memfd: clean up mapping in mfd_fail_write Running the memfd script ./run_hugetlbfs_test.sh will often end in error as follows: memfd-hugetlb: CREATE memfd-hugetlb: BASIC memfd-hugetlb: SEAL-WRITE memfd-hugetlb: SEAL-FUTURE-WRITE memfd-hugetlb: SEAL-SHRINK fallocate(ALLOC) failed: No space left on device ./run_hugetlbfs_test.sh: line 60: 166855 Aborted (core dumped) ./memfd_test hugetlbfs opening: ./mnt/memfd fuse: DONE If no hugetlb pages have been preallocated, run_hugetlbfs_test.sh will allocate 'just enough' pages to run the test. In the SEAL-FUTURE-WRITE test the mfd_fail_write routine maps the file, but does not unmap. As a result, two hugetlb pages remain reserved for the mapping. When the fallocate call in the SEAL-SHRINK test attempts allocate all hugetlb pages, it is short by the two reserved pages. Fix by making sure to unmap in mfd_fail_write. Link: https://lkml.kernel.org/r/20220219004340.56478-1-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-02-26 09:51:17 -08:00
Roman Gushchin	9502bdbf34	mailmap: update Roman Gushchin's email I'm moving to a @linux.dev account. Map my old addresses. Link: https://lkml.kernel.org/r/20220221200006.416377-1-roman.gushchin@linux.dev Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-02-26 09:51:17 -08:00
Vlastimil Babka	7b0112f343	MAINTAINERS, SLAB: add Roman as reviewer, git tree The slab code has an overlap with kmem accounting, where Roman has done a lot of work recently and it would be useful to make sure he's CC'd on patches that potentially affect it. Thus add him as a reviewer for the SLAB subsystem. Also while at it, add the link to slab git tree. Link: https://lkml.kernel.org/r/20220222103104.13241-1-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-02-26 09:51:17 -08:00
Shakeel Butt	bb9d545499	MAINTAINERS: add Shakeel as a memcg co-maintainer I have been contributing and reviewing to the memcg codebase for last couple of years. So, making it official. Link: https://lkml.kernel.org/r/20220224060148.4092228-1-shakeelb@google.com Signed-off-by: Shakeel Butt <shakeelb@google.com> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-02-26 09:51:17 -08:00
Vladimir Davydov	0a972e72e2	MAINTAINERS: remove Vladimir from memcg maintainers Link: https://lkml.kernel.org/r/4ad1f8da49d7b71c84a0c15bd5347f5ce704e730.1645608825.git.vdavydov.dev@gmail.com Signed-off-by: Vladimir Davydov <vdavydov.dev@gmail.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-02-26 09:51:17 -08:00
Roman Gushchin	7d547dcf97	MAINTAINERS: add Roman as a memcg co-maintainer Add myself as a memcg co-maintainer. My primary focus over last few years was the kernel memory accounting stack, but I do work on some other parts of the memory controller as well. Link: https://lkml.kernel.org/r/20220221233951.659048-1-roman.gushchin@linux.dev Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-02-26 09:51:17 -08:00
Aneesh Kumar K.V	f39c58008d	selftest/vm: fix map_fixed_noreplace test failure On the latest RHEL the test fails due to executable mapped at 256MB address # ./map_fixed_noreplace mmap() @ 0x10000000-0x10050000 p=0xffffffffffffffff result=File exists 10000000-10010000 r-xp 00000000 fd:04 34905657 /root/rpmbuild/BUILD/kernel-5.14.0-56.el9/linux-5.14.0-56.el9.ppc64le/tools/testing/selftests/vm/map_fixed_noreplace 10010000-10020000 r--p 00000000 fd:04 34905657 /root/rpmbuild/BUILD/kernel-5.14.0-56.el9/linux-5.14.0-56.el9.ppc64le/tools/testing/selftests/vm/map_fixed_noreplace 10020000-10030000 rw-p 00010000 fd:04 34905657 /root/rpmbuild/BUILD/kernel-5.14.0-56.el9/linux-5.14.0-56.el9.ppc64le/tools/testing/selftests/vm/map_fixed_noreplace 10029b90000-10029bc0000 rw-p 00000000 00:00 0 [heap] 7fffbb510000-7fffbb750000 r-xp 00000000 fd:04 24534 /usr/lib64/libc.so.6 7fffbb750000-7fffbb760000 r--p 00230000 fd:04 24534 /usr/lib64/libc.so.6 7fffbb760000-7fffbb770000 rw-p 00240000 fd:04 24534 /usr/lib64/libc.so.6 7fffbb780000-7fffbb7a0000 r--p 00000000 00:00 0 [vvar] 7fffbb7a0000-7fffbb7b0000 r-xp 00000000 00:00 0 [vdso] 7fffbb7b0000-7fffbb800000 r-xp 00000000 fd:04 24514 /usr/lib64/ld64.so.2 7fffbb800000-7fffbb810000 r--p 00040000 fd:04 24514 /usr/lib64/ld64.so.2 7fffbb810000-7fffbb820000 rw-p 00050000 fd:04 24514 /usr/lib64/ld64.so.2 7fffd93f0000-7fffd9420000 rw-p 00000000 00:00 0 [stack] Error: couldn't map the space we need for the test Fix this by finding a free address using mmap instead of hardcoding BASE_ADDRESS. Link: https://lkml.kernel.org/r/20220217083417.373823-1-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Jann Horn <jannh@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-02-26 09:51:17 -08:00
Suren Baghdasaryan	f798a1d4f9	mm: fix use-after-free bug when mm->mmap is reused after being freed oom reaping (__oom_reap_task_mm) relies on a 2 way synchronization with exit_mmap. First it relies on the mmap_lock to exclude from unlock path[1], page tables tear down (free_pgtables) and vma destruction. This alone is not sufficient because mm->mmap is never reset. For historical reasons[2] the lock is taken there is also MMF_OOM_SKIP set for oom victims before. The oom reaper only ever looks at oom victims so the whole scheme works properly but process_mrelease can opearate on any task (with fatal signals pending) which doesn't really imply oom victims. That means that the MMF_OOM_SKIP part of the synchronization doesn't work and it can see a task after the whole address space has been demolished and traverse an already released mm->mmap list. This leads to use after free as properly caught up by KASAN report. Fix the issue by reseting mm->mmap so that MMF_OOM_SKIP synchronization is not needed anymore. The MMF_OOM_SKIP is not removed from exit_mmap yet but it acts mostly as an optimization now. [1] 27ae357fa82b ("mm, oom: fix concurrent munlock and oom reaper unmap, v3") [2] 212925802454 ("mm: oom: let oom_reap_task and exit_mmap run concurrently") [mhocko@suse.com: changelog rewrite] Link: https://lore.kernel.org/all/00000000000072ef2c05d7f81950@google.com/ Link: https://lkml.kernel.org/r/20220215201922.1908156-1-surenb@google.com Fixes: 64591e8605d6 ("mm: protect free_pgtables with mmap_lock write lock in exit_mmap") Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reported-by: syzbot+2ccf63a4bd07cf39cab0@syzkaller.appspotmail.com Suggested-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Rik van Riel <riel@surriel.com> Reviewed-by: Yang Shi <shy828301@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: David Rientjes <rientjes@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Rik van Riel <riel@surriel.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jan Engelhardt <jengelh@inai.de> Cc: Tim Murray <timmurray@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-02-26 09:51:17 -08:00

... 7 8 9 10 11 ...

1075874 Commits