linux

iv/linux

Author	SHA1	Message	Date
Jakub Kicinski	3a17ea77da	Merge branch 'mlxsw-preparations-for-support-of-cff-flood-mode' Petr Machata says: ==================== mlxsw: Preparations for support of CFF flood mode PGT is an in-HW table that maps addresses to sets of ports. Then when some HW process needs a set of ports as an argument, instead of embedding the actual set in the dynamic configuration, what gets configured is the address referencing the set. The HW then works with the appropriate PGT entry. Among other allocations, the PGT currently contains two large blocks for bridge flooding: one for 802.1q and one for 802.1d. Within each of these blocks are three tables, for unknown-unicast, multicast and broadcast flooding: . . . \| 802.1q \| 802.1d \| . . . \| UC \| MC \| BC \| UC \| MC \| BC \| \______ _____/ \_____ ______/ v v FID flood vectors Thus each FID (which corresponds to an 802.1d bridge or one VLAN in an 802.1q bridge) uses three flood vectors spread across a fairly large region of PGT. This way of organizing the flood table (called "controlled") is not very flexible. E.g. to decrease a bridge scale and store more IP MC vectors, one would need to completely rewrite the bridge PGT blocks, or resort to hacks such as storing individual MC flood vectors into unused part of the bridge table. In order to address these shortcomings, Spectrum-2 and above support what is called CFF flood mode, for Compressed FID Flooding. In CFF flood mode, each FID has a little table of its own, with three entries adjacent to each other, one for unknown-UC, one for MC, one for BC. This allows for a much more fine-grained approach to PGT management, where bits of it are allocated on demand. . . . \| FID \| FID \| FID \| FID \| FID \| . . . \|U\|M\|B\|U\|M\|B\|U\|M\|B\|U\|M\|B\|U\|M\|B\| \_____________ _____________/ v FID flood vectors Besides the FID table organization, the CFF flood mode also impacts Router Subport (RSP) table. This table contains flood vectors for rFIDs, which are FIDs that reference front panel ports or LAGs. The RSP table contains two entries per front panel port and LAG, one for unknown-UC traffic, and one for everything else. Currently, the FW allocates and manages the table in its own part of PGT. rFIDs are marked with flood_rsp bit and managed specially. In CFF mode, rFIDs are managed as all other FIDs. The driver therefore has to allocate and maintain the flood vectors. Like with bridge FIDs, this is more work, but increases flexibility of the system. The FW currently supports both the controlled and CFF flood modes. To shed complexity, in the future it should only support CFF flood mode. Hence this patchset, which is the first in series of two to add CFF flood mode support to mlxsw. There are FW versions out there that do not support CFF flood mode, and on Spectrum-1 in particular, there is no plan to support it at all. mlxsw will therefore have to support both controlled flood mode as well as CFF. Another aspect is that at least on Spectrum-1, there are FW versions out there that claim to support CFF flood mode, but then reject or ignore configurations enabling the same. The driver thus has to have a say in whether an attempt to configure CFF flood mode should even be made. Much like with the LAG mode, the feature is therefore expressed in terms of "does the driver prefer CFF flood mode?", and "what flood mode the PCI module managed to configure the FW with". This gives to the driver a chance to determine whether CFF flood mode configuration should be attempted. In this patchset, we lay the ground with new definitions, registers and their fields, and some minor code shaping. The next patchset will be more focused on introducing necessary abstractions and implementation. - Patches #1 and #2 add CFF-related items to the command interface. - Patch #3 adds a new resource, for maximum number of flood profiles supported. (A flood profile is a mapping between traffic type and offset in the per-FID flood vector table.) - Patches #4 to #8 adjust reg.h. The SFFP register is added, which is used for configuring the abovementioned traffic-type-to-offset mapping. The SFMR, register, which serves for FID configuration, is extended with fields specific to CFF mode. And other minor adjustments. - Patches #9 and #10 add the plumbing for CFF mode: a way to request that CFF flood mode be configured, and a way to query the flood mode that was actually configured. - Patch #11 removes dead code. - Patches #12 and #13 add helpers that the next patchset will make use of. Patch #14 moves RIF setup ahead so that FID code can make use of it. ==================== Link: https://lore.kernel.org/r/cover.1700503643.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 14:53:12 -08:00
Petr Machata	f7ebb40237	mlxsw: spectrum_router: Call RIF setup before obtaining FID For subport RIFs, the setup initializes, among other things, RIF port and LAG numbers. Those are important to determine where in the PGT the RIF FID will be stored. Therefore, call the RIF setup before fid_get. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/f24d8cad7e4748b8e8e0e16894ca6a20704dea32.1700503644.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 14:53:09 -08:00
Petr Machata	27851dfaa3	mlxsw: spectrum_router: Add a helper to get subport number from a RIF In the CFF flood mode, responsibility for management of the PGT entries for rFIDs is moved from FW to the driver. All rFIDs are based off either a front panel port, or a LAG port. The flood vectors for port-based rFIDs enable just the port itself, the ones for LAG-based rFIDs enable all member ports of the LAG in question. Since all rFIDs based off the same port have the same flood vector, and similarly for LAG-based rFIDs, the flood entries are shared. The PGT address of the flood vector is therefore determined based on the port (or LAG) number of the RIF connected with the rFID. Add a helper to determine subport number given a RIF, to be used in these calculations. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/d7ab43cf5b021f785f363f236e4b6780d10eea93.1700503644.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 14:53:09 -08:00
Petr Machata	2b7bccd1f1	mlxsw: spectrum_fid: Extract SFMR packing into a helper Both mlxsw_sp_fid_op() and mlxsw_sp_fid_edit_op() pack the core of SFMR the same way. Extract the common code into a helper and call that. Extract out of that a wrapper that just calls mlxsw_reg_sfmr_pack(), because it will be useful for the dummy family later on. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/31f32b4d767183f6cb197148d0792feab2efadba.1700503644.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 14:53:09 -08:00
Petr Machata	b51c876c22	mlxsw: spectrum_fid: Drop unnecessary conditions The caller already only calls mlxsw_sp_fid_flood_tables_init() and mlxsw_sp_fid_flood_tables_fini() if (fid_family->flood_tables). There is no configuration where the pointer is non-NULL, but the number of tables is zero. So drop the conditions. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/897c6841bc756ac632b797bf67ac83c6a66ba359.1700503644.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 14:53:09 -08:00
Petr Machata	9aad19a363	mlxsw: pci: Permit enabling CFF mode There are FW versions out there that do not support CFF flood mode, and on Spectrum-1 in particular, there is no plan to support it at all. mlxsw will therefore have to support both controlled flood mode as well as CFF. There are also FW versions out there that claim to support CFF flood mode, but then reject or ignore configurations enabling the same. The driver thus has to have a say in whether an attempt to configure CFF flood mode should even be made, and what to use as a fallback. Hence express the feature in terms of "does the driver prefer CFF flood mode?", and "what flood mode the PCI module managed to configure the FW with". This gives to the driver a chance to determine whether CFF flood mode configuration should be attempted. The latter bit was added in previous patches. In this patch, add the bit that allows the driver to determine whether CFF enablement should be attempted, and the enablement code itself. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/41640a0ee58e0a9538f820f7b601a0e35f6449e4.1700503644.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 14:53:09 -08:00
Petr Machata	0959159568	mlxsw: core, pci: Add plumbing related to CFF mode CFF mode, for Compressed FID Flooding, is a way of organizing flood vectors in the PGT table. The bus module determines whether CFF is supported, can configure flood mode to CFF if it is, and knows what flood mode has been configured. Therefore add a bus callback to determine the configured flood mode. Also add to core an API to query it. Since after this patch, we rely on mlxsw_pci->flood_mode being set, it becomes a coding error if a driver invokes this function with a set of fields that misses the initialization. Warn and bail out in that case. The CFF mode is not used as of this patch. The code to actually use it will be added later. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/889d58759dd40f5037f2206b9fc4a78a9240da80.1700503644.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 14:53:09 -08:00
Petr Machata	6b10371c38	mlxsw: reg: Add to SFMR register the fields related to CFF flood mode Add the field cff_mid_base, which specifies at which point in PGT the per-FID flood table is stored. Add cff_prf_id, the profile ID, which determines on which row of the flood table a flood vector can be found for a given traffic type. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/3ad7ae38cf6534bedcd876f16090d109a814b3e3.1700503644.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 14:53:08 -08:00
Petr Machata	446bc1e9de	mlxsw: reg: Extract flood-mode specific part of mlxsw_reg_sfmr_pack() In CFF mode, it is necessary to set a different set of SFMR fields. Leave in mlxsw_reg_sfmr_pack() only the common bits, and move the parts relevant to controlled flood mode directly to the call site. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/6f29639ebc3ca0722272e6c644ca910096469413.1700503644.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 14:53:08 -08:00
Petr Machata	642d6a2033	mlxsw: reg: Drop unnecessary writes from mlxsw_reg_sfmr_pack() The MLXSW_REG_ZERO at the beginning of the function wipes the whole payload. There's no need to set vtfp and vv to false explicitly. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/04a51ea7cf31eea0ef7707311d8e864e2d9ef307.1700503644.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 14:53:08 -08:00
Petr Machata	7eb902954b	mlxsw: reg: Mark SFGC & some SFMR fields as reserved in CFF mode Some existing fields and the whole register of SFGC are reserved in CFF mode. Backport the reservation note to these fields. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/e1d5977a8cb778227e4ea2fd1515529957ce5de7.1700503643.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 14:53:08 -08:00
Petr Machata	e1e4ce6c6d	mlxsw: reg: Add Switch FID Flooding Profiles Register The SFFP register populates the fid flooding profile tables used for the NVE flooding and Compressed-FID Flooding (CFF). Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/ca42eb67763bd0c7cf035afc62ef73632f3f61a6.1700503643.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 14:53:08 -08:00
Petr Machata	2d19da9277	mlxsw: resources: Add max_cap_nve_flood_prf max_cap_nve_flood_prf describes maximum number of NVE flooding profiles. The same value then applies for flooding profiles for flooding in CFF mode. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/064a2e013d879e5f5494167a6c120c4bb85a2204.1700503643.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 14:53:08 -08:00
Petr Machata	50ee67789b	mlxsw: cmd: Add MLXSW_CMD_MBOX_CONFIG_PROFILE_FLOOD_MODE_CFF PGT, a port-group table is an in-HW block of specialized memory that holds sets of ports. Allocated within the PGT are series of flood tables that describe to which ports traffic of various types (unknown UC, BC, MC) should be flooded from which FID. The hitherto-used layout of these flood tables is being replaced with a more flexible scheme, called compressed FID flooding (CFF). CFF can be configured through CONFIG_PROFILE.flood_mode. In this patch, add MLXSW_CMD_MBOX_CONFIG_PROFILE_FLOOD_MODE_CFF, the value to use to enable the CFF mode. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/fc2e063742856492f8f22b0b87abf431ea6d53d0.1700503643.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 14:53:08 -08:00
Petr Machata	8405d66262	mlxsw: cmd: Add cmd_mbox.query_fw.cff_support PGT, a port-group table is an in-HW block of specialized memory that holds sets of ports. Allocated within the PGT are series of flood tables that describe to which ports traffic of various types (unknown UC, BC, MC) should be flooded from which FID. The hitherto-used layout of these flood tables is being replaced with a more flexible scheme, called compressed FID flooding (CFF). CFF can be configured through CONFIG_PROFILE.flood_mode. cff_support determines whether CONFIG_PROFILE.flood_mode can be set to CFF. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/af727d0e1095e30fa45c7e60404637cdc491aeec.1700503643.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 14:53:07 -08:00
Jakub Kicinski	dd891b5b10	net: do not send a MOVE event when netdev changes netns Networking supports changing netdevice's netns and name at the same time. This allows avoiding name conflicts and having to rename the interface in multiple steps. E.g. netns1={eth0, eth1}, netns2={eth1} - we want to move netns1:eth1 to netns2 and call it eth0 there. If we can't rename "in flight" we'd need to (1) rename eth1 -> $tmp, (2) change netns, (3) rename $tmp -> eth0. To rename the underlying struct device we have to call device_rename(). The rename()'s MOVE event, however, doesn't "belong" to either the old or the new namespace. If there are conflicts on both sides it's actually impossible to issue a real MOVE (old name -> new name) without confusing user space. And Daniel reports that such confusions do in fact happen for systemd, in real life. Since we already issue explicit REMOVE and ADD events manually - suppress the MOVE event completely. Move the ADD after the rename, so that the REMOVE uses the old name, and the ADD the new one. If there is no rename this changes the picture as follows: Before: old ns \| KERNEL[213.399289] remove /devices/virtual/net/eth0 (net) new ns \| KERNEL[213.401302] add /devices/virtual/net/eth0 (net) new ns \| KERNEL[213.401397] move /devices/virtual/net/eth0 (net) After: old ns \| KERNEL[266.774257] remove /devices/virtual/net/eth0 (net) new ns \| KERNEL[266.774509] add /devices/virtual/net/eth0 (net) If there is a rename and a conflict (using the exact eth0/eth1 example explained above) we get this: Before: old ns \| KERNEL[224.316833] remove /devices/virtual/net/eth1 (net) new ns \| KERNEL[224.318551] add /devices/virtual/net/eth1 (net) new ns \| KERNEL[224.319662] move /devices/virtual/net/eth0 (net) After: old ns \| KERNEL[333.033166] remove /devices/virtual/net/eth1 (net) new ns \| KERNEL[333.035098] add /devices/virtual/net/eth0 (net) Note that "in flight" rename is only performed when needed. If there is no conflict for old name in the target netns - the rename will be performed separately by dev_change_name(), as if the rename was a different command, and there will still be a MOVE event for the rename: Before: old ns \| KERNEL[194.416429] remove /devices/virtual/net/eth0 (net) new ns \| KERNEL[194.418809] add /devices/virtual/net/eth0 (net) new ns \| KERNEL[194.418869] move /devices/virtual/net/eth0 (net) new ns \| KERNEL[194.420866] move /devices/virtual/net/eth1 (net) After: old ns \| KERNEL[71.917520] remove /devices/virtual/net/eth0 (net) new ns \| KERNEL[71.919155] add /devices/virtual/net/eth0 (net) new ns \| KERNEL[71.920729] move /devices/virtual/net/eth1 (net) If deleting the MOVE event breaks some user space we should insert an explicit kobject_uevent(MOVE) after the ADD, like this: @@ -11192,6 +11192,12 @@ int __dev_change_net_namespace(struct net_device dev, struct net net, kobject_uevent(&dev->dev.kobj, KOBJ_ADD); netdev_adjacent_add_links(dev); + /* User space wants an explicit MOVE event, issue one unless + * dev_change_name() will get called later and issue one. + / + if (!pat \|\| new_name[0]) + kobject_uevent(&dev->dev.kobj, KOBJ_MOVE); + / Adapt owner in case owning user namespace of target network * namespace is different from the original one. */ Reported-by: Daniel Gröber <dxld@darkboxed.org> Link: https://lore.kernel.org/all/20231010121003.x3yi6fihecewjy4e@House.clients.dxld.at/ Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/all/20231120184140.578375-1-kuba@kernel.org/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 14:39:03 -08:00
Jose Ignacio Tornos Martinez	d2689b6a86	net: usb: ax88179_178a: avoid two consecutive device resets The device is always reset two consecutive times (ax88179_reset is called twice), one from usbnet_probe during the device binding and the other from usbnet_open. Remove the non-necessary reset during the device binding and let the reset operation from open to keep the normal behavior (tested with generic ASIX Electronics Corp. AX88179 Gigabit Ethernet device). Reported-by: Herb Wei <weihao.bj@ieisystem.com> Tested-by: Herb Wei <weihao.bj@ieisystem.com> Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com> Link: https://lore.kernel.org/r/20231120121239.54504-1-jtornosm@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 14:34:11 -08:00
Russell King (Oracle)	335662889f	net: phylink: use for_each_set_bit() Use for_each_set_bit() rather than open coding the for() test_bit() loop. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Link: https://lore.kernel.org/r/E1r4p15-00Cpxe-C7@rmk-PC.armlinux.org.uk Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-11-21 13:26:03 +01:00
Baruch Siach	79a4f4dfa6	net: stmmac: reduce dma ring display code duplication The code to show extended descriptor is identical to normal one. Consolidate the code to remove duplication. Signed-off-by: Baruch Siach <baruch@tkos.co.il> Link: https://lore.kernel.org/r/a2a5c5ce9338bdea60ec71d7eeb00fe757281557.1700372381.git.baruch@tkos.co.il Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-11-21 12:45:11 +01:00
Baruch Siach	7911deba29	net: stmmac: remove extra newline from descriptors display One newline per line should be enough. Reduce the verbosity of descriptors dump. Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: Baruch Siach <baruch@tkos.co.il> Link: https://lore.kernel.org/r/444f3b1dd409fdb14ed2a1ae7679a86b110dadcd.1700372381.git.baruch@tkos.co.il Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-11-21 12:45:11 +01:00
Zhengchao Shao	d6b83f1e37	bonding: return -ENOMEM instead of BUG in alb_upper_dev_walk If failed to allocate "tags" or could not find the final upper device from start_dev's upper list in bond_verify_device_path(), only the loopback detection of the current upper device should be affected, and the system is no need to be panic. So return -ENOMEM in alb_upper_dev_walk to stop walking, print some warn information when failed to allocate memory for vlan tags in bond_verify_device_path. I also think that the following function calls netdev_walk_all_upper_dev_rcu ---->>>alb_upper_dev_walk ---------->>>bond_verify_device_path From this way, "end device" can eventually be obtained from "start device" in bond_verify_device_path, IS_ERR(tags) could be instead of IS_ERR_OR_NULL(tags) in alb_upper_dev_walk. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Link: https://lore.kernel.org/r/20231118081653.1481260-1-shaozhengchao@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-11-21 12:06:50 +01:00
Shinas Rasheed	0807dc76f3	octeon_ep: support Octeon CN10K devices Add PCI Endpoint NIC support for Octeon CN10K devices. CN10K devices are part of Octeon 10 family products with similar PCI NIC characteristics. These include: - CN10KA - CNF10KA - CNF10KB - CN10KB Update supported device list in Documentation Signed-off-by: Shinas Rasheed <srasheed@marvell.com> Link: https://lore.kernel.org/r/20231117103817.2468176-1-srasheed@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-11-21 10:19:56 +01:00
Lorenzo Bianconi	31c54867fd	net: ethernet: mtk_wed: add support for devices with more than 4GB of dram Introduce WED offloading support for boards with more than 4GB of memory. Co-developed-by: Sujuan Chen <sujuan.chen@mediatek.com> Signed-off-by: Sujuan Chen <sujuan.chen@mediatek.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/1c7efdf5d384ea7af3c0209723e40b2ee0f956bf.1700239272.git.lorenzo@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-20 18:12:59 -08:00
Jakub Kicinski	4da325cc61	Merge branch 'selftests-tc-testing-more-updates-to-tdc' Pedro Tammela says: ==================== selftests: tc-testing: more updates to tdc Address the issues making tdc timeout on downstream CIs like lkp and tuxsuite. ==================== Link: https://lore.kernel.org/r/20231117171208.2066136-1-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-20 18:06:38 -08:00
Pedro Tammela	4968afa014	selftests: tc-testing: report number of workers in use Report the number of workers in use to process the test batches. Since the number is now subject to a limit, avoid users getting confused. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20231117171208.2066136-7-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-20 18:06:36 -08:00
Pedro Tammela	4b480cfb10	selftests: tc-testing: timeout on unbounded loops In the spirit of failing early, timeout on unbounded loops that take longer than 20 ticks to complete. Such loops are to ensure that objects created are already visible so tests can proceed without any issues. If a test setup takes more than 20 ticks to see an object, there's definetely something wrong. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20231117171208.2066136-6-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-20 18:06:36 -08:00
Pedro Tammela	3f2d94a4ff	selftests: tc-testing: leverage -all in suite ns teardown Instead of listing lingering ns pinned files and delete them one by one, leverage '-all' from iproute2 to do it in a single process fork. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20231117171208.2066136-5-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-20 18:06:36 -08:00
Pedro Tammela	3d5026fc5a	selftests: tc-testing: use netns delete from pyroute2 When pyroute2 is available, use the native netns delete routine instead of calling iproute2 to do it. As forks are expensive with some kernel configs, minimize its usage to avoid kselftests timeouts. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20231117171208.2066136-4-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-20 18:06:36 -08:00
Pedro Tammela	50a5988a7a	selftests: tc-testing: move back to per test ns setup Surprisingly in kernel configs with most of the debug knobs turned on, pre-allocating the test resources makes tdc run much slower overall than when allocating resources on a per test basis. As these knobs are used in kselftests in downstream CIs, let's go back to the old way of doing things to avoid kselftests timeouts. Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202311161129.3b45ed53-oliver.sang@intel.com Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20231117171208.2066136-3-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-20 18:06:36 -08:00
Pedro Tammela	025de7b6a6	selftests: tc-testing: cap parallel tdc to 4 cores We have observed a lot of lock contention and test instability when running with >8 cores. Enough to actually make the tests run slower than with fewer cores. Cap the maximum cores of parallel tdc to 4 which showed in testing to be a reasonable number for efficiency and stability in different kernel config scenarios. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20231117171208.2066136-2-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-20 18:06:35 -08:00
Jakub Kicinski	b1711d4310	Merge branch 'nfp-add-flow-steering-support' Louis Peens says: ==================== nfp: add flow-steering support This short series adds flow steering support for the nfp driver. The first patch adds the part to communicate with ethtool but stubs out the HW offload parts. The second patch implements the HW communication and offloads flow steering. After this series the user can now use 'ethtool -N/-n' to configure and display rx classification rules. ==================== Link: https://lore.kernel.org/r/20231117071114.10667-1-louis.peens@corigine.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-20 18:04:32 -08:00
Yinjun Zhang	c38fb3dcd5	nfp: offload flow steering to the nfp This is the second part to implement flow steering. Mailbox is used for the communication between driver and HW. Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20231117071114.10667-3-louis.peens@corigine.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-20 18:04:30 -08:00
Yinjun Zhang	9eb03bb1c0	nfp: add ethtool flow steering callbacks This is the first part to implement flow steering. The communication between ethtool and driver is done. User can use following commands to display and set flows: ethtool -n <netdev> ethtool -N <netdev> flow-type ... Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20231117071114.10667-2-louis.peens@corigine.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-20 18:04:30 -08:00
Jakub Kicinski	21612f52e4	Merge branch 'net-axienet-introduce-dmaengine' Radhey Shyam Pandey says: ==================== net: axienet: Introduce dmaengine The axiethernet driver can use the dmaengine framework to communicate with the xilinx DMAengine driver(AXIDMA, MCDMA). The inspiration behind this dmaengine adoption is to reuse the in-kernel xilinx dma engine driver[1] and remove redundant dma programming sequence[2] from the ethernet driver. This simplifies the ethernet driver and also makes it generic to be hooked to any complaint dma IP i.e AXIDMA, MCDMA without any modification. The dmaengine framework was extended for metadata API support during the axidma RFC[3] discussion. However, it still needs further enhancements to make it well suited for ethernet usecases. Comments, suggestions, thoughts to implement remaining functional features are very welcome! [1]: https://github.com/torvalds/linux/blob/master/drivers/dma/xilinx/xilinx_dma.c [2]: https://github.com/torvalds/linux/blob/master/drivers/net/ethernet/xilinx/xilinx_axienet_main.c#L238 [3]: http://lkml.iu.edu/hypermail/linux/kernel/1804.0/00367.html [4]: https://lore.kernel.org/all/20221124102745.2620370-1-sarath.babu.naidu.gaddam@amd.com ==================== Link: https://lore.kernel.org/r/1700074613-1977070-1-git-send-email-radhey.shyam.pandey@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-20 17:52:25 -08:00
Radhey Shyam Pandey	6a91b846af	net: axienet: Introduce dmaengine support Add dmaengine framework to communicate with the xilinx DMAengine driver(AXIDMA). Axi ethernet driver uses separate channels for transmit and receive. Add support for these channels to handle TX and RX with skb and appropriate callbacks. Also add axi ethernet core interrupt for dmaengine framework support. The dmaengine framework was extended for metadata API support. However it still needs further enhancements to make it well suited for ethernet usecases. The ethernet features i.e ethtool set/get of DMA IP properties, ndo_poll_controller,(mentioned in TODO) are not supported and it requires follow-up discussions. dmaengine support has a dependency on xilinx_dma as it uses xilinx_vdma_channel_set_config() API to reset the DMA IP which internally reset MAC prior to accessing MDIO. Benchmark with netperf: xilinx-zcu102-20232:~$ netperf -H 192.168.10.20 -t TCP_STREAM MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.20 () port 0 AF_INET Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 131072 16384 16384 10.02 886.69 xilinx-zcu102-20232:~$ netperf -H 192.168.10.20 -t UDP_STREAM MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.20 () port 0 AF_INET Socket Message Elapsed Messages Size Size Time Okay Errors Throughput bytes bytes secs # # 10^6bits/sec 212992 65507 10.00 15851 0 830.66 212992 10.00 15851 830.66 Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Link: https://lore.kernel.org/r/1700074613-1977070-4-git-send-email-radhey.shyam.pandey@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-20 17:52:22 -08:00
Sarath Babu Naidu Gaddam	6b1b40f704	net: axienet: Preparatory changes for dmaengine support The axiethernet driver has inbuilt dma programming. In order to add dmaengine support and make it's integration seamless the current axidma inbuilt programming code is put under use_dmaengine check. It also performs minor code reordering to minimize conditional use_dmaengine checks and there is no functional change. It uses "dmas" property to identify whether it should use a dmaengine framework or inbuilt axidma programming. Signed-off-by: Sarath Babu Naidu Gaddam <sarath.babu.naidu.gaddam@amd.com> Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Link: https://lore.kernel.org/r/1700074613-1977070-3-git-send-email-radhey.shyam.pandey@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-20 17:52:22 -08:00
Radhey Shyam Pandey	5e63c5ef7a	dt-bindings: net: xlnx,axi-ethernet: Introduce DMA support Xilinx 1G/2.5G Ethernet Subsystem provides 32-bit AXI4-Stream buses to move transmit and receive Ethernet data to and from the subsystem. These buses are designed to be used with an AXI Direct Memory Access(DMA) IP or AXI Multichannel Direct Memory Access (MCDMA) IP core, AXI4-Stream Data FIFO, or any other custom logic in any supported device. Primary high-speed DMA data movement between system memory and stream target is through the AXI4 Read Master to AXI4 memory-mapped to stream (MM2S) Master, and AXI stream to memory-mapped (S2MM) Slave to AXI4 Write Master. AXI DMA/MCDMA enables channel of data movement on both MM2S and S2MM paths in scatter/gather mode. AXI DMA has two channels where as MCDMA has 16 Tx and 16 Rx channels. To uniquely identify each channel use 'chan' suffix. Depending on the usecase AXI ethernet driver can request any combination of multichannel DMA channels using generic dmas, dma-names properties. Example: dma-names = tx_chan0, rx_chan0, tx_chan1, rx_chan1; Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/1700074613-1977070-2-git-send-email-radhey.shyam.pandey@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-20 17:52:22 -08:00
Willem de Bruijn	a0bc96c0cd	selftests: net: verify fq per-band packet limit Commit 29f834aa326e ("net_sched: sch_fq: add 3 bands and WRR scheduling") introduces multiple traffic bands, and per-band maximum packet count. Per-band limits ensures that packets in one class cannot fill the entire qdisc and so cause DoS to the traffic in the other classes. Verify this behavior: 1. set the limit to 10 per band 2. send 20 pkts on band A: verify that 10 are queued, 10 dropped 3. send 20 pkts on band A: verify that 0 are queued, 20 dropped 4. send 20 pkts on band B: verify that 10 are queued, 10 dropped Packets must remain queued for a period to trigger this behavior. Use SO_TXTIME to store packets for 100 msec. The test reuses existing upstream test infra. The script is a fork of cmsg_time.sh. The scripts call cmsg_sender. The test extends cmsg_sender with two arguments: * '-P' SO_PRIORITY There is a subtle difference between IPv4 and IPv6 stack behavior: PF_INET/IP_TOS sets IP header bits and sk_priority PF_INET6/IPV6_TCLASS sets IP header bits BUT NOT sk_priority * '-n' num pkts Send multiple packets in quick succession. I first attempted a for loop in the script, but this is too slow in virtualized environments, causing flakiness as the 100ms timeout is reached and packets are dequeued. Also do not wait for timestamps to be queued unless timestamps are requested. Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20231116203449.2627525-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-20 17:48:36 -08:00
Vishvambar Panth S	45933b2db9	net: microchip: lan743x : bidirectional throughput improvement The LAN743x/PCI11xxx DMA descriptors are always 4 dwords long, but the device supports placing the descriptors in memory back to back or reserving space in between them using its DMA_DESCRIPTOR_SPACE (DSPACE) configurable hardware setting. Currently DSPACE is unnecessarily set to match the host's L1 cache line size, resulting in space reserved in between descriptors in most platforms and causing a suboptimal behavior (single PCIe Mem transaction per descriptor). By changing the setting to DSPACE=16 many descriptors can be packed in a single PCIe Mem transaction resulting in a massive performance improvement in bidirectional tests without any negative effects. Tested and verified improvements on x64 PC and several ARM platforms (typical data below) Test setup 1: x64 PC with LAN7430 ---> x64 PC iperf3 UDP bidirectional with DSPACE set to L1 CACHE Size: - - - - - - - - - - - - - - - - - - - - - - - - - [ ID][Role] Interval Transfer Bitrate [ 5][TX-C] 0.00-10.00 sec 170 MBytes 143 Mbits/sec sender [ 5][TX-C] 0.00-10.04 sec 169 MBytes 141 Mbits/sec receiver [ 7][RX-C] 0.00-10.00 sec 1.02 GBytes 876 Mbits/sec sender [ 7][RX-C] 0.00-10.04 sec 1.02 GBytes 870 Mbits/sec receiver iperf3 UDP bidirectional with DSPACE set to 16 Bytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID][Role] Interval Transfer Bitrate [ 5][TX-C] 0.00-10.00 sec 1.11 GBytes 956 Mbits/sec sender [ 5][TX-C] 0.00-10.04 sec 1.11 GBytes 951 Mbits/sec receiver [ 7][RX-C] 0.00-10.00 sec 1.10 GBytes 948 Mbits/sec sender [ 7][RX-C] 0.00-10.04 sec 1.10 GBytes 942 Mbits/sec receiver Test setup 2 : RK3399 with LAN7430 ---> x64 PC RK3399 Spec: The SOM-RK3399 is ARM module designed and developed by FriendlyElec. Cores: 64-bit Dual Core Cortex-A72 + Quad Core Cortex-A53 Frequency: Cortex-A72(up to 2.0GHz), Cortex-A53(up to 1.5GHz) PCIe: PCIe x4, compatible with PCIe 2.1, Dual operation mode iperf3 UDP bidirectional with DSPACE set to L1 CACHE Size: - - - - - - - - - - - - - - - - - - - - - - - - - [ ID][Role] Interval Transfer Bitrate [ 5][TX-C] 0.00-10.00 sec 534 MBytes 448 Mbits/sec sender [ 5][TX-C] 0.00-10.05 sec 534 MBytes 446 Mbits/sec receiver [ 7][RX-C] 0.00-10.00 sec 1.12 GBytes 961 Mbits/sec sender [ 7][RX-C] 0.00-10.05 sec 1.11 GBytes 946 Mbits/sec receiver iperf3 UDP bidirectional with DSPACE set to 16 Bytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID][Role] Interval Transfer Bitrate [ 5][TX-C] 0.00-10.00 sec 966 MBytes 810 Mbits/sec sender [ 5][TX-C] 0.00-10.04 sec 965 MBytes 806 Mbits/sec receiver [ 7][RX-C] 0.00-10.00 sec 1.11 GBytes 956 Mbits/sec sender [ 7][RX-C] 0.00-10.04 sec 1.07 GBytes 919 Mbits/sec receiver Signed-off-by: Vishvambar Panth S <vishvambarpanth.s@microchip.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20231116054350.620420-1-vishvambarpanth.s@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-20 17:47:30 -08:00
Lorenzo Bianconi	94c81c6266	net: ethernet: mtk_wed: rely on __dev_alloc_page in mtk_wed_tx_buffer_alloc Simplify the code and use __dev_alloc_page() instead of __dev_alloc_pages() with order 0 in mtk_wed_tx_buffer_alloc routine Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-19 19:49:20 +00:00
David S. Miller	69d5ee8c12	Merge branch 'am65-cpsw-ethtool-mac-stats' Roger Quadros says: =================== net: eth: am65-cpsw: add ethtool MAC stats Gets 'ethtool -S eth0 --groups eth-mac' command to work. Also set default TX channels to maximum available and does cleanup in am65_cpsw_nuss_common_open() error path. Changelog: v2: - add __iomem to *stats, to prevent sparse warning - clean up RX descriptors and free up SKB in error handling of am65_cpsw_nuss_common_open() - Re-arrange some funcitons to avoid forward declaration ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-19 19:46:40 +00:00
Roger Quadros	ebd7bf60e2	net: ethernet: ti: am65-cpsw: Fix error handling in am65_cpsw_nuss_common_open() k3_udma_glue_enable_rx/tx_chn returns error code on failure. Bail out on error while enabling TX/RX channel. In the error path, clean up the RX descriptors and SKBs. Get rid of kmemleak_not_leak() as it seems unnecessary now. Fixes: 93a76530316a ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver") Signed-off-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-19 19:46:40 +00:00
Roger Quadros	be397ea347	net: ethernet: am65-cpsw: Set default TX channels to maximum am65-cpsw supports 8 TX hardware queues. Set this as default. The rationale is that some am65-cpsw devices can have up to 4 ethernet ports. If the number of TX channels have to be changed then all interfaces have to be brought down and up as the old default of 1 TX channel is too restrictive for any mqprio/taprio usage. Another reason for this change is to allow testing using kselftest:net/forwarding:ethtool_mm.sh out of the box. Signed-off-by: Roger Quadros <rogerq@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-19 19:46:40 +00:00
Roger Quadros	ac09946696	net: ethernet: ti: am65-cpsw: Re-arrange functions to avoid forward declaration Re-arrange am65_cpsw_nuss_rx_cleanup(), am65_cpsw_nuss_xmit_free() and am65_cpsw_nuss_tx_cleanup() to avoid forward declaration. No functional change. Signed-off-by: Roger Quadros <rogerq@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-19 19:46:40 +00:00
Roger Quadros	67372d7a85	net: ethernet: am65-cpsw: Add standard Ethernet MAC stats to ethtool Gets 'ethtool -S eth0 --groups eth-mac' command to work. Signed-off-by: Roger Quadros <rogerq@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-19 19:46:40 +00:00
Li RongQing	ac40916a3f	rtnetlink: introduce nlmsg_new_large and use it in rtnl_getlink if a PF has 256 or more VFs, ip link command will allocate an order 3 memory or more, and maybe trigger OOM due to memory fragment, the VFs needed memory size is computed in rtnl_vfinfo_size. so introduce nlmsg_new_large which calls netlink_alloc_large_skb in which vmalloc is used for large memory, to avoid the failure of allocating memory ip invoked oom-killer: gfp_mask=0xc2cc0(GFP_KERNEL\|__GFP_NOWARN\|\ __GFP_COMP\|__GFP_NOMEMALLOC), order=3, oom_score_adj=0 CPU: 74 PID: 204414 Comm: ip Kdump: loaded Tainted: P OE Call Trace: dump_stack+0x57/0x6a dump_header+0x4a/0x210 oom_kill_process+0xe4/0x140 out_of_memory+0x3e8/0x790 __alloc_pages_slowpath.constprop.116+0x953/0xc50 __alloc_pages_nodemask+0x2af/0x310 kmalloc_large_node+0x38/0xf0 __kmalloc_node_track_caller+0x417/0x4d0 __kmalloc_reserve.isra.61+0x2e/0x80 __alloc_skb+0x82/0x1c0 rtnl_getlink+0x24f/0x370 rtnetlink_rcv_msg+0x12c/0x350 netlink_rcv_skb+0x50/0x100 netlink_unicast+0x1b2/0x280 netlink_sendmsg+0x355/0x4a0 sock_sendmsg+0x5b/0x60 ____sys_sendmsg+0x1ea/0x250 ___sys_sendmsg+0x88/0xd0 __sys_sendmsg+0x5e/0xa0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f95a65a5b70 Cc: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Li RongQing <lirongqing@baidu.com> Link: https://lore.kernel.org/r/20231115120108.3711-1-lirongqing@baidu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-18 20:18:25 -08:00
Jakub Kicinski	459a70bae4	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== ice: one by one port representors creation Michal Swiatkowski says: Currently ice supports creating port representors only for VFs. For that use case they can be created and removed in one step. This patchset is refactoring current flow to support port representor creation also for subfunctions and SIOV. In this case port representors need to be created and removed one by one. Also, they can be added and removed while other port representors are running. To achieve that we need to change the switchdev configuration flow. Three first patches are only cosmetic (renaming, removing not used code). Next few ones are preparation for new flow. The most important one is "add VF representor one by one". It fully implements new flow. New type of port representor (for subfunction) will be introduced in follow up patchset. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: reserve number of CP queues ice: adjust switchdev rebuild path ice: add VF representors one by one ice: realloc VSI stats arrays ice: set Tx topology every time new repr is added ice: allow changing SWITCHDEV_CTRL VSI queues ice: return pointer to representor ice: make representor code generic ice: remove VF pointer reference in eswitch code ice: track port representors in xarray ice: use repr instead of vf->repr ice: track q_id in representor ice: remove unused control VSI parameter ice: remove redundant max_vsi_num variable ice: rename switchdev to eswitch ==================== Link: https://lore.kernel.org/r/20231114181449.1290117-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-18 19:46:32 -08:00
Jakub Kicinski	a49296e070	Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== igc: Add support for physical + free-running timers Vinicius Costa Gomes says: The objective is to allow having functionality that depends on the physical timer (taprio and ETF offloads, for example) and vclocks operating together. The "big" missing piece is the implementation of the .getcyclesx64() function in igc, as i225/i226 have multiple timers, we use one of those timers (timer 1) as a free-running (non adjustable) timer. The complication is that only implementing .getcyclesx64() and nothing else will break synchronization when using vclocks, as reading the clock will retrieve the free-running value but timnestamps will come from the adjustable timer. The solution is to modify "in one go" the timestamping code to be able to retrieve the timestamp from the correct timer (if a socket is "phc_bound" to a vclock the timestamp will come from the free-running timer). I was debating whether or not to do the adjustments for the internal latencies for the free-running timestamps, decided to do the adjustments so the path delay when using vclocks is similar to the one when using the physical clock. One future improvement is to implement the .getcrosscycles() function. * '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: igc: Add support for PTP .getcyclesx64() igc: Simplify setting flags in the TX data descriptor ==================== Link: https://lore.kernel.org/r/20231114183640.1303163-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-18 19:42:29 -08:00
Jakub Kicinski	516cba96e8	Merge branch 'net-sched-cls_u32-use-proper-refcounts' Pedro Tammela says: ==================== net/sched: cls_u32: use proper refcounts In u32 we are open coding refcounts of hashtables with integers which is far from ideal. Update those with proper refcount and add a couple of tests to tdc that exercise the refcounts explicitly. ==================== Link: https://lore.kernel.org/r/20231114141856.974326-1-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-18 19:38:25 -08:00
Pedro Tammela	54293e4d6a	selftests/tc-testing: add hashtable tests for u32 Add tests to specifically check for the refcount interactions of hashtables created by u32. These tables should not be deleted when referenced and the flush order should respect a tree like composition. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20231114141856.974326-3-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-18 19:38:23 -08:00

1 2 3 4 5 ...

1233761 Commits