linux

iv/linux

Author	SHA1	Message	Date
Petr Machata	5ade50e2df	selftests: router_vid_1: Add a diagram, fix coding style It is customary for selftests to have a comment with a topology diagram, which serves to illustrate the situation in which the test is done. This selftest lacks it. Add it. While at it, fix the list of tests so that the test names are enumerated one at a line. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-04 11:21:32 +01:00
Petr Machata	18d2c710e5	selftests: mlxsw: bail_on_lldpad before installing the cleanup trap A number of mlxsw-specific QoS tests use manual QoS DCB management. As such, they need to make sure lldpad is not running, because it would override the configuration the test has applied using other tools. To that end, these selftests invoke the bail_on_lldpad() helper, which terminates the selftest if th lldpad is running. Some of these tests however first install the bash exit trap, which invokes a cleanup() at the test exit. If bail_on_lldpad() has terminated the script even before the setup part was run, the cleanup part will be very confused. Therefore make sure bail_on_lldpad() is invoked before the cleanup is registered. While there are still edge cases where the user terminates the script before the setup was fully done, this takes care of a common situation where the cleanup would be invoked in an inconsistent state. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-04 11:21:32 +01:00
David S. Miller	39e85fe011	Merge branch 'sfc-Siena-subdir' Martin Habets says: ==================== sfc: Move Siena into a separate subdirectory The Siena NICs (SFN5000 and SFN6000 series) went EOL in November 2021. Most of these adapters have been remove from our test labs, and testing has been reduced to a minimum. This patch series creates a separate kernel module for the Siena architecture, analogous to what was done for Falcon some years ago. This reduces our maintenance for the sfc.ko module, and allows us to enhance the EF10 and EF100 drivers without the risk of breaking Siena NICs. After this series further enhancements are needed to differentiate the new kernel module from sfc.ko, and the Siena code can be removed from sfc.ko. Thes will be posted as a small follow-up series. The Siena module is not built by default, but can be enabled using Kconfig option SFC_SIENA. This will create module sfc-siena.ko. Patches Patch 1 disables the Siena code in the sfc.ko module. Patches 2-6 establish the code base for the Siena driver. Patches 7-12 ensure the allyesconfig build succeeds. Patch 13 adds the basic Siena module. I do not expect patch 2 through 5 to be reviewed, they are FYI only. No checkpatch issues were resolved as part of these, but they were fixed in the subsequent patches. Testing Various build tests were done such as allyesconfig, W=1 and sparse. The new sfc-siena.ko and sfc.ko modules were tested on a machine with both these NICs in them, and several tests were run on both drivers. Martin --- v3: - Fix build errors after rebase. v2: - Split up patch that copies existing files. - Only copy a subset of mcdi_pcol.h. - Use --find-copies-harder as suggested by Benjamin Poirier. - Merge several patches for the allyesconfig build into larger ones. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-04 11:18:08 +01:00
Martin Habets	6b73f20ab6	sfc: Copy a subset of mcdi_pcol.h to siena For Siena we do not need new messages that were defined for the EF100 architecture. Several debug messages have also been removed. Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-04 11:18:08 +01:00
Martin Habets	0c38a5bd60	sfc: Disable Siena support Disable the build of Siena code until later in this patch series. Prevent sfc.ko from binding to Siena NICs. efx_init_sriov/efx_fini_sriov is only used for Siena. Remove calls to those. Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-04 11:18:08 +01:00
David S. Miller	402f2d6b6b	mlx5-updates-2022-05-03 Leon Romanovsky Says: ===================== Extra IPsec cleanup After FPGA IPsec removal, we can go further and make sure that flow steering logic is aligned to mlx5_core standard together with deep cleaning of whole IPsec path. ===================== -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmJyFjcACgkQSD+KveBX +j5UUwgAxKTtSaC2Qd/3Wzn9YBitvrS/y/O7QFEaCyKTn8SMcyFHYxySwSgaABJ2 fbx1FgCRvj0XGJzMs8gN4IcUDai/ff6TlN1KstwjiDV7Xj6yy/6LnqR32YJafJ9Q zKt3c4Ws662/VDryaTbVgScRU/h11HmESM+BEzxowN+S3MExGaJfAUquee4YAip+ KVV6Eas/vjmZcmHPy0+Pk2pndO0cIx7vK//cMeA2bmXsdF+3IqLlzQVy/KVbd7ro +pOZKVJIKF19ToXHGVMB/DQjTbSg3rWFb7kzUTksZO6o/5z2UYp6/PX6UuMgky0S 5AvZxHXq+CoLGWJZvpxCrwbkEQ+2Ng== =/07+ -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2022-05-03' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux mlx5-updates-2022-05-03 Leon Romanovsky Says: ===================== Extra IPsec cleanup After FPGA IPsec removal, we can go further and make sure that flow steering logic is aligned to mlx5_core standard together with deep cleaning of whole IPsec path. ===================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-04 10:53:00 +01:00
David S. Miller	ad0724b90a	mlx5-fixes-2022-05-03 -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmJyJHcACgkQSD+KveBX +j60dQf/UFKW7VzUfUyT4+pgMZHdCQLsEkf7lgHzFjcWD3XaiBr7him3RIFl0ZrI 0mD2AHA7YT6ePtub8Pjyqbfu4hC7rYWMDW6J2iqV7V05co/AVdWLFgxex1zEMhK8 nD2f4nFkgs5rOeSn6tOj0ttSlXQOPmPPh5FfIGy2oiGTeNWOYG5URw1crOoNe+HG 0Z4uAmQDXETFLgim+A8DDTrbjSY60EDT/qnpOhPISpE/EtiG8OCaPDtcsSEmF9re 0LebIKUvx4X9fsHPvxyC0TfKvYpxSHkbz+VAx//O8+aGmHKgco07/3aeWM5Z9D/7 ulbQJQTIwcsM3GwlAFygCjFWFqiLvQ== =gXIJ -----END PGP SIGNATURE----- Merge tag 'mlx5-fixes-2022-05-03' of git://git.kernel.org/pub/scm/linux/kernel/g it/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2022-05-03 This series provides bug fixes to mlx5 driver. Please pull and let me know if there is any problem. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-04 10:51:05 +01:00
David S. Miller	6a9b3de825	Merge branch 'mptcp-pathmanager-api' Mat Martineau says: ==================== mptcp: Userspace path manager API Userspace path managers (PMs) make use of generic netlink MPTCP events and commands to control addition and removal of MPTCP subflows on an existing MPTCP connection. The path manager events have already been upstream for a while, and this patch series adds four netlink commands for userspace: * MPTCP_PM_CMD_ANNOUNCE: advertise an address that's available for additional subflow connections. * MPTCP_PM_CMD_REMOVE: revoke an advertisement * MPTCP_PM_CMD_SUBFLOW_CREATE: initiate a new subflow on an existing MPTCP connection * MPTCP_PM_CMD_SUBFLOW_DESTROY: close a subflow on an existing MPTCP connection Userspace path managers, such as mptcpd, can be more easily customized for different devices. The in-kernel path manager remains available to handle server use cases. Patches 1-3 update common path manager code (used by both in-kernel and userspace PMs) Patches 4, 6, and 8 implement the new generic netlink commands. Patches 5, 7, and 9-13 add self test support and test cases for the new path manager commands. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-04 10:49:32 +01:00
Kishen Maloor	259a834fad	selftests: mptcp: functional tests for the userspace PM type This change adds a selftest script that performs a comprehensive behavioral/functional test of all userspace PM capabilities by exercising all the newly added APIs and changes to support said capabilities. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-04 10:49:32 +01:00
Kishen Maloor	bdde081d72	selftests: mptcp: create listeners to receive MPJs This change updates the "pm_nl_ctl" testing sample with a "listen" option to bind a MPTCP listening socket to the provided addr+port. This option is exercised in testing subflow initiation scenarios in conjunction with userspace path managers where the MPTCP application does not hold an active listener to accept requests for new subflows. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-04 10:49:32 +01:00
Kishen Maloor	b3e5fd653d	selftests: mptcp: capture netlink events This change adds to self-testing support for the MPTCP netlink interface by capturing various MPTCP netlink events (and all their metadata) associated with connections, subflows and address announcements. It is used in self-testing scripts that exercise MPTCP netlink commands to precisely validate those operations by examining the dispatched MPTCP netlink events in response to those commands. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-04 10:49:32 +01:00
Kishen Maloor	57cc361b8d	selftests: mptcp: support MPTCP_PM_CMD_SUBFLOW_DESTROY This change updates the "pm_nl_ctl" testing sample with a "dsf" (destroy subflow) option to support the newly added netlink interface command MPTCP_PM_CMD_SUBFLOW_DESTROY over the chosen MPTCP connection. E.g. ./pm_nl_ctl dsf lip 10.0.2.1 lport 44567 rip 10.0.2.2 rport 56789 token 823274047 Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-04 10:49:32 +01:00
Kishen Maloor	cf8d0a6dfd	selftests: mptcp: support MPTCP_PM_CMD_SUBFLOW_CREATE This change updates the "pm_nl_ctl" testing sample with a "csf" (create subflow) option to support the newly added netlink interface command MPTCP_PM_CMD_SUBFLOW_CREATE over the chosen MPTCP connection. E.g. ./pm_nl_ctl csf lip 10.0.2.1 lid 23 rip 10.0.2.2 rport 56789 token 823274047 Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-04 10:49:32 +01:00
Florian Westphal	702c2f646d	mptcp: netlink: allow userspace-driven subflow establishment This allows userspace to tell kernel to add a new subflow to an existing mptcp connection. Userspace provides the token to identify the mptcp-level connection that needs a change in active subflows and the local and remote addresses of the new or the to-be-removed subflow. MPTCP_PM_CMD_SUBFLOW_CREATE requires the following parameters: { token, { loc_id, family, loc_addr4 \| loc_addr6 }, { family, rem_addr4 \| rem_addr6, rem_port } MPTCP_PM_CMD_SUBFLOW_DESTROY requires the following parameters: { token, { family, loc_addr4 \| loc_addr6, loc_port }, { family, rem_addr4 \| rem_addr6, rem_port } Acked-by: Paolo Abeni <pabeni@redhat.com> Co-developed-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-04 10:49:32 +01:00
Kishen Maloor	ecd2a77d67	selftests: mptcp: support MPTCP_PM_CMD_REMOVE This change updates the "pm_nl_ctl" testing sample with a "rem" (remove) option to support the newly added netlink interface command MPTCP_PM_CMD_REMOVE to issue a REMOVE_ADDR signal over the chosen MPTCP connection. E.g. ./pm_nl_ctl rem token 823274047 id 23 Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-04 10:49:32 +01:00
Kishen Maloor	d9a4594eda	mptcp: netlink: Add MPTCP_PM_CMD_REMOVE This change adds a MPTCP netlink command for issuing a REMOVE_ADDR signal for an address over the chosen MPTCP connection from a userspace path manager. The command requires the following parameters: {token, loc_id}. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-04 10:49:31 +01:00
Kishen Maloor	9a0b36509d	selftests: mptcp: support MPTCP_PM_CMD_ANNOUNCE This change updates the "pm_nl_ctl" testing sample with an "ann" (announce) option to support the newly added netlink interface command MPTCP_PM_CMD_ANNOUNCE to issue ADD_ADDR advertisements over the chosen MPTCP connection. E.g. ./pm_nl_ctl ann 192.168.122.75 token 823274047 id 25 dev enp1s0 Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-04 10:49:31 +01:00
Kishen Maloor	9ab4807c84	mptcp: netlink: Add MPTCP_PM_CMD_ANNOUNCE This change adds a MPTCP netlink interface for issuing ADD_ADDR advertisements over the chosen MPTCP connection from a userspace path manager. The command requires the following parameters: { token, { loc_id, family, daddr4 \| daddr6 [, dport] } [, if_idx], flags[signal] }. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-04 10:49:31 +01:00
Florian Westphal	982f17ba1a	mptcp: netlink: split mptcp_pm_parse_addr into two functions Next patch will need to parse MPTCP_PM_ATTR_ADDR attributes and fill an mptcp_addr_info structure from a different genl command callback. To avoid copy-paste, split the existing function to a helper that does the common part and then call the helper from the (renamed)mptcp_pm_parse_entry function. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-04 10:49:31 +01:00
Kishen Maloor	8b20137012	mptcp: read attributes of addr entries managed by userspace PMs This change introduces a parallel path in the kernel for retrieving the local id, flags, if_index for an addr entry in the context of an MPTCP connection that's being managed by a userspace PM. The userspace and in-kernel PM modes deviate in their procedures for obtaining this information. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-04 10:49:31 +01:00
Kishen Maloor	4638de5aef	mptcp: handle local addrs announced by userspace PMs This change adds an internal function to store/retrieve local addrs announced by userspace PM implementations to/from its kernel context. The function addresses the requirements of three scenarios: 1) ADD_ADDR announcements (which require that a local id be provided), 2) retrieving the local id associated with an address, and also where one may need to be assigned, and 3) reissuance of ADD_ADDRs when there's a successful match of addr/id. The list of all stored local addr entries is held under the MPTCP sock structure. Memory for these entries is allocated from the sock option buffer, so the list of addrs is bounded by optmem_max. The list if not released via REMOVE_ADDR signals is ultimately freed when the sock is destructed. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-04 10:49:31 +01:00
Hector Martin	2ac2fab529	iommu/dart: Add missing module owner to ops structure This is required to make loading this as a module work. Signed-off-by: Hector Martin <marcan@marcan.st> Fixes: `46d1fb072e` ("iommu/dart: Add DART iommu driver") Reviewed-by: Sven Peter <sven@svenpeter.dev> Link: https://lore.kernel.org/r/20220502092238.30486-1-marcan@marcan.st Signed-off-by: Joerg Roedel <jroedel@suse.de>	2022-05-04 10:36:25 +02:00
Mark Bloch	a042d7f5bb	net/mlx5: Fix matching on inner TTC The cited commits didn't use proper matching on inner TTC as a result distribution of encapsulated packets wasn't symmetric between the physical ports. Fixes: `4c71ce50d2` ("net/mlx5: Support partial TTC rules") Fixes: `8e25a2bc66` ("net/mlx5: Lag, add support to create TTC tables for LAG port selection") Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-04 00:00:07 -07:00
Moshe Shemesh	fc3d3db07b	net/mlx5: Avoid double clear or set of sync reset requested Double clear of reset requested state can lead to NULL pointer as it will try to delete the timer twice. This can happen for example on a race between abort from FW and pci error or reset. Avoid such case using test_and_clear_bit() to verify only one time reset requested state clear flow. Similarly use test_and_set_bit() to verify only one time reset requested state set flow. Fixes: `7dd6df329d` ("net/mlx5: Handle sync reset abort event") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-04 00:00:06 -07:00
Moshe Shemesh	cb7786a76e	net/mlx5: Fix deadlock in sync reset flow The sync reset flow can lead to the following deadlock when poll_sync_reset() is called by timer softirq and waiting on del_timer_sync() for the same timer. Fix that by moving the part of the flow that waits for the timer to reset_reload_work. It fixes the following kernel Trace: RIP: 0010:del_timer_sync+0x32/0x40 ... Call Trace: <IRQ> mlx5_sync_reset_clear_reset_requested+0x26/0x50 [mlx5_core] poll_sync_reset.cold+0x36/0x52 [mlx5_core] call_timer_fn+0x32/0x130 __run_timers.part.0+0x180/0x280 ? tick_sched_handle+0x33/0x60 ? tick_sched_timer+0x3d/0x80 ? ktime_get+0x3e/0xa0 run_timer_softirq+0x2a/0x50 __do_softirq+0xe1/0x2d6 ? hrtimer_interrupt+0x136/0x220 irq_exit+0xae/0xb0 smp_apic_timer_interrupt+0x7b/0x140 apic_timer_interrupt+0xf/0x20 </IRQ> Fixes: `3c5193a87b` ("net/mlx5: Use del_timer_sync in fw reset flow of halting poll") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Maher Sanalla <msanalla@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-04 00:00:06 -07:00
Moshe Tal	b781bff882	net/mlx5e: Fix trust state reset in reload Setting dscp2prio during the driver reload can cause dcb ieee app list to be not empty after the reload finish and as a result to a conflict between the priority trust state reported by the app and the state in the device register. Reset the dcb ieee app list on initialization in case this is conflicting with the register status. Fixes: `2a5e7a1344` ("net/mlx5e: Add dcbnl dscp to priority support") Signed-off-by: Moshe Tal <moshet@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-04 00:00:06 -07:00
Ariel Levkovich	0e322efd64	net/mlx5e: Avoid checking offload capability in post_parse action During TC action parsing, the can_offload callback is called before calling the action's main parsing callback. Later on, the can_offload callback is called again before handling the action's post_parse callback if exists. Since the main parsing callback might have changed and set parsing params for the rule, following can_offload checks might fail because some parsing params were already set. Specifically, the ct action main parsing sets the ct param in the parsing status structure and when the second can_offload for ct action is called, before handling the ct post parsing, it will return an error since it checks this ct param to indicate multiple ct actions which are not supported. Therefore, the can_offload call is removed from the post parsing handling to prevent such cases. This is allowed since the first can_offload call will ensure that the action can be offloaded and the fact the code reached the post parsing handling already means that the action can be offloaded. Fixes: `8300f22526` ("net/mlx5e: Create new flow attr for multi table actions") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-04 00:00:05 -07:00
Paul Blakey	b069e14fff	net/mlx5e: CT: Fix queued up restore put() executing after relevant ft release __mlx5_tc_ct_entry_put() queues release of tuple related to some ct FT, if that is the last reference to that tuple, the actual deletion of the tuple can happen after the FT is already destroyed and freed. Flush the used workqueue before destroying the ct FT. Fixes: `a217313152` ("net/mlx5e: CT: manage the lifetime of the ct entry object") Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-04 00:00:05 -07:00
Ariel Levkovich	e3fdc71bcb	net/mlx5e: TC, fix decap fallback to uplink when int port not supported When resolving the decap route device for a tunnel decap rule, the result may be an OVS internal port device. Prior to adding the support for internal port offload, such case would result in using the uplink as the default decap route device which allowed devices that can't support internal port offload to offload this decap rule. This behavior got broken by adding the internal port offload which will fail in case the device can't support internal port offload. To restore the old behavior, use the uplink device as the decap route as before when internal port offload is not supported. Fixes: `b16eb3c81f` ("net/mlx5: Support internal port as decap route device") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-04 00:00:05 -07:00
Ariel Levkovich	087032ee70	net/mlx5e: TC, Fix ct_clear overwriting ct action metadata ct_clear action is translated to clearing reg_c metadata which holds ct state and zone information using mod header actions. These actions are allocated during the actions parsing, as part of the flow attributes main mod header action list. If ct action exists in the rule, the flow's main mod header is used only in the post action table rule, after the ct tables which set the ct info in the reg_c as part of the ct actions. Therefore, if the original rule has a ct_clear action followed by a ct action, the ct action reg_c setting will be done first and will be followed by the ct_clear resetting reg_c and overwriting the ct info. Fix this by moving the ct_clear mod header actions allocation from the ct action parsing stage to the ct action post parsing stage where it is already known if ct_clear is followed by a ct action. In such case, we skip the mod header actions allocation for the ct clear since the ct action will write to reg_c anyway after clearing it. Fixes: `806401c20a` ("net/mlx5e: CT, Fix multiple allocations and memleak of mod acts") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-04 00:00:04 -07:00
Vlad Buslov	4a2a664ed8	net/mlx5e: Lag, Don't skip fib events on current dst Referenced change added check to skip updating fib when new fib instance has same or lower priority. However, new fib instance can be an update on same dst address as existing one even though the structure is another instance that has different address. Ignoring events on such instances causes multipath LAG state to not be correctly updated. Track 'dst' and 'dst_len' fields of fib event fib_entry_notifier_info structure and don't skip events that have the same value of that fields. Fixes: `ad11c4f1d8` ("net/mlx5e: Lag, Only handle events from highest priority multipath entry") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-04 00:00:04 -07:00
Vlad Buslov	a6589155ec	net/mlx5e: Lag, Fix fib_info pointer assignment Referenced change incorrectly sets single path fib_info even when LAG is not active. Fix it by moving call to mlx5_lag_fib_set() into conditional that verifies LAG state. Fixes: `ad11c4f1d8` ("net/mlx5e: Lag, Only handle events from highest priority multipath entry") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-04 00:00:04 -07:00
Vlad Buslov	27b0420fd9	net/mlx5e: Lag, Fix use-after-free in fib event handler Recent commit that modified fib route event handler to handle events according to their priority introduced use-after-free[0] in mp->mfi pointer usage. The pointer now is not just cached in order to be compared to following fib_info instances, but is also dereferenced to obtain fib_priority. However, since mlx5 lag code doesn't hold the reference to fin_info during whole mp->mfi lifetime, it could be used after fib_info instance has already been freed be kernel infrastructure code. Don't ever dereference mp->mfi pointer. Refactor it to be 'const void*' type and cache fib_info priority in dedicated integer. Group fib_info-related data into dedicated 'fib' structure that will be further extended by following patches in the series. [0]: [ 203.588029] ================================================================== [ 203.590161] BUG: KASAN: use-after-free in mlx5_lag_fib_update+0xabd/0xd60 [mlx5_core] [ 203.592386] Read of size 4 at addr ffff888144df2050 by task kworker/u20:4/138 [ 203.594766] CPU: 3 PID: 138 Comm: kworker/u20:4 Tainted: G B 5.17.0-rc7+ #6 [ 203.596751] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 203.598813] Workqueue: mlx5_lag_mp mlx5_lag_fib_update [mlx5_core] [ 203.600053] Call Trace: [ 203.600608] <TASK> [ 203.601110] dump_stack_lvl+0x48/0x5e [ 203.601860] print_address_description.constprop.0+0x1f/0x160 [ 203.602950] ? mlx5_lag_fib_update+0xabd/0xd60 [mlx5_core] [ 203.604073] ? mlx5_lag_fib_update+0xabd/0xd60 [mlx5_core] [ 203.605177] kasan_report.cold+0x83/0xdf [ 203.605969] ? mlx5_lag_fib_update+0xabd/0xd60 [mlx5_core] [ 203.607102] mlx5_lag_fib_update+0xabd/0xd60 [mlx5_core] [ 203.608199] ? mlx5_lag_init_fib_work+0x1c0/0x1c0 [mlx5_core] [ 203.609382] ? read_word_at_a_time+0xe/0x20 [ 203.610463] ? strscpy+0xa0/0x2a0 [ 203.611463] process_one_work+0x722/0x1270 [ 203.612344] worker_thread+0x540/0x11e0 [ 203.613136] ? rescuer_thread+0xd50/0xd50 [ 203.613949] kthread+0x26e/0x300 [ 203.614627] ? kthread_complete_and_exit+0x20/0x20 [ 203.615542] ret_from_fork+0x1f/0x30 [ 203.616273] </TASK> [ 203.617174] Allocated by task 3746: [ 203.617874] kasan_save_stack+0x1e/0x40 [ 203.618644] __kasan_kmalloc+0x81/0xa0 [ 203.619394] fib_create_info+0xb41/0x3c50 [ 203.620213] fib_table_insert+0x190/0x1ff0 [ 203.621020] fib_magic.isra.0+0x246/0x2e0 [ 203.621803] fib_add_ifaddr+0x19f/0x670 [ 203.622563] fib_inetaddr_event+0x13f/0x270 [ 203.623377] blocking_notifier_call_chain+0xd4/0x130 [ 203.624355] __inet_insert_ifa+0x641/0xb20 [ 203.625185] inet_rtm_newaddr+0xc3d/0x16a0 [ 203.626009] rtnetlink_rcv_msg+0x309/0x880 [ 203.626826] netlink_rcv_skb+0x11d/0x340 [ 203.627626] netlink_unicast+0x4cc/0x790 [ 203.628430] netlink_sendmsg+0x762/0xc00 [ 203.629230] sock_sendmsg+0xb2/0xe0 [ 203.629955] ____sys_sendmsg+0x58a/0x770 [ 203.630756] ___sys_sendmsg+0xd8/0x160 [ 203.631523] __sys_sendmsg+0xb7/0x140 [ 203.632294] do_syscall_64+0x35/0x80 [ 203.633045] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 203.634427] Freed by task 0: [ 203.635063] kasan_save_stack+0x1e/0x40 [ 203.635844] kasan_set_track+0x21/0x30 [ 203.636618] kasan_set_free_info+0x20/0x30 [ 203.637450] __kasan_slab_free+0xfc/0x140 [ 203.638271] kfree+0x94/0x3b0 [ 203.638903] rcu_core+0x5e4/0x1990 [ 203.639640] __do_softirq+0x1ba/0x5d3 [ 203.640828] Last potentially related work creation: [ 203.641785] kasan_save_stack+0x1e/0x40 [ 203.642571] __kasan_record_aux_stack+0x9f/0xb0 [ 203.643478] call_rcu+0x88/0x9c0 [ 203.644178] fib_release_info+0x539/0x750 [ 203.644997] fib_table_delete+0x659/0xb80 [ 203.645809] fib_magic.isra.0+0x1a3/0x2e0 [ 203.646617] fib_del_ifaddr+0x93f/0x1300 [ 203.647415] fib_inetaddr_event+0x9f/0x270 [ 203.648251] blocking_notifier_call_chain+0xd4/0x130 [ 203.649225] __inet_del_ifa+0x474/0xc10 [ 203.650016] devinet_ioctl+0x781/0x17f0 [ 203.650788] inet_ioctl+0x1ad/0x290 [ 203.651533] sock_do_ioctl+0xce/0x1c0 [ 203.652315] sock_ioctl+0x27b/0x4f0 [ 203.653058] __x64_sys_ioctl+0x124/0x190 [ 203.653850] do_syscall_64+0x35/0x80 [ 203.654608] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 203.666952] The buggy address belongs to the object at ffff888144df2000 which belongs to the cache kmalloc-256 of size 256 [ 203.669250] The buggy address is located 80 bytes inside of 256-byte region [ffff888144df2000, ffff888144df2100) [ 203.671332] The buggy address belongs to the page: [ 203.672273] page:00000000bf6c9314 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x144df0 [ 203.674009] head:00000000bf6c9314 order:2 compound_mapcount:0 compound_pincount:0 [ 203.675422] flags: 0x2ffff800010200(slab\|head\|node=0\|zone=2\|lastcpupid=0x1ffff) [ 203.676819] raw: 002ffff800010200 0000000000000000 dead000000000122 ffff888100042b40 [ 203.678384] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000 [ 203.679928] page dumped because: kasan: bad access detected [ 203.681455] Memory state around the buggy address: [ 203.682421] ffff888144df1f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 203.683863] ffff888144df1f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 203.685310] >ffff888144df2000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 203.686701] ^ [ 203.687820] ffff888144df2080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 203.689226] ffff888144df2100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 203.690620] ================================================================== Fixes: `ad11c4f1d8` ("net/mlx5e: Lag, Only handle events from highest priority multipath entry") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-04 00:00:03 -07:00
Mark Zhang	c4d963a588	net/mlx5e: Fix the calling of update_buffer_lossy() API The arguments of update_buffer_lossy() is in a wrong order. Fix it. Fixes: `88b3d5c90e` ("net/mlx5e: Fix port buffers cell size value") Signed-off-by: Mark Zhang <markzhang@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-04 00:00:03 -07:00
Vlad Buslov	ada09af92e	net/mlx5e: Don't match double-vlan packets if cvlan is not set Currently, match VLAN rule also matches packets that have multiple VLAN headers. This behavior is similar to buggy flower classifier behavior that has recently been fixed. Fix the issue by matching on outer_second_cvlan_tag with value 0 which will cause the HW to verify the packet doesn't contain second vlan header. Fixes: `699e96ddf4` ("net/mlx5e: Support offloading tc double vlan headers match") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-04 00:00:02 -07:00
Aya Levin	7ba2d9d8de	net/mlx5: Fix slab-out-of-bounds while reading resource dump menu Resource dump menu may span over more than a single page, support it. Otherwise, menu read may result in a memory access violation: reading outside of the allocated page. Note that page format of the first menu page contains menu headers while the proceeding menu pages contain only records. The KASAN logs are as follows: BUG: KASAN: slab-out-of-bounds in strcmp+0x9b/0xb0 Read of size 1 at addr ffff88812b2e1fd0 by task systemd-udevd/496 CPU: 5 PID: 496 Comm: systemd-udevd Tainted: G B 5.16.0_for_upstream_debug_2022_01_10_23_12 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x57/0x7d print_address_description.constprop.0+0x1f/0x140 ? strcmp+0x9b/0xb0 ? strcmp+0x9b/0xb0 kasan_report.cold+0x83/0xdf ? strcmp+0x9b/0xb0 strcmp+0x9b/0xb0 mlx5_rsc_dump_init+0x4ab/0x780 [mlx5_core] ? mlx5_rsc_dump_destroy+0x80/0x80 [mlx5_core] ? lockdep_hardirqs_on_prepare+0x286/0x400 ? raw_spin_unlock_irqrestore+0x47/0x50 ? aomic_notifier_chain_register+0x32/0x40 mlx5_load+0x104/0x2e0 [mlx5_core] mlx5_init_one+0x41b/0x610 [mlx5_core] .... The buggy address belongs to the object at ffff88812b2e0000 which belongs to the cache kmalloc-4k of size 4096 The buggy address is located 4048 bytes to the right of 4096-byte region [ffff88812b2e0000, ffff88812b2e1000) The buggy address belongs to the page: page:000000009d69807a refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88812b2e6000 pfn:0x12b2e0 head:000000009d69807a order:3 compound_mapcount:0 compound_pincount:0 flags: 0x8000000000010200(slab\|head\|zone=2) raw: 8000000000010200 0000000000000000 dead000000000001 ffff888100043040 raw: ffff88812b2e6000 0000000080040000 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88812b2e1e80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88812b2e1f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff88812b2e1f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff88812b2e2000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88812b2e2080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Fixes: `12206b1723` ("net/mlx5: Add support for resource dump") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-04 00:00:02 -07:00
Ariel Levkovich	cb0d54cbf9	net/mlx5e: Fix wrong source vport matching on tunnel rule When OVS internal port is the vtep device, the first decap rule is matching on the internal port's vport metadata value and then changes the metadata to be the uplink's value. Therefore, following rules on the tunnel, in chain > 0, should avoid matching on internal port metadata and use the uplink vport metadata instead. Select the uplink's metadata value for the source vport match in case the rule is in chain greater than zero, even if the tunnel route device is internal port. Fixes: `166f431ec6` ("net/mlx5e: Add indirect tc offload of ovs internal port") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-04 00:00:01 -07:00
Leon Romanovsky	656d338907	net/mlx5: Allow future addition of IPsec object modifiers Currently, all released FW versions support only two IPsec object modifiers, and modify_field_select get and set same value with proper bits. However, it is not future compatible, as new FW can have more modifiers and "default" will cause to overwrite not-changed fields. Fix it by setting explicitly fields that need to be overwritten. Fixes: `7ed92f97a1` ("net/mlx5e: IPsec: Add Connect-X IPsec ESN update offload support") Signed-off-by: Huy Nguyen <huyn@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-03 22:59:18 -07:00
Leon Romanovsky	bd24d1ffb4	net/mlx5: Don't perform lookup after already known sec_path There is no need to perform extra lookup in order to get already known sec_path that was set a couple of lines above. Simply reuse it. Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-03 22:59:18 -07:00
Leon Romanovsky	6cd2126ac6	net/mlx5: Cleanup XFRM attributes struct Remove everything that is not used or from mlx5_accel_esp_xfrm_attrs, together with change type of spi to store proper type from the beginning. Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-03 22:59:18 -07:00
Leon Romanovsky	1c4a59b9fa	net/mlx5: Remove not-supported ICV length mlx5 doesn't allow to configure any AEAD ICV length other than 128, so remove the logic that configures other unsupported values. Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-03 22:59:17 -07:00
Leon Romanovsky	effbe26751	net/mlx5: Simplify IPsec capabilities logic Reduce number of hard-coded IPsec capabilities by making sure that mlx5_ipsec_device_caps() sets only supported bits. As part of this change, remove _ACCEL_ notations from the capabilities names as they represent IPsec-capable device, so it is aligned with MLX5_CAP_IPSEC() macro. And prepare the code to IPsec full offload mode. Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-03 22:59:17 -07:00
Leon Romanovsky	a8444b0bdd	net/mlx5: Don't advertise IPsec netdev support for non-IPsec device Device that lacks proper IPsec capabilities won't pass mlx5e_ipsec_init() later, so no need to advertise HW netdev offload support for something that isn't going to work anyway. Fixes: `8ad893e516` ("net/mlx5e: Remove dependency in IPsec initialization flows") Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-03 22:59:17 -07:00
Leon Romanovsky	b7242ffc56	net/mlx5: Make sure that no dangling IPsec FS pointers exist The IPsec FS code was implemented with anti-pattern there failures in create functions left the system with dangling pointers that were cleaned in global routines. The less error prone approach is to make sure that failed function cleans everything internally. As part of this change, we remove the batch of one liners and rewrite get/put functions to remove ambiguity. Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-03 22:59:17 -07:00
Leon Romanovsky	82f7bdba37	net/mlx5: Clean IPsec FS add/delete rules Reuse existing struct to pass parameters instead of open code them. Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-03 22:59:16 -07:00
Leon Romanovsky	b73e67287b	net/mlx5: Simplify HW context interfaces by using SA entry SA context logic used multiple structures to store same data over and over. By simplifying the SA context interfaces, we can remove extra structs. Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-03 22:59:16 -07:00
Leon Romanovsky	a534e24d72	net/mlx5: Remove indirections from esp functions This change cleanups the mlx5 esp interface. Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-03 22:59:16 -07:00
Leon Romanovsky	c6e3b421c7	net/mlx5: Merge various control path IPsec headers into one file The mlx5 IPsec code has logical separation between code that operates with XFRM objects (ipsec.c), HW objects (ipsec_offload.c), flow steering logic (ipsec_fs.c) and data path (ipsec_rxtx.c). Such separation makes sense for C-files, but isn't needed at all for H-files as they are included in batch anyway. Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-03 22:59:15 -07:00
Leon Romanovsky	2ea36e2e4a	net/mlx5: Remove useless validity check All callers build xfrm attributes with help of mlx5e_ipsec_build_accel_xfrm_attrs() function that ensure validity of attributes. There is no need to recheck them again. Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-03 22:59:15 -07:00
Leon Romanovsky	c674df973a	net/mlx5: Store IPsec ESN update work in XFRM state mlx5 IPsec code updated ESN through workqueue with allocation calls in the data path, which can be saved easily if the work is created during XFRM state initialization routine. The locking used later in the work didn't protect from anything because change of HW context is possible during XFRM state add or delete only, which can cancel work and make sure that it is not running. Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-05-03 22:59:15 -07:00

1 2 3 4 5 ...

1091318 Commits