linux

iv/linux

Author	SHA1	Message	Date
Joerg Roedel	09aaa2d064	MAINTAINERS: Update IOMMU tree location Update the maintainers entries to the new location of the IOMMU tree. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-06-28 14:31:50 +02:00
David S. Miller	dc6be0b73f	Merge tag 'ieee802154-for-net-2024-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan into main	2024-06-28 13:10:12 +01:00
David S. Miller	109e2f5b98	Merge branch 'mlx5-fixes' into main Tariq Toukan says: ==================== mlx5 fixes 2024-06-27 This patchset provides fixes from the team to the mlx5 core and EN drivers. The first 3 patches by Daniel replace a buggy cap field with a newly introduced one. Patch 4 by Chris de-couples ingress ACL creation from a specific flow, so it's invoked by other flows if needed. Patch 5 by Jianbo fixes a possible missing cleanup of QoS objects. Patches 6 and 7 by Leon fixes IPsec stats logic to better reflect the traffic. Series generated against: commit 02ea312055da ("octeontx2-pf: Fix coverity and klockwork issues in octeon PF driver") V2: Fixed wrong cited SHA in patch 6. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 12:58:12 +01:00
Leon Romanovsky	e562f2d46d	net/mlx5e: Approximate IPsec per-SA payload data bytes count ConnectX devices lack ability to count payload data byte size which is needed for SA to return to libreswan for rekeying. As a solution let's approximate that by decreasing headers size from total size counted by flow steering. The calculation doesn't take into account any other headers which can be in the packet (e.g. IP extensions). Fixes: 5a6cddb89b51 ("net/mlx5e: Update IPsec per SA packets/bytes count") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 12:58:12 +01:00
Leon Romanovsky	2d9dac5559	net/mlx5e: Present succeeded IPsec SA bytes and packet IPsec SA statistics presents successfully decrypted and encrypted packet and bytes, and not total handled by this SA. So update the calculation logic to take into account failures. Fixes: 6fb7f9408779 ("net/mlx5e: Connect mlx5 IPsec statistics with XFRM core") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 12:58:11 +01:00
Jianbo Liu	1da839eab6	net/mlx5e: Add mqprio_rl cleanup and free in mlx5e_priv_cleanup() In the cited commit, mqprio_rl cleanup and free are mistakenly removed in mlx5e_priv_cleanup(), and it causes the leakage of host memory and firmware SCHEDULING_ELEMENT objects while changing eswitch mode. So, add them back. Fixes: 0bb7228f7096 ("net/mlx5e: Fix mqprio_rl handling on devlink reload") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 12:58:11 +01:00
Chris Mi	b20c2fb454	net/mlx5: E-switch, Create ingress ACL when needed Currently, ingress acl is used for three features. It is created only when vport metadata match and prio tag are enabled. But active-backup lag mode also uses it. It is independent of vport metadata match and prio tag. And vport metadata match can be disabled using the following devlink command: # devlink dev param set pci/0000:08:00.0 name esw_port_metadata \ value false cmode runtime If ingress acl is not created, will hit panic when creating drop rule for active-backup lag mode. If always create it, there will be about 5% performance degradation. Fix it by creating ingress acl when needed. If esw_port_metadata is true, ingress acl exists, then create drop rule using existing ingress acl. If esw_port_metadata is false, create ingress acl and then create drop rule. Fixes: 1749c4c51c16 ("net/mlx5: E-switch, add drop rule support to ingress ACL") Signed-off-by: Chris Mi <cmi@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 12:58:11 +01:00
Daniel Jurgens	5dbf647367	net/mlx5: Use max_num_eqs_24b when setting max_io_eqs Due a bug in the device max_num_eqs doesn't always reflect a written value. As a result, setting max_io_eqs may not work but appear successful. Instead write max_num_eqs_24b, which reflects correct value. Fixes: 93197c7c509d ("mlx5/core: Support max_io_eqs for a function") Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 12:58:11 +01:00
Daniel Jurgens	29c6a562bf	net/mlx5: Use max_num_eqs_24b capability if set A new capability with more bits is added. If it's set use that value as the maximum number of EQs available. This cap is also writable by the vhca_resource_manager to allow limiting the number of EQs available to SFs and VFs. Fixes: 93197c7c509d ("mlx5/core: Support max_io_eqs for a function") Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Reviewed-by: William Tu <witu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 12:58:11 +01:00
Daniel Jurgens	048a403648	net/mlx5: IFC updates for changing max EQs Expose new capability to support changing the number of EQs available to other functions. Fixes: 93197c7c509d ("mlx5/core: Support max_io_eqs for a function") Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Reviewed-by: William Tu <witu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 12:58:10 +01:00
David S. Miller	748e3bbf47	Merge branch 'net-selftests-mirroring-cleanup' into main Petr Machata says: ==================== selftest: Clean-up and stabilize mirroring tests The mirroring selftests work by sending ICMP traffic between two hosts. Along the way, this traffic is mirrored to a gretap netdevice, and counter taps are then installed strategically along the path of the mirrored traffic to verify the mirroring took place. The problem with this is that besides mirroring the primary traffic, any other service traffic is mirrored as well. At the same time, because the tests need to work in HW-offloaded scenarios, the ability of the device to do arbitrary packet inspection should not be taken for granted. Most tests therefore simply use matchall, one uses flower to match on IP address. As a result, the selftests are noisy. mirror_test() accommodated this noisiness by giving the counters an allowance of several packets. But that only works up to a point, and on busy systems won't be always enough. In this patch set, clean up and stabilize the mirroring selftests. The original intention was to port the tests over to UDP, but the logic of ICMP ends up being so entangled in the mirroring selftests that the changes feel overly invasive. Instead, ICMP is kept, but where possible, we match on ICMP message type, thus filtering out hits by other ICMP messages. Where this is not practical (where the counter tap is put on a device that carries encapsulated packets), switch the counter condition to _at least_ X observed packets. This is less robust, but barely so -- probably the only scenario that this would not catch is something like erroneous packet duplication, which would hopefully get caught by the numerous other tests in this extensive suite. - Patches #1 to #3 clean up parameters at various helpers. - Patches #4 to #6 stabilize the mirroring selftests as described above. - Mirroring tests currently allow testing SW datapath even on HW netdevices by trapping traffic to the SW datapath. This complicates the tests a bit without a good reason: to test SW datapath, just run the selftests on the veth topology. Thus in patch #7, drop support for this dual SW/HW testing. - At this point, some cleanups were either made possible by the previous patches, or were always possible. In patches #8 to #11, realize these cleanups. - In patch #12, fix mlxsw mirror_gre selftest to respect setting TESTS. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 10:55:38 +01:00
Petr Machata	098ba97d0e	selftests: mlxsw: mirror_gre: Obey TESTS This test is unusual in that overriding TESTS does not change the tests to be run. Split the individual tests into several functions and invoke them through tests_run() as appropriate. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 10:55:38 +01:00
Petr Machata	06704a0d5e	selftests: libs: Drop unused functions Nothing calls these. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 10:55:38 +01:00
Petr Machata	4e9cd3d03a	selftests: libs: Drop slow_path_trap_install()/_uninstall() These functions are not used anymore. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 10:55:38 +01:00
Petr Machata	95d33989ce	selftests: mirror_gre_lag_lacp: Drop unnecessary code The selftest does not use functions from mirror_gre_lib, ditch the import. It does not use arping either, so drop the require_command as well. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 10:55:37 +01:00
Petr Machata	388b2d985a	selftests: mlxsw: mirror_gre: Simplify After the previous patch, the function test_span_failable() is always called with should_fail=1. Drop the argument and streamline the code. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 10:55:37 +01:00
Petr Machata	d361d78fe2	selftests: mirror: Drop dual SW/HW testing The mirroring tests are currently run in a skip_hw and optionally a skip_sw mode. The former tests the SW datapath, the latter the HW datapath, if available. In order to be able to test SW datapath on HW loopbacks, traps are installed on ingress to get traffic from the HW datapath to the SW one. This adds an unnecessary complexity when it would be much simpler to just use a veth-based topology to test the SW datapath. Thus drop all the code that supports this dual testing. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 10:55:37 +01:00
Petr Machata	a86e0df9ce	selftests: mirror: mirror_test(): Allow exact count of packets The mirroring selftests work by sending ICMP traffic between two hosts. Along the way, this traffic is mirrored to a gretap netdevice, and counter taps are then installed strategically along the path of the mirrored traffic to verify the mirroring took place. The problem with this is that besides mirroring the primary traffic, any other service traffic is mirrored as well. At the same time, because the tests need to work in HW-offloaded scenarios, the ability of the device to do arbitrary packet inspection should not be taken for granted. Most tests therefore simply use matchall, one uses flower to match on IP address. As a result, the selftests are noisy, because besides the primary ICMP traffic, any amount of other service traffic is mirrored as well. mirror_test() accommodated this noisiness by giving the counters an allowance of several packets. But in the previous patch, where possible, counter taps were changed to match only on an exact ICMP message. At least in those cases, we can demand an exact number of packets to match. Where the tap is installed on a connective netdevice, the exact matching is not practical (though with u32, anything is possible). In those places, there should still be some leeway -- and probably bigger than before, because experience shows that these tests are very noisy. To that end, change mirror_test() so that it can be either called with an exact number to expect, or with an expression. Where leeway is needed, adjust callers to pass a ">= 10" instead of mere 10. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 10:55:37 +01:00
Petr Machata	833415358f	selftests: mirror: do_test_span_dir_ips(): Install accurate taps The mirroring selftests work by sending ICMP traffic between two hosts. Along the way, this traffic is mirrored to a gretap netdevice, and counter taps are then installed strategically along the path of the mirrored traffic to verify the mirroring took place. The problem with this is that besides mirroring the primary traffic, any other service traffic is mirrored as well. At the same time, because the tests need to work in HW-offloaded scenarios, the ability of the device to do arbitrary packet inspection should not be taken for granted. Most tests therefore simply use matchall, one uses flower to match on IP address. As a result, the selftests are noisy, because besides the primary ICMP traffic, any amount of other service traffic is mirrored as well. However, often the counter tap is installed at the remote end of the gretap tunnel. Since this is a SW-datapath scenario anyway, we can make the filter arbitrarily accurate. Thus in this patch, add parameters forward_type and backward_type to several mirroring test helpers, as some other helpers already have. Then change do_test_span_dir_ips() to instead of installing one generic tap and using it for test in both directions, install the tap for each direction separately, matching on the ICMP type given by these parameters. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 10:55:37 +01:00
Petr Machata	95e7b860e1	selftests: mirror_gre_lag_lacp: Check counters at tunnel The test works by sending packets through a tunnel, whence they are forwarded to a LAG. One of the LAG children is removed from the LAG prior to the exercise, and the test then counts how many packets pass through the other one. The issue with this is that it counts all packets, not just the encapsulated ones. So instead add a second gretap endpoint to receive the sent packets, and check reception counters there. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 10:55:36 +01:00
Petr Machata	9b5d5f2726	selftests: lib: tc_rule_stats_get(): Move default to argument definition The argument $dir has a fallback value of "ingress". Move the fallback from the usage site to the argument definition block to make the fact clearer. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 10:55:36 +01:00
Petr Machata	28e67746b7	selftests: mirror: Drop direction argument from several functions The argument is not used by these functions except to propagate it for ultimately no purpose. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 10:55:36 +01:00
Petr Machata	d5fbb2eb33	selftests: libs: Expand "$@" where possible In some functions, argument-forwarding through "$@" without listing the individual arguments explicitly is fundamental to the operation of a function. E.g. xfail_on_veth() should be able to run various tests in the fail-to-xfail regime, and usage of "$@" is appropriate as an abstraction mechanism. For functions such as simple_if_init(), $@ is a handy way to pass an array. In other functions, it's merely a mechanism to save some typing, which however ends up obscuring the real arguments and makes life hard for those that end up reading the code. This patch adds some of the implicit function arguments and correspondingly expands $@'s. In several cases this will come in handy as following patches adjust the parameter lists. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 10:55:36 +01:00
David S. Miller	c977ac49fe	Merge branch 'net-flash-modees-firmware' into main Danielle Ratson says: ==================== Add ability to flash modules' firmware CMIS compliant modules such as QSFP-DD might be running a firmware that can be updated in a vendor-neutral way by exchanging messages between the host and the module as described in section 7.2.2 of revision 4.0 of the CMIS standard. According to the CMIS standard, the firmware update process is done using a CDB commands sequence. CDB (Command Data Block Message Communication) reads and writes are performed on memory map pages 9Fh-AFh according to the CMIS standard, section 8.12 of revision 4.0. Add a pair of new ethtool messages that allow: * User space to trigger firmware update of transceiver modules * The kernel to notify user space about the progress of the process The user interface is designed to be asynchronous in order to avoid RTNL being held for too long and to allow several modules to be updated simultaneously. The interface is designed with CMIS compliant modules in mind, but kept generic enough to accommodate future use cases, if these arise. The kernel interface that will implement the firmware update using CDB command will include 2 layers that will be added under ethtool: * The upper layer that will be triggered from the module layer, is cmis_ fw_update. * The lower one is cmis_cdb. In the future there might be more operations to implement using CDB commands. Therefore, the idea is to keep the cmis_cdb interface clean and the cmis_fw_update specific to the cdb commands handling it. The communication between the kernel and the driver will be done using two ethtool operations that enable reading and writing the transceiver module EEPROM. The operation ethtool_ops::get_module_eeprom_by_page, that is already implemented, will be used for reading from the EEPROM the CDB reply, e.g. reading module setting, state, etc. The operation ethtool_ops::set_module_eeprom_by_page, that is added in the current patchset, will be used for writing to the EEPROM the CDB command such as start firmware image, run firmware image, etc. Therefore in order for a driver to implement module flashing, that driver needs to implement the two functions mentioned above. Patchset overview: Patch #1-#2: Implement the EEPROM writing in mlxsw. Patch #3: Define the interface between the kernel and user space. Patch #4: Add ability to notify the flashing firmware progress. Patch #5: Veto operations during flashing. Patch #6: Add extended compliance codes. Patch #7: Add the cdb layer. Patch #8: Add the fw_update layer. Patch #9: Add ability to flash transceiver modules' firmware. v8: Patch #7: * In the ethtool_cmis_wait_for_cond() evaluate the condition once more to decide if the error code should be -ETIMEDOUT or something else. * s/netdev_err/netdev_err_once. v7: Patch #4: * Return -ENOMEM instead of PTR_ERR(attr) on ethnl_module_fw_flash_ntf_put_err(). Patch #9: * Fix Warning for not unlocking the spin_lock in the error flow on module_flash_fw_work_list_add(). * Avoid the fall-through on ethnl_sock_priv_destroy(). v6: * Squash some of the last patch to patch #5 and patch #9. Patch #3: * Add paragraph in .rst file. Patch #4: * Reserve '1' more place on SKB for NUL terminator in the error message string. * Add more prints on error flow, re-write the printing function and add ethnl_module_fw_flash_ntf_put_err(). * Change the communication method so notification will be sent in unicast instead of multicast. * Add new 'struct ethnl_module_fw_flash_ntf_params' that holds the relevant info for unicast communication and use it to send notification to the specific socket. * s/nla_put_u64_64bit/nla_put_uint/ Patch #7: * In ethtool_cmis_cdb_init(), Use 'const' for the 'params' parameter. Patch #8: * Add a list field to struct ethtool_module_fw_flash for module_fw_flash_work_list that will be presented in the next patch. * Move ethtool_cmis_fw_update() cleaning to a new function that will be represented in the next patch. * Move some of the fields in struct ethtool_module_fw_flash to a separate struct, so ethtool_cmis_fw_update() will get only the relevant parameters for it. * Edit the relevant functions to get the relevant params for them. * s/CMIS_MODULE_READY_MAX_DURATION_USEC/CMIS_MODULE_READY_MAX_DURATION_MSEC Patch #9: * Add a paragraph in the commit message. * Rename labels in module_flash_fw_schedule(). * Add info to genl_sk_priv_() and implement the relevant callbacks, in order to handle properly a scenario of closing the socket from user space before the work item was ended. Add a list the holds all the ethtool_module_fw_flash struct that corresponds to the in progress work items. * Add a new enum for the socket types. * Use both above to identify a flashing socket, add it to the list and when closing socket affect only the flashing type. * Create a new function that will get the work item instead of ethtool_cmis_fw_update(). * Edit the relevant functions to get the relevant params for them. * The new function will call the old ethtool_cmis_fw_update(), and do the cleaning, so the existence of the list should be completely isolated in module.c. =================== Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 10:49:35 +01:00
Danielle Ratson	32b4c8b53e	ethtool: Add ability to flash transceiver modules' firmware Add the ability to flash the modules' firmware by implementing the interface between the user space and the kernel. Example from a succeeding implementation: # ethtool --flash-module-firmware swp40 file test.bin Transceiver module firmware flashing started for device swp40 Transceiver module firmware flashing in progress for device swp40 Progress: 99% Transceiver module firmware flashing completed for device swp40 In addition, add infrastructure that allows modules to set socket-specific private data. This ensures that when a socket is closed from user space during the flashing process, the right socket halts sending notifications to user space until the work item is completed. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 10:48:23 +01:00
Danielle Ratson	c4f78134d4	ethtool: cmis_fw_update: add a layer for supporting firmware update using CDB According to the CMIS standard, the firmware update process is done using a CDB commands sequence. Implement a work that will be triggered from the module layer in the next patch the will initiate and execute all the CDB commands in order, to eventually complete the firmware update process. This flashing process includes, writing the firmware image, running the new firmware image and committing it after testing, so that it will run upon reset. This work will also notify user space about the progress of the firmware update process. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 10:48:23 +01:00
Danielle Ratson	a39c84d796	ethtool: cmis_cdb: Add a layer for supporting CDB commands CDB (Command Data Block Message Communication) reads and writes are performed on memory map pages 9Fh-AFh according to the CMIS standard, section 8.20 of revision 5.2. Page 9Fh is used to specify the CDB command to be executed and also provides an area for a local payload (LPL). According to the CMIS standard, the firmware update process is done using a CDB commands sequence that will be implemented in the next patch. The kernel interface that will implement the firmware update using CDB command will include 2 layers that will be added under ethtool: * The upper layer that will be triggered from the module layer, is cmis_fw_update. * The lower one is cmis_cdb. In the future there might be more operations to implement using CDB commands. Therefore, the idea is to keep the CDB interface clean and the cmis_fw_update specific to the CDB commands handling it. These two layers will communicate using the API the consists of three functions: - struct ethtool_cmis_cdb * ethtool_cmis_cdb_init(struct net_device dev, struct ethtool_module_fw_flash_params params); - void ethtool_cmis_cdb_fini(struct ethtool_cmis_cdb cdb); - int ethtool_cmis_cdb_execute_cmd(struct net_device dev, struct ethtool_cmis_cdb_cmd_args args); Add the CDB layer to support initializing, finishing and executing CDB commands: The initialization process will include creating of an ethtool_cmis_cdb instance, querying the module CDB support, entering and validating the password from user space (CMD 0x0000) and querying the module features (CMD 0x0040). * The finishing API will simply free the ethtool_cmis_cdb instance. * The executing process will write the CDB command to EEPROM using set_module_eeprom_by_page() that was presented earlier, and will process the reply from EEPROM. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 10:48:23 +01:00
Danielle Ratson	e4f9193699	net: sfp: Add more extended compliance codes SFF-8024 is used to define various constants re-used in several SFF SFP-related specifications. Add SFF-8024 extended compliance code definitions for CMIS compliant modules and use them in the next patch to determine the firmware flashing work. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 10:48:23 +01:00
Danielle Ratson	31e0aa99dc	ethtool: Veto some operations during firmware flashing process Some operations cannot be performed during the firmware flashing process. For example: - Port must be down during the whole flashing process to avoid packet loss while committing reset for example. - Writing to EEPROM interrupts the flashing process, so operations like ethtool dump, module reset, get and set power mode should be vetoed. - Split port firmware flashing should be vetoed. In order to veto those scenarios, add a flag in 'struct net_device' that indicates when a firmware flash is taking place on the module and use it to prevent interruptions during the process. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 10:48:22 +01:00
Danielle Ratson	d7d4cfc4c9	ethtool: Add flashing transceiver modules' firmware notifications ability Add progress notifications ability to user space while flashing modules' firmware by implementing the interface between the user space and the kernel. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 10:48:22 +01:00
Danielle Ratson	46fb3ba95b	ethtool: Add an interface for flashing transceiver modules' firmware CMIS compliant modules such as QSFP-DD might be running a firmware that can be updated in a vendor-neutral way by exchanging messages between the host and the module as described in section 7.3.1 of revision 5.2 of the CMIS standard. Add a pair of new ethtool messages that allow: * User space to trigger firmware update of transceiver modules * The kernel to notify user space about the progress of the process The user interface is designed to be asynchronous in order to avoid RTNL being held for too long and to allow several modules to be updated simultaneously. The interface is designed with CMIS compliant modules in mind, but kept generic enough to accommodate future use cases, if these arise. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 10:48:22 +01:00
Ido Schimmel	1983a80070	mlxsw: Implement ethtool operation to write to a transceiver module EEPROM Implement the ethtool_ops::set_module_eeprom_by_page operation to allow ethtool to write to a transceiver module EEPROM, in a similar fashion to the ethtool_ops::get_module_eeprom_by_page operation. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 10:48:22 +01:00
Ido Schimmel	69540b7987	ethtool: Add ethtool operation to write to a transceiver module EEPROM Ethtool can already retrieve information from a transceiver module EEPROM by invoking the ethtool_ops::get_module_eeprom_by_page operation. Add a corresponding operation that allows ethtool to write to a transceiver module EEPROM. The new write operation is purely an in-kernel API and is not exposed to user space. The purpose of this operation is not to enable arbitrary read / write access, but to allow the kernel to write to specific addresses as part of transceiver module firmware flashing. In the future, more functionality can be implemented on top of these read / write operations. Adjust the comments of the 'ethtool_module_eeprom' structure as it is no longer used only for read access. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 10:48:21 +01:00
Neal Cardwell	a6458ab7fd	UPSTREAM: tcp: fix DSACK undo in fast recovery to call tcp_try_to_open() In some production workloads we noticed that connections could sometimes close extremely prematurely with ETIMEDOUT after transmitting only 1 TLP and RTO retransmission (when we would normally expect roughly tcp_retries2 = TCP_RETR2 = 15 RTOs before a connection closes with ETIMEDOUT). From tracing we determined that these workloads can suffer from a scenario where in fast recovery, after some retransmits, a DSACK undo can happen at a point where the scoreboard is totally clear (we have retrans_out == sacked_out == lost_out == 0). In such cases, calling tcp_try_keep_open() means that we do not execute any code path that clears tp->retrans_stamp to 0. That means that tp->retrans_stamp can remain erroneously set to the start time of the undone fast recovery, even after the fast recovery is undone. If minutes or hours elapse, and then a TLP/RTO/RTO sequence occurs, then the start_ts value in retransmits_timed_out() (which is from tp->retrans_stamp) will be erroneously ancient (left over from the fast recovery undone via DSACKs). Thus this ancient tp->retrans_stamp value can cause the connection to die very prematurely with ETIMEDOUT via tcp_write_err(). The fix: we change DSACK undo in fast recovery (TCP_CA_Recovery) to call tcp_try_to_open() instead of tcp_try_keep_open(). This ensures that if no retransmits are in flight at the time of DSACK undo in fast recovery then we properly zero retrans_stamp. Note that calling tcp_try_to_open() is more consistent with other loss recovery behavior, since normal fast recovery (CA_Recovery) and RTO recovery (CA_Loss) both normally end when tp->snd_una meets or exceeds tp->high_seq and then in tcp_fastretrans_alert() the "default" switch case executes tcp_try_to_open(). Also note that by inspection this change to call tcp_try_to_open() implies at least one other nice bug fix, where now an ECE-marked DSACK that causes an undo will properly invoke tcp_enter_cwr() rather than ignoring the ECE mark. Fixes: c7d9d6a185a7 ("tcp: undo on DSACK during recovery") Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 10:28:14 +01:00
Johannes Berg	816c6bec09	wifi: mac80211: fix BSS_CHANGED_UNSOL_BCAST_PROBE_RESP Fix the definition of BSS_CHANGED_UNSOL_BCAST_PROBE_RESP so that not all higher bits get set, 1<<31 is a signed variable, so when we do u64 changed = BSS_CHANGED_UNSOL_BCAST_PROBE_RESP; we get sign expansion, so the value is 0xffff'ffff'8000'0000 and that's clearly not desired. Use BIT_ULL() to make it unsigned as well as the right type for the change flags. Fixes: 178e9d6adc43 ("wifi: mac80211: fix unsolicited broadcast probe config") Reviewed-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20240627104257.06174d291db2.Iba0d642916eb78a61f8ab2cc5ca9280783d9c1db@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-06-28 10:26:32 +02:00
Marek Vasut	8fda53719a	dt-bindings: net: realtek,rtl82xx: Document known PHY IDs as compatible strings Extract known PHY IDs from Linux kernel realtek PHY driver and convert them into supported compatible string list for this DT binding document. Signed-off-by: Marek Vasut <marex@denx.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-28 08:56:22 +01:00
Christophe JAILLET	62d73261a0	can: m_can: Constify struct m_can_ops 'struct m_can_ops' is not modified in these drivers. Constifying this structure moves some data to a read-only section, so increase overall security. On a x86_64, with allmodconfig, as an example: Before: ====== text data bss dec hex filename 4806 520 0 5326 14ce drivers/net/can/m_can/m_can_pci.o After: ===== text data bss dec hex filename 4862 464 0 5326 14ce drivers/net/can/m_can/m_can_pci.o Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/all/a17b96d1be5341c11f263e1e45c9de1cb754e416.1719172843.git.christophe.jaillet@wanadoo.fr Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2024-06-28 09:35:13 +02:00
Marc Kleine-Budde	580d1712a4	Merge patch series "can: rcar_canfd: Small improvements and cleanups" Geert Uytterhoeven <geert+renesas@glider.be> says: This series contains some improvements and cleanups for the R-Car CAN-FD driver. It has been tested on R-Car V4H (White Hawk and White Hawk Single). Link: https://lore.kernel.org/all/cover.1716973640.git.geert+renesas@glider.be [mkl: fixed typo in cover letter] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2024-06-28 09:35:07 +02:00
Geert Uytterhoeven	f9a83965d4	can: rcar_canfd: Remove superfluous parentheses in address calculations There is no need to wrap simple variables or multiplications inside parentheses. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://lore.kernel.org/all/b5aee80895fa029070fd37d1d837cf1c0ecb52dc.1716973640.git.geert+renesas@glider.be Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2024-06-28 09:34:42 +02:00
Geert Uytterhoeven	0c1d0a69c5	can: rcar_canfd: Improve printing of global operational state Replace the printing of internal numerical values by the printing of strings reflecting their meaning, to make the message self-explanatory. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Wolfram Sang <wsa@kernel.org> Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://lore.kernel.org/all/14c8c5ce026e9fec128404706d1c73c8ffa11ced.1716973640.git.geert+renesas@glider.be Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2024-06-28 09:34:42 +02:00
Geert Uytterhoeven	dd20d16dae	can: rcar_canfd: Simplify clock handling The main CAN clock is either the internal CANFD clock, or the external CAN clock. Hence replace the two-valued enum by a simple boolean flag. Consolidate all CANFD clock handling inside a single branch. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://lore.kernel.org/all/2cf38c10b83c8e5c04d68b17a930b6d9dbf66f40.1716973640.git.geert+renesas@glider.be Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2024-06-28 09:34:42 +02:00
Patryk Wlazlyn	b15943c4b3	tools/power turbostat: Add local build_bug.h header for snapshot target Fixes compilation errors for Makefile snapshot target described in: commit 231ce08b662a ("tools/power turbostat: Add "snapshot:" Makefile target") Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2024-06-27 23:53:27 -04:00
Adam Hawley	c5120a3356	tools/power turbostat: Fix unc freq columns not showing with '-q' or '-l' Commit 78464d7681f7 ("tools/power turbostat: Add columns for clustered uncore frequency") introduced 'probe_intel_uncore_frequency_cluster()' in a way which prevents printing uncore frequency columns if either of the '-q' or '-l' options are used. Systems which do not have multiple uncore frequencies per package are unaffected by this regression. Fix the function so that uncore frequency columns are shown when either the '-l' or '-q' option is used by checking if 'quiet' is true after adding counters for the uncore frequency columns. Fixes: 78464d7681f7 ("tools/power turbostat: Add columns for clustered uncore frequency") Signed-off-by: Adam Hawley <adam.james.hawley@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2024-06-27 23:53:27 -04:00
David Arcari	ebb5b260af	tools/power turbostat: option '-n' is ambiguous In some cases specifying the '-n' command line argument will cause turbostat to fail. For instance 'turbostat -n 1' works fine; however, 'turbostat -n 1 -d' will fail. This is the result of the first call to getopt_long_only() where "MP" is specified as the optstring. This can be easily fixed by changing the optstring from "MP" to "MPn:" to remove ambiguity between the arguments. tools/power turbostat: option '-n' is ambiguous; possibilities: '-num_iterations' '-no-msr' '-no-perf' Fixes: a0e86c90b83c ("tools/power turbostat: Add --no-perf option") Signed-off-by: David Arcari <darcari@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>	2024-06-27 23:53:27 -04:00
Linus Torvalds	5bbd9b2498	This push fixes a build failure in qat. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmZ1aqQACgkQxycdCkmx i6eg4BAAtAbtNbBqBUlAHeVxii4Y3Mhf1JZhc7Gu0q83dNMifW61qSG9azEi4Bom 0QQ4B+B8iMEKACAOLkCEvFI7ciKXCV4qPTII0+oLrjYssBzW6kqt9oRhpFcNoLC4 ZK61gZl+Yz5xOkxRs9vgq+Hp3xfpG/HuvN3Fom/MAUCFHpDlTBuh0HmIjGE19naC kyP6xUyb0IfBQ3LYi0ouA/u8q9dAnZviiTJO5JYSoKLFi5SLFLMY+0vT1izaUFVO Q81U2rrsSzY7EMt7CIV2l8drBwUK/km3EgsBrGv8OKZj8L13QPCdrSdCnDuvuaz6 CXkeO3jQIUXoQ6X3e1iKyBvhUC5HDwqxodKd6UMNjtgw2kHAgsd3axH5e+Uq15RT rNrHar83zGn+Kx2aMqn2wbxj3MRxzt1Vf23NZqkqYVg5cneWZmle1geLpJZ9vvdw zEJwnozDPjp99Rym5gzRACI/NNAaXYQlQTVCsxdiIifMGyXuOPRjxGtroRHUY9WD mVy2VMddTlVcSkRxl1lSpnaduieI58JpekqSWbMzJvCu+mKY4ZJlhKqhAx3xpRik WvF767zktxjlzxuMm0gIVZ4/q8Yg4irX7Xa176GwIKKzyh51EQAuWOSED7XJgjHh PNClQRr8cqtil4+dqU3YnXTHKK4T46q8sukCz1Ath7QoNmISPco= =fcdu -----END PGP SIGNATURE----- Merge tag 'v6.10-p4' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pyll crypto fix from Herbert Xu: "Fix a build failure in qat" * tag 'v6.10-p4' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: qat - fix linking errors when PCI_IOV is disabled	2024-06-27 17:43:15 -07:00
Linus Torvalds	1c52cf5e79	drm fixes for 6.10-rc6 core: - fix refcounting race on pid handover fbdev: - Fix fb_info when vmalloc is used, regression from CONFIG_DRM_FBDEV_LEAK_PHYS_SMEM. amdgpu: - SMU 14.x fix - vram info parsing fix - mode1 reset fix - LTTPR fix - Virtual display fix - Avoid spurious error in PSP init i915: - Fix potential UAF due to race on fence register revocation nouveau - nouveau tv mode fixes. panel: - Add KOE TX26D202VM0BWA timings. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmZ9+dUACgkQDHTzWXnE hr4wKQ/+Irl7gSP6pjQaxX8+vQEztbOAMcugQPilJwOE2VOShkA9BBJsdRGOYtcP 5yCEXIC/iYQj3kIg0b4SbJDbn9BZBjJeGMChQpmb5YEKi5nxoOri30UvH8PWb3xP 9MpO9xACQz5i9iA6FPmSDkRx76Dy5o8GF5GASsuZrXF/Hd5YFRFxw9NtYwomcZ2b ghn0v7MCIzeC2Z+snqjkh9TTSMzPGMQJdn8uZ5lHiqS3SGIh9cnrmysxqL+f4OVa xJX0pElE0pEA2O08trghasog4g/VnE5TOwYsNEqQVfX4fW69vas0ZL3S63IX/UUG 05kSd75iy+rZsDQHWblaTsVtOuaBhyqDNt/KOgMDg0fWICN8Y7y37dF4fPn5zJjt RE+axCz1mlVZ7rwPv0I43cEt0dbKe82Qfgcc9aoiJ8imT+ElI/bRkjmjs662Ihfc hlIHdYTbLlUMb1MyM6c9gt8uprd643LcI9hOvJqoYQDUsbR0nZewu9/ElYSb2thl R2YH7Ytc8y6zAXC1TjO6TBifQbtELzrVJkqFtECreiDOI3DlH1M1x10iCxJT9y2L p7MD1VahIdnViLtXm9i6AwEGQAaPGbcJZhsC5QhQ/ev022uMAo8OVbu/3BknDjaO MFKV26POWmmy7RFRU2xNZkEX3YZjUpDiN+DGqLA1JzKBoTUaJSU= =0isZ -----END PGP SIGNATURE----- Merge tag 'drm-fixes-2024-06-28' of https://gitlab.freedesktop.org/drm/kernel Pull drm fixes from Dave Airlie: "Regular fixes, mostly amdgpu with some minor fixes in other places, along with a fix for a very narrow UAF race in the pid handover code. core: - fix refcounting race on pid handover fbdev: - Fix fb_info when vmalloc is used, regression from CONFIG_DRM_FBDEV_LEAK_PHYS_SMEM. amdgpu: - SMU 14.x fix - vram info parsing fix - mode1 reset fix - LTTPR fix - Virtual display fix - Avoid spurious error in PSP init i915: - Fix potential UAF due to race on fence register revocation nouveau - nouveau tv mode fixes panel: - Add KOE TX26D202VM0BWA timings" * tag 'drm-fixes-2024-06-28' of https://gitlab.freedesktop.org/drm/kernel: drm/drm_file: Fix pid refcounting race drm/nouveau/dispnv04: fix null pointer dereference in nv17_tv_get_ld_modes drm/nouveau/dispnv04: fix null pointer dereference in nv17_tv_get_hd_modes drm/amdgpu: Don't show false warning for reg list drm/amdgpu: avoid using null object of framebuffer drm/amd/display: Send DP_TOTAL_LTTPR_CNT during detection if LTTPR is present drm/amdgpu: Fix pci state save during mode-1 reset drm/amdgpu/atomfirmware: fix parsing of vram_info drm/amd/swsmu: add MALL init support workaround for smu_v14_0_1 drm/i915/gt: Fix potential UAF by revoke of fence registers drm/panel: simple: Add missing display timing flags for KOE TX26D202VM0BWA drm/fbdev-dma: Only set smem_start is enable per module option	2024-06-27 17:24:34 -07:00
Breno Leitao	94833addfa	net: thunderx: Unembed netdev structure Embedding net_device into structures prohibits the usage of flexible arrays in the net_device structure. For more details, see the discussion at [1]. Un-embed the net_devices from struct lmac by converting them into pointers, and allocating them dynamically. Use the leverage alloc_netdev() to allocate the net_device object at bgx_lmac_enable(). The free of the device occurs at bgx_lmac_disable(). Do not free_netdevice() if bgx_lmac_enable() fails after lmac->netdev is allocated, since bgx_lmac_disable() is called if bgx_lmac_enable() fails, and lmac->netdev will be freed there (similarly to lmac->dmacs). Link: https://lore.kernel.org/all/20240229225910.79e224cf@kernel.org/ [1] Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20240626173503.87636-1-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-27 16:55:34 -07:00
Sagi Grimberg	2d5f6801db	Revert "net: micro-optimize skb_datagram_iter" This reverts commit 934c29999b57b835d65442da6f741d5e27f3b584. This triggered a usercopy BUG() in systems with HIGHMEM, reported by the test robot in: https://lore.kernel.org/oe-lkp/202406161539.b5ff7b20-oliver.sang@intel.com Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Link: https://patch.msgid.link/20240626070153.759257-1-sagi@grimberg.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-27 16:44:32 -07:00
Marek Vasut	d3dcb084c7	net: phy: phy_device: Fix PHY LED blinking code comment Fix copy-paste error in the code comment. The code refers to LED blinking configuration, not brightness configuration. It was likely copied from comment above this one which does refer to brightness configuration. Fixes: 4e901018432e ("net: phy: phy_device: Call into the PHY driver to set LED blinking") Signed-off-by: Marek Vasut <marex@denx.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20240626030638.512069-1-marex@denx.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-27 16:33:26 -07:00
Jann Horn	4f2a129b33	drm/drm_file: Fix pid refcounting race <maarten.lankhorst@linux.intel.com>, Maxime Ripard <mripard@kernel.org>, Thomas Zimmermann <tzimmermann@suse.de> filp->pid is supposed to be a refcounted pointer; however, before this patch, drm_file_update_pid() only increments the refcount of a struct pid after storing a pointer to it in filp->pid and dropping the dev->filelist_mutex, making the following race possible: process A process B ========= ========= begin drm_file_update_pid mutex_lock(&dev->filelist_mutex) rcu_replace_pointer(filp->pid, <pid B>, 1) mutex_unlock(&dev->filelist_mutex) begin drm_file_update_pid mutex_lock(&dev->filelist_mutex) rcu_replace_pointer(filp->pid, <pid A>, 1) mutex_unlock(&dev->filelist_mutex) get_pid(<pid A>) synchronize_rcu() put_pid(<pid B>) * pid B reaches refcount 0 and is freed here * get_pid(<pid B>) * UAF * synchronize_rcu() put_pid(<pid A>) As far as I know, this race can only occur with CONFIG_PREEMPT_RCU=y because it requires RCU to detect a quiescent state in code that is not explicitly calling into the scheduler. This race leads to use-after-free of a "struct pid". It is probably somewhat hard to hit because process A has to pass through a synchronize_rcu() operation while process B is between mutex_unlock() and get_pid(). Fix it by ensuring that by the time a pointer to the current task's pid is stored in the file, an extra reference to the pid has been taken. This fix also removes the condition for synchronize_rcu(); I think that optimization is unnecessary complexity, since in that case we would usually have bailed out on the lockless check above. Fixes: 1c7a387ffef8 ("drm: Update file owner during use") Cc: <stable@vger.kernel.org> Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2024-06-28 08:56:26 +10:00

... 4 5 6 7 8 ...

1282257 Commits