linux

iv/linux

Author	SHA1	Message	Date
Jakub Kicinski	48eed027d3	netpoll: allocate netdev tracker right away Commit 5fa5ae605821 ("netpoll: add net device refcount tracker to struct netpoll") was part of one of the initial netdev tracker introduction patches. It added an explicit netdev_tracker_alloc() for netpoll, presumably because the flow of the function is somewhat odd. After most of the core networking stack was converted to use the tracking hold() variants, netpoll's call to old dev_hold() stands out a bit. np is allocated by the caller and ready to use, we can use netdev_hold() here, even tho np->ndev will only be set to ndev inside __netpoll_setup(). Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-15 08:21:11 +01:00
Jakub Kicinski	70f7457ad6	net: create device lookup API with reference tracking New users of dev_get_by_index() and dev_get_by_name() keep getting added and it would be nice to steer them towards the APIs with reference tracking. Add variants of those calls which allocate the reference tracker and use them in a couple of places. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-15 08:21:11 +01:00
Xin Long	89da780aa4	rtnetlink: move validate_linkmsg out of do_setlink This patch moves validate_linkmsg() out of do_setlink() to its callers and deletes the early validate_linkmsg() call in __rtnl_newlink(), so that it will not call validate_linkmsg() twice in either of the paths: - __rtnl_newlink() -> do_setlink() - __rtnl_newlink() -> rtnl_newlink_create() -> rtnl_create_link() Additionally, as validate_linkmsg() is now only called with a real dev, we can remove the NULL check for dev in validate_linkmsg(). Note that we moved validate_linkmsg() check to the places where it has not done any changes to the dev, as Jakub suggested. Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/cf2ef061e08251faf9e8be25ff0d61150c030475.1686585334.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-14 22:34:13 -07:00
Edwin Peer	fa0e21fa44	rtnetlink: extend RTEXT_FILTER_SKIP_STATS to IFLA_VF_INFO This filter already exists for excluding IPv6 SNMP stats. Extend its definition to also exclude IFLA_VF_INFO stats in RTM_GETLINK. This patch constitutes a partial fix for a netlink attribute nesting overflow bug in IFLA_VFINFO_LIST. By excluding the stats when the requester doesn't need them, the truncation of the VF list is avoided. While it was technically only the stats added in commit c5a9f6f0ab40 ("net/core: Add drop counters to VF statistics") breaking the camel's back, the appreciable size of the stats data should never have been included without due consideration for the maximum number of VFs supported by PCI. Fixes: 3b766cd83232 ("net/core: Add reading VF statistics through the PF netdevice") Fixes: c5a9f6f0ab40 ("net/core: Add drop counters to VF statistics") Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Cc: Edwin Peer <espeer@gmail.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://lore.kernel.org/r/20230611105108.122586-1-gal@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-06-14 13:28:26 +02:00
Paolo Abeni	c180f85825	Merge branch 'mlxsw-preparations-for-out-of-order-operations-patches' Petr Machata says: ==================== mlxsw: Preparations for out-of-order-operations patches The mlxsw driver currently makes the assumption that the user applies configuration in a bottom-up manner. Thus netdevices need to be added to the bridge before IP addresses are configured on that bridge or SVI added on top of it. Enslaving a netdevice to another netdevice that already has uppers is in fact forbidden by mlxsw for this reason. Despite this safety, it is rather easy to get into situations where the offloaded configuration is just plain wrong. As an example, take a front panel port, configure an IP address: it gets a RIF. Now enslave the port to a bridge, and the RIF is gone. Remove the port from the bridge again, but the RIF never comes back. There is a number of similar situations, where changing the configuration there and back utterly breaks the offload. Over the course of the following several patchsets, mlxsw code is going to be adjusted to diminish the space of wrongly offloaded configurations. Ideally the offload state will reflect the actual state, regardless of the sequence of operation used to construct that state. No functional changes are intended in this patchset yet. Rather the patches prepare the codebase for easier introduction of functional changes in later patchsets. - In patch #1, extract a helper to join a RIF of a given port, if there is one. In patch #2, use it in a newly-added helper to join a LAG interface. - In patches #3, #4 and #5, add helpers that abstract away the rif->dev access. This will make it simpler in the future to change the way the deduction is done. In patch #6, do this for deduction from nexthop group info to RIF. - In patch #7, add a helper to destroy a RIF. So far RIF was destroyed simply by kfree'ing it. - In patch #8, add a helper to check if any IP addresses are configured on a netdevice. This helper will be useful later. - In patch #9, add a helper to migrate a RIF. This will be a convenient place to put extensions later on. - Patch #10 move IPIP initialization up to make ipip_ops_arr available earlier. ==================== Link: https://lore.kernel.org/r/cover.1686581444.git.petrm@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-06-14 13:13:36 +02:00
Petr Machata	d4a37bf094	mlxsw: spectrum_router: Move IPIP init up mlxsw will need to keep track of certain devices that are not related to any of its front panel ports. This includes IPIP netdevices. To be able to query the list of supported IPIP types, router->ipip_ops_arr needs to be initialized. To that end, move the IPIP initialization up (and finalization correspondingly down). Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-06-14 13:12:21 +02:00
Petr Machata	440273e763	mlxsw: spectrum_router: Extract a helper for RIF migration RIF configuration contains a number of parameters that cannot be changed after the RIF is created. For the IPIP loopbacks, this is currently worked around by creating a new RIF with the desired configuration changes applied, and updating next hops to the new RIF, and then destroying the old RIF. This operation will be useful as a reusable atom, so extract a helper to that effect. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-06-14 13:12:20 +02:00
Petr Machata	33d11c4e5c	mlxsw: spectrum_router: Add a helper to check if netdev has addresses This function will be useful later as the driver will need to retroactively create RIFs for new uppers with addresses. Add another helper that assumes RCU lock, and restructure the code to skip the IPv6 branch not through conditioning on the addr_list_empty variable, but by directly returning the result value. This makes the skip more obvious than it previously was. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-06-14 13:12:20 +02:00
Petr Machata	571c56911b	mlxsw: spectrum_router: Extract a helper to free a RIF Right now freeing the object that mlxsw uses to keep track of a RIF is as simple as calling a kfree. But later on as CRIF abstraction is brought in, it will involve severing the link between CRIF and its RIF as well. Better to have the logic encapsulated in a helper. Since a helper is being introduced, make it a full-fledged destructor and have it validate that the objects tracked at the RIF have been released. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-06-14 13:12:20 +02:00
Petr Machata	532b6e2bbc	mlxsw: spectrum_router: Access nhgi->rif through a helper To abstract away deduction of RIF from the corresponding next hop group info (NHGI), mlxsw currently uses a macro. In its current form, that macro is impossible to extend to more general computation. Therefore introduce a helper, mlxsw_sp_nhgi_rif(), and use it throughout. This will make it possible to change the deduction path easily later on. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-06-14 13:12:20 +02:00
Petr Machata	69f4ba177d	mlxsw: spectrum_router: Access nh->rif->dev through a helper In order to abstract away deduction of netdevice from the corresponding next hop, introduce a helper, mlxsw_sp_nexthop_dev(), and use it throughout. This will make it possible to change the deduction path easily later on. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-06-14 13:12:20 +02:00
Petr Machata	2019b5eeae	mlxsw: spectrum_router: Access rif->dev from params in mlxsw_sp_rif_create() The previous patch added a helper to access a netdevice given a RIF. Using this helper in mlxsw_sp_rif_create() is unreasonable: the netdevice was given in RIF creation parameters. Just take it there. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-06-14 13:12:20 +02:00
Petr Machata	fb6ac45e86	mlxsw: spectrum_router: Access rif->dev through a helper In order to abstract away deduction of netdevice from the corresponding RIF, introduce a helper, mlxsw_sp_rif_dev(), and use it throughout. This will make it possible to change the deduction path easily later on. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-06-14 13:12:20 +02:00
Petr Machata	76962b802e	mlxsw: spectrum_router: Add a helper specifically for joining a LAG Currently, joining a LAG very simply means that the LAG RIF should be joined by the subport representing untagged traffic. If the RIF does not exist, it does not have to be created: if the user wants there to be RIF for the LAG device, they are supposed to add an IP address, and they are supposed to do it after tha LAG becomes mlxsw upper. We can also assume that the LAG has no uppers, otherwise the enslavement is not allowed. In the future, these ordering dependencies should be removed. That means that joining LAG will be more complex operation, possibly involving a lazy RIF creation, and possibly joining / lazily creating RIFs for VLAN uppers of the LAG. It will be handy to have a dedicated function that handles all this. The new function mlxsw_sp_router_port_join_lag() is that. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-06-14 13:12:20 +02:00
Petr Machata	e0db883b69	mlxsw: spectrum_router: Extract a helper from mlxsw_sp_port_vlan_router_join() Split out of mlxsw_sp_port_vlan_router_join() the part that checks for RIF and dispatches to __mlxsw_sp_port_vlan_router_join(), leaving it as wrapper that just manages the router lock. The new function, mlxsw_sp_port_vlan_router_join_existing(), will be useful as an atom in later patches. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-06-14 13:12:20 +02:00
Justin Chen	2bddad9ec6	ethtool: ioctl: account for sopass diff in set_wol sopass won't be set if wolopt doesn't change. This means the following will fail to set the correct sopass. ethtool -s eth0 wol s sopass 11:22:33:44:55:66 ethtool -s eth0 wol s sopass 22:44:55:66:77:88 Make sure we call into the driver layer set_wol if sopass is different. Fixes: 55b24334c0f2 ("ethtool: ioctl: improve error checking for set_wol") Signed-off-by: Justin Chen <justin.chen@broadcom.com> Link: https://lore.kernel.org/r/1686605822-34544-1-git-send-email-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-13 22:05:52 -07:00
Uwe Kleine-König	e5d4a21b3a	mctp i2c: Switch back to use struct i2c_driver's .probe() After commit b8a1a4cd5a98 ("i2c: Provide a temporary .probe_new() call-back type"), all drivers being converted to .probe_new() and then commit 03c835f498b5 ("i2c: Switch .probe() to not take an id parameter") convert back to (the new) .probe() to be able to eventually drop .probe_new() from struct i2c_driver. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20230612071641.836976-1-u.kleine-koenig@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-13 20:46:20 -07:00
Jakub Kicinski	a9c4769788	Merge branch 'tools-ynl-gen-improvements-for-dpll' Jakub Kicinski says: ==================== tools: ynl-gen: improvements for DPLL A set of improvements needed to generate the kernel code for DPLL. ==================== Link: https://lore.kernel.org/r/20230612155920.1787579-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-13 11:46:47 -07:00
Jakub Kicinski	be093a80df	tools: ynl-gen: inherit policy in multi-attr Instead of reimplementing policies in MutliAttr for every underlying type forward the calls to the base type. This will be needed for DPLL which uses a multi-attr nest, and currently gets an invalid NLA_NEST policy generated. Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-13 11:46:46 -07:00
Jakub Kicinski	10c4d2a7b8	tools: ynl-gen: correct enum policies Scalar range validation assumes enums start at 0. Teach it to properly calculate the value range. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-13 11:46:46 -07:00
Raju Rangoju	6b5f9a87e1	amd-xgbe: extend 10Mbps support to MAC version 21H MAC version 21H supports the 10Mbps speed. So, extend support to platforms that support it. Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-13 10:44:11 +01:00
David S. Miller	15f522411c	Merge branch 'octeontx2-updates' Naveen Mamindlapalli says: ==================== RVU NIX AF driver updates This patch series includes a few enhancements and other updates to the RVU NIX AF driver. The first patch adds devlink option to configure NPC MCAM high priority zone entries reservation. This is useful when the requester needs more high priority entries than default reserved entries. The second patch adds support for RSS hash computation using L3 SRC or DST only, or L4 SRC or DST only. The third patch updates DWRR MTU configuration for CN10KB silicon. HW uses the DWRR MTU to compute DWRR weight. Patch 4 configures the LBK link in TL3_TL2 configuration only when switch mode is enabled. Patch 5 adds an option in the mailbox request to enable/disable DROP_RE bit which drops packets with L2 errors when set. Patch 6 updates SMQ flush mechanism to stop other child nodes from enqueuing any packets while SMQ flush is active. Otherwise SMQ flush may timeout. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-13 09:54:58 +01:00
Naveen Mamindlapalli	e18aab0470	octeontx2-af: Set XOFF on other child transmit schedulers during SMQ flush When multiple transmit scheduler queues feed a TL1 transmit link, the SMQ flush initiated on a low priority queue might get stuck when a high priority queue fully subscribes the transmit link. This inturn effects interface teardown. To avoid this, temporarily XOFF all TL1's other immediate child transmit scheduler queues and also clear any rate limit configuration on all the scheduler queues in SMQ(flush) hierarchy. Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-13 09:54:58 +01:00
Nithin Dabilpuram	4ed6387a61	octeontx2-af: add option to toggle DROP_RE enable in rx cfg Add option to toggle DROP_RE bit in rx cfg mbox. This helps in modifying the config runtime as opposed to setting available via nix_lf_alloc() mbox at NIX LF init time. Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com> Signed-off-by: Jerin Jacob Kollanukkaran <jerinj@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-13 09:54:57 +01:00
Subbaraya Sundeep	b6a072a153	octeontx2-af: Enable LBK links only when switch mode is on. Currently, all the TL3_TL2 nodes are being configured to enable switch LBK channel 63 in them. Instead enable them only when switch mode is enabled. Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-13 09:54:57 +01:00
Sunil Goutham	bbba125ead	octeontx2-af: cn10k: Set NIX DWRR MTU for CN10KB silicon The DWRR MTU config added for SDP and RPM/LBK links on CN10K silicon is further extended on CK10KB silicon variant and made it configurable. Now there are 4 DWRR MTU config to choose while setting transmit scheduler's RR_WEIGHT. Here we are reserving one config for each of RPM, SDP and LBK. NIXX_AF_DWRR_MTUX(0) ---> RPM NIXX_AF_DWRR_MTUX(1) ---> SDP NIXX_AF_DWRR_MTUX(2) ---> LBK PF/VF drivers can choose the DWRR_MTU to be used by setting SMQX_CFG[pkt_link_type] to one of above. TLx_SCHEDULE[RR_WEIGHT] is to be as configured 'quantum / 2^DWRR_MTUX[MTU]'. DWRR_MTU of each link is exposed to PF/VF drivers via mailbox for RR_WEIGHT calculation. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-13 09:54:57 +01:00
Kiran Kumar K	79bc788c03	octeontx2-af: extend RSS supported offload types Add support to select L3 SRC or DST only, L4 SRC or DST only for RSS calculation. AF consumer may have requirement as we can select only SRC or DST data for RSS calculation in L3, L4 layers. With this requirement there will be following combinations, IPV[4,6]_SRC_ONLY, IPV[4,6]_DST_ONLY, [TCP,UDP,SCTP]_SRC_ONLY, [TCP,UDP,SCTP]_DST_ONLY. So, instead of creating a bit for each combination, we are using upper 4 bits (31:28) in the flow_key_cfg to represent the SRC, DST selection. 31 => L3_SRC, 30 => L3_DST, 29 => L4_SRC, 28 => L4_DST. These won't be part of flow_cfg, so that we don't need to change the existing ABI. Signed-off-by: Kiran Kumar K <kirankumark@marvell.com> Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-13 09:54:57 +01:00
Naveen Mamindlapalli	09de114c77	octeontx2-af: Add devlink option to adjust mcam high prio zone entries The NPC MCAM entries are currently divided into three priority zones in AF driver: high, mid, and low. The high priority zone and low priority zone take up 1/8th (each) of the available MCAM entries, and remaining going to the mid priority zone. The current allocation scheme may not meet certain requirements, such as when a requester needs more high priority zone entries than are reserved. This patch adds a devlink configurable option to increase the number of high priority zone entries that can be allocated by requester. The max number of entries that can be reserved for high priority usage is 100% of available MCAM entries. Usage: 1) Change high priority zone percentage to 75%: devlink -p dev param set pci/0002:01:00.0 name npc_mcam_high_zone_percent \ value 75 cmode runtime 2) Read high priority zone percentage: devlink -p dev param show pci/0002:01:00.0 name npc_mcam_high_zone_percent The devlink set configuration is only permitted when no MCAM entries are assigned, i.e., all MCAM entries are free, indicating that no PF/VF driver is loaded. So user must unload/unbind PF/VF driver/devices before modifying the high priority zone percentage. Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-13 09:54:57 +01:00
Ido Schimmel	c29e012eae	selftests: forwarding: Fix layer 2 miss test syntax The test currently specifies "l2_miss" as "true" / "false", but the version that eventually landed in iproute2 uses "1" / "0" [1]. Align the test accordingly. [1] https://lore.kernel.org/netdev/20230607153550.3829340-1-idosch@nvidia.com/ Fixes: 8c33266ae26a ("selftests: forwarding: Add layer 2 miss test cases") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-13 09:38:42 +01:00
Jakub Kicinski	7d4e87e973	Merge branch 'splice-net-some-miscellaneous-msg_splice_pages-changes' David Howells says: ==================== splice, net: Some miscellaneous MSG_SPLICE_PAGES changes Now that the splice_to_socket() has been rewritten so that nothing now uses the ->sendpage() file op[1], some further changes can be made, so here are some miscellaneous changes that can now be done. (1) Remove the ->sendpage() file op. (2) Remove hash_sendpage*() from AF_ALG. (3) Make sunrpc send multiple pages in single sendmsg() call rather than calling sendpage() in TCP (or maybe TLS). (4) Make tcp_bpf_sendpage() a wrapper around tcp_bpf_sendmsg(). (5) Make AF_KCM use sendmsg() when calling down to TCP and then make it send entire fragment lists in single sendmsg calls. Link: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=fd5f4d7da29218485153fd8b4c08da7fc130c79f [1] ==================== Link: https://lore.kernel.org/r/20230609100221.2620633-1-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-12 21:13:25 -07:00
David Howells	c31a25e1db	kcm: Send multiple frags in one sendmsg() Rewrite the AF_KCM transmission loop to send all the fragments in a single skb or frag_list-skb in one sendmsg() with MSG_SPLICE_PAGES set. The list of fragments in each skb is conveniently a bio_vec[] that can just be attached to a BVEC iter. Note: I'm working out the size of each fragment-skb by adding up bv_len for all the bio_vecs in skb->frags[] - but surely this information is recorded somewhere? For the skbs in head->frag_list, this is equal to skb->data_len, but not for the head. head->data_len includes all the tail frags too. Signed-off-by: David Howells <dhowells@redhat.com> cc: Tom Herbert <tom@herbertland.com> cc: Tom Herbert <tom@quantonium.net> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-12 21:13:23 -07:00
David Howells	264ba53fac	kcm: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage When transmitting data, call down into the transport socket using sendmsg with MSG_SPLICE_PAGES to indicate that content should be spliced rather than using sendpage. Signed-off-by: David Howells <dhowells@redhat.com> cc: Tom Herbert <tom@herbertland.com> cc: Tom Herbert <tom@quantonium.net> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-12 21:13:23 -07:00
David Howells	de17c68573	tcp_bpf: Make tcp_bpf_sendpage() go through tcp_bpf_sendmsg(MSG_SPLICE_PAGES) Make tcp_bpf_sendpage() a wrapper around tcp_bpf_sendmsg(MSG_SPLICE_PAGES) rather than a loop calling tcp_sendpage(). sendpage() will be removed in the future. Signed-off-by: David Howells <dhowells@redhat.com> cc: John Fastabend <john.fastabend@gmail.com> cc: Jakub Sitnicki <jakub@cloudflare.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-12 21:13:23 -07:00
David Howells	5df5dd03a8	sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage When transmitting data, call down into TCP using sendmsg with MSG_SPLICE_PAGES to indicate that content should be spliced rather than performing sendpage calls to transmit header, data pages and trailer. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> cc: Trond Myklebust <trond.myklebust@hammerspace.com> cc: Anna Schumaker <anna@kernel.org> cc: Jeff Layton <jlayton@kernel.org> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-12 21:13:23 -07:00
David Howells	345ee3e812	algif: Remove hash_sendpage() Remove hash_sendpage() as nothing should now call it since the rewrite of splice_to_socket()[1]. Signed-off-by: David Howells <dhowells@redhat.com> cc: Herbert Xu <herbert@gondor.apana.org.au> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Link: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=2dc334f1a63a8839b88483a3e73c0f27c9c1791c [1] Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-12 21:13:23 -07:00
David Howells	a3bbdc52c3	Remove file->f_op->sendpage Remove file->f_op->sendpage as splicing to a socket now calls sendmsg rather than sendpage. Signed-off-by: David Howells <dhowells@redhat.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-12 21:13:23 -07:00
Jakub Kicinski	ccbe64be15	Merge branch 'net-flower-add-cfm-support' Zahari Doychev says: ==================== net: flower: add cfm support The first patch adds cfm support to the flow dissector. The second adds the flower classifier support. The third adds a selftest for the flower cfm functionality. ==================== Link: https://lore.kernel.org/r/20230608105648.266575-1-zahari.doychev@linux.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-12 17:01:48 -07:00
Zahari Doychev	1668a55a73	selftests: net: add tc flower cfm test New cfm flower test case is added to the net forwarding selfttests. Example output: # ./tc_flower_cfm.sh p1 p2 TEST: CFM opcode match test [ OK ] TEST: CFM level match test [ OK ] TEST: CFM opcode and level match test [ OK ] Signed-off-by: Zahari Doychev <zdoychev@maxlinear.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-12 17:01:45 -07:00
Zahari Doychev	7cfffd5fed	net: flower: add support for matching cfm fields Add support to the tc flower classifier to match based on fields in CFM information elements like level and opcode. tc filter add dev ens6 ingress protocol 802.1q \ flower vlan_id 698 vlan_ethtype 0x8902 cfm mdl 5 op 46 \ action drop Signed-off-by: Zahari Doychev <zdoychev@maxlinear.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-12 17:01:45 -07:00
Zahari Doychev	d7ad70b5ef	net: flow_dissector: add support for cfm packets Add support for dissecting cfm packets. The cfm packet header fields maintenance domain level and opcode can be dissected. Signed-off-by: Zahari Doychev <zdoychev@maxlinear.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-06-12 17:01:45 -07:00
Uwe Kleine-König	3a2cb45ca0	net: mlxsw: i2c: Switch back to use struct i2c_driver's .probe() After commit b8a1a4cd5a98 ("i2c: Provide a temporary .probe_new() call-back type"), all drivers being converted to .probe_new() and then commit 03c835f498b5 ("i2c: Switch .probe() to not take an id parameter") convert back to (the new) .probe() to be able to eventually drop .probe_new() from struct i2c_driver. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-12 11:56:41 +01:00
Daniel Golle	98c485eaf5	net: phy: add driver for MediaTek SoC built-in GE PHYs Some of MediaTek's Filogic SoCs come with built-in gigabit Ethernet PHYs which require calibration data from the SoC's efuse. Despite the similar design the driver doesn't share any code with the existing mediatek-ge.c. Add support for such PHYs by introducing a new driver with basic support for MediaTek SoCs MT7981 and MT7988 built-in 1GE PHYs. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-12 11:55:04 +01:00
David S. Miller	a89dc58703	mlx5-updates-2023-06-09 1) Embedded CPU Virtual Functions 2) Lightweight local SFs Daniel Jurgens says: ==================== Embedded CPU Virtual Functions This series enables the creation of virtual functions on Bluefield (the embedded CPU platform). Embedded CPU virtual functions (EC VFs). EC VF creation, deletion and management interfaces are the same as those for virtual functions in a server with a Connect-X NIC. When using EC VFs on the ARM the creation of virtual functions on the host system is still supported. Host VFs eswitch vports occupy a range of 1..max_vfs, the EC VF vport range is max_vfs+1..max_ec_vfs. Every function (PF, ECPF, VF, EC VF, and subfunction) has a function ID associated with it. Prior to this series the function ID and the eswitch vport were the same. That is no longer the case, the EC VF function ID range is 1..max_ec_vfs. When querying or setting the capabilities of an EC VF function an new bit must be set in the query/set HCA cap structure. This is a high level overview of the changes made: - Allocate vports for EC VFs if they are enabled. - Create representors and devlink ports for the EC VF vports. - When querying/setting HCA caps by vport break the assumption that function ID is the same a vport number and adjust accordingly. - Create a new type of page, so that when SRIOV on the ARM is disabled, but remains enabled on the host, the driver can wait for the correct pages. - Update SRIOV code to support EC VF creation/deletion. =================== Lightweight local SFs: Last 3 patches form Shay Drory: SFs are heavy weight and by default they come with the full package of ConnectX features. Usually users want specialized SFs for one specific purpose and using devlink users will almost always override the set of advertises features of an SF and reload it. Shay Drory says: ================ In order to avoid the wasted time and resources on the reload, local SFs will probe without any auxiliary sub-device, so that the SFs can be configured prior to its full probe. The defaults of the enable_* devlink params of these SFs are set to false. Usage example: Create SF: $ devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 11 $ devlink port function set pci/0000:08:00.0/32768 \ hw_addr 00:00:00:00:00:11 state active Enable ETH auxiliary device: $ devlink dev param set auxiliary/mlx5_core.sf.1 \ name enable_eth value true cmode driverinit Now, in order to fully probe the SF, use devlink reload: $ devlink dev reload auxiliary/mlx5_core.sf.1 At this point the user have SF devlink instance with auxiliary device for the Ethernet functionality only. ================ -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmSD1KUACgkQSD+KveBX +j5JMwf/SGo0DwXm0GTu+RlUmoUpCYG40QjJwNwJ9WUNIB/zMpnGrx59dtdhuVUb 3gD8T9MVT08NXjQIzQReAZu9hm9MvM6jvTnUG++fFyjjlHuI/9hXWBgz1aSPcQOC ZR2ZHuFZsTTRWNe3JU5v9tuHOE0c6xZBAzPdDU3Lib0LvhIbZGK2eiSb11RIGU3V FTUskLUPMo5ObUNS55lgcTFquK90Qy41LUv0DX3MpbOY/YnDBiycikopwDoYtYQ6 9/kOM+VOprW+BgalLyGHJ77BDg7O/k1ZxnL6J7mGn7tAhOkKX5+06xGfZnEifI2F nTuXYI5O8TDBdZYpmO+sBOh+SVyUrg== =H9kD -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2023-06-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux mlx5-updates-2023-06-09 1) Embedded CPU Virtual Functions 2) Lightweight local SFs Daniel Jurgens says: ==================== Embedded CPU Virtual Functions This series enables the creation of virtual functions on Bluefield (the embedded CPU platform). Embedded CPU virtual functions (EC VFs). EC VF creation, deletion and management interfaces are the same as those for virtual functions in a server with a Connect-X NIC. When using EC VFs on the ARM the creation of virtual functions on the host system is still supported. Host VFs eswitch vports occupy a range of 1..max_vfs, the EC VF vport range is max_vfs+1..max_ec_vfs. Every function (PF, ECPF, VF, EC VF, and subfunction) has a function ID associated with it. Prior to this series the function ID and the eswitch vport were the same. That is no longer the case, the EC VF function ID range is 1..max_ec_vfs. When querying or setting the capabilities of an EC VF function an new bit must be set in the query/set HCA cap structure. This is a high level overview of the changes made: - Allocate vports for EC VFs if they are enabled. - Create representors and devlink ports for the EC VF vports. - When querying/setting HCA caps by vport break the assumption that function ID is the same a vport number and adjust accordingly. - Create a new type of page, so that when SRIOV on the ARM is disabled, but remains enabled on the host, the driver can wait for the correct pages. - Update SRIOV code to support EC VF creation/deletion. =================== Lightweight local SFs: Last 3 patches form Shay Drory: SFs are heavy weight and by default they come with the full package of ConnectX features. Usually users want specialized SFs for one specific purpose and using devlink users will almost always override the set of advertises features of an SF and reload it. Shay Drory says: ================ In order to avoid the wasted time and resources on the reload, local SFs will probe without any auxiliary sub-device, so that the SFs can be configured prior to its full probe. The defaults of the enable_* devlink params of these SFs are set to false. Usage example: Create SF: $ devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 11 $ devlink port function set pci/0000:08:00.0/32768 \ hw_addr 00:00:00:00:00:11 state active Enable ETH auxiliary device: $ devlink dev param set auxiliary/mlx5_core.sf.1 \ name enable_eth value true cmode driverinit Now, in order to fully probe the SF, use devlink reload: $ devlink dev reload auxiliary/mlx5_core.sf.1 At this point the user have SF devlink instance with auxiliary device for the Ethernet functionality only. ================	2023-06-12 11:41:57 +01:00
David S. Miller	73f49f8cc1	Merge branch 'tcp-tx-headless' Eric Dumazet says: ==================== tcp: tx path fully headless This series completes transition of TCP stack tx path to headless packets : All payload now reside in page frags, never in skb->head. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-12 11:38:55 +01:00
Eric Dumazet	5882efff88	tcp: remove size parameter from tcp_stream_alloc_skb() Now all tcp_stream_alloc_skb() callers pass @size == 0, we can remove this parameter. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-12 11:38:54 +01:00
Eric Dumazet	b4a2439713	tcp: remove some dead code Now all skbs in write queue do not contain any payload in skb->head, we can remove some dead code. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-12 11:38:54 +01:00
Eric Dumazet	fbf934068f	tcp: let tcp_send_syn_data() build headless packets tcp_send_syn_data() is the last component in TCP transmit path to put payload in skb->head. Switch it to use page frags, so that we can remove dead code later. This allows to put more payload than previous implementation. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-12 11:38:54 +01:00
David S. Miller	f2f069da4c	Merge branch 'ethtool-extack' Jakub Kicinski says: ==================== net: support extack in dump and simplify ethtool uAPI Ethtool currently requires header nest to be always present even if it doesn't have to carry any attr for a given request. This inflicts unnecessary pain on the users. What makes it worse is that extack was not working in dump's ->start() callback. Address both of those issues. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-12 11:32:45 +01:00
Jakub Kicinski	500e1340d1	net: ethtool: don't require empty header nests Ethtool currently requires a header nest (which is used to carry the common family options) in all requests including dumps. $ cli.py --spec netlink/specs/ethtool.yaml --dump channels-get lib.ynl.NlError: Netlink error: Invalid argument nl_len = 64 (48) nl_flags = 0x300 nl_type = 2 error: -22 extack: {'msg': 'request header missing'} $ cli.py --spec netlink/specs/ethtool.yaml --dump channels-get \ --json '{"header":{}}'; ) [{'combined-count': 1, 'combined-max': 1, 'header': {'dev-index': 2, 'dev-name': 'enp1s0'}}] Requiring the header nest to always be there may seem nice from the consistency perspective, but it's not serving any practical purpose. We shouldn't burden the user like this. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-12 11:32:45 +01:00
Jakub Kicinski	5ab8c41cef	netlink: support extack in dump ->start() Commit 4a19edb60d02 ("netlink: Pass extack to dump handlers") added extack support to netlink dumps. It was focused on rtnl and since rtnl does not use ->start(), ->done() callbacks it ignored those. Genetlink on the other hand uses ->start() extensively, for parsing and input validation. Pass the extact in via struct netlink_dump_control and link it to cb for the time of ->start(). Both struct netlink_dump_control and extack itself live on the stack so we can't keep the same extack for the duration of the dump. This means that the extack visible in ->start() and each ->dump() callbacks will be different. Corner cases like reporting a warning message in DONE across dump calls are still not supported. We could put the extack (for dumps) in the socket struct, but layering makes it slightly awkward (extack pointer is decided before the DO / DUMP split). The genetlink dump error extacks are now surfaced: $ cli.py --spec netlink/specs/ethtool.yaml --dump channels-get lib.ynl.NlError: Netlink error: Invalid argument nl_len = 64 (48) nl_flags = 0x300 nl_type = 2 error: -22 extack: {'msg': 'request header missing'} Previously extack was missing: $ cli.py --spec netlink/specs/ethtool.yaml --dump channels-get lib.ynl.NlError: Netlink error: Invalid argument nl_len = 36 (20) nl_flags = 0x100 nl_type = 2 error: -22 Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-06-12 11:32:44 +01:00

1 2 3 4 5 ...

1187671 Commits