1202151 Commits

Author SHA1 Message Date
Marcin Szycik
0960a27bd4 ice: Add direction metadata
Currently it is possible to create a filter which breaks TX traffic, e.g.:

tc filter add dev $PF1 ingress protocol ip prio 1 flower ip_proto udp
dst_port $PORT action mirred egress redirect dev $VF1_PR

This adds a rule which might match both TX and RX traffic, and in TX path
the PF will actually receive the traffic, which breaks communication.

To fix this, add a match on direction metadata flag when adding a tc rule.

Because of the way metadata is currently handled, a duplicate lookup word
would appear if VLAN metadata is also added. The lookup would still work
correctly, but one word would be wasted. To prevent it, lookup 0 now always
contains all metadata. When any metadata needs to be added, it is added to
lookup 0 and lookup count is not incremented. This way, two flags residing
in the same word will take up one word, instead of two.

Note: the drop action is also affected, i.e. it will now only work in one
direction.

Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-08-07 13:01:29 -07:00
Wojciech Drewek
505a1fdada ice: Accept LAG netdevs in bridge offloads
Allow LAG interfaces to be used in bridge offload using
netif_is_lag_master. In this case, search for ice netdev in
the list of LAG's lower devices.

Reviewed-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-08-07 13:01:29 -07:00
Jakub Kicinski
96bc313783 linux-can-next-for-6.6-20230807
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEDs2BvajyNKlf9TJQvlAcSiqKBOgFAmTQn1YTHG1rbEBwZW5n
 dXRyb25peC5kZQAKCRC+UBxKKooE6Eh3CACQmSMV6FLXhFsOUzQS9ZyiKwQaAMG7
 1giCbXRJS6CiyHIbkye/h7AIC4WYcr9bTXxFTLypWBMqo9uFEm/jiFRPMyJcPsZa
 g8ySbQqkYeaGb0RkrHFbChsJaSnZhH1niatrAw+Vk2jeh/3Dait+LPFtdDWbLFsw
 6mPoZMv18tVy1r/0kiqPCive1Gie3eKzmVwBk9AK6XVUPS88bX7OKRppoRXv3f9x
 DpNKfJjhyWtBCoK8wsR3vVc1jRNL3eeLN7t8E7zYhfKRQsts1rVnyrWdihEjZ9ik
 e97oMG5Zeu8yStYiAkbGjNlhs1Uujlzc3yJoN0tuIFvniD2xC4bxeZ+X
 =X3r+
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-next-for-6.6-20230807' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2023-08-07

The patch is from me and reverts the addition of the CAN controller
nodes in the allwinner d1 SoC.

* tag 'linux-can-next-for-6.6-20230807' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next:
  Revert "riscv: dts: allwinner: d1: Add CAN controller nodes"
====================

Link: https://lore.kernel.org/r/20230807074222.1576119-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-07 12:25:46 -07:00
Jakub Kicinski
0d0c5f0b9b Merge branch 'net-stmmac-correct-mac-propagation-delay'
Johannes Zink says:

====================
net: stmmac: correct MAC propagation delay

Changes in v3:
- work in Richard's review feedback. Thank you for reviewing my patch:
  - as some of the hardware may have no or invalid correction value
    registers: introduce feature switch which can be enabled in the glue
    code drivers depending on the actual hardware support
  - only enable the feature on the i.MX8MP for the time being, as the patch
    improves timing accuracy and is tested for this hardware
- Link to v2: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v2-1-3366f38ee9a6@pengutronix.de

Changes in v2:
- fix builds for 32bit, this was found by the kernel build bot
	Reported-by: kernel test robot <lkp@intel.com>
	Closes: https://lore.kernel.org/oe-kbuild-all/202307200225.B8rmKQPN-lkp@intel.com/
- while at it also fix an overflow by shifting a u32 constant from macro by 10bits
  by casting the constant to u64
- Link to v1: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v1-1-768aa4d09334@pengutronix.de

Tested-by: Kurt Kanzenbach <kurt@linutronix.de> # imx8mp
====================

Link: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v3-0-61e63427735e@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-07 12:17:15 -07:00
Johannes Zink
6cb2e613c7 net: stmmac: dwmac-imx: enable MAC propagation delay correction for i.MX8MP
As the i.MX8MP supports reading MAC propagation delay and correcting the
Hardware timestamp counter for additional delays [1], enable the feature
for this SoC.

This reduces phase error of the PPS output from the PTP Hardware Clock
from approx 150ns to 100ns.

[1] i.MX8MP Reference Manual, rev.1 Section 11.7.2.5.3 "Timestamp
correction"

Signed-off-by: Johannes Zink <j.zink@pengutronix.de>
Link: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v3-2-61e63427735e@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-07 12:17:13 -07:00
Johannes Zink
26cfb838aa net: stmmac: correct MAC propagation delay
The IEEE1588 Standard specifies that the timestamps of Packets must be
captured when the PTP message timestamp point (leading edge of first
octet after the start of frame delimiter) crosses the boundary between
the node and the network. As the MAC latches the timestamp at an
internal point, the captured timestamp must be corrected for the
additional data transmission latency, as described in the publicly
available datasheet [1].

This patch only corrects for the MAC-Internal delay, which can be read
out from the MAC_Ingress_Timestamp_Latency register on DWMAC version 5,
since the Phy framework currently does not support querying the Phy
ingress and egress latency. The Closs Domain Crossing Circuits errors as
indicated in [1] are already being accounted in the
stmmac_get_tx_hwtstamp() function and are not corrected here.

As the Latency varies for different link speeds and MII
modes of operation, the correction value needs to be updated on each
link state change.

As the delay also causes a phase shift in the timestamp counter compared
to the rest of the network, this correction will also reduce phase error
when generating PPS outputs from the timestamp counter.

Since the correction registers may be unavailable on some hardware and
no feature bits are documented for dynamically detection of the MAC
propagation delay readout, introduce a feature bit to explicitely enable
MAC delay Correction in the gluecode driver.

[1] i.MX8MP Reference Manual, rev.1 Section 11.7.2.5.3 "Timestamp
correction"

Signed-off-by: Johannes Zink <j.zink@pengutronix.de>
Link: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v2-1-3366f38ee9a6@pengutronix.de
Link: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v3-1-61e63427735e@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-07 12:17:13 -07:00
Roi Dayan
b56fb19c33 net/mlx5: Bridge, Only handle registered netdev bridge events
Don't handle bridge events for a netdev that doesn't belong to
an eswitch that registered for bridge events.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07 10:53:52 -07:00
Roi Dayan
d602be220c net/mlx5: E-Switch, Remove redundant arg ignore_flow_lvl
The arg is always passed as true and thus redundant.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07 10:53:52 -07:00
Gal Pressman
58f6d9d044 net/mlx5: Fix typo reminder -> remainder
Fix a typo in esw_qos_devlink_rate_to_mbps(): reminder -> remainder.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07 10:53:52 -07:00
Ruan Jinjie
a0ae00e71e net/mlx5: remove many unnecessary NULL values
There are many pointers assigned first, which need not to be
initialized, so remove the NULL assignments.

Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07 10:53:52 -07:00
Maher Sanalla
f14c1a14e6 net/mlx5: Allocate completion EQs dynamically
This commit enables the dynamic allocation of EQs at runtime, allowing
for more flexibility in managing completion EQs and reducing the memory
overhead of driver load. Whenever a CQ is created for a given vector
index, the driver will lookup to see if there is an already mapped
completion EQ for that vector, if so, utilize it. Otherwise, allocate a
new EQ on demand and then utilize it for the CQ completion events.

Add a protection lock to the EQ table to protect from concurrent EQ
creation attempts.

While at it, replace mlx5_vector2irqn()/mlx5_vector2eqn() with
mlx5_comp_eqn_get() and mlx5_comp_irqn_get() which will allocate an
EQ on demand if no EQ is found for the given vector.

Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07 10:53:52 -07:00
Maher Sanalla
54c5297801 net/mlx5: Handle SF IRQ request in the absence of SF IRQ pool
In case the SF IRQ pool is not available due to setup limitations,
SF currently relies on the already allocated PF IRQs to fulfill
its IRQ vector requests.

However, with the dynamic EQ allocation introduced in the next patch,
it is possible that not all IRQs of PF will be allocated after the driver
is loaded. In such case, if a SF requests a completion IRQ without having
its own independent IRQ pool, SF will lack a PF IRQ to utilize.

To address this scenario, allocate an IRQ for the SF from the PF's IRQ pool
on demand. The new IRQ will be shared between the SF and it's PF.

Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07 10:53:51 -07:00
Maher Sanalla
674dd4e2e0 net/mlx5: Rename mlx5_comp_vectors_count() to mlx5_comp_vectors_max()
To accurately represent its purpose, rename the function that retrieves
the value of maximum vectors from mlx5_comp_vectors_count() to
mlx5_comp_vectors_max().

Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07 10:53:51 -07:00
Maher Sanalla
f3147015fa net/mlx5: Add IRQ vector to CPU lookup function
Currently, once driver load completes, IRQ requests were performed for all
vectors. However, as we move to support dynamic creation of EQs, this will
not be the case as some IRQs will not exist at this stage. Thus, in such
case, use the default CPU to IRQ mapping which is the serial mapping based
on IRQ vector index. Meaning, the n'th vector gets mapped to the n'th CPU.

Introduce an API function mlx5_comp_vector_cpu() that takes an IRQ index and
provides the corresponding CPU mapping. It utilizes the existing IRQ
affinity if defined, or resorts to the default serialized CPU mapping
otherwise.

Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07 10:53:51 -07:00
Maher Sanalla
ddd2c79da0 net/mlx5: Introduce mlx5_cpumask_default_spread
For better code readability in the completion IRQ request code, define
the cpu lookup per completion vector logic in a separate function.

The new method mlx5_cpumask_default_spread() given a vector index 'n'
will return the 'nth' cpu. This new method will be used also in the next
patch.

Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07 10:53:51 -07:00
Maher Sanalla
e3e56775e9 net/mlx5: Implement single completion EQ create/destroy methods
Currently, create_comp_eqs() function handles the creation of all
completion EQs for all the vectors on driver load. While on driver
unload, destroy_comp_eqs() performs the equivalent job.

In preparation for dynamic EQ creation, replace create_comp_eqs() /
destroy_comp_eqs() with  create_comp_eq() / destroy_comp_eq() functions
which will receive a vector index and allocate/destroy an EQ for that
specific vector. Thus, allowing more flexibility in the management
of completion EQs.

Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07 10:53:51 -07:00
Maher Sanalla
273c697fde net/mlx5: Use xarray to store and manage completion EQs
Use xarray to store the completion EQs instead of a linked list.
The xarray offers more scalability, reduced memory overhead, and
facilitates the lookup of a certain EQ given a vector index.

Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07 10:53:50 -07:00
Maher Sanalla
54b2cf41b8 net/mlx5: Refactor completion IRQ request/release handlers in EQ layer
Break the completion IRQ request/release functions into per-vector
handlers for both PCI devices and SFs in the EQ layer.

On EQ table creation, loop over all vectors and request an IRQ for each
one using the new per-vector functions. Perform the symmetrical change
when releasing IRQs on EQ table cleanup.

Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07 10:53:50 -07:00
Maher Sanalla
c8a0245c39 net/mlx5: Use xarray to store and manage completion IRQs
Use xarray to store the completion IRQs instead of a fixed-size allocated
array as not all completion IRQs will be requested on driver load, but
rather on demand when an EQ is created. The xarray offers more scalability,
reduced memory overhead, and provides the ability to dynamically resize the
array when needed.

Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07 10:53:50 -07:00
Maher Sanalla
a1772de78d net/mlx5: Refactor completion IRQ request/release API
Introduce a per-vector completion IRQ request API that requests a
single IRQ for a given vector index instead of multiple IRQs request API.
On driver load, loop over all completion vectors and request an IRQ for
each one via the newly introduced API.

Symmetrically, introduce an IRQ release API per vector. On driver
unload, loop over all vectors and release each completion IRQ via
the new per-vector API.

As IRQ vectors will be requested dynamically later in the patchset,
add a cpumask of the bounded CPUs to avoid the possible mapping of
two IRQs of the same device to the same cpu.

Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07 10:53:50 -07:00
Maher Sanalla
18cf3d31f8 net/mlx5: Track the current number of completion EQs
In preparation to allocate completion EQs, add a counter to track the
number of completion EQs currently allocated. Store the maximum number
of EQs in max_comp_eqs variable.

Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-07 10:53:50 -07:00
Yue Haibing
cc97777c80 udp/udplite: Remove unused function declarations udp{,lite}_get_port()
Commit 6ba5a3c52da0 ("[UDP]: Make full use of proto.h.udp_hash innovation.")
removed these implementations but leave declarations.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-07 08:53:55 +01:00
Yue Haibing
a6ab5c29b8 net: sfp: Remove unused function declaration sfp_link_configure()
Commit ce0aa27ff3f6 ("sfp: add sfp-bus to bridge between network devices and sfp cages")
declared but never implemented it.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-07 08:53:55 +01:00
Yue Haibing
2c6af36beb ndisc: Remove unused ndisc_ifinfo_sysctl_strategy() declaration
Commit f8572d8f2a2b ("sysctl net: Remove unused binary sysctl code")
left behind this declaration.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-07 08:53:55 +01:00
Yue Haibing
992b47851b net: pkt_cls: Remove unused inline helpers
Commit acb674428c3d ("net: sched: introduce per-block callbacks")
implemented these but never used it.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-07 08:53:54 +01:00
Yue Haibing
047551cd30 neighbour: Remove unused function declaration pneigh_for_each()
pneigh_for_each() is never implemented since the beginning of git history.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-07 08:53:54 +01:00
Yue Haibing
f6ecb68b38 net/tls: Remove unused function declarations
Commit 3c4d7559159b ("tls: kernel TLS support") declared but never implemented
these functions.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-07 08:53:54 +01:00
Marc Kleine-Budde
84059a0ef5 Revert "riscv: dts: allwinner: d1: Add CAN controller nodes"
It turned out the dtsi changes were not quite ready, revert them for
now.

This reverts commit 6ea1ad888f5900953a21853e709fa499fdfcb317.

Link: https://lore.kernel.org/all/2690764.mvXUDI8C0e@jernej-laptop
Suggested-by: Jernej Škrabec <jernej.skrabec@gmail.com>
Link: https://lore.kernel.org/all/20230807-riscv-allwinner-d1-revert-can-controller-nodes-v1-1-eb3f70b435d9@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2023-08-07 09:37:31 +02:00
Vladimir Oltean
c35e927cbe net: omit ndo_hwtstamp_get() call when possible in dev_set_hwtstamp_phylib()
Setting dev->priv_flags & IFF_SEE_ALL_HWTSTAMP_REQUESTS is only legal
for drivers which were converted to ndo_hwtstamp_get() and
ndo_hwtstamp_set(), and it is only there that we call ndo_hwtstamp_set()
for a request that otherwise goes to phylib (for stuff like packet traps,
which need to be undone if phylib failed, hence the old_cfg logic).

The problem is that we end up calling ndo_hwtstamp_get() when we don't
need to (even if the SIOCSHWTSTAMP wasn't intended for phylib, or if it
was, but the driver didn't set IFF_SEE_ALL_HWTSTAMP_REQUESTS). For those
unnecessary conditions, we share a code path with virtual drivers (vlan,
macvlan, bonding) where ndo_hwtstamp_get() is implemented as
generic_hwtstamp_get_lower(), and may be resolved through
generic_hwtstamp_ioctl_lower() if the lower device is unconverted.

I.e. this situation:

$ ip link add link eno0 name eno0.100 type vlan id 100
$ hwstamp_ctl -i eno0.100 -t 1

We are unprepared to deal with this, because if ndo_hwtstamp_get() is
resolved through a legacy ndo_eth_ioctl(SIOCGHWTSTAMP) lower_dev
implementation, that needs a non-NULL old_cfg.ifr pointer, and we don't
have it.

But we don't even need to deal with it either. In the general case,
drivers may not even implement SIOCGHWTSTAMP handling, only SIOCSHWTSTAMP,
so it makes sense to completely avoid a SIOCGHWTSTAMP call if we can.

The solution is to split the single "if" condition into 3 smaller ones,
thus separating the decision to call ndo_hwtstamp_get() from the
decision to call ndo_hwtstamp_set(). The third "if" condition is
identical to the first one, and both are subsets of the second one.
Thus, the "cfg" argument of kernel_hwtstamp_config_changed() is always
valid.

Reported-by: Eric Dumazet <edumazet@google.com>
Closes: https://lore.kernel.org/netdev/CANn89iLOspJsvjPj+y8jikg7erXDomWe8sqHMdfL_2LQSFrPAg@mail.gmail.com/
Fixes: fd770e856e22 ("net: remove phy_has_hwtstamp() -> phy_mii_ioctl() decision from converted drivers")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06 13:25:10 +01:00
Yang Yingliang
54024dbec9 net: ethernet: adi: adin1110: use eth_broadcast_addr() to assign broadcast address
Use eth_broadcast_addr() to assign broadcast address instead
of memset().

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06 13:22:47 +01:00
Yu Liao
813f3662c2 ibmvnic: remove unused rc variable
gcc with W=1 reports
drivers/net/ethernet/ibm/ibmvnic.c:194:13: warning: variable 'rc' set but not used [-Wunused-but-set-variable]
                                            ^
This variable is not used so remove it.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202308040609.zQsSXWXI-lkp@intel.com/
Signed-off-by: Yu Liao <liaoyu15@huawei.com>
Reviewed-by: Nick Child <nnac123@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06 13:20:44 +01:00
Haiyang Zhang
b1d13f7a3b net: mana: Add page pool for RX buffers
Add page pool for RX buffers for faster buffer cycle and reduce CPU
usage.

The standard page pool API is used.

With iperf and 128 threads test, this patch improved the throughput
by 12-15%, and decreased the IRQ associated CPU's usage from 99-100% to
10-50%.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06 08:36:06 +01:00
David S. Miller
48ae409aaf Merge branch 'gve-desc'
Rushil Gupta says:

====================
gve: Add QPL mode for DQO descriptor format

GVE supports QPL ("queue-page-list") mode where
all data is communicated through a set of pre-registered
pages. Adding this mode to DQO.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06 08:34:37 +01:00
Rushil Gupta
5a3f8d1231 gve: update gve.rst
Add a note about QPL and RDA mode

Signed-off-by: Rushil Gupta <rushilg@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Signed-off-by: Bailey Forrest <bcf@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06 08:34:37 +01:00
Rushil Gupta
e7075ab4fb gve: RX path for DQO-QPL
The RX path allocates the QPL page pool at queue creation, and
tries to reuse these pages through page recycling. This patch
ensures that on refill no non-QPL pages are posted to the device.

When the driver is running low on free buffers, an ondemand
allocation step kicks in that allocates a non-qpl page for
SKB business to free up the QPL page in use.

gve_try_recycle_buf was moved to gve_rx_append_frags so that driver does
not attempt to mark buffer as used if a non-qpl page was allocated
ondemand.

Signed-off-by: Rushil Gupta <rushilg@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Signed-off-by: Bailey Forrest <bcf@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06 08:34:36 +01:00
Rushil Gupta
a6fb8d5a8b gve: Tx path for DQO-QPL
Each QPL page is divided into GVE_TX_BUFS_PER_PAGE_DQO buffers.
When a packet needs to be transmitted, we break the packet into max
GVE_TX_BUF_SIZE_DQO sized chunks and transmit each chunk using a TX
descriptor.
We allocate the TX buffers from the free list in dqo_tx.
We store these TX buffer indices in an array in the pending_packet
structure.

The TX buffers are returned to the free list in dqo_compl after
receiving packet completion or when removing packets from miss
completions list.

Signed-off-by: Rushil Gupta <rushilg@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Signed-off-by: Bailey Forrest <bcf@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06 08:34:36 +01:00
Rushil Gupta
66ce8e6b49 gve: Control path for DQO-QPL
GVE supports QPL ("queue-page-list") mode where
all data is communicated through a set of pre-registered
pages. Adding this mode to DQO descriptor format.

Add checks, abi-changes and device options to support
QPL mode for DQO in addition to GQI. Also, use
pages-per-qpl supplied by device-option to control the
size of the "queue-page-list".

Signed-off-by: Rushil Gupta <rushilg@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Signed-off-by: Bailey Forrest <bcf@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06 08:34:36 +01:00
David S. Miller
16fd753995 Merge branch 'tcp-options-lockless'
Eric Dumazet says:

====================
tcp: set few options locklessly

This series is avoiding the socket lock for six TCP options.

They are not heavily used, but this exercise can give
ideas for other parts of TCP/IP stack :)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06 08:24:56 +01:00
Eric Dumazet
6e97ba552b tcp: set TCP_DEFER_ACCEPT locklessly
rskq_defer_accept field can be read/written without
the need of holding the socket lock.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06 08:24:56 +01:00
Eric Dumazet
a81722ddd7 tcp: set TCP_LINGER2 locklessly
tp->linger2 can be set locklessly as long as readers
use READ_ONCE().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06 08:24:55 +01:00
Eric Dumazet
84485080cb tcp: set TCP_KEEPCNT locklessly
tp->keepalive_probes can be set locklessly, readers
are already taking care of this field being potentially
set by other threads.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06 08:24:55 +01:00
Eric Dumazet
6fd70a6b4e tcp: set TCP_KEEPINTVL locklessly
tp->keepalive_intvl can be set locklessly, readers
are already taking care of this field being potentially
set by other threads.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06 08:24:55 +01:00
Eric Dumazet
d58f2e15aa tcp: set TCP_USER_TIMEOUT locklessly
icsk->icsk_user_timeout can be set locklessly,
if all read sides use READ_ONCE().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06 08:24:55 +01:00
Eric Dumazet
d44fd4a767 tcp: set TCP_SYNCNT locklessly
icsk->icsk_syn_retries can safely be set without locking the socket.

We have to add READ_ONCE() annotations in tcp_fastopen_synack_timer()
and tcp_write_timeout().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06 08:24:55 +01:00
Jakub Kicinski
81083076a0 wireless-next patches for v6.6
The first pull request for v6.6 and only driver patches this time.
 Nothing special really standing out, it has been quiet most likely due
 to vacations.
 
 Major changes:
 
 rtl8xxxu
 
 * enable AP mode for: RTL8192FU, RTL8710BU (RTL8188GU), RTL8192EU and RTL8723BU
 
 mwifiex
 
 * allow moving to a different namespace
 
 mt76
 
 * preparation for mt7925 support
 
 * mt7981 support
 
 ath12k
 
 * Extremely High Throughput (EHT) PHY support for Wi-Fi 7
 -----BEGIN PGP SIGNATURE-----
 
 iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmTM6gkRHGt2YWxvQGtl
 cm5lbC5vcmcACgkQbhckVSbrbZuuAggAjRQi7Vjsfr+GlZ+g/y/vf+ircw8YjKgy
 wJqnQ0fnJ4rpyxqVFjMr+ocuOrdBufTSs/W4fqOBbbg9oimsgg+vxIQA8GmQIUVQ
 ZQVWQHVqPLQ6NVp/YZJnt9seeCewGHW6UZxG9k0MqR1RJn+KinmSjWKRo1D56niL
 rJQAK0FWrVqkj5nt9lKRJLMGxX0k/ftrdZgHanUOVCNYi9Ukx0jXSbqSMftTk7xz
 r3jtuY5zAV+2GXoMIbW4ogBks4Yx06XzVycByzj+dYt5E3VBdDFX+mXlsw9vnjbv
 whVzsuMnYBu6CCFlKDPdGsmrzZA0GrLCRZE9uw7yhwZZ+qKkQZ8kNw==
 =5r3T
 -----END PGP SIGNATURE-----

Merge tag 'wireless-next-2023-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next

Kalle Valo says:

====================
wireless-next patches for v6.6

The first pull request for v6.6 and only driver patches this time.
Nothing special really standing out, it has been quiet most likely due
to vacations.

Major changes:

rtl8xxxu
 - enable AP mode for: RTL8192FU, RTL8710BU (RTL8188GU),
   RTL8192EU and RTL8723BU

mwifiex
 - allow moving to a different namespace

mt76
 - preparation for mt7925 support
 - mt7981 support

ath12k
 - Extremely High Throughput (EHT) PHY support for Wi-Fi 7

* tag 'wireless-next-2023-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (172 commits)
  wifi: rtw89: return failure if needed firmware elements are not recognized
  wifi: rtw89: add to parse firmware elements of BB and RF tables
  wifi: rtw89: introduce infrastructure of firmware elements
  wifi: rtw89: add firmware suit for BB MCU 0/1
  wifi: rtw89: add firmware parser for v1 format
  wifi: rtw89: introduce v1 format of firmware header
  wifi: rtw89: support firmware log with formatted text
  wifi: rtw89: recognize log format from firmware file
  wifi: ath12k: avoid deadlock by change ieee80211_queue_work for regd_update_work
  wifi: ath12k: add handler for scan event WMI_SCAN_EVENT_DEQUEUED
  wifi: ath12k: relax list iteration in ath12k_mac_vif_unref()
  wifi: ath12k: configure puncturing bitmap
  wifi: ath12k: parse WMI service ready ext2 event
  wifi: ath12k: add MLO header in peer association
  wifi: ath12k: peer assoc for 320 MHz
  wifi: ath12k: add WMI support for EHT peer
  wifi: ath12k: prepare EHT peer assoc parameters
  wifi: ath12k: add EHT PHY modes
  wifi: ath12k: propagate EHT capabilities to userspace
  wifi: ath12k: WMI support to process EHT capabilities
  ...
====================

Link: https://lore.kernel.org/r/87msz7j942.fsf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-04 18:34:25 -07:00
Jakub Kicinski
90cd5467d1 Merge branch 'tcp-disable-header-prediction-for-md5'
Kuniyuki Iwashima says:

====================
tcp: Disable header prediction for MD5.

The 1st patch disable header prediction for MD5 flow and the 2nd
patch updates the stale comment in tcp_parse_options().
====================

Link: https://lore.kernel.org/r/20230803224552.69398-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-04 18:28:38 -07:00
Kuniyuki Iwashima
b205153689 tcp: Update stale comment for MD5 in tcp_parse_options().
Since commit 9ea88a153001 ("tcp: md5: check md5 signature without socket
lock"), the MD5 option is checked in tcp_v[46]_rcv().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230803224552.69398-3-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-04 18:28:36 -07:00
Kuniyuki Iwashima
d0f2b7a9ca tcp: Disable header prediction for MD5 flow.
TCP socket saves the minimum required header length in tcp_header_len
of struct tcp_sock, and later the value is used in __tcp_fast_path_on()
to generate a part of TCP header in tcp_sock(sk)->pred_flags.

In tcp_rcv_established(), if the incoming packet has the same pattern
with pred_flags, we enter the fast path and skip full option parsing.

The MD5 option is parsed in tcp_v[46]_rcv(), so we need not parse it
again later in tcp_rcv_established() unless other options exist.  We
add TCPOLEN_MD5SIG_ALIGNED to tcp_header_len in two paths to avoid the
slow path.

For passive open connections with MD5, we add TCPOLEN_MD5SIG_ALIGNED
to tcp_header_len in tcp_create_openreq_child() after 3WHS.

On the other hand, we do it in tcp_connect_init() for active open
connections.  However, the value is overwritten while processing
SYN+ACK or crossed SYN in tcp_rcv_synsent_state_process().

These two cases will have the wrong value in pred_flags and never go
into the fast path.

We could update tcp_header_len in tcp_rcv_synsent_state_process(), but
a test with slightly modified netperf which uses MD5 for each flow shows
that the slow path is actually a bit faster than the fast path.

  On c5.4xlarge EC2 instance (16 vCPU, 32 GiB mem)

  $ for i in {1..10}; do
  ./super_netperf $(nproc) -H localhost -l 10 -- -m 256 -M 256;
  done

  Avg of 10
  * 36e68eadd303  : 10.376 Gbps
  * all fast path : 10.374 Gbps (patch v2, See Link)
  * all slow path : 10.394 Gbps

The header prediction is not worth adding complexity for MD5, so let's
disable it for MD5.

Link: https://lore.kernel.org/netdev/20230803042214.38309-1-kuniyu@amazon.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230803224552.69398-2-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-04 18:28:36 -07:00
Russell King (Oracle)
f4bf467883 net: phy: move marking PHY on SFP module into SFP code
Move marking the PHY as being on a SFP module into the SFP code between
getting the PHY device (and thus initialising the phy_device structure)
and registering the discovered device.

This means that PHY drivers can use phy_on_sfp() in their match and
get_features methods.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qRaga-001vKt-8X@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-04 18:25:52 -07:00
Yue Haibing
852c18d561 mlxsw: spectrum: Remove unused function declarations
Commit c3d2ed93b14d ("mlxsw: Remove old parsing depth infrastructure")
left behind mlxsw_sp_nve_inc_parsing_depth_get()/mlxsw_sp_nve_inc_parsing_depth_put().
And commit 532b49e41e64 ("mlxsw: spectrum_span: Derive SBIB from maximum port speed & MTU")
remove mlxsw_sp_span_port_mtu_update()/mlxsw_sp_span_speed_update_work() but leave the
declarations.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20230803142047.42660-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-04 18:03:05 -07:00