1061203 Commits

Author SHA1 Message Date
Amit Cohen
bf0a8b9bf2 selftests: forwarding: Add Q-in-VNI test for IPv6
Add test to check Q-in-VNI traffic with IPv6 underlay and overlay.
The test is similar to the existing IPv4 test.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-22 17:14:34 -08:00
Amit Cohen
6c6ea78a11 selftests: forwarding: Add a test for VxLAN symmetric routing with IPv6
In a similar fashion to the asymmetric test, add a test for symmetric
routing. In symmetric routing both the ingress and egress VTEPs perform
routing in the overlay network into / from the VxLAN tunnel. Packets in
different directions use the same VNI - the L3 VNI.
Different tenants (VRFs) use different L3 VNIs.

Add a test which is similar to the existing IPv4 test to check IPv6.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-22 17:14:34 -08:00
Amit Cohen
2902bae465 selftests: forwarding: Add a test for VxLAN asymmetric routing with IPv6
In asymmetric routing the ingress VTEP routes the packet into the
correct VxLAN tunnel, whereas the egress VTEP only bridges the packet to
the correct host. Therefore, packets in different directions use
different VNIs - the target VNI.

Add a test which is similar to the existing IPv4 test to check IPv6.

The test uses a simple topology with two VTEPs and two VNIs and verifies
that ping passes between hosts (local / remote) in the same VLAN (VNI)
and in different VLANs belonging to the same tenant (VRF).

While the test does not check VM mobility, it does configure an anycast
gateway using a macvlan device on both VTEPs.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-22 17:14:34 -08:00
Amit Cohen
dc498cdda0 selftests: forwarding: vxlan_bridge_1q: Remove unused function
Remove `vxlan_ping_test()` which is not used and probably was copied
mistakenly from vxlan_bridge_1d.sh.

This was found while adding an equivalent test for IPv6.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-22 17:14:33 -08:00
Amit Cohen
728b35259e selftests: forwarding: Add VxLAN tests with a VLAN-aware bridge for IPv6
The tests are very similar to their VLAN-unaware counterpart
(vxlan_bridge_1d_ipv6.sh and vxlan_bridge_1d_port_8472_ipv6.sh),
but instead of using multiple VLAN-unaware bridges, a single VLAN-aware
bridge is used with multiple VLANs.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-22 17:14:33 -08:00
Amit Cohen
b07e9957f2 selftests: forwarding: Add VxLAN tests with a VLAN-unaware bridge for IPv6
Add tests similar to vxlan_bridge_1d.sh and vxlan_bridge_1d_port_8472.sh.

The tests set up a topology with three VxLAN endpoints: one
"local", possibly offloaded, and two "remote", formed using veth pairs
and likely purely software bridges. The "local" endpoint is connected to
host systems by a VLAN-unaware bridge.

Since VxLAN tunnels must be unique per namespace, each of the "remote"
endpoints is in its own namespace. H3 forms the bridge between the three
domains.

Send IPv4 packets and IPv6 packets with IPv6 underlay.
Use `TC_FLAG`, which is defined in `forwarding.config` file, for TC
checks. `TC_FLAG` allows testing that on HW datapath, the traffic
actually goes through HW.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-22 17:14:32 -08:00
Amit Cohen
0cd0b1f7a6 selftests: lib.sh: Add PING_COUNT to allow sending configurable amount of packets
Currently `ping_do()` and `ping6_do()` send 10 packets.

There are cases that it is not possible to catch only the interesting
packets using tc rule, so then, it is possible to send many packets and
verify that at least this amount of packets hit the rule.

Add `PING_COUNT` variable, which is set to 10 by default, to allow tests
sending more than 10 packets using the existing ping API.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-22 17:14:32 -08:00
Amit Cohen
70ec72d5b6 mlxsw: spectrum_flower: Make vlan_id limitation more specific
Spectrum ASICs do not support matching of VLAN ID at egress.
Currently, mlxsw driver forbids matching of all VLAN related fields at
egress, which is too strict check.

For example, the following filter is not supported by the driver:
$ tc filter add dev swpX egress protocol 802.1q pref 1 handle 101 flower
vlan_ethtype ipv4 src_ip .. dst_ip .. skip_sw action pass
Error: mlxsw_spectrum: vlan_id key is not supported on egress.
We have an error talking to the kernel

The filter above does not match on VLAN ID, but is bounced anyway.

Make the check more specific, forbid only matching of 'vlan_id' at egress.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-22 17:14:32 -08:00
Jakub Kicinski
5de24da1b3 mlx5-updates-2021-12-21
1) From Shay Drory: Devlink user knobs to control device's EQ size
 
 This series provides knobs which will enable users to
 minimize memory consumption of mlx5 Functions (PF/VF/SF).
 mlx5 exposes two new generic devlink params for EQ size
 configuration and uses devlink generic param max_macs.
 
 LINK: https://lore.kernel.org/netdev/20211208141722.13646-1-shayd@nvidia.com/
 
 2) From Tariq and Lama, allocate software channel objects and statistics
   of a mlx5 netdevice private data dynamically upon first demand to save on
   memory.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmHClsoACgkQSD+KveBX
 +j5d6gf+NKH8mQd6Aa/Gt4Y2DtS7GzN+dPD+2MokuT2YSWU8kVGlMnC01MTE2V2s
 jiOUC6ZEbayx2ORzd58XlcfAMEZz2WH8VGLXTdmM3niv13D7AueSsUP5SVK7Oamk
 h0bwzOV8CE1Ru5s3Q3zPaLBqTWN+TmyG42HNwYyD8GZT7O5q8iBfor97N2KN5U5u
 rhxzFcD2jfhtYxACkjea6RTN9CTM7l9FS4+DIjhu53PAOBaOZLj8NKSYUqP6I17L
 ITJDd9za4Nq/YiB2yU4UVNkDEXsuXWoqnTNsL1oEUPForEomJsnikHV9ywzyKVMn
 SbWdZOr5C14UBH9wyyauEBBv5oLB7w==
 =cotW
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2021-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2021-12-21

1) From Shay Drory: Devlink user knobs to control device's EQ size

This series provides knobs which will enable users to
minimize memory consumption of mlx5 Functions (PF/VF/SF).
mlx5 exposes two new generic devlink params for EQ size
configuration and uses devlink generic param max_macs.

LINK: https://lore.kernel.org/netdev/20211208141722.13646-1-shayd@nvidia.com/

2) From Tariq and Lama, allocate software channel objects and statistics
  of a mlx5 netdevice private data dynamically upon first demand to save on
  memory.

* tag 'mlx5-updates-2021-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5e: Take packet_merge params directly from the RX res struct
  net/mlx5e: Allocate per-channel stats dynamically at first usage
  net/mlx5e: Use dynamic per-channel allocations in stats
  net/mlx5e: Allow profile-specific limitation on max num of channels
  net/mlx5e: Save memory by using dynamic allocation in netdev priv
  net/mlx5e: Add profile indications for PTP and QOS HTB features
  net/mlx5e: Use bitmap field for profile features
  net/mlx5: Remove the repeated declaration
  net/mlx5: Let user configure max_macs generic param
  devlink: Clarifies max_macs generic devlink param
  net/mlx5: Let user configure event_eq_size param
  devlink: Add new "event_eq_size" generic device param
  net/mlx5: Let user configure io_eq_size param
  devlink: Add new "io_eq_size" generic device param
====================

Link: https://lore.kernel.org/r/20211222031604.14540-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-22 17:13:01 -08:00
Jakub Kicinski
e6e5904455 codel: remove unnecessary pkt_sched.h include
Commit d068ca2ae2e6 ("codel: split into multiple files") moved all
Qdisc-related code to codel_qdisc.h, move the include of pkt_sched.h
as well.

This is similar to the previous commit, although we don't care as
much about incremental builds after pkt_sched.h was touched itself
it is included by net/sch_generic.h which is modified ~20 times
a year.

This decreases the incremental build size after touching pkt_sched.h
from 1592 to 617 objects.

Fix unmasked missing includes in WiFi drivers.

Acked-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20211221193941.3805147-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-22 15:03:51 -08:00
Jakub Kicinski
15fcb10311 codel: remove unnecessary sock.h include
Since sock.h is modified relatively often (60 times in the last
12 months) it seems worthwhile to decrease the incremental build
work.

CoDel's header includes net/inet_ecn.h which in turn includes net/sock.h.
codel.h is itself included by mac80211 which is included by much of
the WiFi stack and drivers. Removing the net/inet_ecn.h include from
CoDel breaks the dependecy between WiFi and sock.h.

Commit d068ca2ae2e6 ("codel: split into multiple files") moved all
the code which actually needs ECN helpers out to net/codel_impl.h,
the include can be moved there as well.

This decreases the incremental build size after touching sock.h
from 4999 objects to 4051 objects.

Fix unmasked missing includes in WiFi drivers.

Acked-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20211221193941.3805147-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-22 15:03:47 -08:00
Colin Ian King
62a3106697 net: broadcom: bcm4908enet: remove redundant variable bytes
The variable bytes is being used to summate slot lengths,
however the value is never used afterwards. The summation
is redundant so remove variable bytes.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20211222003937.727325-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-22 14:58:43 -08:00
Jesse Brandeburg
0092db5fac ice: trivial: fix odd indenting
Fix an odd indent where some code was left indented, and causes smatch
to warn:
ice_log_pkg_init() warn: inconsistent indenting

While here, for consistency, add a break after the default case.

This commit has a Fixes: but we caught this while it was only in net-next.

Fixes: 247dd97d713c ("ice: Refactor status flow for DDP load")
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://lore.kernel.org/r/20211221230538.2546315-1-jesse.brandeburg@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-22 14:57:55 -08:00
Jakub Kicinski
2030eddced Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
100GbE Intel Wired LAN Driver Updates 2021-12-21

This series contains updates to ice driver only.

Karol modifies the reset flow to correct issues with PTP reset.

Jake extends PTP support for E822 based devices. This includes a few
cleanup patches, that fix some minor issues. In addition, there are some
slight refactors to ease the addition of E822 support, followed by adding
the new hardware implementation ice_ptp_hw.c.

There are a few major differences with E822 support compared to E810
support:

*) The E822 device has a Clock Generation Unit which must be initialized in
order to generate proper clock frequencies on the output that drives the PTP
hardware clock registers

*) The E822 PHY is a bit different and requires a more complex
initialization procedure which must be rerun any time the link configuration
changes.

*) The E822 devices support enhanced timestamp calibration by making use of
a process called Vernier offset measurement. This allows the hardware to
measure phase offset related to the PHY clocks for Serdes and FEC, reducing
the inaccuracy of the timestamp relative to the actual packet transmission
and receipt. Making use of this requires data gathered from the first
transmitted and received packets, and waiting for the PHY to complete the
calibration measurements. This is done as part of a new kthread, ov_work.
Note that to avoid delay in enabling timestamps, we start the PHY in
'bypass' mode which allows timestamps to be captured without the Vernier
calibration measurement. Once the first packets have been sent and received,
we then complete the calibration setup and exit bypass mode and begin using
the more precise timestamps. According to the datasheet, timestamps without
calibration data can be incorrect relative to actual receipt or transmission
by up to 1 clock cycle (~1.25 nanoseconds), while calibrated timestamps
should be correct to within 1/8th of a clock cycle (~0.15 nanoseconds).

*) E822 devices support crosstimestamping via PCIe PTM, which we enable when
available on the platform.

There is a fair amount of logic required to perform PHY and CGU
initialization, which is the vast majority of the new code, but it is fairly
self contained within ice_ptp_hw.c, with the exception of monitoring for
offset validity being handled by a kthread.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  ice: support crosstimestamping on E822 devices if supported
  ice: exit bypass mode once hardware finishes timestamp calibration
  ice: ensure the hardware Clock Generation Unit is configured
  ice: implement basic E822 PTP support
  ice: convert clk_freq capability into time_ref
  ice: introduce ice_ptp_init_phc function
  ice: use 'int err' instead of 'int status' in ice_ptp_hw.c
  ice: PTP: move setting of tstamp_config
  ice: introduce ice_base_incval function
  ice: Fix E810 PTP reset flow
====================

Link: https://lore.kernel.org/r/20211221174845.3063640-1-anthony.l.nguyen@intel.com
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-22 14:25:42 -08:00
Tariq Toukan
1f08917ab9 net/mlx5e: Take packet_merge params directly from the RX res struct
As packet_merge params structure is saved on the RX resources structure, there
is no need to pass it separately.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21 19:08:58 -08:00
Lama Kayal
fa691d0c9c net/mlx5e: Allocate per-channel stats dynamically at first usage
Make stats allocation per-channel dynamic on demand, at channel open
operation.

Previously the stats array was pre-allocated for the maximum possible
number of channels. Here we defer the per-channel stats instance allocation
upon its first usage, so that it's allocated only if really needed.

Allocating stats on demand helps maintain a more memory-efficient code,
as we're saving memory when the used number of channels is smaller than
the maximum.

The stats memory instances are still freed in mlx5e_priv_arrays_free(),
so that they are persistent to channels' closure.

Memory size allocated for struct mlx5e_channel_stats is 3648 bytes.
If maximum number of channel stands for 64, the total memory space
allocated for stats is 3648x64 = 228K bytes. In scenarios where the
number of channels in use is significantly smaller than maximum number,
the memory saved can be remarkable.

Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21 19:08:58 -08:00
Tariq Toukan
be98737a4f net/mlx5e: Use dynamic per-channel allocations in stats
Make stats array an array of pointer. This patch comes in to prepare for
the next patch where allocations of the stats are to be performed
dynamically on first usage.

Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21 19:08:57 -08:00
Tariq Toukan
473baf2e9e net/mlx5e: Allow profile-specific limitation on max num of channels
Let SF/VF representor's netdev use profile-specific limitation on
max_nch to reduce its memory and HW resources consumption.

This is particularly important for environments with limited memory
and high number of SFs.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Vu Pham <vuhuong@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21 19:08:57 -08:00
Tariq Toukan
0246a57ab5 net/mlx5e: Save memory by using dynamic allocation in netdev priv
Many arrays in priv are statically allocated with a pre-defined maximum
(for num channels, num TCs, etc...), that is in some cases significantly
larger than the actual maximum. Examples:
- The more VFs are supported, the less MSIX vectors each of them could
  have. This limits the max_nch for each.
- Systems with limited number of cores or MSIX (< 64).
- Netdev profiles that do not support: QoS (DCB / HTB), PTP TX port
  timestamping.

Here we save some amount of memory by moving several structures
and arrays to follow the actual maximum instead.
This patch also prepares the code for even more savings to follow.

For example, on a system where the maximum num of channel is 8,
the channels stats structs alone go down from 3648*64 = 228 KB to
3648*8 = 28.5 KB per interface.

This is important for environments with high number of VFs/SFs or
limited memory.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21 19:08:57 -08:00
Tariq Toukan
1958c2bddf net/mlx5e: Add profile indications for PTP and QOS HTB features
Let the profile indicate support of the PTP and HTB (QOS) features.
This unifies the logic that calculates the number of netdev queues needed
for the features, and allows simplification of mlx5e_create_netdev(),
which no longer requires number of rx/tx queues as parameters.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21 19:08:56 -08:00
Tariq Toukan
6c72cb05d4 net/mlx5e: Use bitmap field for profile features
Use a features bitmap field in mlx5e_profile to declare profile support
state of the different features.  Let it replace the existing
rx_ptp_support boolean. It will be extended to cover more features in a
downstream patch.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21 19:08:56 -08:00
Shaokun Zhang
08ab0ff47b net/mlx5: Remove the repeated declaration
Function 'mlx5_esw_vport_match_metadata_supported' and
'mlx5_esw_offloads_vport_metadata_set' are declared twice, so remove
the repeated declaration and blank line.

Cc: Saeed Mahameed <saeedm@nvidia.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21 19:08:56 -08:00
Shay Drory
8680a60fc1 net/mlx5: Let user configure max_macs generic param
Currently, max_macs is taking 70Kbytes of memory per function. This
size is not needed in all use cases, and is critical with large scale.
Hence, allow user to configure the number of max_macs.

For example, to reduce the number of max_macs to 1, execute::
$ devlink dev param set pci/0000:00:0b.0 name max_macs value 1 \
              cmode driverinit
$ devlink dev reload pci/0000:00:0b.0

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21 19:08:55 -08:00
Shay Drory
0ad598d0be devlink: Clarifies max_macs generic devlink param
The generic param max_macs documentation isn't clear.
Replace it with a more descriptive documentation

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21 19:08:55 -08:00
Shay Drory
57ca767820 net/mlx5: Let user configure event_eq_size param
Event EQ is an EQ which received the notification of almost all the
events generated by the NIC.
Currently, each event EQ is taking 512KB of memory. This size is not
needed in most use cases, and is critical with large scale. Hence,
allow user to configure the size of the event EQ.

For example to reduce event EQ size to 64, execute::
$ devlink dev param set pci/0000:00:0b.0 name event_eq_size value 64 \
              cmode driverinit
$ devlink dev reload pci/0000:00:0b.0

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21 19:08:55 -08:00
Shay Drory
0b5705ebc3 devlink: Add new "event_eq_size" generic device param
Add new device generic parameter to determine the size of the
asynchronous control events EQ.

For example, to reduce event EQ size to 64, execute:
$ devlink dev param set pci/0000:06:00.0 \
              name event_eq_size value 64 cmode driverinit
$ devlink dev reload pci/0000:06:00.0

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21 19:08:54 -08:00
Shay Drory
0844fa5f7b net/mlx5: Let user configure io_eq_size param
Currently, each I/O EQ is taking 128KB of memory. This size
is not needed in all use cases, and is critical with large scale.
Hence, allow user to configure the size of I/O EQs.

For example, to reduce I/O EQ size to 64, execute:
$ devlink dev param set pci/0000:00:0b.0 name io_eq_size value 64 \
              cmode driverinit
$ devlink dev reload pci/0000:00:0b.0

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21 19:08:54 -08:00
Shay Drory
47402385d0 devlink: Add new "io_eq_size" generic device param
Add new device generic parameter to determine the size of the
I/O completion EQs.

For example, to reduce I/O EQ size to 64, execute:
$ devlink dev param set pci/0000:06:00.0 \
              name io_eq_size value 64 cmode driverinit
$ devlink dev reload pci/0000:06:00.0

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21 19:08:54 -08:00
Jakub Kicinski
f4f2970dfd Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
1GbE Intel Wired LAN Driver Updates 2021-12-21

This series contains updates to igc, igb, igbvf, and fm10k drivers.

Sasha removes unused defines and enum values from igc driver.

Jason Wang removes a variable whose value never changes and, instead,
returns the value directly for igb.

Karen adjusts a reset message from warning to info for igbvf.

Xiang wangx removes a repeated word for fm10k.

* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  fm10k: Fix syntax errors in comments
  igbvf: Refactor trace
  igb: remove never changed variable `ret_val'
  igc: Remove obsolete define
  igc: Remove obsolete mask
  igc: Remove obsolete nvm type
  igc: Remove unused phy type
  igc: Remove unused _I_PHY_ID define
====================

Link: https://lore.kernel.org/r/20211221180200.3176851-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-21 17:20:31 -08:00
Divya Koppera
b3ec7248f1 net: phy: micrel: Adding interrupt support for Link up/Link down in LAN8814 Quad phy
This patch add support for Link up or Link down
interrupt support in LAN8814 Quad phy

Signed-off-by: Divya Koppera <Divya.Koppera@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20211221112217.9502-1-Divya.Koppera@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-21 16:57:25 -08:00
Xiang wangx
37cf276df1 fm10k: Fix syntax errors in comments
Delete the redundant word 'by'.

Signed-off-by: Xiang wangx <wangxiang@cdjrlc.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:17:47 -08:00
Karen Sornek
630f6edc48 igbvf: Refactor trace
Refactoring "PF still resetting" message, because previous version looked
like a bug - it informed about changes that worked as designed but might
confuse users. Changes requested to make message more user-friendly.

Signed-off-by: Karen Sornek <karen.sornek@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:17:47 -08:00
Jason Wang
890781af31 igb: remove never changed variable `ret_val'
The variable used for return status in `igb_write_xmdio_reg' function
is never changed  and this function is just need return 0. Thus, the
`ret_val' can be removed and return 0 at the end of the
`igb_write_xmdio_reg' function.

Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:17:47 -08:00
Sasha Neftin
b8773a66f6 igc: Remove obsolete define
'MII_CR_FULL_DUPLEX' define not in use. This patch comes to tidy up
 obsolete define.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:17:47 -08:00
Sasha Neftin
d2a66dd3fd igc: Remove obsolete mask
'IGC_CTRL_EXT_LINK_MODE_MASK' not in use. This patch comes to tidy up
obsolete define.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:17:47 -08:00
Sasha Neftin
2a8807a765 igc: Remove obsolete nvm type
i225 devices use only spi nvm type. This patch comes to tidy up
obsolete nvm types.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:17:47 -08:00
Sasha Neftin
8e153faf58 igc: Remove unused phy type
_phy_none type not in use. Clean up the code accordingly,
and get rid of the unused enum line

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:17:47 -08:00
Sasha Neftin
7a34cda1ee igc: Remove unused _I_PHY_ID define
_I_PHY_ID not in use. Clean up the code accordingly,
and get rid of the unused define

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:17:47 -08:00
Jacob Keller
13a64f0b98 ice: support crosstimestamping on E822 devices if supported
E822 devices on supported platforms can generate a cross timestamp
between the platform ART and the device time. This process allows for
very precise measurement of the difference between the PTP hardware
clock and the platform time.

This is only supported if we know the TSC frequency relative to ART, so
we do not enable this unless the boot CPU has a known TSC frequency (as
required by convert_art_ns_to_tsc).

Because PCIe PTM support is not available on all platforms, introduce
CONFIG_ICE_HWTS and make it depend on X86 where we know the support
exists.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:11:40 -08:00
Jacob Keller
a69f1cb62a ice: exit bypass mode once hardware finishes timestamp calibration
Once the E822 device has sent and received one packet, the hardware
computes the internal delay of the PHY using a process known as Vernier
calibration. This calibration calculates a more accurate offset for the
Tx and Rx timestamps. To make use of this offset, we need to exit the
bypass mode. This cannot be done until the PHY has completed offset
calibration, as indicated by the offset valid bits.

To handle this, introduce a kthread work item which will poll the offset
valid bits every few milliseconds seeing if it is safe to exit bypass
mode.

Once we have finished calibrating the offsets, we can program the total
Tx and Rx offset registers and turn off the bypass bit. This allows the
hardware to include the more precise vernier calibration offset, and
improves the timestamp precision.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:11:40 -08:00
Jacob Keller
b111ab5a11 ice: ensure the hardware Clock Generation Unit is configured
The E822 device has a Clock Generation Unit (CGU) responsible for
determining the clock frequency that drives the timers.

Ensure this function is initialized when bringing up the PTP support, so
that the clock has a known frequency.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:11:40 -08:00
Jacob Keller
3a7496234d ice: implement basic E822 PTP support
Implement support for the basic operations needed to enable the PTP
hardware clock on E822 devices.

This includes implementations for the various PHY access functions, as
well as the ability to start and stop the PHY timers. This is different
from the E810 device because the configuration depends on link speed, so
we cannot just start the PHYs immediately. We must wait until the link
is up to get proper values for the speed based initialization.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:11:37 -08:00
Jacob Keller
405efa49b5 ice: convert clk_freq capability into time_ref
Convert the clk_freq value into the associated time_ref frequency value
for E822 devices. This simplifies determining the time reference value
for the clock.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:11:02 -08:00
Jacob Keller
b2ee72565c ice: introduce ice_ptp_init_phc function
When we enable support for E822 devices, there are some additional
steps required to initialize the PTP hardware clock. To make this easier
to implement as device-specific behavior, refactor the register setups
in ice_ptp_init_owner to a new ice_ptp_init_phc function defined in
ice_ptp_hw.c

This function will have a common section, and an e810 specific
sub-implementation.

This will enable easily extending the functionality to cover the E822
specific setup required to initialize the hardware clock generation
unit. It also makes it clear which steps are E810 specific vs which ones
are necessary for all ice devices.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:11:02 -08:00
Jacob Keller
39b2810642 ice: use 'int err' instead of 'int status' in ice_ptp_hw.c
The ice_ptp_hw.c file introduced a bunch of uses of "int status" instead
of the more traditional "int err" or "int ret". These are actually
traditional Linux error codes (as opposed to the recently removed
ice_status enumeration values).

We're about to add a bunch of new functions to ice_ptp_hw.c. It's
normally preferred in the ice driver to use "int ret" or "int err" when
dealing with error code values.

Instead of making the new functions use "int status" lets just fix all
of ice_ptp_hw.c to use "int err". This will match the new functions and
ensures a consistent style across at least the PTP related files.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:11:02 -08:00
Jacob Keller
e59d75dd41 ice: PTP: move setting of tstamp_config
The tstamp_config structure is being set inside of
ice_ptp_cfg_timestamp, which is the function used to set Tx and
Rx timestamping during initialization.

This function is also used in order to set the PHY port timestamping
status. However, it makes sense to always set the tstamp_config directly
whenever the ice_set_tx_tstamp or ice_set_rx_tstamp functions are
called.

Move assignment of tstamp_config into the related functions and out of
ice_ptp_cfg_timestamp.

Now that we assign the timestamp mode in the relevant functions, we no
longer modify the config value in ice_set_timestamp_mode. In turn, we
no longer want to copy that config value into the PF cached structure.
Instead, this is now the source of truth for actual configuration. On
success of ice_set_timestamp_mode, copy the real configured mode back to
report it out to userspace.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:11:02 -08:00
Jacob Keller
78267d0c9c ice: introduce ice_base_incval function
A future change will add additional possible increment values for the
E822 device support. To handle this, we want to look up the increment
value to use instead of hard coding it to the nominal value for E810
devices. Introduce ice_base_incval as a function to get the best nominal
increment value to use.

For now, it just returns the E810 value, but will be refactored in the
future to look up the value based on the device type and configured
clock frequency.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:11:02 -08:00
Karol Kolacinski
4809671015 ice: Fix E810 PTP reset flow
The PF reset does not reset PHC and PHY clocks so it's unnecessary to
stop them and reinitialize after the reset.
Configuring timestamping changes the VSI fields so it needs to be
performed after VSIs are initialized, which was not done in case of a
reset.

Suggested-by: Patrick Talbert <ptalbert@redhat.com>
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Tested-by: Pasi Vaananen <pvaanane@redhat.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-21 09:09:57 -08:00
Jakub Kicinski
294e70c952 This time we have:
* ndo_fill_forward_path support in mac80211, to let
    drivers use it
  * association comeback notification for userspace,
    to be able to react more sensibly to long delays
  * support for background radar detection hardware
    in some chipsets
  * SA Query Procedures offload on the AP side
  * more logging if we find problems with HT/VHT/HE
  * various cleanups and minor fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAmHBuIoACgkQB8qZga/f
 l8SDNQ//bWl1fnVTzXcva16NGXNtQc8ufOdDfEHsusTA0qP1EfCDfhiMmRZ+jUQH
 Xdg7F3Yube0fgij1sEgpcoVOFm5wr7p861nljR8m71t9FI832gfd+qdCJicNxGGI
 B3zEhHCkcZ4yBhT35+cKG/H3WBysI8RO65dC6NVlzCyY1iM9TVkHBtbEKrdNljcM
 cKKWRp/fk7lCRVqLtunUd5kJauwJxjwHOm4GTH5BajbT/06m91GLoj/tZEjr9rQL
 aSsBa1nR0/LcMyYbbQYIxLikTZnkzILIJGLakb7k5ZJ2W4/hUv0Zn6LUCyMDM1mK
 7+Bt6qvB3Wz/TwjKYDm2qOniaD4IDVOtEpVPaXGau8c5Cj6rjnJ/cgF3ydBk4+xB
 5xngZBCk6Y4+epg9V7EWfqmV0vVqlWqfUfARwPulLWA1X15mVVBmcrafGEaLvGrC
 mvkq0n0XZzf+ObrILK7yjafOdLC4ATCj8j6RW85mH4yU+PqKrx3gOCrWn3Zm+6BN
 n6y7vs5x6zEitqjap4zsiVxqJf3jtAVcdVy7k52VF2BBpF8xoyrIMYZw5CNUG2Jv
 aTmW5aE8X9mQ2VT88JewZst0IX4jjfK/B8wOj24tokC2mXRdM5uKTOWK7uTFQJfM
 lLFcRYzo6n6epHrA5oBN4SnQ3/QpZNJOEsRxyROXemDxnQ9de+w=
 =u1jf
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-net-next-2021-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
This time we have:
 * ndo_fill_forward_path support in mac80211, to let drivers use it
 * association comeback notification for userspace, to be able
   to react more sensibly to long delays
 * support for background radar detection hardware in some chipsets
 * SA Query Procedures offload on the AP side
 * more logging if we find problems with HT/VHT/HE
 * various cleanups and minor fixes

Conflicts:

net/wireless/reg.c:
  e08ebd6d7b90 ("cfg80211: Acquire wiphy mutex on regulatory work")
  701fdfe348f7 ("cfg80211: Enable regulatory enforcement checks for drivers supporting mesh iface")
  https://lore.kernel.org/r/20211221111950.57ecc6a7@canb.auug.org.au

drivers/net/wireless/ath/ath10k/wmi.c:
  7f599aeccbd2 ("cfg80211: Use the HE operation IE to determine a 6GHz BSS channel")
  3bf2537ec2e3 ("ath10k: drop beacon and probe response which leak from other channel")
  https://lore.kernel.org/r/20211221115004.1cd6b262@canb.auug.org.au

* tag 'mac80211-next-for-net-next-2021-12-21' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next: (32 commits)
  cfg80211: Enable regulatory enforcement checks for drivers supporting mesh iface
  rfkill: allow to get the software rfkill state
  cfg80211: refactor cfg80211_get_ies_channel_number()
  nl82011: clarify interface combinations wrt. channels
  nl80211: Add support to offload SA Query procedures for AP SME device
  nl80211: Add support to set AP settings flags with single attribute
  mac80211: add more HT/VHT/HE state logging
  cfg80211: Use the HE operation IE to determine a 6GHz BSS channel
  cfg80211: rename offchannel_chain structs to background_chain to avoid confusion with ETSI standard
  mac80211: Notify cfg80211 about association comeback
  cfg80211: Add support for notifying association comeback
  mac80211: introduce channel switch disconnect function
  cfg80211: Fix order of enum nl80211_band_iftype_attr documentation
  cfg80211: simplify cfg80211_chandef_valid()
  mac80211: Remove a couple of obsolete TODO
  mac80211: fix FEC flag in radio tap header
  mac80211: use coarse boottime for airtime fairness code
  ieee80211: change HE nominal packet padding value defines
  cfg80211: use ieee80211_bss_get_elem() instead of _get_ie()
  mac80211: Use memset_after() to clear tx status
  ...
====================

Link: https://lore.kernel.org/r/20211221112532.28708-1-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-21 07:41:52 -08:00
Yang Li
c48c94b0ab net/sched: use min() macro instead of doing it manually
Fix following coccicheck warnings:
./net/sched/cls_api.c:3333:17-18: WARNING opportunity for min()
./net/sched/cls_api.c:3389:17-18: WARNING opportunity for min()
./net/sched/cls_api.c:3427:17-18: WARNING opportunity for min()

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-21 10:16:47 +00:00