Commit Graph

38923 Commits

Author SHA1 Message Date
Biju Das
8912ed25da ravb: Add net_features and net_hw_features to struct ravb_hw_info
On R-Car the checksum calculation on RX frames is done by the E-MAC
module, whereas on RZ/G2L it is done by the TOE.

TOE calculates the checksum of received frames from E-MAC and outputs it to
DMAC. TOE also calculates the checksum of transmission frames from DMAC and
outputs it E-MAC.

Add net_features and net_hw_features to struct ravb_hw_info, to support
subsequent SoCs without any code changes in the ravb_probe function.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-19 12:05:16 +01:00
Biju Das
896a818e0e ravb: Add gstrings_stats and gstrings_size to struct ravb_hw_info
The device stats strings for R-Car and RZ/G2L are different.

R-Car provides 30 device stats, whereas RZ/G2L provides only 15. In
addition, RZ/G2L has stats "rx_queue_0_csum_offload_errors" instead of
"rx_queue_0_missed_errors".

Add structure variables gstrings_stats and gstrings_size to struct
ravb_hw_info, so that subsequent SoCs can be added without any code
changes in the ravb_get_strings function.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-19 12:05:16 +01:00
Biju Das
25154301fc ravb: Add stats_len to struct ravb_hw_info
R-Car provides 30 device stats, whereas RZ/G2L provides only 15. In
addition, RZ/G2L has stats "rx_queue_0_csum_offload_errors" instead of
"rx_queue_0_missed_errors".

Replace RAVB_STATS_LEN macro with a structure variable stats_len to
struct ravb_hw_info, to support subsequent SoCs without any code changes
to the ravb_get_sset_count function.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-19 12:05:15 +01:00
Biju Das
cb01c672c2 ravb: Add max_rx_len to struct ravb_hw_info
The maximum descriptor size that can be specified on the reception side for
R-Car is 2048 bytes, whereas for RZ/G2L it is 8096.

Add the max_rx_len variable to struct ravb_hw_info for allocating different
RX skb buffer sizes for R-Car and RZ/G2L using the netdev_alloc_skb
function.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-19 12:05:15 +01:00
Biju Das
68ca3c9232 ravb: Add aligned_tx to struct ravb_hw_info
R-Car Gen2 needs a 4byte aligned address for the transmission buffer,
whereas R-Car Gen3 doesn't have any such restriction.

Add aligned_tx to struct ravb_hw_info to select the driver to choose
between aligned and unaligned tx buffers.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-19 12:05:15 +01:00
Biju Das
ebb091461a ravb: Add struct ravb_hw_info to driver data
The DMAC and EMAC blocks of Gigabit Ethernet IP found on RZ/G2L SoC are
similar to the R-Car Ethernet AVB IP. With a few changes in the driver we
can support both IPs.

This patch adds the struct ravb_hw_info to hold hw features, driver data
and function pointers to support both the IPs. It also replaces the driver
data chip type with struct ravb_hw_info by moving chip type to it.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-19 12:05:15 +01:00
Biju Das
cb537b2417 ravb: Use unsigned int for num_tx_desc variable in struct ravb_private
The number of TX descriptors per packet is an unsigned value and
the variable for holding this information should be unsigned.

This patch replaces the data type of num_tx_desc variable in struct
ravb_private from 'int' to 'unsigned int'.
This patch also updates the data type of local variables to unsigned int,
where the local variables are evaluated using num_tx_desc.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-19 12:05:15 +01:00
Jason Wang
19b8ece42c net/mlx4: Use ARRAY_SIZE to get an array's size
The ARRAY_SIZE macro is defined to get an array's size which is
more compact and more formal in linux source. Thus, we can replace
the long sizeof(arr)/sizeof(arr[0]) with the compact ARRAY_SIZE.

Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20210817121106.44189-1-wangborong@cdjrlc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-18 15:16:54 -07:00
Subbaraya Sundeep
ab44035d30 octeontx2-pf: Allow VLAN priority also in ntuple filters
VLAN TCI is a 16 bit field which includes Priority(3 bits),
CFI(1 bit) and VID(12 bits). Currently ntuple filters support
installing rules to steer packets based on VID only.
This patch extends that support such that filters can
be installed for entire VLAN TCI.

Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-18 11:29:11 +01:00
Colin Ian King
6e9078a667 i40e: Fix spelling mistake "dissable" -> "disable"
There is a spelling mistake in a dev_info message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-08-17 10:35:01 -07:00
Stefan Assmann
5ac49f3c27 iavf: use mutexes for locking of critical sections
As follow-up to the discussion with Jakub Kicinski about iavf locking
being insufficient [1] convert iavf to use mutexes instead of bitops.
The locking logic is kept as is, just a drop-in replacement of
enum iavf_critical_section_t with separate mutexes.
The only difference is that the mutexes will be destroyed before the
module is unloaded.

[1] https://lwn.net/ml/netdev/20210316150210.00007249%40intel.com/

Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-08-17 09:45:45 -07:00
Vidya
aee5122491 octeontx2-af: configure npc for cn10k to allow packets from cpt
On CN10K, the higher bits in the channel number represents the CPT
channel number. Mask out these higher bits in the npc configuration
to allow packets from cpt for parsing.

Signed-off-by: Vidya <vvelumuri@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-17 10:06:33 +01:00
Hariprasad Kelam
99b8e5479d octeontx2-af: cn10K: Get NPC counters value
The way SW can identify the number NPC counters supported by silicon
has changed for CN10K. This patch addresses this reading appropriate
registers to find out number of counters available.

Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-17 10:06:33 +01:00
Subbaraya Sundeep
7df5b4b260 octeontx2-af: Allocate low priority entries for PF
If the mcam entry allocation request is from PF
and NOT a priority allocation request then allocate
low priority entries so that PF entries always have
lower priority than its VFs. This is required so
that entries with (base) MCAM match criteria have lower
priority compared to entries with (base + additional)
match criteria. This patch considers only best case
scenario where PF entries are allocated from low
priority zone if low priority zone has free space.
There are worst case scenarios like:
1. VFs allocating hundreds of MCAM entries leading to VFs
using all mid priority zone and low priority zone entries
hence no entries free from low priority zone for PF.
2. All the PFs and VFs in the system allocating and freeing
entries causing fragmentation in MCAM space and all the
entries requested by PF could not fit in low priority
zone for allocation.
This patch do not handle worst case scenarios.

Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-17 10:06:33 +01:00
Sunil Goutham
2da4894327 octeontx2-pf: devlink params support to set mcam entry count
Added support for setting or modifying MCAM entry count at
runtime via devlink params.

commands:
  devlink dev param show
pci/0002:02:00.0:
  name mcam_count type driver-specific
    values:
      cmode runtime value 16

  devlink dev param set pci/0002:02:00.0 name mcam_count
				value 64 cmode runtime

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-17 10:06:33 +01:00
Sunil Goutham
2e2a8126ff octeontx2-pf: Unify flow management variables
Variables used for TC flow management like maximum number
of flows, number of flows installed etc are a copy of ntuple
flow management variables. Since both TC and NTUPLE are not
supported at the same time, it's better to unify these with
common variables.

This patch addresses this unification and also does cleanup of
other minor stuff wrt TC.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-17 10:06:33 +01:00
Sunil Goutham
cc65fcab88 octeontx2-pf: Sort the allocated MCAM entry indices
Per single mailbox request a maximum of 256 MCAM entries
can be allocated. If more than 256 are being allocated, then
the mcam indices in the final list could get jumbled. Hence
sort the indices.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-17 10:06:33 +01:00
Rakesh Babu
3cffaed213 octeontx2-pf: Ntuple filters support for VF netdev
Add packet flow classification support for both LMAC mapped virtual
functions and loopback VFs. This patch adds supports for ntuple
offload feature.

Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-17 10:06:33 +01:00
Sunil Goutham
0b3834aeaf octeontx2-pf: Enable NETIF_F_RXALL support for VF driver
Enabled NETIF_F_RXALL support for VF driver.
Also removed MTU range comments which are no longer valid.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-17 10:06:32 +01:00
Sunil Goutham
a83bdada06 octeontx2-af: Add debug messages for failures
Added debug messages for various failures during probe.
This will help in quickly identifying the API where the failure
is happening.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-17 10:06:32 +01:00
Naveen Mamindlapalli
7278c359e5 octeontx2-af: add proper return codes for AF mailbox handlers
Add appropriate error codes to be used when returning from AF
mailbox handlers due to some error condition.

Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-17 10:06:32 +01:00
Subbaraya Sundeep
9cfc580956 octeontx2-af: Modify install flow error codes
When installing a flow using npc_install_flow
mailbox there are number of reasons to reject
the request like caller is not permitted,
invalid channel specified in request, flow
not supported in extraction profile and so on.
Hence define new error codes for npc flows and use
them instead of generic error codes.

Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-17 10:06:32 +01:00
David S. Miller
354e1f9d88 mlx5-updates-2021-08-16
The following patchset provides two separate mlx5 updates
 1) Ethtool RSS context and MQPRIO channel mode support:
   1.1) enable mlx5e netdev driver to allow creating Transport Interface RX
        (TIRs) objects on the fly to be used for ethtool RSS contexts and
        TX MQPRIO channel mode
   1.2) Introduce mlx5e_rss object to manage such TIRs.
   1.3) Ethtool support for RSS context
   1.4) Support MQPRIO channel mode
 
 2) Bridge offloads Lag support:
    to allow adding bond net devices to mlx5 bridge
   2.1) Address bridge port by (vport_num, esw_owner_vhca_id) pair
        since vport_num is only unique per eswitch and in lag mode we
        need to manage ports from both eswitches.
   2.2) Allow connectivity between representors of different eswitch
        instances that are attached to same bridge
   2.3) Bridge LAG, Require representors to be in shared FDB mode and
        introduce local and peer ports representors,
        match on paired eswitch metadata in peer FDB entries,
        And finally support addition/deletion and aging of peer flows.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmEa8gwACgkQSD+KveBX
 +j56lAf/esRKDA2kKdtYr7AUNCDmRf/Fj5jBRuASuEWOGgoBaGNZprBw7So5MZdc
 kZpkTnrvFZ5KXsq8nfCJsFkmepHMfJYmTc4VWXpMOyxYlsTwQc9UQ3MBboylMYqL
 23KJKxurFN0kyMIOGciqXaetHgIhf+iQx1osJZd4WGk1soiWX7JiDV6gXXEZu4Ge
 VH261CwOCLwpJ4STTMdjgcGnARStMcr0I3LJm7BWoMVls6FcLXcNslhvcRYYyUiX
 B5UXYrZPLA39CegL/y/jP1vdH6pStb453x1dtjIoJiM+iOIjTaGkYOLF6w2QQWqd
 /8R4MPeAYJ4ENsWmrEbPG+F/5GIN1Q==
 =KWry
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2021-08-16' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2021-08-16

The following patchset provides two separate mlx5 updates
1) Ethtool RSS context and MQPRIO channel mode support:
  1.1) enable mlx5e netdev driver to allow creating Transport Interface RX
       (TIRs) objects on the fly to be used for ethtool RSS contexts and
       TX MQPRIO channel mode
  1.2) Introduce mlx5e_rss object to manage such TIRs.
  1.3) Ethtool support for RSS context
  1.4) Support MQPRIO channel mode

2) Bridge offloads Lag support:
   to allow adding bond net devices to mlx5 bridge
  2.1) Address bridge port by (vport_num, esw_owner_vhca_id) pair
       since vport_num is only unique per eswitch and in lag mode we
       need to manage ports from both eswitches.
  2.2) Allow connectivity between representors of different eswitch
       instances that are attached to same bridge
  2.3) Bridge LAG, Require representors to be in shared FDB mode and
       introduce local and peer ports representors,
       match on paired eswitch metadata in peer FDB entries,
       And finally support addition/deletion and aging of peer flows.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-17 09:51:19 +01:00
Vlad Buslov
ff9b752146 net/mlx5: Bridge, support LAG
Allow adding bond net devices to mlx5 bridge with following changes:

- Modify bridge representor code to obtain uplink represetor that belongs
to eswitch that is registered for notification. Require representor to be
in shared FDB mode. If representor is the lag master, then consider its
port as local, otherwise treat it as peer.

- Use devcom to match on paired eswitch metadata in peer FDB entries. This
is necessary for shared FDB LAG to function since packets are always
received on active eswitch instance as opposed to parent eswitch of port.

- Support for deleting peer flows when receiving
SWITCHDEV_FDB_DEL_TO_BRIDGE notification was implemented in one of previous
patches in series. Now also implement support for handling
SWITCHDEV_FDB_ADD_TO_BRIDGE which can be generated on peer by bridge update
workqueue task in LAG configuration. Refresh the flow 'lastuse' timestamp
to current jiffies when receiving such notification on eswitch that manages
the local FDB entry. This allows peer entries to prevent ageing of the FDB.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-16 16:17:32 -07:00
Vlad Buslov
c358ea1741 net/mlx5: Bridge, allow merged eswitch connectivity
Allow connectivity between representors of different eswitch instances that
are attached to same bridge when merged_eswitch capability is enabled. Add
ports of peer eswitch to bridge instance and mark them with
MLX5_ESW_BRIDGE_PORT_FLAG_PEER. Mark FDBs offloaded on peer ports with
MLX5_ESW_BRIDGE_FLAG_PEER flag. Such FDBs can only be aged out on their
local eswitch instance, which then sends SWITCHDEV_FDB_DEL_TO_BRIDGE event.
Listen to the event on mlx5 bridge implementation and delete peer FDBs in
event handler.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-16 16:17:31 -07:00
Vlad Buslov
bf3d56d8f5 net/mlx5: Bridge, extract FDB delete notification to function
SWITCHDEV_FDB_DEL_TO_BRIDGE notification is generated in multiple places in
bridge code. Following patch in series changes the condition for the
notification. Extract the notification into dedicated helper function
mlx5_esw_bridge_fdb_del_notify() to only modify it in single place in the
future changes.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-16 16:17:31 -07:00
Vlad Buslov
3ee6233e61 net/mlx5: Bridge, identify port by vport_num+esw_owner_vhca_id pair
Following patches in series allow traffic between vports of different
eswitch instances, which requires addressing bridge port by
vport_num+esw_owner_vhca_id pair since vport_num is only unique
per-eswitch. As a preparation, extend struct mlx5_esw_bridge_port with
'esw_owner_vhca_id' field and use it as part of key for
mlx5_esw_bridge->vports xarray.

With this change we can't rely on switchdev_handle_port_obj_add() helper to
get mlx5 representor from stacked device because we need specifically
representor from parent eswitch that registered the callback to obtain
correct esw_owner_vhca_id. The helper doesn't allow passing additional
parameters to predicate function and doesn't provide access to the notifier
block to obtain eswitch through br_offloads. Implement custom helpers to
obtain mlx5 representor and use them in
mlx5_esw_bridge_port_obj_{add|del|attr_set}() implementations.

Remove direct pointer to parent bridge from struct mlx5_vport as it is no
longer needed.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-16 16:17:30 -07:00
Vlad Buslov
a514d17350 net/mlx5: Bridge, obtain core device from eswitch instead of priv
Following patches in series will pass bond device to bridge, which means
the code can't assume the device is mlx5 representor. Moreover, the core
device can be easily obtained from eswitch instance, so there is no reason
for more complex code that obtains struct mlx5_priv from net_device in
order to use its mdev. Refactor the code to use esw->dev instead of
priv->mdev.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-16 16:17:30 -07:00
Vlad Buslov
4de20e9a12 net/mlx5: Bridge, release bridge in same function where it is taken
Refactor mlx5_esw_bridge_vport_link() to release the bridge instance if
mlx5_esw_bridge_vport_init() returned an error instead of relying on it to
release the bridge. This improves the design because object instance is
taken and released in same layer and simplifies following patches that add
more logic to mlx5_esw_bridge_vport_link().

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-16 16:17:30 -07:00
Tariq Toukan
ec60c4581b net/mlx5e: Support MQPRIO channel mode
Add support for MQPRIO channel mode, in which a partition to TCs
is defined over the channels. We allow partitions with contiguous
queue indices, with no holes within. We do not allow modification
to the num of channels while this MQPRIO mode is active.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-16 16:17:29 -07:00
Tariq Toukan
21ecfcb83a net/mlx5e: Handle errors of netdev_set_num_tc()
Add handling for failures in netdev_set_num_tc().
Let mlx5e_netdev_set_tcs return an int.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-16 16:17:29 -07:00
Tariq Toukan
e2aeac448f net/mlx5e: Maintain MQPRIO mode parameter
This is in preparation for supporting MQPRIO CHANNEL mode in
downstream patch, in addition to DCB mode that's supported today.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-16 16:17:29 -07:00
Tariq Toukan
86d747a3f9 net/mlx5e: Abstract MQPRIO params
Abstract the MQPRIO params into a struct.
Use a getter for DCB mode num_tcs.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-16 16:17:28 -07:00
Tariq Toukan
248d3b4c9a net/mlx5e: Support flow classification into RSS contexts
Extend the existing flow classification support, to steer
flows not only directly to a receive ring, but also into
the new RSS contexts.

Create needed TIR objects on demand, and hold reference
on the RSS context.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-16 16:17:28 -07:00
Tariq Toukan
f01cc58c18 net/mlx5e: Support multiple RSS contexts
Add support to multiple RSS contexts. Resources of the non-default
RSS contexts are allocated and created on demand. Each RSS context
can be controlled and configured separately, via the implemented
ethtool ops. Here we limit the num of total contexts to 16.

We do not enforce any kind of new limitation over the indirection table
content. More specifically, two separate contexts can be configured to
fully or partially point to the same set of receive rings.

The default RSS context (index 0) is created with its full set of TIRs.
All other contexts are created with an empty set, then TIRs are added
upon first usage when steering rules are added.
We use a reference counting mechanism to make sure an RSS context is
not removed before the rules pointing to it.

Block ethtool set_channels operations when multiple RSS contexts exist,
as currently the kernel doesn't protect against inconsistent channels
configs that break non-default RSS contexts.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-16 16:17:28 -07:00
Tariq Toukan
49095f641b net/mlx5e: Dynamically allocate TIRs in RSS contexts
Move from static to dynamic memory allocations for TIR.
This is in preparation to supporting on-demand TIR operations in
downstream patches, where every RSS context will be init with an
empty set of TIRs.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-16 16:17:27 -07:00
Tariq Toukan
25307a91cb net/mlx5e: Convert RSS to a dedicated object
Code related to RSS is now encapsulated into a dedicated object and put
into new files en/rss.{c,h}. All usages are converted.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-16 16:17:27 -07:00
Tariq Toukan
713ba5e5f6 net/mlx5e: Introduce abstraction of RSS context
Bring all fields that define and maintain RSS behavior together
into a new structure.
Align all usages with this new structure. Keep it hidden within
rx_res.c.
This helps supporting multiple RSS contexts in downstream patch.

Use dynamic allocations for the RSS context.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-16 16:17:27 -07:00
Tariq Toukan
fc651ff910 net/mlx5e: Introduce TIR create/destroy API in rx_res
Take TIR control operations in rx_res into functions.
This is in preparation to supporting on-demand TIR operations in
downstream patches.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-16 16:17:26 -07:00
Tariq Toukan
6e5fea5196 net/mlx5e: Do not try enable RSS when resetting indir table
All calls to mlx5e_rx_res_rss_set_indir_uniform() occur while the RSS
state is inactive, i.e. the RQT is pointing to the drop RQ, not to the
channels' RQs.
It means that the "apply" part of the function is not called.
Remove this part from the function, and document the change. It will be
useful for next patches in the series, allows code simplifications when
multiple RSS contexts are introduced.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-08-16 16:17:26 -07:00
Guangbin Huang
f5c2b9f0fc net: hns3: add support ethtool extended link state
In order to know the reason of link up failure, add supporting ethtool
extended link state. Driver reads the link status code from firmware if
in link down state and converts it to ethtool extended link state.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-16 15:12:14 -07:00
Guangbin Huang
edb40bbc17 net: hns3: add header file hns3_ethtoo.h
Add a new file hns3_ethtool.h, and move struct type definitions from
hns3_ethtool.c to hns3_ethtool.h.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-16 15:12:13 -07:00
Vijayakannan Ayyathurai
af9bf70154 net: stmmac: add ethtool per-queue irq statistic support
Adding ethtool per-queue statistics support to show number of interrupts
generated at DMA tx and DMA rx. All the counters are incremented at
dwmac4_dma_interrupt function.

Signed-off-by: Vijayakannan Ayyathurai <vijayakannan.ayyathurai@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-16 11:36:04 +01:00
Vijayakannan Ayyathurai
68e9c5dee1 net: stmmac: add ethtool per-queue statistic framework
Adding generic ethtool per-queue statistic framework to display the
statistics for each rx/tx queue. In future, users can avail it to add
more per-queue specific counters. Number of rx/tx queues displayed is
depending on the available rx/tx queues in that particular MAC config
and this number is limited up to the MTL_MAX_{RX|TX}_QUEUES defined
in the driver.

Ethtool per-queue statistic display will look like below, when users
start adding more counters.

Example:
 q0_tx_statA:
 q0_tx_statB:
 q0_tx_statC:
 |
 q0_tx_statX:
 .
 .
 .
 qMAX_tx_statA:
 qMAX_tx_statB:
 qMAX_tx_statC:
 |
 qMAX_tx_statX:

 q0_rx_statA:
 q0_rx_statB:
 q0_rx_statC:
 |
 q0_rx_statX:
 .
 .
 .
 qMAX_rx_statA:
 qMAX_rx_statB:
 qMAX_rx_statC:
 |
 qMAX_rx_statX:

In addition, this patch has the support on displaying the number of
packets received and transmitted per queue.

Signed-off-by: Vijayakannan Ayyathurai <vijayakannan.ayyathurai@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-16 11:36:04 +01:00
Voon Weifeng
1975df880b net: stmmac: fix INTR TBU status affecting irq count statistic
DMA channel status "Transmit buffer unavailable(TBU)" bit is not
considered as a successful dma tx. Hence, it should not affect
all the irq count statistic.

Fixes: 1103d3a553 ("net: stmmac: dwmac4: Also use TBU interrupt to clean TX path")
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: Vijayakannan Ayyathurai <vijayakannan.ayyathurai@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-16 11:36:04 +01:00
Heiner Kallweit
c07c8ffc70 r8169: rename rtl_csi_access_enable to rtl_set_aspm_entry_latency
Rename the function to reflect what it's doing. Also add a description
of the register values as kindly provided by Realtek.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-16 11:21:54 +01:00
Vladimir Oltean
e6e12df625 net: mscc: ocelot: convert to phylink
The felix DSA driver, which is a wrapper over the same hardware class as
ocelot, is integrated with phylink, but ocelot is using the plain PHY
library. It makes sense to bring together the two implementations, which
is what this patch achieves.

This is a large patch and hard to break up, but it does the following:

The existing ocelot_adjust_link writes some registers, and
felix_phylink_mac_link_up writes some registers, some of them are
common, but both functions write to some registers to which the other
doesn't.

The main reasons for this are:
- Felix switches so far have used an NXP PCS so they had no need to
  write the PCS1G registers that ocelot_adjust_link writes
- Felix switches have the MAC fixed at 1G, so some of the MAC speed
  changes actually break the link and must be avoided.

The naming conventions for the functions introduced in this patch are:
- vsc7514_phylink_{mac_config,validate} are specific to the Ocelot
  instantiations and placed in ocelot_net.c which is built only for the
  ocelot switchdev driver.
- ocelot_phylink_mac_link_{up,down} are shared between the ocelot
  switchdev driver and the felix DSA driver (they are put in the common
  lib).

One by one, the registers written by ocelot_adjust_link are:

DEV_MAC_MODE_CFG - felix_phylink_mac_link_up had no need to write this
                   register since its out-of-reset value was fine and
                   did not need changing. The write is moved to the
                   common ocelot_phylink_mac_link_up and on felix it is
                   guarded by a quirk bit that makes the written value
                   identical with the out-of-reset one
DEV_PORT_MISC - runtime invariant, was moved to vsc7514_phylink_mac_config
PCS1G_MODE_CFG - same as above
PCS1G_SD_CFG - same as above
PCS1G_CFG - same as above
PCS1G_ANEG_CFG - same as above
PCS1G_LB_CFG - same as above
DEV_MAC_ENA_CFG - both ocelot_adjust_link and ocelot_port_disable
                  touched this. felix_phylink_mac_link_{up,down} also
                  do. We go with what felix does and put it in
                  ocelot_phylink_mac_link_up.
DEV_CLOCK_CFG - ocelot_adjust_link and felix_phylink_mac_link_up both
                write this, but to different values. Move to the common
                ocelot_phylink_mac_link_up and make sure via the quirk
                that the old values are preserved for both.
ANA_PFC_PFC_CFG - ocelot_adjust_link wrote this, felix_phylink_mac_link_up
                  did not. Runtime invariant, speed does not matter since
                  PFC is disabled via the RX_PFC_ENA bits which are cleared.
                  Move to vsc7514_phylink_mac_config.
QSYS_SWITCH_PORT_MODE_PORT_ENA - both ocelot_adjust_link and
                                 felix_phylink_mac_link_{up,down} wrote
                                 this. Ocelot also wrote this register
                                 from ocelot_port_disable. Keep what
                                 felix did, move in ocelot_phylink_mac_link_{up,down}
                                 and delete ocelot_port_disable.
ANA_POL_FLOWC - same as above
SYS_MAC_FC_CFG - same as above, except slight behavior change. Whereas
                 ocelot always enabled RX and TX flow control, felix
                 listened to phylink (for the most part, at least - see
                 the 2500base-X comment).

The registers which only felix_phylink_mac_link_up wrote are:

SYS_PAUSE_CFG_PAUSE_ENA - this is why I am not sure that flow control
                          worked on ocelot. Not it should, since the
                          code is shared with felix where it does.
ANA_PORT_PORT_CFG - this is a Frame Analyzer block register, phylink
                    should be the one touching them, deleted.

Other changes:

- The old phylib registration code was in mscc_ocelot_init_ports. It is
  hard to work with 2 levels of indentation already in, and with hard to
  follow teardown logic. The new phylink registration code was moved
  inside ocelot_probe_port(), right between alloc_etherdev() and
  register_netdev(). It could not be done before (=> outside of)
  ocelot_probe_port() because ocelot_probe_port() allocates the struct
  ocelot_port which we then use to assign ocelot_port->phy_mode to. It
  is more preferable to me to have all PHY handling logic inside the
  same function.
- On the same topic: struct ocelot_port_private :: serdes is only used
  in ocelot_port_open to set the SERDES protocol to Ethernet. This is
  logically a runtime invariant and can be done just once, when the port
  registers with phylink. We therefore don't even need to keep the
  serdes reference inside struct ocelot_port_private, or to use the devm
  variant of of_phy_get().
- Phylink needs a valid phy-mode for phylink_create() to succeed, and
  the existing device tree bindings in arch/mips/boot/dts/mscc/ocelot_pcb120.dts
  don't define one for the internal PHY ports. So we patch
  PHY_INTERFACE_MODE_NA into PHY_INTERFACE_MODE_INTERNAL.
- There was a strategically placed:

	switch (priv->phy_mode) {
	case PHY_INTERFACE_MODE_NA:
	        continue;

  which made the code skip the serdes initialization for the internal
  PHY ports. Frankly that is not all that obvious, so now we explicitly
  initialize the serdes under an "if" condition and not rely on code
  jumps, so everything is clearer.
- There was a write of OCELOT_SPEED_1000 to DEV_CLOCK_CFG for QSGMII
  ports. Since that is in fact the default value for the register field
  DEV_CLOCK_CFG_LINK_SPEED, I can only guess the intention was to clear
  the adjacent fields, MAC_TX_RST and MAC_RX_RST, aka take the port out
  of reset, which does match the comment. I don't even want to know why
  this code is placed there, but if there is indeed an issue that all
  ports that share a QSGMII lane must all be up, then this logic is
  already buggy, since mscc_ocelot_init_ports iterates using
  for_each_available_child_of_node, so nobody prevents the user from
  putting a 'status = "disabled";' for some QSGMII ports which would
  break the driver's assumption.
  In any case, in the eventuality that I'm right, we would have yet
  another issue if ocelot_phylink_mac_link_down would reset those ports
  and that would be forbidden, so since the ocelot_adjust_link logic did
  not do that (maybe for a reason), add another quirk to preserve the
  old logic.

The ocelot driver teardown goes through all ports in one fell swoop.
When initialization of one port fails, the ocelot->ports[port] pointer
for that is reset to NULL, and teardown is done only for non-NULL ports,
so there is no reason to do partial teardowns, let the central
mscc_ocelot_release_ports() do its job.

Tested bind, unbind, rebind, link up, link down, speed change on mock-up
hardware (modified the driver to probe on Felix VSC9959). Also
regression tested the felix DSA driver. Could not test the Ocelot
specific bits (PCS1G, SERDES, device tree bindings).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-16 11:19:34 +01:00
Vladimir Oltean
46efe4efb9 net: dsa: felix: stop calling ocelot_port_{enable,disable}
ocelot_port_enable touches ANA_PORT_PORT_CFG, which has the following
fields:

- LOCKED_PORTMOVE_CPU, LEARNDROP, LEARNCPU, LEARNAUTO, RECV_ENA, all of
  which are written with their hardware default values, also runtime
  invariants. So it makes no sense to write these during every .ndo_open.

- PORTID_VAL: this field has an out-of-reset value of zero for all ports
  and must be initialized by software. Additionally, the
  ocelot_setup_logical_port_ids() code path sets up different logical
  port IDs for the ports in a hardware LAG, and we absolutely don't want
  .ndo_open to interfere there and reset those values.

So in fact the write from ocelot_port_enable can better be moved to
ocelot_init_port, and the .ndo_open hook deleted.

ocelot_port_disable touches DEV_MAC_ENA_CFG and QSYS_SWITCH_PORT_MODE_PORT_ENA,
in an attempt to undo what ocelot_adjust_link did. But since .ndo_stop
does not get called each time the link falls (i.e. this isn't a
substitute for .phylink_mac_link_down), felix already does better at
this by writing those registers already in felix_phylink_mac_link_down.

So keep ocelot_port_disable (for now, until ocelot is converted to
phylink too), and just delete the felix call to it, which is not
necessary.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-16 11:19:34 +01:00
Leon Romanovsky
a1fcb106ae net: hns3: remove always exist devlink pointer check
The devlink pointer always exists after hclge_devlink_init() succeed.
Remove that check together with NULL setting after release and ensure
that devlink_register is last command prior to call to devlink_reload_enable().

Fixes: b741269b27 ("net: hns3: add support for registering devlink for PF")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-14 13:59:10 +01:00
Arnd Bergmann
e5f3155267 ethernet: fix PTP_1588_CLOCK dependencies
The 'imply' keyword does not do what most people think it does, it only
politely asks Kconfig to turn on another symbol, but does not prevent
it from being disabled manually or built as a loadable module when the
user is built-in. In the ICE driver, the latter now causes a link failure:

aarch64-linux-ld: drivers/net/ethernet/intel/ice/ice_main.o: in function `ice_eth_ioctl':
ice_main.c:(.text+0x13b0): undefined reference to `ice_ptp_get_ts_config'
ice_main.c:(.text+0x13b0): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `ice_ptp_get_ts_config'
aarch64-linux-ld: ice_main.c:(.text+0x13bc): undefined reference to `ice_ptp_set_ts_config'
ice_main.c:(.text+0x13bc): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `ice_ptp_set_ts_config'
aarch64-linux-ld: drivers/net/ethernet/intel/ice/ice_main.o: in function `ice_prepare_for_reset':
ice_main.c:(.text+0x31fc): undefined reference to `ice_ptp_release'
ice_main.c:(.text+0x31fc): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `ice_ptp_release'
aarch64-linux-ld: drivers/net/ethernet/intel/ice/ice_main.o: in function `ice_rebuild':

This is a recurring problem in many drivers, and we have discussed
it several times befores, without reaching a consensus. I'm providing
a link to the previous email thread for reference, which discusses
some related problems.

To solve the dependency issue better than the 'imply' keyword, introduce a
separate Kconfig symbol "CONFIG_PTP_1588_CLOCK_OPTIONAL" that any driver
can depend on if it is able to use PTP support when available, but works
fine without it. Whenever CONFIG_PTP_1588_CLOCK=m, those drivers are
then prevented from being built-in, the same way as with a 'depends on
PTP_1588_CLOCK || !PTP_1588_CLOCK' dependency that does the same trick,
but that can be rather confusing when you first see it.

Since this should cover the dependencies correctly, the IS_REACHABLE()
hack in the header is no longer needed now, and can be turned back
into a normal IS_ENABLED() check. Any driver that gets the dependency
wrong will now cause a link time failure rather than being unable to use
PTP support when that is in a loadable module.

However, the two recently added ptp_get_vclocks_index() and
ptp_convert_timestamp() interfaces are only called from builtin code with
ethtool and socket timestamps, so keep the current behavior by stubbing
those out completely when PTP is in a loadable module. This should be
addressed properly in a follow-up.

As Richard suggested, we may want to actually turn PTP support into a
'bool' option later on, preventing it from being a loadable module
altogether, which would be one way to solve the problem with the ethtool
interface.

Fixes: 06c16d89d2 ("ice: register 1588 PTP clock device object for E810 devices")
Link: https://lore.kernel.org/netdev/20210804121318.337276-1-arnd@kernel.org/
Link: https://lore.kernel.org/netdev/CAK8P3a06enZOf=XyZ+zcAwBczv41UuCTz+=0FMf2gBz1_cOnZQ@mail.gmail.com/
Link: https://lore.kernel.org/netdev/CAK8P3a3=eOxE-K25754+fB_-i_0BZzf9a9RfPTX3ppSwu9WZXw@mail.gmail.com/
Link: https://lore.kernel.org/netdev/20210726084540.3282344-1-arnd@kernel.org/
Acked-by: Shannon Nelson <snelson@pensando.io>
Acked-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20210812183509.1362782-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-13 17:49:05 -07:00