1092342 Commits

Author SHA1 Message Date
Arun Ramadoss
008db08b64 net: dsa: microchip: remove unused members in ksz_device
The name, regs_size and overrides members in struct ksz_device are
unused. Hence remove it.
And host_mask is used in only place of ksz8795.c file, which can be
replaced by dev->info->cpu_ports

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-18 12:51:00 +01:00
Arun Ramadoss
65ac79e181 net: dsa: microchip: add the phylink get_caps
This patch add the support for phylink_get_caps for ksz8795 and ksz9477
series switch. It updates the struct ksz_switch_chip with the details of
the internal phys and xmii interface. Then during the get_caps based on
the bits set in the structure, corresponding phy mode is set.

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-18 12:50:59 +01:00
Prasanna Vengateshan
b094c67966 net: dsa: move mib->cnt_ptr reset code to ksz_common.c
mib->cnt_ptr resetting is handled in multiple places as part of
port_init_cnt(). Hence moved mib->cnt_ptr code to ksz common layer
and removed from individual product files.

Signed-off-by: Prasanna Vengateshan <prasanna.vengateshan@microchip.com>
Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-18 12:50:59 +01:00
Arun Ramadoss
997d2126ac net: dsa: microchip: move get_strings to ksz_common
ksz8795 and ksz9477 uses the same algorithm for copying the ethtool
strings. Hence moved to ksz_common to remove the redundant code.

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-18 12:50:59 +01:00
Arun Ramadoss
198b34783a net: dsa: microchip: move port memory allocation to ksz_common
ksz8795 and ksz9477 init function initializes the memory to dev->ports,
mib counters and assigns the ds real number of ports. Since both the
routines are same, moved the allocation of port memory to
ksz_switch_register after init.

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-18 12:50:59 +01:00
Arun Ramadoss
a530e6f220 net: dsa: microchip: move struct mib_names to ksz_chip_data
The ksz88xx family has one set of mib_names. The ksz87xx, ksz9477,
LAN937x based switches has one set of mib_names. In order to remove
redundant declaration, moved the struct mib_names to ksz_chip_data
structure.

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-18 12:50:59 +01:00
Arun Ramadoss
eee16b1471 net: dsa: microchip: perform the compatibility check for dev probed
This patch perform the compatibility check for the device after the chip
detect is done. It is to prevent the mismatch between the device
compatible specified in the device tree and actual device found during
the detect. The ksz9477 device doesn't use any .data in the
of_device_id. But the ksz8795_spi uses .data for assigning the regmap
between 8830 family and 87xx family switch. Changed the regmap
assignment based on the chip_id from the .data.

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-18 12:50:59 +01:00
Arun Ramadoss
462d525018 net: dsa: microchip: move ksz_chip_data to ksz_common
This patch moves the ksz_chip_data in ksz8795 and ksz9477 to ksz_common.
At present, the dev->chip_id is iterated with the ksz_chip_data and then
copy its value to the ksz_dev structure. These values are declared as
constant.
Instead of copying the values and referencing it, this patch update the
dev->info to the ksz_chip_data based on the chip_id in the init
function. And also update the ksz_chip_data values for the LAN937x based
switches.

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-18 12:50:59 +01:00
Arun Ramadoss
a30bf80559 net: dsa: microchip: ksz8795: update the port_cnt value in ksz_chip_data
The port_cnt value in the structure is not used in the switch_init.
Instead it uses the fls(chip->cpu_port), this is due to one of port in
the ksz8794 unavailable. The cpu_port for the 8794 is 0x10, fls(0x10) =
5, hence updating it directly in the ksz_chip_data structure in order to
same with all the other switches in ksz8795.c and ksz9477.c files.

Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-18 12:50:59 +01:00
David S. Miller
089403a3f7 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
pull request (net): ipsec 2022-05-18

1) Fix "disable_policy" flag use when arriving from different devices.
   From Eyal Birger.

2) Fix error handling of pfkey_broadcast in function pfkey_process.
   From Jiasheng Jiang.

3) Check the encryption module availability consistency in pfkey.
   From Thomas Bartschies.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-18 12:47:36 +01:00
Johannes Berg
78488a64ae iwlwifi: mei: fix potential NULL-ptr deref
If SKB allocation fails, continue rather than using the NULL
pointer.

Coverity CID: 1497650

Cc: stable@vger.kernel.org
Fixes: 2da4366f9e2c ("iwlwifi: mei: add the driver to allow cooperation with CSME")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20220517120045.90c1b1fd534e.Ibb42463e74d0ec7d36ec81df22e171ae1f6268b0@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-05-18 12:58:59 +02:00
Avraham Stern
55cf10488d iwlwifi: mei: clear the sap data header before sending
The SAP data header has some fields that are marked as reserved
but are actually in use by CSME. Clear those fields before sending
the data to avoid having random values in those fields.

Cc: stable@vger.kernel.org
Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20220517120045.8dd3423cf683.I02976028eaa6aab395cb2e701fa7127212762eb7@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-05-18 12:54:09 +02:00
Miri Korenblit
98c0de7b26 iwlwifi: mvm: remove vif_count
We used to count the number of ieee80211_vifs in mvm.
This was needed for the legacy PM API, which is no longer
supported. Remove it.

Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20220517120045.8c91ae023b15.Ia6145e4930b1d28f3fcedc316b4f177295b00557@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-05-18 12:54:09 +02:00
Emmanuel Grumbach
147eb05f24 iwlwifi: mvm: always tell the firmware to accept MCAST frames in BSS
Make the firmware's life easier and always accept MCAST frames. If
needed, drop them in the driver. We need to filter out MCAST frames
in order not to have false positives in the decryption check. If we
accept MCAST frames before we have the GKT installed, we'll end up
complaining that we can't decrypt the frame.
Implement the same filtering, but in the driver.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20220517120045.479956a46317.I21fac7ede9eca85a662671d694872898df884f0b@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-05-18 12:54:08 +02:00
Mordechay Goodstein
184f10db5f iwlwifi: mvm: add OTP info in case of init failure
This helps to understand HW issues that can happen while
initializing the nic.

Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20220517120045.48464938b27a.I9b381f0da5e0636ad6a5f6c13f98edb9031b50fb@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-05-18 12:54:08 +02:00
Emmanuel Grumbach
9d096e3d30 iwlwifi: mvm: fix assert 1F04 upon reconfig
When we reconfig we must not send the MAC_POWER command that relates to
a MAC that was not yet added to the firmware.

Ignore those in the iterator.

Cc: stable@vger.kernel.org
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20220517120044.ed2ffc8ce732.If786e19512d0da4334a6382ea6148703422c7d7b@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-05-18 12:54:04 +02:00
Johannes Berg
d1f6530c3e iwlwifi: fw: init SAR GEO table only if data is present
When no table data was read from ACPI, then filling the data
and returning success here will fill zero values, which means
transmit power will be limited to 0 dBm. This is clearly not
intended.

Return an error from iwl_sar_geo_init() if there's no data to
fill into the command structure.

Cc: stable@vger.kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Fixes: 78a19d5285d9 ("iwlwifi: mvm: Read the PPAG and SAR tables at INIT stage")
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20220517120044.bc45923b74e9.Id2b4362234b7f8ced82c591b95d4075dd2ec12f4@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-05-18 12:53:53 +02:00
Johannes Berg
51e073c23b iwlwifi: mvm: clean up authorized condition
We track in mvmvif->authorized when the AP STA becomes authorized
and no longer authorized, so we don't need the complex condition
with station lookup. Simplify the code.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20220517120044.41f528383a6b.I1cdf165581b781c53c8e6ac8779a2282b1f67c59@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-05-18 12:51:25 +02:00
Haim Dreyfuss
537b76d26c iwlwifi: mvm: use NULL instead of ERR_PTR when parsing wowlan status
We anyway don't differentiate between the errors so it is pointless,
returning NULL will be simpler in this case.

Signed-off-by: Haim Dreyfuss <haim.dreyfuss@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20220517120044.78a7651327bb.I77480de7c26db850680f96a3440fb6a1b45dd9d2@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-05-18 12:51:11 +02:00
Johannes Berg
c191819642 iwlwifi: pcie: simplify MSI-X cause mapping
We're currently manually encoding a calculation here since the HW
just maps all the bits of specific registers to specific offsets,
which led to the bug fixed here previously with the Bz SW_ERROR
interrupt.

Clean up the code to only know about the mapping offset (-16 or
16 depending on the register) to avoid such issues in the future.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20220517120044.19abe9a4d171.I934356911277f9b2a955808763f317986f69a461@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-05-18 12:50:46 +02:00
David S. Miller
6431ce6cd3 mlx5-updates-2022-05-17
MISC updates to mlx5 dirver
 
 1) Aya Levin allows relaxed ordering over VFs
 
 2) Gal Pressman Adds support XDP SQs for uplink representors in switchdev mode
 
 3) Add debugfs TC stats and command failure syndrome for debuggability
 
 4) Tariq uses variants of vzalloc where it could help
 
 5) Multiport eswitch support from Elic Cohen:
 
 Eli Cohen Says:
 ===============
 
 The multiport eswitch feature allows to forward traffic from a
 representor net device to the uplink port of an associated eswitch's
 uplink port.
 
 This feature requires creating a LAG object. Since LAG can be created
 only once for a function, the feature is mutual exclusive with either
 bonding or multipath.
 
 Multipath eswitch mode is entered automatically these conditions are
 met:
 1. No other LAG related mode is active.
 2. A rule that explicitly forwards to an uplink port is inserted.
 
 The implementation maintains a reference count on such rules. When the
 reference count reaches zero, the LAG is released and other modes may be
 used.
 
 When an explicit rule that explicitly forwards to an uplink port is
 inserted while another LAG mode is active, that rule will not be
 offloaded by the hardware since the hardware cannot guarantee that the
 rule will actually be forwarded to that port.
 
 Example rules that forwards to an uplink port is:
 
 $ tc filter add dev rep0 root flower action mirred egress \
   redirect dev uplinkrep0
 
 $ tc filter add dev rep0 root flower action mirred egress \
   redirect dev uplinkrep1
 
 This feature is supported only if LAG_RESOURCE_ALLOCATION firmware
 configuration parameter is set to true.
 
 The series consists of three patches:
 1. Lag state machine refactor
    This patch does not add new functionality but rather changes the way
    the state of the LAG is maintained.
 2. Small fix to remove unused argument.
 3. The actual implementation of the feature.
 ===============
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmKElS8ACgkQSD+KveBX
 +j6uKAf9GJrtngu198xdnood2qfxsiZ6x+wYsBVW/uOb36xp13XkbM78pr37Vp08
 y/IvK07u+VrsgnoQLVqNyCosKaja5rdaHW/E8qa4mGnYo2j1DIjTNDPJVPiazL6+
 FfHUBRmVjiPaEoF5FS7ezu7AMNqxfqSqcojsrk/j+5Ic/1AbNidZD015uIxmqg95
 rnS3V7UFV7LlMMVGPWidF2QBBQAbddsqGSpYBOowBu8MuPExuJfqH7nxxem0ATLH
 RLoU5MghM+Ny1s6ZfccKtl/TIrFBcxu0IATmacmUlp/0Hmv7apuBFredbxJ0D4bF
 cHgaOU+80UBEEFs31K0a3+neisL+ug==
 =0VhX
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2022-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2022-05-17

MISC updates to mlx5 dirver

1) Aya Levin allows relaxed ordering over VFs

2) Gal Pressman Adds support XDP SQs for uplink representors in switchdev mode

3) Add debugfs TC stats and command failure syndrome for debuggability

4) Tariq uses variants of vzalloc where it could help

5) Multiport eswitch support from Elic Cohen:

Eli Cohen Says:
===============

The multiport eswitch feature allows to forward traffic from a
representor net device to the uplink port of an associated eswitch's
uplink port.

This feature requires creating a LAG object. Since LAG can be created
only once for a function, the feature is mutual exclusive with either
bonding or multipath.

Multipath eswitch mode is entered automatically these conditions are
met:
1. No other LAG related mode is active.
2. A rule that explicitly forwards to an uplink port is inserted.

The implementation maintains a reference count on such rules. When the
reference count reaches zero, the LAG is released and other modes may be
used.

When an explicit rule that explicitly forwards to an uplink port is
inserted while another LAG mode is active, that rule will not be
offloaded by the hardware since the hardware cannot guarantee that the
rule will actually be forwarded to that port.

Example rules that forwards to an uplink port is:

$ tc filter add dev rep0 root flower action mirred egress \
  redirect dev uplinkrep0

$ tc filter add dev rep0 root flower action mirred egress \
  redirect dev uplinkrep1

This feature is supported only if LAG_RESOURCE_ALLOCATION firmware
configuration parameter is set to true.

The series consists of three patches:
1. Lag state machine refactor
   This patch does not add new functionality but rather changes the way
   the state of the LAG is maintained.
2. Small fix to remove unused argument.
3. The actual implementation of the feature.
===============

====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-18 11:35:27 +01:00
David S. Miller
765d121600 mlx5-fixes-2022-05-17
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmKEjE0ACgkQSD+KveBX
 +j4alQgAuKCGEmc7blX5Xwg2qDaRc7/WlLLXyrg5IBI3Pmqi/GrKqGfT7iNnX6xD
 ADR3k64mSoA941Il18xyPOztdRzp9N0JQ2r1sTcKLx+4DrMSpnQERnDb8pHjLsqw
 7l2BvkH/PjEAdwByi4rpt3X3BBahkPOW33QhEj3yIOvBY6EQnAtGDqVY4ql4vG7+
 4dgcycNN7MA7ylxgVlwn66EumHGZTKnuugYwmK4eFDxgyaQYBbHfe/yOj4N8rdUF
 pV0Z0sAdIvg+UXP0HKFVv08dINXQLFU5aCYVK/09SHrInP+ctW3uD+3ykFSxhQGZ
 JOhyJJ6dZk0vmXGikeve36WMOCcRVQ==
 =rT0G
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2022-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes 2022-05-17

This series provides bug fixes to mlx5 driver.
Please pull and let me know if there is any problem.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-18 11:33:44 +01:00
Thomas Bartschies
015c44d7bf net: af_key: check encryption module availability consistency
Since the recent introduction supporting the SM3 and SM4 hash algos for IPsec, the kernel
produces invalid pfkey acquire messages, when these encryption modules are disabled. This
happens because the availability of the algos wasn't checked in all necessary functions.
This patch adds these checks.

Signed-off-by: Thomas Bartschies <thomas.bartschies@cvk.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-05-18 09:42:16 +02:00
Jiasheng Jiang
4dc2a5a8f6 net: af_key: add check for pfkey_broadcast in function pfkey_process
If skb_clone() returns null pointer, pfkey_broadcast() will
return error.
Therefore, it should be better to check the return value of
pfkey_broadcast() and return error if fails.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-05-18 09:23:54 +02:00
Stephen Rothwell
58a94a62a5 netfilter: ctnetlink: fix up for "netfilter: conntrack: remove unconfirmed list"
After merging the net-next tree, today's linux-next build (powerpc
ppc64_defconfig) produced this warning:

nf_conntrack_netlink.c:1717 warning: 'ctnetlink_dump_one_entry' defined but not used

Fixes: 8a75a2c17410 ("netfilter: conntrack: remove unconfirmed list")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Florian Westphal <fw@strlen.de>
2022-05-18 09:21:59 +02:00
Eli Cohen
94db331778 net/mlx5: Support multiport eswitch mode
Multiport eswitch mode is a LAG mode that allows to add rules that
forward traffic to a specific physical port without being affected by LAG
affinity configuration.

This mode of operation is mutual exclusive with the other LAG modes used
by multipath and bonding.

To make the transition between the modes, we maintain a counter on the
number of rules specifying one of the uplink representors as the target
of mirred egress redirect action.

An example of such rule would be:

$ tc filter add dev enp8s0f0_0 prot all root flower dst_mac \
  00:11:22:33:44:55 action mirred egress redirect dev enp8s0f0

If the reference count just grows to one and LAG is not in use, we
create the LAG in multiport eswitch mode. Other mode changes are not
allowed while in this mode. When the reference count reaches zero, we
destroy the LAG and let other modes be used if needed.

logic also changed such that if forwarding to some uplink destination
cannot be guaranteed, we fail the operation so the rule will eventually
be in software and not in hardware.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-17 23:41:51 -07:00
Eli Cohen
a4a9c87ebb net/mlx5: Remove unused argument
Argument ndev is not used in mlx5_handle_changeupper_event()
Remove it.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-17 23:41:50 -07:00
Eli Cohen
ef9a3a4a81 net/mlx5: Lag, refactor lag state machine
LAG state machine is implemented using bit flags. However, all these bit
flags, except for MLX5_LAG_FLAG_HASH_BASED, are really mutual exclusive.

In addition, MLX5_LAG_FLAG_READY is used by bonding to mark if we have
our netdevices successfully added to lag and does not really belong in
the same flags variable as the other flags.

Rename MLX5_LAG_FLAG_READY to MLX5_LAG_FLAG_NDEVS_READY to better
reflect its purpose and put it in a new flags variable.

For the rest of the flags, we introduce a mode enum to hold the state
of the LAG.

Remove the shared fdb boolean flag from struct mlx5_lag and store this
configuration as a mode flag.

Change all flag related operations to use standard Linux APIs.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-17 23:41:50 -07:00
Gal Pressman
65810a2d2a net/mlx5e: Add XDP SQs to uplink representors steering tables
This patch adds the XDP SQs to the uplink representors steering tables
in swichdev mode and enables XDP usage on them.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-17 23:41:49 -07:00
Moshe Tal
6d0ba49321 net/mlx5e: Correct the calculation of max channels for rep
Correct the calculation of maximum channels of rep to better utilize
the hardware resources and allow a larger scale of reps.

This will allow creation of all virtual ports configured.

Fixes: 473baf2e9e8c ("net/mlx5e: Allow profile-specific limitation on max num of channels")
Signed-off-by: Moshe Tal <moshet@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-17 23:41:48 -07:00
Saeed Mahameed
77422a8f6f net/mlx5e: CT: Add ct driver counters
Connection offload is translated to multiple rules over several
hardware flow tables. Unhandled end-cases may cause a hardware
resource leak causing multiple system symptoms such as a host
memory leak, decreased performance and other scale related issues.

Export the current number of firmware FTEs related to the CT table
as a debugfs counter. Also add a dropped packets counter to help
debug packets dropped on restore failure.

To show the offloaded count:
cat /sys/kernel/debug/mlx5/<PCI>/ct_nic/offloaded

To show the dropped count:
cat /sys/kernel/debug/mlx5/<PCI>/ct_nic/rx_dropped

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Roi Dayan <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
2022-05-17 23:41:48 -07:00
Aya Levin
f05ec8d9d0 net/mlx5e: Allow relaxed ordering over VFs
By PCI spec, the config space of the VF always report relaxed ordering
not supported while it inherits this property from its PF. Hence using
pcie_relaxed_ordering_enable(), always disables the relaxed ordering on
all VFs. Remove this check and rely on the firmware which queries the
config space of the PF and set the capability bit accordingly.

Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Marina Varshaver <marinav@nvidia.com>
Reviewed-by: Gal Shalom <galshalom@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-17 23:41:47 -07:00
Gal Pressman
682adfa6ca net/mlx5e: Support partial GSO for tunnels over vlans
Offloading outer checksum on tunnels requires GSO partial, add it to
'vlan_features' to allow offloading tunnels over vlans.
For example, running GENEVE over vlan & ipv6 (mandatory UDP checksum)
now allows for hardware TSO instead of software segmentation in GSO
only.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-17 23:41:47 -07:00
Gal Pressman
675b9d51d6 net/mlx5e: IPoIB, Improve ethtool rxnfc callback structure in IPoIB
Followup commit
79ce39be1d63 ("net/mlx5e: Improve ethtool rxnfc callback structure")
and handle CONFIG_MLX5_EN_RXNFC enabled/disabled inside the fs layer so
the ethtool callbacks are always available. The fs layer will provide
stubs when CONFIG_MLX5_EN_RXNFC is compiled out.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-17 23:41:47 -07:00
Tariq Toukan
597c112326 net/mlx5e: Allocate virtually contiguous memory for reps structures
Physical continuity is not necessary, and requested allocation size might
be larger than PAGE_SIZE.
Hence, use v-alloc/free API.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-17 23:41:46 -07:00
Tariq Toukan
035e0dd573 net/mlx5e: Allocate virtually contiguous memory for VLANs list
Physical continuity is not necessary, and requested allocation size might
be larger than PAGE_SIZE.
Hence, use v-alloc/free API.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-17 23:41:46 -07:00
Tariq Toukan
88468311c0 net/mlx5: Allocate virtually contiguous memory in pci_irq.c
Physical continuity is not necessary, and requested allocation size might
be larger than PAGE_SIZE.
Hence, use v-alloc/free API.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-17 23:41:45 -07:00
Tariq Toukan
773c104d53 net/mlx5: Allocate virtually contiguous memory in vport.c
Physical continuity is not necessary, and requested allocation size might
be larger than PAGE_SIZE.
Hence, use v-alloc/free API.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-17 23:41:45 -07:00
Tariq Toukan
9b45bde82c net/mlx5: Inline db alloc API function
Take the wrapper version which picks default node into a header file.
This reduces the number of exported functions.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-17 23:41:45 -07:00
Moshe Shemesh
1d2c717bc7 net/mlx5: Add last command failure syndrome to debugfs
Add syndrome of last command failure per command type to debugfs to ease
debugging of such failure.
last_failed_syndrome - last command failed syndrome returned by FW.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-17 23:41:44 -07:00
Saeed Mahameed
4c7c8a6d87 net/mlx5: sparse: error: context imbalance in 'mlx5_vf_get_core_dev'
Removing the annotation resolves the issue for some reason.

Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-17 23:41:43 -07:00
Al Viro
a91714312e percpu_ref_init(): clean ->percpu_count_ref on failure
That way percpu_ref_exit() is safe after failing percpu_ref_init().
At least one user (cgroup_create()) had a double-free that way;
there might be other similar bugs.  Easier to fix in percpu_ref_init(),
rather than playing whack-a-mole in sloppy users...

Usual symptoms look like a messed refcounting in one of subsystems
that use percpu allocations (might be percpu-refcount, might be
something else).  Having refcounts for two different objects share
memory is Not Nice(tm)...

Reported-by: syzbot+5b1e53987f858500ec00@syzkaller.appspotmail.com
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-05-18 02:20:17 -04:00
Shay Drory
16d42d3133 net/mlx5: Drain fw_reset when removing device
In case fw sync reset is called in parallel to device removal, device
might stuck in the following deadlock:
         CPU 0                        CPU 1
         -----                        -----
                                  remove_one
                                   uninit_one (locks intf_state_mutex)
mlx5_sync_reset_now_event()
work in fw_reset->wq.
 mlx5_enter_error_state()
  mutex_lock (intf_state_mutex)
                                   cleanup_once
                                    fw_reset_cleanup()
                                     destroy_workqueue(fw_reset->wq)

Drain the fw_reset WQ, and make sure no new work is being queued, before
entering uninit_one().
The Drain is done before devlink_unregister() since fw_reset, in some
flows, is using devlink API devlink_remote_reload_actions_performed().

Fixes: 38b9f903f22b ("net/mlx5: Handle sync reset request event")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-17 23:03:57 -07:00
Paul Blakey
04c551bad3 net/mlx5e: CT: Fix setting flow_source for smfs ct tuples
Cited patch sets flow_source to ANY overriding the provided spec
flow_source, avoiding the optimization done by commit c9c079b4deaa
("net/mlx5: CT: Set flow source hint from provided tuple device").

To fix the above, set the dr_rule flow_source from provided flow spec.

Fixes: 3ee61ebb0df1 ("net/mlx5: CT: Add software steering ct flow steering provider")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-17 23:03:56 -07:00
Paul Blakey
8e1dcf499a net/mlx5e: CT: Fix support for GRE tuples
cited commit removed support for GRE tuples when software steering was enabled.

To bring back support for GRE tuples, add GRE ipv4/ipv6 matchers.

Fixes: 3ee61ebb0df1 ("net/mlx5: CT: Add software steering ct flow steering provider")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-17 23:03:56 -07:00
Gal Pressman
6bbd723035 net/mlx5e: Remove HW-GRO from reported features
We got reports of certain HW-GRO flows causing kernel call traces, which
might be related to firmware. To be on the safe side, disable the
feature for now and re-enable it once a driver/firmware fix is found.

Fixes: 83439f3c37aa ("net/mlx5e: Add HW-GRO offload")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-17 23:03:56 -07:00
Maxim Mikityanskiy
b0617e7b35 net/mlx5e: Properly block HW GRO when XDP is enabled
HW GRO is incompatible and mutually exclusive with XDP and XSK. However,
the needed checks are only made when enabling XDP. If HW GRO is enabled
when XDP is already active, the command will succeed, and XDP will be
skipped in the data path, although still enabled.

This commit fixes the bug by checking the XDP and XSK status in
mlx5e_fix_features and disabling HW GRO if XDP is enabled.

Fixes: 83439f3c37aa ("net/mlx5e: Add HW-GRO offload")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-17 23:03:55 -07:00
Maxim Mikityanskiy
cf6e34c8c2 net/mlx5e: Properly block LRO when XDP is enabled
LRO is incompatible and mutually exclusive with XDP. However, the needed
checks are only made when enabling XDP. If LRO is enabled when XDP is
already active, the command will succeed, and XDP will be skipped in the
data path, although still enabled.

This commit fixes the bug by checking the XDP status in
mlx5e_fix_features and disabling LRO if XDP is enabled.

Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-17 23:03:55 -07:00
Aya Levin
15a5078cab net/mlx5e: Block rx-gro-hw feature in switchdev mode
When the driver is in switchdev mode and rx-gro-hw is set, the RQ needs
special CQE handling. Till then, block setting of rx-gro-hw feature in
switchdev mode, to avoid failure while setting the feature due to
failure while opening the RQ.

Fixes: f97d5c2a453e ("net/mlx5e: Add handle SHAMPO cqe support")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-17 23:03:54 -07:00
Maxim Mikityanskiy
379169740b net/mlx5e: Wrap mlx5e_trap_napi_poll into rcu_read_lock
The body of mlx5e_napi_poll is wrapped into rcu_read_lock to be able to
read the XDP program pointer using rcu_dereference. However, the trap RQ
NAPI doesn't use rcu_read_lock, because the trap RQ works only in the
non-linear mode, and mlx5e_skb_from_cqe_nonlinear, until recently,
didn't support XDP and didn't call rcu_dereference.

Starting from the cited commit, mlx5e_skb_from_cqe_nonlinear supports
XDP and calls rcu_dereference, but mlx5e_trap_napi_poll doesn't wrap it
into rcu_read_lock. It leads to RCU-lockdep warnings like this:

    WARNING: suspicious RCU usage

This commit fixes the issue by adding an rcu_read_lock to
mlx5e_trap_napi_poll, similarly to mlx5e_napi_poll.

Fixes: ea5d49bdae8b ("net/mlx5e: Add XDP multi buffer support to the non-linear legacy RQ")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-17 23:03:54 -07:00