Commit Graph

996723 Commits

Author SHA1 Message Date
Baowen Zheng
2ffe039528 net/sched: act_police: add support for packet-per-second policing
Allow a policer action to enforce a rate-limit based on packets-per-second,
configurable using a packet-per-second rate and burst parameters.

e.g.
tc filter add dev tap1 parent ffff: u32 match \
        u32 0 0 police pkts_rate 3000 pkts_burst 1000

Testing was unable to uncover a performance impact of this change on
existing features.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-13 14:18:09 -08:00
Baowen Zheng
6a56e19902 flow_offload: reject configuration of packet-per-second policing in offload drivers
A follow-up patch will allow users to configures packet-per-second policing
in the software datapath. In preparation for this, teach all drivers that
support offload of the policer action to reject such configuration as
currently none of them support it.

Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-13 14:18:09 -08:00
Xingfeng Hu
25660156f4 flow_offload: add support for packet-per-second policing
Allow flow_offload API to configure packet-per-second policing using rate
and burst parameters.

Dummy implementations of tcf_police_rate_pkt_ps() and
tcf_police_burst_pkt() are supplied which return 0, the unconfigured state.
This is to facilitate splitting the offload, driver, and TC code portion of
this feature into separate patches with the aim of providing a logical flow
for review. And the implementation of these helpers will be filled out by a
follow-up patch.

Signed-off-by: Xingfeng Hu <xingfeng.hu@corigine.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Louis Peens <louis.peens@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-13 14:18:09 -08:00
David S. Miller
4849d9beb8 Merge branch 'hns3-imp-phys'
Huazhong Tan says:

====================
net: hns3: support imp-controlled PHYs

This series adds support for imp-controlled PHYs in the HNS3
ethernet driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-13 14:11:29 -08:00
Guangbin Huang
b47cfe1f40 net: hns3: add phy loopback support for imp-controlled PHYs
If the imp-controlled PHYs feature is enabled, driver can not
call phy driver interface to set loopback anymore and needs
to send command to firmware to start phy loopback.

Driver reuses the existing firmware command 0x0315 to start
phy loopback, just add a setting bit in this command. As this
command is not only for serdes loopback anymore, rename this
command to "xxx_COMMON_LOOPBACK", and modify function name,
macro name and logs related to it.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-13 14:11:29 -08:00
Guangbin Huang
024712f51e net: hns3: add ioctl support for imp-controlled PHYs
When the imp-controlled PHYs feature is enabled, driver will not
register mdio bus. In order to support ioctl ops for phy tool to
read or write phy register in this case, the firmware implement
a new command for driver and driver implement ioctl by using this
new command.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-13 14:11:29 -08:00
Guangbin Huang
57a8f46b1b net: hns3: add get/set pause parameters support for imp-controlled PHYs
When the imp-controlled PHYs feature is enabled, phydev is NULL.
In this case, the autoneg is always off when user uses ethtool -a
command to get pause parameters because  hclge_get_pauseparam()
uses phydev to check whether device is TP port. To fit this new
feature, use media type to check whether device is TP port.

And when user set pause parameters, these parameters need to
always set to mac, no matter whether autoneg is off.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-13 14:11:28 -08:00
Guangbin Huang
f5f2b3e4dc net: hns3: add support for imp-controlled PHYs
IMP(Intelligent Management Processor) firmware add a new feature
to take control of PHYs for some new devices, PF driver adds
support for this feature.

Driver queries device's capability to check whether IMP supports
this feature, it will tell IMP to enable this feature by firmware
compatible command if it is supported.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-13 14:11:28 -08:00
David S. Miller
5ab6f96a12 Merge branch 'sh_eth-reg-defs'
Sergey Shtylyov says:

====================
sh_eth: Improve the register/bit definitions in the Ether driver

Here are 4 patches against DaveM's 'net-next' repo. Mainly I'm renaming the register *enum*
tags/entries to match the SoC manuals,and also moving the RX-TX descriptor *enum*s closer to
the corresponding *struct*s...
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:50:42 -08:00
Sergey Shtylyov
0deaeabf27 sh_eth: place RX/TX descriptor *enum*s after their *struct*s
Place the RX/TX descriptor bit *enum*s where they belong -- after the
corresponding RX/TX descriptor *struct*s and, while at it, switch to
declaring one *enum* entry per line...

Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:50:42 -08:00
Sergey Shtylyov
e2dccaf194 sh_eth: rename *enum*s still not matching register names
Finally, rename the rest of the *enum* tags still not (exactly) matching
the abbreviated register names from the manuals...

Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:50:42 -08:00
Sergey Shtylyov
4585b72d97 sh_eth: rename PSR bits
In all the SoC manuals (except R-Car gen2) the PHY status register's name
is abbreviated to  PSR with the only valid bit 0 named LMON.  Follow the
suit and rename the corresponding *enum* tag/entry.

Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:50:42 -08:00
Sergey Shtylyov
bc9d992ca4 sh_eth: rename TRSCER bits
In all the SoC manuals the TRSCER register bits match the corresponding
EESR registers's bits, but only on the R-Car gen2 SoC those are named
RINT<n> and TINT<n>.  Follow the suit and rename the *enum* tag/entries
from DESC_I_* to TRSCER_*.

Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:50:42 -08:00
David S. Miller
7c678829ef Merge branch 'mptcp-Include-multiple-address-ids-in-RM_ADDR'
Mat Martineau says:

====================
mptcp: Include multiple address ids in RM_ADDR

Here's a patch series from the MPTCP tree that extends the capabilities
of the MPTCP RM_ADDR header.

MPTCP peers can exchange information about their IP addresses that are
available for additional MPTCP subflows. IP addresses are advertised
with an ADD_ADDR header type, and those advertisements are revoked with
the RM_ADDR header type. RFC 8684 allows the RM_ADDR header to include
more than one address ID, so multiple advertisements can be revoked in a
single header. Previous kernel versions have only used RM_ADDR with a
single address ID, so multiple removals required multiple packets.

Patches 1-4 plumb address id list structures around the MPTCP code,
where before only a single address ID was passed.

Patches 5-8 make use of the address lists at the path manager layer that
tracks available addresses for both peers.

Patches 9-11 update the selftests to cover the new use of RM_ADDR with
multiple address IDs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:47:46 -08:00
Geliang Tang
d2c4333a80 selftests: mptcp: add testcases for removing addrs
This patch added the testcases for removing a list of addresses. Used
the netlink to flush the addresses in the testcases.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:47:45 -08:00
Geliang Tang
f87744ad42 selftests: mptcp: set addr id for removing testcases
The removing testcases can only delete the addresses from id 1, this
patch added the support for deleting the addresses from any id that user
set.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:47:45 -08:00
Geliang Tang
7028ba8ac9 selftests: mptcp: add invert argument for chk_rm_nr
Some of the removing testcases used two zeros as arguments for chk_rm_nr
like this: chk_rm_nr 0 0. This doesn't mean that no RM_ADDR has been sent.
It only means that RM_ADDR had been sent in the opposite direction that
chk_rm_nr is checking.

This patch added a new argument invert for chk_rm_nr to allow it can
check the RM_ADDR from the opposite direction.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:47:45 -08:00
Geliang Tang
0e4a3e6886 mptcp: remove a list of addrs when flushing
This patch invoked mptcp_nl_remove_addrs_list to remove a list of addresses
when the netlink flushes addresses, instead of using
mptcp_nl_remove_subflow_and_signal_addr to remove them one by one.

And dropped the unused parameter net in __flush_addrs too.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:47:45 -08:00
Geliang Tang
06faa22710 mptcp: remove multi addresses and subflows in PM
This patch implemented the function to remove a list of addresses and
subflows, named mptcp_nl_remove_addrs_list, which had a input parameter
rm_list as the removing addresses list.

In mptcp_nl_remove_addrs_list, traverse all the existing msk sockets to
invoke mptcp_pm_remove_addrs_and_subflows to remove a list of addresses
for each msk socket.

In mptcp_pm_remove_addrs_and_subflows, traverse all the addresses in the
removing addresses list, to find whether this address is in the conn_list
or anno_list. If it is, put the address ID into the removing address list
or the removing subflow list, and pass the two lists to
mptcp_pm_remove_addr and mptcp_pm_remove_subflow.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:47:45 -08:00
Geliang Tang
ddd14bb85d mptcp: remove multi subflows in PM
This patch dealt with removing multi subflows in PM:

In mptcp_pm_remove_subflow, changed the input parameter local_id as an
list of removing address ids, and passed the list to
mptcp_pm_nl_rm_subflow_received.

In mptcp_pm_nl_rm_subflow_received, iterated each address id from the
received ids list. Then shut down and closed each address id's subsocket.

In mptcp_nl_remove_subflow_and_signal_addr, put the single address id into
an ids list, and passed it to mptcp_pm_remove_subflow.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:47:45 -08:00
Geliang Tang
d0b698ca9a mptcp: remove multi addresses in PM
This patch dropped the member rm_id of struct mptcp_pm_data. Use
rm_list_rx in mptcp_pm_nl_rm_addr_received instead of using rm_id.

In mptcp_pm_nl_rm_addr_received, iterated each address id from
pm.rm_list_rx, then shut down and closed each address id's subsocket.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:47:45 -08:00
Geliang Tang
b5c55f334c mptcp: add rm_list_rx in mptcp_pm_data
This patch added a new member rm_list_rx for struct mptcp_pm_data as an
list of the removing address ids on the incoming direction. Initialized
its nr field to zero in mptcp_pm_data_init.

In mptcp_pm_rm_addr_received, set it as the input rm_list.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:47:45 -08:00
Geliang Tang
5c4a824dcb mptcp: add rm_list in mptcp_options_received
This patch changed the member rm_id in struct mptcp_options_received as a
list of the removing address ids, and renamed it to rm_list.

In mptcp_parse_option, parsed the RM_ADDR suboption and filled them into
the rm_list in struct mptcp_options_received.

In mptcp_incoming_options, passed this rm_list to the function
mptcp_pm_rm_addr_received.

It also changed the parameter type of mptcp_pm_rm_addr_received.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:47:45 -08:00
Geliang Tang
cbde278718 mptcp: add rm_list_tx in mptcp_pm_data
This patch added a new member rm_list_tx for struct mptcp_pm_data as the
removing address list on the outgoing direction. Initialize its nr field
to zero in mptcp_pm_data_init.

In mptcp_pm_remove_anno_addr, put the single address id into an removing
list, and passed it to mptcp_pm_remove_addr.

In mptcp_pm_remove_addr, save the input rm_list to rm_list_tx in struct
mptcp_pm_data.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:47:45 -08:00
Geliang Tang
6445e17af7 mptcp: add rm_list in mptcp_out_options
This patch defined a new struct mptcp_rm_list, the ids field was an
array of the removing address ids, the nr field was the valid number of
removing address ids in the array. The array size was definced as a new
macro MPTCP_RM_IDS_MAX. Changed the member rm_id of struct
mptcp_out_options to rm_list.

In mptcp_established_options_rm_addr, invoked mptcp_pm_rm_addr_signal to
get the rm_list. According the number of addresses in it, calculated
the padded RM_ADDR suboption length. And saved the ids array in struct
mptcp_out_options's rm_list member.

In mptcp_write_options, iterated each address id from struct
mptcp_out_options's rm_list member, set the invalid ones as TCPOPT_NOP,
then filled them into the RM_ADDR suboption.

Changed TCPOLEN_MPTCP_RM_ADDR_BASE from 4 to 3.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:47:45 -08:00
David S. Miller
e9e90a70cc Merge branch 'resil-nhgroups-netdevsim-selftests'
Petr Machata says:

====================
net: Resilient NH groups: netdevsim, selftests

Support for resilient next-hop groups was added in a previous patch set.
Resilient next hop groups add a layer of indirection between the SKB hash
and the next hop. Thus the hash is used to reference a hash table bucket,
which is then used to reference a particular next hop. This allows the
system more flexibility when assigning SKB hash space to next hops.
Previously, each next hop had to be assigned a continuous range of SKB hash
space. With a hash table as an intermediate layer, it is possible to
reassign next hops with a hash table bucket granularity. In turn, this
mends issues with traffic flow redirection resulting from next hop removal
or adjustments in next-hop weights.

This patch set introduces mock offloading of resilient next hop groups by
the netdevsim driver, and a suite of selftests.

- Patch #1 adds a netdevsim-specific lock to protect next-hop hashtable.
  Previously, netdevsim relied on RTNL to maintain mutual exclusion.
  Patch #2 extracts a helper to make the following patches clearer.

- Patch #3 implements the support for offloading of resilient next-hop
  groups.

- Patch #4 introduces a new debugfs interface to set activity on a selected
  next-hop bucket. This simulates how HW can periodically report bucket
  activity, and buckets thus marked are expected to be exempt from
  migration to new next hops when the group changes.

- Patches #5 and #6 clean up the fib_nexthop selftests.

- Patches #7, #8 and #9 add tests for resilient next hop groups. Patch #7
  adds resilient-hashing counterparts to fib_nexthops.sh. Patch #8 adds a
  new traffic test for resilient next-hop groups. Patch #9 adds a new
  traffic test for tunneling.

- Patch #10 actually leverages the netdevsim offload to implement a suite
  of algorithmic tests that verify how and when buckets are migrated under
  various simulated workload scenarios.

The overall plan is to contribute approximately the following patchsets:

1) Nexthop policy refactoring (already pushed)
2) Preparations for resilient next hop groups (already pushed)
3) Implementation of resilient next hop group (already pushed)
4) Netdevsim offload plus a suite of selftests (this patchset)
5) Preparations for mlxsw offload of resilient next-hop groups
6) mlxsw offload including selftests

Interested parties can look at the complete code at [2].

[1] https://tools.ietf.org/html/rfc2992
[2] https://github.com/idosch/linux/commits/submit/res_integ_v1
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:44:10 -08:00
Ido Schimmel
b8a07c4cea selftests: netdevsim: Add test for resilient nexthop groups offload API
Test various aspects of the resilient nexthop group offload API on top
of the netdevsim implementation. Both good and bad flows are tested.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Co-developed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:44:10 -08:00
Ido Schimmel
902280cacc selftests: forwarding: Add resilient multipath tunneling nexthop test
Add a resilient nexthop objects version of gre_multipath_nh.sh. Test
that both IPv4 and IPv6 overlays work with resilient nexthop groups
where the nexthops are two GRE tunnels.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:44:10 -08:00
Ido Schimmel
386e3792b5 selftests: forwarding: Add resilient hashing test
Verify that IPv4 and IPv6 multipath forwarding works correctly with
resilient nexthop groups and with different weights.

Test that when the idle timer is not zero, the resilient groups are not
rebalanced - because the nexthop buckets are considered active - and the
initial weights (1:1) are used.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:44:10 -08:00
Ido Schimmel
557205f47d selftests: fib_nexthops: Test resilient nexthop groups
Add test cases for resilient nexthop groups. Exhaustive forwarding tests
are added separately under net/forwarding/.

Examples:

 # ./fib_nexthops.sh -t basic_res

Basic resilient nexthop group functional tests
----------------------------------------------
TEST: Add a nexthop group with default parameters                   [ OK ]
TEST: Get a nexthop group with default parameters                   [ OK ]
TEST: Get a nexthop group with non-default parameters               [ OK ]
TEST: Add a nexthop group with 0 buckets                            [ OK ]
TEST: Replace nexthop group parameters                              [ OK ]
TEST: Get a nexthop group after replacing parameters                [ OK ]
TEST: Replace idle timer                                            [ OK ]
TEST: Get a nexthop group after replacing idle timer                [ OK ]
TEST: Replace unbalanced timer                                      [ OK ]
TEST: Get a nexthop group after replacing unbalanced timer          [ OK ]
TEST: Replace with no parameters                                    [ OK ]
TEST: Get a nexthop group after replacing no parameters             [ OK ]
TEST: Replace nexthop group type - implicit                         [ OK ]
TEST: Replace nexthop group type - explicit                         [ OK ]
TEST: Replace number of nexthop buckets                             [ OK ]
TEST: Get a nexthop group after replacing with invalid parameters   [ OK ]
TEST: Dump all nexthop buckets                                      [ OK ]
TEST: Dump all nexthop buckets in a group                           [ OK ]
TEST: Dump all nexthop buckets with a specific nexthop device       [ OK ]
TEST: Dump all nexthop buckets with a specific nexthop identifier   [ OK ]
TEST: Dump all nexthop buckets in a non-existent group              [ OK ]
TEST: Dump all nexthop buckets in a non-resilient group             [ OK ]
TEST: Dump all nexthop buckets using a non-existent device          [ OK ]
TEST: Dump all nexthop buckets with invalid 'groups' keyword        [ OK ]
TEST: Dump all nexthop buckets with invalid 'fdb' keyword           [ OK ]
TEST: Get a valid nexthop bucket                                    [ OK ]
TEST: Get a nexthop bucket with valid group, but invalid index      [ OK ]
TEST: Get a nexthop bucket from a non-resilient group               [ OK ]
TEST: Get a nexthop bucket from a non-existent group                [ OK ]

Tests passed:  29
Tests failed:   0

 # ./fib_nexthops.sh -t ipv4_large_res_grp

IPv4 large resilient group (128k buckets)
-----------------------------------------
TEST: Dump large (x131072) nexthop buckets                          [ OK ]

Tests passed:   1
Tests failed:   0

 # ./fib_nexthops.sh -t ipv6_large_res_grp

IPv6 large resilient group (128k buckets)
-----------------------------------------
TEST: Dump large (x131072) nexthop buckets                          [ OK ]

Tests passed:   1
Tests failed:   0

 # ./fib_nexthops.sh -t ipv4_res_torture

IPv4 runtime resilient nexthop group torture
--------------------------------------------
TEST: IPv4 resilient nexthop group torture test                     [ OK ]

Tests passed:   1
Tests failed:   0

 # ./fib_nexthops.sh -t ipv6_res_torture

IPv6 runtime resilient nexthop group torture
--------------------------------------------
TEST: IPv6 resilient nexthop group torture test                     [ OK ]

Tests passed:   1
Tests failed:   0

 # ./fib_nexthops.sh -t ipv4_res_grp_fcnal

IPv4 resilient groups functional
--------------------------------
TEST: Nexthop group updated when entry is deleted                   [ OK ]
TEST: Nexthop buckets updated when entry is deleted                 [ OK ]
TEST: Nexthop group updated after replace                           [ OK ]
TEST: Nexthop buckets updated after replace                         [ OK ]
TEST: Nexthop group updated when entry is deleted - nECMP           [ OK ]
TEST: Nexthop buckets updated when entry is deleted - nECMP         [ OK ]
TEST: Nexthop group updated after replace - nECMP                   [ OK ]
TEST: Nexthop buckets updated after replace - nECMP                 [ OK ]

Tests passed:   8
Tests failed:   0

 # ./fib_nexthops.sh -t ipv6_res_grp_fcnal

IPv6 resilient groups functional
--------------------------------
TEST: Nexthop group updated when entry is deleted                   [ OK ]
TEST: Nexthop buckets updated when entry is deleted                 [ OK ]
TEST: Nexthop group updated after replace                           [ OK ]
TEST: Nexthop buckets updated after replace                         [ OK ]
TEST: Nexthop group updated when entry is deleted - nECMP           [ OK ]
TEST: Nexthop buckets updated when entry is deleted - nECMP         [ OK ]
TEST: Nexthop group updated after replace - nECMP                   [ OK ]
TEST: Nexthop buckets updated after replace - nECMP                 [ OK ]

Tests passed:   8
Tests failed:   0

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Co-developed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:44:10 -08:00
Ido Schimmel
a8f9952d21 selftests: fib_nexthops: List each test case in a different line
The lines with the IPv4 and IPv6 test cases are already very long and
more test cases will be added in subsequent patches.

List each test case in a different line to make it easier to extend the
test with more test cases.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:44:10 -08:00
Ido Schimmel
8e815284a5 selftests: fib_nexthops: Declutter test output
Before:

 # ./fib_nexthops.sh -t ipv4_torture

IPv4 runtime torture
--------------------
TEST: IPv4 torture test                                             [ OK ]
./fib_nexthops.sh: line 213: 19376 Killed                  ipv4_del_add_loop1
./fib_nexthops.sh: line 213: 19377 Killed                  ipv4_grp_replace_loop
./fib_nexthops.sh: line 213: 19378 Killed                  ip netns exec me ping -f 172.16.101.1 > /dev/null 2>&1
./fib_nexthops.sh: line 213: 19380 Killed                  ip netns exec me ping -f 172.16.101.2 > /dev/null 2>&1
./fib_nexthops.sh: line 213: 19381 Killed                  ip netns exec me mausezahn veth1 -B 172.16.101.2 -A 172.16.1.1 -c 0 -t tcp "dp=1-1023, flags=syn" > /dev/null 2>&1

Tests passed:   1
Tests failed:   0

 # ./fib_nexthops.sh -t ipv6_torture

IPv6 runtime torture
--------------------
TEST: IPv6 torture test                                             [ OK ]
./fib_nexthops.sh: line 213: 24453 Killed                  ipv6_del_add_loop1
./fib_nexthops.sh: line 213: 24454 Killed                  ipv6_grp_replace_loop
./fib_nexthops.sh: line 213: 24456 Killed                  ip netns exec me ping -f 2001:db8:101::1 > /dev/null 2>&1
./fib_nexthops.sh: line 213: 24457 Killed                  ip netns exec me ping -f 2001:db8:101::2 > /dev/null 2>&1
./fib_nexthops.sh: line 213: 24458 Killed                  ip netns exec me mausezahn -6 veth1 -B 2001:db8:101::2 -A 2001:db8:91::1 -c 0 -t tcp "dp=1-1023, flags=syn" > /dev/null 2>&1

Tests passed:   1
Tests failed:   0

After:

 # ./fib_nexthops.sh -t ipv4_torture

IPv4 runtime torture
--------------------
TEST: IPv4 torture test                                             [ OK ]

Tests passed:   1
Tests failed:   0

 # ./fib_nexthops.sh -t ipv6_torture

IPv6 runtime torture
--------------------
TEST: IPv6 torture test                                             [ OK ]

Tests passed:   1
Tests failed:   0

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:44:10 -08:00
Ido Schimmel
c6385c0b67 netdevsim: Allow reporting activity on nexthop buckets
A key component of the resilient hashing algorithm is the hash buckets'
activity. If a bucket is active, it will not be populated with a new
nexthop in order not to break existing flows. Therefore, in order to
easily and thoroughly test the algorithm, we need to be in full control
over the reported activity.

Add a debugfs interface that allows user space to have netdevsim report
a nexthop bucket within a resilient nexthop group as active. For
example:

 # echo 10 23 > /sys/kernel/debug/netdevsim/netdevsim10/fib/nexthop_bucket_activity

Will mark bucket 23 in nexthop group 10 as active.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:44:10 -08:00
Ido Schimmel
d8eaa4faca netdevsim: Add support for resilient nexthop groups
Allow resilient nexthop groups to be programmed and account their
occupancy according to their number of buckets. The nexthop group itself
as well as its buckets are marked with hardware flags (i.e.,
'RTNH_F_TRAP').

Replacement of a single nexthop bucket can fail using the following
debugfs knob:

 # cat /sys/kernel/debug/netdevsim/netdevsim10/fib/fail_nexthop_bucket_replace
 N
 # echo 1 > /sys/kernel/debug/netdevsim/netdevsim10/fib/fail_nexthop_bucket_replace
 # cat /sys/kernel/debug/netdevsim/netdevsim10/fib/fail_nexthop_bucket_replace
 Y

Replacement of a resilient nexthop group can fail using the following
debugfs knob:

 # cat /sys/kernel/debug/netdevsim/netdevsim10/fib/fail_res_nexthop_group_replace
 N
 # echo 1 > /sys/kernel/debug/netdevsim/netdevsim10/fib/fail_res_nexthop_group_replace
 # cat /sys/kernel/debug/netdevsim/netdevsim10/fib/fail_res_nexthop_group_replace
 Y

This enables testing of various error paths.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:44:10 -08:00
Ido Schimmel
40ff83711f netdevsim: Create a helper for setting nexthop hardware flags
Instead of calling nexthop_set_hw_flags(), call a helper. It will be
used to also set nexthop bucket flags in a subsequent patch.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:44:10 -08:00
Petr Machata
86927c9c4d netdevsim: fib: Introduce a lock to guard nexthop hashtable
Currently netdevsim relies on RTNL to maintain exclusivity in accessing the
nexthop hash table. However, bucket notification may be called without RTNL
having been held. Instead, introduce a custom lock to guard the table.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:44:10 -08:00
David S. Miller
b202923d3a Merge branch 'ptp-warnings'
Lee Jones says:

====================
Rid W=1 warnings from PTP

This set is part of a larger effort attempting to clean-up W=1
kernel builds, which are currently overwhelmingly riddled with
niggly little warnings.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:09:34 -08:00
Lee Jones
287f93ded6 ptp: ptp_p: Demote non-conformant kernel-doc headers and supply a param description
Fixes the following W=1 kernel build warning(s):

 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'control' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'event' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'addend' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'accum' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'test' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'ts_compare' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'rsystime_lo' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'rsystime_hi' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'systime_lo' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'systime_hi' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'trgt_lo' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'trgt_hi' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'asms_lo' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'asms_hi' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'amms_lo' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'amms_hi' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'ch_control' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'ch_event' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'tx_snap_lo' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'tx_snap_hi' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'rx_snap_lo' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'rx_snap_hi' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'src_uuid_lo' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'src_uuid_hi' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'can_status' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'can_snap_lo' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'can_snap_hi' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'ts_sel' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'ts_st' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'reserve1' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'stl_max_set_en' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'stl_max_set' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'reserve2' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:78: warning: Function parameter or member 'srst' not described in 'pch_ts_regs'
 drivers/ptp/ptp_pch.c:121: warning: Function parameter or member 'regs' not described in 'pch_dev'
 drivers/ptp/ptp_pch.c:121: warning: Function parameter or member 'ptp_clock' not described in 'pch_dev'
 drivers/ptp/ptp_pch.c:121: warning: Function parameter or member 'caps' not described in 'pch_dev'
 drivers/ptp/ptp_pch.c:121: warning: Function parameter or member 'exts0_enabled' not described in 'pch_dev'
 drivers/ptp/ptp_pch.c:121: warning: Function parameter or member 'exts1_enabled' not described in 'pch_dev'
 drivers/ptp/ptp_pch.c:121: warning: Function parameter or member 'mem_base' not described in 'pch_dev'
 drivers/ptp/ptp_pch.c:121: warning: Function parameter or member 'mem_size' not described in 'pch_dev'
 drivers/ptp/ptp_pch.c:121: warning: Function parameter or member 'irq' not described in 'pch_dev'
 drivers/ptp/ptp_pch.c:121: warning: Function parameter or member 'pdev' not described in 'pch_dev'
 drivers/ptp/ptp_pch.c:121: warning: Function parameter or member 'register_lock' not described in 'pch_dev'
 drivers/ptp/ptp_pch.c:128: warning: Function parameter or member 'station' not described in 'pch_params'
 drivers/ptp/ptp_pch.c:291: warning: Function parameter or member 'pdev' not described in 'pch_set_station_address'

Cc: Richard Cochran <richardcochran@gmail.com>
Cc: LAPIS SEMICONDUCTOR <tshimizu818@gmail.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:09:34 -08:00
Lee Jones
9ec04c71ab ptp: ptp_clockmatrix: Demote non-kernel-doc header to standard comment
Fixes the following W=1 kernel build warning(s):

 drivers/ptp/ptp_clockmatrix.c:1408: warning: Cannot understand  * @brief Maximum absolute value for write phase offset in picoseconds
 drivers/ptp/ptp_clockmatrix.c:1408: warning: Cannot understand  * @brief Maximum absolute value for write phase offset in picoseconds
 drivers/ptp/ptp_clockmatrix.c:1408: warning: Cannot understand  * @brief Maximum absolute value for write phase offset in picoseconds
 drivers/ptp/ptp_clockmatrix.c:1408: warning: Cannot understand  * @brief Maximum absolute value for write phase offset in picoseconds
 drivers/ptp/ptp_clockmatrix.c:1408: warning: Cannot understand  * @brief Maximum absolute value for write phase offset in picoseconds

Cc: Richard Cochran <richardcochran@gmail.com>
Cc: IDT-support-1588@lm.renesas.com
Cc: netdev@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:09:34 -08:00
Lee Jones
f90fc37f28 ptp_pch: Move 'pch_*()' prototypes to shared header
Fixes the following W=1 kernel build warning(s):

 drivers/ptp/ptp_pch.c:193:6: warning: no previous prototype for ‘pch_ch_control_write’ [-Wmissing-prototypes]
 drivers/ptp/ptp_pch.c:201:5: warning: no previous prototype for ‘pch_ch_event_read’ [-Wmissing-prototypes]
 drivers/ptp/ptp_pch.c:212:6: warning: no previous prototype for ‘pch_ch_event_write’ [-Wmissing-prototypes]
 drivers/ptp/ptp_pch.c:220:5: warning: no previous prototype for ‘pch_src_uuid_lo_read’ [-Wmissing-prototypes]
 drivers/ptp/ptp_pch.c:231:5: warning: no previous prototype for ‘pch_src_uuid_hi_read’ [-Wmissing-prototypes]
 drivers/ptp/ptp_pch.c:242:5: warning: no previous prototype for ‘pch_rx_snap_read’ [-Wmissing-prototypes]
 drivers/ptp/ptp_pch.c:259:5: warning: no previous prototype for ‘pch_tx_snap_read’ [-Wmissing-prototypes]
 drivers/ptp/ptp_pch.c:300:5: warning: no previous prototype for ‘pch_set_station_address’ [-Wmissing-prototypes]

Cc: Richard Cochran <richardcochran@gmail.com> (maintainer:PTP HARDWARE CLOCK SUPPORT)
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Flavio Suligoi <f.suligoi@asem.it>
Cc: netdev@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:09:34 -08:00
Lee Jones
257382c54e ptp_pch: Remove unused function 'pch_ch_control_read()'
Fixes the following W=1 kernel build warning(s):

 drivers/ptp/ptp_pch.c:182:5: warning: no previous prototype for ‘pch_ch_control_read’ [-Wmissing-prototypes]

Cc: Richard Cochran <richardcochran@gmail.com> (maintainer:PTP HARDWARE CLOCK SUPPORT)
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Flavio Suligoi <f.suligoi@asem.it>
Cc: netdev@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:09:34 -08:00
Rafał Miłecki
a9349f08ec net: dsa: bcm_sf2: setup BCM4908 internal crossbar
On some SoCs (e.g. BCM4908, BCM631[345]8) SF2 has an integrated
crossbar. It allows connecting its selected external ports to internal
ports. It's used by vendors to handle custom Ethernet setups.

BCM4908 has following 3x2 crossbar. On Asus GT-AC5300 rgmii is used for
connecting external BCM53134S switch. GPHY4 is usually used for WAN
port. More fancy devices use SerDes for 2.5 Gbps Ethernet.

              ┌──────────┐
SerDes ─── 0 ─┤          │
              │   3x2    ├─ 0 ─── switch port 7
 GPHY4 ─── 1 ─┤          │
              │ crossbar ├─ 1 ─── runner (accelerator)
 rgmii ─── 2 ─┤          │
              └──────────┘

Use setup data based on DT info to configure BCM4908's switch port 7.
Right now only GPHY and rgmii variants are supported. Handling SerDes
can be implemented later.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:06:37 -08:00
Rafał Miłecki
01488a0ccd net: dsa: bcm_sf2: store PHY interface/mode in port structure
It's needed later for proper switch / crossbar setup.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:06:37 -08:00
Shubhankar Kuranagatti
6ad086009f net: ipv4: route.c: Fix indentation of multi line comment.
All comment lines inside the comment block have been aligned.
Every line of comment starts with a * (uniformity in code).

Signed-off-by: Shubhankar Kuranagatti <shubhankarvk@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 17:02:30 -08:00
Rafał Miłecki
12bb508bfe net: broadcom: bcm4908_enet: support TX interrupt
It appears that each DMA channel has its own interrupt and both rings
can be configured (the same way) to handle interrupts.

1. Make ring interrupts code generic (make it operate on given ring)
2. Move napi to ring (so each has its own)
3. Make IRQ handler generic (match ring against received IRQ number)
4. Add (optional) support for TX interrupt

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 16:48:38 -08:00
Rafał Miłecki
ab4dda7a8c dt-bindings: net: bcm4908-enet: add optional TX interrupt
I discovered that hardware actually supports two interrupts, one per DMA
channel (RX and TX).

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 16:48:38 -08:00
David S. Miller
26d2e0426a Merge branch 'macb-fixed-link-fixes'
Robert Hancock says:

====================
macb SGMII fixed-link fixes

Some fixes to the macb driver for use in SGMII mode with a fixed-link (such as
for chip-to-chip connectivity).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 16:44:46 -08:00
Robert Hancock
e276e5e40e net: macb: Disable PCS auto-negotiation for SGMII fixed-link mode
When using a fixed-link configuration in SGMII mode, it's not really
sensible to have auto-negotiation enabled since the link settings are
fixed by definition. In other configurations, such as an SGMII
connection to a PHY, it should generally be enabled.

Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 16:44:45 -08:00
Robert Hancock
8fab174b78 net: macb: poll for fixed link state in SGMII mode
When using a fixed-link configuration with GEM in SGMII mode, such as
for a chip-to-chip interconnect, the link state was always showing as
established regardless of the actual connectivity state. We can monitor
the pcs_link_state bit in the Network Status register to determine
whether the PCS link state is actually up.

Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 16:44:45 -08:00
David S. Miller
c232f81b0a mlx5-updates-2021-03-12
1) TC support for ICMP parameters
 2) TC connection tracking with mirroring
 3) A round of trivial fixups and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmBL+V4ACgkQSD+KveBX
 +j68tgf+Lzq6QkpGl4dpg87pjuKIwyykGVJ0NW/Yig+bNe9V3pbFCv785xymD16N
 ALcLKNtxUovE6p+Gy6JgzgBN7V9ZsM6bn564487ycYIqewz9c2KIQneJdGEUUDgi
 S8NkuCaLQst36Wl8gdiygApZmw8TAafOQYYzKOZ7HdZ+aRH+fIwbVABYreAYwjGa
 2iz5yu1AEzG5/10bTEx69rQHY+309EPy4N4na/jo/tXRL3fWQSQz6hzEnoi3EjAU
 YsktBB72eBX/NogqLIHpSzzjPgm5IFRQbNSHyL71TH9KEr0KVpzJd9VCbaic6goL
 gQn0TqoE74X7r4ZPWP3AHcGOL9L5Rg==
 =L2PO
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2021-03-12' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2021-03-12

1) TC support for ICMP parameters
2) TC connection tracking with mirroring
3) A round of trivial fixups and cleanups
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-12 16:36:07 -08:00