1236928 Commits

Author SHA1 Message Date
David S. Miller
8179cc4764 Merge branch 'mptcp-mib-counters'
Matthieu Baerts says:

====================
mptcp: add CurrEstab MIB counter

This MIB counter is similar to the one of TCP -- CurrEstab -- available
in /proc/net/snmp. This is useful to quickly list the number of MPTCP
connections without having to iterate over all of them.

Patch 1 prepares its support by adding new helper functions:

 - MPTCP_DEC_STATS(): similar to MPTCP_INC_STATS(), but this time to
   decrement a counter.

 - mptcp_set_state(): similar to tcp_set_state(), to change the state of
   an MPTCP socket, and to inc/decrement the new counter when needed.

Patch 2 uses mptcp_set_state() instead of directly calling
inet_sk_state_store() to change the state of MPTCP sockets.

Patch 3 and 4 validate the new feature in MPTCP "join" and "diag"
selftests.
====================

Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02 13:33:58 +00:00
Geliang Tang
81ab772819 selftests: mptcp: diag: check CURRESTAB counters
This patch adds a new helper chk_msk_cestab() to check the current
established connections counter MIB_CURRESTAB in diag.sh. Invoke it
to check the counter during the connection after every chk_msk_inuse().

Signed-off-by: Geliang Tang <geliang.tang@linux.dev>
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02 13:33:33 +00:00
Geliang Tang
0bd962dd86 selftests: mptcp: join: check CURRESTAB counters
This patch adds a new helper chk_cestab_nr() to check the current
established connections counter MIB_CURRESTAB. Set the newly added
variables cestab_ns1 and cestab_ns2 to indicate how many connections
are expected in ns1 or ns2.

Invoke check_cestab() to check the counter during the connection in
do_transfer() and invoke chk_cestab_nr() to re-check it when the
connection closed. These checks are embedded in add_tests().

Signed-off-by: Geliang Tang <geliang.tang@linux.dev>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02 13:33:33 +00:00
Geliang Tang
c693a85164 mptcp: use mptcp_set_state
This patch replaces all the 'inet_sk_state_store()' calls under net/mptcp
with the new helper mptcp_set_state().

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/460
Signed-off-by: Geliang Tang <geliang.tang@linux.dev>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02 13:33:33 +00:00
Geliang Tang
d9cd27b8cd mptcp: add CurrEstab MIB counter support
Add a new MIB counter named MPTCP_MIB_CURRESTAB to count current
established MPTCP connections, similar to TCP_MIB_CURRESTAB. This is
useful to quickly list the number of MPTCP connections without having to
iterate over all of them.

This patch adds a new helper function mptcp_set_state(): if the state
switches from or to ESTABLISHED state, this newly added counter is
incremented. This helper is going to be used in the following patch.

Similar to MPTCP_INC_STATS(), a new helper called MPTCP_DEC_STATS() is
also needed to decrement a MIB counter.

Signed-off-by: Geliang Tang <geliang.tang@linux.dev>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02 13:32:45 +00:00
David S. Miller
42a7889a19 Merge branch 'selftests-tcp-ao'
Dmitry Safonov says:

====================
selftest/net: Some more TCP-AO selftest post-merge fixups

Note that there's another post-merge fix for TCP-AO selftests, but that
doesn't conflict with these, so I don't resend that:

https://lore.kernel.org/all/20231219-b4-tcp-ao-selftests-out-of-tree-v1-1-0fff92d26eac@arista.com/T/#u
====================

Tested-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
2024-01-02 13:27:48 +00:00
Dmitry Safonov
80057b2080 selftest/tcp-ao: Work on namespace-ified sysctl_optmem_max
Since commit f5769faeec36 ("net: Namespace-ify sysctl_optmem_max")
optmem_max is per-netns, so need of switching to root namespace.
It seems trivial to keep the old logic working, so going to keep it for
a while (at least, until kernel with netns-optmem_max will be release).

Currently, there is a test that checks that optmem_max limit applies to
TCP-AO keys and a little benchmark that measures linked-list TCP-AO keys
scaling, those are fixed by this.

Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02 13:27:48 +00:00
Dmitry Safonov
72cd9f8d5a selftest/tcp-ao: Set routes in a proper VRF table id
In unsigned-md5 selftests ip_route_add() is not needed in
client_add_ip(): the route was pre-setup in __test_init() => link_init()
for subnet, rather than a specific ip-address.

Currently, __ip_route_add() mistakenly always sets VRF table
to RT_TABLE_MAIN - this seems to have sneaked in during unsigned-md5
tests debugging. That also explains, why ip_route_add_vrf() ignored
EEXIST, returned by fib6.

Yet, keep EEXIST ignoring in bench-lookups selftests as it's expected
that those selftests may add the same (duplicate) routes.

Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02 13:27:48 +00:00
David S. Miller
a27359abc8 wireless-next patches for v6.8
The third "new features" pull request for v6.8. This is a smaller one
 to clear up our tree before the break and nothing really noteworthy
 this time.
 
 Major changes:
 
 stack
 
 * cfg80211: introduce cfg80211_ssid_eq() for SSID matching
 
 * cfg80211: support P2P operation on DFS channels
 
 * mac80211: allow 64-bit radiotap timestamps
 
 iwlwifi
 
 * AX210: allow concurrent P2P operation on DFS channels
 -----BEGIN PGP SIGNATURE-----
 
 iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmWFbnIRHGt2YWxvQGtl
 cm5lbC5vcmcACgkQbhckVSbrbZs5hQf/aCvvTjqeRoMkmO+ZPFMSO+YquZNCJi1M
 TP8Fce2ALKj7woPad8vdJNNStMa9k4bu2NvShMXhoYM3xOA/4o0P9yeb5OfyYkTk
 Y6JF+SoBGzABtB3m/a3i5J19F+oC+6yKN6/OY8byfK4jqZdrAprc3qXwodC5zb9n
 blC16KKlldjoj5AWe/b6Vn/LI9P7mVhZIaWxI9IaktK0eIgfsfcgIZLuuMJPq5DJ
 NjvhmK++qCcTQrJo/4TMVoWmcZZKR3XzcSs++HYRELNCwcM2q9s07R4KkV81aB0t
 RpCaCWa2KVUCrKdk3FlnG5pS7A6US5KGP4g6sSQRnq8t3IYDUbDo5Q==
 =Bte8
 -----END PGP SIGNATURE-----

Merge tag 'wireless-next-2023-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next

Kalle Valo says:

====================
wireless-next patches for v6.8

The third "new features" pull request for v6.8. This is a smaller one
to clear up our tree before the break and nothing really noteworthy
this time.

Major changes:

stack

* cfg80211: introduce cfg80211_ssid_eq() for SSID matching

* cfg80211: support P2P operation on DFS channels

* mac80211: allow 64-bit radiotap timestamps

iwlwifi

* AX210: allow concurrent P2P operation on DFS channels
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02 12:46:10 +00:00
David S. Miller
8146c3f9e6 Merge branch 'net-tc-ipt-retire'
Jamal Hadi Salim says:

====================
net/sched: retire tc ipt action

In keeping up with my status as a hero who removes code: another one bites the
dust.
The tc ipt action was intended to run all netfilter/iptables target.
Unfortunately it has not benefitted over the years from proper updates when
netfilter changes, and for that reason it has remained rudimentary.
Pinging a bunch of people that i was aware were using this indicates that
removing it wont affect them.
Retire it to reduce maintenance efforts.
So Long, ipt, and Thanks for all the Fish.
====================

Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02 12:41:16 +00:00
Jamal Hadi Salim
6d6d80e4f6 net/sched: Remove CONFIG_NET_ACT_IPT from default configs
Now that we are retiring the IPT action.

Reviewed-by: Victor Noguiera <victor@mojatatu.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02 12:41:16 +00:00
Jamal Hadi Salim
ba24ea1291 net/sched: Retire ipt action
The tc ipt action was intended to run all netfilter/iptables target.
Unfortunately it has not benefitted over the years from proper updates when
netfilter changes, and for that reason it has remained rudimentary.
Pinging a bunch of people that i was aware were using this indicates that
removing it wont affect them.
Retire it to reduce maintenance efforts. Buh-bye.

Reviewed-by: Victor Noguiera <victor@mojatatu.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02 12:41:16 +00:00
Eric Dumazet
993498e537 net-device: move gso_partial_features to net_device_read_tx
dev->gso_partial_features is read from tx fast path for GSO packets.

Move it to appropriate section to avoid a cache line miss.

Fixes: 43a71cd66b9c ("net-device: reorganize net_device fast path variables")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Coco Li <lixiaoyan@google.com>
Cc: David Ahern <dsahern@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02 12:35:42 +00:00
Andy Shevchenko
07938d774f ptp: ocp: Use DEFINE_RES_*() in place
There is no need to have an intermediate functions as DEFINE_RES_*()
macros are represented by compound literals. Just use them in place.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-02 12:34:10 +00:00
David S. Miller
9fb3dc1e9a Merge branch 'phy-listing-link_topology-tracking'
Maxime Chevallier says:

====================
Introduce PHY listing and link_topology tracking

Here's a V5 of the multi-PHY support series.

At a glance, besides some minor fixes and R'd-by from Andrew, one of the
thing this series does is remove the ASSERT_RTNL() from the
topo_add_phy/del_phy operations.

These operations will take a PHY device and put it into the list of
devices associated to a netdevice. The main thing to protect here is the
list itself, but since we use xarrays, my naive understanding of it is
that it contains its own protection scheme. There shouldn't be a need
for more locking, as the insertion/deletion paths are already hooked
into the PHY connection to a netdev, or disconnection from it.

Now for the rest of the cover :

As a remainder, this ongoing work aims ultimately at supporting complex
link topologies that involve multiplexing multiple PHYs/SFPs on a single
netdevice. As a first step, it's required that we are able to enumerate the
PHYs on a given ethernet interface.

By just doing so, we also improve already-existing use-cases, namely the
copper SFP modules support when a media-converter is used (as we have 2
PHYs on the link, but only one is referenced by net_device.phydev, which
is used on a variety of netlink commands).

The series is architectured as follows :

- The first patch adds the notion of phy_link_topology, which tracks
all PHYs attached to a netdevice.

- Patches 2, 3 and 4 adds some plumbing into SFP and phylib to be able
  to connect the dots when building the topology tree, to know which PHY
  is connected to which SFP bus, trying not to be too invasive on phylib.

- Patch 5 allows passing a PHY_INDEX to ethnl commands. I'm uncertain about
  this, as there are at least 4 netlink commands ( 5 with the one introduced
  in patch 7 ) that targets PHYs directly or indirectly, which to me makes
  it worth-it to have a generic way to pass a PHY index to commands, however
  the approach taken may be too generic.

- Patch 6 is the netlink spec update + ethtool-user.c|h autogenerated code
update (the autogenerated code triggers checkpatch warning though)

- Patch 7 introduces a new netlink command set to list PHYs on a netdevice.
It implements a custom DUMP and GET operation to allow filtered dumps,
that lists all PHYs on a given netdevice. I couldn't use most of ethnl's
plumbing though.

- Patch 8 is the netlink spec update + ethtool-user.c|h update for that
new command

- Patch 8,9,10 and 11 updates the PLCA, strset, cable-test and pse netlink
commands to use the user-provided PHY instead of net_device.phydev.

- Finally patch 12 adds some documentation for this whole work.

Examples
========

Here's a short overview of the kind of operations you can have regarding
the PHY topology. These tests were performed on a MacchiatoBin, which
has 3 interfaces :

eth0 and eth1 have the following layout:

MAC - PHY - SFP

eth2 has this more classic topology :

MAC - PHY - RJ45

finally eth3 has the following topology :

MAC - SFP

When performing a dump with all interfaces down, we don't get any
result, as no PHY has been attached to their respective net_device :

None

The following output is with eth0, eth2 and eth3 up, but no SFP module
inserted in none of the interfaces :

[{'downstream-sfp-name': 'sfp-eth0',
  'drvname': 'mv88x3310',
  'header': {'dev-index': 2, 'dev-name': 'eth0'},
  'id': 0,
  'index': 1,
  'name': 'f212a600.mdio-mii:00',
  'upstream-type': 'mac'},
 {'drvname': 'Marvell 88E1510',
  'header': {'dev-index': 4, 'dev-name': 'eth2'},
  'id': 21040593,
  'index': 1,
  'name': 'f212a200.mdio-mii:00',
  'upstream-type': 'mac'}]

And now is a dump operation with a copper SFP in the eth0 port :

[{'downstream-sfp-name': 'sfp-eth0',
  'drvname': 'mv88x3310',
  'header': {'dev-index': 2, 'dev-name': 'eth0'},
  'id': 0,
  'index': 1,
  'name': 'f212a600.mdio-mii:00',
  'upstream-type': 'mac'},
 {'drvname': 'Marvell 88E1111',
  'header': {'dev-index': 2, 'dev-name': 'eth0'},
  'id': 21040322,
  'index': 2,
  'name': 'i2c:sfp-eth0:16',
  'upstream': {'index': 1, 'sfp-name': 'sfp-eth0'},
  'upstream-type': 'phy'},
 {'drvname': 'Marvell 88E1510',
  'header': {'dev-index': 4, 'dev-name': 'eth2'},
  'id': 21040593,
  'index': 1,
  'name': 'f212a200.mdio-mii:00',
  'upstream-type': 'mac'}]

 -- Note that this shouldn't actually work as the 88x3310 PHY doesn't allow
a 1G SFP to be connected to its SFP interface, and I don't have a 10G copper SFP,
so for the sake of the demo I applied the following modification, which
of courses gives a non-functionnal link, but the PHY attach still works,
which is what I want to demonstrate :

@@ -488,7 +488,7 @@ static int mv3310_sfp_insert(void *upstream, const struct sfp_eeprom_id *id)

        if (iface != PHY_INTERFACE_MODE_10GBASER) {
                dev_err(&phydev->mdio.dev, "incompatible SFP module inserted\n");
-               return -EINVAL;
+               //return -EINVAL;
        }
        return 0;
 }

Finally an example of the filtered DUMP operation that Jakub suggested
in V1 :

[{'downstream-sfp-name': 'sfp-eth0',
  'drvname': 'mv88x3310',
  'header': {'dev-index': 2, 'dev-name': 'eth0'},
  'id': 0,
  'index': 1,
  'name': 'f212a600.mdio-mii:00',
  'upstream-type': 'mac'},
 {'drvname': 'Marvell 88E1111',
  'header': {'dev-index': 2, 'dev-name': 'eth0'},
  'id': 21040322,
  'index': 2,
  'name': 'i2c:sfp-eth0:16',
  'upstream': {'index': 1, 'sfp-name': 'sfp-eth0'},
  'upstream-type': 'phy'}]

And a classic GET operation allows querying a single PHY's info :

{'drvname': 'Marvell 88E1111',
 'header': {'dev-index': 2, 'dev-name': 'eth0'},
 'id': 21040322,
 'index': 2,
 'name': 'i2c:sfp-eth0:16',
 'upstream': {'index': 1, 'sfp-name': 'sfp-eth0'},
 'upstream-type': 'phy'}

Changed in V5:
- Removed the RTNL assertion in the topology ops
- Made the phy_topo_get_phy inline
- Fixed the PSE-PD multi-PHY support by re-adding a wrongly dropped
  check
- Fixed some typos in the documentation
- Fixed reverse xmas trees

Changes in V4:
- Dropped the RFC flag
- Made the net_device integration independent to having phylib enabled
- Removed the autogenerated ethtool-user code for the YNL specs

Changes in V3:
- Added RTNL assertions where needed
- Fixed issues in the DUMP code for PHY_GET, which crashed when running it
  twice in a row
- Added the documentation, and moved in-source docs around
- renamed link_topology to phy_link_topology

Changes in V2:
- Added the DUMP operation
- Added much more information in the reported data, to be able to reconstruct
  precisely the topology tree
- renamed phy_list to link_topology
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01 18:38:57 +00:00
Maxime Chevallier
32bb4515e3 Documentation: networking: document phy_link_topology
The newly introduced phy_link_topology tracks all ethernet PHYs that are
attached to a netdevice. Document the base principle, internal and
external APIs. As the phy_link_topology is expected to be extended, this
documentation will hold any further improvements and additions made
relative to topology handling.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01 18:38:57 +00:00
Maxime Chevallier
d078d48063 net: ethtool: strset: Allow querying phy stats by index
The ETH_SS_PHY_STATS command gets PHY statistics. Use the phydev pointer
from the ethnl request to allow query phy stats from each PHY on the
link.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01 18:38:57 +00:00
Maxime Chevallier
fcc4b105ca net: ethtool: cable-test: Target the command to the requested PHY
Cable testing is a PHY-specific command. Instead of targeting the command
towards dev->phydev, use the request to pick the targeted PHY.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01 18:38:57 +00:00
Maxime Chevallier
345237dbc1 net: ethtool: pse-pd: Target the command to the requested PHY
PSE and PD configuration is a PHY-specific command. Instead of targeting
the command towards dev->phydev, use the request to pick the targeted
PHY device.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01 18:38:57 +00:00
Maxime Chevallier
7db69ec9cf net: ethtool: plca: Target the command to the requested PHY
PLCA is a PHY-specific command. Instead of targeting the command
towards dev->phydev, use the request to pick the targeted PHY.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01 18:38:57 +00:00
Maxime Chevallier
95132a018f netlink: specs: add ethnl PHY_GET command set
The PHY_GET command, supporting both DUMP and GET operations, is used to
retrieve the list of PHYs connected to a netdevice, and get topology
information to know where exactly it sits on the physical link.

Add the netlink specs corresponding to that command.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01 18:38:57 +00:00
Maxime Chevallier
63d5eaf35a net: ethtool: Introduce a command to list PHYs on an interface
As we have the ability to track the PHYs connected to a net_device
through the link_topology, we can expose this list to userspace. This
allows userspace to use these identifiers for phy-specific commands and
take the decision of which PHY to target by knowing the link topology.

Add PHY_GET and PHY_DUMP, which can be a filtered DUMP operation to list
devices on only one interface.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01 18:38:57 +00:00
Maxime Chevallier
c29451aefc netlink: specs: add phy-index as a header parameter
Update the spec to take the newly introduced phy-index as a generic
request parameter.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01 18:38:57 +00:00
Maxime Chevallier
2ab0edb505 net: ethtool: Allow passing a phy index for some commands
Some netlink commands are target towards ethernet PHYs, to control some
of their features. As there's several such commands, add the ability to
pass a PHY index in the ethnl request, which will populate the generic
ethnl_req_info with the relevant phydev when the command targets a PHY.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01 18:38:57 +00:00
Maxime Chevallier
dedd702a35 net: sfp: Add helper to return the SFP bus name
Knowing the bus name is helpful when we want to expose the link topology
to userspace, add a helper to return the SFP bus name.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01 18:38:56 +00:00
Maxime Chevallier
034fcc2103 net: phy: add helpers to handle sfp phy connect/disconnect
There are a few PHY drivers that can handle SFP modules through their
sfp_upstream_ops. Introduce Phylib helpers to keep track of connected
SFP PHYs in a netdevice's namespace, by adding the SFP PHY to the
upstream PHY's netdev's namespace.

By doing so, these SFP PHYs can be enumerated and exposed to users,
which will be able to use their capabilities.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01 18:38:56 +00:00
Maxime Chevallier
9c5625f559 net: sfp: pass the phy_device when disconnecting an sfp module's PHY
Pass the phy_device as a parameter to the sfp upstream .disconnect_phy
operation. This is preparatory work to help track phy devices across
a net_device's link.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01 18:38:56 +00:00
Maxime Chevallier
02018c544e net: phy: Introduce ethernet link topology representation
Link topologies containing multiple network PHYs attached to the same
net_device can be found when using a PHY as a media converter for use
with an SFP connector, on which an SFP transceiver containing a PHY can
be used.

With the current model, the transceiver's PHY can't be used for
operations such as cable testing, timestamping, macsec offload, etc.

The reason being that most of the logic for these configuration, coming
from either ethtool netlink or ioctls tend to use netdev->phydev, which
in multi-phy systems will reference the PHY closest to the MAC.

Introduce a numbering scheme allowing to enumerate PHY devices that
belong to any netdev, which can in turn allow userspace to take more
precise decisions with regard to each PHY's configuration.

The numbering is maintained per-netdev, in a phy_device_list.
The numbering works similarly to a netdevice's ifindex, with
identifiers that are only recycled once INT_MAX has been reached.

This prevents races that could occur between PHY listing and SFP
transceiver removal/insertion.

The identifiers are assigned at phy_attach time, as the numbering
depends on the netdevice the phy is attached to.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01 18:38:56 +00:00
David S. Miller
109bf4cfe1 netfilter pull request 23-12-22
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEN9lkrMBJgcdVAPub1V2XiooUIOQFAmWFd9oACgkQ1V2XiooU
 IOTBog//auEj3LR5v2b5et6CLsuMESsCFkVko1AFdzzpu94sJ8TTkqlrW7SPSfPW
 CMNYz1EYFaRGK1GjfH0a49EUofeU/kPD/TmffmiYwuJlGyEIXp17ZIUQnGDrM6Mz
 UBHX8VrW73McJYuZJbxM4HX6Q3aFyshKy8iOPpffdFAQcCD5XtkYbW24Mzw6V4sI
 8fBz6C/iT+5Jq1wGtvM2xmGkNYuMLJxbf9qOpinRZCXQV/z5IWda/bFnv3kpJ9mR
 x2ZYXHnzFEr2gVFIhZ7G6FvMgQAkkKGVQE+n9zVZ9V78czqBk9depdeLaq/RYGyx
 8VYqJyaXHKQm7FqsBwSSxU964aQhi/zNrjF9SeJSfcRNhTmXgzdERndVWVjUPe9O
 rFLyQVoQ6JyKy5++1lL3+63cd6ZeGw8gUw8MCw9DX8rwamnNFunUsixKv4SLzQHA
 jfFgmAUq+HVWHXI4xmj+SDO8g+s7O3BpbZM09E3si6lwy4/LiQqHhV4sNgEVN/It
 hf76t/U6GRYZ0Hwy2dwQl8SZg1BxMSGFmzB7vyEZyGXaqxRZb1Jym4h9slOMKXWl
 gS5y4y0ZVTmzRCqJR+jUrMAKPCw/R964LrslDYwQpsQeP7nuUPeU5jH1VWD6kAbS
 Vc36uDRnfojoNiRcmAfuhvZIseM1+hI6UXCIsqYR6cekCFlBad8=
 =+Uvv
 -----END PGP SIGNATURE-----

Merge tag 'nf-next-23-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next

Pablo Neira Ayuso says:

====================
netfilter pull request 23-12-22

The following patchset contains Netfilter updates for net-next:

1) Add locking for NFT_MSG_GETSETELEM_RESET requests, to address a
   race scenario with two concurrent processes running a dump-and-reset
   which exposes negative counters to userspace, from Phil Sutter.

2) Use GFP_KERNEL in pipapo GC, from Florian Westphal.

3) Reorder nf_flowtable struct members, place the read-mostly parts
   accessed by the datapath first. From Florian Westphal.

4) Set on dead flag for NFT_MSG_NEWSET in abort path,
   from Florian Westphal.

5) Support filtering zone in ctnetlink, from Felix Huettner.

6) Bail out if user tries to redefine an existing chain with different
   type in nf_tables.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01 16:15:40 +00:00
David S. Miller
240436c06c bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZYVEqQAKCRDbK58LschI
 gzH6AP9hVXLpHFTWMT0+2GK2lx69VX8zW1C0SmN7WHaxUbPN9QEAwzGnELfKk00P
 0IKRHSl5abhVMX7JOM3sSOhCILeKjQg=
 =wRLJ
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
bpf-next-for-netdev
The following pull-request contains BPF updates for your *net-next* tree.

We've added 22 non-merge commits during the last 3 day(s) which contain
a total of 23 files changed, 652 insertions(+), 431 deletions(-).

The main changes are:

1) Add verifier support for annotating user's global BPF subprogram arguments
   with few commonly requested annotations for a better developer experience,
   from Andrii Nakryiko.

   These tags are:
     - Ability to annotate a special PTR_TO_CTX argument
     - Ability to annotate a generic PTR_TO_MEM as non-NULL

2) Support BPF verifier tracking of BPF_JNE which helps cases when the compiler
   transforms (unsigned) "a > 0" into "if a == 0 goto xxx" and the like, from
   Menglong Dong.

3) Fix a warning in bpf_mem_cache's check_obj_size() as reported by LKP, from Hou Tao.

4) Re-support uid/gid options when mounting bpffs which had to be reverted with
   the prior token series revert to avoid conflicts, from Daniel Borkmann.

5) Fix a libbpf NULL pointer dereference in bpf_object__collect_prog_relos() found
   from fuzzing the library with malformed ELF files, from Mingyi Zhang.

6) Skip DWARF sections in libbpf's linker sanity check given compiler options to
   generate compressed debug sections can trigger a rejection due to misalignment,
   from Alyssa Ross.

7) Fix an unnecessary use of the comma operator in BPF verifier, from Simon Horman.

8) Fix format specifier for unsigned long values in cpustat sample, from Colin Ian King.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01 14:45:21 +00:00
Luiz Angelo Daros de Luca
cff9c565e6 net: mdio: get/put device node during (un)registration
The __of_mdiobus_register() function was storing the device node in
dev.of_node without increasing its reference count. It implicitly relied
on the caller to maintain the allocated node until the mdiobus was
unregistered.

Now, __of_mdiobus_register() will acquire the node before assigning it,
and of_mdiobus_unregister_callback() will be called at the end of
mdio_unregister().

Drivers can now release the node immediately after MDIO registration.
Some of them are already doing that even before this patch.

Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-01 13:01:27 +00:00
David S. Miller
92de776d20 mlx5-updates-2023-12-20
mlx5 Socket direct support and management PF profile.
 
 Tariq Says:
 ===========
 Support Socket-Direct multi-dev netdev
 
 This series adds support for combining multiple devices (PFs) of the
 same port under one netdev instance. Passing traffic through different
 devices belonging to different NUMA sockets saves cross-numa traffic and
 allows apps running on the same netdev from different numas to still
 feel a sense of proximity to the device and achieve improved
 performance.
 
 We achieve this by grouping PFs together, and creating the netdev only
 once all group members are probed. Symmetrically, we destroy the netdev
 once any of the PFs is removed.
 
 The channels are distributed between all devices, a proper configuration
 would utilize the correct close numa when working on a certain app/cpu.
 
 We pick one device to be a primary (leader), and it fills a special
 role.  The other devices (secondaries) are disconnected from the network
 in the chip level (set to silent mode). All RX/TX traffic is steered
 through the primary to/from the secondaries.
 
 Currently, we limit the support to PFs only, and up to two devices
 (sockets).
 
 ===========
 
 Armen Says:
 ===========
 Management PF support and module integration
 
 This patch rolls out comprehensive support for the Management Physical
 Function (MGMT PF) within the mlx5 driver. It involves updating the
 mlx5 interface header to introduce necessary definitions for MGMT PF
 and adding a new management PF netdev profile, which will allow the host
 side to communicate with the embedded linux on Blue-field devices.
 
 ===========
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmWDjMMACgkQSD+KveBX
 +j5VFgf/b59UxW5f0aL08CI1XAY5cKHmsYADj2oNhujXGh/TsZAEwedIT5CUvzhH
 /hgmGLDPRTbEHjSWOXAV4m35VAO38wpTFVVbUfvPBri8bWc7OyuofDDVkt0LuU6n
 aa0lsF+BDnD5il3C1MjzsnTTThVJ9IoNEuE74MOQoYCNGpU0XANcb8aO0hEx+Xr0
 bJu9E/4e3qDHOQk+vAPY+ssoDv67gT6ED7OFfyP/v8r0eqwXBedbWEDq5Za10xdQ
 62z/9dkC8vxN7Q8KjbC6s9CA0eYf69muD4Ggeo2S6lPfO6UclAwQhYYMBjzpDEf3
 kQXdTJL0VWgPmWq0Sw8S2lyMmD7qKQ==
 =16bl
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2023-12-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2023-12-20

mlx5 Socket direct support and management PF profile.

Tariq Says:
===========
Support Socket-Direct multi-dev netdev

This series adds support for combining multiple devices (PFs) of the
same port under one netdev instance. Passing traffic through different
devices belonging to different NUMA sockets saves cross-numa traffic and
allows apps running on the same netdev from different numas to still
feel a sense of proximity to the device and achieve improved
performance.

We achieve this by grouping PFs together, and creating the netdev only
once all group members are probed. Symmetrically, we destroy the netdev
once any of the PFs is removed.

The channels are distributed between all devices, a proper configuration
would utilize the correct close numa when working on a certain app/cpu.

We pick one device to be a primary (leader), and it fills a special
role.  The other devices (secondaries) are disconnected from the network
in the chip level (set to silent mode). All RX/TX traffic is steered
through the primary to/from the secondaries.

Currently, we limit the support to PFs only, and up to two devices
(sockets).

===========

Armen Says:
===========
Management PF support and module integration

This patch rolls out comprehensive support for the Management Physical
Function (MGMT PF) within the mlx5 driver. It involves updating the
mlx5 interface header to introduce necessary definitions for MGMT PF
and adding a new management PF netdev profile, which will allow the host
side to communicate with the embedded linux on Blue-field devices.

===========
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-29 22:35:13 +00:00
Ido Schimmel
cd4d7263d5 genetlink: Use internal flags for multicast groups
As explained in commit e03781879a0d ("drop_monitor: Require
'CAP_SYS_ADMIN' when joining "events" group"), the "flags" field in the
multicast group structure reuses uAPI flags despite the field not being
exposed to user space. This makes it impossible to extend its use
without adding new uAPI flags, which is inappropriate for internal
kernel checks.

Solve this by adding internal flags (i.e., "GENL_MCAST_*") and convert
the existing users to use them instead of the uAPI flags.

Tested using the reproducers in commit 44ec98ea5ea9 ("psample: Require
'CAP_NET_ADMIN' when joining "packets" group") and commit e03781879a0d
("drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group").

No functional changes intended.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-29 08:43:59 +00:00
Greg Kroah-Hartman
f732ba4ac9 iucv: make iucv_bus const
Now that the driver core can properly handle constant struct bus_type,
move the iucv_bus variable to be a constant structure as well, placing
it into read-only memory which can not be modified at runtime.

Cc: Wenjia Zhang <wenjia@linux.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: linux-s390@vger.kernel.org
Cc: netdev@vger.kernel.org
Acked-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-29 07:46:38 +00:00
Jonathan Corbet
1271ca00aa ethtool: reformat kerneldoc for struct ethtool_fec_stats
The kerneldoc comment for struct ethtool_fec_stats attempts to describe the
"total" and "lanes" fields of the ethtool_fec_stat substructure in a way
leading to these warnings:

  ./include/linux/ethtool.h:424: warning: Excess struct member 'lane' description in 'ethtool_fec_stats'
  ./include/linux/ethtool.h:424: warning: Excess struct member 'total' description in 'ethtool_fec_stats'

Reformat the comment to retain the information while eliminating the
warnings.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-29 01:22:59 +00:00
Jonathan Corbet
d0c3891db2 ethtool: reformat kerneldoc for struct ethtool_link_settings
The kernel doc comments for struct ethtool_link_settings includes
documentation for three fields that were never present there, leading to
these docs-build warnings:

  ./include/uapi/linux/ethtool.h:2207: warning: Excess struct member 'supported' description in 'ethtool_link_settings'
  ./include/uapi/linux/ethtool.h:2207: warning: Excess struct member 'advertising' description in 'ethtool_link_settings'
  ./include/uapi/linux/ethtool.h:2207: warning: Excess struct member 'lp_advertising' description in 'ethtool_link_settings'

Remove the entries to make the warnings go away.  There was some
information there on how data in >link_mode_masks is formatted; move that
to the body of the comment to preserve it.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-29 01:22:09 +00:00
Jonathan Corbet
144377c340 net: sock: remove excess structure-member documentation
Remove a couple of kerneldoc entries for struct members that do not exist,
addressing these warnings:

  ./include/net/sock.h:548: warning: Excess struct member '__sk_flags_offset' description in 'sock'
  ./include/net/sock.h:548: warning: Excess struct member 'sk_padding' description in 'sock'

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-29 01:21:14 +00:00
Kevin Hao
3fb65f6bc7 net: pktgen: Use wait_event_freezable_timeout() for freezable kthread
A freezable kernel thread can enter frozen state during freezing by
either calling try_to_freeze() or using wait_event_freezable() and its
variants. So for the following snippet of code in a kernel thread loop:
  wait_event_interruptible_timeout();
  try_to_freeze();

We can change it to a simple wait_event_freezable_timeout() and then
eliminate a function call.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-27 14:34:52 +00:00
David S. Miller
2f7ccf1d88 Merge branch 'net-tja11xx-macsec-support'
Radu Pirea says:

====================
Add MACsec support for TJA11XX C45 PHYs

This is the MACsec support for TJA11XX PHYs. The MACsec block encrypts
the ethernet frames on the fly and has no buffering. This operation will
grow the frames by 32 bytes. If the frames are sent back to back, the
MACsec block will not have enough room to insert the SecTAG and the ICV
and the frames will be dropped.

To mitigate this, the PHY can parse a specific ethertype with some
padding bytes and replace them with the SecTAG and ICV. These padding
bytes might be dummy or might contain information about TX SC that must
be used to encrypt the frame.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-27 13:08:10 +00:00
Radu Pirea (NXP OSS)
dc1a00380a net: phy: nxp-c45-tja11xx: implement mdo_insert_tx_tag
Implement mdo_insert_tx_tag to insert the TLV header in the ethernet
frame.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-27 13:08:10 +00:00
Radu Pirea (NXP OSS)
31a99fc06b net: phy: nxp-c45-tja11xx: add MACsec statistics
Add MACsec statistics callbacks.
The statistic registers must be set to 0 if the SC/SA is
deleted to read relevant values next time when the SC/SA is used.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-27 13:08:10 +00:00
Radu Pirea (NXP OSS)
a868b486cb net: phy: nxp-c45-tja11xx: add MACsec support
Add MACsec support.
The MACsec block has four TX SCs and four RX SCs. The driver supports up
to four SecY. Each SecY with one TX SC and one RX SC.
The RX SCs can have two keys, key A and key B, written in hardware and
enabled at the same time.
The TX SCs can have two keys written in hardware, but only one can be
active at a given time.
On TX, the SC is selected using the MAC source address. Due of this
selection mechanism, each offloaded netdev must have a unique MAC
address.
On RX, the SC is selected by SCI(found in SecTAG or calculated using MAC
SA), or using RX SC 0 as implicit.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-27 13:08:10 +00:00
Radu Pirea (NXP OSS)
a73d8779d6 net: macsec: introduce mdo_insert_tx_tag
Offloading MACsec in PHYs requires inserting the SecTAG and the ICV in
the ethernet frame. This operation will increase the frame size with up
to 32 bytes. If the frames are sent at line rate, the PHY will not have
enough room to insert the SecTAG and the ICV.

Some PHYs use a hardware buffer to store a number of ethernet frames and,
if it fills up, a pause frame is sent to the MAC to control the flow.
This HW implementation does not need any modification in the stack.

Other PHYs might offer to use a specific ethertype with some padding
bytes present in the ethernet frame. This ethertype and its associated
bytes will be replaced by the SecTAG and ICV.

mdo_insert_tx_tag allows the PHY drivers to add any specific tag in the
skb.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-27 13:08:10 +00:00
Radu Pirea (NXP OSS)
25a00d0cd6 net: macsec: revert the MAC address if mdo_upd_secy fails
Revert the MAC address if mdo_upd_secy fails. Offloaded MACsec device
might be left in an inconsistent state.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-27 13:08:10 +00:00
Radu Pirea (NXP OSS)
eb97b9bd38 net: macsec: documentation for macsec_context and macsec_ops
Add description for fields of struct macsec_context and struct
macsec_ops.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-27 13:08:09 +00:00
Radu Pirea (NXP OSS)
b1c036e835 net: macsec: move sci_to_cpu to macsec header
Move sci_to_cpu to the MACsec header to use it in drivers.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-27 13:08:09 +00:00
Radu Pirea (NXP OSS)
b34ab3527b net: macsec: use skb_ensure_writable_head_tail to expand the skb
Use skb_ensure_writable_head_tail to expand the skb if needed instead of
reimplementing a similar operation.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-27 13:08:09 +00:00
Radu Pirea (NXP OSS)
90abde49ea net: rename dsa_realloc_skb to skb_ensure_writable_head_tail
Rename dsa_realloc_skb to skb_ensure_writable_head_tail and move it to
skbuff.c to use it as helper.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-27 13:08:09 +00:00
Lin Ma
c2b2ee3625 bridge: cfm: fix enum typo in br_cc_ccm_tx_parse
It appears that there is a typo in the code where the nlattr array is
being parsed with policy br_cfm_cc_ccm_tx_policy, but the instance is
being accessed via IFLA_BRIDGE_CFM_CC_RDI_INSTANCE, which is associated
with the policy br_cfm_cc_rdi_policy.

This problem was introduced by commit 2be665c3940d ("bridge: cfm: Netlink
SET configuration Interface.").

Though it seems like a harmless typo since these two enum owns the exact
same value (1 here), it is quite misleading hence fix it by using the
correct enum IFLA_BRIDGE_CFM_CC_CCM_TX_INSTANCE here.

Signed-off-by: Lin Ma <linma@zju.edu.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-26 22:38:13 +00:00
David S. Miller
1f62f58d50 Merge branch 'mptcp-cleanups-ephemeral-port-sockopts'
Matthieu Baerts says:

====================
mptcp: cleanup and support more ephemeral ports sockopts

Patch 1 is a cleanup one: mptcp_is_tcpsk() helper was modifying sock_ops
in some cases which is unexpected with that name.

Patch 2 to 4 add support for two socket options: IP_LOCAL_PORT_RANGE and
IP_BIND_ADDRESS_NO_PORT. The first one is a preparation patch, the
second one adds the support while the last one modifies an existing
selftest to validate the new features.
====================

Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-26 22:33:22 +00:00