IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Add test cases to verify that the bridge driver correctly marks layer 2
misses only when it should and that the flower classifier can match on
this metadata.
Example output:
# ./tc_flower_l2_miss.sh
TEST: L2 miss - Unicast [ OK ]
TEST: L2 miss - Multicast (IPv4) [ OK ]
TEST: L2 miss - Multicast (IPv6) [ OK ]
TEST: L2 miss - Link-local multicast (IPv4) [ OK ]
TEST: L2 miss - Link-local multicast (IPv6) [ OK ]
TEST: L2 miss - Broadcast [ OK ]
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add the 'fdb_miss' key element to supported key blocks and make use of
it to match on layer 2 miss.
The key is only supported on Spectrum-{2,3,4}. An error is returned for
Spectrum-1 since the key element is not present in any of its key
blocks.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently, mlxsw only supports the 'ingress_ifindex' field in the
'FLOW_DISSECTOR_KEY_META' key, but subsequent patches are going to add
support for the 'l2_miss' field as well. It is valid to only match on
'l2_miss' without 'ingress_ifindex', so do not force matching on it.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently, mlxsw only supports the 'ingress_ifindex' field in the
'FLOW_DISSECTOR_KEY_META' key, but subsequent patches are going to add
support for the 'l2_miss' field as well. Split the parsing of the
'ingress_ifindex' field to a separate function to avoid nesting. No
functional changes intended.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Adjust drivers that support the 'FLOW_DISSECTOR_KEY_META' key to reject
filters that try to match on the newly added layer 2 miss field. Add an
extack message to clearly communicate the failure reason to user space.
The following users were not patched:
1. mtk_flow_offload_replace(): Only checks that the key is present, but
does not do anything with it.
2. mlx5_tc_ct_set_tuple_match(): Used as part of netfilter offload,
which does not make use of the new field, unlike tc.
3. get_netdev_from_rule() in nfp: Likewise.
Example:
# tc filter add dev swp1 egress pref 1 proto all flower skip_sw l2_miss true action drop
Error: mlxsw_spectrum: Can't match on "l2_miss".
We have an error talking to the kernel
Acked-by: Elad Nachman <enachman@marvell.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add the 'TCA_FLOWER_L2_MISS' netlink attribute that allows user space to
match on packets that encountered a layer 2 miss. The miss indication is
set as metadata in the tc skb extension by the bridge driver upon FDB or
MDB lookup miss and dissected by the flow dissector to the
'FLOW_DISSECTOR_KEY_META' key.
The use of this skb extension is guarded by the 'tc_skb_ext_tc' static
key. As such, enable / disable this key when filters that match on layer
2 miss are added / deleted.
Tested:
# cat tc_skb_ext_tc.py
#!/usr/bin/env -S drgn -s vmlinux
refcount = prog["tc_skb_ext_tc"].key.enabled.counter.value_()
print(f"tc_skb_ext_tc reference count is {refcount}")
# ./tc_skb_ext_tc.py
tc_skb_ext_tc reference count is 0
# tc filter add dev swp1 egress proto all handle 101 pref 1 flower src_mac 00:11:22:33:44:55 action drop
# tc filter add dev swp1 egress proto all handle 102 pref 2 flower src_mac 00:11:22:33:44:55 l2_miss true action drop
# tc filter add dev swp1 egress proto all handle 103 pref 3 flower src_mac 00:11:22:33:44:55 l2_miss false action drop
# ./tc_skb_ext_tc.py
tc_skb_ext_tc reference count is 2
# tc filter replace dev swp1 egress proto all handle 102 pref 2 flower src_mac 00:01:02:03:04:05 l2_miss false action drop
# ./tc_skb_ext_tc.py
tc_skb_ext_tc reference count is 2
# tc filter del dev swp1 egress proto all handle 103 pref 3 flower
# tc filter del dev swp1 egress proto all handle 102 pref 2 flower
# tc filter del dev swp1 egress proto all handle 101 pref 1 flower
# ./tc_skb_ext_tc.py
tc_skb_ext_tc reference count is 0
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Extend the 'FLOW_DISSECTOR_KEY_META' key with a new 'l2_miss' field and
populate it from a field with the same name in the tc skb extension.
This field is set by the bridge driver for packets that incur an FDB or
MDB miss.
The next patch will extend the flower classifier to be able to match on
layer 2 misses.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
For EVPN non-DF (Designated Forwarder) filtering we need to be able to
prevent decapsulated traffic from being flooded to a multi-homed host.
Filtering of multicast and broadcast traffic can be achieved using the
following flower filter:
# tc filter add dev bond0 egress pref 1 proto all flower indev vxlan0 dst_mac 01:00:00:00:00:00/01:00:00:00:00:00 action drop
Unlike broadcast and multicast traffic, it is not currently possible to
filter unknown unicast traffic. The classification into unknown unicast
is performed by the bridge driver, but is not visible to other layers
such as tc.
Solve this by adding a new 'l2_miss' bit to the tc skb extension. Clear
the bit whenever a packet enters the bridge (received from a bridge port
or transmitted via the bridge) and set it if the packet did not match an
FDB or MDB entry. If there is no skb extension and the bit needs to be
cleared, then do not allocate one as no extension is equivalent to the
bit being cleared. The bit is not set for broadcast packets as they
never perform a lookup and therefore never incur a miss.
A bit that is set for every flooded packet would also work for the
current use case, but it does not allow us to differentiate between
registered and unregistered multicast traffic, which might be useful in
the future.
To keep the performance impact to a minimum, the marking of packets is
guarded by the 'tc_skb_ext_tc' static key. When 'false', the skb is not
touched and an skb extension is not allocated. Instead, only a
5 bytes nop is executed, as demonstrated below for the call site in
br_handle_frame().
Before the patch:
```
memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
c37b09: 49 c7 44 24 28 00 00 movq $0x0,0x28(%r12)
c37b10: 00 00
p = br_port_get_rcu(skb->dev);
c37b12: 49 8b 44 24 10 mov 0x10(%r12),%rax
memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
c37b17: 49 c7 44 24 30 00 00 movq $0x0,0x30(%r12)
c37b1e: 00 00
c37b20: 49 c7 44 24 38 00 00 movq $0x0,0x38(%r12)
c37b27: 00 00
```
After the patch (when static key is disabled):
```
memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
c37c29: 49 c7 44 24 28 00 00 movq $0x0,0x28(%r12)
c37c30: 00 00
c37c32: 49 8d 44 24 28 lea 0x28(%r12),%rax
c37c37: 48 c7 40 08 00 00 00 movq $0x0,0x8(%rax)
c37c3e: 00
c37c3f: 48 c7 40 10 00 00 00 movq $0x0,0x10(%rax)
c37c46: 00
#ifdef CONFIG_HAVE_JUMP_LABEL_HACK
static __always_inline bool arch_static_branch(struct static_key *key, bool branch)
{
asm_volatile_goto("1:"
c37c47: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
br_tc_skb_miss_set(skb, false);
p = br_port_get_rcu(skb->dev);
c37c4c: 49 8b 44 24 10 mov 0x10(%r12),%rax
```
Subsequent patches will extend the flower classifier to be able to match
on the new 'l2_miss' bit and enable / disable the static key when
filters that match on it are added / deleted.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko says:
====================
devlink: move port ops into separate structure
In devlink, some of the objects have separate ops registered alongside
with the object itself. Port however have ops in devlink_ops structure.
For drivers what register multiple kinds of ports with different ops
this is not convenient.
This patchset changes does following changes:
1) Introduces devlink_port_ops with functions that allow devlink port
to be registered passing a pointer to driver port ops. (patch #1)
2) Converts drivers to define port_ops and register ports passing the
ops pointer. (patches #2, #3, #4, #6, #8, and #9)
3) Moves ops from devlink_ops struct to devlink_port_ops.
(patches #5, #7, #10-15)
No functional changes.
====================
Link: https://lore.kernel.org/r/20230526102841.2226553-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Now when the original ops variable is removed, introduce it again
but this time for devlink_port_ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move port_fn_hw_addr_get/set() from devlink_ops into newly introduced
devlink_port_ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use newly introduce devlink port registration function variant and
register devlink port passing ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use newly introduce devlink port registration function variant and
register devlink port passing ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use newly introduce devlink port registration function variant and
register devlink port passing ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use newly introduce devlink port registration function variant and
register devlink port passing ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use newly introduce devlink port registration function variant and
register devlink port passing ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use newly introduce devlink port registration function variant and
register devlink port passing ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Michal Wilczynski <michal.wilczynski@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In devlink, some of the objects have separate ops registered alongside
with the object itself. Port however have ops in devlink_ops structure.
For drivers what register multiple kinds of ports with different ops
this is not convenient. Introduce devlink_port_ops and a set
of functions that allow drivers to pass ops pointer during
port registration.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The last_bdp is initialized to bdp, and both last_bdp and bdp are
not changed. That is to say that last_bdp and bdp are always equal.
So bdp can be used directly.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230529022615.669589-1-wei.fang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Check whether first PCI read returns 0xffffffff. Currently, if this is
the case, the user sees the following misleading message:
unknown chip XID fcf, contact r8169 maintainers (see MAINTAINERS file)
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/75b54d23-fefe-2bf4-7e80-c9d3bc91af11@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
There is no gpiod_export() and gpiod_unexport() looks pretty much stray.
The gpiod_export() and gpiod_unexport() shouldn't be used in the code,
GPIO sysfs is deprecated. That said, simply drop the stray call.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230528142531.38602-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Parthiban Veerasooran says:
====================
microchip_t1s: Update on Microchip 10BASE-T1S PHY driver
This patch series contain the below updates,
- Fixes on the Microchip LAN8670/1/2 10BASE-T1S PHYs support in the
net/phy/microchip_t1s.c driver.
- Adds support for the Microchip LAN8650/1 Rev.B0 10BASE-T1S Internal
PHYs in the net/phy/microchip_t1s.c driver.
====================
Link: https://lore.kernel.org/r/20230526152348.70781-1-Parthiban.Veerasooran@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Add support for the Microchip LAN865x Rev.B0 10BASE-T1S Internal PHYs
(LAN8650/1). The LAN865x combines a Media Access Controller (MAC) and an
internal 10BASE-T1S Ethernet PHY to access 10BASE‑T1S networks. As
LAN867X and LAN865X are using the same function for the read_status,
rename the function as lan86xx_read_status.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com>
Reviewed-by: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@ferroamp.se>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
By default, except Reset Complete interrupt in the Interrupt Mask 2
Register all other interrupts are disabled/masked. As Reset Complete
status is already handled, it doesn't make sense to disable it.
Reviewed-by: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@ferroamp.se>
Tested-by: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@ferroamp.se>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
As per the datasheet DS-LAN8670-1-2-60001573C.pdf, the Reset Complete
status bit in the STS2 register has to be checked before proceeding to
the initial configuration. Reading STS2 register will also clear the
Reset Complete interrupt which is non-maskable.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com>
Reviewed-by: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@ferroamp.se>
Tested-by: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@ferroamp.se>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
As per AN1699, the initial configuration in the driver applies to LAN867x
Rev.B1 hardware revision. 0x0007C160 (Rev.A0) and 0x0007C161 (Rev.B0)
never released to production and hence they don't need to be supported.
Reviewed-by: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@ferroamp.se>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Replace read-modify-write code in the lan867x_config_init function to
avoid handling data type mismatch and to simplify the code.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com>
Reviewed-by: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@ferroamp.se>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Remove LAN867X from the driver description as this driver is common for
all the Microchip 10BASE-T1S PHYs.
Reviewed-by: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@ferroamp.se>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Oleksij Rempel says:
====================
Microchip DSA Driver Improvements
changes v2:
- set .max_register = U8_MAX, it should be more readable
- clarify in the RMW error handling patch, logging behavior
expectation.
I'd like to share a set of patches for the Microchip DSA driver. These
patches were chosen from a bigger set because they are simpler and
should be easier to review. The goal is to make the code easier to read,
get rid of unused code, and handle errors better.
====================
Link: https://lore.kernel.org/r/20230526073445.668430-1-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This update introduces specific register access boundaries for the
KSZ8873 and KSZ8863 chips within the DSA Microchip driver. The outlined
ranges target global control registers, port registers, and advanced
control registers.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This patch prepares the ksz8863_smi part of ksz8 driver to utilize the
regmap register access validation feature.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The only place where this variable would be set to false is the
ksz8_config_cpu_port() function. But it is done in a bogus way:
for (i = 0; i < dev->phy_port_cnt; i++) {
if (i == dev->phy_port_cnt) <--- will be never executed.
break;
p->on = 1;
So, we never have a situation where p->on = 0. In this case, we can just
remove it.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
It is not immediately obvious that this driver allocates, via the
KSZ_REGMAP_TABLE() macro, 3 regmaps for register access: dev->regmap[0]
for 8-bit access, dev->regmap[1] for 16-bit and dev->regmap[2] for
32-bit access.
In future changes that add support for reg_fields, each field will have
to specify through which of the 3 regmaps it's going to go. Add an enum
now, to denote one of the 3 register access widths, and make the code go
through some wrapper functions for easier review and further
modification.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This patch refines the error handling mechanism for 8-bit register
read-modify-write operations. In case of a failure, it now logs an error
message detailing the problematic offset. This enhancement aids in
debugging by providing more precise information when these operations
encounter issues.
Furthermore, the ksz_prmw8() function has been updated to return error
values rather than void, enabling calling functions to appropriately
respond to errors.
Additionally, in case of an error that affects both the current and
future accesses, the PHY driver will log the errors consistently, akin
to the existing behavior in all ksz_read*/ksz_write* helpers.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Donald Hunter says:
====================
netlink: specs: add ynl spec for ovs_flow
Add a ynl specification for ovs_flow. The spec is sufficient to dump ovs
flows but some attrs have been left as binary blobs because ynl doesn't
support C arrays in struct definitions yet.
Patches 1-3 add features for genetlink-legacy specs
Patch 4 is the ovs_flow netlink spec
====================
Link: https://lore.kernel.org/r/20230527133107.68161-1-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a ynl specification for ovs_flow. This spec is sufficient to dump ovs
flows. Some attrs are left as binary blobs because ynl doesn't support C
arrays in struct definitions yet.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Support decoding scalars as enums in struct members for genetlink-legacy
specs.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This eliminates the need for e.g. --json '{"dp-ifindex":0}' which is not
too big a deal for ovs but will get tiresome for fixed header structs that
have many members.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Make it possible to document the meaning of struct member attributes in
genetlink-legacy specs.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The "len" variable needs to be signed for the error handling to work
correctly.
Fixes: 2e910b9532 ("net: Add a function to splice pages into an skbuff for MSG_SPLICE_PAGES")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/366861a7-87c8-4bbf-9101-69dd41021d07@kili.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Rather than using put_device(&mdiodev->dev), use the proper interface
provided to dispose of the mdiodev - that being mdio_device_free().
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Link: https://lore.kernel.org/r/E1q2VsB-008QlZ-El@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King says:
====================
net: pcs: add helpers to xpcs and lynx to manage mdiodev
This morning, we have had two instances where the destruction of the
MDIO device associated with XPCS and Lynx has been wrong. Rather than
allowing this pattern of errors to continue, let's make it easier for
driver authors to get this right by adding a helper.
The changes are essentially:
1. Add two new mdio device helpers to manage the underlying struct
device reference count. Note that the existing mdio_device_free()
doesn't actually free anything, it merely puts the reference count.
2. Make the existing _create() and _destroy() PCS driver methods
increment and decrement this refcount using these helpers. This
results in no overall change, although drivers may hang on to
the mdio device for a few cycles longer.
3. Add _create_mdiodev() which creates the mdio device before calling
the existing _create() method. Once the _create() method has
returned, we put the reference count on the mdio device.
If _create() was successful, then the reference count taken there
will "hold" the mdio device for the lifetime of the PCS (in other
words, until _destroy() is called.) However, if _create() failed,
then dropping the refcount at this point will free the mdio device.
This is the exact behaviour we desire.
4. Convert users that create a mdio device and then call the PCS's
_create() method over to the new _create_mdiodev() method, and
simplify the cleanup.
We also have DPAA2 and fmem_memac that look up their PCS rather than
creating it. These could also drop their reference count on the MDIO
device immediately after calling lynx_pcs_create(), which would then
mean we wouldn't need lynx_get_mdio_device() and the associated
complexity to put the device in dpaa2_pcs_destroy() and pcs_put().
Note that DPAA2 bypasses the mdio device's abstractions by calling
put_device() directly.
====================
Link: https://lore.kernel.org/r/ZHCGZ8IgAAwr8bla@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>