IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
In prevision to add new UAPI for hwtstamp we will be limited to the struct
ethtool_ts_info that is currently passed in fixed binary format through the
ETHTOOL_GET_TS_INFO ethtool ioctl. It would be good if new kernel code
already started operating on an extensible kernel variant of that
structure, similar in concept to struct kernel_hwtstamp_config vs struct
hwtstamp_config.
Since struct ethtool_ts_info is in include/uapi/linux/ethtool.h, here
we introduce the kernel-only structure in include/linux/ethtool.h.
The manual copy is then made in the function called by ETHTOOL_GET_TS_INFO.
Acked-by: Shannon Nelson <shannon.nelson@amd.com>
Acked-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20240709-feature_ptp_netnext-v17-6-b5317f50df2a@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Prepare for future support of saving hwtstamp source in PTP xarray by
introducing HWTSTAMP_SOURCE_UNSPEC to hwtstamp_source enum, setting it
to 0 to match old behavior of no source defined.
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20240709-feature_ptp_netnext-v17-5-b5317f50df2a@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Change the API to select MAC default time stamping instead of the PHY.
Indeed the PHY is closer to the wire therefore theoretically it has less
delay than the MAC timestamping but the reality is different. Due to lower
time stamping clock frequency, latency in the MDIO bus and no PHC hardware
synchronization between different PHY, the PHY PTP is often less precise
than the MAC. The exception is for PHY designed specially for PTP case but
these devices are not very widespread. For not breaking the compatibility
default_timestamp flag has been introduced in phy_device that is set by
the phy driver to know we are using the old API behavior.
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20240709-feature_ptp_netnext-v17-4-b5317f50df2a@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Timestamping software or hardware flags are often used as a group,
therefore adding these masks will ease future use.
I did not use SOF_TIMESTAMPING_SYS_HARDWARE flag as it is deprecated and
not used at all.
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20240709-feature_ptp_netnext-v17-1-b5317f50df2a@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The same name is set when allocating the netdevice structure in the
alloc_etherdev_mq()->alloc_etherrdev_mqs() function. Therefore, there
is no need to manually set it.
This fixes CheckPatch warnings:
WARNING: Prefer strscpy over strcpy - see: https://github.com/KSPP/linux/issues/88
strcpy(dev->name, "eth%d");
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Link: https://patch.msgid.link/20240713170920.863171-1-olek2@wp.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pawel Dembicki says:
====================
net: dsa: vsc73xx: Implement VLAN operations
This patch series is a result of splitting a larger patch series [0],
where some parts was merged before.
The first patch implements port state configuration, which is required
for bridge functionality. STP frames are not forwarded at this moment.
BPDU frames are only forwarded from/to the PI/SI interface.
For more information, see chapter 2.7.1 (CPU Forwarding) in the
datasheet.
Patches 2, 7-9 and 11 provide a basic implementation of tag_8021q
functionality with QinQ support, without VLAN filtering in
the bridge and simple VLAN awareness in VLAN filtering mode.
Patches 3-6 came from Vladimir Oltean. They prepare for making
tag8021q more common. VSC73XX uses very similar tag recognition,
and some code from tag_sja1105 could be moved to tag_8021q for
common use.
Patch 10 is preparation for use tag_8021q bridge functions as generic
implementation of the 'ds->ops->port_bridge_*()'.
Patch 12 is required to avoid problem with learning on standalone ports.
[0] https://patchwork.kernel.org/project/netdevbpf/list/?series=841034&state=%2A&archive=both
====================
Link: https://patch.msgid.link/20240713211620.1125910-1-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch implements .port_pre_bridge_flags() and .port_bridge_flags(),
which are required for properly treating the BR_LEARNING flag. Also,
.port_stp_state_set() is tweaked and now disables learning for standalone
ports.
Disabling learning for standalone ports is required to avoid situations
where one port sees traffic originating from another, which could cause
packet drops.
Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/20240713211620.1125910-13-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch adds bridge support for the vsc73xx driver.
The vsc73xx requires minimal operations and ithe generic
dsa_tag_8021q_bridge_* API is sufficient.
The forwarding matrix is managed by vsc73xx_port_stp_state_set() ->
vsc73xx_refresh_fwd_map()i routine, which is called immediately after
.port_bridge_join() and .port_bridge_leave().
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Link: https://patch.msgid.link/20240713211620.1125910-12-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The 'dsa_tag_8021q_bridge_join' could be used as a generic implementation
of the 'ds->ops->port_bridge_join()' function. However, it is necessary
to synchronize their arguments.
This patch also moves the 'tx_fwd_offload' flag configuration line into
'dsa_tag_8021q_bridge_join' body. Currently, every (sja1105) driver sets
it, and the future vsc73xx implementation will also need it for
simplification.
Suggested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/20240713211620.1125910-11-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Max number of bridges in tag8021q implementation is strictly limited
by VBID size: 3 bits. But zero is reserved and only 7 values can be used.
This patch adds define which describe maximum possible value.
Suggested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/20240713211620.1125910-10-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch is a simple implementation of 802.1q tagging in the vsc73xx
driver. Currently, devices with DSA_TAG_PROTO_NONE are not functional.
The VSC73XX family doesn't provide any tag support for external Ethernet
ports.
The only option available is VLAN-based tagging, which requires constant
hardware VLAN filtering. While the VSC73XX family supports provider
bridging, it only supports QinQ without full implementation of 802.1AD.
This means it only allows the doubled 0x8100 TPID.
In the simple port mode, QinQ is enabled to preserve forwarding of
VLAN-tagged frames.
Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/20240713211620.1125910-9-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This commit introduces a new tagger based on 802.1q tagging.
It's designed for the vsc73xx driver. The VSC73xx family doesn't have
any tag support for the RGMII port, but it could be based on VLANs.
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://patch.msgid.link/20240713211620.1125910-8-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
A new tagging protocol implementation based on tag_8021q is on the
horizon, and it appears that it also has to open-code the complicated
logic of finding a source port based on a VLAN header.
Create a single dsa_tag_8021q_find_user() and make sja1105 call it.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20240713211620.1125910-7-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Now that dsa_8021q_rcv() handles better the case where we don't
overwrite the precise source information if it comes from an external
(non-tag_8021q) source, we can now unify the call sequence between
sja1105_rcv() and sja1110_rcv().
This is a preparatory change for creating a higher-level wrapper for the
entire sequence which will live in tag_8021q.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/20240713211620.1125910-6-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
tag_sja1105 has a wrapper over dsa_8021q_rcv(): sja1105_vlan_rcv(),
which determines whether the packet came from a bridge with
vlan_filtering=1 (the case resolved via
dsa_find_designated_bridge_port_by_vid()), or if it contains a tag_8021q
header.
Looking at a new tagger implementation for vsc73xx, based also on
tag_8021q, it is becoming clear that the logic is needed there as well.
So instead of forcing each tagger to wrap around dsa_8021q_rcv(), let's
merge the logic into the core.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Link: https://patch.msgid.link/20240713211620.1125910-5-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In both sja1105_rcv() and sja1110_rcv(), we may have precise source port
information coming from parallel hardware mechanisms, in addition to the
tag_8021q header.
Only sja1105_rcv() has extra logic to not overwrite that precise info
with what's present in the VLAN tag. This is because sja1110_rcv() gets
by, by having a reversed set of checks when assigning skb->dev. When the
source port is imprecise (vbid >=1), source_port and switch_id will be
set to zeroes by dsa_8021q_rcv(), which might be problematic. But by
checking for vbid >= 1 first, sja1110_rcv() fends that off.
We would like to make more code common between sja1105_rcv() and
sja1110_rcv(), and for that, we need to make sure that sja1110_rcv()
also goes through the precise source port preservation logic.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/20240713211620.1125910-4-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch implements VLAN filtering for the vsc73xx driver.
After starting VLAN filtering, the switch is reconfigured from QinQ to
a simple VLAN aware mode. This is required because VSC73XX chips do not
support inner VLAN tag filtering.
Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/20240713211620.1125910-3-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This isn't a fully functional implementation of 802.1D, but
port_stp_state_set is required for a future tag8021q operations.
This implementation handles properly all states, but vsc73xx doesn't
forward STP packets.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Link: https://patch.msgid.link/20240713211620.1125910-2-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
icssg_prueth.c and icssg_prueth_sr1.c drivers use multiple common .c
files. These common objects are getting added to multiple modules. As a
result when both drivers are enabled in .config, below warning is seen.
drivers/net/ethernet/ti/Makefile: icssg/icssg_common.o is added to multiple modules: icssg-prueth icssg-prueth-sr1
drivers/net/ethernet/ti/Makefile: icssg/icssg_classifier.o is added to multiple modules: icssg-prueth icssg-prueth-sr1
drivers/net/ethernet/ti/Makefile: icssg/icssg_config.o is added to multiple modules: icssg-prueth icssg-prueth-sr1
drivers/net/ethernet/ti/Makefile: icssg/icssg_mii_cfg.o is added to multiple modules: icssg-prueth icssg-prueth-sr1
drivers/net/ethernet/ti/Makefile: icssg/icssg_stats.o is added to multiple modules: icssg-prueth icssg-prueth-sr1
drivers/net/ethernet/ti/Makefile: icssg/icssg_ethtool.o is added to multiple modules: icssg-prueth icssg-prueth-sr1
Fix this by building a new module (icssg.o) for all the common objects.
Both the driver can then depend on this common module.
Some APIs being exported have emac_ as the prefix which may result into
confusion with other existing APIs with emac_ prefix, to avoid
confusion, rename the APIs being exported with emac_ to icssg_ prefix.
This also fixes below error seen when both drivers are built.
ERROR: modpost: "icssg_queue_pop"
[drivers/net/ethernet/ti/icssg-prueth-sr1.ko] undefined!
ERROR: modpost: "icssg_queue_push"
[drivers/net/ethernet/ti/icssg-prueth-sr1.ko] undefined!
Reported-and-tested-by: Thorsten Leemhuis <linux@leemhuis.info>
Closes: https://lore.kernel.org/oe-kbuild-all/202405182038.ncf1mL7Z-lkp@intel.com/
Fixes: 487f7323f39a ("net: ti: icssg-prueth: Add helper functions to configure FDB")
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Reviewed-by: Sai Krishna <saikrishnag@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After the commit bdacf3e34945 ("net: Use nested-BH locking for
napi_alloc_cache.") was merged, the following warning began to appear:
WARNING: CPU: 5 PID: 1 at net/core/skbuff.c:1451 napi_skb_cache_put+0x82/0x4b0
__warn+0x12f/0x340
napi_skb_cache_put+0x82/0x4b0
napi_skb_cache_put+0x82/0x4b0
report_bug+0x165/0x370
handle_bug+0x3d/0x80
exc_invalid_op+0x1a/0x50
asm_exc_invalid_op+0x1a/0x20
__free_old_xmit+0x1c8/0x510
napi_skb_cache_put+0x82/0x4b0
__free_old_xmit+0x1c8/0x510
__free_old_xmit+0x1c8/0x510
__pfx___free_old_xmit+0x10/0x10
The issue arises because virtio is assuming it's running in NAPI context
even when it's not, such as in the netpoll case.
To resolve this, modify virtnet_poll_tx() to only set NAPI when budget
is available. Same for virtnet_poll_cleantx(), which always assumed that
it was in a NAPI context.
Fixes: df133f3f9625 ("virtio_net: bulk free tx skbs")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Heng Qi <hengqi@linux.alibaba.com>
Link: https://patch.msgid.link/20240712115325.54175-1-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kamil Horák says:
====================
net: phy: bcm5481x: add support for BroadR-Reach mode
PATCH 1 - Add the 10baseT1BRR_Full link mode
PATCH 2 - Add the definitions of LRE registers, necessary to use
BroadR-Reach modes on the BCM5481x PHY
PATCH 3 - Add brr-mode flag to switch between IEEE802.3 and BroadR-Reach
PATCH 4 - Implementation of the BroadR-Reach modes for the Broadcom
PHYs
====================
Link: https://patch.msgid.link/20240712150709.3134474-1-kamilh@axis.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Implement single-pair BroadR-Reach modes on bcm5481x PHY by Broadcom.
Create set of functions alternative to IEEE 802.3 to handle
configuration of these modes on compatible Broadcom PHYs.
There is only subset of capabilities supported because of limited
collection of hardware available for the development.
For BroadR-Reach capable PHYs, the LRE (Long Reach Ethernet)
alternative register set is handled. Only bcm54811 PHY is verified,
for bcm54810, there is some support possible but untested. There
is no auto-negotiation of the link parameters (called LDS in the
Broadcom terminology, Long-Distance Signaling) for bcm54811.
It should be possible to enable LDS for bcm54810.
Signed-off-by: Kamil Horák (2N) <kamilh@axis.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20240712150709.3134474-5-kamilh@axis.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There is a group of PHY chips supporting BroadR-Reach link modes in
a manner allowing for more or less identical register usage as standard
Clause 22 PHY.
These chips support standard Ethernet link modes as well, however, the
circuitry is mutually exclusive and cannot be auto-detected.
The link modes in question are 100Base-T1 as defined in IEEE802.3bw,
based on Broadcom's 1BR-100 link mode, and newly defined 10Base-T1BRR
(1BR-10 in Broadcom documents).
Add optional brr-mode flag to switch the PHY to BroadR-Reach mode.
Signed-off-by: Kamil Horák (2N) <kamilh@axis.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20240712150709.3134474-4-kamilh@axis.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Introduce a new link mode necessary for 10 MBit single-pair
connection in BroadR-Reach mode on bcm5481x PHY by Broadcom.
This new link mode, 10baseT1BRR, is known as 1BR10 in the Broadcom
terminology. Another link mode to be used is 1BR100 and it is already
present as 100baseT1, because Broadcom's 1BR100 became 100baseT1
(IEEE 802.3bw).
Signed-off-by: Kamil Horák (2N) <kamilh@axis.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20240712150709.3134474-2-kamilh@axis.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Xuan Zhuo says:
====================
virtio-net: support AF_XDP zero copy
v5: http://lore.kernel.org/all/20240611114147.31320-1-xuanzhuo@linux.alibaba.com
XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero
copy feature of xsk (XDP socket) needs to be supported by the driver. The
performance of zero copy is very good. mlx5 and intel ixgbe already support
this feature, This patch set allows virtio-net to support xsk's zerocopy xmit
feature.
At present, we have completed some preparation:
1. vq-reset (virtio spec and kernel code)
2. virtio-core premapped dma
3. virtio-net xdp refactor
So it is time for Virtio-Net to complete the support for the XDP Socket
Zerocopy.
Virtio-net can not increase the queue num at will, so xsk shares the queue with
kernel.
On the other hand, Virtio-Net does not support generate interrupt from driver
manually, so when we wakeup tx xmit, we used some tips. If the CPU run by TX
NAPI last time is other CPUs, use IPI to wake up NAPI on the remote CPU. If it
is also the local CPU, then we wake up napi directly.
This patch set includes some refactor to the virtio-net to let that to support
AF_XDP.
Because there are too many commits, the work of virtio net supporting af-xdp is
split to rx part and tx part. This patch set is for rx part.
So the flag NETDEV_XDP_ACT_XSK_ZEROCOPY is not added, if someone want to test
for af-xdp rx, the flag needs to be adding locally.
ENV: Qemu with vhost-user(polling mode).
Host CPU: Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz
testpmd> show port stats all
######################## NIC statistics for port 0 ########################
RX-packets: 19531092064 RX-missed: 0 RX-bytes: 1093741155584
RX-errors: 0
RX-nombuf: 0
TX-packets: 5959955552 TX-errors: 0 TX-bytes: 371030645664
Throughput (since last show)
Rx-pps: 8861574 Rx-bps: 3969985208
Tx-pps: 8861493 Tx-bps: 3969962736
############################################################################
testpmd> show port stats all
######################## NIC statistics for port 0 ########################
RX-packets: 68152727 RX-missed: 0 RX-bytes: 3816552712
RX-errors: 0
RX-nombuf: 0
TX-packets: 68114967 TX-errors: 33216 TX-bytes: 3814438152
Throughput (since last show)
Rx-pps: 6333196 Rx-bps: 2837272088
Tx-pps: 6333227 Tx-bps: 2837285936
############################################################################
But AF_XDP consumes more CPU for tx and rx napi(100% and 86%).
====================
Link: https://patch.msgid.link/20240708112537.96291-1-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Support AF-XDP for merge mode.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20240708112537.96291-11-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In the process:
1. We may need to copy data to create skb for XDP_PASS.
2. We may need to call xsk_buff_free() to release the buffer.
3. The handle for xdp_buff is difference from the buffer.
If we pushed this logic into existing receive handle(merge and small),
we would have to maintain code scattered inside merge and small (and big).
So I think it is a good choice for us to put the xsk code into an
independent function.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20240708112537.96291-10-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Implement the logic of filling rq with XSK buffers.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20240708112537.96291-9-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
xsk wakeup is used to trigger the logic for xsk xmit by xsk framework or
user.
Virtio-net does not support to actively generate an interruption, so it
tries to trigger tx NAPI on the local cpu.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20240708112537.96291-8-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch implement the logic of bind/unbind xsk pool to rq.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20240708112537.96291-7-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This commit separates the function receive_mergeable(),
put the logic of appending frag to the skb as an independent function.
The subsequent commit will reuse it.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20240708112537.96291-6-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This commit separates the function receive_buf(), then we wrap the logic
of handling the skb to an independent function virtnet_receive_done().
The subsequent commit will reuse it.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20240708112537.96291-5-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch separates two sub-functions from virtnet_tx_resize():
* virtnet_tx_pause
* virtnet_tx_resume
Then the subsequent virtnet_tx_reset() can share these two functions.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20240708112537.96291-4-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch separates two sub-functions from virtnet_rx_resize():
* virtnet_rx_pause
* virtnet_rx_resume
Then the subsequent reset rx for xsk can share these two functions.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20240708112537.96291-3-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
virtio net has VIRTIO_XDP_HEADROOM that is equal to
XDP_PACKET_HEADROOM to calculate the headroom for xdp.
But here we should use the macro XDP_PACKET_HEADROOM from bpf.h to
calculate the headroom for xdp. So here we remove the
VIRTIO_XDP_HEADROOM, and use the XDP_PACKET_HEADROOM to replace it.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20240708112537.96291-2-xuanzhuo@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vladimir Oltean says:
====================
Eliminate CONFIG_NR_CPUS dependency in dpaa-eth and enable COMPILE_TEST in fsl_qbman
Breno's previous attempt at enabling COMPILE_TEST for the fsl_qbman
driver (now included here as patch 5/5) triggered compilation warnings
for large CONFIG_NR_CPUS values:
https://lore.kernel.org/all/202406261920.l5pzM1rj-lkp@intel.com/
Patch 1/5 switches two NR_CPUS arrays in the dpaa-eth driver to dynamic
allocation to avoid that warning. There is more NR_CPUS usage in the
fsl-qbman driver, but that looks relatively harmless and I couldn't find
a good reason to change it.
I noticed, while testing, that the driver doesn't actually work properly
with high CONFIG_NR_CPUS values, and patch 2/5 addresses that.
During code analysis, I have identified two places which treat
conditions that can never happen. Patches 3/5 and 4/5 simplify the
probing code - dpaa_fq_setup() - just a little bit.
Finally we have at 5/5 the patch that triggered all of this. There is
an okay from Herbert to take it via netdev, despite it being on soc/qbman:
https://lore.kernel.org/all/Zns%2FeVVBc7pdv0yM@gondor.apana.org.au/
Link to v1:
https://lore.kernel.org/netdev/20240710230025.46487-1-vladimir.oltean@nxp.com/
====================
Link: https://patch.msgid.link/20240713225336.1746343-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
As most of the drivers that depend on ARCH_LAYERSCAPE, make FSL_DPAA
depend on COMPILE_TEST for compilation and testing.
# grep -r depends.\*ARCH_LAYERSCAPE.\*COMPILE_TEST | wc -l
29
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Link: https://patch.msgid.link/20240713225336.1746343-6-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
dpaa_fq_setup() iterates through the &priv->dpaa_fq_list elements
allocated by dpaa_alloc_all_fqs(). This includes a call to:
if (!dpaa_fq_alloc(dev, 0, dpaa_max_num_txqs(), list, FQ_TYPE_TX))
goto fq_alloc_failed;
which gives us dpaa_max_num_txqs() elements of FQ_TYPE_TX type.
The code block which we are deleting runs after an earlier iteration
through &priv->dpaa_fq_list. So at the end of this iteration (for which
there is no early break), egress_cnt will be unconditionally equal to
dpaa_max_num_txqs().
In other words, dpaa_alloc_all_fqs() has already allocated TX queues for
all possible CPUs and the maximal number of traffic classes, and we've
already iterated once through them all.
The while() condition is dead code, remove it.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Link: https://patch.msgid.link/20240713225336.1746343-5-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
dpaa_fq_setup() iterates through the queues allocated by dpaa_alloc_all_fqs()
and saved in &priv->dpaa_fq_list.
The allocation for FQ_TYPE_TX looks as follows:
if (!dpaa_fq_alloc(dev, 0, dpaa_max_num_txqs(), list, FQ_TYPE_TX))
goto fq_alloc_failed;
Thus, iterating again through FQ_TYPE_TX queues in dpaa_fq_setup() and
counting them will never yield an egress_cnt larger than the allocated
size, dpaa_max_num_txqs().
The comparison serves no purpose since it is always true; remove it.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Link: https://patch.msgid.link/20240713225336.1746343-4-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The driver uses the DPAA_TC_TXQ_NUM and DPAA_ETH_TXQ_NUM macros for TX
queue handling, and they depend on CONFIG_NR_CPUS.
In generic .config files, these can go to very large (8096 CPUs) values
for the systems that DPAA1 is integrated in (1-24 CPUs). We allocate a
lot of resources that will never be used. Those are:
- system memory
- QMan FQIDs as managed by qman_alloc_fqid_range(). This is especially
painful since currently, when booting with CONFIG_NR_CPUS=8096, a
LS1046A-RDB system will only manage to probe 3 of its 6 interfaces.
The rest will run out of FQD ("/reserved-memory/qman-fqd" in the
device tree) and fail at the qman_create_fq() stage of the probing
process.
- netdev queues as alloc_etherdev_mq() argument. The high queue indices
are simply hidden from the network stack after the call to
netif_set_real_num_tx_queues().
With just a tiny bit more effort, we can replace the NR_CPUS
compile-time constant with the num_possible_cpus() run-time constant,
and dynamically allocate the egress_fqs[] and conf_fqs[] arrays.
Even on a system with a high CONFIG_NR_CPUS, num_possible_cpus() will
remain equal to the number of available cores on the SoC.
The replacement is as follows:
- DPAA_TC_TXQ_NUM -> dpaa_num_txqs_per_tc()
- DPAA_ETH_TXQ_NUM -> dpaa_max_num_txqs()
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Link: https://patch.msgid.link/20240713225336.1746343-3-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The dpaa-eth driver is written for PowerPC and Arm SoCs which have 1-24
CPUs. It depends on CONFIG_NR_CPUS having a reasonably small value in
Kconfig. Otherwise, there are 2 functions which allocate on-stack arrays
of NR_CPUS elements, and these can quickly explode in size, leading to
warnings such as:
drivers/net/ethernet/freescale/dpaa/dpaa_eth.c:3280:12: warning:
stack frame size (16664) exceeds limit (2048) in 'dpaa_eth_probe' [-Wframe-larger-than]
The problem is twofold:
- Reducing the array size to the boot-time num_possible_cpus() (rather
than the compile-time NR_CPUS) creates a variable-length array,
which should be avoided in the Linux kernel.
- Using NR_CPUS as an array size makes the driver blow up in stack
consumption with generic, as opposed to hand-crafted, .config files.
A simple solution is to use dynamic allocation for num_possible_cpus()
elements (aka a small number determined at runtime).
Link: https://lore.kernel.org/all/202406261920.l5pzM1rj-lkp@intel.com/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Breno Leitao <leitao@debian.org>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Link: https://patch.msgid.link/20240713225336.1746343-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH7ZpcWbFyOOp6OJbrB3Eaf9PW7cFAmaSU/QACgkQrB3Eaf9P
W7etjA/+I8bWTjMCCGFT7AXIisXWQhHbrRuaU6hpROxWUTAyjUuM4qhdXHYUyG6i
2mcg7Ppqn0etEnrvCDJqgWGPonSJuxKRMpRNiB2uRYZAKDK2X7d5gCVVK+xGyuYn
rXjAw3yQ9W6oV8lQvm7GqLYOFL5vj9UA5q8QEhyTxH11HDDRBjlHSgzgWovzGsjO
2qLHSh3wuBuuoWS6jhN5n0pA1mFiKxhzPRRvTV2Q8CEBt+JML0gGd08g0s6tSGMJ
qlEGdTHIkIGi/QsbOoRm14X5gYYrDz1EEATISZTA9/Pbb03MsQfxUp6EUZNZIM4O
/K9XO7LLXOYWXBcI3BDCHCOT1cJPw1WVvYwlwWzu4DpxelPAc+pk2/QZk9wV2cWd
MzScbhHKmZ5GnYnlfQAyOnC5tvQXUBG2OntyXMBGh9seh+H5Lcl1RJAflIwRvBx5
7cnR6HiTmLUlbBxKjSJF+xFPnTucp0J637DkY/ONtAA7qNHnOKh3LWqkIH80q/FI
7Ua0EpgTtzAzN6iR2ujMHusfAjJs4yhMGY5KFGcEHwqS2axYq+mpnaShYzNebzl6
9kOmj6UAVP0tivH2Ahmsz2HaNhZaJ3hXftZeF3zwcoN6XTc3jrQ4JuNyiDcsUdnf
ggyLMZ7VI6Jf38ep8LEnfpqQm5qFTVfto62goWWLlGgr4wsy66c=
=KyYL
-----END PGP SIGNATURE-----
Merge tag 'ipsec-next-2024-07-13' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:
====================
pull request (net-next): ipsec-next 2024-07-13
1) Support sending NAT keepalives in ESP in UDP states.
Userspace IKE daemon had to do this before, but the
kernel can better keep track of it.
From Eyal Birger.
2) Support IPsec crypto offload for IPv6 ESP and IPv4 UDP-encapsulated
ESP data paths. Currently, IPsec crypto offload is enabled for GRO
code path only. This patchset support UDP encapsulation for the non
GRO path. From Mike Yu.
* tag 'ipsec-next-2024-07-13' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next:
xfrm: Support crypto offload for outbound IPv4 UDP-encapsulated ESP packet
xfrm: Support crypto offload for inbound IPv4 UDP-encapsulated ESP packet
xfrm: Allow UDP encapsulation in crypto offload control path
xfrm: Support crypto offload for inbound IPv6 ESP packets not in GRO path
xfrm: support sending NAT keepalives in ESP in UDP states
====================
Link: https://patch.msgid.link/20240713102416.3272997-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Lorenzo Bianconi says:
====================
Introduce EN7581 ethernet support
Add airoha_eth driver in order to introduce ethernet support for
Airoha EN7581 SoC available on EN7581 development board.
EN7581 mac controller is mainly composed by Frame Engine (FE) and
QoS-DMA (QDMA) modules. FE is used for traffic offloading (just basic
functionalities are supported now) while QDMA is used for DMA operation
and QOS functionalities between mac layer and the dsa switch (hw QoS is
not available yet and it will be added in the future).
Currently only hw lan features are available, hw wan will be added with
subsequent patches.
====================
Link: https://patch.msgid.link/cover.1720818878.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add airoha_eth driver in order to introduce ethernet support for
Airoha EN7581 SoC available on EN7581 development board (en7581-evb).
EN7581 mac controller is mainly composed by the Frame Engine (PSE+PPE)
and QoS-DMA (QDMA) modules. FE is used for traffic offloading (just
basic functionalities are currently supported) while QDMA is used for
DMA operations and QOS functionalities between the mac layer and the
external modules conncted to the FE GDM ports (e.g MT7530 DSA switch
or external phys).
A general overview of airoha_eth architecture is reported below:
┌───────┐ ┌───────┐
│ QDMA2 │ │ QDMA1 │
└───┬───┘ └───┬───┘
│ │
┌───────▼─────────────────────────────────────────────▼────────┐
│ │
│ P5 P0 │
│ │
│ │
│ │ ┌──────┐
│ P3 ├────► GDM3 │
│ │ └──────┘
│ │
│ │
┌─────┐ │ │
│ PPE ◄────┤ P4 PSE │
└─────┘ │ │
│ │
│ │
│ │ ┌──────┐
│ P9 ├────► GDM4 │
│ │ └──────┘
│ │
│ │
│ │
│ P2 P1 │
└─────────┬───────────────────────────────────────────┬────────┘
│ │
┌───▼──┐ ┌──▼───┐
│ GDM2 │ │ GDM1 │
└──────┘ └──┬───┘
│
┌────▼─────┐
│ MT7530 │
└──────────┘
Currently only hw LAN features (QDMA1+GDM1) are available while hw WAN
(QDMA2+GDM{2,3,4}) ones will be added with subsequent patches introducing
traffic offloading support.
Tested-by: Benjamin Larsson <benjamin.larsson@genexis.eu>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/274945d2391c195098ab180a46d0617b18b9e42c.1720818878.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tony Nguyen says:
====================
ice: Switch API optimizations
Marcin Szycik says:
Optimize the process of creating a recipe in the switch block by removing
duplicate switch ID words and changing how result indexes are fitted into
recipes. In many cases this can decrease the number of recipes required to
add a certain set of rules, potentially allowing a more varied set of rules
to be created. Total rule count will also increase, since less words will
be left unused/wasted. There are only 64 rules available in total, so every
one counts.
After this modification, many fields and some structs became unused or were
simplified, resulting in overall simpler implementation.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
ice: Add tracepoint for adding and removing switch rules
ice: Remove unused members from switch API
ice: Optimize switch recipe creation
ice: remove unused recipe bookkeeping data
ice: Simplify bitmap setting in adding recipe
ice: Remove reading all recipes before adding a new one
ice: Remove unused struct ice_prot_lkup_ext members
====================
Link: https://patch.msgid.link/20240711181312.2019606-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2024-07-11 (net/intel)
This series contains updates to most Intel network drivers.
Tony removes MODULE_AUTHOR from drivers containing the entry.
Simon Horman corrects a kdoc entry for i40e.
Pawel adds implementation for devlink param "local_forwarding" on ice.
Michal removes unneeded call, and code, for eswitch rebuild for ice.
Sasha removed a no longer used field from igc.
* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
igc: Remove the internal 'eee_advert' field
ice: remove eswitch rebuild
ice: Add support for devlink local_forwarding param
i40e: correct i40e_addr_to_hkey() name in kdoc
net: intel: Remove MODULE_AUTHORs
====================
Link: https://patch.msgid.link/20240711201932.2019925-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
I2C v7, SMBus 3.2, and I3C 1.1.1 specifications have replaced "master/slave"
with more appropriate terms. Inspired by Wolfram's series to fix drivers/i2c/,
fix the terminology for users of I2C_ALGOBIT bitbanging interface, now that
the approved verbiage exists in the specification.
Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com>
Link: https://patch.msgid.link/20240711052734.1273652-5-eahariha@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>