IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Follow-up patchset introducing XMDR implementation is going to need
to distinguish write and update ops. Therefore introduce "update op"
and call "write op" only when new FIB entry is inserted.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In case bulking is used, the entry that was previously added may not
be yet committed to the HW as it waits in the queue for bulk send. For
such entries, skip the deletion.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Prepare for the low-level ops that need to store some data alongside
the fib_entry and introduce a per-fib_entry priv for ll ops.
The priv is reference counted as in the follow-up patch it is going
to be saved in pack() function and used later on in commit() even in
case the related fib_entry gets freed in the middle.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Get the max size needed for FIB entry op context and allocate it once
for the instance. Use it repeatedly from the scheduled work.
By this, allow to extend the context to hold more data than it is wise
to do when it was on the stack. Make sure to signalize that the context
needs to be initialized in case families of subsequent FIB entries differ.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
For XMDR register it is possible to carry multiple FIB entry
operations in a single write. However the FW does not restrict mixing
the types of operations, make the code easier and indicate the bulking
is ok only in case the bulk contains FIB operations of the same family
and event.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
With follow-up introduction of XM implementation, XMDR register is
going to be optionally used instead of RALUE register. Push the RALUE
packing helpers and write call into low-level router ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Unify the RALUE register payload packing and use the
__mlxsw_sp_fib_entry_ralue_pack() helper from
__mlxsw_sp_router_set_abort_trap().
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In preparation for the change that is going to be done in the next
patch, allow to pass NULL pointer to mlxsw_reg_ralue_pack4() and
mlxsw_reg_ralue_pack6() helpers.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Instead of passing destination IP as a u32 value, pass it as pointer to
u32. Avoid using local variable for the pointer store.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
As the RALUE packing is going to be put into op, make the user from
IPIP code use the same helper as the router code does.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
As the RALUE packing is going to be pushed into an op, in preparation
for that push the code into a separate function in the meantime.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently, RALUE payload is defined locally in the function that is
calling the register write. With introduction of alternative register to
RALUE, XMDR, it has to be possible to put multiple FIB entry
operations into single register write.
So in order to prepare for that, have per-work entry operation context
and propagate it all the way down to the functions writing RALUE.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently, every FIB event is queued-up as a separate work to be
processed. However, that allows to process only one FIB entry per work
callback.
In preparation of future XMDR register bulking of multiple FIB entries,
convert to FIB event queue. Implement this by a list_head, adding new
events to the end of the list in the FIB notify callback. That allows to
process multiple events from the list inside the work callback.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since the write/delete of FIB entry is going to be implemented by XMDR
register for XM implementation, introduce RALUE-independent enum for op
so the enum could be used in both RALUE and XMDR.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Don't pass RALXX register enum and rather pass enum mlxsw_sp_l3proto
to __mlxsw_sp_router_set_abort_trap(). This is in preparation to fib
entry pack implementation by XMDR register.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Improve the build testing of these SMSC drivers by enabling them when
COMPILE_TEST is selected.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
drivers/net/ethernet/smsc/smc911x.c: In function ‘smc911x_hardware_send_pkt’:
drivers/net/ethernet/smsc/smc911x.c:471:11: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
471 | cmdA = (((u32)skb->data & 0x3) << 16) |
When built on 64bit targets, the skb->data pointer cannot be cast to a
u32 in a meaningful way. Use uintptr_t instead.
Suggested-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Now that the compiler always sees the parameters passed to the DBG()
macro, it gives an error message about wrong parameters. The comment
says it all:
/* ndev is not valid yet, so avoid passing it in. */
DBG(SMC_DEBUG_FUNC, "--> %s\n", __func__);
You cannot not just pass a parameter!
The DBG does not seem to have any real value, to just remove it.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
drivers/net/ethernet/smsc/smc911x.c: In function ‘smc911x_timeout’:
drivers/net/ethernet/smsc/smc911x.c:1251:6: warning: variable ‘status’ set but not used [-Wunused-but-set-variable]
1251 | int status, mask;
The status is read in order to print it via the DBG macro. However,
due to the way DBG is disabled, the compiler never sees it being used.
Change the DBG macro to actually make use of the passed parameters,
and the leave the optimiser to remove the unwanted code inside the
while (0).
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
drivers/net/ethernet/smsc/smc911x.c: In function ‘smc911x_phy_interrupt’:
drivers/net/ethernet/smsc/smc911x.c:976:6: warning: variable ‘status’ set but not used [-Wunused-but-set-variable]
976 | int status;
A comment indicates the status needs to be read from the PHY,
otherwise bad things happen. But due to the macro magic, it is hard to
perform the read without assigning it to a variable. So add
_always_unused attribute to status to tell the compiler we don't
expect to use the value.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
drivers/net/ethernet/smsc/smc91x.c:2199: warning: Function parameter or member 'dev' not described in 'try_toggle_control_gpio'
drivers/net/ethernet/smsc/smc91x.c:2199: warning: Function parameter or member 'desc' not described in 'try_toggle_control_gpio'
drivers/net/ethernet/smsc/smc91x.c:2199: warning: Function parameter or member 'name' not described in 'try_toggle_control_gpio'
drivers/net/ethernet/smsc/smc91x.c:2199: warning: Function parameter or member 'index' not described in 'try_toggle_control_gpio'
drivers/net/ethernet/smsc/smc91x.c:2199: warning: Function parameter or member 'value' not described in 'try_toggle_control_gpio'
drivers/net/ethernet/smsc/smc91x.c:2199: warning: Function parameter or member 'nsdelay' not described in 'try_toggle_control_gpio'
Document these parameters.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
drivers/net/ethernet/smsc/smc91x.c:706:51: warning: variable ‘pkt_len’ set but not used [-Wunused-but-set-variable]
706 | unsigned int saved_packet, packet_no, tx_status, pkt_len;
The read of the packet length in the descriptor probably needs to be
kept in order to keep the hardware happy. So tell the compiler we
don't expect to use the value by using the __always_unused attribute.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
To improve build testing of this driver, add COMPILE_TEST support.
Acked-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
drivers/net/ethernet//xilinx/xilinx_emaclite.c:341:35: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
341 | addr = (void __iomem __force *)((u32 __force)addr ^
Use uintptr_t instead of u32 to avoid problems on 64 bit systems.
Also, cast the address to an unsigned long for printing.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The txqueue parameter to the watchdog callback is unused in this
driver. But it still needs to be documented.
Reviewed-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
nfp_cpp_from_nfp6000_pcie() returns ERR_PTR() and never returns
NULL. The NULL test should be removed, also return correct err.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Link: https://lore.kernel.org/r/20201112145852.6580-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In function ‘strncpy’,
inlined from ‘sky2_name’ at drivers/net/ethernet/marvell/sky2.c:4903:3,
inlined from ‘sky2_probe’ at drivers/net/ethernet/marvell/sky2.c:5049:2:
./include/linux/string.h:297:30: warning: ‘__builtin_strncpy’ specified bound 16 equals destination size [-Wstringop-truncation]
None of the device names are 16 characters long, so it was never an
issue. But replace the strncpy with an snprintf() to prevent the
theoretical overflow.
Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Link: https://lore.kernel.org/r/20201110023222.1479398-1-andrew@lunn.ch
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The return pointer of dwc_eth_dwmac_data's .probe isn't used, and
"probe" usually return int, so change the prototype to follow standard
way. Secondly, it can simplify the tegra_eqos_probe() code.
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Link: https://lore.kernel.org/r/20201109160440.3a736ee3@xhacker.debian
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The '!=' expression itself is bool, no need to convert it to bool.
Fix the following coccicheck warning:
./drivers/net/ethernet/aquantia/atlantic/aq_nic.c:1477:34-39: WARNING: conversion to bool not needed here
Reported-by: Tosk Robot <tencent_os_robot@tencent.com>
Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
Link: https://lore.kernel.org/r/1604797919-10157-1-git-send-email-kaixuxia@tencent.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Set all EHL/TGL phy_addr to -1 so that the driver will automatically
detect it at run-time by probing all the possible 32 addresses.
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: Wong Vee Khee <vee.khee.wong@intel.com>
Link: https://lore.kernel.org/r/20201106094341.4241-1-vee.khee.wong@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fix the gcc warning:
drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c:2673:9: warning: this 'for' clause does not guard... [-Wmisleading-indentation]
2673 | for (i = 0; i < n; ++i) \
Reported-by: Tosk Robot <tencent_os_robot@tencent.com>
Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
Link: https://lore.kernel.org/r/1604467444-23043-1-git-send-email-kaixuxia@tencent.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
MDIO spec does not require an MDC at all times, only when MDIO
transactions are occurring. This patch allows the xilinx_axienet
driver to disable the MDC when not in use, and re-enable it when
needed. It also simplifies the driver by removing MDC disable
and enable in device reset sequence.
Signed-off-by: Clayton Rayment <clayton.rayment@xilinx.com>
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Introduce helper functions to enable/disable MDIO interface clock. This
change serves a preparatory patch for the coming feature to dynamically
control the management bus clock.
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This reverts commit 16b5f5ce351f8709a6b518cc3cbf240c378305bf
where it restructures do_reset. There are patches being tested that
would require major rework if this is committed first.
We will resend this after the other patches have been applied.
Signed-off-by: Dany Madden <drt@linux.ibm.com>
Link: https://lore.kernel.org/r/20201106191745.1679846-1-drt@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
and netfilter subtrees.
Current release - bugs in new features:
- can: isotp: isotp_rcv_cf(): enable RX timeout handling in
listen-only mode
Previous release - regressions:
- mac80211:
- don't require VHT elements for HE on 2.4 GHz
- fix regression where EAPOL frames were sent in plaintext
- netfilter:
- ipset: Update byte and packet counters regardless of whether
they match
- ip_tunnel: fix over-mtu packet send by allowing fragmenting even
if inner packet has IP_DF (don't fragment) set in its header
(when TUNNEL_DONT_FRAGMENT flag is not set on the tunnel dev)
- net: fec: fix MDIO probing for some FEC hardware blocks
- ip6_tunnel: set inner ipproto before ip6_tnl_encap to un-break
gso support
- sctp: Fix COMM_LOST/CANT_STR_ASSOC err reporting on big-endian
platforms, sparse-related fix used the wrong integer size
Previous release - always broken:
- netfilter: use actual socket sk rather than skb sk when routing
harder
- r8169: work around short packet hw bug on RTL8125 by padding frames
- net: ethernet: ti: cpsw: disable PTPv1 hw timestamping
advertisement, the hardware does not support it
- chelsio/chtls: fix always leaking ctrl_skb and another leak caused
by a race condition
- fix drivers incorrectly writing into skbs on TX:
- cadence: force nonlinear buffers to be cloned
- gianfar: Account for Tx PTP timestamp in the skb headroom
- gianfar: Replace skb_realloc_headroom with skb_cow_head for PTP
- can: flexcan:
- remove FLEXCAN_QUIRK_DISABLE_MECR quirk for LS1021A
- add ECC initialization for VF610 and LX2160A
- flexcan_remove(): disable wakeup completely
- can: fix packet echo functionality:
- peak_canfd: fix echo management when loopback is on
- make sure skbs are not freed in IRQ context in case they need
to be dropped
- always clone the skbs to make sure they have a reference on
the socket, and prevent it from disappearing
- fix real payload length return value for RTR frames
- can: j1939: return failure on bind if netdev is down, rather than
waiting indefinitely
Misc:
- IPv6: reply ICMP error if the first fragment don't include all
headers to improve compliance with RFC 8200
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAl+kTDcACgkQMUZtbf5S
IrtC9A//f9rwNFI7sRaz9FYi6ljtWY7paPxdOxy3pWRoNzbfffjTGSPheNvy1pQb
IPaLsNwRrckQNSEPTbQqlUYcjzk1W74ffvq0sQOan4kNKxjX3uf78E6RuWARJsRC
dLqfcJctO6bFi6sEMwIFZ2tLOO5lUIA+Pd0GbjhSdObWzl3uqJ26v7wC6vVk29vS
116Mmhe8/TDVtCOzwlZnBPHqBJkTAirB+MAEX4Sp6FB9YirlcNZbWyHX5L6ejGqC
WQVjU2tPBBugeo0j72tc+y0mD3iK0aLcPL+dk0EQQYHRDMVTebl+gxNPUXCo9Out
HGe5z4e4qrR4Rx1W6MQ3pKwTYuCdwKjMRGd72JAi428/l4NN3y9W/HkI2Zuppd2l
7ifURkNQllYjGCSoHBviJbajyFBeA1nkFJgMSJiRs4T167K3zTbsyjNnfa4LnsvS
B3SrYMGqIH+oR20R9EoV8prVX+Alj1hh/jX02J8zsCcHmBqF2yZi17NarVAWoarm
v/AAqehlP+D1vjAmbCG9DeborrjaNi+v6zFTKK6ZadvLXRJX/N+wEPIpG4KjiK8W
DWKIVlee0R+kgCXE1n9AuZaZLWb7VwrAjkG1Pmfi3vkZhWeAhOW4X98ehhi/hVR/
Gq+e48ZECW5yuOA1q4hbsCYkGr2qAn/LPbsXxhEmW8qwkJHZYkI=
=5R2w
-----END PGP SIGNATURE-----
Merge tag 'net-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Networking fixes for 5.10-rc3, including fixes from wireless, can, and
netfilter subtrees.
Current merge window - bugs in new features:
- can: isotp: isotp_rcv_cf(): enable RX timeout handling in
listen-only mode
Previous releases - regressions:
- mac80211:
- don't require VHT elements for HE on 2.4 GHz
- fix regression where EAPOL frames were sent in plaintext
- netfilter:
- ipset: Update byte and packet counters regardless of whether
they match
- ip_tunnel: fix over-mtu packet send by allowing fragmenting even if
inner packet has IP_DF (don't fragment) set in its header (when
TUNNEL_DONT_FRAGMENT flag is not set on the tunnel dev)
- net: fec: fix MDIO probing for some FEC hardware blocks
- ip6_tunnel: set inner ipproto before ip6_tnl_encap to un-break gso
support
- sctp: Fix COMM_LOST/CANT_STR_ASSOC err reporting on big-endian
platforms, sparse-related fix used the wrong integer size
Previous releases - always broken:
- netfilter: use actual socket sk rather than skb sk when routing
harder
- r8169: work around short packet hw bug on RTL8125 by padding frames
- net: ethernet: ti: cpsw: disable PTPv1 hw timestamping
advertisement, the hardware does not support it
- chelsio/chtls: fix always leaking ctrl_skb and another leak caused
by a race condition
- fix drivers incorrectly writing into skbs on TX:
- cadence: force nonlinear buffers to be cloned
- gianfar: Account for Tx PTP timestamp in the skb headroom
- gianfar: Replace skb_realloc_headroom with skb_cow_head for PTP
- can: flexcan:
- remove FLEXCAN_QUIRK_DISABLE_MECR quirk for LS1021A
- add ECC initialization for VF610 and LX2160A
- flexcan_remove(): disable wakeup completely
- can: fix packet echo functionality:
- peak_canfd: fix echo management when loopback is on
- make sure skbs are not freed in IRQ context in case they need to
be dropped
- always clone the skbs to make sure they have a reference on the
socket, and prevent it from disappearing
- fix real payload length return value for RTR frames
- can: j1939: return failure on bind if netdev is down, rather than
waiting indefinitely
Misc:
- IPv6: reply ICMP error if the first fragment don't include all
headers to improve compliance with RFC 8200"
* tag 'net-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (66 commits)
ionic: check port ptr before use
r8169: work around short packet hw bug on RTL8125
net: openvswitch: silence suspicious RCU usage warning
chelsio/chtls: fix always leaking ctrl_skb
chelsio/chtls: fix memory leaks caused by a race
can: flexcan: flexcan_remove(): disable wakeup completely
can: flexcan: add ECC initialization for VF610
can: flexcan: add ECC initialization for LX2160A
can: flexcan: remove FLEXCAN_QUIRK_DISABLE_MECR quirk for LS1021A
can: mcp251xfd: remove unneeded break
can: mcp251xfd: mcp251xfd_regmap_nocrc_read(): fix semicolon.cocci warnings
can: mcp251xfd: mcp251xfd_regmap_crc_read(): increase severity of CRC read error messages
can: peak_canfd: pucan_handle_can_rx(): fix echo management when loopback is on
can: peak_usb: peak_usb_get_ts_time(): fix timestamp wrapping
can: peak_usb: add range checking in decode operations
can: xilinx_can: handle failure cases of pm_runtime_get_sync
can: ti_hecc: ti_hecc_probe(): add missed clk_disable_unprepare() in error path
can: isotp: padlen(): make const array static, makes object smaller
can: isotp: isotp_rcv_cf(): enable RX timeout handling in listen-only mode
can: isotp: Explain PDU in CAN_ISOTP help text
...
This series includes updates to mlx5 software steering component.
1) Few improvements in the DR area, such as removing unneeded checks,
renaming to better general names, refactor in some places, etc.
2) Software steering (DR) Memory management improvements
This patch series contains SW Steering memory management improvements:
using buddy allocator instead of an existing bucket allocator, and
several other optimizations.
The buddy system is a memory allocation and management algorithm
that manages memory in power of two increments.
The algorithm is well-known and well-described, such as here:
https://en.wikipedia.org/wiki/Buddy_memory_allocation
Linux uses this algorithm for managing and allocating physical pages,
as described here:
https://www.kernel.org/doc/gorman/html/understand/understand009.html
In our case, although the algorithm in principal is similar to the
Linux physical page allocator, the "building blocks" and the circumstances
are different: in SW steering, buddy allocator doesn't really allocates
a memory, but rather manages ICM (Interconnect Context Memory) that was
previously allocated and registered.
The ICM memory that is used in SW steering is always power
of 2 (order), so buddy system is a good fit for this.
Patches in this series:
[PATH 4] net/mlx5: DR, Add buddy allocator utilities
This patch adds a modified implementation of a well-known buddy allocator,
adjusted for SW steering needs: the algorithm in principal is similar to
the Linux physical page allocator, but in our case buddy allocator doesn't
really allocate a memory, but rather manages ICM memory that was previously
allocated and registered.
[PATH 5] net/mlx5: DR, Handle ICM memory via buddy allocation instead of bucket management
This patch changes ICM management of SW steering to use buddy-system mechanism
Instead of the previous bucket management.
[PATH 6] net/mlx5: DR, Sync chunks only during free
This patch makes syncing happen only when freeing memory chunks.
[PATH 7] net/mlx5: DR, ICM memory pools sync optimization
This patch adds tracking of pool's "hot" memory and makes the
check whether steering sync is required much shorter and faster.
[PATH 8] net/mlx5: DR, Free buddy ICM memory if it is unused
This patch adds tracking buddy's used ICM memory,
and frees the buddy if all its memory becomes unused.
3) Misc code cleanups
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl+kW/sACgkQSD+KveBX
+j7pRwf+KH6vIkwKlt0I2cYUmUINAkz0o1E82wXGx5Q81iMJLeIeMPKiatai6/0r
BIYD8t8oagOVw5OU+H9DbVIR47wz6pB90bkQIrIJk0S3ocVcDfLlN0ssbwCdEDlH
jj56SB6jjVVj1LlTXSVfUrXEKJ2FBjTxPbNtVL/NLW0GQkoJM5RaYjK++ZB58o+O
YI1Gb2w+FT5vVHdRVxs/2a/NTy71VQOrYctkhql4/8P0SsstPvhgOD/oRU26l8Il
Qpl68H/0N9KM04SlOlD91AwaWhPEIBP1qFSgRnfStmqxAfQlAfqajOoSntK8Vkz0
0yRF7pVLELhzxx+IR/W9Vpdf7uzjeA==
=rQcg
-----END PGP SIGNATURE-----
Merge tag 'mlx5-updates-2020-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2020-11-03
This series includes updates to mlx5 software steering component.
1) Few improvements in the DR area, such as removing unneeded checks,
renaming to better general names, refactor in some places, etc.
2) Software steering (DR) Memory management improvements
This patch series contains SW Steering memory management improvements:
using buddy allocator instead of an existing bucket allocator, and
several other optimizations.
The buddy system is a memory allocation and management algorithm
that manages memory in power of two increments.
The algorithm is well-known and well-described, such as here:
https://en.wikipedia.org/wiki/Buddy_memory_allocation
Linux uses this algorithm for managing and allocating physical pages,
as described here:
https://www.kernel.org/doc/gorman/html/understand/understand009.html
In our case, although the algorithm in principal is similar to the
Linux physical page allocator, the "building blocks" and the circumstances
are different: in SW steering, buddy allocator doesn't really allocates
a memory, but rather manages ICM (Interconnect Context Memory) that was
previously allocated and registered.
The ICM memory that is used in SW steering is always power
of 2 (order), so buddy system is a good fit for this.
Patches in this series:
[PATH 4] net/mlx5: DR, Add buddy allocator utilities
This patch adds a modified implementation of a well-known buddy allocator,
adjusted for SW steering needs: the algorithm in principal is similar to
the Linux physical page allocator, but in our case buddy allocator doesn't
really allocate a memory, but rather manages ICM memory that was previously
allocated and registered.
[PATH 5] net/mlx5: DR, Handle ICM memory via buddy allocation instead of bucket management
This patch changes ICM management of SW steering to use buddy-system mechanism
Instead of the previous bucket management.
[PATH 6] net/mlx5: DR, Sync chunks only during free
This patch makes syncing happen only when freeing memory chunks.
[PATH 7] net/mlx5: DR, ICM memory pools sync optimization
This patch adds tracking of pool's "hot" memory and makes the
check whether steering sync is required much shorter and faster.
[PATH 8] net/mlx5: DR, Free buddy ICM memory if it is unused
This patch adds tracking buddy's used ICM memory,
and frees the buddy if all its memory becomes unused.
3) Misc code cleanups
* tag 'mlx5-updates-2020-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net: mlx5: Replace in_irq() usage
net/mlx5: Cleanup kernel-doc warnings
net/mlx4: Cleanup kernel-doc warnings
net/mlx5e: Validate stop_room size upon user input
net/mlx5: DR, Free unused buddy ICM memory
net/mlx5: DR, ICM memory pools sync optimization
net/mlx5: DR, Sync chunks only during free
net/mlx5: DR, Handle ICM memory via buddy allocation instead of buckets
net/mlx5: DR, Add buddy allocator utilities
net/mlx5: DR, Rename matcher functions to be more HW agnostic
net/mlx5: DR, Rename builders HW specific names
net/mlx5: DR, Remove unused member of action struct
====================
Link: https://lore.kernel.org/r/20201105201242.21716-1-saeedm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
mlx5_eq_async_int() uses in_irq() to decide whether eq::lock needs to be
acquired and released with spin_[un]lock() or the irq saving/restoring
variants.
The usage of in_*() in drivers is phased out and Linus clearly requested
that code which changes behaviour depending on context should either be
seperated or the context be conveyed in an argument passed by the caller,
which usually knows the context.
mlx5_eq_async_int() knows the context via the action argument already so
using it for the lock variant decision is a straight forward replacement
for in_irq().
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
$ git ls-files *.[ch] | egrep drivers/net/ethernet/mellanox/ | \
xargs scripts/kernel-doc -none
drivers/net/ethernet/mellanox/mlx5/core/fpga/sdk.h:57:
warning: Enum value 'MLX5_FPGA_ACCESS_TYPE_I2C' not described ...
drivers/net/ethernet/mellanox/mlx5/core/fpga/sdk.h:57:
warning: Enum value 'MLX5_FPGA_ACCESS_TYPE_DONTCARE' not described ...
drivers/net/ethernet/mellanox/mlx5/core/fpga/sdk.h:118:
warning: Function parameter or member 'cb_arg' not described ...
drivers/net/ethernet/mellanox/mlx5/core/fpga/sdk.h:160:
warning: Function parameter or member 'conn' not described ...
drivers/net/ethernet/mellanox/mlx5/core/fpga/sdk.h:160:
warning: Excess function parameter 'fdev' description ...
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reported-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Stop room is a space that may be taken by WQEs in the SQ during a packet
transmit. It is used to check if next packet has enough room in the SQ.
Stop room guarantees this packet can be served and if not, the queue is
stopped, so no more packets are passed to the driver until it's ready.
Currently, stop_room size is calculated and validated upon tx queues
allocation. This makes it impossible to know if user provided valid
input for certain parameters when interface is down.
Instead, store stop_room in mlx5e_sq_param and create
mlx5e_validate_params(), to validate its fields upon user input even
when the interface is down.
Signed-off-by: Vladyslav Tarasiuk <vladyslavt@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Track buddy's used ICM memory, and free it if all
of the buddy's memory bacame unused.
Do this only for STEs.
MODIFY_ACTION buddies are much smaller, so in case there
is a large amount of modify_header actions, which result
in large amount of MODIFY_ACTION buddies, doing this
cleanup during sync will result in performance hit while
not freeing significant amount of memory.
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Track the pool's hot ICM memory when freeing/allocating
chunk, so that when checking if the sync is required, just
check if the pool hot memory has reached the sync threshold.
Signed-off-by: Hamdan Igbaria <hamdani@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
When freeing chunks, we want to sync the steering
so that all the "hot" memory will be written to ICM
and all the chunks that are in the hot_list will be
actually destroyed.
When allocating from the pool, we don't have a need
to sync the steering, as we're not freeing anything,
and sync might just hurt the performance in terms of
flow-per-second offloaded.
Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Till now in order to manage the ICM memory we used bucket
mechanism, which kept a bucket per specified size (sizes were
between 1 block to 2^21 blocks).
Now changing that with buddy-system mechanism, which gives us much
more flexible way to manage the ICM memory.
Its biggest advantage over the bucket is by using the same ICM memory
area for all the sizes of blocks, which reduces the memory consumption.
Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Add implementation of SW Steering variation of buddy allocator.
The buddy system for ICM memory uses 2 main data structures:
- Bitmap per order, that keeps the current state of allocated
blocks for this order
- Indicator for the number of available blocks per each order
Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Remove flex parser from the matcher function names since
the matcher should not be aware of such HW specific details.
Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>