1015294 Commits

Author SHA1 Message Date
Sergey Ryazanov
9ee23f48f6 wwan_hwsim: add debugfs management interface
wwan_hwsim creates and removes simulated control ports on module loading
and unloading. It would be helpful to be able to create/remove devices
and ports at run-time to trigger wwan port (un-)register actions without
module reloading.

Some simulator objects (e.g. ports) do not have the underling device and
it is not possible to fully manage the simulator via sysfs. wwan_hsim
intend for developers, so implement it as a self-contained debugfs based
management interface.

Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 14:33:43 -07:00
Sergey Ryazanov
f36a111a74 wwan_hwsim: WWAN device simulator
This driver simulates a set of WWAN device with a set of AT control
ports. It can be used to test WWAN kernel framework as well as user
space tools.

Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 14:33:43 -07:00
David S. Miller
95848099a3 Merge branch 'stmmac-25gbps'
Michael Sit Wei Hong says:

====================
Enable 2.5Gbps speed for stmmac

Intel mGbE supports 2.5Gbps link speed by overclocking the clock rate
by 2.5 times to support 2.5Gbps link speed. In this mode, the serdes/PHY
operates at a serial baud rate of 3.125 Gbps and the PCS data path and
GMII interface of the MAC operate at 312.5 MHz instead of 125 MHz.
This is configured in the BIOS during boot up. The kernel driver is not able
access to modify the clock rate for 1Gbps/2.5G mode on the fly. The way to
determine the current 1G/2.5G mode is by reading a dedicated adhoc
register through mdio bus.

Changes:
v5 -> v6
 patch 1/3
 - Check if mdio_bus_data is populated to prevent NULL pointer dereferencing
   when accesing mdio_bus_data member

v4 -> v5
 patch 1/3
 - Rebase to latest code changes after Vladimir's code is merged and fix
   build warnings

v3 -> v4
 patch 1/3
 - Rebase to latest code and Initialize 'found' to 0 to avoid build warning

 patch 2/3
 - Fix indentation issue from v3

v2 -> v3
 patch 1/3
 -New patch added to restructure the code. enabling reading the dedicated
  adhoc register to determine link speed mode.

 patch 2/3
 -Restructure for 2.5G speed to use 2500BaseX configuration as the
  PHY interface.

 patch 3/3
 -Restructure to read serdes registers to set max_speed and configure to
  use 2500BaseX in 2.5G speeds.

v1 -> v2
 patch 1/2
 -Remove MAC supported link speed masking

 patch 2/2
 -Add supported link speed masking in the PCS

iperf3 and ping for 2.5Gbps and regression test on 10M/100M/1000Mbps
is done to prevent regresson issues.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 14:31:43 -07:00
Voon Weifeng
46682cb86a net: stmmac: enable Intel mGbE 2.5Gbps link speed
The Intel mGbE supports 2.5Gbps link speed by increasing the clock rate by
2.5 times of the original rate. In this mode, the serdes/PHY operates at a
serial baud rate of 3.125 Gbps and the PCS data path and GMII interface of
the MAC operate at 312.5 MHz instead of 125 MHz.

For Intel mGbE, the overclocking of 2.5 times clock rate to support 2.5G is
only able to be configured in the BIOS during boot time. Kernel driver has
no access to modify the clock rate for 1Gbps/2.5G mode. The way to
determined the current 1G/2.5G mode is by reading a dedicated adhoc
register through mdio bus. In short, after the system boot up, it is either
in 1G mode or 2.5G mode which not able to be changed on the fly.

Compared to 1G mode, the 2.5G mode selects the 2500BASEX as PHY interface and
disables the xpcs_an_inband. This is to cater for some PHYs that only
supports 2500BASEX PHY interface with no autonegotiation.

v2: remove MAC supported link speed masking
v3: Restructure  to introduce intel_speed_mode_2500() to read serdes registers
    for max speed supported and select the appropritate configuration.
    Use max_speed to determine the supported link speed mask.

Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: Michael Sit Wei Hong <michael.wei.hong.sit@intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 14:31:43 -07:00
Voon Weifeng
f27abde304 net: pcs: add 2500BASEX support for Intel mGbE controller
XPCS IP supports 2500BASEX as PHY interface. It is configured as
autonegotiation disable to cater for PHYs that does not supports 2500BASEX
autonegotiation.

v2: Add supported link speed masking.
v3: Restructure to introduce xpcs_config_2500basex() used to configure the
    xpcs for 2.5G speeds. Added 2500BASEX specific information for
    configuration.
v4: Fix indentation error

Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: Michael Sit Wei Hong <michael.wei.hong.sit@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 14:31:43 -07:00
Voon Weifeng
597a68ce32 net: stmmac: split xPCS setup from mdio register
This patch is a preparation patch for the enabling of Intel mGbE 2.5Gbps
link speed. The Intel mGbR link speed configuration (1G/2.5G) is depends on
a mdio ADHOC register which can be configured in the bios menu.
As PHY interface might be different for 1G and 2.5G, the mdio bus need be
ready to check the link speed and select the PHY interface before probing
the xPCS.

Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: Michael Sit Wei Hong <michael.wei.hong.sit@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 14:31:43 -07:00
David S. Miller
303597e49b This feature/cleanup patchset includes the following patches:
- bump version strings, by Simon Wunderlich
 
  - consistently send iface index/name in genlmsg, by Sven Eckelmann
 
  - improve broadcast queueing, by Linus Lüssing (2 patches)
 
  - add support for routable IPv4 multicast with bridged setups,
    by Linus Lüssing
 
  - remove repeated declarations, by Shaokun Zhang
 
  - fix spelling mistakes, by Zheng Yongjun
 
  - clean up hard interface handling after dropping sysfs support,
    by Sven Eckelmann (4 patches)
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAmC/izUWHHN3QHNpbW9u
 d3VuZGVybGljaC5kZQAKCRChK+OYQpKeofECEADS3YF/XZw/CpLPETGLFfEg+hlE
 jYKPTq0enAgaEbiJuiHAI4hvdPAxD3bOGUhOU/QIMyfzJ5NVRkX1hc7A04v7HAZk
 fxsKI6OUQShL0ld3pFH3lNE+gnsDsHnAF8eIL/dJRVCLekxWO3jSI0VlPasGn0zG
 /kXugSrWV7z/fGDGmDAVE47LvTq0sCsiqDVwNZH869biYpIRh7D03EAvPkBx2UY/
 StdgBC5jUED0DgJgtlmxil8Kei8gHIme1iRXnBR951vKAboyRo3bR72QkRQPoIZk
 l9lvFnaSAMcL9B/pcCZElmFVkkQY7h3nVw4vsyc7MbR5CKAnLMDJHyqXnnXAmStG
 gxR2pa54uD4hwcxax6CaTR1xhmSSFnAz8vZCaEpAyPjeU2Lt05Et1Ikv+y0Vlp3m
 te0M+aHu1leMjRtNye1nNDem7ubZ4kjhOFJM8YGfsQgcK0pZz0GayWiafheIguSW
 CNDCe52FvNzIOyjB6CeXS3iAWHckjIpBRXaBMg0uGWb7LFNXRfhg5HX02827usrS
 GOZ0+wq09RIc3vbgQ6c2JXSNGJ6ICa2NdNxosukzOQRquqEkGO8y+SlSKSSHZMbB
 mbkPJ/BnhEiTH9fkDMnPkNYYKy/6TuiFejxsmNrwBAd+MHYHJ8fBaGgNoWGRCfLY
 ygrNafDNE+Visz3wCw==
 =+IDL
 -----END PGP SIGNATURE-----

Merge tag 'batadv-next-pullrequest-20210608' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
pull request for net-next: batman-adv 2021-06-08

here is a feature/cleanup pull request of batman-adv to go into net-next.

Please pull or let me know of any problem!

This feature/cleanup patchset includes the following patches:

 - bump version strings, by Simon Wunderlich

 - consistently send iface index/name in genlmsg, by Sven Eckelmann

 - improve broadcast queueing, by Linus Lüssing (2 patches)

 - add support for routable IPv4 multicast with bridged setups,
   by Linus Lüssing

 - remove repeated declarations, by Shaokun Zhang

 - fix spelling mistakes, by Zheng Yongjun

 - clean up hard interface handling after dropping sysfs support,
   by Sven Eckelmann (4 patches)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 12:10:26 -07:00
Geert Uytterhoeven
7624115420 nvme: NVME_TCP_OFFLOAD should not default to m
The help text for the symbol controlling support for the NVM Express
over Fabrics TCP offload common layer suggests to not enable this
support when unsure.

Hence drop the "default m", which actually means "default y" if
CONFIG_MODULES is not enabled.

Fixes: f0e8cb6106da2703 ("nvme-tcp-offload: Add nvme-tcp-offload - NVMeTCP HW offload ULP")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 12:06:19 -07:00
David S. Miller
1a61fed9f7 Merge branch 'farsync-cleanups'
Peng Li says:

====================
net: farsync: clean up some code style issues

This patchset clean up some code style issues.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 12:04:05 -07:00
Peng Li
f23a3da78a net: farsync: replace comparison to NULL with "fst_card_array[i]"
According to the chackpatch.pl, comparison to NULL could
be written "fst_card_array[i]".

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 12:04:05 -07:00
Peng Li
f01f906ffe net: farsync: remove redundant return
Void function return statements are not generally useful.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 12:04:05 -07:00
Peng Li
d2a1054b8b net: farsync: fix the alignment issue
Alignment should match open parenthesis.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 12:04:05 -07:00
Peng Li
ae1be3fad5 net: farsync: remove redundant parentheses
Unnecessary parentheses around 'port->hwif == X21'.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 12:04:05 -07:00
Peng Li
b64b5aee73 net: farsync: remove redundant spaces
According to the chackpatch.pl,
space prohibited between function name and open parenthesis '(',
no space is necessary after a cast.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 12:04:05 -07:00
Peng Li
fa8d10b547 net: farsync: remove redundant braces {}
This patch removes redundant braces {}, to fix the
checkpatch.pl warning:
"braces {} are not necessary for single statement blocks".

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 12:04:05 -07:00
Peng Li
37947a9be3 net: farsync: add some required spaces
Add spaces required around that '=' and '*'.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 12:04:05 -07:00
Peng Li
7619ab1618 net: farsync: fix the code style issue about macros
Macros with complex values should be enclosed in parentheses.
space required after that ',' .

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 12:04:05 -07:00
Peng Li
3a950181f6 net: farsync: code indent use tabs where possible
Code indent should use tabs where possible.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 12:04:05 -07:00
Peng Li
d70711da30 net: farsync: remove trailing whitespaces
This patch removes trailing whitespaces.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 12:04:05 -07:00
Peng Li
14b9764ccf net: farsync: fix the comments style issue
Networking block comments don't use an empty /* line,
use /* Comment...

Block comments use * on subsequent lines.
Block comments use a trailing */ on a separate line.

This patch fixes the comments style issues.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 12:04:05 -07:00
Peng Li
8ccac4a58a net: farsync: remove redundant initialization for statics
Should not initialise statics to 0.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 12:04:05 -07:00
Peng Li
40996bcfe9 net: farsync: move out assignment in if condition
Should not use assignment in if condition.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 12:04:05 -07:00
Peng Li
8ea4bfb30a net: farsync: fix the code style issue about "foo* bar"
Fix the checkpatch error as "foo * bar" should be "foo *bar".

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 12:04:04 -07:00
Peng Li
50d4c36336 net: farsync: add blank line after declarations
This patch fixes the checkpatch error about missing a blank line
after declarations.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 12:04:04 -07:00
Peng Li
34de4c85f3 net: farsync: remove redundant blank lines
This patch removes some redundant blank lines.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 12:04:04 -07:00
David S. Miller
5552571c65 Merge branch 'realtek-dt'
Joakim Zhang says:

====================
net: phy: add dt property for realtek phy

Add dt property for realtek phy.

---
ChangeLogs:
V1->V2:
	* store the desired PHYCR1/2 register value in "priv" rather than
	using "quirks", per Russell King suggestion, as well as can
	cover the bootloader setting.
	* change the behavior of ALDPS mode, default is disabled, add dt
	property for users to enable it.
	* fix dt binding yaml build issues.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 11:41:24 -07:00
Joakim Zhang
6813cc8cfd net: phy: realtek: add delay to fix RXC generation issue
PHY will delay about 11.5ms to generate RXC clock when switching from
power down to normal operation. Read/write registers would also cause RXC
become unstable and stop for a while during this process. Realtek engineer
suggests 15ms or more delay can workaround this issue.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 11:41:24 -07:00
Joakim Zhang
d90db36a9e net: phy: realtek: add dt property to enable ALDPS mode
If enable Advance Link Down Power Saving (ALDPS) mode, it will change
crystal/clock behavior, which cause RXC clock stop for dozens to hundreds
of miliseconds. This is comfirmed by Realtek engineer. For some MACs, it
needs RXC clock to support RX logic, after this patch, PHY can generate
continuous RXC clock during auto-negotiation.

ALDPS default is disabled after hardware reset, it's more reasonable to
add a property to enable this feature, since ALDPS would introduce side effect.
This patch adds dt property "realtek,aldps-enable" to enable ALDPS mode
per users' requirement.

Jisheng Zhang enables this feature, changes the default behavior. Since
mine patch breaks the rule that new implementation should not break
existing design, so Cc'ed let him know to see if it can be accepted.

Cc: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 11:41:24 -07:00
Joakim Zhang
0a4355c2b7 net: phy: realtek: add dt property to disable CLKOUT clock
CLKOUT is enabled by default after PHY hardware reset, this patch adds
"realtek,clkout-disable" property for user to disable CLKOUT clock
to save PHY power.

Per RTL8211F guide, a PHY reset should be issued after setting these
bits in PHYCR2 register. After this patch, CLKOUT clock output to be
disabled.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 11:41:23 -07:00
Joakim Zhang
a9f15dc2b9 dt-bindings: net: add dt binding for realtek rtl82xx phy
Add binding for realtek rtl82xx phy.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 11:41:23 -07:00
Marek Behún
d6dd33ffa3 net: Kconfig: indent with tabs instead of spaces
The BAREUDP config option uses spaces instead of tabs for indentation.
The rest of this file uses tabs. Fix this.

Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 11:34:37 -07:00
David S. Miller
dc8cf7550a Merge branch 'page_pool-recycling'
Matteo Croce says:

====================
page_pool: recycle buffers

This is a respin of [1]

This patchset shows the plans for allowing page_pool to handle and
maintain DMA map/unmap of the pages it serves to the driver. For this
to work a return hook in the network core is introduced.

The overall purpose is to simplify drivers, by providing a page
allocation API that does recycling, such that each driver doesn't have
to reinvent its own recycling scheme. Using page_pool in a driver
does not require implementing XDP support, but it makes it trivially
easy to do so. Instead of allocating buffers specifically for SKBs
we now allocate a generic buffer and either wrap it on an SKB
(via build_skb) or create an XDP frame.
The recycling code leverages the XDP recycle APIs.

The Marvell mvpp2 and mvneta drivers are used in this patchset to
demonstrate how to use the API, and tested on a MacchiatoBIN
and EspressoBIN boards respectively.

Please let this going in on a future -rc1 so to allow enough time
to have wider tests.

v7 -> v8:
- use page->lru.next instead of page->index for pfmemalloc
- remove conditional include
- rework page_pool_return_skb_page() so to have less conversions
  between page and addresses, and call compound_head() only once
- move some code from skb_free_head() to a new helper skb_pp_recycle()
- misc fixes

v6 -> v7:
- refresh patches against net-next
- remove a redundant call to virt_to_head_page()
- update mvneta benchmarks

v5 -> v6:
- preserve pfmemalloc bit when setting signature
- fix typo in mvneta
- rebase on next-next with the new cache
- don't clear the skb->pp_recycle in pskb_expand_head()

v4 -> v5:
- move the signature so it doesn't alias with page->mapping
- use an invalid pointer as magic
- incorporate Matthew Wilcox's changes for pfmemalloc pages
- move the __skb_frag_unref() changes to a preliminary patch
- refactor some cpp directives
- only attempt recycling if skb->head_frag
- clear skb->pp_recycle in pskb_expand_head()

v3 -> v4:
- store a pointer to page_pool instead of xdp_mem_info
- drop a patch which reduces xdp_mem_info size
- do the recycling in the page_pool code instead of xdp_return
- remove some unused headers include
- remove some useless forward declaration

v2 -> v3:
- added missing SOBs
- CCed the MM people

v1 -> v2:
- fix a commit message
- avoid setting pp_recycle multiple times on mvneta
- squash two patches to avoid breaking bisect
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 14:11:47 -07:00
Matteo Croce
e4017570da mvneta: recycle buffers
Use the new recycling API for page_pool.
In a drop rate test, the packet rate increased by 10%,
from 296 Kpps to 326 Kpps.

perf top on a stock system shows:

Overhead  Shared Object     Symbol
  23.66%  [kernel]          [k] __pi___inval_dcache_area
  22.85%  [mvneta]          [k] mvneta_rx_swbm
   7.54%  [kernel]          [k] kmem_cache_alloc
   6.49%  [kernel]          [k] eth_type_trans
   3.94%  [kernel]          [k] dev_gro_receive
   3.91%  [kernel]          [k] __netif_receive_skb_core
   3.91%  [kernel]          [k] kmem_cache_free
   3.76%  [kernel]          [k] page_pool_release_page
   3.56%  [kernel]          [k] free_unref_page
   2.40%  [kernel]          [k] build_skb
   1.49%  [kernel]          [k] skb_release_data
   1.45%  [kernel]          [k] __alloc_pages_bulk
   1.30%  [kernel]          [k] page_frag_free

And this is the same output with recycling enabled:

Overhead  Shared Object     Symbol
  26.41%  [kernel]          [k] __pi___inval_dcache_area
  25.00%  [mvneta]          [k] mvneta_rx_swbm
   8.14%  [kernel]          [k] kmem_cache_alloc
   6.84%  [kernel]          [k] eth_type_trans
   4.44%  [kernel]          [k] __netif_receive_skb_core
   4.38%  [kernel]          [k] kmem_cache_free
   4.16%  [kernel]          [k] dev_gro_receive
   3.21%  [kernel]          [k] page_pool_put_page
   2.41%  [kernel]          [k] build_skb
   1.82%  [kernel]          [k] skb_release_data
   1.61%  [kernel]          [k] napi_gro_receive
   1.25%  [kernel]          [k] page_pool_refill_alloc_cache
   1.16%  [kernel]          [k] __netif_receive_skb_list_core

We can see that page_pool_release_page(), free_unref_page() and
__alloc_pages_bulk() are no longer on top of the list when receiving
traffic.

The test was done with mausezahn on the TX side with 64 byte raw
ethernet frames.

Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 14:11:47 -07:00
Matteo Croce
133637fcfa mvpp2: recycle buffers
Use the new recycling API for page_pool.
In a drop rate test, the packet rate is almost doubled,
from 1110 Kpps to 2128 Kpps.

perf top on a stock system shows:

Overhead  Shared Object     Symbol
  34.88%  [kernel]          [k] page_pool_release_page
   8.06%  [kernel]          [k] free_unref_page
   6.42%  [mvpp2]           [k] mvpp2_rx
   6.07%  [kernel]          [k] eth_type_trans
   5.18%  [kernel]          [k] __netif_receive_skb_core
   4.95%  [kernel]          [k] build_skb
   4.88%  [kernel]          [k] kmem_cache_free
   3.97%  [kernel]          [k] kmem_cache_alloc
   3.45%  [kernel]          [k] dev_gro_receive
   2.73%  [kernel]          [k] page_frag_free
   2.07%  [kernel]          [k] __alloc_pages_bulk
   1.99%  [kernel]          [k] arch_local_irq_save
   1.84%  [kernel]          [k] skb_release_data
   1.20%  [kernel]          [k] netif_receive_skb_list_internal

With packet rate stable at 1100 Kpps:

tx: 0 bps 0 pps rx: 532.7 Mbps 1110 Kpps
tx: 0 bps 0 pps rx: 532.6 Mbps 1110 Kpps
tx: 0 bps 0 pps rx: 532.4 Mbps 1109 Kpps
tx: 0 bps 0 pps rx: 532.1 Mbps 1109 Kpps
tx: 0 bps 0 pps rx: 531.9 Mbps 1108 Kpps
tx: 0 bps 0 pps rx: 531.9 Mbps 1108 Kpps

And this is the same output with recycling enabled:

Overhead  Shared Object     Symbol
  12.91%  [kernel]          [k] eth_type_trans
  12.54%  [mvpp2]           [k] mvpp2_rx
   9.67%  [kernel]          [k] build_skb
   9.63%  [kernel]          [k] __netif_receive_skb_core
   8.44%  [kernel]          [k] page_pool_put_page
   8.07%  [kernel]          [k] kmem_cache_free
   7.79%  [kernel]          [k] kmem_cache_alloc
   6.86%  [kernel]          [k] dev_gro_receive
   3.19%  [kernel]          [k] skb_release_data
   2.41%  [kernel]          [k] netif_receive_skb_list_internal
   2.18%  [kernel]          [k] page_pool_refill_alloc_cache
   1.76%  [kernel]          [k] napi_gro_receive
   1.61%  [kernel]          [k] kfree_skb
   1.20%  [kernel]          [k] dma_sync_single_for_device
   1.16%  [mvpp2]           [k] mvpp2_poll
   1.12%  [mvpp2]           [k] mvpp2_read

With packet rate above 2100 Kpps:

tx: 0 bps 0 pps rx: 1021 Mbps 2128 Kpps
tx: 0 bps 0 pps rx: 1021 Mbps 2127 Kpps
tx: 0 bps 0 pps rx: 1021 Mbps 2128 Kpps
tx: 0 bps 0 pps rx: 1021 Mbps 2128 Kpps
tx: 0 bps 0 pps rx: 1022 Mbps 2128 Kpps
tx: 0 bps 0 pps rx: 1022 Mbps 2129 Kpps

The major performance increase is explained by the fact that the most CPU
consuming functions (page_pool_release_page, page_frag_free and
free_unref_page) are no longer called on a per packet basis.

The test was done by sending to the macchiatobin 64 byte ethernet frames
with an invalid ethertype, so the packets are dropped early in the RX path.

Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 14:11:47 -07:00
Ilias Apalodimas
6a5bcd84e8 page_pool: Allow drivers to hint on SKB recycling
Up to now several high speed NICs have custom mechanisms of recycling
the allocated memory they use for their payloads.
Our page_pool API already has recycling capabilities that are always
used when we are running in 'XDP mode'. So let's tweak the API and the
kernel network stack slightly and allow the recycling to happen even
during the standard operation.
The API doesn't take into account 'split page' policies used by those
drivers currently, but can be extended once we have users for that.

The idea is to be able to intercept the packet on skb_release_data().
If it's a buffer coming from our page_pool API recycle it back to the
pool for further usage or just release the packet entirely.

To achieve that we introduce a bit in struct sk_buff (pp_recycle:1) and
a field in struct page (page->pp) to store the page_pool pointer.
Storing the information in page->pp allows us to recycle both SKBs and
their fragments.
We could have skipped the skb bit entirely, since identical information
can bederived from struct page. However, in an effort to affect the free path
as less as possible, reading a single bit in the skb which is already
in cache, is better that trying to derive identical information for the
page stored data.

The driver or page_pool has to take care of the sync operations on it's own
during the buffer recycling since the buffer is, after opting-in to the
recycling, never unmapped.

Since the gain on the drivers depends on the architecture, we are not
enabling recycling by default if the page_pool API is used on a driver.
In order to enable recycling the driver must call skb_mark_for_recycle()
to store the information we need for recycling in page->pp and
enabling the recycling bit, or page_pool_store_mem_info() for a fragment.

Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Co-developed-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 14:11:47 -07:00
Matteo Croce
c420c98982 skbuff: add a parameter to __skb_frag_unref
This is a prerequisite patch, the next one is enabling recycling of
skbs and fragments. Add an extra argument on __skb_frag_unref() to
handle recycling, and update the current users of the function with that.

Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 14:11:47 -07:00
Matteo Croce
c07aea3ef4 mm: add a signature in struct page
This is needed by the page_pool to avoid recycling a page not allocated
via page_pool.

The page->signature field is aliased to page->lru.next and
page->compound_head, but it can't be set by mistake because the
signature value is a bad pointer, and can't trigger a false positive
in PageTail() because the last bit is 0.

Co-developed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 14:11:47 -07:00
Yang Yingliang
35cba15a50 net: moxa: Use devm_platform_get_and_ioremap_resource()
Use devm_platform_get_and_ioremap_resource() to simplify
code and avoid a null-ptr-deref by checking 'res' in it.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 14:08:30 -07:00
Zheng Yongjun
7f553ff214 l2tp: Fix spelling mistakes
Fix some spelling mistakes in comments:
negociated  ==> negotiated
dont  ==> don't

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 14:08:30 -07:00
Zheng Yongjun
4fb3ebbf7e net/ncsi: Fix spelling mistakes
Fix some spelling mistakes in comments:
constuct  ==> construct
chanels  ==> channels
Detination  ==> Destination

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 14:08:30 -07:00
Zheng Yongjun
974d8f86cd ipv4: Fix spelling mistakes
Fix some spelling mistakes in comments:
Dont  ==> Don't
timout  ==> timeout
incomming  ==> incoming
necesarry  ==> necessary
substract  ==> subtract

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 14:08:30 -07:00
Zheng Yongjun
84a57ae96b netlabel: Fix spelling mistakes
Fix some spelling mistakes in comments:
Interate  ==> Iterate
sucess  ==> success

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 14:08:30 -07:00
Yang Yingliang
20f1932e22 net: micrel: check return value after calling platform_get_resource()
It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 14:07:22 -07:00
Yang Yingliang
0bb51a3a38 net: mvpp2: check return value after calling platform_get_resource()
It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 14:07:22 -07:00
Yang Yingliang
3710e80952 net: ethernet: bgmac: Use devm_platform_ioremap_resource_byname
Use the devm_platform_ioremap_resource_byname() helper instead of
calling platform_get_resource_byname() and devm_ioremap_resource()
separately.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 14:07:22 -07:00
Yang Yingliang
b5d64b43f8 net: enetc: Use devm_platform_get_and_ioremap_resource()
Use devm_platform_get_and_ioremap_resource() to simplify
code.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 14:07:22 -07:00
Yang Yingliang
809660cbc8 net: macb: Use devm_platform_get_and_ioremap_resource()
Use devm_platform_get_and_ioremap_resource() to simplify
code.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 14:07:22 -07:00
Yang Yingliang
74325bf010 net: bcmgenet: check return value after calling platform_get_resource()
It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 14:07:22 -07:00
Shaokun Zhang
90fdd89f6c net: tulip: Remove the repeated declaration
Function 'pnic2_lnk_change' is declared twice, so remove the
repeated declaration.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 14:03:11 -07:00
Yang Yingliang
f1fe19c2cb net: mscc: ocelot: check return value after calling platform_get_resource()
It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07 14:02:25 -07:00