linux

iv/linux

History

Jakub Kicinski 7079d5e61a mlx5-updates-2023-03-28

Dragos Tatulea says:
 ====================
 
 net/mlx5e: RX, Drop page_cache and fully use page_pool
 
 For page allocation on the rx path, the mlx5e driver has been using an
 internal page cache in tandem with the page pool. The internal page
 cache uses a queue for page recycling which has the issue of head of
 queue blocking.
 
 This patch series drops the internal page_cache altogether and uses the
 page_pool to implement everything that was done by the page_cache
 before:
 * Let the page_pool handle dma mapping and unmapping.
 * Use fragmented pages with fragment counter instead of tracking via
   page ref.
 * Enable skb recycling.
 
 The patch series has the following effects on the rx path:
 
 * Improved performance for the cases when there was low page recycling
   due to head of queue blocking in the internal page_cache. The test
   for this was running a single iperf TCP stream to a rx queue
   which is bound on the same cpu as the application.
 
   |-------------+--------+--------+------+---------|
   | rq type     | before | after  | unit |   diff  |
   |-------------+--------+--------+------+---------|
   | striding rq |  30.1  |  31.4  | Gbps |  4.14 % |
   | legacy rq   |  30.2  |  33.0  | Gbps |  8.48 % |
   |-------------+--------+--------+------+---------|
 
 * Small XDP performance degradation. The test was is XDP drop
   program running on a single rx queue with small packets incoming
   it looks like this:
 
   |-------------+----------+----------+------+---------|
   | rq type     | before   | after    | unit |   diff  |
   |-------------+----------+----------+------+---------|
   | striding rq | 19725449 | 18544617 | pps  | -6.37 % |
   | legacy rq   | 19879931 | 18631841 | pps  | -6.70 % |
   |-------------+----------+----------+------+---------|
 
   This will be handled in a different patch series by adding support for
   multi-packet per page.
 
 * For other cases the performance is roughly the same.
 
 The above numbers were obtained on the following system:
   24 core Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz
   32 GB RAM
   ConnectX-7 single port
 
 The breakdown on the patch series is the following:
 * Preparations for introducing the mlx5e_frag_page struct.
 * Delete the mlx5e_page_cache struct.
 * Enable dma mapping from page_pool.
 * Enable skb recycling and fragment counting.
 * Do deferred release of pages (just before alloc) to ensure better
   page_pool cache utilization.
 
 ====================
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmQjUY8ACgkQSD+KveBX
 +j6tVAf/QHCbKgt9c2Q5EpFch2e4x3A/HfE7DbxTancIj0cc1bH98xd4wO574aE4
 PCJ/aJ+9zTLvTUgUnKDaiqonfmcsF7v6d/ltoLW1PTNnPqdsjsXpVy76dnL81SWy
 u/g7h68cfeMdMjAAoewyVv+k7GeTIZCsIdvik3dWGFQ67IpE1k5dLbO13YBNW/5m
 Cm39RzD55tjgxS8GHdyFYAV4MwgHy+pdhTYR9LGzH80hfd02KqsCO38u1NIShuez
 1rwjRF213Qdln20bMNSNiXG36JUV65mo+Q/XHKOEjB0qNKRcF5bzZovqHzP+R7QZ
 qhhhfce8c63UWpcXADP6k6qevW8+UA==
 =8F1t
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2023-03-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2023-03-28

Dragos Tatulea says:
====================

net/mlx5e: RX, Drop page_cache and fully use page_pool

For page allocation on the rx path, the mlx5e driver has been using an
internal page cache in tandem with the page pool. The internal page
cache uses a queue for page recycling which has the issue of head of
queue blocking.

This patch series drops the internal page_cache altogether and uses the
page_pool to implement everything that was done by the page_cache
before:
* Let the page_pool handle dma mapping and unmapping.
* Use fragmented pages with fragment counter instead of tracking via
  page ref.
* Enable skb recycling.

The patch series has the following effects on the rx path:

* Improved performance for the cases when there was low page recycling
  due to head of queue blocking in the internal page_cache. The test
  for this was running a single iperf TCP stream to a rx queue
  which is bound on the same cpu as the application.

  |-------------+--------+--------+------+---------|
  | rq type     | before | after  | unit |   diff  |
  |-------------+--------+--------+------+---------|
  | striding rq |  30.1  |  31.4  | Gbps |  4.14 % |
  | legacy rq   |  30.2  |  33.0  | Gbps |  8.48 % |
  |-------------+--------+--------+------+---------|

* Small XDP performance degradation. The test was is XDP drop
  program running on a single rx queue with small packets incoming
  it looks like this:

  |-------------+----------+----------+------+---------|
  | rq type     | before   | after    | unit |   diff  |
  |-------------+----------+----------+------+---------|
  | striding rq | 19725449 | 18544617 | pps  | -6.37 % |
  | legacy rq   | 19879931 | 18631841 | pps  | -6.70 % |
  |-------------+----------+----------+------+---------|

  This will be handled in a different patch series by adding support for
  multi-packet per page.

* For other cases the performance is roughly the same.

The above numbers were obtained on the following system:
  24 core Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz
  32 GB RAM
  ConnectX-7 single port

The breakdown on the patch series is the following:
* Preparations for introducing the mlx5e_frag_page struct.
* Delete the mlx5e_page_cache struct.
* Enable dma mapping from page_pool.
* Enable skb recycling and fragment counting.
* Do deferred release of pages (just before alloc) to ensure better
  page_pool cache utilization.

====================

* tag 'mlx5-updates-2023-03-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5e: RX, Remove unnecessary recycle parameter and page_cache stats
  net/mlx5e: RX, Break the wqe bulk refill in smaller chunks
  net/mlx5e: RX, Increase WQE bulk size for legacy rq
  net/mlx5e: RX, Split off release path for xsk buffers for legacy rq
  net/mlx5e: RX, Defer page release in legacy rq for better recycling
  net/mlx5e: RX, Change wqe last_in_page field from bool to bit flags
  net/mlx5e: RX, Defer page release in striding rq for better recycling
  net/mlx5e: RX, Rename xdp_xmit_bitmap to a more generic name
  net/mlx5e: RX, Enable skb page recycling through the page_pool
  net/mlx5e: RX, Enable dma map and sync from page_pool allocator
  net/mlx5e: RX, Remove internal page_cache
  net/mlx5e: RX, Store SHAMPO header pages in array
  net/mlx5e: RX, Remove alloc unit layout constraint for striding rq
  net/mlx5e: RX, Remove alloc unit layout constraint for legacy rq
  net/mlx5e: RX, Remove mlx5e_alloc_unit argument in page allocation
====================

Link: https://lore.kernel.org/r/20230328205623.142075-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

2023-03-29 22:15:24 -07:00

accel

accel: Build sub-directories based on config options

2023-03-13 12:44:53 +01:00

accessibility

…

acpi

Merge branches 'acpi-video', 'acpi-x86', 'acpi-tools' and 'acpi-docs'

2023-03-17 16:44:41 +01:00

amba

…

android

Char/Misc and other driver subsystem changes for 6.3-rc1

2023-02-24 12:47:33 -08:00

ata

ata: pata_parport: fix memory leaks

2023-03-16 16:54:38 +09:00

atm

atm: idt77252: fix kmemleak when rmmod idt77252

2023-03-21 20:19:28 -07:00

auxdisplay

…

base

A set of updates for the interrupt susbsystem:

2023-03-05 11:19:16 -08:00

bcma

…

block

block: sunvdc: add check for mdesc_grab() returning NULL

2023-03-15 08:48:58 -06:00

bluetooth

Bluetooth: btsdio: fix use after free bug in btsdio_remove due to unfinished work

2023-03-23 13:09:38 -07:00

bus

ARM: SoC drivers for 6.3

2023-02-27 10:04:49 -08:00

cdrom

…

char

tpm: disable hwrng for fTPM on some AMD designs

2023-03-12 23:28:10 +02:00

clk

clk: k210: remove an implicit 64-bit division

2023-03-06 14:41:20 -08:00

clocksource

Updates for timekeeping, timers and clockevent/source drivers:

2023-02-21 09:45:13 -08:00

comedi

…

connector

…

counter

…

cpufreq

More power management updates for 6.3-rc1

2023-03-03 10:30:58 -08:00

cpuidle

cpuidle: psci: Iterate backwards over list in psci_pd_remove()

2023-03-07 14:04:13 +01:00

crypto

This push fixes a regression in the caam driver.

2023-03-05 11:32:30 -08:00

cxl

cxl for v6.3

2023-02-25 09:19:23 -08:00

dax

cxl for v6.3

2023-02-25 09:19:23 -08:00

dca

…

devfreq

…

dio

…

dma

dmaengine updates for v6.3

2023-02-24 17:18:54 -08:00

dma-buf

dma-buf: make kobj_type structure constant

2023-02-17 09:16:34 +01:00

edac

- Add a driver for the RAS functionality on Xilinx's on chip memory

2023-02-21 08:10:03 -08:00

eisa

…

extcon

extcon: intel-cht-wc: Add support for Lenovo Yoga Tab 3 Pro YT3-X90F

2023-02-04 13:05:42 +00:00

firewire

Driver core changes for 6.3-rc1

2023-02-24 12:58:55 -08:00

firmware

firmware: xilinx: don't make a sleepable memory allocation from an atomic context

2023-03-09 18:00:31 +01:00

fpga

Driver core changes for 6.3-rc1

2023-02-24 12:58:55 -08:00

fsi

…

gnss

…

gpio

ACPI: x86: Introduce an acpi_quirk_skip_gpio_event_handlers() helper

2023-03-07 14:15:10 +01:00

gpu

gpu: host1x: fix uninitialized variable use

2023-03-20 11:12:37 -07:00

greybus

…

hid

for-linus-2023030901

2023-03-09 10:17:23 -08:00

hsi

Driver core changes for 6.3-rc1

2023-02-24 12:58:55 -08:00

hte

…

Driver core changes for 6.3-rc1

2023-02-24 12:58:55 -08:00

hwmon

hwmon: (ltc2992) Set can_sleep flag for GPIO chip

2023-03-15 19:15:00 -07:00

hwspinlock

…

hwtracing

Driver core changes for 6.3-rc1

2023-02-24 12:58:55 -08:00

i2c

i2c: dev: Fix bus callback return values

2023-03-09 22:07:52 +01:00

i3c

I3C for 6.3

2023-02-28 16:05:01 -08:00

idle

Power management updates for 6.3-rc1

2023-02-21 12:13:58 -08:00

iio

Char/Misc and other driver subsystem changes for 6.3-rc1

2023-02-24 12:47:33 -08:00

infiniband

v6.3 RDMA pull request

2023-02-24 15:11:03 -08:00

input

ARM: SoC drivers for 6.3

2023-02-27 10:04:49 -08:00

interconnect

interconnect: exynos: drop redundant link destroy

2023-03-13 21:13:48 +02:00

iommu

ARM: SoC drivers for 6.3

2023-02-27 10:04:49 -08:00

ipack

Driver core changes for 6.3-rc1

2023-02-24 12:58:55 -08:00

irqchip

ARM:

2023-02-25 11:30:21 -08:00

isdn

mISDN: remove unused vpm_read_address and cpld_read_reg functions

2023-03-24 19:09:57 -07:00

leds

- Remove Drivers

2023-02-23 15:09:31 -08:00

macintosh

powerpc updates for 6.3

2023-02-25 11:00:06 -08:00

mailbox

mailbox: qcom-apcs-ipc: add IPQ5332 APSS clock support

2023-02-23 14:47:13 -06:00

mcb

…

Merge branch 'md-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into block-6.3

2023-03-15 12:18:07 -06:00

media

media: m5mols: fix off-by-one loop termination error

2023-03-18 11:07:15 -07:00

memory

memory: tegra30-emc: fix interconnect registration race

2023-03-13 21:13:49 +02:00

memstick

MMC core:

2023-02-27 09:47:26 -08:00

message

…

mfd

mfd: ocelot: add ocelot-serdes capability

2023-03-20 09:08:48 +00:00

misc

misc: ad525x_dpot-i2c: Convert to i2c's .probe_new()

2023-03-09 21:58:45 +01:00

mmc

mmc: dw_mmc-starfive: Fix initialization of prev_err

2023-03-09 15:33:51 +01:00

most

…

mtd

* regression fix for the notifier handling of the I2C core

2023-03-11 09:24:05 -08:00

mux

…

net

mlx5-updates-2023-03-28

2023-03-29 22:15:24 -07:00

nfc

Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

2023-03-17 16:29:25 -07:00

ntb

…

nubus

…

nvdimm

virtio,vhost,vdpa: features, fixes

2023-02-25 11:48:02 -08:00

nvme

nvme fixes for Linux 6.3

2023-03-16 07:01:48 -06:00

nvmem

nvmem: core: return -ENOENT if nvmem cell is not found

2023-03-10 10:55:49 +01:00

IOMMU Updates for Linux v6.3:

2023-02-24 13:40:13 -08:00

opp

OPP: fix error checking in opp_migrate_dentry()

2023-02-16 13:48:53 +01:00

parisc

…

parport

Char/Misc and other driver subsystem changes for 6.3-rc1

2023-02-24 12:47:33 -08:00

pci

PCI: s390: Fix use-after-free of PCI resources with per-function hotplug

2023-03-13 09:15:11 +01:00

pcmcia

Driver core changes for 6.3-rc1

2023-02-24 12:58:55 -08:00

peci

…

perf

RISC-V Patches for the 6.3 Merge Window, Part 2

2023-03-03 09:32:51 -08:00

phy

phy: phy-ocelot-serdes: add ability to be used in a non-syscon configuration

2023-03-20 09:08:48 +00:00

pinctrl

ARM: SoC drivers for 6.3

2023-02-27 10:04:49 -08:00

platform

platform: mellanox: mlx-platform: Initialize shift variable to 0

2023-03-07 12:08:30 +01:00

pnp

…

power

power supply changes for the v6.3 series (part 2)

2023-03-03 16:33:28 -08:00

powercap

More power management updates for 6.3-rc1

2023-03-03 10:30:58 -08:00

pps

…

ps3

…

ptp

ptp: add ToD device driver for Intel FPGA cards

2023-03-29 21:25:48 -07:00

pwm

pwm: dwc: Use devm_pwmchip_add()

2023-02-20 12:26:35 +01:00

rapidio

…

ras

…

regulator

regulator: Fixes for v6.3

2023-03-02 09:21:25 -08:00

remoteproc

ARM: SoC drivers for 6.3

2023-02-27 10:04:49 -08:00

reset

…

rpmsg

rpmsg updates for v6.3

2023-02-26 12:10:28 -08:00

rtc

RTC for 6.3

2023-03-03 09:15:50 -08:00

s390

net/ism: Remove redundant pci_clear_master

2023-03-24 09:13:42 +00:00

sbus

mm: replace vma->vm_flags direct modifications with modifier calls

2023-02-09 16:51:39 -08:00

scsi

net: introduce a config option to tweak MAX_SKB_FRAGS

2023-03-27 19:29:22 -07:00

sh updates for v6.3

2023-03-01 09:44:22 -08:00

siox

…

slimbus

…

soc

ARM: SoC drivers for 6.3

2023-02-27 10:04:49 -08:00

soundwire

soundwire updates for 6.3

2023-02-24 17:29:52 -08:00

spi

spi: Fixes for v6.3

2023-03-02 09:25:38 -08:00

spmi

…

ssb

…

staging

staging: r8188eu: delete driver

2023-03-09 10:06:28 +01:00

target

scsi: target: iscsi: Fix an error message in iscsi_check_key()

2023-03-06 16:50:42 -05:00

…

tee

Driver core changes for 6.3-rc1

2023-02-24 12:58:55 -08:00

thermal

thermal: intel: int340x: processor_thermal: Fix deadlock

2023-03-03 20:34:49 +01:00

thunderbolt

Driver core changes for 6.3-rc1

2023-02-24 12:58:55 -08:00

tty

TTY/Serial driver fixes for 6.3-rc3

2023-03-19 10:09:58 -07:00

ufs

scsi: ufs: mcq: Use active_reqs to check busy in clock scaling

2023-03-09 21:09:28 -05:00

uio

- Daniel Verkamp has contributed a memfd series ("mm/memfd: add

2023-02-23 17:09:35 -08:00

usb

wwan: core: Support slicing in port TX flow of WWAN subsystem

2023-03-17 22:38:31 -07:00

vdpa

vdpa_sim: set last_used_idx as last_avail_idx in vdpasim_queue_ready

2023-03-13 02:29:12 -04:00

vfio

vfio/mlx5: Fix the report of dirty_bytes upon pre-copy

2023-03-13 12:50:59 -06:00

vhost

vsock: support sockmap

2023-03-29 08:19:38 +01:00

video

fbdev updates for kernel 6.3-rc3:

2023-03-18 16:01:34 -07:00

virt

virt/coco/sev-guest: Add throttling awareness

2023-03-13 13:29:27 +01:00

virtio

virtio,vhost,vdpa: features, fixes

2023-02-25 11:48:02 -08:00

vlynq

…

w1: ds2482: Convert to i2c's .probe_new()

2023-03-09 21:58:57 +01:00

watchdog

linux-watchdog 6.3-rc1 tag

2023-03-02 11:12:01 -08:00

xen

xen: branch for v6.3-rc3

2023-03-17 10:45:49 -07:00

zorro

…

Kconfig

…

Makefile

Kbuild updates for v6.3

2023-02-26 11:53:25 -08:00