linux/include
Vladimir Oltean 2f530df76c net/sched: taprio: give higher priority to higher TCs in software dequeue mode
Current taprio software implementation is haunted by the shadow of the
igb/igc hardware model. It iterates over child qdiscs in increasing
order of TXQ index, therefore giving higher xmit priority to TXQ 0 and
lower to TXQ N. According to discussions with Vinicius, that is the
default (perhaps even unchangeable) prioritization scheme used for the
NICs that taprio was first written for (igb, igc), and we have a case of
two bugs canceling out, resulting in a functional setup on igb/igc, but
a less sane one on other NICs.

To the best of my understanding, taprio should prioritize based on the
traffic class, so it should really dequeue starting with the highest
traffic class and going down from there. We get to the TXQ using the
tc_to_txq[] netdev property.

TXQs within the same TC have the same (strict) priority, so we should
pick from them as fairly as we can. We can achieve that by implementing
something very similar to q->curband from multiq_dequeue().

Since igb/igc really do have TXQ 0 of higher hardware priority than
TXQ 1 etc, we need to preserve the behavior for them as well. We really
have no choice, because in txtime-assist mode, taprio is essentially a
software scheduler towards offloaded child tc-etf qdiscs, so the TXQ
selection really does matter (not all igb TXQs support ETF/SO_TXTIME,
says Kurt Kanzenbach).

To preserve the behavior, we need a capability bit so that taprio can
determine if it's running on igb/igc, or on something else. Because igb
doesn't offload taprio at all, we can't piggyback on the
qdisc_offload_query_caps() call from taprio_enable_offload(), but
instead we need a separate call which is also made for software
scheduling.

Introduce two static keys to minimize the performance penalty on systems
which only have igb/igc NICs, and on systems which only have other NICs.
For mixed systems, taprio will have to dynamically check whether to
dequeue using one prioritization algorithm or using the other.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08 09:48:52 +00:00
..
acpi ACPI: Fix selecting wrong ACPI fwnode for the iGPU on some Dell laptops 2023-01-10 20:23:48 +01:00
asm-generic arch: fix broken BuildID for arm64 and riscv 2022-12-30 17:21:51 +09:00
clocksource
crypto
drm drm/fb-helper: Use a per-driver FB deferred I/O handler 2023-01-24 11:13:08 +01:00
dt-bindings remoteproc updates for v6.2 2022-12-21 09:37:14 -08:00
keys
kunit kunit: fix kunit_test_init_section_suites(...) 2023-01-31 09:10:38 -07:00
kvm
linux net: micrel: Add support for lan8841 PHY 2023-02-08 09:16:07 +00:00
math-emu
media
memory
misc
net net/sched: taprio: give higher priority to higher TCs in software dequeue mode 2023-02-08 09:48:52 +00:00
pcmcia
ras
rdma
rv
scsi scsi: iscsi_tcp: Fix UAF during logout when accessing the shost ipaddress 2023-01-18 19:14:56 -05:00
soc net: mscc: ocelot: un-export unused regmap symbols 2023-02-06 22:33:15 -08:00
sound
target
trace net: bridge: Add a tracepoint for MDB overflows 2023-02-06 08:48:25 +00:00
uapi net: bridge: Add netlink knobs for number / maximum MDB entries 2023-02-06 08:48:26 +00:00
ufs scsi: ufs: core: Fix devfreq deadlocks 2023-01-18 19:08:37 -05:00
vdso
video fbdev: omapfb: connector-analog-tv: remove support for platform data 2022-12-14 20:01:49 +01:00
xen xen: make remove callback of xen driver void returned 2022-12-15 16:06:10 +01:00