linux/net
Björn Töpel 7fd3253a7d net: Introduce preferred busy-polling
The existing busy-polling mode, enabled by the SO_BUSY_POLL socket
option or system-wide using the /proc/sys/net/core/busy_read knob, is
an opportunistic. That means that if the NAPI context is not
scheduled, it will poll it. If, after busy-polling, the budget is
exceeded the busy-polling logic will schedule the NAPI onto the
regular softirq handling.

One implication of the behavior above is that a busy/heavy loaded NAPI
context will never enter/allow for busy-polling. Some applications
prefer that most NAPI processing would be done by busy-polling.

This series adds a new socket option, SO_PREFER_BUSY_POLL, that works
in concert with the napi_defer_hard_irqs and gro_flush_timeout
knobs. The napi_defer_hard_irqs and gro_flush_timeout knobs were
introduced in commit 6f8b12d661 ("net: napi: add hard irqs deferral
feature"), and allows for a user to defer interrupts to be enabled and
instead schedule the NAPI context from a watchdog timer. When a user
enables the SO_PREFER_BUSY_POLL, again with the other knobs enabled,
and the NAPI context is being processed by a softirq, the softirq NAPI
processing will exit early to allow the busy-polling to be performed.

If the application stops performing busy-polling via a system call,
the watchdog timer defined by gro_flush_timeout will timeout, and
regular softirq handling will resume.

In summary; Heavy traffic applications that prefer busy-polling over
softirq processing should use this option.

Example usage:

  $ echo 2 | sudo tee /sys/class/net/ens785f1/napi_defer_hard_irqs
  $ echo 200000 | sudo tee /sys/class/net/ens785f1/gro_flush_timeout

Note that the timeout should be larger than the userspace processing
window, otherwise the watchdog will timeout and fall back to regular
softirq processing.

Enable the SO_BUSY_POLL/SO_PREFER_BUSY_POLL options on your socket.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/bpf/20201130185205.196029-2-bjorn.topel@gmail.com
2020-12-01 00:09:25 +01:00
..
6lowpan
9p net: 9p: Fix kerneldoc warnings of missing parameters etc 2020-11-02 12:25:52 -08:00
802
8021q net: vlan: Fixed signedness in vlan_group_prealloc_vid() 2020-09-28 00:51:39 -07:00
appletalk net: appletalk: fix kerneldoc warnings 2020-10-30 11:48:17 -07:00
atm net: atm: fix update of position index in lec_seq_next 2020-10-31 12:26:30 -07:00
ax25 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-07-25 17:49:04 -07:00
batman-adv genetlink: move to smaller ops wherever possible 2020-10-02 19:11:11 -07:00
bluetooth Bluetooth: Replace zero-length array with flexible-array member 2020-10-29 17:22:59 -05:00
bpf bpf: fix raw_tp test run in preempt kernel 2020-09-30 08:34:08 -07:00
bpfilter Revert "bpfilter: Fix build error with CONFIG_BPFILTER_UMH" 2020-10-15 12:33:24 -07:00
bridge bridge: mrp: Use hlist_head instead of list_head for mrp 2020-11-09 16:42:12 -08:00
caif caif: Remove duplicate macro SRVL_CTRL_PKT_SIZE 2020-09-05 15:57:05 -07:00
can can: isotp: padlen(): make const array static, makes object smaller 2020-11-03 22:30:32 +01:00
ceph libceph: clear con->out_msg on Policy::stateful_server faults 2020-10-12 15:29:27 +02:00
core net: Introduce preferred busy-polling 2020-12-01 00:09:25 +01:00
dcb net: dcb: Fix kerneldoc warnings 2020-10-30 11:59:54 -07:00
dccp net: dccp: convert tasklets to use new tasklet_setup() API 2020-11-07 10:40:56 -08:00
decnet treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
dns_resolver
dsa net: dsa: use net core stats64 handling 2020-11-09 17:50:27 -08:00
ethernet
ethtool Merge https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-11-12 16:54:48 -08:00
hsr genetlink: move to smaller ops wherever possible 2020-10-02 19:11:11 -07:00
ieee802154 genetlink: move to smaller ops wherever possible 2020-10-02 19:11:11 -07:00
ife
ipv4 Merge https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-11-12 16:54:48 -08:00
ipv6 ipv6: remove unused function ipv6_skb_idev() 2020-11-14 12:00:27 -08:00
iucv net/af_iucv: fix null pointer dereference on shutdown 2020-11-10 18:08:17 -08:00
kcm net: pass a sockptr_t into ->setsockopt 2020-07-24 15:41:54 -07:00
key Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-08-02 01:02:12 -07:00
l2tp genetlink: move to smaller ops wherever possible 2020-10-02 19:11:11 -07:00
l3mdev net: l3mdev: Fix kerneldoc warning 2020-10-30 11:43:42 -07:00
lapb
llc net: llc: Fix kerneldoc warnings 2020-10-30 11:34:09 -07:00
mac80211 Some updates: 2020-11-13 12:03:22 -08:00
mac802154 net: mac802154: convert tasklets to use new tasklet_setup() API 2020-11-07 10:40:56 -08:00
mpls mpls: drop skb's dst in mpls_forward() 2020-11-03 12:55:53 -08:00
mptcp Merge https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-11-12 16:54:48 -08:00
ncsi genetlink: move to smaller ops wherever possible 2020-10-02 19:11:11 -07:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-11-06 17:33:38 -08:00
netlabel Merge https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-11-12 16:54:48 -08:00
netlink netlink: export policy in extended ACK 2020-10-09 20:22:32 -07:00
netrom treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
nfc net: nfc: Fix kerneldoc warnings 2020-10-30 11:57:56 -07:00
nsh
openvswitch Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-11-06 17:33:38 -08:00
packet net/packet: make packet_fanout.arr size configurable up to 64K 2020-11-09 16:41:40 -08:00
phonet treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
psample genetlink: move to smaller ops wherever possible 2020-10-02 19:11:11 -07:00
qrtr net: qrtr: Release distant nodes along the bridge node 2020-11-11 15:29:34 -08:00
rds RDMA: Add rdma_connect_locked() 2020-10-28 09:14:49 -03:00
rfkill
rose treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
rxrpc rxrpc: Fix loss of final ack on shutdown 2020-10-15 13:28:00 +01:00
sched net: sched: fix misspellings using misspell-fixer tool 2020-11-10 17:00:28 -08:00
sctp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-11-06 17:33:38 -08:00
smc net: smc: convert tasklets to use new tasklet_setup() API 2020-11-07 10:41:15 -08:00
strparser
sunrpc net/sunrpc: fix useless comparison in proc_do_xprt() 2020-11-08 16:28:25 -05:00
switchdev net: switchdev: Fixed kerneldoc warning 2020-09-23 17:46:31 -07:00
tipc tipc: fix -Wstringop-truncation warnings 2020-11-13 14:17:49 -08:00
tls Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-10-15 12:43:21 -07:00
unix networking changes for the 5.10 merge window 2020-10-15 18:42:13 -07:00
vmw_vsock vsock: fix the error return when an invalid ioctl command is used 2020-10-29 17:43:20 -07:00
wireless Some updates: 2020-11-13 12:03:22 -08:00
x25 net/x25: Fix null-ptr-deref in x25_connect 2020-11-11 14:53:56 -08:00
xdp xdp: Remove the functions xsk_map_inc and xsk_map_put 2020-11-27 23:00:51 +01:00
xfrm net: xfrm: convert tasklets to use new tasklet_setup() API 2020-11-07 10:41:15 -08:00
compat.c iov_iter: transparently handle compat iovecs in import_iovec 2020-10-03 00:02:13 -04:00
devres.c net: devres: rename the release callback of devm_register_netdev() 2020-06-30 15:57:34 -07:00
Kconfig wimax: move out to staging 2020-10-29 19:27:45 +01:00
Makefile wimax: move out to staging 2020-10-29 19:27:45 +01:00
socket.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-10-05 18:40:01 -07:00
sysctl_net.c