linux/net
Toke Høiland-Jørgensen b530e9e106 bpf: Add "live packet" mode for XDP in BPF_PROG_RUN
This adds support for running XDP programs through BPF_PROG_RUN in a mode
that enables live packet processing of the resulting frames. Previous uses
of BPF_PROG_RUN for XDP returned the XDP program return code and the
modified packet data to userspace, which is useful for unit testing of XDP
programs.

The existing BPF_PROG_RUN for XDP allows userspace to set the ingress
ifindex and RXQ number as part of the context object being passed to the
kernel. This patch reuses that code, but adds a new mode with different
semantics, which can be selected with the new BPF_F_TEST_XDP_LIVE_FRAMES
flag.

When running BPF_PROG_RUN in this mode, the XDP program return codes will
be honoured: returning XDP_PASS will result in the frame being injected
into the networking stack as if it came from the selected networking
interface, while returning XDP_TX and XDP_REDIRECT will result in the frame
being transmitted out that interface. XDP_TX is translated into an
XDP_REDIRECT operation to the same interface, since the real XDP_TX action
is only possible from within the network drivers themselves, not from the
process context where BPF_PROG_RUN is executed.

Internally, this new mode of operation creates a page pool instance while
setting up the test run, and feeds pages from that into the XDP program.
The setup cost of this is amortised over the number of repetitions
specified by userspace.

To support the performance testing use case, we further optimise the setup
step so that all pages in the pool are pre-initialised with the packet
data, and pre-computed context and xdp_frame objects stored at the start of
each page. This makes it possible to entirely avoid touching the page
content on each XDP program invocation, and enables sending up to 9
Mpps/core on my test box.

Because the data pages are recycled by the page pool, and the test runner
doesn't re-initialise them for each run, subsequent invocations of the XDP
program will see the packet data in the state it was after the last time it
ran on that particular page. This means that an XDP program that modifies
the packet before redirecting it has to be careful about which assumptions
it makes about the packet content, but that is only an issue for the most
naively written programs.

Enabling the new flag is only allowed when not setting ctx_out and data_out
in the test specification, since using it means frames will be redirected
somewhere else, so they can't be returned.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220309105346.100053-2-toke@redhat.com
2022-03-09 14:19:22 -08:00
..
6lowpan net: don't include ndisc.h from ipv6.h 2022-02-04 14:15:11 -08:00
9p virtio,vdpa,qemu_fw_cfg: features, cleanups, fixes 2022-01-18 10:05:48 +02:00
802 net: 802: Use memset_startat() to clear struct fields 2021-11-19 11:23:23 +00:00
8021q Revert "vlan: move dev_put into vlan_dev_uninit" 2022-02-23 15:21:13 -08:00
appletalk
atm proc: remove PDE_DATA() completely 2022-01-22 08:33:37 +02:00
ax25 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-02-10 17:29:56 -08:00
batman-adv Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-03-03 11:55:12 -08:00
bluetooth Bluetooth: use memset avoid memory leaks 2022-03-04 16:55:38 +01:00
bpf bpf: Add "live packet" mode for XDP in BPF_PROG_RUN 2022-03-09 14:19:22 -08:00
bpfilter
bridge net: bridge: Use netif_rx(). 2022-03-04 12:02:19 +00:00
caif net: caif: Use netif_rx(). 2022-03-04 12:02:19 +00:00
can Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-02-24 17:54:25 -08:00
ceph libceph: optionally use bounce buffer on recv path in crc mode 2022-02-02 18:50:36 +01:00
core net: dev: use kfree_skb_reason() for __netif_receive_skb_core() 2022-03-04 12:17:11 +00:00
dcb net: dcb: disable softirqs in dcbnl_flush_dev() 2022-03-03 08:01:55 -08:00
dccp dccp: remove max48() 2022-01-27 13:53:27 +00:00
decnet net: decnet: use time_is_before_jiffies() instead of open coding it 2022-02-28 13:21:32 +00:00
dns_resolver
dsa net: dsa: tag_rtl8_4: add rtl8_4t trailing variant 2022-03-05 11:04:25 +00:00
ethernet gro: remove rcu_read_lock/rcu_read_unlock from gro_complete handlers 2021-11-24 17:21:42 -08:00
ethtool ethtool: add support to set/get completion queue event size 2022-02-23 20:33:05 -08:00
hsr flow_dissector: Add support for HSR 2022-03-02 22:44:49 -08:00
ieee802154 net: ipv6: Handle delivery_time in ipv6 defrag 2022-03-03 14:38:48 +00:00
ife
ipv4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-03-03 11:55:12 -08:00
ipv6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-03-03 11:55:12 -08:00
iucv s390/iucv: sort out physical vs virtual pointers usage 2022-02-22 16:09:13 -08:00
kcm net: Don't include filter.h from net/sock.h 2021-12-29 08:48:14 -08:00
key xfrm: Check if_id in xfrm_migrate 2022-01-26 07:44:01 +01:00
l2tp l2tp: add netns refcount tracker to l2tp_dfs_seq_data 2021-12-10 06:38:27 -08:00
l3mdev
lapb
llc sock: Use sock_owned_by_user_nocheck() instead of sk_lock.owned. 2021-12-10 19:43:00 -08:00
mac80211 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-03-03 11:55:12 -08:00
mac802154
mctp mctp: Avoid warning if unregister notifies twice 2022-02-25 22:23:23 -08:00
mpls net: mpls: Fix GCC 12 warning 2022-02-10 15:29:39 +00:00
mptcp mptcp: add the mibs for MP_RST 2022-03-04 21:54:30 -08:00
ncsi all: replace find_next{,_zero}_bit with find_first{,_zero}_bit where appropriate 2022-01-15 08:47:31 -08:00
netfilter bpf: Replace __diag_ignore with unified __diag_ignore_all 2022-03-05 15:29:36 -08:00
netlabel lsm: security_task_getsecid_subj() -> security_current_getsecid_subj() 2021-11-22 17:52:47 -05:00
netlink net: Don't include filter.h from net/sock.h 2021-12-29 08:48:14 -08:00
netrom netrom: fix api breakage in nr_setsockopt() 2022-01-07 14:11:05 +00:00
nfc nfc: llcp: Revert "NFC: Keep socket alive until the DISC PDU is actually sent" 2022-03-03 10:43:37 +00:00
nsh
openvswitch net: Add skb_clear_tstamp() to keep the mono delivery_time 2022-03-03 14:38:48 +00:00
packet net: Handle delivery_time in skb->tstamp during network tapping with af_packet 2022-03-03 14:38:48 +00:00
phonet phonet/pep: refuse to enable an unbound pipe 2021-12-20 11:49:51 +00:00
psample
qrtr bus: mhi: core: Add an API for auto queueing buffers for DL channel 2021-12-17 17:17:14 +01:00
rds Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-12-16 16:13:19 -08:00
rfkill rfkill: allow to get the software rfkill state 2021-12-20 11:02:38 +01:00
rose net: Don't include filter.h from net/sock.h 2021-12-29 08:48:14 -08:00
rxrpc rxrpc: Adjust retransmission backoff 2022-01-22 02:03:24 +00:00
sched Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-03-03 11:55:12 -08:00
sctp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-01-05 14:36:10 -08:00
smc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-03-03 11:55:12 -08:00
strparser bpf: sockmap, strparser, and tls are reusing qdisc_skb_cb and colliding 2021-11-09 01:05:28 +01:00
sunrpc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-02-10 17:29:56 -08:00
switchdev net: switchdev: remove lag_mod_cb from switchdev_handle_fdb_event_to_device 2022-02-24 21:31:43 -08:00
tipc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-02-24 17:54:25 -08:00
tls tls: cap the output scatter list to something reasonable 2022-02-04 10:14:07 +00:00
unix Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2022-01-24 15:42:29 -08:00
vmw_vsock vsock: remove vsock from connected table when connect is interrupted by a signal 2022-02-17 08:56:02 -08:00
wireless Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-03-03 11:55:12 -08:00
x25 net: x25: drop harmless check of !more 2021-12-09 18:35:11 -08:00
xdp i40e: xsk: Move tmp desc array from driver to pool 2022-01-27 17:25:32 +01:00
xfrm Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-03-03 11:55:12 -08:00
compat.c
devres.c
Kconfig page_pool: Add allocation stats 2022-03-03 09:55:28 +00:00
Kconfig.debug net: add networking namespace refcount tracker 2021-12-10 06:38:26 -08:00
Makefile
socket.c net: fix documentation for kernel_getsockname 2022-02-14 14:01:19 +00:00
sysctl_net.c sections: move and rename core_kernel_data() to is_kernel_core_data() 2021-11-09 10:02:50 -08:00