linux/net
Daniel Borkmann b4ab314149 bpf: Add redirect_neigh helper as redirect drop-in
Add a redirect_neigh() helper as redirect() drop-in replacement
for the xmit side. Main idea for the helper is to be very similar
in semantics to the latter just that the skb gets injected into
the neighboring subsystem in order to let the stack do the work
it knows best anyway to populate the L2 addresses of the packet
and then hand over to dev_queue_xmit() as redirect() does.

This solves two bigger items: i) skbs don't need to go up to the
stack on the host facing veth ingress side for traffic egressing
the container to achieve the same for populating L2 which also
has the huge advantage that ii) the skb->sk won't get orphaned in
ip_rcv_core() when entering the IP routing layer on the host stack.

Given that skb->sk neither gets orphaned when crossing the netns
as per 9c4c325252 ("skbuff: preserve sock reference when scrubbing
the skb.") the helper can then push the skbs directly to the phys
device where FQ scheduler can do its work and TCP stack gets proper
backpressure given we hold on to skb->sk as long as skb is still
residing in queues.

With the helper used in BPF data path to then push the skb to the
phys device, I observed a stable/consistent TCP_STREAM improvement
on veth devices for traffic going container -> host -> host ->
container from ~10Gbps to ~15Gbps for a single stream in my test
environment.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: David Ahern <dsahern@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Cc: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/bpf/f207de81629e1724899b73b8112e0013be782d35.1601477936.git.daniel@iogearbox.net
2020-09-30 11:50:35 -07:00
..
6lowpan
9p treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
802
8021q treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
appletalk appletalk: Fix atalk_proc_init() return path 2020-08-03 15:48:32 -07:00
atm net: atm: delete duplicated words 2020-09-18 14:12:43 -07:00
ax25 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-07-25 17:49:04 -07:00
batman-adv net: bridge: mcast: rename br_ip's u member to dst 2020-09-23 13:24:34 -07:00
bluetooth net: bluetooth: delete duplicated words 2020-09-18 14:12:43 -07:00
bpf bpf: fix raw_tp test run in preempt kernel 2020-09-30 08:34:08 -07:00
bpfilter bpf: Add kernel module with user mode driver that populates bpffs. 2020-08-20 16:02:36 +02:00
bridge net: bridge: mcast: when forwarding handle filter mode and blocked flag 2020-09-23 13:24:35 -07:00
caif caif: Remove duplicate macro SRVL_CTRL_PKT_SIZE 2020-09-05 15:57:05 -07:00
can can: remove "WITH Linux-syscall-note" from SPDX tag of C files 2020-09-21 10:13:16 +02:00
ceph treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
core bpf: Add redirect_neigh helper as redirect drop-in 2020-09-30 11:50:35 -07:00
dcb net: DCB: Validate DCB_ATTR_DCB_BUFFER argument 2020-09-10 15:09:08 -07:00
dccp ip: pass tos into ip_build_and_send_pkt() 2020-09-10 13:15:40 -07:00
decnet treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
dns_resolver
dsa Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-09-22 16:45:34 -07:00
ethernet
ethtool Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-09-22 16:45:34 -07:00
hsr Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-09-22 16:45:34 -07:00
ieee802154 treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
ife
ipv4 bpf: Change bpf_sk_storage_*() to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON 2020-09-25 13:58:01 -07:00
ipv6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-09-22 16:45:34 -07:00
iucv treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
kcm net: pass a sockptr_t into ->setsockopt 2020-07-24 15:41:54 -07:00
key Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-08-02 01:02:12 -07:00
l2tp l2tp: fix up inconsistent rx/tx statistics 2020-09-18 14:36:54 -07:00
l3mdev net: Fix some comments 2020-08-27 07:55:59 -07:00
lapb
llc net: pass a sockptr_t into ->setsockopt 2020-07-24 15:41:54 -07:00
mac80211 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-09-22 16:45:34 -07:00
mac802154 Merge tag 'ieee802154-for-davem-2020-09-08' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan 2020-09-08 20:12:58 -07:00
mpls treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
mptcp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-09-22 16:45:34 -07:00
ncsi treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-09-22 16:45:34 -07:00
netlabel netlabel: Fix some kernel-doc warnings 2020-09-08 20:04:27 -07:00
netlink netlink: add spaces around '&' in netlink_recv/sendmsg() 2020-09-17 16:53:47 -07:00
netrom treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
nfc NFC: digital: Remove two unused macroes 2020-09-05 16:01:52 -07:00
nsh
openvswitch net: openswitch: reuse the helper variable to improve the code readablity 2020-09-18 14:24:08 -07:00
packet net/packet: Fix a comment about network_header 2020-09-19 16:40:48 -07:00
phonet treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
psample
qrtr net: qrtr: check skb_put_padto() return value 2020-09-09 11:04:39 -07:00
rds RDS: drop double zeroing 2020-09-20 19:09:11 -07:00
rfkill
rose treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
rxrpc rxrpc: Fix an overget of the conn bundle when setting up a client conn 2020-09-14 16:18:59 +01:00
sched Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-09-22 16:45:34 -07:00
sctp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-09-22 16:45:34 -07:00
smc net/smc: fix double kfree in smc_listen_work() 2020-09-17 18:03:56 -07:00
strparser
sunrpc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-09-22 16:45:34 -07:00
switchdev net: switchdev: kerneldoc fixes 2020-07-13 17:20:40 -07:00
tipc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-09-22 16:45:34 -07:00
tls net/tls: Implement getsockopt SOL_TLS TLS_RX 2020-09-01 11:47:12 -07:00
unix net: unix: remove redundant assignment to variable 'err' 2020-09-21 14:51:37 -07:00
vmw_vsock vsock: fix potential null pointer dereference in vsock_poll() 2020-08-12 12:56:06 -07:00
wimax
wireless Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-09-22 16:45:34 -07:00
x25 treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
xdp xsk: Fix a documentation mistake in xsk_queue.h 2020-09-29 11:25:56 -07:00
xfrm treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
compat.c net/scm: Fix typo in SCM_RIGHTS compat refactoring 2020-08-07 12:43:25 -07:00
devres.c
Kconfig net: ethtool: Remove PHYLIB direct dependency 2020-07-07 15:41:05 -07:00
Makefile
socket.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-09-04 21:28:59 -07:00
sysctl_net.c