linux/net
Daniel Borkmann fdadd04931 bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K
Michael and Sandipan report:

  Commit ede95a63b5 introduced a bpf_jit_limit tuneable to limit BPF
  JIT allocations. At compile time it defaults to PAGE_SIZE * 40000,
  and is adjusted again at init time if MODULES_VADDR is defined.

  For ppc64 kernels, MODULES_VADDR isn't defined, so we're stuck with
  the compile-time default at boot-time, which is 0x9c400000 when
  using 64K page size. This overflows the signed 32-bit bpf_jit_limit
  value:

  root@ubuntu:/tmp# cat /proc/sys/net/core/bpf_jit_limit
  -1673527296

  and can cause various unexpected failures throughout the network
  stack. In one case `strace dhclient eth0` reported:

  setsockopt(5, SOL_SOCKET, SO_ATTACH_FILTER, {len=11, filter=0x105dd27f8},
             16) = -1 ENOTSUPP (Unknown error 524)

  and similar failures can be seen with tools like tcpdump. This doesn't
  always reproduce however, and I'm not sure why. The more consistent
  failure I've seen is an Ubuntu 18.04 KVM guest booted on a POWER9
  host would time out on systemd/netplan configuring a virtio-net NIC
  with no noticeable errors in the logs.

Given this and also given that in near future some architectures like
arm64 will have a custom area for BPF JIT image allocations we should
get rid of the BPF_JIT_LIMIT_DEFAULT fallback / default entirely. For
4.21, we have an overridable bpf_jit_alloc_exec(), bpf_jit_free_exec()
so therefore add another overridable bpf_jit_alloc_exec_limit() helper
function which returns the possible size of the memory area for deriving
the default heuristic in bpf_jit_charge_init().

Like bpf_jit_alloc_exec() and bpf_jit_free_exec(), the new
bpf_jit_alloc_exec_limit() assumes that module_alloc() is the default
JIT memory provider, and therefore in case archs implement their custom
module_alloc() we use MODULES_{END,_VADDR} for limits and otherwise for
vmalloc_exec() cases like on ppc64 we use VMALLOC_{END,_START}.

Additionally, for archs supporting large page sizes, we should change
the sysctl to be handled as long to not run into sysctl restrictions
in future.

Fixes: ede95a63b5 ("bpf: add bpf_jit_limit knob to restrict unpriv allocations")
Reported-by: Sandipan Das <sandipan@linux.ibm.com>
Reported-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-12-11 19:12:21 -08:00
..
6lowpan 6lowpan: iphc: reset mac_header after decompress to fix panic 2018-07-06 12:32:12 +02:00
9p Merge branch 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-11-03 10:35:52 -07:00
802
8021q netpoll: allow cleanup to be synchronous 2018-10-19 17:01:43 -07:00
appletalk
atm Revert "net: simplify sock_poll_wait" 2018-10-23 10:57:06 -07:00
ax25 ax25: remove blank line at EOF 2018-07-24 14:10:42 -07:00
batman-adv batman-adv: Expand merged fragment buffer for full packet 2018-11-12 10:41:29 +01:00
bluetooth Merge branch 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-11-01 19:58:52 -07:00
bpf bpf: refactor bpf_test_run() to separate own failures and test program result 2018-12-01 12:33:58 -08:00
bpfilter net: bpfilter: Set user mode helper's command line 2018-10-22 19:37:36 -07:00
bridge net: bridge: fix vlan stats use-after-free on destruction 2018-11-17 21:38:44 -08:00
caif Revert "net: simplify sock_poll_wait" 2018-10-23 10:57:06 -07:00
can can: raw: check for CAN FD capable netdev in raw_sendmsg() 2018-11-09 17:19:34 +01:00
ceph libceph: fall back to sendmsg for slab pages 2018-11-19 17:59:47 +01:00
core bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K 2018-12-11 19:12:21 -08:00
dcb net: dcb: Add priority-to-DSCP map getters 2018-07-27 13:17:50 -07:00
dccp Revert "net: simplify sock_poll_wait" 2018-10-23 10:57:06 -07:00
decnet decnet: Remove unnecessary check for dev->name 2018-09-21 19:48:36 -07:00
dns_resolver dns: Allow the dns resolver to retrieve a server set 2018-10-04 09:40:52 -07:00
dsa net: dsa: Fix tagging attribute location 2018-11-30 17:17:39 -08:00
ethernet
hsr
ieee802154 net/ipfrag: let ip[6]frag_high_thresh in ns be higher than in init_net 2018-09-21 19:45:52 -07:00
ife
ipv4 ipv4: ipv6: netfilter: Adjust the frag mem limit when truesize changes 2018-12-05 20:44:46 -08:00
ipv6 ipv6: sr: properly initialize flowi6 prior passing to ip6_route_output 2018-12-07 12:22:39 -08:00
iucv Revert "net: simplify sock_poll_wait" 2018-10-23 10:57:06 -07:00
kcm Revert "kcm: remove any offset before parsing messages" 2018-09-17 18:43:42 -07:00
key Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2018-07-27 09:33:37 -07:00
l2tp l2tp: fix a sock refcnt leak in l2tp_tunnel_register 2018-11-14 22:49:31 -08:00
l3mdev
lapb
llc llc: do not use sk_eat_skb() 2018-10-22 19:59:20 -07:00
mac80211 mac80211: ignore NullFunc frames in the duplicate detection 2018-12-05 12:34:49 +01:00
mac802154 mac802154: Remove VLA usage of skcipher 2018-09-28 12:46:07 +08:00
mpls net/mpls: Handle kernel side filtering of route dumps 2018-10-16 00:14:07 -07:00
ncsi net/ncsi: Add NCSI Broadcom OEM command 2018-10-17 22:14:54 -07:00
netfilter netfilter: nf_tables: deactivate expressions in rule replecement routine 2018-11-28 10:56:40 +01:00
netlabel netlabel: check for IPV4MASK in addrinfo_get 2018-09-21 18:58:34 -07:00
netlink netlink: Add answer_flags to netlink_callback 2018-10-16 00:13:12 -07:00
netrom
nfc Merge branch 'work.tty-ioctl' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-10-24 14:43:41 +01:00
nsh nsh: set mac len based on inner packet 2018-07-12 16:55:29 -07:00
openvswitch openvswitch: fix spelling mistake "execeeds" -> "exceeds" 2018-11-30 13:18:09 -08:00
packet packet: copy user buffers before orphan or clone 2018-11-23 11:08:03 -08:00
phonet
psample
qrtr net: qrtr: Reset the node and port ID of broadcast messages 2018-07-05 20:20:03 +09:00
rds Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-10-12 21:38:46 -07:00
rfkill Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-09-04 21:33:03 -07:00
rose
rxrpc rxrpc: Fix life check 2018-11-15 11:35:40 -08:00
sched net/sched: act_police: fix memory leak in case of invalid control action 2018-11-30 17:14:06 -08:00
sctp sctp: frag_point sanity check 2018-12-05 20:37:52 -08:00
smc net/smc: use after free fix in smc_wr_tx_put_slot() 2018-11-21 16:14:56 -08:00
strparser bpf, sockmap: convert to generic sk_msg interface 2018-10-15 12:23:19 -07:00
sunrpc NFS client bugfixes for Linux 4.20 2018-11-15 10:59:37 -06:00
switchdev
tipc tipc: fix lockdep warning during node delete 2018-11-27 16:30:39 -08:00
tls Merge branch 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-11-01 19:58:52 -07:00
unix Revert "net: simplify sock_poll_wait" 2018-10-23 10:57:06 -07:00
vmw_vsock vsock: split dwork to avoid reinitializations 2018-08-07 12:39:13 -07:00
wimax wimax: remove blank lines at EOF 2018-07-24 14:10:42 -07:00
wireless cfg80211: Fix busy loop regression in ieee80211_ie_split_ric() 2018-12-05 12:51:29 +01:00
x25 net/x25: handle call collisions 2018-11-29 14:25:36 -08:00
xdp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-10-19 11:03:06 -07:00
xfrm Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-11-03 18:25:17 -07:00
compat.c y2038: socket: Change recvmmsg to use __kernel_timespec 2018-08-29 15:42:24 +02:00
Kconfig bpf, sockmap: convert to generic sk_msg interface 2018-10-15 12:23:19 -07:00
Makefile
socket.c socket: do a generic_file_splice_read when proto_ops has no splice_read 2018-11-17 21:34:11 -08:00
sysctl_net.c