linux/net/core
Martin KaFai Lau 1b98658968 bpf: Fix bpf_tcp_sock and bpf_sk_fullsock issue related to bpf_sk_release
Lorenz Bauer [thanks!] reported that a ptr returned by bpf_tcp_sock(sk)
can still be accessed after bpf_sk_release(sk).
Both bpf_tcp_sock() and bpf_sk_fullsock() have the same issue.
This patch addresses them together.

A simple reproducer looks like this:

	sk = bpf_sk_lookup_tcp();
	/* if (!sk) ... */
	tp = bpf_tcp_sock(sk);
	/* if (!tp) ... */
	bpf_sk_release(sk);
	snd_cwnd = tp->snd_cwnd; /* oops! The verifier does not complain. */

The problem is the verifier did not scrub the register's states of
the tcp_sock ptr (tp) after bpf_sk_release(sk).

[ Note that when calling bpf_tcp_sock(sk), the sk is not always
  refcount-acquired. e.g. bpf_tcp_sock(skb->sk). The verifier works
  fine for this case. ]

Currently, the verifier does not track if a helper's return ptr (in REG_0)
is "carry"-ing one of its argument's refcount status. To carry this info,
the reg1->id needs to be stored in reg0.

One approach was tried, like "reg0->id = reg1->id", when calling
"bpf_tcp_sock()".  The main idea was to avoid adding another "ref_obj_id"
for the same reg.  However, overlapping the NULL marking and ref
tracking purpose in one "id" does not work well:

	ref_sk = bpf_sk_lookup_tcp();
	fullsock = bpf_sk_fullsock(ref_sk);
	tp = bpf_tcp_sock(ref_sk);
	if (!fullsock) {
	     bpf_sk_release(ref_sk);
	     return 0;
	}
	/* fullsock_reg->id is marked for NOT-NULL.
	 * Same for tp_reg->id because they have the same id.
	 */

	/* oops. verifier did not complain about the missing !tp check */
	snd_cwnd = tp->snd_cwnd;

Hence, a new "ref_obj_id" is needed in "struct bpf_reg_state".
With a new ref_obj_id, when bpf_sk_release(sk) is called, the verifier can
scrub all reg states which has a ref_obj_id match.  It is done with the
changes in release_reg_references() in this patch.

While fixing it, sk_to_full_sk() is removed from bpf_tcp_sock() and
bpf_sk_fullsock() to avoid these helpers from returning
another ptr. It will make bpf_sk_release(tp) possible:

	sk = bpf_sk_lookup_tcp();
	/* if (!sk) ... */
	tp = bpf_tcp_sock(sk);
	/* if (!tp) ... */
	bpf_sk_release(tp);

A separate helper "bpf_get_listener_sock()" will be added in a later
patch to do sk_to_full_sk().

Misc change notes:
- To allow bpf_sk_release(tp), the arg of bpf_sk_release() is changed
  from ARG_PTR_TO_SOCKET to ARG_PTR_TO_SOCK_COMMON.  ARG_PTR_TO_SOCKET
  is removed from bpf.h since no helper is using it.

- arg_type_is_refcounted() is renamed to arg_type_may_be_refcounted()
  because ARG_PTR_TO_SOCK_COMMON is the only one and skb->sk is not
  refcounted.  All bpf_sk_release(), bpf_sk_fullsock() and bpf_tcp_sock()
  take ARG_PTR_TO_SOCK_COMMON.

- check_refcount_ok() ensures is_acquire_function() cannot take
  arg_type_may_be_refcounted() as its argument.

- The check_func_arg() can only allow one refcount-ed arg.  It is
  guaranteed by check_refcount_ok() which ensures at most one arg can be
  refcounted.  Hence, it is a verifier internal error if >1 refcount arg
  found in check_func_arg().

- In release_reference(), release_reference_state() is called
  first to ensure a match on "reg->ref_obj_id" can be found before
  scrubbing the reg states with release_reg_references().

- reg_is_refcounted() is no longer needed.
  1. In mark_ptr_or_null_regs(), its usage is replaced by
     "ref_obj_id && ref_obj_id == id" because,
     when is_null == true, release_reference_state() should only be
     called on the ref_obj_id obtained by a acquire helper (i.e.
     is_acquire_function() == true).  Otherwise, the following
     would happen:

	sk = bpf_sk_lookup_tcp();
	/* if (!sk) { ... } */
	fullsock = bpf_sk_fullsock(sk);
	if (!fullsock) {
		/*
		 * release_reference_state(fullsock_reg->ref_obj_id)
		 * where fullsock_reg->ref_obj_id == sk_reg->ref_obj_id.
		 *
		 * Hence, the following bpf_sk_release(sk) will fail
		 * because the ref state has already been released in the
		 * earlier release_reference_state(fullsock_reg->ref_obj_id).
		 */
		bpf_sk_release(sk);
	}

  2. In release_reg_references(), the current reg_is_refcounted() call
     is unnecessary because the id check is enough.

- The type_is_refcounted() and type_is_refcounted_or_null()
  are no longer needed also because reg_is_refcounted() is removed.

Fixes: 655a51e536 ("bpf: Add struct bpf_tcp_sock and BPF_FUNC_tcp_sock")
Reported-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-03-13 12:04:35 -07:00
..
datagram.c for-4.21/block-20181221 2018-12-28 13:19:59 -08:00
dev_addr_lists.c net: dev: Issue NETDEV_PRE_CHANGEADDR 2018-12-13 18:41:38 -08:00
dev_ioctl.c net: dev: Add extack argument to dev_set_mac_address() 2018-12-13 18:41:38 -08:00
dev.c net: dev: add generic protodown handler 2019-02-24 13:01:04 -08:00
devlink.c devlink: Add support for direct reporter health state update 2019-03-04 11:00:43 -08:00
drop_monitor.c
dst_cache.c
dst.c net: add a route cache full diagnostic message 2019-01-17 15:37:25 -08:00
ethtool.c ethtool: reduce stack usage with clang 2019-03-07 09:45:21 -08:00
failover.c
fib_notifier.c
fib_rules.c net/fib_rules: Update fib_nl_dumprule for strict data checking 2018-10-08 10:39:05 -07:00
filter.c bpf: Fix bpf_tcp_sock and bpf_sk_fullsock issue related to bpf_sk_release 2019-03-13 12:04:35 -07:00
flow_dissector.c net/flow_dissector: move bpf case into __skb_flow_bpf_dissect 2019-01-29 01:08:29 +01:00
flow_offload.c flow_offload: add flow action infrastructure 2019-02-06 10:38:25 -08:00
gen_estimator.c
gen_stats.c net: sched: put back q.qlen into a single location 2019-03-02 14:10:18 -08:00
gro_cells.c gro_cells: make sure device is up in gro_cells_receive() 2019-03-10 11:07:14 -07:00
hwbm.c
link_watch.c net: linkwatch: add check for netdevice being present to linkwatch_do_dev 2018-09-19 21:06:46 -07:00
lwt_bpf.c net: fix GSO in bpf_lwt_push_ip_encap 2019-03-07 10:41:29 +01:00
lwtunnel.c ip_tunnel: Add dst_cache support in lwtunnel_state of ip tunnel 2019-02-24 22:13:49 -08:00
Makefile flow_offload: add flow_rule and flow_match structures and use them 2019-02-06 10:38:25 -08:00
neighbour.c neigh: hook tracepoints in neigh update code 2019-02-17 10:33:39 -08:00
net_namespace.c net: namespace: perform strict checks also for doit handlers 2019-01-19 10:09:58 -08:00
net-procfs.c
net-sysfs.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-03-04 13:26:15 -08:00
net-sysfs.h
net-traces.c trace: events: add a few neigh tracepoints 2019-02-17 10:33:39 -08:00
netclassid_cgroup.c cgroup, netclassid: add a preemption point to write_classid 2018-10-23 12:58:17 -07:00
netevent.c
netpoll.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2018-12-27 13:04:52 -08:00
netprio_cgroup.c
page_pool.c page_pool: use DMA_ATTR_SKIP_CPU_SYNC for DMA mappings 2019-02-13 22:00:16 -08:00
pktgen.c pktgen: Fix fall-through annotation 2018-09-13 15:36:41 -07:00
ptp_classifier.c
request_sock.c
rtnetlink.c net: Get rid of SWITCHDEV_ATTR_ID_PORT_PARENT_ID 2019-02-06 14:17:03 -08:00
scm.c socket: Add SO_TIMESTAMPING_NEW 2019-02-03 11:17:31 -08:00
secure_seq.c
skbuff.c net: Do not allocate page fragments that are not skb aligned 2019-02-17 15:48:43 -08:00
skmsg.c bpf: Stop the psock parser before canceling its work 2019-03-07 15:16:20 +01:00
sock_diag.c
sock_map.c bpf: skmsg, fix psock create on existing kcm/tls port 2018-10-20 00:40:45 +02:00
sock_reuseport.c sctp: add sock_reuseport for the sock in __sctp_hash_endpoint 2018-11-12 09:09:51 -08:00
sock.c net: support 64bit rates for getsockopt(SO_MAX_PACING_RATE) 2019-03-01 23:08:30 -08:00
stream.c tcp: reduce POLLOUT events caused by TCP_NOTSENT_LOWAT 2018-12-04 21:21:18 -08:00
sysctl_net_core.c net: introduce a knob to control whether to inherit devconf config 2019-01-22 11:07:21 -08:00
timestamping.c
tso.c
utils.c
xdp.c xdp: remove redundant variable 'headroom' 2018-09-01 01:35:53 +02:00