linux/net/ipv6
Kuniyuki Iwashima 69421bf984 udp: Update reuse->has_conns under reuseport_lock.
When we call connect() for a UDP socket in a reuseport group, we have
to update sk->sk_reuseport_cb->has_conns to 1.  Otherwise, the kernel
could select a unconnected socket wrongly for packets sent to the
connected socket.

However, the current way to set has_conns is illegal and possible to
trigger that problem.  reuseport_has_conns() changes has_conns under
rcu_read_lock(), which upgrades the RCU reader to the updater.  Then,
it must do the update under the updater's lock, reuseport_lock, but
it doesn't for now.

For this reason, there is a race below where we fail to set has_conns
resulting in the wrong socket selection.  To avoid the race, let's split
the reader and updater with proper locking.

 cpu1                               cpu2
+----+                             +----+

__ip[46]_datagram_connect()        reuseport_grow()
.                                  .
|- reuseport_has_conns(sk, true)   |- more_reuse = __reuseport_alloc(more_socks_size)
|  .                               |
|  |- rcu_read_lock()
|  |- reuse = rcu_dereference(sk->sk_reuseport_cb)
|  |
|  |                               |  /* reuse->has_conns == 0 here */
|  |                               |- more_reuse->has_conns = reuse->has_conns
|  |- reuse->has_conns = 1         |  /* more_reuse->has_conns SHOULD BE 1 HERE */
|  |                               |
|  |                               |- rcu_assign_pointer(reuse->socks[i]->sk_reuseport_cb,
|  |                               |                     more_reuse)
|  `- rcu_read_unlock()            `- kfree_rcu(reuse, rcu)
|
|- sk->sk_state = TCP_ESTABLISHED

Note the likely(reuse) in reuseport_has_conns_set() is always true,
but we put the test there for ease of review.  [0]

For the record, usually, sk_reuseport_cb is changed under lock_sock().
The only exception is reuseport_grow() & TCP reqsk migration case.

  1) shutdown() TCP listener, which is moved into the latter part of
     reuse->socks[] to migrate reqsk.

  2) New listen() overflows reuse->socks[] and call reuseport_grow().

  3) reuse->max_socks overflows u16 with the new listener.

  4) reuseport_grow() pops the old shutdown()ed listener from the array
     and update its sk->sk_reuseport_cb as NULL without lock_sock().

shutdown()ed TCP sk->sk_reuseport_cb can be changed without lock_sock(),
but, reuseport_has_conns_set() is called only for UDP under lock_sock(),
so likely(reuse) never be false in reuseport_has_conns_set().

[0]: https://lore.kernel.org/netdev/CANn89iLja=eQHbsM_Ta2sQF0tOGU8vAGrh_izRuuHjuO1ouUag@mail.gmail.com/

Fixes: acdcecc612 ("udp: correct reuseport selection with connected sockets")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20221014182625.89913-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-10-18 10:17:18 +02:00
..
ila genetlink: start to validate reserved header bytes 2022-08-29 12:47:15 +01:00
netfilter netfilter: rpfilter/fib: Populate flowic_l3mdev field 2022-10-12 14:08:15 +02:00
addrconf_core.c net: rename reference+tracking helpers 2022-06-09 21:52:55 -07:00
addrconf.c bonding: add all node mcast address when slave up 2022-09-05 10:07:05 +01:00
addrlabel.c ipv6: addrlabel: fix possible memory leak in ip6addrlbl_net_init 2020-11-25 11:20:16 -08:00
af_inet6.c tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct(). 2022-10-12 17:50:37 -07:00
ah6.c xfrm: ah: add extack to ah_init_state, ah6_init_state 2022-09-29 07:17:59 +02:00
anycast.c
calipso.c cipso,calipso: resolve a number of problems with the DOI refcounts 2021-03-04 15:26:57 -08:00
datagram.c udp: Update reuse->has_conns under reuseport_lock. 2022-10-18 10:17:18 +02:00
esp6_offload.c esp: choose the correct inner protocol for GSO on inter address family tunnels 2022-08-29 10:20:58 +02:00
esp6.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2022-10-03 07:52:13 +01:00
exthdrs_core.c
exthdrs_offload.c
exthdrs.c net: ipv6: add skb drop reasons to TLV parse 2022-04-13 13:09:57 +01:00
fib6_notifier.c
fib6_rules.c ipv6: change fib6_rules_net_exit() to batch mode 2022-02-08 20:41:34 -08:00
fou6.c
icmp.c icmp: Fix data-races around sysctl_icmp_echo_enable_probe. 2022-07-13 12:56:49 +01:00
inet6_connection_sock.c lsm,selinux: pass flowi_common instead of flowi to the LSM hooks 2020-11-23 18:36:21 -05:00
inet6_hashtables.c tcp: Access &tcp_hashinfo via net. 2022-09-20 10:21:49 -07:00
ioam6_iptunnel.c ipv6: ioam: Insertion frequency in lwtunnel output 2022-02-04 20:24:45 -08:00
ioam6.c genetlink: start to validate reserved header bytes 2022-08-29 12:47:15 +01:00
ip6_checksum.c
ip6_fib.c ipv6: annotate accesses to fn->fn_sernum 2022-01-20 20:18:37 -08:00
ip6_flowlabel.c ipv6: per-netns exclusive flowlabel checks 2022-02-16 20:37:47 -08:00
ip6_gre.c ipv6: move from strlcpy with unused retval to strscpy 2022-08-22 17:59:42 -07:00
ip6_icmp.c net: icmp: pass zeroed opts from icmp{,v6}_ndo_send before sending 2021-02-23 11:29:52 -08:00
ip6_input.c tcp/udp: Make early_demux back namespacified. 2022-07-15 18:50:35 -07:00
ip6_offload.c net-next: skbuff: refactor pskb_pull 2022-09-30 12:31:46 +01:00
ip6_offload.h
ip6_output.c net: shrink struct ubuf_info 2022-09-28 18:51:25 -07:00
ip6_tunnel.c net: Add helper function to parse netlink msg of ip_tunnel_encap 2022-10-03 07:59:06 +01:00
ip6_udp_tunnel.c
ip6_vti.c ip6_vti:Remove the space before the comma 2022-09-30 13:10:44 +01:00
ip6mr.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-09-22 13:02:10 -07:00
ipcomp6.c xfrm: ipcomp: add extack to ipcomp{4,6}_init_state 2022-09-29 07:18:00 +02:00
ipv6_sockglue.c tcp: Fix data races around icsk->icsk_af_ops. 2022-10-12 17:50:37 -07:00
Kconfig crypto: lib - make the sha1 library optional 2022-07-15 16:43:59 +08:00
Makefile net: ipv6: use ipv6-y directly instead of ipv6-objs 2021-09-28 13:13:40 +01:00
mcast_snoop.c net: bridge: mcast: fix broken length + header check for MRDv6 Adv. 2021-04-27 14:02:06 -07:00
mcast.c bpf: net: Change do_ipv6_getsockopt() to take the sockptr_t argument 2022-09-02 20:34:31 -07:00
mip6.c xfrm: mip6: add extack to mip6_destopt_init_state, mip6_rthdr_init_state 2022-09-29 07:18:01 +02:00
ndisc.c net: fix potential refcount leak in ndisc_router_discovery() 2022-08-15 11:40:28 +01:00
netfilter.c netfilter: Use l3mdev flow key when re-routing mangled packets 2022-05-16 13:03:29 +02:00
output_core.c ipv6: use prandom_u32() for ID generation 2021-05-31 22:12:08 -07:00
ping.c inet: ping: fix recent breakage 2022-10-12 09:10:02 +01:00
proc.c
protocol.c
raw.c raw: remove unused variables from raw6_icmp_error() 2022-06-22 18:48:08 -07:00
reassembly.c net: ipv6: Handle delivery_time in ipv6 defrag 2022-03-03 14:38:48 +00:00
route.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-07-14 15:27:35 -07:00
rpl_iptunnel.c net: ipv6: rpl_iptunnel: simplify the return expression of rpl_do_srh() 2020-12-08 16:22:54 -08:00
rpl.c
seg6_hmac.c net: ipv6: unexport __init-annotated seg6_hmac_net_init() 2022-06-28 21:23:30 -07:00
seg6_iptunnel.c seg6: add support for SRv6 H.L2Encaps.Red behavior 2022-07-29 12:14:03 +01:00
seg6_local.c seg6: add NEXT-C-SID support for SRv6 End behavior 2022-09-20 12:33:22 +02:00
seg6.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-09-08 18:38:30 +02:00
sit.c net: Add helper function to parse netlink msg of ip_tunnel_parm 2022-10-03 07:59:06 +01:00
syncookies.c tcp: Fix data-races around sysctl_tcp_syncookies. 2022-07-18 12:21:54 +01:00
sysctl_net_ipv6.c net: sysctl: introduce sysctl SYSCTL_THREE 2022-05-03 10:15:06 +02:00
tcp_ipv6.c tcp: Fix data races around icsk->icsk_af_ops. 2022-10-12 17:50:37 -07:00
tcpv6_offload.c net: move gro definitions to include/net/gro.h 2021-11-16 13:16:54 +00:00
tunnel6.c
udp_impl.h tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct(). 2022-10-12 17:50:37 -07:00
udp_offload.c gro: remove rcu_read_lock/rcu_read_unlock from gro_receive handlers 2021-11-24 17:21:42 -08:00
udp.c udp: Update reuse->has_conns under reuseport_lock. 2022-10-18 10:17:18 +02:00
udplite.c tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct(). 2022-10-12 17:50:37 -07:00
xfrm6_input.c
xfrm6_output.c xfrm: fix tunnel model fragmentation behavior 2022-03-01 12:08:40 +01:00
xfrm6_policy.c net: rename reference+tracking helpers 2022-06-09 21:52:55 -07:00
xfrm6_protocol.c
xfrm6_state.c
xfrm6_tunnel.c xfrm: tunnel: add extack to ipip_init_state, xfrm6_tunnel_init_state 2022-09-29 07:18:00 +02:00