linux/net/ipv4
Ricardo Dias 01770a1661 tcp: fix race condition when creating child sockets from syncookies
When the TCP stack is in SYN flood mode, the server child socket is
created from the SYN cookie received in a TCP packet with the ACK flag
set.

The child socket is created when the server receives the first TCP
packet with a valid SYN cookie from the client. Usually, this packet
corresponds to the final step of the TCP 3-way handshake, the ACK
packet. But is also possible to receive a valid SYN cookie from the
first TCP data packet sent by the client, and thus create a child socket
from that SYN cookie.

Since a client socket is ready to send data as soon as it receives the
SYN+ACK packet from the server, the client can send the ACK packet (sent
by the TCP stack code), and the first data packet (sent by the userspace
program) almost at the same time, and thus the server will equally
receive the two TCP packets with valid SYN cookies almost at the same
instant.

When such event happens, the TCP stack code has a race condition that
occurs between the momement a lookup is done to the established
connections hashtable to check for the existence of a connection for the
same client, and the moment that the child socket is added to the
established connections hashtable. As a consequence, this race condition
can lead to a situation where we add two child sockets to the
established connections hashtable and deliver two sockets to the
userspace program to the same client.

This patch fixes the race condition by checking if an existing child
socket exists for the same client when we are adding the second child
socket to the established connections socket. If an existing child
socket exists, we drop the packet and discard the second child socket
to the same client.

Signed-off-by: Ricardo Dias <rdias@singlestore.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20201120111133.GA67501@rdias-suse-pc.lan
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-23 16:32:33 -08:00
..
bpfilter net: Revert "net: optimize the sockptr_t for unified kernel/user address spaces" 2020-08-10 12:06:44 -07:00
netfilter netfilter: use actual socket sk rather than skb sk when routing harder 2020-10-30 12:57:39 +01:00
af_inet.c io_uring: allow tcp ancillary data for __sys_recvmsg_sock() 2020-08-24 16:16:06 -07:00
ah4.c inet: Use fallthrough; 2020-03-12 15:55:00 -07:00
arp.c net: Exempt multicast addresses from five-second neighbor lifetime 2020-11-13 14:24:39 -08:00
bpf_tcp_ca.c bpf: Change bpf_sk_storage_*() to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON 2020-09-25 13:58:01 -07:00
cipso_ipv4.c cipso: fix 'audit_secid' kernel-doc warning in cipso_ipv4.c 2020-09-08 20:03:36 -07:00
datagram.c inet: stop leaking jiffies on the wire 2019-11-01 14:57:52 -07:00
devinet.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-05-31 17:48:46 -07:00
esp4_offload.c net: Add MODULE_DESCRIPTION entries to network modules 2020-06-20 21:33:57 -07:00
esp4.c ESP: Export esp_output_fill_trailer function 2020-02-19 13:52:32 +01:00
fib_frontend.c ipv4: use IS_ENABLED instead of ifdef 2020-11-17 17:02:03 -08:00
fib_lookup.h net: add net available in build_state 2020-03-29 22:30:57 -07:00
fib_notifier.c net: fib_notifier: propagate extack down to the notifier block callback 2019-10-04 11:10:56 -07:00
fib_rules.c fib: use indirect call wrappers in the most common fib_rules_ops 2020-07-28 17:42:31 -07:00
fib_semantics.c net: Fix the arp error in some cases 2020-06-18 20:21:51 -07:00
fib_trie.c ipv4: Silence suspicious RCU usage warning 2020-08-26 15:58:48 -07:00
fou.c genetlink: move to smaller ops wherever possible 2020-10-02 19:11:11 -07:00
gre_demux.c gre: fix uninit-value in __iptunnel_pull_header 2020-03-08 21:25:37 -07:00
gre_offload.c net: gre: recompute gre csum for sctp over gre tunnels 2020-08-03 15:29:44 -07:00
icmp.c icmp: randomize the global rate limiter 2020-10-16 16:47:09 -07:00
igmp.c ip*_mc_gsfget(): lift copyout of struct group_filter into callers 2020-05-20 20:31:27 -04:00
inet_connection_sock.c tcp: fix race condition when creating child sockets from syncookies 2020-11-23 16:32:33 -08:00
inet_diag.c inet_diag: Fix error path to cancel the meseage in inet_req_diag_fill() 2020-11-17 16:08:36 -08:00
inet_fragment.c inet: frags: re-introduce skb coalescing for local delivery 2019-08-08 15:55:10 -07:00
inet_hashtables.c tcp: fix race condition when creating child sockets from syncookies 2020-11-23 16:32:33 -08:00
inet_timewait_sock.c
inetpeer.c inetpeer: fix data-race in inet_putpeer / inet_putpeer 2019-11-07 16:15:56 -08:00
ip_forward.c ipv4: Revert removal of rt_uses_gateway 2019-09-20 18:23:33 -07:00
ip_fragment.c inet: frags: re-introduce skb coalescing for local delivery 2019-08-08 15:55:10 -07:00
ip_gre.c ip_gre: set dev->hard_header_len and dev->needed_headroom properly 2020-10-13 18:35:29 -07:00
ip_input.c bpf: Add socket assign support 2020-03-30 13:45:04 -07:00
ip_options.c net: clean up codestyle for net/ipv4 2020-08-25 06:28:02 -07:00
ip_output.c networking changes for the 5.10 merge window 2020-10-15 18:42:13 -07:00
ip_sockglue.c net: Remove duplicated midx check against 0 2020-08-25 06:23:59 -07:00
ip_tunnel_core.c tunnels: Fix off-by-one in lower MTU bounds for ICMP/ICMPv6 replies 2020-11-09 15:39:39 -08:00
ip_tunnel.c ip_tunnel: fix over-mtu packet send fail without TUNNEL_DONT_FRAGMENT flags 2020-10-31 17:19:02 -07:00
ip_vti.c ipv4: use dev_sw_netstats_rx_add() 2020-10-06 06:23:21 -07:00
ipcomp.c ipcomp: assign if_id to child tunnel from parent tunnel 2020-07-09 12:55:37 +02:00
ipconfig.c Documentation: nfsroot.rst: Fix references to nfsroot.rst 2020-03-02 13:11:46 -07:00
ipip.c net: ipip: implement header_ops->parse_protocol for AF_PACKET 2020-06-30 12:29:39 -07:00
ipmr_base.c net: fib_notifier: propagate extack down to the notifier block callback 2019-10-04 11:10:56 -07:00
ipmr.c ipmr: Use full VIF ID in netlink cache reports 2020-09-10 12:25:51 -07:00
Kconfig net: ipv4: remove duplicate "the the" phrase in Kconfig text 2020-08-18 16:02:16 -07:00
Makefile udp_tunnel: add central NIC RX port offload infrastructure 2020-07-10 13:54:00 -07:00
metrics.c
netfilter.c netfilter: use actual socket sk rather than skb sk when routing harder 2020-10-30 12:57:39 +01:00
netlink.c
nexthop.c nexthop: Fix performance regression in nexthop deletion 2020-10-19 20:07:15 -07:00
ping.c net: clean up codestyle 2020-08-31 12:33:34 -07:00
proc.c tcp: skip DSACKs with dubious sequence ranges 2020-09-24 20:15:45 -07:00
protocol.c
raw_diag.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-03-12 22:34:48 -07:00
raw.c networking changes for the 5.10 merge window 2020-10-15 18:42:13 -07:00
route.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-10-15 12:43:21 -07:00
syncookies.c net: Update window_clamp if SOCK_RCVBUF is set 2020-11-10 17:42:35 -08:00
sysctl_net_ipv4.c tcp: reflect tos value received in SYN to the socket 2020-09-10 13:15:40 -07:00
tcp_bbr.c tcp: only postpone PROBE_RTT if RTT is < current min_rtt estimate 2020-11-17 11:03:22 -08:00
tcp_bic.c tcp: fix stretch ACK bugs in BIC 2020-03-16 18:26:54 -07:00
tcp_bpf.c bpf, sockmap: Ensure SO_RCVBUF memory is observed on ingress redirect 2020-11-18 00:12:34 +01:00
tcp_cdg.c
tcp_cong.c tcp: Set INET_ECN_xmit configuration in tcp_reinit_congestion_control 2020-11-20 18:09:47 -08:00
tcp_cubic.c tcp_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTT 2020-06-25 16:08:47 -07:00
tcp_dctcp.c
tcp_dctcp.h
tcp_diag.c inet_diag: Move the INET_DIAG_REQ_BYTECODE nlattr to cb->data 2020-02-27 18:50:19 -08:00
tcp_fastopen.c bpf: tcp: Add bpf_skops_established() 2020-08-24 14:35:00 -07:00
tcp_highspeed.c Replace HTTP links with HTTPS ones: IPv* 2020-07-06 13:23:03 -07:00
tcp_htcp.c Replace HTTP links with HTTPS ones: IPv* 2020-07-06 13:23:03 -07:00
tcp_hybla.c
tcp_illinois.c
tcp_input.c tcp: Prevent low rmem stalls with SO_RCVLOWAT. 2020-10-23 19:11:20 -07:00
tcp_ipv4.c tcp: fix race condition when creating child sockets from syncookies 2020-11-23 16:32:33 -08:00
tcp_lp.c
tcp_metrics.c genetlink: move to smaller ops wherever possible 2020-10-02 19:11:11 -07:00
tcp_minisocks.c bpf: tcp: Do not limit cb_flags when creating child sk from listen sk 2020-10-02 11:34:48 -07:00
tcp_nv.c
tcp_offload.c
tcp_output.c tcp: add exponential backoff in __tcp_send_ack() 2020-09-30 14:21:30 -07:00
tcp_rate.c
tcp_recovery.c tcp: move tcp_mark_skb_lost 2020-09-25 17:17:14 -07:00
tcp_scalable.c net: ipv4: delete repeated words 2020-08-24 17:31:20 -07:00
tcp_timer.c inet: remove icsk_ack.blocked 2020-09-30 14:21:30 -07:00
tcp_ulp.c bpf: sockmap: Only check ULP for TCP sockets 2020-03-09 22:34:58 +01:00
tcp_vegas.c tcp: use semicolons rather than commas to separate statements 2020-10-13 17:11:52 -07:00
tcp_vegas.h
tcp_veno.c Replace HTTP links with HTTPS ones: IPv* 2020-07-06 13:23:03 -07:00
tcp_westwood.c
tcp_yeah.c tcp: fix stretch ACK bugs in Yeah 2020-03-16 18:26:55 -07:00
tcp.c tcp: Prevent low rmem stalls with SO_RCVLOWAT. 2020-10-23 19:11:20 -07:00
tunnel4.c tunnel4: add cb_handler to struct xfrm_tunnel 2020-07-09 12:51:36 +02:00
udp_bpf.c net: sk_msg: Simplify sk_psock initialization 2020-08-21 15:16:11 -07:00
udp_diag.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-03-12 22:34:48 -07:00
udp_impl.h net: pass a sockptr_t into ->setsockopt 2020-07-24 15:41:54 -07:00
udp_offload.c net: udp: fix IP header access and skb lookup on Fast/frag0 UDP GRO 2020-11-12 09:55:51 -08:00
udp_tunnel_core.c udp_tunnel: add central NIC RX port offload infrastructure 2020-07-10 13:54:00 -07:00
udp_tunnel_nic.c udp_tunnel: add the ability to share port tables 2020-09-28 12:50:12 -07:00
udp_tunnel_stub.c udp_tunnel: add central NIC RX port offload infrastructure 2020-07-10 13:54:00 -07:00
udp.c net: ipv4: delete repeated words 2020-08-24 17:31:20 -07:00
udplite.c net/ipv4: remove compat_ip_{get,set}sockopt 2020-07-19 18:16:41 -07:00
xfrm4_input.c xfrm: state: remove extract_input indirection from xfrm_state_afinfo 2020-05-06 09:40:08 +02:00
xfrm4_output.c xfrm: fix unused variable warning if CONFIG_NETFILTER=n 2020-05-11 15:12:27 +02:00
xfrm4_policy.c net: add bool confirm_neigh parameter for dst_ops.update_pmtu 2019-12-24 22:28:54 -08:00
xfrm4_protocol.c xfrm: add route lookup to xfrm4_rcv_encap 2019-12-09 09:59:07 +01:00
xfrm4_state.c xfrm: remove output_finish indirection from xfrm_state_afinfo 2020-05-06 09:40:08 +02:00
xfrm4_tunnel.c xfrm: interface: fix the priorities for ipip and ipv6 tunnels 2020-10-09 12:29:48 +02:00