linux/include/net/netns
Kuniyuki Iwashima 9804985bf2 udp: Introduce optional per-netns hash table.
The maximum hash table size is 64K due to the nature of the protocol. [0]
It's smaller than TCP, and fewer sockets can cause a performance drop.

On an EC2 c5.24xlarge instance (192 GiB memory), after running iperf3 in
different netns, creating 32Mi sockets without data transfer in the root
netns causes regression for the iperf3's connection.

  uhash_entries		sockets		length		Gbps
	    64K		      1		     1		5.69
			    1Mi		    16		5.27
			    2Mi		    32		4.90
			    4Mi		    64		4.09
			    8Mi		   128		2.96
			   16Mi		   256		2.06
			   32Mi		   512		1.12

The per-netns hash table breaks the lengthy lists into shorter ones.  It is
useful on a multi-tenant system with thousands of netns.  With smaller hash
tables, we can look up sockets faster, isolate noisy neighbours, and reduce
lock contention.

The max size of the per-netns table is 64K as well.  This is because the
possible hash range by udp_hashfn() always fits in 64K within the same
netns and we cannot make full use of the whole buckets larger than 64K.

  /* 0 < num < 64K  ->  X < hash < X + 64K */
  (num + net_hash_mix(net)) & mask;

Also, the min size is 128.  We use a bitmap to search for an available
port in udp_lib_get_port().  To keep the bitmap on the stack and not
fire the CONFIG_FRAME_WARN error at build time, we round up the table
size to 128.

The sysctl usage is the same with TCP:

  $ dmesg | cut -d ' ' -f 6- | grep "UDP hash"
  UDP hash table entries: 65536 (order: 9, 2097152 bytes, vmalloc)

  # sysctl net.ipv4.udp_hash_entries
  net.ipv4.udp_hash_entries = 65536  # can be changed by uhash_entries

  # sysctl net.ipv4.udp_child_hash_entries
  net.ipv4.udp_child_hash_entries = 0  # disabled by default

  # ip netns add test1
  # ip netns exec test1 sysctl net.ipv4.udp_hash_entries
  net.ipv4.udp_hash_entries = -65536  # share the global table

  # sysctl -w net.ipv4.udp_child_hash_entries=100
  net.ipv4.udp_child_hash_entries = 100

  # ip netns add test2
  # ip netns exec test2 sysctl net.ipv4.udp_hash_entries
  net.ipv4.udp_hash_entries = 128  # own a per-netns table with 2^n buckets

We could optimise the hash table lookup/iteration further by removing
the netns comparison for the per-netns one in the future.  Also, we
could optimise the sparse udp_hslot layout by putting it in udp_table.

[0]: https://lore.kernel.org/netdev/4ACC2815.7010101@gmail.com/

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-16 09:43:35 +00:00
..
bpf.h bpf: Invert the dependency between bpf-netns.h and netns/bpf.h 2021-12-29 20:03:05 -08:00
can.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
conntrack.h netfilter: remove nf_conntrack_helper sysctl and modparam toggles 2022-08-31 12:12:32 +02:00
core.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
flow_table.h netfilter: nf_flow_table: count pending offload workqueue tasks 2022-07-11 16:25:14 +02:00
generic.h netns: Replace zero-length array with DECLARE_FLEX_ARRAY() helper 2022-09-28 18:51:47 -07:00
hash.h netns: provide pure entropy for net_hash_mix() 2019-03-28 17:00:45 -07:00
ieee802154_6lowpan.h net: dynamically allocate fqdir structures 2019-05-26 14:08:05 -07:00
ipv4.h udp: Introduce optional per-netns hash table. 2022-11-16 09:43:35 +00:00
ipv6.h ipv6: make ip6_rt_gc_expire an atomic_t 2022-04-15 14:28:50 -07:00
mctp.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
mib.h net: reorganize fields in netns_mib 2021-04-02 14:31:44 -07:00
mpls.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
netfilter.h Remove DECnet support from kernel 2022-08-22 14:26:30 +01:00
nexthop.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
nftables.h net: remove obsolete members from struct net 2021-04-06 00:34:53 +02:00
packet.h
sctp.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
smc.h net/smc: Unbind r/w buffer size from clcsock and make them tunable 2022-09-22 12:58:21 +02:00
unix.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
xdp.h net: xsk: track AF_XDP sockets on a per-netns list 2019-01-25 01:50:03 +01:00
xfrm.h xfrm: rework default policy structure 2022-03-18 07:23:12 +01:00