linux/net
Eric Dumazet 8ee602c635 tcp: try to send bigger TSO packets
While investigating TCP performance, I found that TCP would
sometimes send big skbs followed by a single MSS skb,
in a 'locked' pattern.

For instance, BIG TCP is enabled, MSS is set to have 4096 bytes
of payload per segment. gso_max_size is set to 181000.

This means that an optimal TCP packet size should contain
44 * 4096 = 180224 bytes of payload,

However, I was seeing packets sizes interleaved in this pattern:

172032, 8192, 172032, 8192, 172032, 8192, <repeat>

tcp_tso_should_defer() heuristic is defeated, because after a split of
a packet in write queue for whatever reason (this might be a too small
CWND or a small enough pacing_rate),
the leftover packet in the queue is smaller than the optimal size.

It is time to try to make 'leftover packets' bigger so that
tcp_tso_should_defer() can give its full potential.

After this patch, we can see the following output:

14:13:34.009273 IP6 sender > receiver: Flags [P.], seq 4048380:4098360, ack 1, win 256, options [nop,nop,TS val 3425678144 ecr 1561784500], length 49980
14:13:34.010272 IP6 sender > receiver: Flags [P.], seq 4098360:4148340, ack 1, win 256, options [nop,nop,TS val 3425678145 ecr 1561784501], length 49980
14:13:34.011271 IP6 sender > receiver: Flags [P.], seq 4148340:4198320, ack 1, win 256, options [nop,nop,TS val 3425678146 ecr 1561784502], length 49980
14:13:34.012271 IP6 sender > receiver: Flags [P.], seq 4198320:4248300, ack 1, win 256, options [nop,nop,TS val 3425678147 ecr 1561784503], length 49980
14:13:34.013272 IP6 sender > receiver: Flags [P.], seq 4248300:4298280, ack 1, win 256, options [nop,nop,TS val 3425678148 ecr 1561784504], length 49980
14:13:34.014271 IP6 sender > receiver: Flags [P.], seq 4298280:4348260, ack 1, win 256, options [nop,nop,TS val 3425678149 ecr 1561784505], length 49980
14:13:34.015272 IP6 sender > receiver: Flags [P.], seq 4348260:4398240, ack 1, win 256, options [nop,nop,TS val 3425678150 ecr 1561784506], length 49980
14:13:34.016270 IP6 sender > receiver: Flags [P.], seq 4398240:4448220, ack 1, win 256, options [nop,nop,TS val 3425678151 ecr 1561784507], length 49980
14:13:34.017269 IP6 sender > receiver: Flags [P.], seq 4448220:4498200, ack 1, win 256, options [nop,nop,TS val 3425678152 ecr 1561784508], length 49980
14:13:34.018276 IP6 sender > receiver: Flags [P.], seq 4498200:4548180, ack 1, win 256, options [nop,nop,TS val 3425678153 ecr 1561784509], length 49980
14:13:34.019259 IP6 sender > receiver: Flags [P.], seq 4548180:4598160, ack 1, win 256, options [nop,nop,TS val 3425678154 ecr 1561784510], length 49980

With 200 concurrent flows on a 100Gbit NIC, we can see a reduction
of TSO packets (and ACK packets) of about 30 %.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240418214600.1291486-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-22 14:25:28 -07:00
..
6lowpan net: fill in MODULE_DESCRIPTION()s for 6LoWPAN 2024-02-09 14:12:01 -08:00
9p 9p: Fix read/write debug statements to report server reply 2024-02-12 21:18:54 +09:00
802
8021q netlink: introduce type-checking attribute iteration 2024-03-29 15:06:02 -07:00
appletalk
atm ipv4: Set scope explicitly in ip_route_output(). 2024-04-08 13:20:51 +01:00
ax25 sysctl: treewide: constify ctl_table_header::ctl_table_arg 2024-04-22 08:56:31 +01:00
batman-adv Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-04-11 14:23:47 -07:00
bluetooth Bluetooth: l2cap: Don't double set the HCI_CONN_MGMT_CONNECTED bit 2024-04-10 15:10:16 -04:00
bpf bpf: Check return from set_memory_rox() 2024-03-18 14:18:47 -07:00
bridge sysctl: treewide: constify ctl_table_header::ctl_table_arg 2024-04-22 08:56:31 +01:00
caif caif: Use UTILITY_NAME_LENGTH instead of hard-coding 16 2024-04-02 18:20:00 -07:00
can linux-can-next-for-6.9-20240220 2024-02-20 15:32:45 +01:00
ceph libceph: init the cursor when preparing sparse read in msgr2 2024-03-06 12:43:01 +01:00
core sysctl: treewide: constify ctl_table_header::ctl_table_arg 2024-04-22 08:56:31 +01:00
dcb
dccp tcp/dccp: do not care about families in inet_twsk_purge() 2024-04-01 21:27:58 -07:00
devlink devlink: Support setting max_io_eqs 2024-04-08 14:10:45 +01:00
dns_resolver
dsa net: dsa: convert dsa_user_phylink_fixed_state() to use dsa_phylink_to_port() 2024-04-15 10:48:41 +01:00
ethernet
ethtool net: ethtool: pse-pd: Expand pse commands with the PSE PoE interface 2024-04-18 18:27:02 -07:00
handshake net/handshake: remove redundant assignment to variable ret 2024-04-16 17:14:55 -07:00
hsr net: hsr: Use full string description when opening HSR network device 2024-03-29 10:42:21 +00:00
ieee802154 sysctl: treewide: constify ctl_table_header::ctl_table_arg 2024-04-22 08:56:31 +01:00
ife
ipv4 tcp: try to send bigger TSO packets 2024-04-22 14:25:28 -07:00
ipv6 sysctl: treewide: constify ctl_table_header::ctl_table_arg 2024-04-22 08:56:31 +01:00
iucv net/iucv: Avoid explicit cpumask var allocation on stack 2024-04-02 18:19:09 -07:00
kcm net: kcm: fix incorrect parameter validation in the kcm_getsockopt) function 2024-03-11 09:53:22 +00:00
key net: fill in MODULE_DESCRIPTION()s for af_key 2024-02-09 14:12:01 -08:00
l2tp l2tp: fix incorrect parameter validation in the pppol2tp_getsockopt() function 2024-03-11 09:53:22 +00:00
l3mdev
lapb
llc
mac80211 wireless-next patches for v6.10 2024-04-03 19:36:57 -07:00
mac802154 mac802154: fix llsec key resources release in mac802154_llsec_key_del 2024-03-06 21:01:26 +01:00
mctp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-02-29 14:24:56 -08:00
mpls sysctl: treewide: constify ctl_table_header::ctl_table_arg 2024-04-22 08:56:31 +01:00
mptcp sysctl: treewide: constify ctl_table_header::ctl_table_arg 2024-04-22 08:56:31 +01:00
ncsi
netfilter sysctl: treewide: constify ctl_table_header::ctl_table_arg 2024-04-22 08:56:31 +01:00
netlabel netlabel: remove impossible return value in netlbl_bitmap_walk 2024-02-28 19:37:34 -08:00
netlink netlink: create a new header for internal genetlink symbols 2024-04-01 21:44:34 -07:00
netrom netrom: Fix data-races around sysctl_net_busy_read 2024-03-07 10:36:58 +01:00
nfc net: nfc: remove inappropriate attrs check 2024-04-12 18:52:35 -07:00
nsh
openvswitch Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-04-11 14:23:47 -07:00
packet af_packet: avoid a false positive warning in packet_setsockopt() 2024-04-08 13:19:01 +01:00
phonet phonet/pep: fix racy skb_queue_empty() use 2024-02-22 09:05:50 +01:00
psample ip_tunnel: convert __be16 tunnel flags to bitmaps 2024-04-01 10:49:28 +01:00
qrtr
rds net/rds: fix possible cp null dereference 2024-03-29 12:04:09 -07:00
rfkill net: rfkill: gpio: Convert to platform remove callback returning void 2024-03-25 15:40:22 +01:00
rose
rxrpc net: add sk_wake_async_rcu() helper 2024-03-29 15:03:11 -07:00
sched net_sched: sch_skbprio: implement lockless skbprio_dump() 2024-04-19 11:34:08 +01:00
sctp sysctl: treewide: constify ctl_table_header::ctl_table_arg 2024-04-22 08:56:31 +01:00
smc sysctl: treewide: constify ctl_table_header::ctl_table_arg 2024-04-22 08:56:31 +01:00
strparser
sunrpc nfsd-6.9 fixes: 2024-04-06 09:37:50 -07:00
switchdev net: bridge: switchdev: Skip MDB replays of deferred events on offload 2024-02-16 09:36:37 +00:00
tipc tipc: remove redundant assignment to ret, simplify code 2024-04-12 19:07:31 -07:00
tls tls: remove redundant assignment to variable decrypted 2024-04-11 20:00:22 -07:00
unix sysctl: treewide: constify ctl_table_header::ctl_table_arg 2024-04-22 08:56:31 +01:00
vmw_vsock vsock/virtio: fix packet delivery to tap device 2024-04-02 18:00:24 -07:00
wireless wireless-next patches for v6.10 2024-04-03 19:36:57 -07:00
x25 net/x25: fix incorrect parameter validation in the x25_getsockopt() function 2024-03-11 09:53:22 +00:00
xdp xsk: validate user input for XDP_{UMEM|COMPLETION}_FILL_RING 2024-04-05 22:47:22 -07:00
xfrm sysctl: treewide: constify ctl_table_header::ctl_table_arg 2024-04-22 08:56:31 +01:00
compat.c
devres.c
Kconfig net: skbuff: generalize the skb->decrypted bit 2024-04-06 17:34:31 +01:00
Kconfig.debug
Makefile
socket.c net: remove {revc,send}msg_copy_msghdr() from exports 2024-03-14 16:48:53 -07:00
sysctl_net.c