Jussi Maki 848ca9182a net: bonding: Use per-cpu rr_tx_counter
The round-robin rr_tx_counter was shared across CPUs leading to
significant cache thrashing at high packet rates. This patch switches
the round-robin packet counter to use a per-cpu variable to decide
the destination slave.

On a test with 2x100Gbit ICE nic with pktgen_sample_04_many_flows.sh
(-s 64 -t 32) the tx rate was 19.6Mpps before and 22.3Mpps after
this patch.

"perf top -e cache_misses" before:
    12.31%  [bonding]       [k] bond_xmit_roundrobin_slave_get
    10.59%  [sch_fq_codel]  [k] fq_codel_dequeue
     9.34%  [kernel]        [k] skb_release_data
after:
    15.42%  [sch_fq_codel]  [k] fq_codel_dequeue
    10.06%  [kernel]        [k] __memset
     9.12%  [kernel]        [k] skb_release_data

Signed-off-by: Jussi Maki <joamaki@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-15 11:26:15 -07:00
..
2021-06-03 15:05:06 -07:00
2021-06-11 13:32:46 -07:00
2019-01-23 11:18:00 -08:00
2020-05-05 13:23:29 -07:00
2021-04-19 12:25:11 +02:00
2019-12-09 10:36:44 -08:00
2020-06-22 21:12:44 -07:00
2020-11-10 17:56:54 -08:00
2021-03-30 13:29:39 -07:00
2019-12-09 10:36:44 -08:00
2020-03-04 13:25:55 -08:00
2021-05-17 15:29:35 -07:00
2019-04-22 21:47:25 -07:00
2019-10-05 16:29:00 -07:00
2021-06-04 14:08:09 -07:00
2020-06-18 20:46:23 -07:00
2019-12-09 10:28:43 -08:00