linux/include
David Ahern 9ab948a91b ipv4: Allow amount of dirty memory from fib resizing to be controllable
fib_trie implementation calls synchronize_rcu when a certain amount of
pages are dirty from freed entries. The number of pages was determined
experimentally in 2009 (commit c3059477fc).

At the current setting, synchronize_rcu is called often -- 51 times in a
second in one test with an average of an 8 msec delay adding a fib entry.
The total impact is a lot of slow down modifying the fib. This is seen
in the output of 'time' - the difference between real time and sys+user.
For example, using 720,022 single path routes and 'ip -batch'[1]:

    $ time ./ip -batch ipv4/routes-1-hops
    real    0m14.214s
    user    0m2.513s
    sys     0m6.783s

So roughly 35% of the actual time to install the routes is from the ip
command getting scheduled out, most notably due to synchronize_rcu (this
is observed using 'perf sched timehist').

This patch makes the amount of dirty memory configurable between 64k where
the synchronize_rcu is called often (small, low end systems that are memory
sensitive) to 64M where synchronize_rcu is called rarely during a large
FIB change (for high end systems with lots of memory). The default is 512kB
which corresponds to the current setting of 128 pages with a 4kB page size.

As an example, at 16MB the worst interval shows 4 calls to synchronize_rcu
in a second blocking for up to 30 msec in a single instance, and a total
of almost 100 msec across the 4 calls in the second. The trade off is
allowing FIB entries to consume more memory in a given time window but
but with much better fib insertion rates (~30% increase in prefixes/sec).
With this patch and net.ipv4.fib_sync_mem set to 16MB, the same batch
file runs in:

    $ time ./ip -batch ipv4/routes-1-hops
    real    0m9.692s
    user    0m2.491s
    sys     0m6.769s

So the dead time is reduced to about 1/2 second or <5% of the real time.

[1] 'ip' modified to not request ACK messages which improves route
    insertion times by about 20%

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-21 13:29:53 -07:00
..
acpi ACPI updates for 5.1-rc1 2019-03-06 13:33:11 -08:00
asm-generic Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-03-10 14:46:56 -07:00
clocksource
crypto
drm drm next pull request for 5.1 2019-03-08 08:23:15 -08:00
dt-bindings We have a fairly balanced mix of clk driver updates and clk framework 2019-03-14 08:46:17 -07:00
keys Merge branch 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2019-03-10 17:32:04 -07:00
kvm
linux net: remove 'fallback' argument from dev->ndo_select_queue() 2019-03-20 11:18:55 -07:00
math-emu
media media: include: fix several typos 2019-03-01 09:45:52 -05:00
memory
misc
net ipv4: Allow amount of dirty memory from fib resizing to be controllable 2019-03-21 13:29:53 -07:00
pcmcia
ras
rdma
scsi scsi: kill command serial number 2019-02-27 09:19:24 -05:00
soc IOMMU Updates for Linux v5.1 2019-03-10 12:29:52 -07:00
sound Char/Misc driver patches for 5.1-rc1 2019-03-06 14:18:59 -08:00
target
trace dmaengine updates for v5.1-rc1 2019-03-14 09:11:54 -07:00
uapi tun: Add ioctl() TUNGETDEVNETNS cmd to allow obtaining real net ns of tun device 2019-03-21 13:19:15 -07:00
video media updates for v5.1-rc1 2019-03-09 14:45:54 -08:00
xen