78c1d78488
A series of radix tree cleanups, and usage of them in the core pagecache
code.
Micro-benchmark:
lookup 14 slots (typical page-vector size)
in radix-tree there earch <step> slot filled and tagged
before/after - nsec per full scan through tree
* Intel Sandy Bridge i7-2620M 4Mb L3
New code always faster
* AMD Athlon 6000+ 2x1Mb L2, without L3
New code generally faster,
Minor degradation (marked with "*") for huge sparse trees
* i386 on Sandy Bridge
New code faster for common cases: tagged and dense trees.
Some degradations for non-tagged lookup on sparse trees.
Ideally, there might help __ffs() analog for searching first non-zero
long element in array, gcc sometimes cannot optimize this loop corretly.
Numbers:
CPU: Intel Sandy Bridge i7-2620M 4Mb L3
radix-tree with 1024 slots:
tagged lookup
step 1 before 7156 after 3613
step 2 before 5399 after 2696
step 3 before 4779 after 1928
step 4 before 4456 after 1429
step 5 before 4292 after 1213
step 6 before 4183 after 1052
step 7 before 4157 after 951
step 8 before 4016 after 812
step 9 before 3952 after 851
step 10 before 3937 after 732
step 11 before 4023 after 709
step 12 before 3872 after 657
step 13 before 3892 after 633
step 14 before 3720 after 591
step 15 before 3879 after 578
step 16 before 3561 after 513
normal lookup
step 1 before 4266 after 3301
step 2 before 2695 after 2129
step 3 before 2083 after 1712
step 4 before 1801 after 1534
step 5 before 1628 after 1313
step 6 before 1551 after 1263
step 7 before 1475 after 1185
step 8 before 1432 after 1167
step 9 before 1373 after 1092
step 10 before 1339 after 1134
step 11 before 1292 after 1056
step 12 before 1319 after 1030
step 13 before 1276 after 1004
step 14 before 1256 after 987
step 15 before 1228 after 992
step 16 before 1247 after 999
radix-tree with 1024*1024*128 slots:
tagged lookup
step 1 before 1086102841 after 674196409
step 2 before 816839155 after 498138306
step 7 before 599728907 after 240676762
step 15 before 555729253 after 185219677
step 63 before 606637748 after 128585664
step 64 before 608384432 after 102945089
step 65 before 596987114 after 123996019
step 128 before 304459225 after 56783056
step 256 before 158846855 after 31232481
step 512 before 86085652 after 18950595
step 12345 before 6517189 after 1674057
normal lookup
step 1 before 626064869 after 544418266
step 2 before 418809975 after 336321473
step 7 before 242303598 after 207755560
step 15 before 208380563 after 176496355
step 63 before 186854206 after 167283638
step 64 before 176188060 after 170143976
step 65 before 185139608 after 167487116
step 128 before 88181865 after 86913490
step 256 before 45733628 after 45143534
step 512 before 24506038 after 23859036
step 12345 before 2177425 after 2018662
* AMD Athlon 6000+ 2x1Mb L2, without L3
radix-tree with 1024 slots:
tag-lookup
step 1 before 8164 after 5379
step 2 before 5818 after 5581
step 3 before 4959 after 4213
step 4 before 4371 after 3386
step 5 before 4204 after 2997
step 6 before 4950 after 2744
step 7 before 4598 after 2480
step 8 before 4251 after 2288
step 9 before 4262 after 2243
step 10 before 4175 after 2131
step 11 before 3999 after 2024
step 12 before 3979 after 1994
step 13 before 3842 after 1929
step 14 before 3750 after 1810
step 15 before 3735 after 1810
step 16 before 3532 after 1660
normal-lookup
step 1 before 7875 after 5847
step 2 before 4808 after 4071
step 3 before 4073 after 3462
step 4 before 3677 after 3074
step 5 before 4308 after 2978
step 6 before 3911 after 3807
step 7 before 3635 after 3522
step 8 before 3313 after 3202
step 9 before 3280 after 3257
step 10 before 3166 after 3083
step 11 before 3066 after 3026
step 12 before 2985 after 2982
step 13 before 2925 after 2924
step 14 before 2834 after 2808
step 15 before 2805 after 2803
step 16 before 2647 after 2622
radix-tree with 1024*1024*128 slots:
tag-lookup
step 1 before 1288059720 after 951736580
step 2 before 961292300 after 884212140
step 7 before 768905140 after 547267580
step 15 before 771319480 after 456550640
step 63 before 504847640 after 242704304
step 64 before 392484800 after 177920786
step 65 before 491162160 after 246895264
step 128 before 208084064 after 97348392
step 256 before 112401035 after 51408126
step 512 before 75825834 after 29145070
step 12345 before 5603166 after 2847330
normal-lookup
step 1 before 1025677120 after 861375100
step 2 before 647220080 after 572258540
step 7 before 505518960 after 484041813
step 15 before 430483053 after 444815320 *
step 63 before 388113453 after 404250546 *
step 64 before 374154666 after 396027440 *
step 65 before 381423973 after 396704853 *
step 128 before 190078700 after 202619384 *
step 256 before 100886756 after 102829108 *
step 512 before 64074505 after 56158720
step 12345 before 4237289 after 4422299 *
* i686 on Sandy bridge
radix-tree with 1024 slots:
tagged lookup
step 1 before 7990 after 4019
step 2 before 5698 after 2897
step 3 before 5013 after 2475
step 4 before 4630 after 1721
step 5 before 4346 after 1759
step 6 before 4299 after 1556
step 7 before 4098 after 1513
step 8 before 4115 after 1222
step 9 before 3983 after 1390
step 10 before 4077 after 1207
step 11 before 3921 after 1231
step 12 before 3894 after 1116
step 13 before 3840 after 1147
step 14 before 3799 after 1090
step 15 before 3797 after 1059
step 16 before 3783 after 745
normal lookup
step 1 before 5103 after 3499
step 2 before 3299 after 2550
step 3 before 2489 after 2370
step 4 before 2034 after 2302 *
step 5 before 1846 after 2268 *
step 6 before 1752 after 2249 *
step 7 before 1679 after 2164 *
step 8 before 1627 after 2153 *
step 9 before 1542 after 2095 *
step 10 before 1479 after 2109 *
step 11 before 1469 after 2009 *
step 12 before 1445 after 2039 *
step 13 before 1411 after 2013 *
step 14 before 1374 after 2046 *
step 15 before 1340 after 1975 *
step 16 before 1331 after 2000 *
radix-tree with 1024*1024*128 slots:
tagged lookup
step 1 before 1225865377 after 667153553
step 2 before 842427423 after 471533007
step 7 before 609296153 after 276260116
step 15 before 544232060 after 226859105
step 63 before 519209199 after 141343043
step 64 before 588980279 after 141951339
step 65 before 521099710 after 138282060
step 128 before 298476778 after 83390628
step 256 before 149358342 after 43602609
step 512 before 76994713 after 22911077
step 12345 before
|
||
---|---|---|
.. | ||
acpi | ||
asm-generic | ||
crypto | ||
drm | ||
keys | ||
linux | ||
math-emu | ||
media | ||
misc | ||
mtd | ||
net | ||
pcmcia | ||
rdma | ||
rxrpc | ||
scsi | ||
sound | ||
target | ||
trace | ||
video | ||
xen | ||
Kbuild |