b78b27d029
Optimizing the initialization speed of 1G huge pages through parallelization. 1G hugetlbs are allocated from bootmem, a process that is already very fast and does not currently require optimization. Therefore, we focus on parallelizing only the initialization phase in `gather_bootmem_prealloc`. Here are some test results: test case no patch(ms) patched(ms) saved ------------------- -------------- ------------- -------- 256c2T(4 node) 1G 4745 2024 57.34% 128c1T(2 node) 1G 3358 1712 49.02% 12T 1G 77000 18300 76.23% [akpm@linux-foundation.org: s/initialied/initialized/, per Alexey] Link: https://lkml.kernel.org/r/20240222140422.393911-9-gang.li@linux.dev Signed-off-by: Gang Li <ligang.bdlg@bytedance.com> Tested-by: David Rientjes <rientjes@google.com> Reviewed-by: Muchun Song <muchun.song@linux.dev> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jane Chu <jane.chu@oracle.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
---|---|---|
.. | ||
book3s32 | ||
book3s64 | ||
kasan | ||
nohash | ||
ptdump | ||
cacheflush.c | ||
copro_fault.c | ||
dma-noncoherent.c | ||
drmem.c | ||
fault.c | ||
hugetlbpage.c | ||
init_32.c | ||
init_64.c | ||
init-common.c | ||
ioremap_32.c | ||
ioremap_64.c | ||
ioremap.c | ||
maccess.c | ||
Makefile | ||
mem.c | ||
mmu_context.c | ||
mmu_decl.h | ||
numa.c | ||
pageattr.c | ||
pgtable_32.c | ||
pgtable_64.c | ||
pgtable-frag.c | ||
pgtable.c |