linux/arch
Nicholas Piggin 2655421ae6 lazy tlb: shoot lazies, non-refcounting lazy tlb mm reference handling scheme
On big systems, the mm refcount can become highly contented when doing a
lot of context switching with threaded applications.  user<->idle switch
is one of the important cases.  Abandoning lazy tlb entirely slows this
switching down quite a bit in the common uncontended case, so that is not
viable.

Implement a scheme where lazy tlb mm references do not contribute to the
refcount, instead they get explicitly removed when the refcount reaches
zero.

The final mmdrop() sends IPIs to all CPUs in the mm_cpumask and they
switch away from this mm to init_mm if it was being used as the lazy tlb
mm.  Enabling the shoot lazies option therefore requires that the arch
ensures that mm_cpumask contains all CPUs that could possibly be using mm.
A DEBUG_VM option IPIs every CPU in the system after this to ensure there
are no references remaining before the mm is freed.

Shootdown IPIs cost could be an issue, but they have not been observed to
be a serious problem with this scheme, because short-lived processes tend
not to migrate CPUs much, therefore they don't get much chance to leave
lazy tlb mm references on remote CPUs.  There are a lot of options to
reduce them if necessary, described in comments.

The near-worst-case can be benchmarked with will-it-scale:

  context_switch1_threads -t $(($(nproc) / 2))

This will create nproc threads (nproc / 2 switching pairs) all sharing the
same mm that spread over all CPUs so each CPU does thread->idle->thread
switching.

[ Rik came up with basically the same idea a few years ago, so credit
  to him for that. ]

Link: https://lore.kernel.org/linux-mm/20230118080011.2258375-1-npiggin@gmail.com/
Link: https://lore.kernel.org/all/20180728215357.3249-11-riel@surriel.com/
Link: https://lkml.kernel.org/r/20230203071837.1136453-5-npiggin@gmail.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-03-28 16:20:08 -07:00
..
alpha alpha: fix lazy-FPU mis(merged/applied/whatnot) 2023-03-06 20:13:49 -05:00
arc - Daniel Verkamp has contributed a memfd series ("mm/memfd: add 2023-02-23 17:09:35 -08:00
arm lazy tlb: introduce lazy tlb mm refcount helper functions 2023-03-28 16:20:08 -07:00
arm64 ARM: SoC fixes for 6.3, part 2 2023-03-24 15:38:13 -07:00
csky rch/csky patches for 6.3 2023-02-27 09:27:31 -08:00
hexagon VM_FAULT_RETRY fixes 2023-03-05 11:07:58 -08:00
ia64 cpumask: re-introduce constant-sized cpumask optimizations 2023-03-05 14:30:34 -08:00
loongarch LoongArch changes for v6.3 2023-03-01 09:27:00 -08:00
m68k m68k: Only force 030 bus error if PC not in exception table 2023-03-06 14:09:42 +01:00
microblaze VM_FAULT_RETRY fixes 2023-03-05 11:07:58 -08:00
mips Networking fixes for 6.3-rc2, including fixes from netfilter, bpf 2023-03-09 10:56:58 -08:00
nios2 VM_FAULT_RETRY fixes 2023-03-05 11:07:58 -08:00
openrisc VM_FAULT_RETRY fixes 2023-03-05 11:07:58 -08:00
parisc VM_FAULT_RETRY fixes 2023-03-05 11:07:58 -08:00
powerpc lazy tlb: introduce lazy tlb mm refcount helper functions 2023-03-28 16:20:08 -07:00
riscv RISC-V Fixes for 6.3-rc4 2023-03-24 09:52:26 -07:00
s390 s390: update defconfigs 2023-03-13 09:15:11 +01:00
sh sh: sanitize the flags on sigreturn 2023-03-09 10:01:59 -08:00
sparc VM_FAULT_RETRY fixes 2023-03-05 11:07:58 -08:00
um This pull request contains the following changes for UML: 2023-03-01 09:13:00 -08:00
x86 x86/mm/pat: clear VM_PAT if copy_p4d_range failed 2023-03-28 16:20:07 -07:00
xtensa - Daniel Verkamp has contributed a memfd series ("mm/memfd: add 2023-02-23 17:09:35 -08:00
.gitignore
Kconfig lazy tlb: shoot lazies, non-refcounting lazy tlb mm reference handling scheme 2023-03-28 16:20:08 -07:00