linux

iv/linux

History

Linus Torvalds 9244724fbf A large update for SMP management:

- Parallel CPU bringup
 
     The reason why people are interested in parallel bringup is to shorten
     the (kexec) reboot time of cloud servers to reduce the downtime of the
     VM tenants.
 
     The current fully serialized bringup does the following per AP:
 
       1) Prepare callbacks (allocate, intialize, create threads)
       2) Kick the AP alive (e.g. INIT/SIPI on x86)
       3) Wait for the AP to report alive state
       4) Let the AP continue through the atomic bringup
       5) Let the AP run the threaded bringup to full online state
 
     There are two significant delays:
 
       #3 The time for an AP to report alive state in start_secondary() on
          x86 has been measured in the range between 350us and 3.5ms
          depending on vendor and CPU type, BIOS microcode size etc.
 
       #4 The atomic bringup does the microcode update. This has been
          measured to take up to ~8ms on the primary threads depending on
          the microcode patch size to apply.
 
     On a two socket SKL server with 56 cores (112 threads) the boot CPU
     spends on current mainline about 800ms busy waiting for the APs to come
     up and apply microcode. That's more than 80% of the actual onlining
     procedure.
 
     This can be reduced significantly by splitting the bringup mechanism
     into two parts:
 
       1) Run the prepare callbacks and kick the AP alive for each AP which
       	 needs to be brought up.
 
 	 The APs wake up, do their firmware initialization and run the low
       	 level kernel startup code including microcode loading in parallel
       	 up to the first synchronization point. (#1 and #2 above)
 
       2) Run the rest of the bringup code strictly serialized per CPU
       	 (#3 - #5 above) as it's done today.
 
 	 Parallelizing that stage of the CPU bringup might be possible in
 	 theory, but it's questionable whether required surgery would be
 	 justified for a pretty small gain.
 
     If the system is large enough the first AP is already waiting at the
     first synchronization point when the boot CPU finished the wake-up of
     the last AP. That reduces the AP bringup time on that SKL from ~800ms
     to ~80ms, i.e. by a factor ~10x.
 
     The actual gain varies wildly depending on the system, CPU, microcode
     patch size and other factors. There are some opportunities to reduce
     the overhead further, but that needs some deep surgery in the x86 CPU
     bringup code.
 
     For now this is only enabled on x86, but the core functionality
     obviously works for all SMP capable architectures.
 
   - Enhancements for SMP function call tracing so it is possible to locate
     the scheduling and the actual execution points. That allows to measure
     IPI delivery time precisely.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmSZb/YTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoRoOD/9vAiGI3IhGyZcX/RjXxauSHf8Pmqll
 05jUubFi5Vi3tKI1ubMOsnMmJTw2yy5xDyS/iGj7AcbRLq9uQd3iMtsXXHNBzo/X
 FNxnuWTXYUj0vcOYJ+j4puBumFzzpRCprqccMInH0kUnSWzbnaQCeelicZORAf+w
 zUYrswK4HpBXHDOnvPw6Z7MYQe+zyDQSwjSftstLyROzu+lCEw/9KUaysY2epShJ
 wHClxS2XqMnpY4rJ/CmJAlRhD0Plb89zXyo6k9YZYVDWoAcmBZy6vaTO4qoR171L
 37ApqrgsksMkjFycCMnmrFIlkeb7bkrYDQ5y+xqC3JPTlYDKOYmITV5fZ83HD77o
 K7FAhl/CgkPq2Ec+d82GFLVBKR1rijbwHf7a0nhfUy0yMeaJCxGp4uQ45uQ09asi
 a/VG2T38EgxVdseC92HRhcdd3pipwCb5wqjCH/XdhdlQrk9NfeIeP+TxF4QhADhg
 dApp3ifhHSnuEul7+HNUkC6U+Zc8UeDPdu5lvxSTp2ooQ0JwaGgC5PJq3nI9RUi2
 Vv826NHOknEjFInOQcwvp6SJPfcuSTF75Yx6xKz8EZ3HHxpvlolxZLq+3ohSfOKn
 2efOuZO5bEu4S/G2tRDYcy+CBvNVSrtZmCVqSOS039c8quBWQV7cj0334cjzf+5T
 TRiSzvssbYYmaw==
 =Y8if
 -----END PGP SIGNATURE-----

Merge tag 'smp-core-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull SMP updates from Thomas Gleixner:
 "A large update for SMP management:

   - Parallel CPU bringup

     The reason why people are interested in parallel bringup is to
     shorten the (kexec) reboot time of cloud servers to reduce the
     downtime of the VM tenants.

     The current fully serialized bringup does the following per AP:

       1) Prepare callbacks (allocate, intialize, create threads)
       2) Kick the AP alive (e.g. INIT/SIPI on x86)
       3) Wait for the AP to report alive state
       4) Let the AP continue through the atomic bringup
       5) Let the AP run the threaded bringup to full online state

     There are two significant delays:

       #3 The time for an AP to report alive state in start_secondary()
          on x86 has been measured in the range between 350us and 3.5ms
          depending on vendor and CPU type, BIOS microcode size etc.

       #4 The atomic bringup does the microcode update. This has been
          measured to take up to ~8ms on the primary threads depending
          on the microcode patch size to apply.

     On a two socket SKL server with 56 cores (112 threads) the boot CPU
     spends on current mainline about 800ms busy waiting for the APs to
     come up and apply microcode. That's more than 80% of the actual
     onlining procedure.

     This can be reduced significantly by splitting the bringup
     mechanism into two parts:

       1) Run the prepare callbacks and kick the AP alive for each AP
          which needs to be brought up.

          The APs wake up, do their firmware initialization and run the
          low level kernel startup code including microcode loading in
          parallel up to the first synchronization point. (#1 and #2
          above)

       2) Run the rest of the bringup code strictly serialized per CPU
          (#3 - #5 above) as it's done today.

          Parallelizing that stage of the CPU bringup might be possible
          in theory, but it's questionable whether required surgery
          would be justified for a pretty small gain.

     If the system is large enough the first AP is already waiting at
     the first synchronization point when the boot CPU finished the
     wake-up of the last AP. That reduces the AP bringup time on that
     SKL from ~800ms to ~80ms, i.e. by a factor ~10x.

     The actual gain varies wildly depending on the system, CPU,
     microcode patch size and other factors. There are some
     opportunities to reduce the overhead further, but that needs some
     deep surgery in the x86 CPU bringup code.

     For now this is only enabled on x86, but the core functionality
     obviously works for all SMP capable architectures.

   - Enhancements for SMP function call tracing so it is possible to
     locate the scheduling and the actual execution points. That allows
     to measure IPI delivery time precisely"

* tag 'smp-core-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits)
  trace,smp: Add tracepoints for scheduling remotelly called functions
  trace,smp: Add tracepoints around remotelly called functions
  MAINTAINERS: Add CPU HOTPLUG entry
  x86/smpboot: Fix the parallel bringup decision
  x86/realmode: Make stack lock work in trampoline_compat()
  x86/smp: Initialize cpu_primary_thread_mask late
  cpu/hotplug: Fix off by one in cpuhp_bringup_mask()
  x86/apic: Fix use of X{,2}APIC_ENABLE in asm with older binutils
  x86/smpboot/64: Implement arch_cpuhp_init_parallel_bringup() and enable it
  x86/smpboot: Support parallel startup of secondary CPUs
  x86/smpboot: Implement a bit spinlock to protect the realmode stack
  x86/apic: Save the APIC virtual base address
  cpu/hotplug: Allow "parallel" bringup up to CPUHP_BP_KICK_AP_STATE
  x86/apic: Provide cpu_primary_thread mask
  x86/smpboot: Enable split CPU startup
  cpu/hotplug: Provide a split up CPUHP_BRINGUP mechanism
  cpu/hotplug: Reset task stack state in _cpu_up()
  cpu/hotplug: Remove unused state functions
  riscv: Switch to hotplug core state synchronization
  parisc: Switch to hotplug core state synchronization
  ...

2023-06-26 13:59:56 -07:00

syscalls

arch: syscalls: simplify uapi/kapi directory creation

2022-03-31 12:03:46 +09:00

.gitignore

.gitignore: add SPDX License Identifier

2020-03-25 11:50:48 +01:00

access-helper.h

MIPS: Fix new sparse warnings

2021-04-07 16:11:05 +02:00

asm-offsets.c

MIPS: Octeon: Allow CVMSEG to be disabled

2023-04-05 09:45:09 +02:00

bmips_5xxx_init.S

MIPS: BCM5xxx: Remove dead init_fpu code

2018-11-08 11:20:57 -08:00

bmips_vec.S

…

branch.c

MIPS: kernel: include probes-common.h header in branch.c

2020-09-21 22:14:24 +02:00

cacheinfo.c

drivers: base: cacheinfo: Get rid of DEFINE_SMP_CALL_CACHE_FUNCTION()

2021-09-01 10:29:10 +02:00

cevt-bcm1480.c

MIPS: Replace setup_irq() by request_irq()

2020-03-05 16:47:35 +01:00

cevt-ds1287.c

MIPS: Replace setup_irq() by request_irq()

2020-03-05 16:47:35 +01:00

cevt-gt641xx.c

MIPS: Replace setup_irq() by request_irq()

2020-03-05 16:47:35 +01:00

cevt-r4k.c

MIPS: cevt-r4k: Offset the value used to clear compare interrupt

2023-02-27 23:45:17 +01:00

cevt-sb1250.c

MIPS: Replace setup_irq() by request_irq()

2020-03-05 16:47:35 +01:00

cevt-txx9.c

mips: kernel: convert comma to semicolon

2020-12-28 22:32:28 +01:00

cmpxchg.c

MIPS: fix typos in comments

2022-05-04 22:22:59 +02:00

cps-vec-ns16550.S

mips: Add CPS_NS16550_WIDTH config

2020-05-22 09:12:52 +02:00

cps-vec.S

MIPS: smp-cps: Disable coherence setup for unsupported ISA

2023-04-05 09:45:08 +02:00

cpu-probe.c

MIPS: Restore Au1300 support

2023-05-23 10:59:29 +02:00

cpu-r3k-probe.c

MIPS: Remove TX39XX support

2022-03-01 10:07:22 +01:00

crash_dump.c

vmcore: convert copy_oldmem_page() to take an iov_iter

2022-04-29 14:37:59 -07:00

crash.c

mm: remove include/linux/bootmem.h

2018-10-31 08:54:16 -07:00

csrc-bcm1480.c

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157

2019-05-30 11:26:37 -07:00

csrc-ioasic.c

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157

2019-05-30 11:26:37 -07:00

csrc-r4k.c

mips: csrc-r4k: Mark R4K timer as unstable if CPU freq changes

2020-05-22 09:14:06 +02:00

csrc-sb1250.c

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157

2019-05-30 11:26:37 -07:00

early_printk_8250.c

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 1

2019-05-21 11:28:39 +02:00

early_printk.c

mips: unify prom_putchar() declarations

2018-07-17 09:40:17 -07:00

elf.c

MIPS: Modernize READ_IMPLIES_EXEC

2022-02-23 13:08:30 +01:00

entry.S

MIPS: Remove TX39XX support

2022-03-01 10:07:22 +01:00

fpu-probe.c

MIPS: cpu-probe: move fpu probing/handling into its own file

2020-10-12 12:04:50 +02:00

fpu-probe.h

MIPS: cpu-probe: move fpu probing/handling into its own file

2020-10-12 12:04:50 +02:00

ftrace.c

MIPS: kernel: Remove not needed set_fs calls

2021-04-06 14:36:56 +02:00

genex.S

MIPS: Always use -Wa,-msoft-float and eliminate GAS_HAS_SET_HARDFLOAT

2023-01-26 12:41:16 +09:00

gpio_txx9.c

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500

2019-06-19 17:09:55 +02:00

head.S

MIPS: of: Introduce helper function to get DTB

2021-02-04 13:34:51 +01:00

i8253.c

MIPS: Replace setup_irq() by request_irq()

2020-03-05 16:47:35 +01:00

idle.c

cpuidle,arch: Mark all regular cpuidle_state:: Enter methods __cpuidle

2023-01-13 11:48:18 +01:00

irq_txx9.c

MIPS: Remove TX39XX support

2022-03-01 10:07:22 +01:00

irq-gt641xx.c

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 1

2019-05-21 11:28:39 +02:00

irq-msc01.c

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152

2019-05-30 11:26:32 -07:00

irq.c

MIPS: Only use current_stack_pointer on GCC

2022-03-14 15:02:53 +01:00

jump_label.c

MIPS: jump_label: Fix compat branch range check

2022-11-11 15:46:03 +01:00

kgdb.c

MIPS: kernel: Drop kgdb_call_nmi_hook

2021-02-15 12:23:54 +01:00

kprobes.c

MIPS: Use NOKPROBE_SYMBOL() instead of __kprobes annotation

2022-05-12 18:00:51 +02:00

linux32.c

MIPS: Delete unused code in linux32.c

2018-08-01 13:20:27 -07:00

machine_kexec.c

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 230

2019-06-19 17:09:06 +02:00

Makefile

MIPS: Remove deprecated CONFIG_MIPS_CMP

2023-04-12 15:01:09 +02:00

mcount.S

mips: ftrace: fix static function graph tracing

2018-06-19 15:00:12 -07:00

mips-cm.c

MIPS: mips-cm: Check availability of config registers

2023-04-05 09:45:08 +02:00

mips-cpc.c

mips: cpc: Fix refcount leak in mips_cpc_default_phys_base

2022-04-26 15:11:25 +02:00

mips-mt-fpaff.c

MIPS: Replace deprecated CPU-hotplug functions.

2021-08-05 10:57:01 +02:00

mips-mt.c

driver core: class: remove module * from class_create()

2023-03-17 15:16:33 +01:00

mips-r2-to-r6-emul.c

MIPS: Fix build error due to PTR used in more places

2022-01-27 09:04:19 +01:00

module.c

jump_label: mips: move module NOP patching into arch code

2022-06-24 09:48:55 +02:00

octeon_switch.S

MIPS: octeon_switch: Remove duplicated labels

2023-04-12 15:14:16 +02:00

perf_event_mipsxx.c

MIPS: fix typos in comments

2022-05-04 22:22:59 +02:00

perf_event.c

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500

2019-06-19 17:09:55 +02:00

perf_regs.c

MIPS: kernel: Support extracting off-line stack traces from user-space with perf

2021-02-04 21:55:45 +01:00

pm-cps.c

MIPS: barrier: Add __SYNC() infrastructure

2019-10-07 09:42:17 -07:00

pm.c

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152

2019-05-30 11:26:32 -07:00

probes-common.h

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152

2019-05-30 11:26:32 -07:00

proc.c

MIPS: cpuinfo: Fix a warning for CONFIG_CPUMASK_OFFSTACK

2022-07-14 11:49:40 +02:00

process.c

sched/idle: Mark arch_cpu_idle_dead() __noreturn

2023-03-08 08:44:28 -08:00

prom.c

MIPS: move from strlcpy with unused retval to strscpy

2022-09-12 15:34:04 +02:00

ptrace32.c

mm: don't include asm/pgtable.h if linux/mm.h is already included

2020-06-09 09:39:13 -07:00

ptrace.c

mips: ptrace: user_regset_copyin_ignore() always returns 0

2022-11-15 14:30:40 -08:00

r4k_fpu.S

MIPS: Always use -Wa,-msoft-float and eliminate GAS_HAS_SET_HARDFLOAT

2023-01-26 12:41:16 +09:00

r4k_switch.S

…

r4k-bugs64.c

MIPS: remove asm/war.h

2022-02-22 09:35:49 +01:00

r2300_fpu.S

MIPS: Always use -Wa,-msoft-float and eliminate GAS_HAS_SET_HARDFLOAT

2023-01-26 12:41:16 +09:00

r2300_switch.S

…

relocate_kernel.S

MIPS: fix duplicate definitions for exported symbols

2022-11-11 15:44:44 +01:00

relocate.c

MIPS: move from strlcpy with unused retval to strscpy

2022-09-12 15:34:04 +02:00

reset.c

mips: Use do_kernel_power_off()

2022-05-19 19:30:31 +02:00

rtlx-mt.c

MIPS: Replace setup_irq() by request_irq()

2020-03-05 16:47:35 +01:00

rtlx.c

…

scall32-o32.S

MIPS: remove asm/war.h

2022-02-22 09:35:49 +01:00

scall64-n32.S

MIPS: Fix build error due to PTR used in more places

2022-01-27 09:04:19 +01:00

scall64-n64.S

MIPS: remove asm/war.h

2022-02-22 09:35:49 +01:00

scall64-o32.S

MIPS: Fix build error due to PTR used in more places

2022-01-27 09:04:19 +01:00

segment.c

mips: kernel: convert to DEFINE_SHOW_ATTRIBUTE

2022-09-19 16:40:17 +02:00

setup.c

Updates for the x86 boot process:

2023-06-26 13:39:10 -07:00

signal32.c

Remove 'type' argument from access_ok() function

2019-01-03 18:57:57 -08:00

signal_n32.c

MIPS: remove asm/war.h

2022-02-22 09:35:49 +01:00

signal_o32.c

signal: Remove task parameter from force_sig

2019-05-27 09:36:28 -05:00

signal-common.h

…

signal.c

ptrace: Cleanups for v5.18

2022-03-28 17:29:53 -07:00

smp-bmips.c

MIPS: SMP_CPS: Switch to hotplug core state synchronization

2023-05-15 13:44:58 +02:00

smp-cps.c

MIPS: SMP_CPS: Switch to hotplug core state synchronization

2023-05-15 13:44:58 +02:00

smp-mt.c

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 182

2019-05-30 11:29:20 -07:00

smp-up.c

…

smp.c

MIPS: SMP_CPS: Switch to hotplug core state synchronization

2023-05-15 13:44:58 +02:00

spinlock_test.c

mips: kernel: use DEFINE_DEBUGFS_ATTRIBUTE with debugfs_create_file_unsafe()

2021-03-14 14:09:49 +01:00

spram.c

mips: Add CONFIG/CONFIG6/Cause reg fields macro

2020-05-22 09:12:22 +02:00

stacktrace.c

treewide: Add SPDX license identifier for missed files

2019-05-21 10:50:45 +02:00

sync-r4k.c

MIPS: sync-r4k: do slave counter synchronization with disabled HW interrupts

2020-01-22 10:16:18 -08:00

syscall.c

MIPS: Fix build error due to PTR used in more places

2022-01-27 09:04:19 +01:00

sysrq.c

MIPS: constify sysrq_key_op

2020-05-15 14:53:19 +02:00

time.c

MIPS: Fix CP0 counter erratum detection for R4k CPUs

2022-04-29 15:52:00 +02:00

topology.c

drivers/base/node: consolidate node device subsystem initialization in node_dev_init()

2022-03-22 15:57:10 -07:00

traps.c

MIPS: fix fortify panic when copying asm exception handlers

2022-03-07 13:09:28 +01:00

unaligned.c

MIPS: Handle address errors for accesses above CPU max virtual user address

2022-02-25 09:36:05 +01:00

uprobes.c

MIPS: uprobes: Restore thread.trap_nr

2023-04-24 13:31:44 +02:00

vdso.c

treewide: use get_random_u32_below() instead of deprecated function

2022-11-18 02:15:15 +01:00

vmlinux.lds.S

MIPS: Define RUNTIME_DISCARD_EXIT in LD script

2023-04-21 23:59:43 +02:00

vpe-mt.c

drivers: remove struct module * setting from struct class

2023-03-17 15:16:27 +01:00

vpe.c

- added support for Huawei B593u-12

2023-04-27 17:46:52 -07:00

watch.c

MIPS: Use fallthrough for arch/mips

2020-05-07 11:55:47 +02:00