a72bbec70d
The hotplug support for kexec_load() requires changes to the userspace kexec-tools and a little extra help from the kernel. Given a kdump capture kernel loaded via kexec_load(), and a subsequent hotplug event, the crash hotplug handler finds the elfcorehdr and rewrites it to reflect the hotplug change. That is the desired outcome, however, at kernel panic time, the purgatory integrity check fails (because the elfcorehdr changed), and the capture kernel does not boot and no vmcore is generated. Therefore, the userspace kexec-tools/kexec must indicate to the kernel that the elfcorehdr can be modified (because the kexec excluded the elfcorehdr from the digest, and sized the elfcorehdr memory buffer appropriately). To facilitate hotplug support with kexec_load(): - a new kexec flag KEXEC_UPATE_ELFCOREHDR indicates that it is safe for the kernel to modify the kexec_load()'d elfcorehdr - the /sys/kernel/crash_elfcorehdr_size node communicates the preferred size of the elfcorehdr memory buffer - The sysfs crash_hotplug nodes (ie. /sys/devices/system/[cpu|memory]/crash_hotplug) dynamically take into account kexec_file_load() vs kexec_load() and KEXEC_UPDATE_ELFCOREHDR. This is critical so that the udev rule processing of crash_hotplug is all that is needed to determine if the userspace unload-then-load of the kdump image is to be skipped, or not. The proposed udev rule change looks like: # The kernel updates the crash elfcorehdr for CPU and memory changes SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" The table below indicates the behavior of kexec_load()'d kdump image updates (with the new udev crash_hotplug rule in place): Kernel |Kexec -------+-----+---- Old |Old |New | a | a -------+-----+---- New | a | b -------+-----+---- where kexec 'old' and 'new' delineate kexec-tools has the needed modifications for the crash hotplug feature, and kernel 'old' and 'new' delineate the kernel supports this crash hotplug feature. Behavior 'a' indicates the unload-then-reload of the entire kdump image. For the kexec 'old' column, the unload-then-reload occurs due to the missing flag KEXEC_UPDATE_ELFCOREHDR. An 'old' kernel (with 'new' kexec) does not present the crash_hotplug sysfs node, which leads to the unload-then-reload of the kdump image. Behavior 'b' indicates the desired optimized behavior of the kernel directly modifying the elfcorehdr and avoiding the unload-then-reload of the kdump image. If the udev rule is not updated with crash_hotplug node check, then no matter any combination of kernel or kexec is new or old, the kdump image continues to be unload-then-reload on hotplug changes. To fully support crash hotplug feature, there needs to be a rollout of kernel, kexec-tools and udev rule changes. However, the order of the rollout of these pieces does not matter; kexec_load()'d kdump images still function for hotplug as-is. Link: https://lkml.kernel.org/r/20230814214446.6659-7-eric.devolder@oracle.com Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Suggested-by: Hari Bathini <hbathini@linux.ibm.com> Acked-by: Hari Bathini <hbathini@linux.ibm.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Akhil Raj <lf32.dev@gmail.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Young <dyoung@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Mimi Zohar <zohar@linux.ibm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Sean Christopherson <seanjc@google.com> Cc: Sourabh Jain <sourabhjain@linux.ibm.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Weißschuh <linux@weissschuh.net> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
151 lines
5.1 KiB
Plaintext
151 lines
5.1 KiB
Plaintext
# SPDX-License-Identifier: GPL-2.0-only
|
|
|
|
menu "Kexec and crash features"
|
|
|
|
config CRASH_CORE
|
|
bool
|
|
|
|
config KEXEC_CORE
|
|
select CRASH_CORE
|
|
bool
|
|
|
|
config KEXEC_ELF
|
|
bool
|
|
|
|
config HAVE_IMA_KEXEC
|
|
bool
|
|
|
|
config KEXEC
|
|
bool "Enable kexec system call"
|
|
depends on ARCH_SUPPORTS_KEXEC
|
|
select KEXEC_CORE
|
|
help
|
|
kexec is a system call that implements the ability to shutdown your
|
|
current kernel, and to start another kernel. It is like a reboot
|
|
but it is independent of the system firmware. And like a reboot
|
|
you can start any kernel with it, not just Linux.
|
|
|
|
The name comes from the similarity to the exec system call.
|
|
|
|
It is an ongoing process to be certain the hardware in a machine
|
|
is properly shutdown, so do not be surprised if this code does not
|
|
initially work for you. As of this writing the exact hardware
|
|
interface is strongly in flux, so no good recommendation can be
|
|
made.
|
|
|
|
config KEXEC_FILE
|
|
bool "Enable kexec file based system call"
|
|
depends on ARCH_SUPPORTS_KEXEC_FILE
|
|
select KEXEC_CORE
|
|
help
|
|
This is new version of kexec system call. This system call is
|
|
file based and takes file descriptors as system call argument
|
|
for kernel and initramfs as opposed to list of segments as
|
|
accepted by kexec system call.
|
|
|
|
config KEXEC_SIG
|
|
bool "Verify kernel signature during kexec_file_load() syscall"
|
|
depends on ARCH_SUPPORTS_KEXEC_SIG
|
|
depends on KEXEC_FILE
|
|
help
|
|
This option makes the kexec_file_load() syscall check for a valid
|
|
signature of the kernel image. The image can still be loaded without
|
|
a valid signature unless you also enable KEXEC_SIG_FORCE, though if
|
|
there's a signature that we can check, then it must be valid.
|
|
|
|
In addition to this option, you need to enable signature
|
|
verification for the corresponding kernel image type being
|
|
loaded in order for this to work.
|
|
|
|
config KEXEC_SIG_FORCE
|
|
bool "Require a valid signature in kexec_file_load() syscall"
|
|
depends on ARCH_SUPPORTS_KEXEC_SIG_FORCE
|
|
depends on KEXEC_SIG
|
|
help
|
|
This option makes kernel signature verification mandatory for
|
|
the kexec_file_load() syscall.
|
|
|
|
config KEXEC_IMAGE_VERIFY_SIG
|
|
bool "Enable Image signature verification support (ARM)"
|
|
default ARCH_DEFAULT_KEXEC_IMAGE_VERIFY_SIG
|
|
depends on ARCH_SUPPORTS_KEXEC_IMAGE_VERIFY_SIG
|
|
depends on KEXEC_SIG
|
|
depends on EFI && SIGNED_PE_FILE_VERIFICATION
|
|
help
|
|
Enable Image signature verification support.
|
|
|
|
config KEXEC_BZIMAGE_VERIFY_SIG
|
|
bool "Enable bzImage signature verification support"
|
|
depends on ARCH_SUPPORTS_KEXEC_BZIMAGE_VERIFY_SIG
|
|
depends on KEXEC_SIG
|
|
depends on SIGNED_PE_FILE_VERIFICATION
|
|
select SYSTEM_TRUSTED_KEYRING
|
|
help
|
|
Enable bzImage signature verification support.
|
|
|
|
config KEXEC_JUMP
|
|
bool "kexec jump"
|
|
depends on ARCH_SUPPORTS_KEXEC_JUMP
|
|
depends on KEXEC && HIBERNATION
|
|
help
|
|
Jump between original kernel and kexeced kernel and invoke
|
|
code in physical address mode via KEXEC
|
|
|
|
config CRASH_DUMP
|
|
bool "kernel crash dumps"
|
|
depends on ARCH_SUPPORTS_CRASH_DUMP
|
|
depends on ARCH_SUPPORTS_KEXEC
|
|
select CRASH_CORE
|
|
select KEXEC_CORE
|
|
select KEXEC
|
|
help
|
|
Generate crash dump after being started by kexec.
|
|
This should be normally only set in special crash dump kernels
|
|
which are loaded in the main kernel with kexec-tools into
|
|
a specially reserved region and then later executed after
|
|
a crash by kdump/kexec. The crash dump kernel must be compiled
|
|
to a memory address not used by the main kernel or BIOS using
|
|
PHYSICAL_START, or it must be built as a relocatable image
|
|
(CONFIG_RELOCATABLE=y).
|
|
For more details see Documentation/admin-guide/kdump/kdump.rst
|
|
|
|
For s390, this option also enables zfcpdump.
|
|
See also <file:Documentation/s390/zfcpdump.rst>
|
|
|
|
config CRASH_HOTPLUG
|
|
bool "Update the crash elfcorehdr on system configuration changes"
|
|
default y
|
|
depends on CRASH_DUMP && (HOTPLUG_CPU || MEMORY_HOTPLUG)
|
|
depends on ARCH_SUPPORTS_CRASH_HOTPLUG
|
|
help
|
|
Enable direct update to the crash elfcorehdr (which contains
|
|
the list of CPUs and memory regions to be dumped upon a crash)
|
|
in response to hot plug/unplug or online/offline of CPUs or
|
|
memory. This is a much more advanced approach than userspace
|
|
attempting that.
|
|
|
|
If unsure, say Y.
|
|
|
|
config CRASH_MAX_MEMORY_RANGES
|
|
int "Specify the maximum number of memory regions for the elfcorehdr"
|
|
default 8192
|
|
depends on CRASH_HOTPLUG
|
|
help
|
|
For the kexec_file_load() syscall path, specify the maximum number of
|
|
memory regions that the elfcorehdr buffer/segment can accommodate.
|
|
These regions are obtained via walk_system_ram_res(); eg. the
|
|
'System RAM' entries in /proc/iomem.
|
|
This value is combined with NR_CPUS_DEFAULT and multiplied by
|
|
sizeof(Elf64_Phdr) to determine the final elfcorehdr memory buffer/
|
|
segment size.
|
|
The value 8192, for example, covers a (sparsely populated) 1TiB system
|
|
consisting of 128MiB memblocks, while resulting in an elfcorehdr
|
|
memory buffer/segment size under 1MiB. This represents a sane choice
|
|
to accommodate both baremetal and virtual machine configurations.
|
|
|
|
For the kexec_load() syscall path, CRASH_MAX_MEMORY_RANGES is part of
|
|
the computation behind the value provided through the
|
|
/sys/kernel/crash_elfcorehdr_size attribute.
|
|
|
|
endmenu
|