IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
The total memory size we get in kernel is usually slightly less than the
actual memory size because BIOS/firmware will reserve some memory region.
So it won't export all memory as usable.
E.g, on my x86_64 kvm guest with 1G memory, the total_mem value shows:
UEFI boot with ovmf: 0x3faef000 Legacy boot kvm guest: 0x3ff7ec00
When specifying crashkernel=1G-2G:128M, if we have a 1G memory machine, we
get total size 1023M from firmware. Then it will not fall into 1G-2G,
thus no memory reserved. User will never know this, it is hard to let
user know the exact total value in kernel.
One way is to use dmi/smbios to get physical memory size, but it's not
reliable as well. According to Prarit hardware vendors sometimes screw
this up. Thus round up total size to 128M to work around this problem.
This patch is a resend of [1] and rebased onto v5.19-rc2, and the
original credit goes to Dave Young.
[1]: http://lists.infradead.org/pipermail/kexec/2018-April/020568.html
Link: https://lkml.kernel.org/r/20220627074440.187222-1-ltao@redhat.com
Signed-off-by: Tao Liu <ltao@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The internal kallsyms tables contain information which could be quite
useful to a debugging tool in the absence of other debuginfo. If kallsyms
is enabled, then a debugging tool could parse it and use it as a fallback
symbol table. Combined with BTF data, live & post-mortem debuggers can
support basic operations without needing a large DWARF debuginfo file
available. As many as five symbols are necessary to properly parse
kallsyms names and addresses. Add these to the vmcoreinfo note.
CONFIG_KALLSYMS_ABSOLUTE_PERCPU does impact the computation of symbol
addresses. However, a debugger can infer this configuration value by
comparing the address of _stext in the vmcoreinfo with the address
computed via kallsyms. So there's no need to include information about
this config value in the vmcoreinfo note.
To verify that we're still well below the maximum of 4096 bytes, I created
a script[1] to compute a rough upper bound on the possible size of
vmcoreinfo. On v5.18-rc7, the script reports 3106 bytes, and with this
patch, the maximum become 3370 bytes.
[1]: https://github.com/brenns10/kernel_stuff/blob/master/vmcoreinfosize/
Link: https://lkml.kernel.org/r/20220517000508.777145-3-stephen.s.brennan@oracle.com
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Bixuan Cui <cuibixuan@huawei.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: David Vernet <void@manifault.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Stephen Boyd <swboyd@chromium.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Pull misc updates from Andrew Morton:
"The non-MM patch queue for this merge window.
Not a lot of material this cycle. Many singleton patches against
various subsystems. Most notably some maintenance work in ocfs2
and initramfs"
* tag 'mm-nonmm-stable-2022-05-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (65 commits)
kcov: update pos before writing pc in trace function
ocfs2: dlmfs: fix error handling of user_dlm_destroy_lock
ocfs2: dlmfs: don't clear USER_LOCK_ATTACHED when destroying lock
fs/ntfs: remove redundant variable idx
fat: remove time truncations in vfat_create/vfat_mkdir
fat: report creation time in statx
fat: ignore ctime updates, and keep ctime identical to mtime in memory
fat: split fat_truncate_time() into separate functions
MAINTAINERS: add Muchun as a memcg reviewer
proc/sysctl: make protected_* world readable
ia64: mca: drop redundant spinlock initialization
tty: fix deadlock caused by calling printk() under tty_port->lock
relay: remove redundant assignment to pointer buf
fs/ntfs3: validate BOOT sectors_per_clusters
lib/string_helpers: fix not adding strarray to device's resource list
kernel/crash_core.c: remove redundant check of ck_cmdline
ELF, uapi: fixup ELF_ST_TYPE definition
ipc/mqueue: use get_tree_nodev() in mqueue_get_tree()
ipc: update semtimedop() to use hrtimer
ipc/sem: remove redundant assignments
...
According to the current crashkernel=Y,low support in other ARCHes, it's
an optional command-line option. When it doesn't exist, kernel will try
to allocate minimum required memory below 4G automatically.
However, __parse_crashkernel() returns '-EINVAL' for all error cases. It
can't distinguish the nonexistent option from invalid option.
Change __parse_crashkernel() to return '-ENOENT' for the nonexistent option
case. With this change, crashkernel,low memory will take the default
value if crashkernel=,low is not specified; while crashkernel reservation
will fail and bail out if an invalid option is specified.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Baoquan He <bhe@redhat.com>
Link: https://lore.kernel.org/r/20220506114402.365-2-thunder.leizhen@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
When booting with crashkernel= on the kernel command line a warning
similar to
Kernel command line: ro console=ttyS0 crashkernel=256M
Unknown kernel command line parameters "crashkernel=256M", will be passed to user space.
is printed.
This comes from crashkernel= being parsed independent from the kernel
parameter handling mechanism. So the code in init/main.c doesn't know
that crashkernel= is a valid kernel parameter and prints this incorrect
warning.
Suppress the warning by adding a dummy early_param handler for
crashkernel=.
Link: https://lkml.kernel.org/r/20211208133443.6867-1-prudo@redhat.com
Fixes: 86d1919a4f ("init: print out unknown kernel parameters")
Signed-off-by: Philipp Rudo <prudo@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Andrew Halaney <ahalaney@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently <crypto/sha.h> contains declarations for both SHA-1 and SHA-2,
and <crypto/sha3.h> contains declarations for SHA-3.
This organization is inconsistent, but more importantly SHA-1 is no
longer considered to be cryptographically secure. So to the extent
possible, SHA-1 shouldn't be grouped together with any of the other SHA
versions, and usage of it should be phased out.
Therefore, split <crypto/sha.h> into two headers <crypto/sha1.h> and
<crypto/sha2.h>, and make everyone explicitly specify whether they want
the declarations for SHA-1, SHA-2, or both.
This avoids making the SHA-1 declarations visible to files that don't
want anything to do with SHA-1. It also prepares for potentially moving
sha1.h into a new insecure/ or dangerous/ directory.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Make kernel GNU build-id available in VMCOREINFO. Having build-id in
VMCOREINFO facilitates presenting appropriate kernel namelist image with
debug information file to kernel crash dump analysis tools. Currently
VMCOREINFO lacks uniquely identifiable key for crash analysis automation.
Regarding if this patch is necessary or matching of linux_banner and
OSRELEASE in VMCOREINFO employed by crash(8) meets the need -- IMO,
build-id approach more foolproof, in most instances it is a cryptographic
hash generated using internal code/ELF bits unlike kernel version string
upon which linux_banner is based that is external to the code. I feel
each is intended for a different purpose. Also OSRELEASE is not suitable
when two different kernel builds from same version with different features
enabled.
Currently for most linux (and non-linux) systems build-id can be extracted
using standard methods for file types such as user mode crash dumps,
shared libraries, loadable kernel modules etc., This is an exception for
linux kernel dump. Having build-id in VMCOREINFO brings some uniformity
for automation tools.
Tyler said:
: I think this is a nice improvement over today's linux_banner approach for
: correlating vmlinux to a kernel dump.
:
: The elf notes parsing in this patch lines up with what is described in in
: the "Notes (Nhdr)" section of the elf(5) man page.
:
: BUILD_ID_MAX is sufficient to hold a sha1 build-id, which is the default
: build-id type today in GNU ld(2). It is also sufficient to hold the
: "fast" build-id, which is the default build-id type today in LLVM lld(2).
Signed-off-by: Vijay Balakrishna <vijayb@linux.microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Link: http://lkml.kernel.org/r/1591849672-34104-1-git-send-email-vijayb@linux.microsoft.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Right now user-space tools like 'makedumpfile' and 'crash' need to rely
on a best-guess method of determining value of 'MAX_PHYSMEM_BITS'
supported by underlying kernel.
This value is used in user-space code to calculate the bit-space
required to store a section for SPARESMEM (similar to the existing
calculation method used in the kernel implementation):
#define SECTIONS_SHIFT (MAX_PHYSMEM_BITS - SECTION_SIZE_BITS)
Now, regressions have been reported in user-space utilities
like 'makedumpfile' and 'crash' on arm64, with the recently added
kernel support for 52-bit physical address space, as there is
no clear method of determining this value in user-space
(other than reading kernel CONFIG flags).
As per suggestion from makedumpfile maintainer (Kazu), it makes more
sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself
rather than in arch-specific code, so that the user-space code for other
archs can also benefit from this addition to the vmcoreinfo and use it
as a standard way of determining 'SECTIONS_SHIFT' value in user-land.
A reference 'makedumpfile' implementation which reads the
'MAX_PHYSMEM_BITS' value from vmcoreinfo in a arch-independent fashion
is available here:
While at it also update vmcoreinfo documentation for 'MAX_PHYSMEM_BITS'
variable being added to vmcoreinfo.
'MAX_PHYSMEM_BITS' defines the maximum supported physical address
space memory.
Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
Tested-by: John Donnelly <john.p.donnelly@oracle.com>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: Boris Petkov <bp@alien8.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: James Morse <james.morse@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Anderson <anderson@redhat.com>
Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Cc: x86@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: kexec@lists.infradead.org
Link: https://lore.kernel.org/r/1589395957-24628-2-git-send-email-bhsharma@redhat.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This is preparation for allowing CRASH_CORE to be enabled for any
architecture.
swapper_pg_dir is always either an array or a macro expanding to NULL.
In the latter case, VMCOREINFO_SYMBOL() won't work, as it tries to take
the address of the given symbol:
#define VMCOREINFO_SYMBOL(name) \
vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
Instead, use VMCOREINFO_SYMBOL_ARRAY(), which uses the value:
#define VMCOREINFO_SYMBOL_ARRAY(name) \
vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
This is the same thing for the array case but isn't an error for the macro
case.
Link: http://lkml.kernel.org/r/c05f9781ec204f40fc96f95086e7b6de6a3eb2c3.1532563124.git.osandov@fb.com
Signed-off-by: Omar Sandoval <osandov@fb.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Bhupesh Sharma <bhsharma@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: James Morse <james.morse@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently vmcoreinfo data is updated at boot time subsys_initcall(), it
has the risk of being modified by some wrong code during system is
running.
As a result, vmcore dumped may contain the wrong vmcoreinfo. Later on,
when using "crash", "makedumpfile", etc utility to parse this vmcore, we
probably will get "Segmentation fault" or other unexpected errors.
E.g. 1) wrong code overwrites vmcoreinfo_data; 2) further crashes the
system; 3) trigger kdump, then we obviously will fail to recognize the
crash context correctly due to the corrupted vmcoreinfo.
Now except for vmcoreinfo, all the crash data is well
protected(including the cpu note which is fully updated in the crash
path, thus its correctness is guaranteed). Given that vmcoreinfo data
is a large chunk prepared for kdump, we better protect it as well.
To solve this, we relocate and copy vmcoreinfo_data to the crash memory
when kdump is loading via kexec syscalls. Because the whole crash
memory will be protected by existing arch_kexec_protect_crashkres()
mechanism, we naturally protect vmcoreinfo_data from write(even read)
access under kernel direct mapping after kdump is loaded.
Since kdump is usually loaded at the very early stage after boot, we can
trust the correctness of the vmcoreinfo data copied.
On the other hand, we still need to operate the vmcoreinfo safe copy
when crash happens to generate vmcoreinfo_note again, we rely on vmap()
to map out a new kernel virtual address and update to use this new one
instead in the following crash_save_vmcoreinfo().
BTW, we do not touch vmcoreinfo_note, because it will be fully updated
using the protected vmcoreinfo_data after crash which is surely correct
just like the cpu crash note.
Link: http://lkml.kernel.org/r/1493281021-20737-3-git-send-email-xlpang@redhat.com
Signed-off-by: Xunlei Pang <xlpang@redhat.com>
Tested-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As Eric said,
"what we need to do is move the variable vmcoreinfo_note out of the
kernel's .bss section. And modify the code to regenerate and keep this
information in something like the control page.
Definitely something like this needs a page all to itself, and ideally
far away from any other kernel data structures. I clearly was not
watching closely the data someone decided to keep this silly thing in
the kernel's .bss section."
This patch allocates extra pages for these vmcoreinfo_XXX variables, one
advantage is that it enhances some safety of vmcoreinfo, because
vmcoreinfo now is kept far away from other kernel data structures.
Link: http://lkml.kernel.org/r/1493281021-20737-1-git-send-email-xlpang@redhat.com
Signed-off-by: Xunlei Pang <xlpang@redhat.com>
Tested-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Suggested-by: Eric Biederman <ebiederm@xmission.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "kexec/fadump: remove dependency with CONFIG_KEXEC and
reuse crashkernel parameter for fadump", v4.
Traditionally, kdump is used to save vmcore in case of a crash. Some
architectures like powerpc can save vmcore using architecture specific
support instead of kexec/kdump mechanism. Such architecture specific
support also needs to reserve memory, to be used by dump capture kernel.
crashkernel parameter can be a reused, for memory reservation, by such
architecture specific infrastructure.
This patchset removes dependency with CONFIG_KEXEC for crashkernel
parameter and vmcoreinfo related code as it can be reused without kexec
support. Also, crashkernel parameter is reused instead of
fadump_reserve_mem to reserve memory for fadump.
The first patch moves crashkernel parameter parsing and vmcoreinfo
related code under CONFIG_CRASH_CORE instead of CONFIG_KEXEC_CORE. The
second patch reuses the definitions of append_elf_note() & final_note()
functions under CONFIG_CRASH_CORE in IA64 arch code. The third patch
removes dependency on CONFIG_KEXEC for firmware-assisted dump (fadump)
in powerpc. The next patch reuses crashkernel parameter for reserving
memory for fadump, instead of the fadump_reserve_mem parameter. This
has the advantage of using all syntaxes crashkernel parameter supports,
for fadump as well. The last patch updates fadump kernel documentation
about use of crashkernel parameter.
This patch (of 5):
Traditionally, kdump is used to save vmcore in case of a crash. Some
architectures like powerpc can save vmcore using architecture specific
support instead of kexec/kdump mechanism. Such architecture specific
support also needs to reserve memory, to be used by dump capture kernel.
crashkernel parameter can be a reused, for memory reservation, by such
architecture specific infrastructure.
But currently, code related to vmcoreinfo and parsing of crashkernel
parameter is built under CONFIG_KEXEC_CORE. This patch introduces
CONFIG_CRASH_CORE and moves the above mentioned code under this config,
allowing code reuse without dependency on CONFIG_KEXEC. There is no
functional change with this patch.
Link: http://lkml.kernel.org/r/149035338104.6881.4550894432615189948.stgit@hbathini.in.ibm.com
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>