License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 15:07:57 +01:00
# SPDX-License-Identifier: GPL-2.0
2007-06-13 02:30:17 +10:00
source "arch/powerpc/platforms/Kconfig.cputype"
2007-03-19 11:53:53 +01:00
2010-10-22 00:17:55 +00:00
config 32BIT
bool
default y if PPC32
2005-09-26 16:04:21 +10:00
config 64BIT
bool
default y if PPC64
config MMU
bool
default y
2017-04-21 00:36:20 +10:00
config ARCH_MMAP_RND_BITS_MAX
# On Book3S 64, the default virtual address space for 64-bit processes
# is 2^47 (128TB). As a maximum, allow randomisation to consume up to
# 32T of address space (2^45), which should ensure a reasonable gap
# between bottom-up and top-down allocations for applications that
# consume "normal" amounts of address space. Book3S 64 only supports 64K
# and 4K page sizes.
default 29 if PPC_BOOK3S_64 && PPC_64K_PAGES # 29 = 45 (32T) - 16 (64K)
default 33 if PPC_BOOK3S_64 # 33 = 45 (32T) - 12 (4K)
#
# On all other 64-bit platforms (currently only Book3E), the virtual
# address space is 2^46 (64TB). Allow randomisation to consume up to 16T
# of address space (2^44). Only 4K page sizes are supported.
default 32 if 64BIT # 32 = 44 (16T) - 12 (4K)
#
# For 32-bit, use the compat values, as they're the same.
default ARCH_MMAP_RND_COMPAT_BITS_MAX
config ARCH_MMAP_RND_BITS_MIN
# Allow randomisation to consume up to 1GB of address space (2^30).
default 14 if 64BIT && PPC_64K_PAGES # 14 = 30 (1GB) - 16 (64K)
default 18 if 64BIT # 18 = 30 (1GB) - 12 (4K)
#
# For 32-bit, use the compat values, as they're the same.
default ARCH_MMAP_RND_COMPAT_BITS_MIN
config ARCH_MMAP_RND_COMPAT_BITS_MAX
# Total virtual address space for 32-bit processes is 2^31 (2GB).
# Allow randomisation to consume up to 512MB of address space (2^29).
default 11 if PPC_256K_PAGES # 11 = 29 (512MB) - 18 (256K)
default 13 if PPC_64K_PAGES # 13 = 29 (512MB) - 16 (64K)
2019-07-03 18:04:13 +02:00
default 15 if PPC_16K_PAGES # 15 = 29 (512MB) - 14 (16K)
2017-04-21 00:36:20 +10:00
default 17 # 17 = 29 (512MB) - 12 (4K)
config ARCH_MMAP_RND_COMPAT_BITS_MIN
# Total virtual address space for 32-bit processes is 2^31 (2GB).
# Allow randomisation to consume up to 8MB of address space (2^23).
default 5 if PPC_256K_PAGES # 5 = 23 (8MB) - 18 (256K)
default 7 if PPC_64K_PAGES # 7 = 23 (8MB) - 16 (64K)
default 9 if PPC_16K_PAGES # 9 = 23 (8MB) - 14 (16K)
default 11 # 11 = 23 (8MB) - 12 (4K)
2009-08-14 15:00:53 +09:00
config HAVE_SETUP_PER_CPU_AREA
2009-03-30 19:07:44 +09:00
def_bool PPC64
2009-08-14 15:00:53 +09:00
config NEED_PER_CPU_EMBED_FIRST_CHUNK
2020-06-08 12:39:02 +05:30
def_bool y if PPC64
config NEED_PER_CPU_PAGE_FIRST_CHUNK
def_bool y if PPC64
2008-01-30 13:32:51 +01:00
2009-10-13 19:44:44 +00:00
config NR_IRQS
int "Number of virtual interrupt numbers"
2020-12-10 18:14:44 +01:00
range 32 1048576
2009-10-13 19:44:44 +00:00
default "512"
help
This defines the number of virtual interrupt numbers the kernel
can manage. Virtual interrupt numbers are what you see in
/proc/interrupts. If you configure your system to have too few,
drivers will fail to load or worse - handle with care.
2016-12-20 04:30:08 +10:00
config NMI_IPI
bool
2017-07-12 14:35:52 -07:00
depends on SMP && (DEBUGGER || KEXEC_CORE || HARDLOCKUP_DETECTOR)
2016-12-20 04:30:08 +10:00
default y
2017-08-01 22:00:52 +10:00
config PPC_WATCHDOG
bool
depends on HARDLOCKUP_DETECTOR
depends on HAVE_HARDLOCKUP_DETECTOR_ARCH
default y
help
This is a placeholder when the powerpc hardlockup detector
watchdog is selected (arch/powerpc/kernel/watchdog.c). It is
2020-12-07 15:54:20 +00:00
selected via the generic lockup detector menu which is why we
2017-08-01 22:00:52 +10:00
have no standalone config option for it here.
2008-04-17 14:35:00 +10:00
config STACKTRACE_SUPPORT
bool
default y
2008-04-17 14:35:01 +10:00
config TRACE_IRQFLAGS_SUPPORT
bool
default y
config LOCKDEP_SUPPORT
bool
default y
2008-01-30 13:31:20 +01:00
config GENERIC_LOCKBREAK
bool
default y
2019-10-24 18:04:58 +02:00
depends on SMP && PREEMPTION
2008-01-30 13:31:20 +01:00
2006-03-26 01:39:33 -08:00
config GENERIC_HWEIGHT
bool
default y
2005-09-26 16:04:21 +10:00
config PPC
bool
default y
2017-03-06 22:53:59 +11:00
#
# Please keep this list sorted alphabetically.
#
32-bit userspace ABI: introduce ARCH_32BIT_OFF_T config option
All new 32-bit architectures should have 64-bit userspace off_t type, but
existing architectures has 32-bit ones.
To enforce the rule, new config option is added to arch/Kconfig that defaults
ARCH_32BIT_OFF_T to be disabled for new 32-bit architectures. All existing
32-bit architectures enable it explicitly.
New option affects force_o_largefile() behaviour. Namely, if userspace
off_t is 64-bits long, we have no reason to reject user to open big files.
Note that even if architectures has only 64-bit off_t in the kernel
(arc, c6x, h8300, hexagon, nios2, openrisc, and unicore32),
a libc may use 32-bit off_t, and therefore want to limit the file size
to 4GB unless specified differently in the open flags.
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Yury Norov <ynorov@marvell.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-05-16 11:18:49 +03:00
select ARCH_32BIT_OFF_T if PPC32
2021-05-04 18:38:17 -07:00
select ARCH_ENABLE_MEMORY_HOTPLUG
select ARCH_ENABLE_MEMORY_HOTREMOVE
2021-04-21 17:06:42 +00:00
select ARCH_HAS_COPY_MC if PPC64
2018-12-19 07:09:39 +00:00
select ARCH_HAS_DEBUG_VIRTUAL
2021-03-18 09:18:55 +05:30
select ARCH_HAS_DEBUG_VM_PGTABLE
2017-03-06 22:53:59 +11:00
select ARCH_HAS_DEVMEM_IS_ALLOWED
2021-04-21 17:06:42 +00:00
select ARCH_HAS_DMA_MAP_DIRECT if PPC_PSERIES
2017-03-06 22:53:59 +11:00
select ARCH_HAS_ELF_RANDOMIZE
include/linux/string.h: add the option of fortified string.h functions
This adds support for compiling with a rough equivalent to the glibc
_FORTIFY_SOURCE=1 feature, providing compile-time and runtime buffer
overflow checks for string.h functions when the compiler determines the
size of the source or destination buffer at compile-time. Unlike glibc,
it covers buffer reads in addition to writes.
GNU C __builtin_*_chk intrinsics are avoided because they would force a
much more complex implementation. They aren't designed to detect read
overflows and offer no real benefit when using an implementation based
on inline checks. Inline checks don't add up to much code size and
allow full use of the regular string intrinsics while avoiding the need
for a bunch of _chk functions and per-arch assembly to avoid wrapper
overhead.
This detects various overflows at compile-time in various drivers and
some non-x86 core kernel code. There will likely be issues caught in
regular use at runtime too.
Future improvements left out of initial implementation for simplicity,
as it's all quite optional and can be done incrementally:
* Some of the fortified string functions (strncpy, strcat), don't yet
place a limit on reads from the source based on __builtin_object_size of
the source buffer.
* Extending coverage to more string functions like strlcat.
* It should be possible to optionally use __builtin_object_size(x, 1) for
some functions (C strings) to detect intra-object overflows (like
glibc's _FORTIFY_SOURCE=2), but for now this takes the conservative
approach to avoid likely compatibility issues.
* The compile-time checks should be made available via a separate config
option which can be enabled by default (or always enabled) once enough
time has passed to get the issues it catches fixed.
Kees said:
"This is great to have. While it was out-of-tree code, it would have
blocked at least CVE-2016-3858 from being exploitable (improper size
argument to strlcpy()). I've sent a number of fixes for
out-of-bounds-reads that this detected upstream already"
[arnd@arndb.de: x86: fix fortified memcpy]
Link: http://lkml.kernel.org/r/20170627150047.660360-1-arnd@arndb.de
[keescook@chromium.org: avoid panic() in favor of BUG()]
Link: http://lkml.kernel.org/r/20170626235122.GA25261@beast
[keescook@chromium.org: move from -mm, add ARCH_HAS_FORTIFY_SOURCE, tweak Kconfig help]
Link: http://lkml.kernel.org/r/20170526095404.20439-1-danielmicay@gmail.com
Link: http://lkml.kernel.org/r/1497903987-21002-8-git-send-email-keescook@chromium.org
Signed-off-by: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-12 14:36:10 -07:00
select ARCH_HAS_FORTIFY_SOURCE
2017-03-06 22:53:59 +11:00
select ARCH_HAS_GCOV_PROFILE_ALL
2019-07-11 20:57:28 -07:00
select ARCH_HAS_HUGEPD if HUGETLB_PAGE
2021-04-21 17:06:42 +00:00
select ARCH_HAS_KCOV
select ARCH_HAS_MEMBARRIER_CALLBACKS
select ARCH_HAS_MEMBARRIER_SYNC_CORE
2020-01-30 12:06:07 -08:00
select ARCH_HAS_MEMREMAP_COMPAT_ALIGN
2019-02-22 14:45:42 +00:00
select ARCH_HAS_MMIOWB if PPC64
2021-04-21 17:06:42 +00:00
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
2018-01-10 16:21:13 +01:00
select ARCH_HAS_PHYS_TO_DMA
2019-07-31 06:31:41 +00:00
select ARCH_HAS_PMEM_API
2019-07-16 16:30:47 -07:00
select ARCH_HAS_PTE_DEVMAP if PPC_BOOK3S_64
2018-06-07 17:06:08 -07:00
select ARCH_HAS_PTE_SPECIAL
2019-08-22 16:44:05 +00:00
select ARCH_HAS_SCALED_CPUTIME if VIRT_CPU_ACCOUNTING_NATIVE && PPC_BOOK3S_64
2021-03-31 11:38:45 +11:00
select ARCH_HAS_STRICT_KERNEL_RWX if ((PPC_BOOK3S_64 || PPC32) && !HIBERNATION)
2017-03-06 22:53:59 +11:00
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
2019-07-31 06:31:41 +00:00
select ARCH_HAS_UACCESS_FLUSHCACHE
2017-03-06 22:53:59 +11:00
select ARCH_HAS_UBSAN_SANITIZE_ALL
select ARCH_HAVE_NMI_SAFE_CMPXCHG
2019-05-13 17:22:59 -07:00
select ARCH_KEEP_MEMBLOCK
2013-10-07 22:15:32 -04:00
select ARCH_MIGHT_HAVE_PC_PARPORT
2014-01-01 11:32:26 -08:00
select ARCH_MIGHT_HAVE_PC_SERIO
2018-01-04 16:35:25 +01:00
select ARCH_OPTIONAL_KERNEL_RWX if ARCH_HAS_STRICT_KERNEL_RWX
2021-03-16 07:57:15 +00:00
select ARCH_STACKWALK
2017-03-06 22:53:59 +11:00
select ARCH_SUPPORTS_ATOMIC_RMW
2020-12-14 19:10:30 -08:00
select ARCH_SUPPORTS_DEBUG_PAGEALLOC if PPC32 || PPC_BOOK3S_64
2017-03-06 22:53:59 +11:00
select ARCH_USE_BUILTIN_BSWAP
select ARCH_USE_CMPXCHG_LOCKREF if PPC64
2021-04-29 22:55:15 -07:00
select ARCH_USE_MEMTEST
powerpc/64s: Implement queued spinlocks and rwlocks
These have shown significantly improved performance and fairness when
spinlock contention is moderate to high on very large systems.
With this series including subsequent patches, on a 16 socket 1536
thread POWER9, a stress test such as same-file open/close from all
CPUs gets big speedups, 11620op/s aggregate with simple spinlocks vs
384158op/s (33x faster), where the difference in throughput between
the fastest and slowest thread goes from 7x to 1.4x.
Thanks to the fast path being identical in terms of atomics and
barriers (after a subsequent optimisation patch), single threaded
performance is not changed (no measurable difference).
On smaller systems, performance and fairness seems to be generally
improved. Using dbench on tmpfs as a test (that starts to run into
kernel spinlock contention), a 2-socket OpenPOWER POWER9 system was
tested with bare metal and KVM guest configurations. Results can be
found here:
https://github.com/linuxppc/issues/issues/305#issuecomment-663487453
Observations are:
- Queued spinlocks are equal when contention is insignificant, as
expected and as measured with microbenchmarks.
- When there is contention, on bare metal queued spinlocks have better
throughput and max latency at all points.
- When virtualised, queued spinlocks are slightly worse approaching
peak throughput, but significantly better throughput and max latency
at all points beyond peak, until queued spinlock maximum latency
rises when clients are 2x vCPUs.
The regressions haven't been analysed very well yet, there are a lot
of things that can be tuned, particularly the paravirtualised locking,
but the numbers already look like a good net win even on relatively
small systems.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200724131423.1362108-4-npiggin@gmail.com
2020-07-24 23:14:20 +10:00
select ARCH_USE_QUEUED_RWLOCKS if PPC_QUEUED_SPINLOCKS
select ARCH_USE_QUEUED_SPINLOCKS if PPC_QUEUED_SPINLOCKS
2017-03-06 22:53:59 +11:00
select ARCH_WANT_IPC_PARSE_VERSION
2020-09-14 14:52:17 +10:00
select ARCH_WANT_IRQS_OFF_ACTIVATE_MM
2020-11-19 13:46:56 -07:00
select ARCH_WANT_LD_ORPHAN_WARN
2017-01-14 13:32:50 -08:00
select ARCH_WEAK_RELEASE_ACQUIRE
2013-03-06 18:11:51 +00:00
select BINFMT_ELF
2019-12-04 08:46:31 +08:00
select BUILDTIME_TABLE_SORT
2017-03-06 22:53:59 +11:00
select CLONE_BACKWARDS
select DCACHE_WORD_ACCESS if PPC64 && CPU_LITTLE_ENDIAN
2020-07-08 12:22:47 +02:00
select DMA_OPS_BYPASS if PPC64
2021-04-21 17:06:42 +00:00
select DMA_OPS if PPC64
2018-03-27 15:29:06 +11:00
select DYNAMIC_FTRACE if FUNCTION_TRACER
2017-03-06 22:53:59 +11:00
select EDAC_ATOMIC_SCRUB
select EDAC_SUPPORT
select GENERIC_ATOMIC64 if PPC32
select GENERIC_CLOCKEVENTS_BROADCAST if SMP
select GENERIC_CMOS_UPDATE
select GENERIC_CPU_AUTOPROBE
2018-07-28 09:06:34 +10:00
select GENERIC_CPU_VULNERABILITIES if PPC_BARRIER_NOSPEC
2019-09-12 13:49:43 +00:00
select GENERIC_EARLY_IOREMAP
2021-03-31 16:48:47 +00:00
select GENERIC_GETTIMEOFDAY
2017-03-06 22:53:59 +11:00
select GENERIC_IRQ_SHOW
select GENERIC_IRQ_SHOW_LEVEL
2018-11-15 20:05:32 +01:00
select GENERIC_PCI_IOMAP if PCI
2017-03-06 22:53:59 +11:00
select GENERIC_SMP_IDLE_THREAD
select GENERIC_STRNCPY_FROM_USER
select GENERIC_STRNLEN_USER
powerpc: Convert VDSO update function to use new update_vsyscall interface
This converts the powerpc VDSO time update function to use the new
interface introduced in commit 576094b7f0aa ("time: Introduce new
GENERIC_TIME_VSYSCALL", 2012-09-11). Where the old interface gave
us the time as of the last update in seconds and whole nanoseconds,
with the new interface we get the nanoseconds part effectively in
a binary fixed-point format with tk->tkr_mono.shift bits to the
right of the binary point.
With the old interface, the fractional nanoseconds got truncated,
meaning that the value returned by the VDSO clock_gettime function
would have about 1ns of jitter in it compared to the value computed
by the generic timekeeping code in the kernel.
The powerpc VDSO time functions (clock_gettime and gettimeofday)
already work in units of 2^-32 seconds, or 0.23283 ns, because that
makes it simple to split the result into seconds and fractional
seconds, and represent the fractional seconds in either microseconds
or nanoseconds. This is good enough accuracy for now, so this patch
avoids changing how the VDSO works or the interface in the VDSO data
page.
This patch converts the powerpc update_vsyscall_old to be called
update_vsyscall and use the new interface. We convert the fractional
second to units of 2^-32 seconds without truncating to whole nanoseconds.
(There is still a conversion to whole nanoseconds for any legacy users
of the vdso_data/systemcfg stamp_xtime field.)
In addition, this improves the accuracy of the computation of tb_to_xs
for those systems with high-frequency timebase clocks (>= 268.5 MHz)
by doing the right shift in two parts, one before the multiplication and
one after, rather than doing the right shift before the multiplication.
(We can't do all of the right shift after the multiplication unless we
use 128-bit arithmetic.)
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Acked-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-05-27 18:04:52 +10:00
select GENERIC_TIME_VSYSCALL
2021-03-31 16:48:47 +00:00
select GENERIC_VDSO_TIME_NS
2017-03-06 22:53:59 +11:00
select HAVE_ARCH_AUDITSYSCALL
2021-05-03 19:17:55 +10:00
select HAVE_ARCH_HUGE_VMALLOC if HAVE_ARCH_HUGE_VMAP
2019-06-10 13:08:18 +10:00
select HAVE_ARCH_HUGE_VMAP if PPC_BOOK3S_64 && PPC_RADIX_MMU
2017-03-06 22:53:59 +11:00
select HAVE_ARCH_JUMP_LABEL
2021-03-23 15:47:59 +00:00
select HAVE_ARCH_JUMP_LABEL_RELATIVE
2020-05-28 10:17:04 +00:00
select HAVE_ARCH_KASAN if PPC32 && PPC_PAGE_SHIFT <= 14
select HAVE_ARCH_KASAN_VMALLOC if PPC32 && PPC_PAGE_SHIFT <= 14
2021-03-04 14:35:09 +00:00
select HAVE_ARCH_KFENCE if PPC32
2021-04-21 17:06:42 +00:00
select HAVE_ARCH_KGDB
2017-04-21 00:36:20 +10:00
select HAVE_ARCH_MMAP_RND_BITS
select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
2019-01-15 15:18:56 +11:00
select HAVE_ARCH_NVRAM_OPS
2017-03-06 22:53:59 +11:00
select HAVE_ARCH_SECCOMP_FILTER
select HAVE_ARCH_TRACEHOOK
2019-08-19 14:54:20 +09:00
select HAVE_ASM_MODVERSIONS
2017-03-06 22:53:59 +11:00
select HAVE_CONTEXT_TRACKING if PPC64
2021-04-21 17:06:42 +00:00
select HAVE_C_RECORDMCOUNT
2017-03-06 22:53:59 +11:00
select HAVE_DEBUG_KMEMLEAK
select HAVE_DEBUG_STACKOVERFLOW
2009-01-06 18:49:17 +00:00
select HAVE_DYNAMIC_FTRACE
2017-03-06 22:53:59 +11:00
select HAVE_DYNAMIC_FTRACE_WITH_REGS if MPROFILE_KERNEL
2021-03-22 16:37:52 +00:00
select HAVE_EBPF_JIT
2017-03-06 22:53:59 +11:00
select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && POWER7_CPU)
2019-07-11 20:57:14 -07:00
select HAVE_FAST_GUP
2017-03-06 22:53:59 +11:00
select HAVE_FTRACE_MCOUNT_RECORD
2018-06-07 15:22:02 +05:30
select HAVE_FUNCTION_ERROR_INJECTION
2009-02-11 20:06:43 -05:00
select HAVE_FUNCTION_GRAPH_TRACER
2017-03-06 22:53:59 +11:00
select HAVE_FUNCTION_TRACER
2018-05-28 18:22:05 +09:00
select HAVE_GCC_PLUGINS if GCC_VERSION >= 50200 # plugin support on gcc <= 5.1 is buggy on PPC
2020-11-27 00:10:05 +11:00
select HAVE_GENERIC_VDSO
2021-04-21 17:06:42 +00:00
select HAVE_HARDLOCKUP_DETECTOR_ARCH if PPC_BOOK3S_64 && SMP
select HAVE_HARDLOCKUP_DETECTOR_PERF if PERF_EVENTS && HAVE_PERF_EVENTS_NMI && !HAVE_HARDLOCKUP_DETECTOR_ARCH
2017-03-06 22:53:59 +11:00
select HAVE_HW_BREAKPOINT if PERF_EVENTS && (PPC_BOOK3S || PPC_8xx)
2008-02-09 10:46:40 +01:00
select HAVE_IDE
2008-07-23 21:27:08 -07:00
select HAVE_IOREMAP_PROT
2017-03-06 22:53:59 +11:00
select HAVE_IRQ_EXIT_ON_IRQ_STACK
2021-04-21 17:06:42 +00:00
select HAVE_IRQ_TIME_ACCOUNTING
2017-03-06 22:53:59 +11:00
select HAVE_KERNEL_GZIP
2019-06-14 10:16:24 +00:00
select HAVE_KERNEL_LZMA if DEFAULT_UIMAGE
2019-06-14 10:16:25 +00:00
select HAVE_KERNEL_LZO if DEFAULT_UIMAGE
2019-01-31 21:59:04 +01:00
select HAVE_KERNEL_XZ if PPC_BOOK3S || 44x
2008-02-02 15:10:35 -05:00
select HAVE_KPROBES
2017-04-19 18:22:26 +05:30
select HAVE_KPROBES_ON_FTRACE
2008-03-04 14:28:37 -08:00
select HAVE_KRETPROBES
2018-05-09 23:00:01 +10:00
select HAVE_LD_DEAD_CODE_DATA_ELIMINATION
2017-03-06 22:53:59 +11:00
select HAVE_LIVEPATCH if HAVE_DYNAMIC_FTRACE_WITH_REGS
select HAVE_MOD_ARCH_SPECIFIC
2017-07-12 14:35:52 -07:00
select HAVE_NMI if PERF_EVENTS || (PPC64 && PPC_BOOK3S)
2021-04-20 14:02:07 +00:00
select HAVE_OPTPROBES
perf: Do the big rename: Performance Counters -> Performance Events
Bye-bye Performance Counters, welcome Performance Events!
In the past few months the perfcounters subsystem has grown out its
initial role of counting hardware events, and has become (and is
becoming) a much broader generic event enumeration, reporting, logging,
monitoring, analysis facility.
Naming its core object 'perf_counter' and naming the subsystem
'perfcounters' has become more and more of a misnomer. With pending
code like hw-breakpoints support the 'counter' name is less and
less appropriate.
All in one, we've decided to rename the subsystem to 'performance
events' and to propagate this rename through all fields, variables
and API names. (in an ABI compatible fashion)
The word 'event' is also a bit shorter than 'counter' - which makes
it slightly more convenient to write/handle as well.
Thanks goes to Stephane Eranian who first observed this misnomer and
suggested a rename.
User-space tooling and ABI compatibility is not affected - this patch
should be function-invariant. (Also, defconfigs were not touched to
keep the size down.)
This patch has been generated via the following script:
FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
sed -i \
-e 's/PERF_EVENT_/PERF_RECORD_/g' \
-e 's/PERF_COUNTER/PERF_EVENT/g' \
-e 's/perf_counter/perf_event/g' \
-e 's/nb_counters/nb_events/g' \
-e 's/swcounter/swevent/g' \
-e 's/tpcounter_event/tp_event/g' \
$FILES
for N in $(find . -name perf_counter.[ch]); do
M=$(echo $N | sed 's/perf_counter/perf_event/g')
mv $N $M
done
FILES=$(find . -name perf_event.*)
sed -i \
-e 's/COUNTER_MASK/REG_MASK/g' \
-e 's/COUNTER/EVENT/g' \
-e 's/\<event\>/event_id/g' \
-e 's/counter/event/g' \
-e 's/Counter/Event/g' \
$FILES
... to keep it as correct as possible. This script can also be
used by anyone who has pending perfcounters patches - it converts
a Linux kernel tree over to the new naming. We tried to time this
change to the point in time where the amount of pending patches
is the smallest: the end of the merge window.
Namespace clashes were fixed up in a preparatory patch - and some
stylistic fallout will be fixed up in a subsequent patch.
( NOTE: 'counters' are still the proper terminology when we deal
with hardware registers - and these sed scripts are a bit
over-eager in renaming them. I've undone some of that, but
in case there's something left where 'counter' would be
better than 'event' we can undo that on an individual basis
instead of touching an otherwise nicely automated patch. )
Suggested-by: Stephane Eranian <eranian@google.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <linux-arch@vger.kernel.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-21 12:02:48 +02:00
select HAVE_PERF_EVENTS
2017-03-06 22:53:59 +11:00
select HAVE_PERF_EVENTS_NMI if PPC64
2016-02-20 10:32:46 +05:30
select HAVE_PERF_REGS
2016-04-28 15:01:08 +05:30
select HAVE_PERF_USER_STACK_DUMP
2010-04-07 18:10:20 +10:00
select HAVE_REGS_AND_STACK_ACCESS_API
2021-03-16 07:57:13 +00:00
select HAVE_RELIABLE_STACKTRACE
2021-04-21 17:06:42 +00:00
select HAVE_RSEQ
2021-02-10 00:40:52 +01:00
select HAVE_SOFTIRQ_ON_OWN_STACK
2021-04-21 17:06:42 +00:00
select HAVE_STACKPROTECTOR if PPC32 && $(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r2)
select HAVE_STACKPROTECTOR if PPC64 && $(cc-option,-mstack-protector-guard=tls -mstack-protector-guard-reg=r13)
2017-03-06 22:53:59 +11:00
select HAVE_SYSCALL_TRACEPOINTS
select HAVE_VIRT_CPU_ACCOUNTING
2021-05-08 21:12:55 +10:00
select HUGETLB_PAGE_SIZE_VARIABLE if PPC_BOOK3S_64 && HUGETLB_PAGE
2018-04-03 15:47:59 +02:00
select IOMMU_HELPER if PPC64
2012-02-16 01:37:49 -07:00
select IRQ_DOMAIN
2011-10-05 02:30:51 +00:00
select IRQ_FORCED_THREADING
2021-04-21 17:06:42 +00:00
select MMU_GATHER_PAGE_SIZE
select MMU_GATHER_RCU_TABLE_FREE
2012-09-28 14:31:03 +09:30
select MODULES_USE_ELF_RELA
2018-07-30 09:37:21 +02:00
select NEED_DMA_MAP_STATE if PPC64 || NOT_COHERENT_CACHE
2018-04-05 09:44:52 +02:00
select NEED_SG_DMA_LENGTH
2017-03-06 22:53:59 +11:00
select OF
2020-01-26 22:52:47 +11:00
select OF_DMA_DEFAULT_COHERENT if !NOT_COHERENT_CACHE
2017-03-06 22:53:59 +11:00
select OF_EARLY_FLATTREE
select OLD_SIGACTION if PPC32
select OLD_SIGSUSPEND
2018-11-15 20:05:33 +01:00
select PCI_DOMAINS if PCI
2020-09-28 12:13:07 +02:00
select PCI_MSI_ARCH_FALLBACKS if PCI_MSI
2018-11-15 20:05:34 +01:00
select PCI_SYSCALL if PCI
2019-06-04 13:00:37 +10:00
select PPC_DAWR if PPC64
2018-04-23 10:36:38 +02:00
select RTC_LIB
2017-03-06 22:53:59 +11:00
select SPARSE_IRQ
select SYSCTL_EXCEPTION_TRACE
2019-01-31 10:08:58 +00:00
select THREAD_INFO_IN_TASK
2017-03-06 22:53:59 +11:00
select VIRT_TO_BUS if !PPC64
#
# Please keep this list sorted alphabetically.
#
2005-09-26 16:04:21 +10:00
2018-07-28 09:06:34 +10:00
config PPC_BARRIER_NOSPEC
2019-07-03 18:04:13 +02:00
bool
default y
depends on PPC_BOOK3S_64 || PPC_FSL_BOOK3E
2018-07-28 09:06:34 +10:00
2005-09-26 16:04:21 +10:00
config EARLY_PRINTK
bool
2005-11-23 17:57:25 +11:00
default y
2005-09-26 16:04:21 +10:00
2013-11-25 23:23:11 +00:00
config PANIC_TIMEOUT
int
default 180
2005-09-26 16:04:21 +10:00
config COMPAT
2020-03-20 11:20:17 +01:00
bool "Enable support for 32bit binaries"
depends on PPC64
default y if !CPU_LITTLE_ENDIAN
[PATCH v3] ipc: provide generic compat versions of IPC syscalls
When using the "compat" APIs, architectures will generally want to
be able to make direct syscalls to msgsnd(), shmctl(), etc., and
in the kernel we would want them to be handled directly by
compat_sys_xxx() functions, as is true for other compat syscalls.
However, for historical reasons, several of the existing compat IPC
syscalls do not do this. semctl() expects a pointer to the fourth
argument, instead of the fourth argument itself. msgsnd(), msgrcv()
and shmat() expect arguments in different order.
This change adds an ARCH_WANT_OLD_COMPAT_IPC config option that can be
set to preserve this behavior for ports that use it (x86, sparc, powerpc,
s390, and mips). No actual semantics are changed for those architectures,
and there is only a minimal amount of code refactoring in ipc/compat.c.
Newer architectures like tile (and perhaps future architectures such
as arm64 and unicore64) should not select this option, and thus can
avoid having any IPC-specific code at all in their architecture-specific
compat layer. In the same vein, if this option is not selected, IPC_64
mode is assumed, since that's what the <asm-generic> headers expect.
The workaround code in "tile" for msgsnd() and msgrcv() is removed
with this change; it also fixes the bug that shmat() and semctl() were
not being properly handled.
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2012-03-15 13:13:38 -04:00
select ARCH_WANT_OLD_COMPAT_IPC
2012-12-25 19:27:42 -05:00
select COMPAT_OLD_SIGACTION
2005-09-26 16:04:21 +10:00
config SYSVIPC_COMPAT
bool
depends on COMPAT && SYSVIPC
default y
2008-11-11 09:05:16 +01:00
config SCHED_OMIT_FRAME_POINTER
2005-09-26 16:04:21 +10:00
bool
default y
config ARCH_MAY_HAVE_PC_FDC
bool
2014-08-18 17:13:41 -04:00
default PCI
2005-09-26 16:04:21 +10:00
2006-01-10 21:43:56 -06:00
config PPC_UDBG_16550
bool
config GENERIC_TBSYNC
bool
default y if PPC32 && SMP
2006-09-12 03:04:40 -04:00
config AUDIT_ARCH
bool
default y
2006-12-08 03:30:41 -08:00
config GENERIC_BUG
bool
default y
depends on BUG
2020-12-01 11:52:03 +11:00
config GENERIC_BUG_RELATIVE_POINTERS
def_bool y
depends on GENERIC_BUG
2007-03-20 05:18:02 +11:00
config SYS_SUPPORTS_APM_EMULATION
2007-05-23 09:51:46 -05:00
default y if PMAC_APM_EMU
2007-03-20 05:18:02 +11:00
bool
2011-04-14 18:29:16 +00:00
config EPAPR_BOOT
bool
help
Used to allow a board to specify it wants an ePAPR compliant wrapper.
2006-01-16 10:53:22 -06:00
config DEFAULT_UIMAGE
bool
help
Used to allow a board to specify it wants a uImage built by default
2007-12-08 02:12:39 +01:00
config ARCH_HIBERNATION_POSSIBLE
bool
2007-05-03 22:31:38 +10:00
default y
2007-12-08 02:14:00 +01:00
config ARCH_SUSPEND_POSSIBLE
def_bool y
2009-09-16 01:43:57 +04:00
depends on ADB_PMU || PPC_EFIKA || PPC_LITE5200 || PPC_83xx || \
2012-07-20 20:42:36 +08:00
(PPC_85xx && !PPC_E500MC) || PPC_86xx || PPC_PSERIES \
|| 44x || 40x
2007-12-08 02:14:00 +01:00
2019-04-11 13:34:46 +10:00
config ARCH_SUSPEND_NONZERO_CPU
def_bool y
depends on PPC_POWERNV || PPC_PSERIES
2006-11-11 17:24:53 +11:00
config PPC_DCR_NATIVE
bool
config PPC_DCR_MMIO
bool
config PPC_DCR
bool
depends on PPC_DCR_NATIVE || PPC_DCR_MMIO
default y
2006-11-11 17:25:08 +11:00
config PPC_OF_PLATFORM_PCI
bool
2007-12-21 15:37:07 +11:00
depends on PCI
2006-11-11 17:25:08 +11:00
depends on PPC64 # not supported on 32 bits yet
2012-08-23 21:31:32 +00:00
config ARCH_SUPPORTS_UPROBES
def_bool y
2010-02-08 11:50:57 +00:00
config PPC_ADV_DEBUG_REGS
bool
depends on 40x || BOOKE
default y
config PPC_ADV_DEBUG_IACS
int
depends on PPC_ADV_DEBUG_REGS
default 4 if 44x
default 2
config PPC_ADV_DEBUG_DACS
int
depends on PPC_ADV_DEBUG_REGS
default 2
config PPC_ADV_DEBUG_DVCS
int
depends on PPC_ADV_DEBUG_REGS
default 2 if 44x
default 0
config PPC_ADV_DEBUG_DAC_RANGE
bool
depends on PPC_ADV_DEBUG_REGS && 44x
default y
2019-06-04 13:00:37 +10:00
config PPC_DAWR
bool
2018-12-16 17:53:49 +01:00
config ZONE_DMA
2014-08-08 18:40:42 -05:00
bool
2018-12-16 17:53:49 +01:00
default y if PPC_BOOK3E_64
2014-08-08 18:40:42 -05:00
2015-04-14 15:45:57 -07:00
config PGTABLE_LEVELS
int
default 2 if !PPC64
default 4
[POWERPC] 4xx: PLB to PCI Express support
This adds to the previous 2 patches the support for the 4xx PCI Express
cells as found in the 440SPe revA, revB and 405EX.
Unfortunately, due to significant differences between these, and other
interesting "features" of those pieces of HW, the code isn't as simple
as it is for PCI and PCI-X and some of the functions differ significantly
between the 3 implementations. Thus, not only this code can only support
those 3 implementations for now and will refuse to operate on any other,
but there are added ifdef's to avoid the bloat of building a fairly large
amount of code on platforms that don't need it.
Also, this code currently only supports fully initializing root complex
nodes, not endpoint. Some more code will have to be lifted from the
arch/ppc implementation to add the endpoint support, though it's mostly
differences in memory mapping, and the question on how to represent
endpoint mode PCI in the device-tree is thus open.
Many thanks to Stefan Roese for testing & fixing up the 405EX bits !
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Stefan Roese <sr@denx.de>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2007-12-21 15:39:24 +11:00
source "arch/powerpc/sysdev/Kconfig"
2007-03-16 09:32:17 -05:00
source "arch/powerpc/platforms/Kconfig"
2005-09-26 16:04:21 +10:00
menu "Kernel options"
config HIGHMEM
bool "High memory support"
depends on PPC32
2020-11-03 10:27:27 +01:00
select KMAP_LOCAL
2005-09-26 16:04:21 +10:00
2018-12-11 20:01:04 +09:00
source "kernel/Kconfig.hz"
2005-09-26 16:04:21 +10:00
config MATH_EMULATION
bool "Math emulation"
2017-08-08 13:58:54 +02:00
depends on 4xx || PPC_8xx || PPC_MPC832x || BOOKE
2020-08-18 17:19:17 +00:00
select PPC_FPU_REGS
2019-07-03 18:04:13 +02:00
help
2005-09-26 16:04:21 +10:00
Some PowerPC chips designed for embedded applications do not have
a floating-point unit and therefore do not implement the
floating-point instructions in the PowerPC instruction set. If you
say Y here, the kernel will include code to emulate a floating-point
unit, which will allow programs that use floating-point
instructions to run.
2013-06-09 17:01:24 +10:00
This is also useful to emulate missing (optional) instructions
such as fsqrt on cores that do have an FPU but do not implement
them (such as Freescale BookE).
2013-07-16 19:57:15 +08:00
choice
prompt "Math emulation options"
default MATH_EMULATION_FULL
depends on MATH_EMULATION
config MATH_EMULATION_FULL
bool "Emulate all the floating point instructions"
2019-07-03 18:04:13 +02:00
help
2013-07-16 19:57:15 +08:00
Select this option will enable the kernel to support to emulate
all the floating point instructions. If your SoC doesn't have
a FPU, you should select this.
config MATH_EMULATION_HW_UNIMPLEMENTED
bool "Just emulate the FPU unimplemented instructions"
2019-07-03 18:04:13 +02:00
help
2013-07-16 19:57:15 +08:00
Select this if you know there does have a hardware FPU on your
SoC, but some floating point instructions are not implemented by that.
endchoice
2013-02-13 16:21:43 +00:00
config PPC_TRANSACTIONAL_MEM
2019-07-03 18:04:13 +02:00
bool "Transactional Memory support for POWERPC"
depends on PPC_BOOK3S_64
depends on SMP
select ALTIVEC
select VSX
help
Support user-mode Transactional Memory on POWERPC.
2013-02-13 16:21:43 +00:00
2019-11-25 08:36:31 +05:30
config PPC_UV
bool "Ultravisor support"
depends on KVM_BOOK3S_HV_POSSIBLE
2020-01-09 14:50:47 +05:30
depends on DEVICE_PRIVATE
2019-11-25 08:36:31 +05:30
default n
help
This option paravirtualizes the kernel to run in POWER platforms that
supports the Protected Execution Facility (PEF). On such platforms,
the ultravisor firmware runs at a privilege level above the
hypervisor.
If unsure, say "N".
2017-05-29 17:39:40 +10:00
config LD_HEAD_STUB_CATCH
bool "Reserve 256 bytes to cope with linker stubs in HEAD text" if EXPERT
depends on PPC64
help
Very large kernels can cause linker branch stubs to be generated by
code in head_64.S, which moves the head text sections out of their
specified location. This option can work around the problem.
If unsure, say "N".
2016-03-03 15:27:00 +11:00
config MPROFILE_KERNEL
2020-04-22 14:56:12 +05:30
depends on PPC64 && CPU_LITTLE_ENDIAN && FUNCTION_TRACER
2018-05-30 22:19:22 +10:00
def_bool $(success,$(srctree)/arch/powerpc/tools/gcc-check-mprofile-kernel.sh $(CC) -I$(srctree)/include -D__KERNEL__)
2016-03-03 15:27:00 +11:00
2005-09-26 16:04:21 +10:00
config HOTPLUG_CPU
bool "Support for enabling/disabling CPUs"
2013-05-21 13:49:35 +10:00
depends on SMP && (PPC_PSERIES || \
2020-01-28 18:22:25 -08:00
PPC_PMAC || PPC_POWERNV || FSL_SOC_BOOKE)
2019-07-03 18:04:13 +02:00
help
2005-09-26 16:04:21 +10:00
Say Y here to be able to disable and re-enable individual
CPUs at runtime on SMP machines.
Say N if you are unsure.
powerpc/64s: Implement queued spinlocks and rwlocks
These have shown significantly improved performance and fairness when
spinlock contention is moderate to high on very large systems.
With this series including subsequent patches, on a 16 socket 1536
thread POWER9, a stress test such as same-file open/close from all
CPUs gets big speedups, 11620op/s aggregate with simple spinlocks vs
384158op/s (33x faster), where the difference in throughput between
the fastest and slowest thread goes from 7x to 1.4x.
Thanks to the fast path being identical in terms of atomics and
barriers (after a subsequent optimisation patch), single threaded
performance is not changed (no measurable difference).
On smaller systems, performance and fairness seems to be generally
improved. Using dbench on tmpfs as a test (that starts to run into
kernel spinlock contention), a 2-socket OpenPOWER POWER9 system was
tested with bare metal and KVM guest configurations. Results can be
found here:
https://github.com/linuxppc/issues/issues/305#issuecomment-663487453
Observations are:
- Queued spinlocks are equal when contention is insignificant, as
expected and as measured with microbenchmarks.
- When there is contention, on bare metal queued spinlocks have better
throughput and max latency at all points.
- When virtualised, queued spinlocks are slightly worse approaching
peak throughput, but significantly better throughput and max latency
at all points beyond peak, until queued spinlock maximum latency
rises when clients are 2x vCPUs.
The regressions haven't been analysed very well yet, there are a lot
of things that can be tuned, particularly the paravirtualised locking,
but the numbers already look like a good net win even on relatively
small systems.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200724131423.1362108-4-npiggin@gmail.com
2020-07-24 23:14:20 +10:00
config PPC_QUEUED_SPINLOCKS
2021-01-18 22:34:51 +10:00
bool "Queued spinlocks" if EXPERT
powerpc/64s: Implement queued spinlocks and rwlocks
These have shown significantly improved performance and fairness when
spinlock contention is moderate to high on very large systems.
With this series including subsequent patches, on a 16 socket 1536
thread POWER9, a stress test such as same-file open/close from all
CPUs gets big speedups, 11620op/s aggregate with simple spinlocks vs
384158op/s (33x faster), where the difference in throughput between
the fastest and slowest thread goes from 7x to 1.4x.
Thanks to the fast path being identical in terms of atomics and
barriers (after a subsequent optimisation patch), single threaded
performance is not changed (no measurable difference).
On smaller systems, performance and fairness seems to be generally
improved. Using dbench on tmpfs as a test (that starts to run into
kernel spinlock contention), a 2-socket OpenPOWER POWER9 system was
tested with bare metal and KVM guest configurations. Results can be
found here:
https://github.com/linuxppc/issues/issues/305#issuecomment-663487453
Observations are:
- Queued spinlocks are equal when contention is insignificant, as
expected and as measured with microbenchmarks.
- When there is contention, on bare metal queued spinlocks have better
throughput and max latency at all points.
- When virtualised, queued spinlocks are slightly worse approaching
peak throughput, but significantly better throughput and max latency
at all points beyond peak, until queued spinlock maximum latency
rises when clients are 2x vCPUs.
The regressions haven't been analysed very well yet, there are a lot
of things that can be tuned, particularly the paravirtualised locking,
but the numbers already look like a good net win even on relatively
small systems.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200724131423.1362108-4-npiggin@gmail.com
2020-07-24 23:14:20 +10:00
depends on SMP
2021-01-18 22:34:51 +10:00
default PPC_BOOK3S_64
powerpc/64s: Implement queued spinlocks and rwlocks
These have shown significantly improved performance and fairness when
spinlock contention is moderate to high on very large systems.
With this series including subsequent patches, on a 16 socket 1536
thread POWER9, a stress test such as same-file open/close from all
CPUs gets big speedups, 11620op/s aggregate with simple spinlocks vs
384158op/s (33x faster), where the difference in throughput between
the fastest and slowest thread goes from 7x to 1.4x.
Thanks to the fast path being identical in terms of atomics and
barriers (after a subsequent optimisation patch), single threaded
performance is not changed (no measurable difference).
On smaller systems, performance and fairness seems to be generally
improved. Using dbench on tmpfs as a test (that starts to run into
kernel spinlock contention), a 2-socket OpenPOWER POWER9 system was
tested with bare metal and KVM guest configurations. Results can be
found here:
https://github.com/linuxppc/issues/issues/305#issuecomment-663487453
Observations are:
- Queued spinlocks are equal when contention is insignificant, as
expected and as measured with microbenchmarks.
- When there is contention, on bare metal queued spinlocks have better
throughput and max latency at all points.
- When virtualised, queued spinlocks are slightly worse approaching
peak throughput, but significantly better throughput and max latency
at all points beyond peak, until queued spinlock maximum latency
rises when clients are 2x vCPUs.
The regressions haven't been analysed very well yet, there are a lot
of things that can be tuned, particularly the paravirtualised locking,
but the numbers already look like a good net win even on relatively
small systems.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200724131423.1362108-4-npiggin@gmail.com
2020-07-24 23:14:20 +10:00
help
Say Y here to use queued spinlocks which give better scalability and
fairness on large SMP and NUMA systems without harming single threaded
performance.
2009-11-25 17:23:25 +00:00
config ARCH_CPU_PROBE_RELEASE
def_bool y
depends on HOTPLUG_CPU
2013-11-15 09:50:50 +05:30
config PPC64_SUPPORTS_MEMORY_FAILURE
bool "Add support for memory hwpoison"
depends on PPC_BOOK3S_64
default "y" if PPC_POWERNV
select ARCH_SUPPORTS_MEMORY_FAILURE
2005-09-26 16:04:21 +10:00
config KEXEC
2013-01-16 18:53:25 -08:00
bool "kexec system call"
2015-10-06 22:48:22 -05:00
depends on (PPC_BOOK3S || FSL_BOOKE || (44x && !SMP)) || PPC_BOOK3E
2015-09-09 15:38:55 -07:00
select KEXEC_CORE
2005-09-26 16:04:21 +10:00
help
kexec is a system call that implements the ability to shutdown your
current kernel, and to start another kernel. It is like a reboot
2006-06-29 01:32:47 -04:00
but it is independent of the system firmware. And like a reboot
2005-09-26 16:04:21 +10:00
you can start any kernel with it, not just Linux.
2006-06-29 01:32:47 -04:00
The name comes from the similarity to the exec system call.
2005-09-26 16:04:21 +10:00
It is an ongoing process to be certain the hardware in a machine
is properly shutdown, so do not be surprised if this code does not
2013-08-20 21:38:03 +02:00
initially work for you. As of this writing the exact hardware
interface is strongly in flux, so no good recommendation can be
made.
2005-09-26 16:04:21 +10:00
2016-11-29 23:45:53 +11:00
config KEXEC_FILE
bool "kexec file based system call"
select KEXEC_CORE
2021-02-21 09:49:26 -08:00
select HAVE_IMA_KEXEC if IMA
2016-11-29 23:45:53 +11:00
select BUILD_BIN2C
2019-08-23 21:49:13 +02:00
select KEXEC_ELF
2016-11-29 23:45:53 +11:00
depends on PPC64
depends on CRYPTO=y
depends on CRYPTO_SHA256=y
help
This is a new version of the kexec system call. This call is
file based and takes in file descriptors as system call arguments
for kernel and initramfs as opposed to a list of segments as is the
case for the older kexec call.
kexec_file: make use of purgatory optional
Patch series "kexec_file, x86, powerpc: refactoring for other
architecutres", v2.
This is a preparatory patchset for adding kexec_file support on arm64.
It was originally included in a arm64 patch set[1], but Philipp is also
working on their kexec_file support on s390[2] and some changes are now
conflicting.
So these common parts were extracted and put into a separate patch set
for better integration. What's more, my original patch#4 was split into
a few small chunks for easier review after Dave's comment.
As such, the resulting code is basically identical with my original, and
the only *visible* differences are:
- renaming of _kexec_kernel_image_probe() and _kimage_file_post_load_cleanup()
- change one of types of arguments at prepare_elf64_headers()
Those, unfortunately, require a couple of trivial changes on the rest
(#1, #6 to #13) of my arm64 kexec_file patch set[1].
Patch #1 allows making a use of purgatory optional, particularly useful
for arm64.
Patch #2 commonalizes arch_kexec_kernel_{image_probe, image_load,
verify_sig}() and arch_kimage_file_post_load_cleanup() across
architectures.
Patches #3-#7 are also intended to generalize parse_elf64_headers(),
along with exclude_mem_range(), to be made best re-use of.
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2018-February/561182.html
[2] http://lkml.iu.edu//hypermail/linux/kernel/1802.1/02596.html
This patch (of 7):
On arm64, crash dump kernel's usable memory is protected by *unmapping*
it from kernel virtual space unlike other architectures where the region
is just made read-only. It is highly unlikely that the region is
accidentally corrupted and this observation rationalizes that digest
check code can also be dropped from purgatory. The resulting code is so
simple as it doesn't require a bit ugly re-linking/relocation stuff,
i.e. arch_kexec_apply_relocations_add().
Please see:
http://lists.infradead.org/pipermail/linux-arm-kernel/2017-December/545428.html
All that the purgatory does is to shuffle arguments and jump into a new
kernel, while we still need to have some space for a hash value
(purgatory_sha256_digest) which is never checked against.
As such, it doesn't make sense to have trampline code between old kernel
and new kernel on arm64.
This patch introduces a new configuration, ARCH_HAS_KEXEC_PURGATORY, and
allows related code to be compiled in only if necessary.
[takahiro.akashi@linaro.org: fix trivial screwup]
Link: http://lkml.kernel.org/r/20180309093346.GF25863@linaro.org
Link: http://lkml.kernel.org/r/20180306102303.9063-2-takahiro.akashi@linaro.org
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Acked-by: Dave Young <dyoung@redhat.com>
Tested-by: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-13 15:35:45 -07:00
config ARCH_HAS_KEXEC_PURGATORY
def_bool KEXEC_FILE
2016-07-13 09:14:39 +08:00
config RELOCATABLE
bool "Build a relocatable kernel"
2016-10-19 14:16:00 +11:00
depends on PPC64 || (FLATMEM && (44x || FSL_BOOKE))
2016-07-13 09:14:39 +08:00
select NONSTATIC_KERNEL
modversions: treat symbol CRCs as 32 bit quantities
The modversion symbol CRCs are emitted as ELF symbols, which allows us
to easily populate the kcrctab sections by relying on the linker to
associate each kcrctab slot with the correct value.
This has a couple of downsides:
- Given that the CRCs are treated as memory addresses, we waste 4 bytes
for each CRC on 64 bit architectures,
- On architectures that support runtime relocation, a R_<arch>_RELATIVE
relocation entry is emitted for each CRC value, which identifies it
as a quantity that requires fixing up based on the actual runtime
load offset of the kernel. This results in corrupted CRCs unless we
explicitly undo the fixup (and this is currently being handled in the
core module code)
- Such runtime relocation entries take up 24 bytes of __init space
each, resulting in a x8 overhead in [uncompressed] kernel size for
CRCs.
Switching to explicit 32 bit values on 64 bit architectures fixes most
of these issues, given that 32 bit values are not treated as quantities
that require fixing up based on the actual runtime load offset. Note
that on some ELF64 architectures [such as PPC64], these 32-bit values
are still emitted as [absolute] runtime relocatable quantities, even if
the value resolves to a build time constant. Since relative relocations
are always resolved at build time, this patch enables MODULE_REL_CRCS on
powerpc when CONFIG_RELOCATABLE=y, which turns the absolute CRC
references into relative references into .rodata where the actual CRC
value is stored.
So redefine all CRC fields and variables as u32, and redefine the
__CRC_SYMBOL() macro for 64 bit builds to emit the CRC reference using
inline assembler (which is necessary since 64-bit C code cannot use
32-bit types to hold memory addresses, even if they are ultimately
resolved using values that do not exceed 0xffffffff). To avoid
potential problems with legacy 32-bit architectures using legacy
toolchains, the equivalent C definition of the kcrctab entry is retained
for 32-bit architectures.
Note that this mostly reverts commit d4703aefdbc8 ("module: handle ppc64
relocating kcrctabs when CONFIG_RELOCATABLE=y")
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-03 09:54:06 +00:00
select MODULE_REL_CRCS if MODVERSIONS
2016-07-13 09:14:39 +08:00
help
This builds a kernel image that is capable of running at the
location the kernel is loaded at. For ppc32, there is no any
alignment restrictions, and this feature is a superset of
DYNAMIC_MEMSTART and hence overrides it. For ppc64, we should use
16k-aligned base address. The kernel is linked as a
position-independent executable (PIE) and contains dynamic relocations
which are processed early in the bootup process.
One use is for the kexec on panic case where the recovery kernel
must live at a different physical address than the primary
kernel.
Note: If CONFIG_RELOCATABLE=y, then the kernel runs from the address
it has been loaded at and the compile time physical addresses
CONFIG_PHYSICAL_START is ignored. However CONFIG_PHYSICAL_START
setting can still be useful to bootwrappers that need to know the
load address of the kernel (eg. u-boot/mkimage).
2019-09-20 17:45:40 +08:00
config RANDOMIZE_BASE
bool "Randomize the address of the kernel image"
depends on (FSL_BOOKE && FLATMEM && PPC32)
depends on RELOCATABLE
help
Randomizes the virtual address at which the kernel image is
loaded, as a security feature that deters exploit attempts
relying on knowledge of the location of kernel internals.
If unsure, say Y.
2016-10-14 18:31:33 +11:00
config RELOCATABLE_TEST
bool "Test relocatable kernel"
depends on (PPC64 && RELOCATABLE)
help
This runs the relocatable kernel at the address it was initially
loaded at, which tends to be non-zero and therefore test the
relocation code.
2006-01-14 13:48:25 -08:00
config CRASH_DUMP
2017-05-08 15:56:24 -07:00
bool "Build a dump capture kernel"
2018-11-17 10:24:58 +00:00
depends on PPC64 || PPC_BOOK3S_32 || FSL_BOOKE || (44x && !SMP)
2016-10-19 14:16:00 +11:00
select RELOCATABLE if PPC64 || 44x || FSL_BOOKE
2006-01-14 13:48:25 -08:00
help
2017-05-08 15:56:24 -07:00
Build a kernel suitable for use as a dump capture kernel.
2008-10-21 17:38:10 +00:00
The same kernel binary can be used as production kernel and dump
capture kernel.
2006-01-14 13:48:25 -08:00
2012-02-16 01:14:22 +00:00
config FA_DUMP
bool "Firmware-assisted dump"
2019-09-11 20:20:26 +05:30
depends on PPC64 && (PPC_RTAS || PPC_POWERNV)
2017-05-08 15:56:24 -07:00
select CRASH_CORE
select CRASH_DUMP
2008-03-22 10:50:50 +11:00
help
2012-02-16 01:14:22 +00:00
A robust mechanism to get reliable kernel crash dump with
assistance from firmware. This approach does not use kexec,
2017-05-08 15:56:24 -07:00
instead firmware assists in booting the capture kernel
2012-02-16 01:14:22 +00:00
while preserving memory contents. Firmware-assisted dump
is meant to be a kdump replacement offering robustness and
speed not possible without system firmware assistance.
2008-03-22 10:50:50 +11:00
2019-09-11 20:20:26 +05:30
If unsure, say "y". Only special kernels like petitboot may
need to say "N" here.
2008-03-22 10:50:50 +11:00
2019-09-11 20:26:03 +05:30
config PRESERVE_FA_DUMP
bool "Preserve Firmware-assisted dump"
depends on PPC64 && PPC_POWERNV && !FA_DUMP
help
On a kernel with FA_DUMP disabled, this option helps to preserve
crash data from a previously crash'ed kernel. Useful when the next
memory preserving kernel boot would process this crash data.
Petitboot kernel is the typical usecase for this option.
2019-09-11 20:26:33 +05:30
config OPAL_CORE
bool "Export OPAL memory as /sys/firmware/opal/core"
depends on PPC64 && PPC_POWERNV
help
This option uses the MPIPL support in firmware to provide an
ELF core of OPAL memory after a crash. The ELF core is exported
as /sys/firmware/opal/core file which is helpful in debugging
OPAL crashes using GDB.
2008-03-22 10:50:50 +11:00
2005-09-26 16:04:21 +10:00
config IRQ_ALL_CPUS
bool "Distribute interrupts on all CPUs by default"
2013-05-15 11:21:01 +02:00
depends on SMP
2005-09-26 16:04:21 +10:00
help
This option gives the kernel permission to distribute IRQs across
multiple CPUs. Saying N here will route all IRQs to the first
CPU. Generally saying Y is safe, although some problems have been
reported with SMP Power Macintoshes with this option enabled.
2005-10-28 17:46:58 -07:00
config NUMA
2020-11-24 23:05:47 +11:00
bool "NUMA Memory Allocation and Scheduler Support"
2020-11-24 23:05:45 +11:00
depends on PPC64 && SMP
2020-11-24 23:05:46 +11:00
default y if PPC_PSERIES || PPC_POWERNV
2020-11-24 23:05:47 +11:00
help
Enable NUMA (Non-Uniform Memory Access) support.
The kernel will try to allocate memory used by a CPU on the
local memory controller of the CPU and add some more
NUMA awareness to the kernel.
2005-10-28 17:46:58 -07:00
2006-04-10 22:53:53 -07:00
config NODES_SHIFT
int
2009-09-21 19:56:43 +00:00
default "8" if PPC64
2006-04-10 22:53:53 -07:00
default "4"
depends on NEED_MULTIPLE_NODES
2014-05-19 11:14:23 -07:00
config USE_PERCPU_NUMA_NODE_ID
def_bool y
depends on NUMA
2014-05-16 16:41:20 -07:00
config HAVE_MEMORYLESS_NODES
def_bool y
depends on NUMA
2005-09-26 16:04:21 +10:00
config ARCH_SELECT_MEMORY_MODEL
def_bool y
depends on PPC64
config ARCH_FLATMEM_ENABLE
2005-11-29 19:20:55 +00:00
def_bool y
depends on (PPC64 && !NUMA) || PPC32
2005-09-26 16:04:21 +10:00
2005-11-11 14:22:35 +11:00
config ARCH_SPARSEMEM_ENABLE
2005-09-26 16:04:21 +10:00
def_bool y
2005-11-29 19:20:55 +00:00
depends on PPC64
2007-10-16 01:24:17 -07:00
select SPARSEMEM_VMEMMAP_ENABLE
2005-09-26 16:04:21 +10:00
2005-11-11 14:22:35 +11:00
config ARCH_SPARSEMEM_DEFAULT
2005-09-26 16:04:21 +10:00
def_bool y
2017-04-05 16:10:48 +10:00
depends on PPC_BOOK3S_64
2005-09-26 16:04:21 +10:00
2016-11-15 21:59:38 +11:00
config ILLEGAL_POINTER_VALUE
hex
# This is roughly half way between the top of user space and the bottom
# of kernel space, which seems about as good as we can get.
default 0x5deadbeef0000000 if PPC64
default 0
2005-11-07 09:39:48 -08:00
config ARCH_MEMORY_PROBE
def_bool y
depends on MEMORY_HOTPLUG
2008-12-11 04:55:41 +03:00
choice
prompt "Page size"
default PPC_4K_PAGES
2005-11-07 11:06:55 +11:00
help
2008-12-11 04:55:41 +03:00
Select the kernel logical page size. Increasing the page size
will reduce software overhead at each page boundary, allow
hardware prefetch mechanisms to be more effective, and allow
larger dma transfers increasing IO efficiency and reducing
overhead. However the utilization of memory will increase.
For example, each cached file will using a multiple of the
page size to hold its contents and the difference between the
end of file and the end of page is wasted.
Some dedicated systems, such as software raid serving with
accelerated calculations, have shown significant increases.
If you configure a 64 bit kernel for 64k pages but the
processor does not support them, then the kernel will simulate
them with 4k pages, loading them on demand, but with the
reduced software overhead and larger internal fragmentation.
For the 32 bit kernel, a large page option will not be offered
unless it is supported by the configured processor.
If unsure, choose 4K_PAGES.
config PPC_4K_PAGES
bool "4k page size"
2016-01-29 22:32:49 +05:30
select HAVE_ARCH_SOFT_DIRTY if PPC_BOOK3S_64
2008-12-11 04:55:41 +03:00
config PPC_16K_PAGES
2015-08-07 16:19:46 +10:00
bool "16k page size"
2018-11-29 14:07:21 +00:00
depends on 44x || PPC_8xx
2008-12-11 04:55:41 +03:00
config PPC_64K_PAGES
2015-08-07 16:19:46 +10:00
bool "64k page size"
2019-02-08 23:34:16 +11:00
depends on 44x || PPC_BOOK3S_64
2016-01-29 22:32:49 +05:30
select HAVE_ARCH_SOFT_DIRTY if PPC_BOOK3S_64
2008-12-11 04:55:41 +03:00
powerpc/44x: Support for 256KB PAGE_SIZE
This patch adds support for 256KB pages on ppc44x-based boards.
For simplification of implementation with 256KB pages we still assume
2-level paging. As a side effect this leads to wasting extra memory space
reserved for PTE tables: only 1/4 of pages allocated for PTEs are
actually used. But this may be an acceptable trade-off to achieve the
high performance we have with big PAGE_SIZEs in some applications (e.g.
RAID).
Also with 256KB PAGE_SIZE we increase THREAD_SIZE up to 32KB to minimize
the risk of stack overflows in the cases of on-stack arrays, which size
depends on the page size (e.g. multipage BIOs, NTFS, etc.).
With 256KB PAGE_SIZE we need to decrease the PKMAP_ORDER at least down
to 9, otherwise all high memory (2 ^ 10 * PAGE_SIZE == 256MB) we'll be
occupied by PKMAP addresses leaving no place for vmalloc. We do not
separate PKMAP_ORDER for 256K from 16K/64K PAGE_SIZE here; actually that
value of 10 in support for 16K/64K had been selected rather intuitively.
Thus now for all cases of PAGE_SIZE on ppc44x (including the default, 4KB,
one) we have 512 pages for PKMAP.
Because ELF standard supports only page sizes up to 64K, then you should
use binutils later than 2.17.50.0.3 with '-zmax-page-size' set to 256K
for building applications, which are to be run with the 256KB-page sized
kernel. If using the older binutils, then you should patch them like follows:
--- binutils/bfd/elf32-ppc.c.orig
+++ binutils/bfd/elf32-ppc.c
-#define ELF_MAXPAGESIZE 0x10000
+#define ELF_MAXPAGESIZE 0x40000
One more restriction we currently have with 256KB page sizes is inability
to use shmem safely, so, for now, the 256KB is available only if you turn
the CONFIG_SHMEM option off (another variant is to use BROKEN).
Though, if you need shmem with 256KB pages, you can always remove the !SHMEM
dependency in 'config PPC_256K_PAGES', and use the workaround available here:
http://lkml.org/lkml/2008/12/19/20
Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2009-01-29 01:40:44 +00:00
config PPC_256K_PAGES
2021-01-20 07:49:14 +00:00
bool "256k page size (Requires non-standard binutils settings)"
depends on 44x && !PPC_47x
powerpc/44x: Support for 256KB PAGE_SIZE
This patch adds support for 256KB pages on ppc44x-based boards.
For simplification of implementation with 256KB pages we still assume
2-level paging. As a side effect this leads to wasting extra memory space
reserved for PTE tables: only 1/4 of pages allocated for PTEs are
actually used. But this may be an acceptable trade-off to achieve the
high performance we have with big PAGE_SIZEs in some applications (e.g.
RAID).
Also with 256KB PAGE_SIZE we increase THREAD_SIZE up to 32KB to minimize
the risk of stack overflows in the cases of on-stack arrays, which size
depends on the page size (e.g. multipage BIOs, NTFS, etc.).
With 256KB PAGE_SIZE we need to decrease the PKMAP_ORDER at least down
to 9, otherwise all high memory (2 ^ 10 * PAGE_SIZE == 256MB) we'll be
occupied by PKMAP addresses leaving no place for vmalloc. We do not
separate PKMAP_ORDER for 256K from 16K/64K PAGE_SIZE here; actually that
value of 10 in support for 16K/64K had been selected rather intuitively.
Thus now for all cases of PAGE_SIZE on ppc44x (including the default, 4KB,
one) we have 512 pages for PKMAP.
Because ELF standard supports only page sizes up to 64K, then you should
use binutils later than 2.17.50.0.3 with '-zmax-page-size' set to 256K
for building applications, which are to be run with the 256KB-page sized
kernel. If using the older binutils, then you should patch them like follows:
--- binutils/bfd/elf32-ppc.c.orig
+++ binutils/bfd/elf32-ppc.c
-#define ELF_MAXPAGESIZE 0x10000
+#define ELF_MAXPAGESIZE 0x40000
One more restriction we currently have with 256KB page sizes is inability
to use shmem safely, so, for now, the 256KB is available only if you turn
the CONFIG_SHMEM option off (another variant is to use BROKEN).
Though, if you need shmem with 256KB pages, you can always remove the !SHMEM
dependency in 'config PPC_256K_PAGES', and use the workaround available here:
http://lkml.org/lkml/2008/12/19/20
Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2009-01-29 01:40:44 +00:00
help
Make the page size 256k.
2021-01-20 07:49:14 +00:00
The kernel will only be able to run applications that have been
compiled with '-zmax-page-size' set to 256K (the default is 64K) using
binutils later than 2.17.50.0.3, or by patching the ELF_MAXPAGESIZE
definition from 0x10000 to 0x40000 in older versions.
powerpc/44x: Support for 256KB PAGE_SIZE
This patch adds support for 256KB pages on ppc44x-based boards.
For simplification of implementation with 256KB pages we still assume
2-level paging. As a side effect this leads to wasting extra memory space
reserved for PTE tables: only 1/4 of pages allocated for PTEs are
actually used. But this may be an acceptable trade-off to achieve the
high performance we have with big PAGE_SIZEs in some applications (e.g.
RAID).
Also with 256KB PAGE_SIZE we increase THREAD_SIZE up to 32KB to minimize
the risk of stack overflows in the cases of on-stack arrays, which size
depends on the page size (e.g. multipage BIOs, NTFS, etc.).
With 256KB PAGE_SIZE we need to decrease the PKMAP_ORDER at least down
to 9, otherwise all high memory (2 ^ 10 * PAGE_SIZE == 256MB) we'll be
occupied by PKMAP addresses leaving no place for vmalloc. We do not
separate PKMAP_ORDER for 256K from 16K/64K PAGE_SIZE here; actually that
value of 10 in support for 16K/64K had been selected rather intuitively.
Thus now for all cases of PAGE_SIZE on ppc44x (including the default, 4KB,
one) we have 512 pages for PKMAP.
Because ELF standard supports only page sizes up to 64K, then you should
use binutils later than 2.17.50.0.3 with '-zmax-page-size' set to 256K
for building applications, which are to be run with the 256KB-page sized
kernel. If using the older binutils, then you should patch them like follows:
--- binutils/bfd/elf32-ppc.c.orig
+++ binutils/bfd/elf32-ppc.c
-#define ELF_MAXPAGESIZE 0x10000
+#define ELF_MAXPAGESIZE 0x40000
One more restriction we currently have with 256KB page sizes is inability
to use shmem safely, so, for now, the 256KB is available only if you turn
the CONFIG_SHMEM option off (another variant is to use BROKEN).
Though, if you need shmem with 256KB pages, you can always remove the !SHMEM
dependency in 'config PPC_256K_PAGES', and use the workaround available here:
http://lkml.org/lkml/2008/12/19/20
Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2009-01-29 01:40:44 +00:00
2008-12-11 04:55:41 +03:00
endchoice
2005-11-07 11:06:55 +11:00
2019-02-21 19:08:46 +00:00
config PPC_PAGE_SHIFT
int
default 18 if PPC_256K_PAGES
default 16 if PPC_64K_PAGES
default 14 if PPC_16K_PAGES
default 12
2017-02-24 13:52:09 +13:00
config THREAD_SHIFT
int "Thread shift" if EXPERT
range 13 15
default "15" if PPC_256K_PAGES
default "14" if PPC64
2020-04-08 15:58:49 +00:00
default "14" if KASAN
2017-02-24 13:52:09 +13:00
default "13"
help
Used to define the stack size. The default is almost always what you
want. Only change this if you know what you are doing.
2019-02-21 19:08:50 +00:00
config DATA_SHIFT_BOOL
2020-05-19 05:49:25 +00:00
bool "Set custom data alignment"
2019-02-21 19:08:50 +00:00
depends on ADVANCED_OPTIONS
2021-03-04 14:35:09 +00:00
depends on STRICT_KERNEL_RWX || DEBUG_PAGEALLOC || KFENCE
2020-11-24 15:24:55 +00:00
depends on PPC_BOOK3S_32 || (PPC_8xx && !PIN_TLB_DATA && !STRICT_KERNEL_RWX)
2019-02-21 19:08:50 +00:00
help
This option allows you to set the kernel data alignment. When
RAM is mapped by blocks, the alignment needs to fit the size and
number of possible blocks. The default should be OK for most configs.
Say N here unless you know what you are doing.
2019-02-21 19:08:47 +00:00
config DATA_SHIFT
2019-02-21 19:08:50 +00:00
int "Data shift" if DATA_SHIFT_BOOL
2019-02-21 19:08:47 +00:00
default 24 if STRICT_KERNEL_RWX && PPC64
2021-03-04 14:35:09 +00:00
range 17 28 if (STRICT_KERNEL_RWX || DEBUG_PAGEALLOC || KFENCE) && PPC_BOOK3S_32
range 19 23 if (STRICT_KERNEL_RWX || DEBUG_PAGEALLOC || KFENCE) && PPC_8xx
2019-02-21 19:08:49 +00:00
default 22 if STRICT_KERNEL_RWX && PPC_BOOK3S_32
2021-03-04 14:35:09 +00:00
default 18 if (DEBUG_PAGEALLOC || KFENCE) && PPC_BOOK3S_32
2019-02-21 19:08:52 +00:00
default 23 if STRICT_KERNEL_RWX && PPC_8xx
2021-03-04 14:35:09 +00:00
default 23 if (DEBUG_PAGEALLOC || KFENCE) && PPC_8xx && PIN_TLB_DATA
default 19 if (DEBUG_PAGEALLOC || KFENCE) && PPC_8xx
2019-02-21 19:08:47 +00:00
default PPC_PAGE_SHIFT
2019-02-21 19:08:50 +00:00
help
On Book3S 32 (603+), DBATs are used to map kernel text and rodata RO.
Smaller is the alignment, greater is the number of necessary DBATs.
2019-02-21 19:08:47 +00:00
2019-02-21 19:08:52 +00:00
On 8xx, large pages (512kb or 8M) are used to map kernel linear
memory. Aligning to 8M reduces TLB misses as only 8M pages are used
2020-05-19 05:49:25 +00:00
in that case. If PIN_TLB is selected, it must be aligned to 8M as
8M pages will be pinned.
2019-02-21 19:08:52 +00:00
2008-04-11 11:11:56 +10:00
config FORCE_MAX_ZONEORDER
int "Maximum zone order"
2016-02-19 16:38:47 +11:00
range 8 9 if PPC64 && PPC_64K_PAGES
2009-07-21 15:25:53 +00:00
default "9" if PPC64 && PPC_64K_PAGES
powerpc/mm: Update FORCE_MAX_ZONEORDER range to allow hugetlb w/4K
For hugetlb to work with 4K page size, we need MAX_ORDER to be 13 or
more. When switching from a 64K page size to 4K linux page size using
make oldconfig, we end up with a CONFIG_FORCE_MAX_ZONEORDER value of 9.
This results in a 16M hugepage beiing considered as a gigantic huge page
which in turn results in failure to setup hugepages if gigantic hugepage
support is not enabled.
This also results in kernel crash with 4K radix configuration. We
hit the below BUG_ON on radix:
kernel BUG at mm/huge_memory.c:364!
Oops: Exception in kernel mode, sig: 5 [#1]
SMP NR_CPUS=2048 NUMA PowerNV
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc1-00006-gbae9cc6 #1
task: c0000000f1af8000 task.stack: c0000000f1aec000
NIP: c000000000c5fa0c LR: c000000000c5f9d8 CTR: c000000000c5f9a4
REGS: c0000000f1aef920 TRAP: 0700 Not tainted (4.8.0-rc1-00006-gbae9cc6)
MSR: 9000000102029033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE,TM[E]> CR: 24000844 XER: 00000000
CFAR: c000000000c5f9e0 SOFTE: 1
....
NIP [c000000000c5fa0c] hugepage_init+0x68/0x238
LR [c000000000c5f9d8] hugepage_init+0x34/0x238
Fixes: a7ee539584acf ("powerpc/Kconfig: Update config option based on page size")
Cc: stable@vger.kernel.org # v4.7+
Reported-by: Santhosh <santhog4@linux.vnet.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-19 23:01:33 +05:30
range 13 13 if PPC64 && !PPC_64K_PAGES
2009-07-21 15:25:53 +00:00
default "13" if PPC64 && !PPC_64K_PAGES
range 9 64 if PPC32 && PPC_16K_PAGES
default "9" if PPC32 && PPC_16K_PAGES
range 7 64 if PPC32 && PPC_64K_PAGES
default "7" if PPC32 && PPC_64K_PAGES
range 5 64 if PPC32 && PPC_256K_PAGES
default "5" if PPC32 && PPC_256K_PAGES
2008-09-24 04:29:08 +00:00
range 11 64
2008-04-11 11:11:56 +10:00
default "11"
help
The kernel memory allocator divides physically contiguous memory
blocks into "zones", where each zone is a power of two number of
pages. This option selects the largest power of two that the kernel
keeps in the memory allocator. If you need to allocate very large
blocks of physically contiguous memory, then you may need to
increase this value.
This config option is actually maximum order plus one. For example,
a value of 11 means that the largest free memory block is 2^10 pages.
The page size is not necessarily 4KB. For example, on 64-bit
systems, 64KB pages can be enabled via CONFIG_PPC_64K_PAGES. Keep
this in mind when choosing a value for this option.
[POWERPC] Provide a way to protect 4k subpages when using 64k pages
Using 64k pages on 64-bit PowerPC systems makes life difficult for
emulators that are trying to emulate an ISA, such as x86, which use a
smaller page size, since the emulator can no longer use the MMU and
the normal system calls for controlling page protections. Of course,
the emulator can emulate the MMU by checking and possibly remapping
the address for each memory access in software, but that is pretty
slow.
This provides a facility for such programs to control the access
permissions on individual 4k sub-pages of 64k pages. The idea is
that the emulator supplies an array of protection masks to apply to a
specified range of virtual addresses. These masks are applied at the
level where hardware PTEs are inserted into the hardware page table
based on the Linux PTEs, so the Linux PTEs are not affected. Note
that this new mechanism does not allow any access that would otherwise
be prohibited; it can only prohibit accesses that would otherwise be
allowed. This new facility is only available on 64-bit PowerPC and
only when the kernel is configured for 64k pages.
The masks are supplied using a new subpage_prot system call, which
takes a starting virtual address and length, and a pointer to an array
of protection masks in memory. The array has a 32-bit word per 64k
page to be protected; each 32-bit word consists of 16 2-bit fields,
for which 0 allows any access (that is otherwise allowed), 1 prevents
write accesses, and 2 or 3 prevent any access.
Implicit in this is that the regions of the address space that are
protected are switched to use 4k hardware pages rather than 64k
hardware pages (on machines with hardware 64k page support). In fact
the whole process is switched to use 4k hardware pages when the
subpage_prot system call is used, but this could be improved in future
to switch only the affected segments.
The subpage protection bits are stored in a 3 level tree akin to the
page table tree. The top level of this tree is stored in a structure
that is appended to the top level of the page table tree, i.e., the
pgd array. Since it will often only be 32-bit addresses (below 4GB)
that are protected, the pointers to the first four bottom level pages
are also stored in this structure (each bottom level page contains the
protection bits for 1GB of address space), so the protection bits for
addresses below 4GB can be accessed with one fewer loads than those
for higher addresses.
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-01-24 08:35:13 +11:00
config PPC_SUBPAGE_PROT
2020-07-03 11:19:58 +10:00
bool "Support setting protections for 4k subpages (subpage_prot syscall)"
default n
2017-10-19 15:08:43 +11:00
depends on PPC_BOOK3S_64 && PPC_64K_PAGES
[POWERPC] Provide a way to protect 4k subpages when using 64k pages
Using 64k pages on 64-bit PowerPC systems makes life difficult for
emulators that are trying to emulate an ISA, such as x86, which use a
smaller page size, since the emulator can no longer use the MMU and
the normal system calls for controlling page protections. Of course,
the emulator can emulate the MMU by checking and possibly remapping
the address for each memory access in software, but that is pretty
slow.
This provides a facility for such programs to control the access
permissions on individual 4k sub-pages of 64k pages. The idea is
that the emulator supplies an array of protection masks to apply to a
specified range of virtual addresses. These masks are applied at the
level where hardware PTEs are inserted into the hardware page table
based on the Linux PTEs, so the Linux PTEs are not affected. Note
that this new mechanism does not allow any access that would otherwise
be prohibited; it can only prohibit accesses that would otherwise be
allowed. This new facility is only available on 64-bit PowerPC and
only when the kernel is configured for 64k pages.
The masks are supplied using a new subpage_prot system call, which
takes a starting virtual address and length, and a pointer to an array
of protection masks in memory. The array has a 32-bit word per 64k
page to be protected; each 32-bit word consists of 16 2-bit fields,
for which 0 allows any access (that is otherwise allowed), 1 prevents
write accesses, and 2 or 3 prevent any access.
Implicit in this is that the regions of the address space that are
protected are switched to use 4k hardware pages rather than 64k
hardware pages (on machines with hardware 64k page support). In fact
the whole process is switched to use 4k hardware pages when the
subpage_prot system call is used, but this could be improved in future
to switch only the affected segments.
The subpage protection bits are stored in a 3 level tree akin to the
page table tree. The top level of this tree is stored in a structure
that is appended to the top level of the page table tree, i.e., the
pgd array. Since it will often only be 32-bit addresses (below 4GB)
that are protected, the pointers to the first four bottom level pages
are also stored in this structure (each bottom level page contains the
protection bits for 1GB of address space), so the protection bits for
addresses below 4GB can be accessed with one fewer loads than those
for higher addresses.
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-01-24 08:35:13 +11:00
help
2020-07-03 11:19:58 +10:00
This option adds support for system call to allow user programs
[POWERPC] Provide a way to protect 4k subpages when using 64k pages
Using 64k pages on 64-bit PowerPC systems makes life difficult for
emulators that are trying to emulate an ISA, such as x86, which use a
smaller page size, since the emulator can no longer use the MMU and
the normal system calls for controlling page protections. Of course,
the emulator can emulate the MMU by checking and possibly remapping
the address for each memory access in software, but that is pretty
slow.
This provides a facility for such programs to control the access
permissions on individual 4k sub-pages of 64k pages. The idea is
that the emulator supplies an array of protection masks to apply to a
specified range of virtual addresses. These masks are applied at the
level where hardware PTEs are inserted into the hardware page table
based on the Linux PTEs, so the Linux PTEs are not affected. Note
that this new mechanism does not allow any access that would otherwise
be prohibited; it can only prohibit accesses that would otherwise be
allowed. This new facility is only available on 64-bit PowerPC and
only when the kernel is configured for 64k pages.
The masks are supplied using a new subpage_prot system call, which
takes a starting virtual address and length, and a pointer to an array
of protection masks in memory. The array has a 32-bit word per 64k
page to be protected; each 32-bit word consists of 16 2-bit fields,
for which 0 allows any access (that is otherwise allowed), 1 prevents
write accesses, and 2 or 3 prevent any access.
Implicit in this is that the regions of the address space that are
protected are switched to use 4k hardware pages rather than 64k
hardware pages (on machines with hardware 64k page support). In fact
the whole process is switched to use 4k hardware pages when the
subpage_prot system call is used, but this could be improved in future
to switch only the affected segments.
The subpage protection bits are stored in a 3 level tree akin to the
page table tree. The top level of this tree is stored in a structure
that is appended to the top level of the page table tree, i.e., the
pgd array. Since it will often only be 32-bit addresses (below 4GB)
that are protected, the pointers to the first four bottom level pages
are also stored in this structure (each bottom level page contains the
protection bits for 1GB of address space), so the protection bits for
addresses below 4GB can be accessed with one fewer loads than those
for higher addresses.
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-01-24 08:35:13 +11:00
to set access permissions (read/write, readonly, or no access)
on the 4k subpages of each 64k page.
2020-07-03 11:19:58 +10:00
If unsure, say N here.
2020-08-21 13:55:57 -05:00
config PPC_PROT_SAO_LPAR
bool "Support PROT_SAO mappings in LPARs"
depends on PPC_BOOK3S_64
help
This option adds support for PROT_SAO mappings from userspace
inside LPARs on supported CPUs.
This may cause issues when performing guest migration from
a CPU that supports SAO to one that does not.
If unsure, say N here.
2014-10-08 19:54:50 +11:00
config PPC_COPRO_BASE
bool
2005-09-26 16:04:21 +10:00
config SCHED_SMT
bool "SMT (Hyperthreading) scheduler support"
depends on PPC64 && SMP
help
SMT scheduler support improves the CPU scheduler's decision making
when dealing with POWER5 cpus at a cost of slightly increased
overhead in some places. If unsure say N here.
2012-09-10 00:35:26 +00:00
config PPC_DENORMALISATION
bool "PowerPC denormalisation exception handling"
depends on PPC_BOOK3S_64
2013-07-31 16:31:26 +10:00
default "y" if PPC_POWERNV
2019-07-03 18:04:13 +02:00
help
2012-09-10 00:35:26 +00:00
Add support for handling denormalisation of single precision
values. Useful for bare metal only. If unsure say Y here.
2005-09-26 16:04:21 +10:00
config CMDLINE
2020-06-12 10:42:19 +12:00
string "Initial kernel command string"
2019-04-26 16:23:27 +00:00
default ""
2005-09-26 16:04:21 +10:00
help
On some platforms, there is currently no way for the boot loader to
pass arguments to the kernel. For these platforms, you can supply
some command-line options at build time by entering them here. In
most cases you will need to specify the root device here.
2019-08-02 10:50:06 +12:00
choice
prompt "Kernel command line type" if CMDLINE != ""
default CMDLINE_FROM_BOOTLOADER
config CMDLINE_FROM_BOOTLOADER
bool "Use bootloader kernel arguments if available"
help
Uses the command-line options passed by the boot loader. If
the boot loader doesn't provide any, the default kernel command
string provided in CMDLINE will be used.
config CMDLINE_EXTEND
bool "Extend bootloader kernel arguments"
help
The command-line arguments provided by the boot loader will be
appended to the default kernel command string.
2014-02-20 21:48:17 +01:00
config CMDLINE_FORCE
bool "Always use the default kernel command string"
help
Always use the default kernel command string, even if the boot
loader passes other arguments to the kernel.
This is useful if you cannot or don't want to change the
command-line options your boot loader passes to the kernel.
2019-08-02 10:50:06 +12:00
endchoice
2008-07-09 09:41:52 -06:00
config EXTRA_TARGETS
string "Additional default image types"
help
List additional targets to be built by the bootwrapper here (separated
by spaces). This is useful for targets that depend of device tree
files in the .dts directory.
Targets in this list will be build as part of the default build
target, or when the user does a 'make zImage' or a
'make zImage.initrd'.
If unsure, leave blank
2008-01-15 23:17:00 -05:00
config ARCH_WANTS_FREEZER_CONTROL
def_bool y
depends on ADB_PMU
2018-12-11 20:01:04 +09:00
source "kernel/power/Kconfig"
2005-09-26 16:04:21 +10:00
2018-01-18 17:50:24 -08:00
config PPC_MEM_KEYS
prompt "PowerPC Memory Protection Keys"
def_bool y
depends on PPC_BOOK3S_64
select ARCH_USES_HIGH_VMA_FLAGS
select ARCH_HAS_PKEYS
help
Memory Protection Keys provides a mechanism for enforcing
page-based protections, but without requiring modification of the
page tables when an application changes protection domains.
2019-06-07 15:54:31 -03:00
For details, see Documentation/core-api/protection-keys.rst
2018-01-18 17:50:24 -08:00
If unsure, say y.
2019-11-05 17:00:22 -06:00
config PPC_SECURE_BOOT
prompt "Enable secure boot support"
bool
2020-09-24 11:49:22 +10:00
depends on PPC_POWERNV || PPC_PSERIES
2019-10-30 23:31:27 -04:00
depends on IMA_ARCH_POLICY
2020-03-08 20:57:51 -04:00
imply IMA_SECURE_AND_OR_TRUSTED_BOOT
2019-11-05 17:00:22 -06:00
help
Systems with firmware secure boot enabled need to define security
policies to extend secure boot to the OS. This config allows a user
to enable OS secure boot on systems that have firmware support for
it. If in doubt say N.
2019-11-10 21:10:34 -06:00
config PPC_SECVAR_SYSFS
bool "Enable sysfs interface for POWER secure variables"
default y
depends on PPC_SECURE_BOOT
depends on SYSFS
help
POWER secure variables are managed and controlled by firmware.
These variables are exposed to userspace via sysfs to enable
read/write operations on these variables. Say Y if you have
secure boot enabled and want to expose variables to userspace.
powerpc/rtas: Restrict RTAS requests from userspace
A number of userspace utilities depend on making calls to RTAS to retrieve
information and update various things.
The existing API through which we expose RTAS to userspace exposes more
RTAS functionality than we actually need, through the sys_rtas syscall,
which allows root (or anyone with CAP_SYS_ADMIN) to make any RTAS call they
want with arbitrary arguments.
Many RTAS calls take the address of a buffer as an argument, and it's up to
the caller to specify the physical address of the buffer as an argument. We
allocate a buffer (the "RMO buffer") in the Real Memory Area that RTAS can
access, and then expose the physical address and size of this buffer in
/proc/powerpc/rtas/rmo_buffer. Userspace is expected to read this address,
poke at the buffer using /dev/mem, and pass an address in the RMO buffer to
the RTAS call.
However, there's nothing stopping the caller from specifying whatever
address they want in the RTAS call, and it's easy to construct a series of
RTAS calls that can overwrite arbitrary bytes (even without /dev/mem
access).
Additionally, there are some RTAS calls that do potentially dangerous
things and for which there are no legitimate userspace use cases.
In the past, this would not have been a particularly big deal as it was
assumed that root could modify all system state freely, but with Secure
Boot and lockdown we need to care about this.
We can't fundamentally change the ABI at this point, however we can address
this by implementing a filter that checks RTAS calls against a list
of permitted calls and forces the caller to use addresses within the RMO
buffer.
The list is based off the list of calls that are used by the librtas
userspace library, and has been tested with a number of existing userspace
RTAS utilities. For compatibility with any applications we are not aware of
that require other calls, the filter can be turned off at build time.
Cc: stable@vger.kernel.org
Reported-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200820044512.7543-1-ajd@linux.ibm.com
2020-08-20 14:45:12 +10:00
config PPC_RTAS_FILTER
bool "Enable filtering of RTAS syscalls"
default y
depends on PPC_RTAS
help
The RTAS syscall API has security issues that could be used to
compromise system integrity. This option enforces restrictions on the
RTAS calls and arguments passed by userspace programs to mitigate
these issues.
Say Y unless you know what you are doing and the filter is causing
problems for you.
2005-09-26 16:04:21 +10:00
endmenu
config ISA_DMA_API
bool
2012-02-22 14:10:12 +00:00
default PCI
2005-09-26 16:04:21 +10:00
menu "Bus options"
config ISA
bool "Support for ISA-bus hardware"
2013-03-27 00:47:03 +00:00
depends on PPC_CHRP
2005-10-26 16:47:42 +10:00
select PPC_I8259
2005-09-26 16:04:21 +10:00
help
Find out whether you have ISA slots on your motherboard. ISA is the
name of a bus system, i.e. the way the CPU talks to the other stuff
inside your box. If you have an Apple machine, say N here; if you
2013-03-27 00:47:03 +00:00
have an IBM RS/6000 or pSeries machine, say Y. If you have an
embedded board, consult your board documentation.
2005-09-26 16:04:21 +10:00
config GENERIC_ISA_DMA
bool
2010-07-15 07:38:16 +00:00
depends on ISA_DMA_API
2005-09-26 16:04:21 +10:00
default y
2005-10-26 16:36:55 +10:00
config PPC_INDIRECT_PCI
bool
depends on PCI
2006-01-14 16:57:39 -06:00
default y if 40x || 44x
2005-10-26 16:36:55 +10:00
2005-09-26 16:04:21 +10:00
config SBUS
bool
2006-01-10 21:43:56 -06:00
config FSL_SOC
bool
2007-07-10 18:44:34 +08:00
config FSL_PCI
2019-07-03 18:04:13 +02:00
bool
2019-02-13 08:01:22 +01:00
select ARCH_HAS_DMA_SET_MASK
2007-07-10 18:44:34 +08:00
select PPC_INDIRECT_PCI
2009-01-28 13:25:29 -06:00
select PCI_QUIRKS
2007-07-10 18:44:34 +08:00
2009-09-16 01:43:57 +04:00
config FSL_PMC
bool
default y
depends on SUSPEND && (PPC_85xx || PPC_86xx)
help
Freescale MPC85xx/MPC86xx power management controller support
(suspend/resume). For MPC83xx see platforms/83xx/suspend.c
2010-10-08 10:25:27 +00:00
config PPC4xx_CPM
bool
default y
depends on SUSPEND && (44x || 40x)
help
PPC4xx Clock Power Management (CPM) support (suspend/resume).
It also enables support for two different idle states (idle-wait
and idle-doze).
2008-03-26 22:39:50 +11:00
config 4xx_SOC
bool
2008-04-11 21:03:40 +04:00
config FSL_LBC
2010-10-18 15:22:31 +08:00
bool "Freescale Local Bus support"
2008-04-11 21:03:40 +04:00
help
2010-10-18 15:22:31 +08:00
Enables reporting of errors from the Freescale local bus
controller. Also contains some common code used by
drivers for specific local bus peripherals.
2008-04-11 21:03:40 +04:00
2008-05-23 20:38:54 +04:00
config FSL_GTM
bool
depends on PPC_83xx || QUICC_ENGINE || CPM2
help
Freescale General-purpose Timers support
2005-09-26 16:04:21 +10:00
config PCI_8260
bool
depends on PCI && 8260
2005-10-26 16:36:55 +10:00
select PPC_INDIRECT_PCI
2005-09-26 16:04:21 +10:00
default y
2011-03-23 16:43:03 -07:00
config FSL_RIO
bool "Freescale Embedded SRIO Controller support"
2018-11-15 20:05:36 +01:00
depends on RAPIDIO = y && HAVE_RAPIDIO
2011-03-23 16:43:03 -07:00
default "n"
2019-07-03 18:04:13 +02:00
help
2011-03-23 16:43:03 -07:00
Include support for RapidIO controller on Freescale embedded
processors (MPC8548, MPC8641, etc).
2005-09-26 16:04:21 +10:00
endmenu
2011-12-14 22:57:15 +00:00
config NONSTATIC_KERNEL
bool
2005-09-26 16:04:21 +10:00
menu "Advanced setup"
depends on PPC32
config ADVANCED_OPTIONS
bool "Prompt for advanced kernel configuration options"
help
This option will enable prompting for a variety of advanced kernel
configuration options. These options can cause the kernel to not
work if they are set incorrectly, but can be used to optimize certain
aspects of kernel memory management.
Unless you know what you are doing, say N here.
comment "Default settings for advanced configuration options are used"
depends on !ADVANCED_OPTIONS
config LOWMEM_SIZE_BOOL
bool "Set maximum low memory"
depends on ADVANCED_OPTIONS
help
This option allows you to set the maximum amount of memory which
will be used as "low memory", that is, memory which the kernel can
access directly, without having to set up a kernel virtual mapping.
This can be useful in optimizing the layout of kernel virtual
memory.
Say N here unless you know what you are doing.
config LOWMEM_SIZE
hex "Maximum low memory size (in bytes)" if LOWMEM_SIZE_BOOL
default "0x30000000"
2008-12-08 19:34:58 -08:00
config LOWMEM_CAM_NUM_BOOL
bool "Set number of CAMs to use to map low memory"
depends on ADVANCED_OPTIONS && FSL_BOOKE
help
This option allows you to set the maximum number of CAM slots that
will be used to map low memory. There are a limited number of slots
available and even more limited number that will fit in the L1 MMU.
However, using more entries will allow mapping more low memory. This
can be useful in optimizing the layout of kernel virtual memory.
Say N here unless you know what you are doing.
config LOWMEM_CAM_NUM
2009-03-31 08:05:50 -04:00
depends on FSL_BOOKE
2008-12-08 19:34:58 -08:00
int "Number of CAMs to use to map low memory" if LOWMEM_CAM_NUM_BOOL
default 3
2011-12-14 22:57:15 +00:00
config DYNAMIC_MEMSTART
2013-01-16 18:53:25 -08:00
bool "Enable page aligned dynamic load address for kernel"
depends on ADVANCED_OPTIONS && FLATMEM && (FSL_BOOKE || 44x)
2011-12-14 22:57:15 +00:00
select NONSTATIC_KERNEL
help
This option enables the kernel to be loaded at any page aligned
2019-07-03 18:04:13 +02:00
physical address. The kernel creates a mapping from KERNELBASE to
2011-12-14 22:57:15 +00:00
the address where the kernel is loaded. The page size here implies
the TLB page size of the mapping for kernel on the particular platform.
Please refer to the init code for finding the TLB page size.
DYNAMIC_MEMSTART is an easy way of implementing pseudo-RELOCATABLE
kernel image, where the only restriction is the page aligned kernel
2019-07-03 18:04:13 +02:00
load address. When this option is enabled, the compile time physical
2011-12-14 22:57:15 +00:00
address CONFIG_PHYSICAL_START is ignored.
2011-12-14 22:58:12 +00:00
This option is overridden by CONFIG_RELOCATABLE
2008-04-22 04:22:34 +10:00
config PAGE_OFFSET_BOOL
bool "Set custom page offset address"
depends on ADVANCED_OPTIONS
help
This option allows you to set the kernel virtual address at which
the kernel will map low memory. This can be useful in optimizing
the virtual memory layout of the system.
Say N here unless you know what you are doing.
config PAGE_OFFSET
hex "Virtual address of memory base" if PAGE_OFFSET_BOOL
default "0xc0000000"
2005-09-26 16:04:21 +10:00
config KERNEL_START_BOOL
bool "Set custom kernel base address"
depends on ADVANCED_OPTIONS
help
This option allows you to set the kernel virtual address at which
2008-04-22 04:22:34 +10:00
the kernel will be loaded. Normally this should match PAGE_OFFSET
however there are times (like kdump) that one might not want them
to be the same.
2005-09-26 16:04:21 +10:00
Say N here unless you know what you are doing.
config KERNEL_START
hex "Virtual address of kernel base" if KERNEL_START_BOOL
2008-04-22 04:22:34 +10:00
default PAGE_OFFSET if PAGE_OFFSET_BOOL
2011-12-14 22:57:15 +00:00
default "0xc2000000" if CRASH_DUMP && !NONSTATIC_KERNEL
2005-09-26 16:04:21 +10:00
default "0xc0000000"
2008-04-22 04:22:34 +10:00
config PHYSICAL_START_BOOL
bool "Set physical address where the kernel is loaded"
depends on ADVANCED_OPTIONS && FLATMEM && FSL_BOOKE
help
This gives the physical address where the kernel is loaded.
Say N here unless you know what you are doing.
config PHYSICAL_START
hex "Physical address where the kernel is loaded" if PHYSICAL_START_BOOL
2018-11-17 10:25:07 +00:00
default "0x02000000" if PPC_BOOK3S && CRASH_DUMP && !NONSTATIC_KERNEL
2008-04-22 04:22:34 +10:00
default "0x00000000"
config PHYSICAL_ALIGN
hex
2008-12-08 19:34:59 -08:00
default "0x04000000" if FSL_BOOKE
2008-04-22 04:22:34 +10:00
help
This value puts the alignment restrictions on physical address
where kernel is loaded and run from. Kernel is compiled for an
address which meets above alignment restriction.
2005-09-26 16:04:21 +10:00
config TASK_SIZE_BOOL
bool "Set custom user task size"
depends on ADVANCED_OPTIONS
help
This option allows you to set the amount of virtual address space
allocated to user tasks. This can be useful in optimizing the
virtual memory layout of the system.
Say N here unless you know what you are doing.
config TASK_SIZE
hex "Size of user task space" if TASK_SIZE_BOOL
2013-03-27 00:47:03 +00:00
default "0x80000000" if PPC_8xx
2021-04-01 13:30:43 +00:00
default "0xb0000000" if PPC_BOOK3S_32
2007-10-11 13:40:21 -05:00
default "0xc0000000"
2005-09-26 16:04:21 +10:00
endmenu
2005-09-30 16:16:52 +10:00
if PPC64
powerpc: Work around gcc miscompilation of __pa() on 64-bit
On 64-bit, __pa(&static_var) gets miscompiled by recent versions of
gcc as something like:
addis 3,2,.LANCHOR1+4611686018427387904@toc@ha
addi 3,3,.LANCHOR1+4611686018427387904@toc@l
This ends up effectively ignoring the offset, since its bottom 32 bits
are zero, and means that the result of __pa() still has 0xC in the top
nibble. This happens with gcc 4.8.1, at least.
To work around this, for 64-bit we make __pa() use an AND operator,
and for symmetry, we make __va() use an OR operator. Using an AND
operator rather than a subtraction ends up with slightly shorter code
since it can be done with a single clrldi instruction, whereas it
takes three instructions to form the constant (-PAGE_OFFSET) and add
it on. (Note that MEMORY_START is always 0 on 64-bit.)
CC: <stable@vger.kernel.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-08-27 16:07:49 +10:00
# This value must have zeroes in the bottom 60 bits otherwise lots will break
2008-04-22 04:22:34 +10:00
config PAGE_OFFSET
hex
default "0xc000000000000000"
2005-09-30 16:16:52 +10:00
config KERNEL_START
hex
2005-09-30 17:24:15 +10:00
default "0xc000000000000000"
2008-04-22 04:22:34 +10:00
config PHYSICAL_START
hex
default "0x00000000"
2005-09-30 16:16:52 +10:00
endif
2013-10-11 14:07:57 +11:00
config ARCH_RANDOM
def_bool n
2007-09-16 20:53:25 +10:00
config PPC_LIB_RHEAP
bool
2008-04-16 23:28:09 -05:00
source "arch/powerpc/kvm/Kconfig"
powerpc/livepatch: Add live patching support on ppc64le
Add the kconfig logic & assembly support for handling live patched
functions. This depends on DYNAMIC_FTRACE_WITH_REGS, which in turn
depends on the new -mprofile-kernel ftrace ABI, which is only supported
currently on ppc64le.
Live patching is handled by a special ftrace handler. This means it runs
from ftrace_caller(). The live patch handler modifies the NIP so as to
redirect the return from ftrace_caller() to the new patched function.
However there is one particularly tricky case we need to handle.
If a function A calls another function B, and it is known at link time
that they share the same TOC, then A will not save or restore its TOC,
and will call the local entry point of B.
When we live patch B, we replace it with a new function C, which may
not have the same TOC as A. At live patch time it's too late to modify A
to do the TOC save/restore, so the live patching code must interpose
itself between A and C, and do the TOC save/restore that A omitted.
An additionaly complication is that the livepatch code can not create a
stack frame in order to save the TOC. That is because if C takes > 8
arguments, or is varargs, A will have written the arguments for C in
A's stack frame.
To solve this, we introduce a "livepatch stack" which grows upward from
the base of the regular stack, and is used to store the TOC & LR when
calling a live patched function.
When the patched function returns, we retrieve the real LR & TOC from
the livepatch stack, restore them, and pop the livepatch "stack frame".
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Torsten Duwe <duwe@suse.de>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
2016-03-24 22:04:05 +11:00
source "kernel/livepatch/Kconfig"