Commit Graph

7971 Commits

Author SHA1 Message Date
ed5b00a05c powerpc/prom: Fix "ibm,arch-vec-5-platform-support" scan
The "ibm,arch-vec-5-platform-support" property is a list of pairs of
bytes representing the options and values supported by the platform
firmware. At boot time, Linux scans this list and activates the
available features it recognizes : Radix and XIVE.

A recent change modified the number of entries to loop on and 8 bytes,
4 pairs of { options, values } entries are always scanned. This is
fine on KVM but not on PowerVM which can advertises less. As a
consequence on this platform, Linux reads extra entries pointing to
random data, interprets these as available features and tries to
activate them, leading to a firmware crash in
ibm,client-architecture-support.

Fix that by using the property length of "ibm,arch-vec-5-platform-support".

Fixes: ab91239942 ("powerpc/prom: Remove VLA in prom_check_platform_support()")
Cc: stable@vger.kernel.org # v4.20+
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210122075029.797013-1-clg@kaod.org
2021-01-31 22:35:48 +11:00
7bd2b120f3 powerpc/pci: Delete traverse_pci_dn()
Nothing uses it.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200902035121.1762475-1-oohall@gmail.com
2021-01-31 22:35:48 +11:00
9e85741683 powerpc/eeh: Add a debugfs interface to check if a driver supports recovery
If a PCI device's current driver implements the error handling callbacks
EEH can use them to recover the device after an error occurs. For devices
without the error handling callbacks we recover them by removing the device
and re-scanning it so the PCI core puts the device back into a known good
state.

Currently there's no way for userspace to determine if the driver supports
recovery or not which makes it difficult to write automated tests for EEH.
This patch addressing that by adding a debugfs interface for querying if
a specific device can be recovered or not.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201103051512.919333-2-oohall@gmail.com
2021-01-31 22:35:47 +11:00
b5e904b830 powerpc/eeh: Rework pci_dev lookup in debugfs attributes
Pull the string -> pci_dev lookup stuff into a helper function. No functional change.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201103051512.919333-1-oohall@gmail.com
2021-01-31 22:35:47 +11:00
66f0a9e058 powerpc/vdso64: remove meaningless vgettimeofday.o build rule
VDSO64 is only built for the 64-bit kernel, hence vgettimeofday.o is
built by the generic rule in scripts/Makefile.build.

This line does not provide anything useful.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201223171142.707053-2-masahiroy@kernel.org
2021-01-30 22:23:42 +11:00
bce74491c3 powerpc/vdso: fix unnecessary rebuilds of vgettimeofday.o
vgettimeofday.o is unnecessarily rebuilt. Adding it to 'targets' is not
enough to fix the issue. Kbuild is correctly rebuilding it because the
command line is changed.

PowerPC builds each vdso directory twice; first in vdso_prepare to
generate vdso{32,64}-offsets.h, second as part of the ordinary build
process to embed vdso{32,64}.so.dbg into the kernel.

The problem shows up when CONFIG_PPC_WERROR=y due to the following line
in arch/powerpc/Kbuild:

  subdir-ccflags-$(CONFIG_PPC_WERROR) := -Werror

In the preparation stage, Kbuild directly visits the vdso directories,
hence it does not inherit subdir-ccflags-y. In the second descend,
Kbuild adds -Werror, which results in the command line flipping
with/without -Werror.

It implies a potential danger; if a more critical flag that would impact
the resulted vdso, the offsets recorded in the headers might be different
from real offsets in the embedded vdso images.

Removing the unneeded second descend solves the problem.

Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/linuxppc-dev/87tuslxhry.fsf@mpe.ellerman.id.au/
Link: https://lore.kernel.org/r/20201223171142.707053-1-masahiroy@kernel.org
2021-01-30 22:23:42 +11:00
691602aab9 powerpc/iommu/debug: Add debugfs entries for IOMMU tables
This adds a folder per LIOBN under /sys/kernel/debug/iommu with IOMMU
table parameters.

This is enabled by CONFIG_IOMMU_DEBUGFS.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210113102014.124452-1-aik@ozlabs.ru
2021-01-30 11:39:31 +11:00
9ae440fb3d powerpc/watchdog: Declare soft_nmi_interrupt() prototype
soft_nmi_interrupt() usage requires PPC_WATCHDOG to be configured.
Check the CONFIG definition to declare the prototype.

It fixes this W=1 compile error :

../arch/powerpc/kernel/watchdog.c:250:6: error: no previous prototype for ‘soft_nmi_interrupt’ [-Werror=missing-prototypes]
  250 | void soft_nmi_interrupt(struct pt_regs *regs)
      |      ^~~~~~~~~~~~~~~~~~

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210104143206.695198-18-clg@kaod.org
2021-01-30 11:39:30 +11:00
bb21e1b6c5 powerpc/optprobes: Make patch_imm64_load_insns() static
patch_imm64_load_insns() is only used locally in
arch_prepare_optimized_kprobe() and does not need to be external.

It fixes this W=1 compile error :

../arch/powerpc/kernel/optprobes.c:149:6: error: no previous prototype for ‘patch_imm64_load_insns’ [-Werror=missing-prototypes]
  149 | void patch_imm64_load_insns(unsigned int val, kprobe_opcode_t *addr)

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210104143206.695198-12-clg@kaod.org
2021-01-30 11:39:29 +11:00
d47d307f10 powerpc/optprobes: Remove unused routine patch_imm32_load_insns()
Commit 650b55b707 ("powerpc: Add prefixed instructions to instruction
data type") removed the use of patch_imm32_load_insns(). Clean it up
to fix this W=1 compile error :

../arch/powerpc/kernel/optprobes.c:149:6: error: no previous prototype for ‘patch_imm32_load_insns’ [-Werror=missing-prototypes]
  149 | void patch_imm32_load_insns(unsigned int val, kprobe_opcode_t *addr)

Fixes: 650b55b707 ("powerpc: Add prefixed instructions to instruction data type")
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210104143206.695198-11-clg@kaod.org
2021-01-30 11:39:29 +11:00
157c9f409d powerpc/smp: Make debugger_ipi_callback() static
debugger_ipi_callback() is a local routine used as a NMI IPI handler and
does not need to be external.

It fixes this W=1 compile error :

../arch/powerpc/kernel/smp.c:579:6: error: no previous prototype for ‘debugger_ipi_callback’ [-Werror=missing-prototypes]
  579 | void debugger_ipi_callback(struct pt_regs *regs)
      |      ^~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210104143206.695198-10-clg@kaod.org
2021-01-30 11:39:29 +11:00
cd7aa5d2fa powerpc/smp: Include tick_broadcast() prototype
It fixes this W=1 compile error :

../arch/powerpc/kernel/smp.c:569:6: error: no previous prototype for ‘tick_broadcast’ [-Werror=missing-prototypes]
  569 | void tick_broadcast(const struct cpumask *mask)
      |      ^~~~~~~~~~~~~~

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210104143206.695198-9-clg@kaod.org
2021-01-30 11:39:29 +11:00
1cc2fd7593 powerpc/mce: Include prototypes
It fixes these W=1 compile errors :

../arch/powerpc/kernel/mce.c:591:14: error: no previous prototype for ‘machine_check_early’ [-Werror=missing-prototypes]
  591 | long notrace machine_check_early(struct pt_regs *regs)
      |              ^~~~~~~~~~~~~~~~~~~
../arch/powerpc/kernel/mce.c:725:6: error: no previous prototype for ‘hmi_exception_realmode’ [-Werror=missing-prototypes]
  725 | long hmi_exception_realmode(struct pt_regs *regs)
      |      ^~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210104143206.695198-8-clg@kaod.org
2021-01-30 11:39:29 +11:00
692e592895 powerpc/setup_64: Make some routines static
The following routines are only called by local services and do not
need to be external symbols.

It fixes these W=1 errors :

../arch/powerpc/kernel/setup_64.c:261:13: error: no previous prototype for ‘record_spr_defaults’ [-Werror=missing-prototypes]
  261 | void __init record_spr_defaults(void)
      |             ^~~~~~~~~~~~~~~~~~~
../arch/powerpc/kernel/setup_64.c:1011:6: error: no previous prototype for ‘entry_flush_enable’ [-Werror=missing-prototypes]
 1011 | void entry_flush_enable(bool enable)
      |      ^~~~~~~~~~~~~~~~~~
../arch/powerpc/kernel/setup_64.c:1023:6: error: no previous prototype for ‘uaccess_flush_enable’ [-Werror=missing-prototypes]
 1023 | void uaccess_flush_enable(bool enable)
      |      ^~~~~~~~~~~~~~~~~~~~

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210104143206.695198-7-clg@kaod.org
2021-01-30 11:39:28 +11:00
7a3c90df20 arch: powerpc: Stop building and using oprofile
The "oprofile" user-space tools don't use the kernel OPROFILE support
any more, and haven't in a long time. User-space has been converted to
the perf interfaces.

This commits stops building oprofile for powerpc and removes any
reference to it from directories in arch/powerpc/ apart from
arch/powerpc/oprofile, which will be removed in the next commit (this is
broken into two commits as the size of the commit became very big, ~5k
lines).

Note that the member "oprofile_cpu_type" in "struct cpu_spec" isn't
removed as it was also used by other parts of the code.

Suggested-by: Christoph Hellwig <hch@infradead.org>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Robert Richter <rric@kernel.org>
Acked-by: William Cohen <wcohen@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
2021-01-29 10:05:51 +05:30
2a1867219c fs: add mount_setattr()
This implements the missing mount_setattr() syscall. While the new mount
api allows to change the properties of a superblock there is currently
no way to change the properties of a mount or a mount tree using file
descriptors which the new mount api is based on. In addition the old
mount api has the restriction that mount options cannot be applied
recursively. This hasn't changed since changing mount options on a
per-mount basis was implemented in [1] and has been a frequent request
not just for convenience but also for security reasons. The legacy
mount syscall is unable to accommodate this behavior without introducing
a whole new set of flags because MS_REC | MS_REMOUNT | MS_BIND |
MS_RDONLY | MS_NOEXEC | [...] only apply the mount option to the topmost
mount. Changing MS_REC to apply to the whole mount tree would mean
introducing a significant uapi change and would likely cause significant
regressions.

The new mount_setattr() syscall allows to recursively clear and set
mount options in one shot. Multiple calls to change mount options
requesting the same changes are idempotent:

int mount_setattr(int dfd, const char *path, unsigned flags,
                  struct mount_attr *uattr, size_t usize);

Flags to modify path resolution behavior are specified in the @flags
argument. Currently, AT_EMPTY_PATH, AT_RECURSIVE, AT_SYMLINK_NOFOLLOW,
and AT_NO_AUTOMOUNT are supported. If useful, additional lookup flags to
restrict path resolution as introduced with openat2() might be supported
in the future.

The mount_setattr() syscall can be expected to grow over time and is
designed with extensibility in mind. It follows the extensible syscall
pattern we have used with other syscalls such as openat2(), clone3(),
sched_{set,get}attr(), and others.
The set of mount options is passed in the uapi struct mount_attr which
currently has the following layout:

struct mount_attr {
	__u64 attr_set;
	__u64 attr_clr;
	__u64 propagation;
	__u64 userns_fd;
};

The @attr_set and @attr_clr members are used to clear and set mount
options. This way a user can e.g. request that a set of flags is to be
raised such as turning mounts readonly by raising MOUNT_ATTR_RDONLY in
@attr_set while at the same time requesting that another set of flags is
to be lowered such as removing noexec from a mount tree by specifying
MOUNT_ATTR_NOEXEC in @attr_clr.

Note, since the MOUNT_ATTR_<atime> values are an enum starting from 0,
not a bitmap, users wanting to transition to a different atime setting
cannot simply specify the atime setting in @attr_set, but must also
specify MOUNT_ATTR__ATIME in the @attr_clr field. So we ensure that
MOUNT_ATTR__ATIME can't be partially set in @attr_clr and that @attr_set
can't have any atime bits set if MOUNT_ATTR__ATIME isn't set in
@attr_clr.

The @propagation field lets callers specify the propagation type of a
mount tree. Propagation is a single property that has four different
settings and as such is not really a flag argument but an enum.
Specifically, it would be unclear what setting and clearing propagation
settings in combination would amount to. The legacy mount() syscall thus
forbids the combination of multiple propagation settings too. The goal
is to keep the semantics of mount propagation somewhat simple as they
are overly complex as it is.

The @userns_fd field lets user specify a user namespace whose idmapping
becomes the idmapping of the mount. This is implemented and explained in
detail in the next patch.

[1]: commit 2e4b7fcd92 ("[PATCH] r/o bind mounts: honor mount writer counts at remount")

Link: https://lore.kernel.org/r/20210121131959.646623-35-christian.brauner@ubuntu.com
Cc: David Howells <dhowells@redhat.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-api@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-01-24 14:42:45 +01:00
4025c784c5 powerpc/64s: prevent recursive replay_soft_interrupts causing superfluous interrupt
When an asynchronous interrupt calls irq_exit, it checks for softirqs
that may have been created, and runs them. Running softirqs enables
local irqs, which can replay pending interrupts causing recursion in
replay_soft_interrupts. This abridged trace shows how this can occur:

! NIP replay_soft_interrupts
  LR  interrupt_exit_kernel_prepare
  Call Trace:
    interrupt_exit_kernel_prepare (unreliable)
    interrupt_return
  --- interrupt: ea0 at __rb_reserve_next
  NIP __rb_reserve_next
  LR __rb_reserve_next
  Call Trace:
    ring_buffer_lock_reserve
    trace_function
    function_trace_call
    ftrace_call
    __do_softirq
    irq_exit
    timer_interrupt
!   replay_soft_interrupts
    interrupt_exit_kernel_prepare
    interrupt_return
  --- interrupt: ea0 at arch_local_irq_restore

This can not be prevented easily, because softirqs must not block hard
irqs, so it has to be dealt with.

The recursion is bounded by design in the softirq code because softirq
replay disables softirqs and loops around again to check for new
softirqs created while it ran, so that's not a problem.

However it does mess up interrupt replay state, causing superfluous
interrupts when the second replay_soft_interrupts clears a pending
interrupt, leaving it still set in the first call in the 'happened'
local variable.

Fix this by not caching a copy of irqs_happened across interrupt
handler calls.

Fixes: 3282a3da25 ("powerpc/64: Implement soft interrupt replay in C")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210123061244.2076145-1-npiggin@gmail.com
2021-01-24 22:27:24 +11:00
08685be776 powerpc/64s: fix scv entry fallback flush vs interrupt
The L1D flush fallback functions are not recoverable vs interrupts,
yet the scv entry flush runs with MSR[EE]=1. This can result in a
timer (soft-NMI) or MCE or SRESET interrupt hitting here and overwriting
the EXRFI save area, which ends up corrupting userspace registers for
scv return.

Fix this by disabling RI and EE for the scv entry fallback flush.

Fixes: f79643787e ("powerpc/64s: flush L1D on kernel entry")
Cc: stable@vger.kernel.org # 5.9+ which also have flush L1D patch backport
Reported-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210111062408.287092-1-npiggin@gmail.com
2021-01-20 15:58:19 +11:00
2225a8dda2 powerpc: Fix alignment bug within the init sections
This is a bug that causes early crashes in builds with an .exit.text
section smaller than a page and an .init.text section that ends in the
beginning of a physical page (this is kinda random, which might
explain why this wasn't really encountered before).

The init sections are ordered like this:
  .init.text
  .exit.text
  .init.data

Currently, these sections aren't page aligned.

Because the init code might become read-only at runtime and because
the .init.text section can potentially reside on the same physical
page as .init.data, the beginning of .init.data might be mapped
read-only along with .init.text.

Then when the kernel tries to modify a variable in .init.data (like
kthreadd_done, used in kernel_init()) the kernel panics.

To avoid this, make _einittext page aligned and also align .exit.text
to make sure .init.data is always seperated from the text segments.

Fixes: 060ef9d89d ("powerpc32: PAGE_EXEC required for inittext")
Signed-off-by: Ariel Marcovitch <ariel.marcovitch@gmail.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210102201156.10805-1-ariel.marcovitch@gmail.com
2021-01-12 15:06:23 +11:00
3ce47d95b7 powerpc: Handle .text.{hot,unlikely}.* in linker script
Commit eff8728fe6 ("vmlinux.lds.h: Add PGO and AutoFDO input
sections") added ".text.unlikely.*" and ".text.hot.*" due to an LLVM
change [1].

After another LLVM change [2], these sections are seen in some PowerPC
builds, where there is a orphan section warning then build failure:

$ make -skj"$(nproc)" \
       ARCH=powerpc CROSS_COMPILE=powerpc64le-linux-gnu- LLVM=1 O=out \
       distclean powernv_defconfig zImage.epapr
ld.lld: warning: kernel/built-in.a(panic.o):(.text.unlikely.) is being placed in '.text.unlikely.'
...
ld.lld: warning: address (0xc000000000009314) of section .text is not a multiple of alignment (256)
...
ERROR: start_text address is c000000000009400, should be c000000000008000
ERROR: try to enable LD_HEAD_STUB_CATCH config option
ERROR: see comments in arch/powerpc/tools/head_check.sh
...

Explicitly handle these sections like in the main linker script so
there is no more build failure.

[1]: https://reviews.llvm.org/D79600
[2]: https://reviews.llvm.org/D92493

Fixes: 83a092cf95 ("powerpc: Link warning for orphan sections")
Cc: stable@vger.kernel.org
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://github.com/ClangBuiltLinux/linux/issues/1218
Link: https://lore.kernel.org/r/20210104205952.1399409-1-natechancellor@gmail.com
2021-01-06 21:59:04 +11:00
98bf2d3f49 powerpc/32s: Fix RTAS machine check with VMAP stack
When we have VMAP stack, exception prolog 1 sets r1, not r11.

When it is not an RTAS machine check, don't trash r1 because it is
needed by prolog 1.

Fixes: da7bb43ab9 ("powerpc/32: Fix vmap stack - Properly set r1 before activating MMU")
Fixes: d2e0060360 ("powerpc/32: Use SPRN_SPRG_SCRATCH2 in exception prologs")
Cc: stable@vger.kernel.org # v5.10+
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Squash in fixup for RTAS machine check from Christophe]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/bc77d61d1c18940e456a2dee464f1e2eda65a3f0.1608621048.git.christophe.leroy@csgroup.eu
2021-01-04 23:59:25 +11:00
9b3f7f1b84 Merge tag 'powerpc-5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:

 - Four commits fixing various things in the new C VDSO code

 - One fix for a 32-bit VMAP stack bug

 - Two minor build fixes

Thanks to Cédric Le Goater, Christophe Leroy, and Will Springer.

* tag 'powerpc-5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/32: Fix vmap stack - Properly set r1 before activating MMU on syscall too
  powerpc/vdso: Fix DOTSYM for 32-bit LE VDSO
  powerpc/vdso: Don't pass 64-bit ABI cflags to 32-bit VDSO
  powerpc/vdso: Block R_PPC_REL24 relocations
  powerpc/smp: Add __init to init_big_cores()
  powerpc/time: Force inlining of get_tb()
  powerpc/boot: Fix build of dts/fsl
2020-12-24 14:02:00 -08:00
347d81b68b Merge tag 'dma-mapping-5.11' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping updates from Christoph Hellwig:

 - support for a partial IOMMU bypass (Alexey Kardashevskiy)

 - add a DMA API benchmark (Barry Song)

 - misc fixes (Tiezhu Yang, tangjianqiang)

* tag 'dma-mapping-5.11' of git://git.infradead.org/users/hch/dma-mapping:
  selftests/dma: add test application for DMA_MAP_BENCHMARK
  dma-mapping: add benchmark support for streaming DMA APIs
  dma-contiguous: fix a typo error in a comment
  dma-pool: no need to check return value of debugfs_create functions
  powerpc/dma: Fallback to dma_ops when persistent memory present
  dma-mapping: Allow mixing bypass and mapped DMA operation
2020-12-22 13:19:43 -08:00
d5c243989f powerpc/32: Fix vmap stack - Properly set r1 before activating MMU on syscall too
We need r1 to be properly set before activating MMU, otherwise any new
exception taken while saving registers into the stack in syscall
prologs will use the user stack, which is wrong and will even lockup
or crash when KUAP is selected.

Do that by switching the meaning of r11 and r1 until we have saved r1
to the stack: copy r1 into r11 and setup the new stack pointer in r1.
To avoid complicating and impacting all generic and specific prolog
code (and more), copy back r1 into r11 once r11 is save onto
the stack.

We could get rid of copying r1 back and forth at the cost of rewriting
everything to use r1 instead of r11 all the way when CONFIG_VMAP_STACK
is set, but the effort is probably not worth it for now.

Fixes: da7bb43ab9 ("powerpc/32: Fix vmap stack - Properly set r1 before activating MMU")
Cc: stable@vger.kernel.org # v5.10+
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/a3d819d5c348cee9783a311d5d3f3ba9b48fd219.1608531452.git.christophe.leroy@csgroup.eu
2020-12-21 22:24:00 +11:00
107521e803 powerpc/vdso: Don't pass 64-bit ABI cflags to 32-bit VDSO
When building the 32-bit VDSO, we are building 32-bit code as part of
a 64-bit kernel build. That requires us to tweak the cflags to trick
the compiler into building 32-bit code for us. The main way we do that
is by passing -m32, but there are other options that affect code
generation and ABI selection.

In particular when building vgettimeofday.c, we end up passing
-mcall-aixdesc because it's in KBUILD_CFLAGS, which causes the
compiler to generate function descriptors, and dot symbols, eg:

  $ nm arch/powerpc/kernel/vdso32/vgettimeofday.o
  000005d0 T .__c_kernel_clock_getres
  00000024 D __c_kernel_clock_getres
  ...

We get away with that at the moment because we also use the DOTSYM
macro, and that is also incorrectly prepending a '.' in 32-bit VDSO
code due to a separate bug.

But we shouldn't be generating function descriptors for this file,
there's no 32-bit ABI that includes function descriptors, so the
resulting object file is some frankenstein and it's surprising that it
even links.

So filter out all the ABI-related options we add to CFLAGS for 64-bit
builds, so that they're not used when building 32-bit code. With that
we only see regular text symbols:

  $ nm arch/powerpc/kernel/vdso32/vgettimeofday.o                                                                                                                                     michael@alpine1-p1
  000005d0 T __c_kernel_clock_getres
  00000000 T __c_kernel_clock_gettime
  00000200 T __c_kernel_clock_gettime64
  00000410 T __c_kernel_gettimeofday
  00000650 T __c_kernel_time

Fixes: ab037dd87a ("powerpc/vdso: Switch VDSO to generic C implementation.")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201218111619.1206391-2-mpe@ellerman.id.au
2020-12-21 22:06:26 +11:00
42ed6d56ad powerpc/vdso: Block R_PPC_REL24 relocations
Add R_PPC_REL24 relocations to the list of relocations we do NOT
support in the VDSO.

These are generated in some cases and we do not support relocating
them at runtime, so if they appear then the VDSO will not work at
runtime, therefore it's preferable to break the build if we see them.

Fixes: ab037dd87a ("powerpc/vdso: Switch VDSO to generic C implementation.")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201218111619.1206391-1-mpe@ellerman.id.au
2020-12-21 22:06:25 +11:00
9014eab6a3 powerpc/smp: Add __init to init_big_cores()
It fixes this link warning:

WARNING: modpost: vmlinux.o(.text.unlikely+0x2d98): Section mismatch in reference from the function init_big_cores.isra.0() to the function .init.text:init_thread_group_cache_map()
The function init_big_cores.isra.0() references
the function __init init_thread_group_cache_map().
This is often because init_big_cores.isra.0 lacks a __init
annotation or the annotation of init_thread_group_cache_map is wrong.

Fixes: 425752c63b ("powerpc: Detect the presence of big-cores via "ibm, thread-groups"")
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201221074154.403779-1-clg@kaod.org
2020-12-21 22:06:10 +11:00
b0a0c2615f epoll: wire up syscall epoll_pwait2
Split off from prev patch in the series that implements the syscall.

Link: https://lkml.kernel.org/r/20201121144401.3727659-4-willemdebruijn.kernel@gmail.com
Signed-off-by: Willem de Bruijn <willemb@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-19 11:18:38 -08:00
8a5be36b93 Merge tag 'powerpc-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:

 - Switch to the generic C VDSO, as well as some cleanups of our VDSO
   setup/handling code.

 - Support for KUAP (Kernel User Access Prevention) on systems using the
   hashed page table MMU, using memory protection keys.

 - Better handling of PowerVM SMT8 systems where all threads of a core
   do not share an L2, allowing the scheduler to make better scheduling
   decisions.

 - Further improvements to our machine check handling.

 - Show registers when unwinding interrupt frames during stack traces.

 - Improvements to our pseries (PowerVM) partition migration code.

 - Several series from Christophe refactoring and cleaning up various
   parts of the 32-bit code.

 - Other smaller features, fixes & cleanups.

Thanks to: Alan Modra, Alexey Kardashevskiy, Andrew Donnellan, Aneesh
Kumar K.V, Ard Biesheuvel, Athira Rajeev, Balamuruhan S, Bill Wendling,
Cédric Le Goater, Christophe Leroy, Christophe Lombard, Colin Ian King,
Daniel Axtens, David Hildenbrand, Frederic Barrat, Ganesh Goudar,
Gautham R. Shenoy, Geert Uytterhoeven, Giuseppe Sacco, Greg Kurz,
Harish, Jan Kratochvil, Jordan Niethe, Kaixu Xia, Laurent Dufour,
Leonardo Bras, Madhavan Srinivasan, Mahesh Salgaonkar, Mathieu
Desnoyers, Nathan Lynch, Nicholas Piggin, Oleg Nesterov, Oliver
O'Halloran, Oscar Salvador, Po-Hsu Lin, Qian Cai, Qinglang Miao, Randy
Dunlap, Ravi Bangoria, Sachin Sant, Sandipan Das, Sebastian Andrzej
Siewior , Segher Boessenkool, Srikar Dronamraju, Tyrel Datwyler, Uwe
Kleine-König, Vincent Stehlé, Youling Tang, and Zhang Xiaoxu.

* tag 'powerpc-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (304 commits)
  powerpc/32s: Fix cleanup_cpu_mmu_context() compile bug
  powerpc: Add config fragment for disabling -Werror
  powerpc/configs: Add ppc64le_allnoconfig target
  powerpc/powernv: Rate limit opal-elog read failure message
  powerpc/pseries/memhotplug: Quieten some DLPAR operations
  powerpc/ps3: use dma_mapping_error()
  powerpc: force inlining of csum_partial() to avoid multiple csum_partial() with GCC10
  powerpc/perf: Fix Threshold Event Counter Multiplier width for P10
  powerpc/mm: Fix hugetlb_free_pmd_range() and hugetlb_free_pud_range()
  KVM: PPC: Book3S HV: Fix mask size for emulated msgsndp
  KVM: PPC: fix comparison to bool warning
  KVM: PPC: Book3S: Assign boolean values to a bool variable
  powerpc: Inline setup_kup()
  powerpc/64s: Mark the kuap/kuep functions non __init
  KVM: PPC: Book3S HV: XIVE: Add a comment regarding VP numbering
  powerpc/xive: Improve error reporting of OPAL calls
  powerpc/xive: Simplify xive_do_source_eoi()
  powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_EOI_FW
  powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_MASK_FW
  powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_SHIFT_BUG
  ...
2020-12-17 13:34:25 -08:00
09c0796adf Merge tag 'trace-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
 "The major update to this release is that there's a new arch config
  option called CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS.

  Currently, only x86_64 enables it. All the ftrace callbacks now take a
  struct ftrace_regs instead of a struct pt_regs. If the architecture
  has HAVE_DYNAMIC_FTRACE_WITH_ARGS enabled, then the ftrace_regs will
  have enough information to read the arguments of the function being
  traced, as well as access to the stack pointer.

  This way, if a user (like live kernel patching) only cares about the
  arguments, then it can avoid using the heavier weight "regs" callback,
  that puts in enough information in the struct ftrace_regs to simulate
  a breakpoint exception (needed for kprobes).

  A new config option that audits the timestamps of the ftrace ring
  buffer at most every event recorded.

  Ftrace recursion protection has been cleaned up to move the protection
  to the callback itself (this saves on an extra function call for those
  callbacks).

  Perf now handles its own RCU protection and does not depend on ftrace
  to do it for it (saving on that extra function call).

  New debug option to add "recursed_functions" file to tracefs that
  lists all the places that triggered the recursion protection of the
  function tracer. This will show where things need to be fixed as
  recursion slows down the function tracer.

  The eval enum mapping updates done at boot up are now offloaded to a
  work queue, as it caused a noticeable pause on slow embedded boards.

  Various clean ups and last minute fixes"

* tag 'trace-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (33 commits)
  tracing: Offload eval map updates to a work queue
  Revert: "ring-buffer: Remove HAVE_64BIT_ALIGNED_ACCESS"
  ring-buffer: Add rb_check_bpage in __rb_allocate_pages
  ring-buffer: Fix two typos in comments
  tracing: Drop unneeded assignment in ring_buffer_resize()
  tracing: Disable ftrace selftests when any tracer is running
  seq_buf: Avoid type mismatch for seq_buf_init
  ring-buffer: Fix a typo in function description
  ring-buffer: Remove obsolete rb_event_is_commit()
  ring-buffer: Add test to validate the time stamp deltas
  ftrace/documentation: Fix RST C code blocks
  tracing: Clean up after filter logic rewriting
  tracing: Remove the useless value assignment in test_create_synth_event()
  livepatch: Use the default ftrace_ops instead of REGS when ARGS is available
  ftrace/x86: Allow for arguments to be passed in to ftrace_regs by default
  ftrace: Have the callbacks receive a struct ftrace_regs instead of pt_regs
  MAINTAINERS: assign ./fs/tracefs to TRACING
  tracing: Fix some typos in comments
  ftrace: Remove unused varible 'ret'
  ring-buffer: Add recording of ring buffer recursion into recursed_functions
  ...
2020-12-17 13:22:17 -08:00
005b2a9dc8 Merge tag 'tif-task_work.arch-2020-12-14' of git://git.kernel.dk/linux-block
Pull TIF_NOTIFY_SIGNAL updates from Jens Axboe:
 "This sits on top of of the core entry/exit and x86 entry branch from
  the tip tree, which contains the generic and x86 parts of this work.

  Here we convert the rest of the archs to support TIF_NOTIFY_SIGNAL.

  With that done, we can get rid of JOBCTL_TASK_WORK from task_work and
  signal.c, and also remove a deadlock work-around in io_uring around
  knowing that signal based task_work waking is invoked with the sighand
  wait queue head lock.

  The motivation for this work is to decouple signal notify based
  task_work, of which io_uring is a heavy user of, from sighand. The
  sighand lock becomes a huge contention point, particularly for
  threaded workloads where it's shared between threads. Even outside of
  threaded applications it's slower than it needs to be.

  Roman Gershman <romger@amazon.com> reported that his networked
  workload dropped from 1.6M QPS at 80% CPU to 1.0M QPS at 100% CPU
  after io_uring was changed to use TIF_NOTIFY_SIGNAL. The time was all
  spent hammering on the sighand lock, showing 57% of the CPU time there
  [1].

  There are further cleanups possible on top of this. One example is
  TIF_PATCH_PENDING, where a patch already exists to use
  TIF_NOTIFY_SIGNAL instead. Hopefully this will also lead to more
  consolidation, but the work stands on its own as well"

[1] https://github.com/axboe/liburing/issues/215

* tag 'tif-task_work.arch-2020-12-14' of git://git.kernel.dk/linux-block: (28 commits)
  io_uring: remove 'twa_signal_ok' deadlock work-around
  kernel: remove checking for TIF_NOTIFY_SIGNAL
  signal: kill JOBCTL_TASK_WORK
  io_uring: JOBCTL_TASK_WORK is no longer used by task_work
  task_work: remove legacy TWA_SIGNAL path
  sparc: add support for TIF_NOTIFY_SIGNAL
  riscv: add support for TIF_NOTIFY_SIGNAL
  nds32: add support for TIF_NOTIFY_SIGNAL
  ia64: add support for TIF_NOTIFY_SIGNAL
  h8300: add support for TIF_NOTIFY_SIGNAL
  c6x: add support for TIF_NOTIFY_SIGNAL
  alpha: add support for TIF_NOTIFY_SIGNAL
  xtensa: add support for TIF_NOTIFY_SIGNAL
  arm: add support for TIF_NOTIFY_SIGNAL
  microblaze: add support for TIF_NOTIFY_SIGNAL
  hexagon: add support for TIF_NOTIFY_SIGNAL
  csky: add support for TIF_NOTIFY_SIGNAL
  openrisc: add support for TIF_NOTIFY_SIGNAL
  sh: add support for TIF_NOTIFY_SIGNAL
  um: add support for TIF_NOTIFY_SIGNAL
  ...
2020-12-16 12:33:35 -08:00
5e60366d56 Merge tag 'fallthrough-fixes-clang-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux
Pull fallthrough fixes from Gustavo A. R. Silva:
 "Fix many fall-through warnings when building with Clang 12.0.0
  using -Wimplicit-fallthrough.

   - powerpc: boot: include compiler_attributes.h (Nick Desaulniers)

   - Revert "lib: Revert use of fallthrough pseudo-keyword in lib/"
     (Nick Desaulniers)

   - powerpc: fix -Wimplicit-fallthrough (Nick Desaulniers)

   - lib: Fix fall-through warnings for Clang (Gustavo A. R. Silva)"

* tag 'fallthrough-fixes-clang-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
  lib: Fix fall-through warnings for Clang
  powerpc: fix -Wimplicit-fallthrough
  Revert "lib: Revert use of fallthrough pseudo-keyword in lib/"
  powerpc: boot: include compiler_attributes.h
2020-12-16 00:24:16 -08:00
d0a3ac549f ubsan: enable for all*config builds
With UBSAN_OBJECT_SIZE disabled for GCC, only UBSAN_ALIGNMENT remained a
noisy UBSAN option.  Disable it for COMPILE_TEST so the rest of UBSAN can
be used for full all*config builds or other large combinations.

[sfr@canb.auug.org.au: add .data..Lubsan_data*/.data..Lubsan_type* sections explicitly]
  Link: https://lkml.kernel.org/r/20201208230157.42c42789@canb.auug.org.au

Link: https://lore.kernel.org/lkml/CAHk-=wgXW=YLxGN0QVpp-1w5GDd2pf1W-FqY15poKzoVfik2qA@mail.gmail.com/
Link: https://lkml.kernel.org/r/20201203004437.389959-6-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: George Popescu <georgepope@android.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Marco Elver <elver@google.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-15 22:46:18 -08:00
3c41e57a1e Merge tag 'irqchip-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core
Pull irqchip updates for 5.11 from Marc Zyngier:

  - Preliminary support for managed interrupts on platform devices
  - Correctly identify allocation of MSIs proxyied by another device
  - Remove the fasteoi IPI flow which has been proved useless
  - Generalise the Ocelot support to new SoCs
  - Improve GICv4.1 vcpu entry, matching the corresponding KVM optimisation
  - Work around spurious interrupts on Qualcomm PDC
  - Random fixes and cleanups

Link: https://lore.kernel.org/r/20201212135626.1479884-1-maz@kernel.org
2020-12-15 10:48:07 +01:00
0be47634db powerpc/cacheinfo: Print correct cache-sibling map/list for L2 cache
On POWER platforms where only some groups of threads within a core
share the L2-cache (indicated by the ibm,thread-groups device-tree
property), we currently print the incorrect shared_cpu_map/list for
L2-cache in the sysfs.

This patch reports the correct shared_cpu_map/list on such platforms.

Example:
On a platform with "ibm,thread-groups" set to
                 00000001 00000002 00000004 00000000
                 00000002 00000004 00000006 00000001
                 00000003 00000005 00000007 00000002
                 00000002 00000004 00000000 00000002
                 00000004 00000006 00000001 00000003
                 00000005 00000007

This indicates that threads {0,2,4,6} in the core share the L2-cache
and threads {1,3,5,7} in the core share the L2 cache.

However, without the patch, the shared_cpu_map/list for L2 for CPUs 0,
1 is reported in the sysfs as follows:

/sys/devices/system/cpu/cpu0/cache/index2/shared_cpu_list:0-7
/sys/devices/system/cpu/cpu0/cache/index2/shared_cpu_map:000000,000000ff

/sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_list:0-7
/sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map:000000,000000ff

With the patch, the shared_cpu_map/list for L2 cache for CPUs 0, 1 is
correctly reported as follows:

/sys/devices/system/cpu/cpu0/cache/index2/shared_cpu_list:0,2,4,6
/sys/devices/system/cpu/cpu0/cache/index2/shared_cpu_map:000000,00000055

/sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_list:1,3,5,7
/sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map:000000,000000aa

This patch also defines cpu_l2_cache_mask() for !CONFIG_SMP case.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1607596739-32439-6-git-send-email-ego@linux.vnet.ibm.com
2020-12-11 00:10:25 +11:00
9538abee18 powerpc/smp: Add support detecting thread-groups sharing L2 cache
On POWER systems, groups of threads within a core sharing the L2-cache
can be indicated by the "ibm,thread-groups" property array with the
identifier "2".

This patch adds support for detecting this, and when present, populate
the populating the cpu_l2_cache_mask of every CPU to the core-siblings
which share L2 with the CPU as specified in the by the
"ibm,thread-groups" property array.

On a platform with the following "ibm,thread-group" configuration
		 00000001 00000002 00000004 00000000
		 00000002 00000004 00000006 00000001
		 00000003 00000005 00000007 00000002
		 00000002 00000004 00000000 00000002
		 00000004 00000006 00000001 00000003
		 00000005 00000007

Without this patch, the sched-domain hierarchy for CPUs 0,1 would be
	CPU0 attaching sched-domain(s):
	domain-0: span=0,2,4,6 level=SMT
	domain-1: span=0-7 level=CACHE
	domain-2: span=0-15,24-39,48-55 level=MC
	domain-3: span=0-55 level=DIE

	CPU1 attaching sched-domain(s):
	domain-0: span=1,3,5,7 level=SMT
	domain-1: span=0-7 level=CACHE
	domain-2: span=0-15,24-39,48-55 level=MC
	domain-3: span=0-55 level=DIE

The CACHE domain at 0-7 is incorrect since the ibm,thread-groups
sub-array
[00000002 00000002 00000004
 00000000 00000002 00000004 00000006
 00000001 00000003 00000005 00000007]
indicates that L2 (Property "2") is shared only between the threads of a single
group. There are "2" groups of threads where each group contains "4"
threads each. The groups being {0,2,4,6} and {1,3,5,7}.

With this patch, the sched-domain hierarchy for CPUs 0,1 would be
     	CPU0 attaching sched-domain(s):
	domain-0: span=0,2,4,6 level=SMT
	domain-1: span=0-15,24-39,48-55 level=MC
	domain-2: span=0-55 level=DIE

	CPU1 attaching sched-domain(s):
	domain-0: span=1,3,5,7 level=SMT
	domain-1: span=0-15,24-39,48-55 level=MC
	domain-2: span=0-55 level=DIE

The CACHE domain with span=0,2,4,6 for CPU 0 (span=1,3,5,7 for CPU 1
resp.) gets degenerated into the SMT domain. Furthermore, the
last-level-cache domain gets correctly set to the SMT sched-domain.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1607596739-32439-5-git-send-email-ego@linux.vnet.ibm.com
2020-12-11 00:10:25 +11:00
fbd2b672e9 powerpc/smp: Rename init_thread_group_l1_cache_map() to make it generic
init_thread_group_l1_cache_map() initializes the per-cpu cpumask
thread_group_l1_cache_map with the core-siblings which share L1 cache
with the CPU. Make this function generic to the cache-property (L1 or
L2) and update a suitable mask. This is a preparatory patch for the
next patch where we will introduce discovery of thread-groups that
share L2-cache.

No functional change.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1607596739-32439-4-git-send-email-ego@linux.vnet.ibm.com
2020-12-11 00:10:25 +11:00
1fdc1d6632 powerpc/smp: Rename cpu_l1_cache_map as thread_group_l1_cache_map
On platforms which have the "ibm,thread-groups" property, the per-cpu
variable cpu_l1_cache_map keeps a track of which group of threads
within the same core share the L1 cache, Instruction and Data flow.

This patch renames the variable to "thread_group_l1_cache_map" to make
it consistent with a subsequent patch which will introduce
thread_group_l2_cache_map.

This patch introduces no functional change.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1607596739-32439-3-git-send-email-ego@linux.vnet.ibm.com
2020-12-11 00:10:24 +11:00
790a1662d3 powerpc/smp: Parse ibm,thread-groups with multiple properties
The "ibm,thread-groups" device-tree property is an array that is used
to indicate if groups of threads within a core share certain
properties. It provides details of which property is being shared by
which groups of threads. This array can encode information about
multiple properties being shared by different thread-groups within the
core.

Example: Suppose,
"ibm,thread-groups" = [1,2,4,8,10,12,14,9,11,13,15,2,2,4,8,10,12,14,9,11,13,15]

This can be decomposed up into two consecutive arrays:

a) [1,2,4,8,10,12,14,9,11,13,15]
b) [2,2,4,8,10,12,14,9,11,13,15]

where in,

a) provides information of Property "1" being shared by "2" groups,
   each with "4" threads each. The "ibm,ppc-interrupt-server#s" of the
   first group is {8,10,12,14} and the "ibm,ppc-interrupt-server#s" of
   the second group is {9,11,13,15}. Property "1" is indicative of
   the thread in the group sharing L1 cache, translation cache and
   Instruction Data flow.

b) provides information of Property "2" being shared by "2" groups,
   each group with "4" threads. The "ibm,ppc-interrupt-server#s" of
   the first group is {8,10,12,14} and the
   "ibm,ppc-interrupt-server#s" of the second group is
   {9,11,13,15}. Property "2" indicates that the threads in each group
   share the L2-cache.

The existing code assumes that the "ibm,thread-groups" encodes
information about only one property. Hence even on platforms which
encode information about multiple properties being shared by the
corresponding groups of threads, the current code will only pick the
first one. (In the above example, it will only consider
[1,2,4,8,10,12,14,9,11,13,15] but not [2,2,4,8,10,12,14,9,11,13,15]).

This patch extends the parsing support on platforms which encode
information about multiple properties being shared by the
corresponding groups of threads.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1607596739-32439-2-git-send-email-ego@linux.vnet.ibm.com
2020-12-11 00:10:16 +11:00
3d2ffcdd2a powerpc/watchpoint: Workaround P10 DD1 issue with VSX-32 byte instructions
POWER10 DD1 has an issue where it generates watchpoint exceptions when
it shouldn't. The conditions where this occur are:

 - octword op
 - ending address of DAWR range is less than starting address of op
 - those addresses need to be in the same or in two consecutive 512B
   blocks
 - 'op address + 64B' generates an address that has a carry into bit
   52 (crosses 2K boundary)

Handle such spurious exception by considering them as extraneous and
emulating/single-steeping instruction without generating an event.

[ravi: Fixed build warning reported by lkp@intel.com]
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201106045650.278987-1-ravi.bangoria@linux.ibm.com
2020-12-11 00:09:10 +11:00
02b02ee1b0 powerpc/64s: Remove idle workaround code from restore_cpu_cpufeatures
Idle code no longer uses the .cpu_restore CPU operation to restore
SPRs, so this workaround is no longer required.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190711022404.18132-2-npiggin@gmail.com
2020-12-10 23:13:26 +11:00
59d512e437 powerpc/64: irq replay remove decrementer overflow check
This is way to catch some cases of decrementer overflow, when the
decrementer has underflowed an odd number of times, while MSR[EE] was
disabled.

With a typical small decrementer, a timer that fires when MSR[EE] is
disabled will be "lost" if MSR[EE] remains disabled for between 4.3 and
8.6 seconds after the timer expires. In any case, the decrementer
interrupt would be taken at 8.6 seconds and the timer would be found at
that point.

So this check is for catching extreme latency events, and it prevents
those latencies from being a further few seconds long.  It's not obvious
this is a good tradeoff. This is already a watchdog magnitude event and
that situation is not improved a significantly with this check. For
large decrementers, it's useless.

Therefore remove this check, which avoids a mftb when enabling hard
disabled interrupts (e.g., when enabling after coming from hardware
interrupt handlers). Perhaps more importantly, it also removes the
clunky MSR[EE] vs PACA_IRQ_HARD_DIS incoherency in soft-interrupt replay
which simplifies the code.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201107014336.2337337-1-npiggin@gmail.com
2020-12-09 23:48:15 +11:00
e89a8ca94b powerpc/64s: Remove MSR[ISF] bit
No supported processor implements this mode. Setting the bit in
MSR values can be a bit confusing (and would prevent the bit from
ever being reused). Remove it.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201106045340.1935841-1-npiggin@gmail.com
2020-12-09 23:48:14 +11:00
5f1888a077 powerpc/fault: Perform exception fixup in do_page_fault()
Exception fixup doesn't require the heady full regs saving,
do it from do_page_fault() directly.

For that, split bad_page_fault() in two parts.

As bad_page_fault() can also be called from other places than
handle_page_fault(), it will still perform exception fixup and
fallback on __bad_page_fault().

handle_page_fault() directly calls __bad_page_fault() as the
exception fixup will now be done by do_page_fault()

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/bd07d6fef9237614cd6d318d8f19faeeadaa816b.1607491748.git.christophe.leroy@csgroup.eu
2020-12-09 23:48:14 +11:00
89eecd938c powerpc/8xx: Use SPRN_SPRG_SCRATCH2 in DTLB miss exception
Use SPRN_SPRG_SCRATCH2 in DTLB miss exception instead of DAR
in order to be similar to ITLB miss exception.

This also simplifies mpc8xx_pmu_del()

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e3cc8f023ef40e1e8ae144e4dd1330a5ff022528.1606231483.git.christophe.leroy@csgroup.eu
2020-12-09 23:48:13 +11:00
a314ea5abf powerpc/8xx: Use SPRN_SPRG_SCRATCH2 in ITLB miss exception
In order to re-enable MMU earlier, ensure ITLB miss exception
cannot clobber SPRN_SPRG_SCRATCH0 and SPRN_SPRG_SCRATCH1.
Do so by using SPRN_SPRG_SCRATCH2 and SPRN_M_TW instead, like
the DTLB miss exception.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/abc78e8e9577d473691ebb9996c6413b37bfd9ca.1606231483.git.christophe.leroy@csgroup.eu
2020-12-09 23:48:12 +11:00
576e02bbf1 powerpc/8xx: Simplify INVALIDATE_ADJACENT_PAGES_CPU15
We now have r11 available as a scratch register so
INVALIDATE_ADJACENT_PAGES_CPU15() can be simplified.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/bdafd651b4ac3a851fd09249f5f3699c50da29f2.1606231483.git.christophe.leroy@csgroup.eu
2020-12-09 23:48:12 +11:00
bccc58986a powerpc/8xx: Always pin kernel text TLB
There is no big poing in not pinning kernel text anymore, as now
we can keep pinned TLB even with things like DEBUG_PAGEALLOC.

Remove CONFIG_PIN_TLB_TEXT, making it always right.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Drop ifdef around mmu_pin_tlb() to fix build errors]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/203b89de491e1379f1677a2685211b7c32adfff0.1606231483.git.christophe.leroy@csgroup.eu
2020-12-09 23:47:45 +11:00
613df979da powerpc/8xx: DEBUG_PAGEALLOC doesn't require an ITLB miss exception handler
Since commit e611939fc8 ("powerpc/mm: Ensure change_page_attr()
doesn't invalidate pinned TLBs"), pinned TLBs are not anymore
invalidated by __kernel_map_pages() when CONFIG_DEBUG_PAGEALLOC is
selected.

Remove the dependency on CONFIG_DEBUG_PAGEALLOC.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e796c5fcb5898de827c803cf1ab8ba1d7a5d4b76.1606231483.git.christophe.leroy@csgroup.eu
2020-12-09 17:01:59 +11:00
ad3ed15cd0 powerpc/process: Remove target specific __set_dabr()
__set_dabr() are simple functions that can be inline directly
inside set_dabr() and using IS_ENABLED() instead of #ifdef

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c10b263668e137236c71d76648b03cf2cd1ee66f.1607076733.git.christophe.leroy@csgroup.eu
2020-12-09 17:01:40 +11:00