IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Currently powerpc's asm/io.h includes linux/io.h, and linux/io.h
includes asm/io.h.
This can cause problems because depending on which is included first the
order of definitions between the two files will change.
The include of linux/io.h was added back in 2008 in commit b41e5fffe8b8
("[POWERPC] devres: Add devm_ioremap_prot()"). It's not entirely clear
it was needed then, but devm_ioremap_prot() has since been removed
entirely as unused, in dedd24a12fe6 ("powerpc: Remove unused
devm_ioremap_prot()").
So it seems to be unnecessary and can potentially cause problems, so
remove the include of linux/io.h from asm/io.h
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
POWER9 requires msgsync for receiver-side synchronization, and a DD1
workaround restricts IPIs to core-local.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Drop no longer needed asm feature macro changes]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
IPIs are a pretty hot path and we already have the ability to do asm feature
patching, so use it.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Change log detail]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
POWER9 changes requirements and adds new instructions for
synchronization.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Change the doorbell callers to know about their msgsnd addressing,
rather than have them set a per-cpu target data tag at boot that gets
sent to the cause_ipi functions. The data is only used for doorbell IPI
functions, no other IPI types, so it makes sense to keep that detail
local to doorbell.
Have the platform code understand doorbell IPIs, rather than the
interrupt controller code understand them. Platform code can look at
capabilities it has available and decide which to use.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Add the bit definition and use it in facility_unavailable_exception() so we can
intelligently report the cause if we take a fault for SCV. This doesn't actually
enable SCV.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Drop whitespace changes to the existing entries, flush out change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
We have a #define for it, so use it.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The current behaviour of the hash table dump assumes that memory is contiguous
and iterates from the start of memory to (start + size of memory). When memory
isn't physically contiguous, this doesn't work.
If memory exists at 0-5 GB and 6-10 GB then the current approach will check if
entries exist in the hash table from 0GB to 9GB. This patch changes the
behaviour to iterate over any holes up to the end of memory.
Fixes: 1515ab932156 ("powerpc/mm: Dump hash table")
Signed-off-by: Rashmica Gupta <rashmica.g@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The current page table dumper scans the Linux page tables and coalesces mappings
with adjacent virtual addresses and similar PTE flags. This behaviour is
somewhat broken when you consider the IOREMAP space where entirely unrelated
mappings will appear to be virtually contiguous. This patch modifies the range
coalescing so that only ranges that are both physically and virtually contiguous
are combined. This patch also adds to the dump output the physical address at
the start of each range.
Fixes: 8eb07b187000 ("powerpc/mm: Dump linux pagetables")
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
[mpe: Print the physicall address with 0x like the other addresses]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
On Book3s we have two PTE flags used to mark cache-inhibited mappings:
_PAGE_TOLERANT and _PAGE_NON_IDEMPOTENT. Currently the kernel page table dumper
only looks at the generic _PAGE_NO_CACHE which is defined to be _PAGE_TOLERANT.
This patch modifies the dumper so both flags are shown in the dump.
Fixes: 8eb07b187000 ("powerpc/mm: Dump linux pagetables")
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently sys_mmap() and sys_mmap2() (32-bit only), are not visible to the
syscall tracing machinery. This means users are not able to see the execution of
mmap() syscalls using the syscall tracer.
Fix that by using SYSCALL_DEFINE6 for sys_mmap() and sys_mmap2() so that the
meta-data associated with these syscalls is visible to the syscall tracer.
A side-effect of this change is that the return type has changed from unsigned
long to long. However this should have no effect, the only code in the kernel
which uses the result of these syscalls is in the syscall return path, which is
written in asm and treats the result as unsigned regardless.
Example output:
cat-3399 [001] .... 196.542410: sys_mmap(addr: 7fff922a0000, len: 20000, prot: 3, flags: 812, fd: 3, offset: 1b0000)
cat-3399 [001] .... 196.542443: sys_mmap -> 0x7fff922a0000
cat-3399 [001] .... 196.542668: sys_munmap(addr: 7fff922c0000, len: 6d2c)
cat-3399 [001] .... 196.542677: sys_munmap -> 0x0
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
[mpe: Massage change log, add detail on return type change]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Recently in commit f6eedbba7a26 ("powerpc/mm/hash: Increase VA range to 128TB"),
we increased H_PGD_INDEX_SIZE to 15 when we're building with 64K pages. This
makes it larger than RADIX_PGD_INDEX_SIZE (13), which means the logic to
calculate MAX_PGD_INDEX_SIZE in book3s/64/pgtable.h is wrong.
The end result is that the PGD (Page Global Directory, ie top level page table)
of the kernel (aka. swapper_pg_dir), is too small.
This generally doesn't lead to a crash, as we don't use the full range in normal
operation. However if we try to dump the kernel pagetables we can trigger a
crash because we walk off the end of the pgd into other memory and eventually
try to dereference something bogus:
$ cat /sys/kernel/debug/kernel_pagetables
Unable to handle kernel paging request for data at address 0xe8fece0000000000
Faulting instruction address: 0xc000000000072314
cpu 0xc: Vector: 380 (Data SLB Access) at [c0000000daa13890]
pc: c000000000072314: ptdump_show+0x164/0x430
lr: c000000000072550: ptdump_show+0x3a0/0x430
dar: e802cf0000000000
seq_read+0xf8/0x560
full_proxy_read+0x84/0xc0
__vfs_read+0x6c/0x1d0
vfs_read+0xbc/0x1b0
SyS_read+0x6c/0x110
system_call+0x38/0xfc
The root cause is that MAX_PGD_INDEX_SIZE isn't actually computed to be
the max of H_PGD_INDEX_SIZE or RADIX_PGD_INDEX_SIZE. To fix that move
the calculation into asm-offsets.c where we can do it easily using
max().
Fixes: f6eedbba7a26 ("powerpc/mm/hash: Increase VA range to 128TB")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This merges the arch part of the XIVE support, leaving the final commit
with the KVM specific pieces dangling on the branch for Paul to merge
via the kvm-ppc tree.
POWER9 DD1.0 hardware has a bug where the SPRs of a thread waking up
from stop 0,1,2 with ESL=1 can endup being misplaced in the core. Thus
the HSPRG0 of a thread waking up from can contain the paca pointer of
its sibling.
This patch implements a context recovery framework within threads of a
core, by provisioning space in paca_struct for saving every sibling
threads's paca pointers. Basically, we should be able to arrive at the
right paca pointer from any of the thread's existing paca pointer.
At bootup, during powernv idle-init, we save the paca address of every
CPU in each one its siblings paca_struct in the slot corresponding to
this CPU's index in the core.
On wakeup from a stop, the thread will determine its index in the core
from the TIR register and recover its PACA pointer by indexing into
the correct slot in the provisioned space in the current PACA.
Furthermore, ensure that the NVGPRs are restored from the stack on the
way out by setting the NAPSTATELOST in paca.
[Changelog written with inputs from svaidy@linux.vnet.ibm.com]
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Call it a bug]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently during idle-init on power9, if we don't find suitable stop
states in the device tree that can be used as the
default_stop/deepest_stop, we set stop0 (ESL=1,EC=1) as the default
stop state psscr to be used by power9_idle and deepest stop state
which is used by CPU-Hotplug.
However, if the platform firmware has not configured or enabled a stop
state, the kernel should not make any assumptions and fallback to a
default choice.
If the kernel uses a stop state that is not configured by the platform
firmware, it may lead to further failures which should be avoided.
In this patch, we modify the init code to ensure that the kernel uses
only the stop states exposed by the firmware through the device
tree. When a suitable default stop state isn't found, we disable
ppc_md.power_save for power9. Similarly, when a suitable
deepest_stop_state is not found in the device tree exported by the
firmware, fall back to the default busy-wait loop in the CPU-Hotplug
code.
[Changelog written with inputs from svaidy@linux.vnet.ibm.com]
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently, the powernv cpu-offline function assumes that platform idle
states such as stop on POWER9, winkle/sleep/nap on POWER8 are always
available. On POWER8, it picks nap as the default state if other deep
idle states like sleep/winkle are not available and enabled in the
platform.
On POWER9, nap is not available and all idle states are managed by
STOP instruction. The parameters to the idle state are passed through
processor stop status control register (PSSCR). Hence as such
executing STOP would take parameters from current PSSCR. We do not
want to make any assumptions in kernel on what STOP states and PSSCR
features are configured by the platform.
Ideally platform will configure a good set of stop states that can be
used in the kernel. We would like to start with a clean slate, if the
platform choose to not configure any state or there is an error in
platform firmware that lead to no stop states being configured or
allowed to be requested.
This patch adds a fallback method for CPU-Hotplug that is similar to
snooze loop at idle where the threads are left to spin at low priority
and hence reduce the cycles consumed.
This is a safe fallback mechanism in the case when no stop state would
be requested if the platform firmware did not configure them most
likely due to an error condition.
Requesting a stop state when the platform has not configured them or
enabled them would lead to further error conditions which could be
difficult to debug.
[Changelog written with inputs from svaidy@linux.vnet.ibm.com]
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Move the piece of code in powernv/smp.c::pnv_smp_cpu_kill_self() which
transitions the CPU to the deepest available platform idle state to a
new function named pnv_cpu_offline() in powernv/idle.c. The rationale
behind this code movement is that the data required to determine the
deepest available platform state resides in powernv/idle.c.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Add user space exported API definitions for 512KB, 1MB, 2MB, 8MB, 16MB,
1GB, 16GB non default huge page sizes to be used with mmap() system
call.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
[mpe: Reword the comment to emphasise that these are only needed to use
the non-default huge page size, and updated the change log.]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Make sparsemem the default on all 64-bit Book3S platforms. It already is
for pseries and ps3, and we need to enable it for powernv because on
POWER9 memory between chips is discontiguous.
For the other platforms sparsemem should work fine, though it might add
a small amount of overhead. We can always force FLATMEM in the
defconfigs if necessary.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
setup_initial_memory_limit() is called from early_init_devtree(), which
runs prior to feature patching. If the kernel is built with CONFIG_JUMP_LABEL=y
and CONFIG_JUMP_LABEL_FEATURE_CHECKS=y then we will potentially get the
wrong value.
If we also have CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG=y we get a warning
and backtrace:
Warning! mmu_has_feature() used prior to jump label init!
CPU: 0 PID: 0 Comm: swapper Not tainted 4.11.0-rc4-gccN-next-20170331-g6af2434 #1
Call Trace:
[c000000000fc3d50] [c000000000a26c30] .dump_stack+0xa8/0xe8 (unreliable)
[c000000000fc3de0] [c00000000002e6b8] .setup_initial_memory_limit+0xa4/0x104
[c000000000fc3e60] [c000000000d5c23c] .early_init_devtree+0xd0/0x2f8
[c000000000fc3f00] [c000000000d5d3b0] .early_setup+0x90/0x11c
[c000000000fc3f90] [c000000000000520] start_here_multiplatform+0x68/0x80
Fix it by using early_mmu_has_feature().
Fixes: c12e6f24d413 ("powerpc: Add option to use jump label for mmu_has_feature()")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
These files don't seem to have any need for asm/debug.h, now that all it
includes are the debugger hooks and breakpoint definitions.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
powerpc_debugfs_root is the dentry representing the root of the
"powerpc" directory tree in debugfs.
Currently it sits in asm/debug.h, a long with some other things that
have "debug" in the name, but are otherwise unrelated.
Pull it out into a separate header, which also includes linux/debugfs.h,
and convert all the users to include debugfs.h instead of debug.h.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
In the recent commit 1ab66d1fbada ("powerpc/powernv: Introduce address
translation services for Nvlink2") the NPU code gained a dependency on MMU
notifiers.
All our defconfigs have KVM enabled, which selects MMU_NOTIFIER, but if KVM is
not enabled then the build breaks.
Fix it by always selecting MMU_NOTIFIER when we're building powernv.
Fixes: 1ab66d1fbada ("powerpc/powernv: Introduce address translation services for Nvlink2")
Signed-off-by: Alistair Popple <alistair@popple.id.au>
[mpe: Reword change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
For a tlbiel with pid, we need to issue tlbiel with set number encoded. We
don't need to do ptesync for each of those. Instead we need one for the entire
tlbiel pid operation.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
For fullmm tlb flush, we do a flush with RIC_FLUSH_ALL which will invalidate all
related caches (radix__tlb_flush()). Hence the pwc flush is not needed.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Move all the EDAC core functionality behind CONFIG_EDAC and get rid of
that indirection. Update defconfigs which had it.
While at it, fix dependencies such that EDAC depends on RAS for the
tracepoints.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: linux-edac@vger.kernel.org
We need to set LPES in order for normal external interrupts (0x500)
to be directed to the guest while running in guest state.
We also need HEIC set to prevent them to be sent to the host while
in host state.
With XIVE the host never gets one of these and wouldn't know how to
handle it. All host external interrupts come in via the new
hypervisor virtualization interrupts vector.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
We have all sort of variants of MMIO accessors for the real mode
instructions. This creates a clean set of accessors based on
Linux normal naming conventions, replacing all occurrences of
the old ones in the tree.
I have purposefully removed the "out/in" variants in favor of
only including __raw variants. Any code using these is already
pretty much hand tuned to operate in a very specific environment.
I've fixed up the 2 users (only one of them actually needed
a barrier in the first place).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The function doesn't exist anymore
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
It's only used within the same file it's defined
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
We traditionally have linux/ before asm/
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The XIVE interrupt controller is the new interrupt controller
found in POWER9. It supports advanced virtualization capabilities
among other things.
Currently we use a set of firmware calls that simulate the old
"XICS" interrupt controller but this is fairly inefficient.
This adds the framework for using XIVE along with a native
backend which OPAL for configuration. Later, a backend allowing
the use in a KVM or PowerVM guest will also be provided.
This disables some fast path for interrupts in KVM when XIVE is
enabled as these rely on the firmware emulation code which is no
longer available when the XIVE is used natively by Linux.
A latter patch will make KVM also directly exploit the XIVE, thus
recovering the lost performance (and more).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Fixup pr_xxx("XIVE:"...), don't split pr_xxx() strings,
tweak Kconfig so XIVE_NATIVE selects XIVE and depends on POWERNV,
fix build errors when SMP=n, fold in fixes from Ben:
Don't call cpu_online() on an invalid CPU number
Fix irq target selection returning out of bounds cpu#
Extra sanity checks on cpu numbers
]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Headed to stable:
- disable HFSCR[TM] if TM is not supported, fixes a potential host kernel crash
triggered by a hostile guest, but only in configurations that no one uses
- don't try to fix up misaligned load-with-reservation instructions
- fix flush_(d|i)cache_range() called from modules on little endian kernels
- add missing global TLB invalidate if cxl is active
- fix missing preempt_disable() in crc32c-vpmsum
And a fix for selftests build changes that went in this release:
- selftests/powerpc: Fix standalone powerpc build
Thanks to:
Benjamin Herrenschmidt, Frederic Barrat, Oliver O'Halloran, Paul Mackerras.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJY6LIKAAoJEFHr6jzI4aWAhfcQAKORHx/tJf9w8KqcfSfKfeEL
O8cZEl5/N3ArNXVM5J5QK5KnMVHnoWWR3FWYwntOjt3RJywjJYJ02YvhOVvt4q+M
YinRS34KzAhnT1f526zx97v0BGqi//UJamrcFBUBTd4rLuHGbol7fdtWHVrsMYa0
KWQ+ooPLEpGDk4I3sDz37yeJBQXVpyhC/UF8vzHpvHGPvIQ8Dw8rfWwOZ0HooJuZ
ewKdkeIsYF8SrM461c1GhOI0VXB0q+CMn9mzIaEKMuZMhHDKyiaM5rm8mWXapzcT
HsCQKlF9X9YHAbhbSbz9DGvNCEYaW7T4vnudSNHjQaAJlA4HsmeRwWXy4+zqZuPc
rIbRIFZAyV3wYowN7j3P6Se3lLBDMmlHZvVkygJnwoaR4rmoujePGwdAv8ZH4Udn
hrbieC41HKVxcm5t3whIDOcHmxaAo1MDqmrVhyxJSjgnkdBtN/gnZXvHDb0VeOJV
9wFGGE8WvMXnTKEcjM2l+a14CuOrV/wRbHQ1B1O0Kfk613cPrukMYab6eLPqyJzF
lmkCm1o46bib5oBOmvlqK+5oVuwNyfHmJSzvL+VOylhLVbJPmFJUhHQFssCvsTUf
k36ZAUxH4fbz1TzAPipXl+wrkE/yzthGmA9FTC9hLkYE/rzvrZt9IKowFw1mq5n/
2zFabXQBl5JBQ4hdL54f
=bTuf
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.11-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Some more powerpc fixes for 4.11:
Headed to stable:
- disable HFSCR[TM] if TM is not supported, fixes a potential host
kernel crash triggered by a hostile guest, but only in
configurations that no one uses
- don't try to fix up misaligned load-with-reservation instructions
- fix flush_(d|i)cache_range() called from modules on little endian
kernels
- add missing global TLB invalidate if cxl is active
- fix missing preempt_disable() in crc32c-vpmsum
And a fix for selftests build changes that went in this release:
- selftests/powerpc: Fix standalone powerpc build
Thanks to: Benjamin Herrenschmidt, Frederic Barrat, Oliver O'Halloran,
Paul Mackerras"
* tag 'powerpc-4.11-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/crypto/crc32c-vpmsum: Fix missing preempt_disable()
powerpc/mm: Add missing global TLB invalidate if cxl is active
powerpc/64: Fix flush_(d|i)cache_range() called from modules
powerpc: Don't try to fix up misaligned load-with-reservation instructions
powerpc: Disable HFSCR[TM] if TM is not supported
selftests/powerpc: Fix standalone powerpc build
Introduce a new getsockopt operation to retrieve the socket cookie
for a specific socket based on the socket fd. It returns a unique
non-decreasing cookie for each socket.
Tested: https://android-review.googlesource.com/#/c/358163/
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Chenbo Feng <fengc@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Its value has never changed; we might as well make it part of the ABI instead
of using the return value of KVM_CHECK_EXTENSION(KVM_CAP_COALESCED_MMIO).
Because PPC does not always make MMIO available, the code has to be made
dependent on CONFIG_KVM_MMIO rather than KVM_COALESCED_MMIO_PAGE_OFFSET.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Remove code from architecture files that can be moved to virt/kvm, since there
is already common code for coalesced MMIO.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
[Removed a pointless 'break' after 'return'.]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
In crc32c_vpmsum() we call enable_kernel_altivec() without first
disabling preemption, which is not allowed:
WARNING: CPU: 9 PID: 2949 at ../arch/powerpc/kernel/process.c:277 enable_kernel_altivec+0x100/0x120
Modules linked in: dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c vmx_crypto ...
CPU: 9 PID: 2949 Comm: docker Not tainted 4.11.0-rc5-compiler_gcc-6.3.1-00033-g308ac7563944 #381
...
NIP [c00000000001e320] enable_kernel_altivec+0x100/0x120
LR [d000000003df0910] crc32c_vpmsum+0x108/0x150 [crc32c_vpmsum]
Call Trace:
0xc138fd09 (unreliable)
crc32c_vpmsum+0x108/0x150 [crc32c_vpmsum]
crc32c_vpmsum_update+0x3c/0x60 [crc32c_vpmsum]
crypto_shash_update+0x88/0x1c0
crc32c+0x64/0x90 [libcrc32c]
dm_bm_checksum+0x48/0x80 [dm_persistent_data]
sb_check+0x84/0x120 [dm_thin_pool]
dm_bm_validate_buffer.isra.0+0xc0/0x1b0 [dm_persistent_data]
dm_bm_read_lock+0x80/0xf0 [dm_persistent_data]
__create_persistent_data_objects+0x16c/0x810 [dm_thin_pool]
dm_pool_metadata_open+0xb0/0x1a0 [dm_thin_pool]
pool_ctr+0x4cc/0xb60 [dm_thin_pool]
dm_table_add_target+0x16c/0x3c0
table_load+0x184/0x400
ctl_ioctl+0x2f0/0x560
dm_ctl_ioctl+0x38/0x50
do_vfs_ioctl+0xd8/0x920
SyS_ioctl+0x68/0xc0
system_call+0x38/0xfc
It used to be sufficient just to call pagefault_disable(), because that
also disabled preemption. But the two were decoupled in commit 8222dbe21e79
("sched/preempt, mm/fault: Decouple preemption from the page fault
logic") in mid 2015.
So add the missing preempt_disable/enable(). We should also call
disable_kernel_fp(), although it does nothing by default, there is a
debug switch to make it active and all enables should be paired with
disables.
Fixes: 6dd7a82cc54e ("crypto: powerpc - Add POWER8 optimised crc32c")
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Some powerpc platforms use this to move IRQs away from a CPU being
unplugged. This function has several bugs such as not taking the right
locks or failing to NULL check pointers.
There's a new generic function doing exactly the same thing without all
the bugs, so let's use it instead.
mpe: The obvious place for the select of GENERIC_IRQ_MIGRATION is on
HOTPLUG_CPU, but that doesn't work. On some configs PM_SLEEP_SMP will
select HOTPLUG_CPU even though its dependencies are not met, which means
the select of GENERIC_IRQ_MIGRATION doesn't happen. That leads to the
build breaking. Fix it by moving the select of GENERIC_IRQ_MIGRATION to
SMP.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Mostly simple cases of overlapping changes (adding code nearby,
a function whose name changes, for example).
Signed-off-by: David S. Miller <davem@davemloft.net>
Some platforms (will) need to perform allocations before bringing
a new CPU online. Doing it from smp_ops->setup_cpu is the wrong
thing to do:
- It has no useful failure path (too late)
- Calling any allocator will enable interrupts prematurely
causing problems with large decrementer among others
Instead, add a new callback that is called from __cpu_up (so from
the context trying to online the new CPU) at a point where we
can safely allocate and handle failures.
This will be used by XIVE support.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
kzalloc() won't actually fail because sizeof(*resize) is small, but
static checkers complain.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Commit 4c6d9acce1f4 ("powerpc/mm: Add hooks for cxl") converted local
TLB invalidates to global if the cxl driver is active. This is necessary
because the CAPP snoops invalidations to forward them to the PSL on the
cxl adapter. However one path was forgotten. native_flush_hash_range()
still does local TLB invalidates, as found out the hard way recently.
This patch fixes it by following the same logic as previously: if the
cxl driver is active, the local TLB invalidates are 'upgraded' to
global.
Fixes: 4c6d9acce1f4 ("powerpc/mm: Add hooks for cxl")
Cc: stable@vger.kernel.org # v3.18+
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>