23469 Commits

Author SHA1 Message Date
Michael Ellerman
ee1bc694fb powerpc/kvm: Fix build error when PPC_MEM_KEYS/PPC_PSERIES=n
lkp reported a randconfig failure:

     In file included from arch/powerpc/include/asm/book3s/64/pkeys.h:6,
                    from arch/powerpc/kvm/book3s_64_mmu_host.c:15:
     arch/powerpc/include/asm/book3s/64/hash-pkey.h: In function 'hash__vmflag_to_pte_pkey_bits':
  >> arch/powerpc/include/asm/book3s/64/hash-pkey.h:10:23: error: 'VM_PKEY_BIT0' undeclared
        10 |  return (((vm_flags & VM_PKEY_BIT0) ? H_PTE_PKEY_BIT0 : 0x0UL) |
         |                       ^~~~~~~~~~~~

We added the include of book3s/64/pkeys.h for pte_to_hpte_pkey_bits(),
but that header on its own should only be included when PPC_MEM_KEYS=y.
Instead include linux/pkeys.h, which brings in the right definitions
when PPC_MEM_KEYS=y and also provides empty stubs when PPC_MEM_KEYS=n.

Fixes: e4e8bc1df691 ("powerpc/kvm: Fix PR KVM with KUAP/MEM_KEYS enabled")
Cc: stable@vger.kernel.org # v5.11+
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210425115831.2818434-1-mpe@ellerman.id.au
2021-04-27 10:48:37 +10:00
Christophe Leroy
30c400886b powerpc/kasan: Fix shadow start address with modules
Modules are now located before kernel, KASAN area has to
be extended accordingly.

Fixes: 80edc68e0479 ("powerpc/32s: Define a MODULE area below kernel text all the time")
Fixes: 9132a2e82adc ("powerpc/8xx: Define a MODULE area below kernel text")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c68163065163f303f5af1e4bbdd9f1ce69f0543e.1619260465.git.christophe.leroy@csgroup.eu
2021-04-25 21:29:04 +10:00
Leonardo Bras
fc5590fd56 powerpc/kernel/iommu: Use largepool as a last resort when !largealloc
As of today, doing iommu_range_alloc() only for !largealloc (npages <= 15)
will only be able to use 3/4 of the available pages, given pages on
largepool  not being available for !largealloc.

This could mean some drivers not being able to fully use all the available
pages for the DMA window.

Add pages on largepool as a last resort for !largealloc, making all pages
of the DMA window available.

Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210318174414.684630-2-leobras.c@gmail.com
2021-04-23 12:54:58 +10:00
Leonardo Bras
3c0468d445 powerpc/kernel/iommu: Align size for IOMMU_PAGE_SIZE() to save TCEs
Currently both iommu_alloc_coherent() and iommu_free_coherent() align the
desired allocation size to PAGE_SIZE, and gets system pages and IOMMU
mappings (TCEs) for that value.

When IOMMU_PAGE_SIZE < PAGE_SIZE, this behavior may cause unnecessary
TCEs to be created for mapping the whole system page.

Example:
- PAGE_SIZE = 64k, IOMMU_PAGE_SIZE() = 4k
- iommu_alloc_coherent() is called for 128 bytes
- 1 system page (64k) is allocated
- 16 IOMMU pages (16 x 4k) are allocated (16 TCEs used)

It would be enough to use a single TCE for this, so 15 TCEs are
wasted in the process.

Update iommu_*_coherent() to make sure the size alignment happens only
for IOMMU_PAGE_SIZE() before calling iommu_alloc() and iommu_free().

Also, on iommu_range_alloc(), replace ALIGN(n, 1 << tbl->it_page_shift)
with IOMMU_PAGE_ALIGN(n, tbl), which is easier to read and does the
same.

Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210318174414.684630-1-leobras.c@gmail.com
2021-04-23 12:54:50 +10:00
Colin Ian King
ee6b25fa7c powerpc/44x: fix spelling mistake in Kconfig "varients" -> "variants"
There is a spelling mistake in the Kconfig help text. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201216113608.11812-1-colin.king@canonical.com
2021-04-23 01:38:04 +10:00
Alexey Kardashevskiy
cc7130bf11 powerpc/iommu: Annotate nested lock for lockdep
The IOMMU table is divided into pools for concurrent mappings and each
pool has a separate spinlock. When taking the ownership of an IOMMU group
to pass through a device to a VM, we lock these spinlocks which triggers
a false negative warning in lockdep (below).

This fixes it by annotating the large pool's spinlock as a nest lock
which makes lockdep not complaining when locking nested locks if
the nest lock is locked already.

===
WARNING: possible recursive locking detected
5.11.0-le_syzkaller_a+fstn1 #100 Not tainted
--------------------------------------------
qemu-system-ppc/4129 is trying to acquire lock:
c0000000119bddb0 (&(p->lock)/1){....}-{2:2}, at: iommu_take_ownership+0xac/0x1e0

but task is already holding lock:
c0000000119bdd30 (&(p->lock)/1){....}-{2:2}, at: iommu_take_ownership+0xac/0x1e0

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&(p->lock)/1);
  lock(&(p->lock)/1);
===

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210301063653.51003-1-aik@ozlabs.ru
2021-04-23 01:38:04 +10:00
Alexey Kardashevskiy
4be518d838 powerpc/iommu: Do not immediately panic when failed IOMMU table allocation
Most platforms allocate IOMMU table structures (specifically it_map)
at the boot time and when this fails - it is a valid reason for panic().

However the powernv platform allocates it_map after a device is returned
to the host OS after being passed through and this happens long after
the host OS booted. It is quite possible to trigger the it_map allocation
panic() and kill the host even though it is not necessary - the host OS
can still use the DMA bypass mode (requires a tiny fraction of it_map's
memory) and even if that fails, the host OS is runnnable as it was without
the device for which allocating it_map causes the panic.

Instead of immediately crashing in a powernv/ioda2 system, this prints
an error and continues. All other platforms still call panic().

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Leonardo Bras <leobras.c@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210216033307.69863-3-aik@ozlabs.ru
2021-04-23 01:38:04 +10:00
Alexey Kardashevskiy
7f1fa82d79 powerpc/iommu: Allocate it_map by vmalloc
The IOMMU table uses the it_map bitmap to keep track of allocated DMA
pages. This has always been a contiguous array allocated at either
the boot time or when a passed through device is returned to the host OS.
The it_map memory is allocated by alloc_pages() which allocates
contiguous physical memory.

Such allocation method occasionally creates a problem when there is
no big chunk of memory available (no free memory or too fragmented).
On powernv/ioda2 the default DMA window requires 16MB for it_map.

This replaces alloc_pages_node() with vzalloc_node() which allocates
contiguous block but in virtual memory. This should reduce changes of
failure but should not cause other behavioral changes as it_map is only
used by the kernel's DMA hooks/api when MMU is on.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210216033307.69863-2-aik@ozlabs.ru
2021-04-23 01:38:04 +10:00
Yang Li
caea7b833d powerpc/64s: remove unneeded semicolon
Eliminate the following coccicheck warning:
./arch/powerpc/platforms/powernv/setup.c:160:2-3: Unneeded semicolon

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1612236877-104974-1-git-send-email-yang.lee@linux.alibaba.com
2021-04-23 01:38:04 +10:00
Yang Li
f3d03fc748 powerpc/eeh: remove unneeded semicolon
Eliminate the following coccicheck warning:
./arch/powerpc/kernel/eeh.c:782:2-3: Unneeded semicolon

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1612236096-91154-1-git-send-email-yang.lee@linux.alibaba.com
2021-04-23 01:38:04 +10:00
Michael Ellerman
421a748387 powerpc/configs: Add IBMVNIC to some 64-bit configs
This is an IBM specific driver that we should enable to get some
build/boot testing.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210302020954.2980046-1-mpe@ellerman.id.au
2021-04-23 01:38:03 +10:00
Christophe Leroy
8a87a50771 powerpc/52xx: Fix an invalid ASM expression ('addi' used instead of 'add')
AS      arch/powerpc/platforms/52xx/lite5200_sleep.o
arch/powerpc/platforms/52xx/lite5200_sleep.S: Assembler messages:
arch/powerpc/platforms/52xx/lite5200_sleep.S:184: Warning: invalid register expression

In the following code, 'addi' is wrong, has to be 'add'

	/* local udelay in sram is needed */
  udelay: /* r11 - tb_ticks_per_usec, r12 - usecs, overwrites r13 */
	mullw	r12, r12, r11
	mftb	r13	/* start */
	addi	r12, r13, r12 /* end */

Fixes: ee983079ce04 ("[POWERPC] MPC5200 low power mode")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/cb4cec9131c8577803367f1699209a7e104cec2a.1619025821.git.christophe.leroy@csgroup.eu
2021-04-23 01:38:03 +10:00
Nicholas Piggin
0f197ddce4 powerpc/64s: Fix mm_cpumask memory ordering comment
The memory ordering comment no longer applies, because mm_ctx_id is
no longer used anywhere. At best always been difficult to follow.

It's better to consider the load on which the slbmte depends on, which
the MMU depends on before it can start loading TLBs, rather than a
store which may or may not have a subsequent dependency chain to the
slbmte.

So update the comment and we use the load of the mm's user context ID.
This is much more analogous the radix ordering too, which is good.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210421151733.212858-1-npiggin@gmail.com
2021-04-23 01:38:03 +10:00
Athira Rajeev
66d9b74928 powerpc/perf: Fix the threshold event selection for memory events in power10
Memory events (mem-loads and mem-stores) currently use the threshold
event selection as issue to finish. Power10 supports issue to complete
as part of thresholding which is more appropriate for mem-loads and
mem-stores. Hence fix the event code for memory events to use issue
to complete.

Fixes: a64e697cef23 ("powerpc/perf: power10 Performance Monitoring support")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1614840015-1535-1-git-send-email-atrajeev@linux.vnet.ibm.com
2021-04-23 01:38:02 +10:00
Athira Rajeev
b4ded42268 powerpc/perf: Fix sampled instruction type for larx/stcx
Sampled Instruction Event Register (SIER) field [46:48] identifies the
sampled instruction type. ISA v3.1 says value of 0b111 for this field as
reserved, but in POWER10 it denotes LARX/STCX type which will hopefully
be fixed in ISA v3.1 update.

Patch fixes the functions to handle type value 7 for CPU_FTR_ARCH_31.

Fixes: a64e697cef23 ("powerpc/perf: power10 Performance Monitoring support")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
[mpe: Avoid reading mmcra until necessary, use early return to deindent if block]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1614858937-1485-1-git-send-email-atrajeev@linux.vnet.ibm.com
2021-04-23 01:38:02 +10:00
Christophe Leroy
0bd3f9e953 powerpc/legacy_serial: Use early_ioremap()
[    0.000000] ioremap() called early from find_legacy_serial_ports+0x3cc/0x474. Use early_ioremap() instead

find_legacy_serial_ports() is called early from setup_arch(), before
paging_init(). vmalloc is not available yet, ioremap shouldn't be
used that early.

Use early_ioremap() and switch to a regular ioremap() later.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/103ed8ee9e5973c958ec1da2d0b0764f69395d01.1618925560.git.christophe.leroy@csgroup.eu
2021-04-22 20:59:15 +10:00
Christophe Leroy
9ccba66d4d powerpc/64: Fix the definition of the fixmap area
At the time being, the fixmap area is defined at the top of
the address space or just below KASAN.

This definition is not valid for PPC64.

For PPC64, use the top of the I/O space.

Because of circular dependencies, it is not possible to include
asm/fixmap.h in asm/book3s/64/pgtable.h , so define a fixed size
AREA at the top of the I/O space for fixmap and ensure during
build that the size is big enough.

Fixes: 265c3491c4bc ("powerpc: Add support for GENERIC_EARLY_IOREMAP")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/0d51620eacf036d683d1a3c41328f69adb601dc0.1618925560.git.christophe.leroy@csgroup.eu
2021-04-22 20:59:15 +10:00
Randy Dunlap
389586333c powerpc: make ALTIVEC select PPC_FPU
On a kernel config with ALTIVEC=y and PPC_FPU not set/enabled,
there are build errors:

drivers/cpufreq/pmac32-cpufreq.c:262:2: error: implicit declaration of function 'enable_kernel_fp' [-Werror,-Wimplicit-function-declaration]
           enable_kernel_fp();
../arch/powerpc/lib/sstep.c: In function 'do_vec_load':
../arch/powerpc/lib/sstep.c:637:3: error: implicit declaration of function 'put_vr' [-Werror=implicit-function-declaration]
  637 |   put_vr(rn, &u.v);
      |   ^~~~~~
../arch/powerpc/lib/sstep.c: In function 'do_vec_store':
../arch/powerpc/lib/sstep.c:660:3: error: implicit declaration of function 'get_vr'; did you mean 'get_oc'? [-Werror=implicit-function-declaration]
  660 |   get_vr(rn, &u.v);
      |   ^~~~~~

In theory ALTIVEC is independent of PPC_FPU but in practice nobody
is going to build such a machine, so make ALTIVEC require PPC_FPU
by selecting it.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210421210647.20836-1-rdunlap@infradead.org
2021-04-22 20:59:15 +10:00
Michael Ellerman
7d94627657 powerpc/64s: Add FA_DUMP to defconfig
FA_DUMP (Firmware Assisted Dump) is a powerpc only feature that should
be enabled in our defconfig to get some build / test coverage.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210420042209.1641634-1-mpe@ellerman.id.au
2021-04-22 20:59:12 +10:00
Michael Ellerman
d936f8182e powerpc/powernv: Fix type of opal_mpipl_query_tag() addr argument
opal_mpipl_query_tag() takes a pointer to a 64-bit value, which firmware
writes a value to. As OPAL is traditionally big endian this value will
be big endian.

This can be confirmed by looking at the implementation in skiboot:

  static uint64_t opal_mpipl_query_tag(enum opal_mpipl_tags tag, __be64 *tag_val)
  {
  	...
  	*tag_val = cpu_to_be64(opal_mpipl_tags[tag]);
  	return OPAL_SUCCESS;
  }

Fix the declaration to annotate that the value is big endian.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210421125402.1955013-2-mpe@ellerman.id.au
2021-04-22 20:59:09 +10:00
Michael Ellerman
2e341f56a1 powerpc/fadump: Fix sparse warnings
Sparse says:
  arch/powerpc/kernel/fadump.c:48:16: warning: symbol 'fadump_kobj' was not declared. Should it be static?
  arch/powerpc/kernel/fadump.c:55:27: warning: symbol 'crash_mrange_info' was not declared. Should it be static?
  arch/powerpc/kernel/fadump.c:61:27: warning: symbol 'reserved_mrange_info' was not declared. Should it be static?
  arch/powerpc/kernel/fadump.c:83:12: warning: symbol 'fadump_cma_init' was not declared. Should it be static?

And indeed none of them are used outside this file, they can all be made
static. Also fadump_kobj needs to be moved inside the ifdef where it's
used.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210421125402.1955013-1-mpe@ellerman.id.au
2021-04-22 20:59:04 +10:00
Christophe Leroy
39352430aa powerpc: Move copy_inst_from_kernel_nofault()
When probe_kernel_read_inst() was created, there was no good place to
put it, so a file called lib/inst.c was dedicated for it.

Since then, probe_kernel_read_inst() has been renamed
copy_inst_from_kernel_nofault(). And mm/maccess.h didn't exist at that
time. Today, mm/maccess.h is related to copy_from_kernel_nofault().

Move copy_inst_from_kernel_nofault() into mm/maccess.c

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/9655d8957313906b77b8db5700a0e33ce06f45e5.1618405715.git.christophe.leroy@csgroup.eu
2021-04-21 22:52:34 +10:00
Christophe Leroy
41d6cf68b5 powerpc: Rename probe_kernel_read_inst()
When probe_kernel_read_inst() was created, it was to mimic
probe_kernel_read() function.

Since then, probe_kernel_read() has been renamed
copy_from_kernel_nofault().

Rename probe_kernel_read_inst() into copy_inst_from_kernel_nofault().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b783d1f7cdb8914992384a669a2af57051b6bdcf.1618405715.git.christophe.leroy@csgroup.eu
2021-04-21 22:52:33 +10:00
Christophe Leroy
6449078d50 powerpc: Make probe_kernel_read_inst() common to PPC32 and PPC64
We have two independant versions of probe_kernel_read_inst(), one for
PPC32 and one for PPC64.

The PPC32 is identical to the first part of the PPC64 version.
The remaining part of PPC64 version is not relevant for PPC32, but
not contradictory, so we can easily have a common function with
the PPC64 part opted out via a IS_ENABLED(CONFIG_PPC64).

The only need is to add a version of ppc_inst_prefix() for PPC32.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f7b9dfddef3b3760182c7e5466356c121a293dc9.1618405715.git.christophe.leroy@csgroup.eu
2021-04-21 22:52:33 +10:00
Christophe Leroy
6ac7897f08 powerpc: Remove probe_user_read_inst()
Its name comes from former probe_user_read() function.
That function is now called copy_from_user_nofault().

probe_user_read_inst() uses copy_from_user_nofault() to read only
a few bytes. It is suboptimal.

It does the same as get_user_inst() but in addition disables
page faults.

But on the other hand, it is not used for the time being. So remove it
for now. If one day it is really needed, we can give it a new name
more in line with today's naming, and implement it using get_user_inst()

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/5f6f82572242a59bfee1e19a71194d8f7ef5fca4.1618405715.git.christophe.leroy@csgroup.eu
2021-04-21 22:52:33 +10:00
Christophe Leroy
ee7c3ec3b4 powerpc/ebpf32: Use standard function call for functions within 32M distance
If the target of a function call is within 32 Mbytes distance, use a
standard function call with 'bl' instead of the 'lis/ori/mtlr/blrl'
sequence.

In the first pass, no memory has been allocated yet and the code
position is not known yet (image pointer is NULL). This pass is there
to calculate the amount of memory to allocate for the EBPF code, so
assume the 4 instructions sequence is required, so that enough memory
is allocated.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/74944a1e3e5cfecc141e440a6ccd37920e186b70.1618227846.git.christophe.leroy@csgroup.eu
2021-04-21 22:52:33 +10:00
Christophe Leroy
e7de0023e1 powerpc/ebpf32: Rework 64 bits shifts to avoid tests and branches
Re-implement BPF_ALU64 | BPF_{LSH/RSH/ARSH} | BPF_X with branchless
implementation copied from misc_32.S.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/03167350b05b2fe8b741e53363ee37709d0f878d.1618227846.git.christophe.leroy@csgroup.eu
2021-04-21 22:52:33 +10:00
Christophe Leroy
d228cc4969 powerpc/ebpf32: Fix comment on BPF_ALU{64} | BPF_LSH | BPF_K
Replace <<== by <<=

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/34d12a4f75cb8b53a925fada5e7ddddd3b145203.1618227846.git.christophe.leroy@csgroup.eu
2021-04-21 22:52:33 +10:00
Christophe Leroy
867e762480 powerpc/32: Use r2 in wrtspr() instead of r0
wrtspr() is a function to write an arbitrary value in a special
register. It is used on 8xx to write to SPRN_NRI, SPRN_EID and
SPRN_EIE. Writing any value to one of those will play with MSR EE
and MSR RI regardless of that value.

r0 is used many places in the generated code and using r0 for
that creates an unnecessary dependency of this instruction with
preceding ones using r0 in a few places in vmlinux.

r2 is most likely the most stable register as it contains the
pointer to 'current'.

Using r2 instead of r0 avoids that unnecessary dependency.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/69f9968f4b592fefda55227f0f7430ea612cc950.1611299687.git.christophe.leroy@csgroup.eu
2021-04-21 22:52:32 +10:00
Ganesh Goudar
92d9d61be5 powerpc/mce: save ignore_event flag unconditionally for UE
When we hit an UE while using machine check safe copy routines,
ignore_event flag is set and the event is ignored by mce handler,
And the flag is also saved for defered handling and printing of
mce event information, But as of now saving of this flag is done
on checking if the effective address is provided or physical address
is calculated, which is not right.

Save ignore_event flag regardless of whether the effective address is
provided or physical address is calculated.

Without this change following log is seen, when the event is to be
ignored.

[  512.971365] MCE: CPU1: machine check (Severe)  UE Load/Store [Recovered]
[  512.971509] MCE: CPU1: NIP: [c0000000000b67c0] memcpy+0x40/0x90
[  512.971655] MCE: CPU1: Initiator CPU
[  512.971739] MCE: CPU1: Unknown
[  512.972209] MCE: CPU1: machine check (Severe)  UE Load/Store [Recovered]
[  512.972334] MCE: CPU1: NIP: [c0000000000b6808] memcpy+0x88/0x90
[  512.972456] MCE: CPU1: Initiator CPU
[  512.972534] MCE: CPU1: Unknown

Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
Reviewed-by: Santosh Sivaraj <santosh@fossix.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210407045816.352276-1-ganeshgr@linux.ibm.com
2021-04-21 22:52:32 +10:00
Christophe Leroy
eacf4c0202 powerpc: Enable OPTPROBES on PPC32
For that, create a 32 bits version of patch_imm64_load_insns()
and create a patch_imm_load_insns() which calls
patch_imm32_load_insns() on PPC32 and patch_imm64_load_insns()
on PPC64.

Adapt optprobes_head.S for PPC32. Use PPC_LL/PPC_STL macros instead
of raw ld/std, opt out things linked to paca and use stmw/lmw to
save/restore registers.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/bad58c66859b2a475c0ad516b53164ae3b4853cd.1618927318.git.christophe.leroy@csgroup.eu
2021-04-21 22:52:32 +10:00
Christophe Leroy
693557ebf4 powerpc/inst: ppc_inst_as_u64() becomes ppc_inst_as_ulong()
In order to simplify use on PPC32, change ppc_inst_as_u64()
into ppc_inst_as_ulong() that returns the 32 bits instruction
on PPC32.

Will be used when porting OPTPROBES to PPC32.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/22cadf29620664b600b82026d2a72b8b23351777.1618927318.git.christophe.leroy@csgroup.eu
2021-04-21 22:52:32 +10:00
Christophe Leroy
e522331173 powerpc/irq: Enhance readability of trap types
This patch makes use of trap types in irq.c

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/f7f8c9f98c33eaea316755c7fef150d1d77e047d.1618847273.git.christophe.leroy@csgroup.eu
2021-04-21 22:52:32 +10:00
Christophe Leroy
7fab639729 powerpc/32s: Enhance readability of trap types
This patch makes use of trap types in head_book3s_32.S

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/bd80ace67757f489fc4ecdb76dd1a71511daba94.1618847273.git.christophe.leroy@csgroup.eu
2021-04-21 22:52:31 +10:00
Christophe Leroy
0f5eb28a6c powerpc/8xx: Enhance readability of trap types
This patch makes use of trap types in head_8xx.S

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e1147287bf6f2fb0693048fe8db0298c7870e419.1618847273.git.christophe.leroy@csgroup.eu
2021-04-21 22:52:31 +10:00
Leonardo Bras
a9d2f9bb22 powerpc/pseries/iommu: Fix window size for direct mapping with pmem
As of today, if the DDW is big enough to fit (1 << MAX_PHYSMEM_BITS)
it's possible to use direct DMA mapping even with pmem region.

But, if that happens, the window size (len) is set to (MAX_PHYSMEM_BITS
- page_shift) instead of MAX_PHYSMEM_BITS, causing a pagesize times
smaller DDW to be created, being insufficient for correct usage.

Fix this so the correct window size is used in this case.

Fixes: bf6e2d562bbc4 ("powerpc/dma: Fallback to dma_ops when persistent memory present")
Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210420045404.438735-1-leobras.c@gmail.com
2021-04-21 22:52:31 +10:00
Michael Ellerman
e4e8bc1df6 powerpc/kvm: Fix PR KVM with KUAP/MEM_KEYS enabled
The changes to add KUAP support with the hash MMU broke booting of KVM
PR guests. The symptom is no visible progress of the guest, or possibly
just "SLOF" being printed to the qemu console.

Host code is still executing, but breaking into xmon might show a stack
trace such as:

  __might_fault+0x84/0xe0 (unreliable)
  kvm_read_guest+0x1c8/0x2f0 [kvm]
  kvmppc_ld+0x1b8/0x2d0 [kvm]
  kvmppc_load_last_inst+0x50/0xa0 [kvm]
  kvmppc_exit_pr_progint+0x178/0x220 [kvm_pr]
  kvmppc_handle_exit_pr+0x31c/0xe30 [kvm_pr]
  after_sprg3_load+0x80/0x90 [kvm_pr]
  kvmppc_vcpu_run_pr+0x104/0x260 [kvm_pr]
  kvmppc_vcpu_run+0x34/0x48 [kvm]
  kvm_arch_vcpu_ioctl_run+0x340/0x450 [kvm]
  kvm_vcpu_ioctl+0x2ac/0x8c0 [kvm]
  sys_ioctl+0x320/0x1060
  system_call_exception+0x160/0x270
  system_call_common+0xf0/0x27c

Bisect points to commit b2ff33a10c8b ("powerpc/book3s64/hash/kuap:
Enable kuap on hash"), but that's just the commit that enabled KUAP with
hash and made the bug visible.

The root cause seems to be that KVM PR is creating kernel mappings that
don't use the correct key, since we switched to using key 3.

We have a helper for adding the right key value, however it's designed
to take a pteflags variable, which the KVM code doesn't have. But we can
make it work by passing 0 for the pteflags, and tell it explicitly that
it should use the kernel key.

With that changed guests boot successfully.

Fixes: d94b827e89dc ("powerpc/book3s64/kuap: Use Key 3 for kernel mapping with hash translation")
Cc: stable@vger.kernel.org # v5.11+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210419120139.1455937-1-mpe@ellerman.id.au
2021-04-20 14:22:24 +10:00
Michael Ellerman
ed8029d7b4 powerpc/pseries: Stop calling printk in rtas_stop_self()
RCU complains about us calling printk() from an offline CPU:

  =============================
  WARNING: suspicious RCU usage
  5.12.0-rc7-02874-g7cf90e481cb8 #1 Not tainted
  -----------------------------
  kernel/locking/lockdep.c:3568 RCU-list traversed in non-reader section!!

  other info that might help us debug this:

  RCU used illegally from offline CPU!
  rcu_scheduler_active = 2, debug_locks = 1
  no locks held by swapper/0/0.

  stack backtrace:
  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.12.0-rc7-02874-g7cf90e481cb8 #1
  Call Trace:
    dump_stack+0xec/0x144 (unreliable)
    lockdep_rcu_suspicious+0x124/0x144
    __lock_acquire+0x1098/0x28b0
    lock_acquire+0x128/0x600
    _raw_spin_lock_irqsave+0x6c/0xc0
    down_trylock+0x2c/0x70
    __down_trylock_console_sem+0x60/0x140
    vprintk_emit+0x1a8/0x4b0
    vprintk_func+0xcc/0x200
    printk+0x40/0x54
    pseries_cpu_offline_self+0xc0/0x120
    arch_cpu_idle_dead+0x54/0x70
    do_idle+0x174/0x4a0
    cpu_startup_entry+0x38/0x40
    rest_init+0x268/0x388
    start_kernel+0x748/0x790
    start_here_common+0x1c/0x614

Which happens because by the time we get to rtas_stop_self() we are
already offline. In addition the message can be spammy, and is not that
helpful for users, so remove it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210418135413.1204031-1-mpe@ellerman.id.au
2021-04-20 14:22:24 +10:00
Michael Ellerman
3027a37c06 powerpc: Only define _TASK_CPU for 32-bit
We have some interesting code in our Makefile to define _TASK_CPU, based
on awk'ing the value out of asm-offsets.h. It exists to circumvent some
circular header dependencies that prevent us from referring to
task_struct in the relevant code. See the comment around _TASK_CPU in
smp.h for more detail.

Maybe one day we can come up with a better solution, but for now we can
at least limit that logic to 32-bit, because it's not needed for 64-bit.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210418131641.1186227-1-mpe@ellerman.id.au
2021-04-20 14:22:24 +10:00
Tyrel Datwyler
39d0099f94 powerpc/pseries: Add shutdown() to vio_driver and vio_bus
Currently, neither the vio_bus or vio_driver structures provide support
for a shutdown() routine.

Add support for shutdown() by allowing drivers to provide a
implementation via function pointer in their vio_driver struct and
provide a proper implementation in the driver template for the vio_bus
that calls a vio drivers shutdown() if defined.

In the case that no shutdown() is defined by a vio driver and a kexec is
in progress we implement a big hammer that calls remove() to ensure no
further DMA for the devices is possible.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210402001325.939668-1-tyreld@linux.ibm.com
2021-04-20 14:22:24 +10:00
Athira Rajeev
af31fd0c91 powerpc/perf: Expose processor pipeline stage cycles using PERF_SAMPLE_WEIGHT_STRUCT
Performance Monitoring Unit (PMU) registers in powerpc provides
information on cycles elapsed between different stages in the
pipeline. This can be used for application tuning. On ISA v3.1
platform, this information is exposed by sampling registers.
Patch adds kernel support to capture two of the cycle counters
as part of perf sample using the sample type:
PERF_SAMPLE_WEIGHT_STRUCT.

The power PMU function 'get_mem_weight' currently uses 64 bit weight
field of perf_sample_data to capture memory latency. But following the
introduction of PERF_SAMPLE_WEIGHT_TYPE, weight field could contain
64-bit or 32-bit value depending on the architexture support for
PERF_SAMPLE_WEIGHT_STRUCT. Patches uses WEIGHT_STRUCT to expose the
pipeline stage cycles info. Hence update the ppmu functions to work for
64-bit and 32-bit weight values.

If the sample type is PERF_SAMPLE_WEIGHT, use the 64-bit weight field.
if the sample type is PERF_SAMPLE_WEIGHT_STRUCT, memory subsystem
latency is stored in the low 32bits of perf_sample_weight structure.
Also for CPU_FTR_ARCH_31, capture the two cycle counter information in
two 16 bit fields of perf_sample_weight structure.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1616425047-1666-2-git-send-email-atrajeev@linux.vnet.ibm.com
2021-04-20 14:22:23 +10:00
Daniel Henrique Barboza
29c9a2699e powerpc/pseries: Set UNISOLATE on dlpar_cpu_remove() failure
The RTAS set-indicator call, when attempting to UNISOLATE a DRC that is
already UNISOLATED or CONFIGURED, returns RTAS_OK and does nothing else
for both QEMU and phyp. This gives us an opportunity to use this
behavior to signal the hypervisor layer when an error during device
removal happens, allowing it to do a proper error handling, while not
breaking QEMU/phyp implementations that don't have this support.

This patch introduces this idea by unisolating all CPU DRCs that failed
to be removed by dlpar_cpu_remove_by_index(), when handling the
PSERIES_HP_ELOG_ID_DRC_INDEX event. This is being done for this event
only because its the only CPU removal event QEMU uses, and there's no
need at this moment to add this mechanism for phyp only code.

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210416210216.380291-3-danielhb413@gmail.com
2021-04-20 14:22:23 +10:00
Daniel Henrique Barboza
0e3b3ff83c powerpc/pseries: Introduce dlpar_unisolate_drc()
Next patch will execute a set-indicator call in hotplug-cpu.c.

Create a dlpar_unisolate_drc() helper to avoid spreading more
rtas_set_indicator() calls outside of dlpar.c.

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210416210216.380291-2-danielhb413@gmail.com
2021-04-20 14:22:23 +10:00
Ganesh Goudar
864ec4d40c powerpc/pseries/mce: Fix a typo in error type assignment
The error type is ICACHE not DCACHE, for case MCE_ERROR_TYPE_ICACHE.

Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210416125750.49550-1-ganeshgr@linux.ibm.com
2021-04-20 14:22:23 +10:00
Michael Ellerman
cbd3d5ba46 powerpc/fadump: Fix compile error since trap type change
sfr reports that the allyesconfig build fails with:

  arch/powerpc/kernel/fadump.c: In function 'crash_fadump':
  arch/powerpc/kernel/fadump.c:731:28: error: 'INTERRUPT_SYSTEM_RESET' undeclared
    731 |  if (TRAP(&(fdh->regs)) == INTERRUPT_SYSTEM_RESET) {

Add an include of interrupt.h to fix it.

Fixes: 7153d4bf0b37 ("powerpc/traps: Enhance readability for trap types")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
[mpe: Reformat change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210419191425.281dc58a@canb.auug.org.au
2021-04-19 22:35:40 +10:00
Madhavan Srinivasan
d8a1d6c589 powerpc/perf: Add platform specific check_attr_config
Add platform specific attr.config value checks. Patch
includes checks for both power9 and power10.

Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210408074504.248211-2-maddy@linux.ibm.com
2021-04-19 12:22:09 +10:00
Michael Ellerman
a38cb41719 Merge branch 'topic/ppc-kvm' into next
Merge some powerpc KVM patches we are keeping in a topic branch just in
case anyone else needs to merge them.
2021-04-18 23:55:12 +10:00
Nicholas Piggin
49c1d07fd0 powerpc/powernv: Enable HAIL (HV AIL) for ISA v3.1 processors
Starting with ISA v3.1, LPCR[AIL] no longer controls the interrupt
mode for HV=1 interrupts. Instead, a new LPCR[HAIL] bit is defined
which behaves like AIL=3 for HV interrupts when set.

Set HAIL on bare metal to give us mmu-on interrupts and improve
performance.

This also fixes an scv bug: we don't implement scv real mode (AIL=0)
vectors because they are at an inconvenient location, so we just
disable scv support when AIL can not be set. However powernv assumes
that LPCR[AIL] will enable AIL mode so it enables scv support despite
HV interrupts being AIL=0, which causes scv interrupts to go off into
the weeds.

Fixes: 7fa95f9adaee ("powerpc/64s: system call support for scv/rfscv instructions")
Cc: stable@vger.kernel.org # v5.9+
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210402024124.545826-1-npiggin@gmail.com
2021-04-18 23:19:29 +10:00
Srikar Dronamraju
6980d13f0d powerpc/smp: Set numa node before updating mask
Geethika reported a trace when doing a dlpar CPU add.

------------[ cut here ]------------
WARNING: CPU: 152 PID: 1134 at kernel/sched/topology.c:2057
CPU: 152 PID: 1134 Comm: kworker/152:1 Not tainted 5.12.0-rc5-master #5
Workqueue: events cpuset_hotplug_workfn
NIP:  c0000000001cfc14 LR: c0000000001cfc10 CTR: c0000000007e3420
REGS: c0000034a08eb260 TRAP: 0700   Not tainted  (5.12.0-rc5-master+)
MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28828422  XER: 00000020
CFAR: c0000000001fd888 IRQMASK: 0 #012GPR00: c0000000001cfc10
c0000034a08eb500 c000000001f35400 0000000000000027 #012GPR04:
c0000035abaa8010 c0000035abb30a00 0000000000000027 c0000035abaa8018
#012GPR08: 0000000000000023 c0000035abaaef48 00000035aa540000
c0000035a49dffe8 #012GPR12: 0000000028828424 c0000035bf1a1c80
0000000000000497 0000000000000004 #012GPR16: c00000000347a258
0000000000000140 c00000000203d468 c000000001a1a490 #012GPR20:
c000000001f9c160 c0000034adf70920 c0000034aec9fd20 0000000100087bd3
#012GPR24: 0000000100087bd3 c0000035b3de09f8 0000000000000030
c0000035b3de09f8 #012GPR28: 0000000000000028 c00000000347a280
c0000034aefe0b00 c0000000010a2a68
NIP [c0000000001cfc14] build_sched_domains+0x6a4/0x1500
LR [c0000000001cfc10] build_sched_domains+0x6a0/0x1500
Call Trace:
[c0000034a08eb500] [c0000000001cfc10] build_sched_domains+0x6a0/0x1500 (unreliable)
[c0000034a08eb640] [c0000000001d1e6c] partition_sched_domains_locked+0x3ec/0x530
[c0000034a08eb6e0] [c0000000002936d4] rebuild_sched_domains_locked+0x524/0xbf0
[c0000034a08eb7e0] [c000000000296bb0] rebuild_sched_domains+0x40/0x70
[c0000034a08eb810] [c000000000296e74] cpuset_hotplug_workfn+0x294/0xe20
[c0000034a08ebc30] [c000000000178dd0] process_one_work+0x300/0x670
[c0000034a08ebd10] [c0000000001791b8] worker_thread+0x78/0x520
[c0000034a08ebda0] [c000000000185090] kthread+0x1a0/0x1b0
[c0000034a08ebe10] [c00000000000ccec] ret_from_kernel_thread+0x5c/0x70
Instruction dump:
7d2903a6 4e800421 e8410018 7f67db78 7fe6fb78 7f45d378 7f84e378 7c681b78
3c62ff1a 3863c6f8 4802dc35 60000000 <0fe00000> 3920fff4 f9210070 e86100a0
---[ end trace 532d9066d3d4d7ec ]---

Some of the per-CPU masks use cpu_cpu_mask as a filter to limit the search
for related CPUs. On a dlpar add of a CPU, update cpu_cpu_mask before
updating the per-CPU masks. This will ensure the cpu_cpu_mask is updated
correctly before its used in setting the masks. Setting the numa_node will
ensure that when cpu_cpu_mask() gets called, the correct node number is
used. This code movement helped fix the above call trace.

Reported-by: Geetika Moolchandani <Geetika.Moolchandani1@ibm.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210401154200.150077-1-srikar@linux.vnet.ibm.com
2021-04-17 22:46:31 +10:00
Xiongwei Song
7153d4bf0b powerpc/traps: Enhance readability for trap types
Define macros to list ppc interrupt types in interttupt.h, replace the
reference of the trap hex values with these macros.

Referred the hex numbers in arch/powerpc/kernel/exceptions-64e.S,
arch/powerpc/kernel/exceptions-64s.S, arch/powerpc/kernel/head_*.S,
arch/powerpc/kernel/head_booke.h and arch/powerpc/include/asm/kvm_asm.h.

Signed-off-by: Xiongwei Song <sxwjean@gmail.com>
[mpe: Resolve conflicts in nmi_disables_ftrace(), fix 40x build]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1618398033-13025-1-git-send-email-sxwjean@me.com
2021-04-17 22:20:19 +10:00