22389 Commits

Author SHA1 Message Date
Michael Ellerman
b91eb51824 powerpc/64s: Fix crash in load_fp_state() due to fpexc_mode
The recent commit 01eb01877f33 ("powerpc/64s: Fix restore_math
unnecessarily changing MSR") changed some of the handling of floating
point/vector restore.

In particular it caused current->thread.fpexc_mode to be copied into
the current MSR (via msr_check_and_set()), rather than just into
regs->msr (which is moved into MSR on return to userspace).

This can lead to a crash in the kernel if we take a floating point
exception when restoring FPSCR:

  Oops: Exception in kernel mode, sig: 8 [#1]
  LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
  Modules linked in:
  CPU: 3 PID: 101213 Comm: ld64.so.2 Not tainted 5.9.0-rc1-00098-g18445bf405cb-dirty #9
  NIP:  c00000000000fbb4 LR: c00000000001a7ac CTR: c000000000183570
  REGS: c0000016b7cfb3b0 TRAP: 0700   Not tainted  (5.9.0-rc1-00098-g18445bf405cb-dirty)
  MSR:  900000000290b933 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 44002444  XER: 00000000
  CFAR: c00000000001a7a8 IRQMASK: 1
  GPR00: c00000000001ae40 c0000016b7cfb640 c0000000011b7f00 c000001542a0f740
  GPR04: c000001542a0f720 c000001542a0eb00 0000000000000900 c000001542a0eb00
  GPR08: 000000000000000a 0000000000002000 9000000000009033 0000000000000000
  GPR12: 0000000000004000 c0000017ffffd900 0000000000000001 c000000000df5a58
  GPR16: c000000000e19c18 c0000000010e1123 0000000000000001 c000000000e1a638
  GPR20: 0000000000000000 c0000000044b1d00 0000000000000000 c000001542a0f2a0
  GPR24: 00000016c7fe0000 c000001542a0f720 c000000001c93da0 c000000000fe5f28
  GPR28: c000001542a0f720 0000000000800000 c0000016b7cfbe90 0000000002802900
  NIP load_fp_state+0x4/0x214
  LR  restore_math+0x17c/0x1f0
  Call Trace:
    0xc0000016b7cfb680 (unreliable)
    __switch_to+0x330/0x460
    __schedule+0x318/0x920
    schedule+0x74/0x140
    schedule_timeout+0x318/0x3f0
    wait_for_completion+0xc8/0x210
    call_usermodehelper_exec+0x234/0x280
    do_coredump+0xedc/0x13c0
    get_signal+0x1d4/0xbe0
    do_notify_resume+0x1a0/0x490
    interrupt_exit_user_prepare+0x1c4/0x230
    interrupt_return+0x14/0x1c0
  Instruction dump:
  ebe10168 e88101a0 7c8ff120 382101e0 e8010010 7c0803a6 4e800020 790605c4
  782905c4 7c0008a8 7c0008a8 c8030200 <fffe058e> 48000088 c8030000 c8230010

Fix it by only loading the fpexc_mode value into regs->msr.

Also add a comment to explain that although VSX is subject to the
value of fpexc_mode, we don't have to handle that separately because
we only allow VSX to be enabled if FP is also enabled.

Fixes: 01eb01877f33 ("powerpc/64s: Fix restore_math unnecessarily changing MSR")
Reported-by: Milton Miller <miltonm@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Link: https://lore.kernel.org/r/20200825093424.3967813-1-mpe@ellerman.id.au
2020-08-27 17:41:39 +10:00
Nicholas Piggin
e5fe56092e powerpc/64s: scv entry should set PPR
Kernel entry sets PPR to HMT_MEDIUM by convention. The scv entry
path missed this.

Fixes: 7fa95f9adaee ("powerpc/64s: system call support for scv/rfscv instructions")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200825075309.224184-1-npiggin@gmail.com
2020-08-27 17:41:13 +10:00
Oliver O'Halloran
3ced132a05 powerpc/nx: Don't pack struct coprocessor_request_block
Building with W=1 results in the following warning:

In file included from arch/powerpc/platforms/powernv/vas-fault.c:16:
./arch/powerpc/include/asm/icswx.h:159:1: error: alignment 1 of ‘struct
	coprocessor_request_block’ is less than 16 [-Werror=packed-not-aligned]
  159 | } __packed;
      | ^
./arch/powerpc/include/asm/icswx.h:159:1: error: alignment 1 of ‘struct
	coprocessor_request_block’ is less than 16 [-Werror=packed-not-aligned]
./arch/powerpc/include/asm/icswx.h:159:1: error: alignment 1 of ‘struct
	coprocessor_request_block’ is less than 16 [-Werror=packed-not-aligned]
./arch/powerpc/include/asm/icswx.h:159:1: error: alignment 1 of ‘struct
	coprocessor_request_block’ is less than 16 [-Werror=packed-not-aligned]
cc1: all warnings being treated as errors

This happens because coprocessor_request_block includes several
sub-structures with an alignment specified using the __aligned(XX)
attribute. The problem comes from coprocessor_request_block having the
__packed attribute. Packing the structure causes the preferred alignment of
the nested structures to be ignored and we get the warnings as a result.

This isn't a problem in practice since the struct is defined with explicit
padding in the form of reserved fields, but we'd like to get rid of the
spurious warnings. The simplest solution is to remove the packed attribute
and use a BUILD_BUG_ON() to ensure the struct is the correct (expected by
HW) size compile time.

Also add a __aligned(128) to the request block structure since Book4 for P8
suggests the HW requires it to be aligned to a 128 byte boundary. There's a
similar requirement for P9 since the COPY and PASTE instructions used to
invoke VAS/NX accelerators operates on a cache line boundary.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200804005410.146094-7-oohall@gmail.com
2020-08-25 01:31:33 +10:00
Oliver O'Halloran
fb248c3121 powerpc/powernv: Fix spurious kerneldoc warnings in opal-prd.c
Comments opening with /** are parsed by kerneldoc and this causes the
following warning to be printed:

	arch/powerpc/platforms/powernv/opal-prd.c:31: warning: cannot understand
	function prototype: 'struct opal_prd_msg_queue_item '

opal_prd_mesg_queue_item is an internal data structure so there's no real
need for it to be documented at all. Fix up the comment to squash the
warning.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200804005410.146094-5-oohall@gmail.com
2020-08-25 01:31:33 +10:00
Oliver O'Halloran
3b70464aa7 powerpc/powernv: Staticify functions without prototypes
There's a few scattered in the powernv platform.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200804005410.146094-4-oohall@gmail.com
2020-08-25 01:31:33 +10:00
Oliver O'Halloran
8471c1dd93 powerpc/powernv: Include asm/powernv.h from the local powernv.h
The asm/powernv.h header provides prototypes for functions which need to be
called by non-powernv platform code. Also include it in the powernv.h
that's local to the platform directory to squash some warnings about
non-static functions missing prototypes.

Also include powernv.h since from opal-memcons.c since it has the
prototypes for the memcons wrangling functions which are used for the opal
and ultravisor msglog.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200804005410.146094-3-oohall@gmail.com
2020-08-25 01:31:33 +10:00
Oliver O'Halloran
f6bac19cf6 powerpc/powernv/smp: Fix spurious DBG() warning
When building with W=1 we get the following warning:

 arch/powerpc/platforms/powernv/smp.c: In function ‘pnv_smp_cpu_kill_self’:
 arch/powerpc/platforms/powernv/smp.c:276:16: error: suggest braces around
 	empty body in an ‘if’ statement [-Werror=empty-body]
   276 |      cpu, srr1);
       |                ^
 cc1: all warnings being treated as errors

The full context is this block:

 if (srr1 && !generic_check_cpu_restart(cpu))
 	DBG("CPU%d Unexpected exit while offline srr1=%lx!\n",
 			cpu, srr1);

When building with DEBUG undefined DBG() expands to nothing and GCC emits
the warning due to the lack of braces around an empty statement.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200804005410.146094-2-oohall@gmail.com
2020-08-25 01:31:33 +10:00
Colin Ian King
346427e668 powerpc/oprofile: fix spelling mistake "contex" -> "context"
There is a spelling mistake in a pr_debug message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200804174316.402425-1-colin.king@canonical.com
2020-08-25 01:31:33 +10:00
Aneesh Kumar K.V
1c0a7ac0ec powerpc/vmemmap: Don't warn if we don't find a mapping vmemmap list entry
Now that we are handling vmemmap list allocation failure correctly, don't
WARN in section deactivate when we don't find a mapping vmemmap list entry.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200731113500.248306-2-aneesh.kumar@linux.ibm.com
2020-08-25 01:31:33 +10:00
Aneesh Kumar K.V
ccaea15296 powerpc/vmemmap: Fix memory leak with vmemmap list allocation failures.
If we fail to allocate vmemmap list, we don't keep track of allocated
vmemmap block buf. Hence on section deactivate we skip vmemmap block
buf free. This results in memory leak.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200731113500.248306-1-aneesh.kumar@linux.ibm.com
2020-08-25 01:31:32 +10:00
zhengbin
18102e4bcc powerpc/powernv: Remove set but not used variable 'parent'
Fix gcc '-Wunused-but-set-variable' warning:

arch/powerpc/platforms/powernv/pci-ioda.c: In function pnv_ioda_configure_pe:
arch/powerpc/platforms/powernv/pci-ioda.c:867:18: warning: variable parent set but not used [-Wunused-but-set-variable]

It is not used since commit b131a8425c34 ("powerpc/powernv:
Set PELTV for compound PEs")

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1574144074-142032-6-git-send-email-zhengbin13@huawei.com
2020-08-25 01:31:32 +10:00
zhengbin
ef23cf9a89 powerpc/perf: Remove set but not used variable 'target'
Fix gcc '-Wunused-but-set-variable' warning:

arch/powerpc/perf/imc-pmu.c: In function trace_imc_event_init:
arch/powerpc/perf/imc-pmu.c:1292:22: warning: variable target set but not used [-Wunused-but-set-variable]

It is introduced by commit 012ae244845f ("powerpc/perf:
Trace imc PMU functions"), but never used, so remove it.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1574144074-142032-3-git-send-email-zhengbin13@huawei.com
2020-08-25 01:31:32 +10:00
zhengbin
738e6cad0a powerpc/fadump: Remove set but not used variable 'elf'
Fix gcc '-Wunused-but-set-variable' warning:

arch/powerpc/kernel/fadump.c: In function fadump_update_elfcore_header:
arch/powerpc/kernel/fadump.c:790:17: warning: variable elf set but not used [-Wunused-but-set-variable]

It is introduced by commit ebaeb5ae2437 ("fadump:
Convert firmware-assisted cpu state dump data into elf notes."),
but never used, so remove it.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1574144074-142032-2-git-send-email-zhengbin13@huawei.com
2020-08-25 01:31:32 +10:00
Biwen Li
843dc8ee23 powerc/dtc/t1024rdb: remove interrupts property
Since the interrupt pin for RTC DS1339 is not connected
to the CPU on T1024RDB, remove the interrupt property
from the device tree.

This also fix the following warning for hwclock.util-linux:
$ hwclock.util-linux
hwclock.util-linux: select() to /dev/rtc0
to wait for clock tick timed out

Signed-off-by: Biwen Li <biwen.li@nxp.com>
Acked-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200527034228.23793-2-biwen.li@oss.nxp.com
2020-08-25 01:31:32 +10:00
Biwen Li
8c7614d648 powerpc/dts/t4240rdb: remove interrupts property
Since the interrupt pin for RTC DS1374 is not connected
to the CPU on T4240RDB, remove the interrupt property
from the device tree.

This also fix the following warning for hwclock.util-linux:
$ hwclock.util-linux
hwclock.util-linux: select() to /dev/rtc0
to wait for clock tick timed out

Signed-off-by: Biwen Li <biwen.li@nxp.com>
Acked-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200527034228.23793-1-biwen.li@oss.nxp.com
2020-08-25 01:31:32 +10:00
Frederic Barrat
374f6178f3 ocxl: Remove custom service to allocate interrupts
We now allocate interrupts through xive directly.

Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200403153838.29224-5-fbarrat@linux.ibm.com
2020-08-25 01:31:31 +10:00
Nicholas Mc Guire
d3e669f31e powerpc/icp-hv: Fix missing of_node_put() in success path
Both of_find_compatible_node() and of_find_node_by_type() will return
a refcounted node on success - thus for the success path the node must
be explicitly released with a of_node_put().

Fixes: 0b05ac6e2480 ("powerpc/xics: Rewrite XICS driver")
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1530691407-3991-1-git-send-email-hofrat@osadl.org
2020-08-25 01:31:31 +10:00
Nicholas Mc Guire
67c3e59443 powerpc/pseries: Fix missing of_node_put() in rng_init()
The call to of_find_compatible_node() returns a node pointer with
refcount incremented thus it must be explicitly decremented here
before returning.

Fixes: a489043f4626 ("powerpc/pseries: Implement arch_get_random_long() based on H_RANDOM")
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1530522496-14816-1-git-send-email-hofrat@osadl.org
2020-08-25 01:31:31 +10:00
Shawn Anastasio
9b725a90a8 powerpc/64s: Disallow PROT_SAO in LPARs by default
Since migration of guests using SAO to ISA 3.1 hosts may cause issues,
disable PROT_SAO in LPARs by default and introduce a new Kconfig option
PPC_PROT_SAO_LPAR to allow users to enable it if desired.

Signed-off-by: Shawn Anastasio <shawn@anastas.io>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200821185558.35561-3-shawn@anastas.io
2020-08-24 14:12:54 +10:00
Shawn Anastasio
12564485ed Revert "powerpc/64s: Remove PROT_SAO support"
This reverts commit 5c9fa16e8abd342ce04dc830c1ebb2a03abf6c05.

Since PROT_SAO can still be useful for certain classes of software,
reintroduce it. Concerns about guest migration for LPARs using SAO
will be addressed next.

Signed-off-by: Shawn Anastasio <shawn@anastas.io>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200821185558.35561-2-shawn@anastas.io
2020-08-24 14:12:53 +10:00
Linus Torvalds
cb95712138 powerpc fixes for 5.9 #3
Add perf support for emitting extended registers for power10.
 
 A fix for CPU hotplug on pseries, where on large/loaded systems we may not wait
 long enough for the CPU to be offlined, leading to crashes.
 
 Addition of a raw cputable entry for Power10, which is not required to boot, but
 is required to make our PMU setup work correctly in guests.
 
 Three fixes for the recent changes on 32-bit Book3S to move modules into their
 own segment for strict RWX.
 
 A fix for a recent change in our powernv PCI code that could lead to crashes.
 
 A change to our perf interrupt accounting to avoid soft lockups when using some
 events, found by syzkaller.
 
 A change in the way we handle power loss events from the hypervisor on pseries.
 We no longer immediately shut down if we're told we're running on a UPS.
 
 A few other minor fixes.
 
 Thanks to:
   Alexey Kardashevskiy, Andreas Schwab, Aneesh Kumar K.V, Anju T Sudhakar,
   Athira Rajeev, Christophe Leroy, Frederic Barrat, Greg Kurz, Kajol Jain,
   Madhavan Srinivasan, Michael Neuling, Michael Roth, Nageswara R Sastry, Oliver
   O'Halloran, Thiago Jung Bauermann, Vaidyanathan Srinivasan, Vasant Hegde.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl9CYMwTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgC/wEACljEVnfHzUObmIgqn9Ru3JlfEI6Hlk
 ts7kajCgS/I/bV6DoDMZ8rlZX87QFOwiBkNM1I+vGHSLAuzsmFAnbFPyxw/idxpQ
 XUoNy8OCvbbzCPzChYdiU0PxW2h2i+QxkmktlWSN1SAPudJUWvoPS2Y4+sC4zksk
 B4B6tbW2DT8TFO1kKeZsU9r2t+EH5KwlIOi+uxbH8d76lJINKkBNSnjzMytl7drM
 TZx/HWr8+s/WJo1787x6bv8gxs5tV9b4vIKt2YZNTY2kvYsEDE+fBR1XfCAneXMw
 ASYnZV+/xCLIUpRF6DI4RAShLBT/Sfiy1yMTndZgfqAgquokFosszNx2zrk0IzCd
 AgqX93YGbGz/H72W3Y/B0W9+74XyO/u2D9zhNpkCRMpdcsM5MbvOQrQA5Ustu47E
 av5MOaF/nNCd8J+OC4Qjgt5VFb/s0h4FdtrwT80srOa2U6Of9cD/T6xAfOszSJ96
 cWdSb5qhn5wuD9pP32KjwdmWBiUw38/gnRGKpRlOVzyHL/GKZijyaBbWBlkoEmty
 0nbjWW/IVfsOb5Weuiybg541h/QOVuOkb2pOvPClITiH83MY/AciDJ+auo4M//hW
 haKz9IgV/KctmzDE+v9d0BD8sGmW03YUcQAPdRufI0eGXijDLcnHeuk2B3Nu84Pq
 8mtev+VQ+T6cZA==
 =sdJ1
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - Add perf support for emitting extended registers for power10.

 - A fix for CPU hotplug on pseries, where on large/loaded systems we
   may not wait long enough for the CPU to be offlined, leading to
   crashes.

 - Addition of a raw cputable entry for Power10, which is not required
   to boot, but is required to make our PMU setup work correctly in
   guests.

 - Three fixes for the recent changes on 32-bit Book3S to move modules
   into their own segment for strict RWX.

 - A fix for a recent change in our powernv PCI code that could lead to
   crashes.

 - A change to our perf interrupt accounting to avoid soft lockups when
   using some events, found by syzkaller.

 - A change in the way we handle power loss events from the hypervisor
   on pseries. We no longer immediately shut down if we're told we're
   running on a UPS.

 - A few other minor fixes.

Thanks to Alexey Kardashevskiy, Andreas Schwab, Aneesh Kumar K.V, Anju T
Sudhakar, Athira Rajeev, Christophe Leroy, Frederic Barrat, Greg Kurz,
Kajol Jain, Madhavan Srinivasan, Michael Neuling, Michael Roth,
Nageswara R Sastry, Oliver O'Halloran, Thiago Jung Bauermann,
Vaidyanathan Srinivasan, Vasant Hegde.

* tag 'powerpc-5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/perf/hv-24x7: Move cpumask file to top folder of hv-24x7 driver
  powerpc/32s: Fix module loading failure when VMALLOC_END is over 0xf0000000
  powerpc/pseries: Do not initiate shutdown when system is running on UPS
  powerpc/perf: Fix soft lockups due to missed interrupt accounting
  powerpc/powernv/pci: Fix possible crash when releasing DMA resources
  powerpc/pseries/hotplug-cpu: wait indefinitely for vCPU death
  powerpc/32s: Fix is_module_segment() when MODULES_VADDR is defined
  powerpc/kasan: Fix KASAN_SHADOW_START on BOOK3S_32
  powerpc/fixmap: Fix the size of the early debug area
  powerpc/pkeys: Fix build error with PPC_MEM_KEYS disabled
  powerpc/kernel: Cleanup machine check function declarations
  powerpc: Add POWER10 raw mode cputable entry
  powerpc/perf: Add extended regs support for power10 platform
  powerpc/perf: Add support for outputting extended regs in perf intr_regs
  powerpc: Fix P10 PVR revision in /proc/cpuinfo for SMT4 cores
2020-08-23 11:37:23 -07:00
Linus Torvalds
b2d9e99622 * PAE and PKU bugfixes for x86
* selftests fix for new binutils
 * MMU notifier fix for arm64
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl9ARnoUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroP2YAf/dgLrPm4y4jxm7Aiz3/txqrHEwogT
 ZtvnzqUPb6+vkFrkop8QMOPw7A8NCfkn3/6sWbyUN5ObgOG1pxKyPraeN3ZdsDoR
 KGwv6P0dKgI8B4UuGEMe9GazXv+oOv8+bSUJnE+HZiUHzJKlX4HJbxDwUhvSSatY
 qYCZb/Uzqundh79TYULa7oI1/3F15A2J1zQPe4QgkToH9tsVB8PVfkH5uPJPp64M
 DTm5+qgwwsBULFaAuuo3FTs9f3pWJxn8GOuico1Sm+RnR53mhbUJggUfFzP0rwzZ
 Emevunje5r1rluFs+JWeNtflGH0gI4CLak7jvlOOBjrNb5XJgUSbzLXxkA==
 =Jwic
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:

 - PAE and PKU bugfixes for x86

 - selftests fix for new binutils

 - MMU notifier fix for arm64

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: arm64: Only reschedule if MMU_NOTIFIER_RANGE_BLOCKABLE is not set
  KVM: Pass MMU notifier range flags to kvm_unmap_hva_range()
  kvm: x86: Toggling CR4.PKE does not load PDPTEs in PAE mode
  kvm: x86: Toggling CR4.SMAP does not load PDPTEs in PAE mode
  KVM: x86: fix access code passed to gva_to_gpa
  selftests: kvm: Use a shorter encoding to clear RAX
2020-08-22 10:03:05 -07:00
Will Deacon
fdfe7cbd58 KVM: Pass MMU notifier range flags to kvm_unmap_hva_range()
The 'flags' field of 'struct mmu_notifier_range' is used to indicate
whether invalidate_range_{start,end}() are permitted to block. In the
case of kvm_mmu_notifier_invalidate_range_start(), this field is not
forwarded on to the architecture-specific implementation of
kvm_unmap_hva_range() and therefore the backend cannot sensibly decide
whether or not to block.

Add an extra 'flags' parameter to kvm_unmap_hva_range() so that
architectures are aware as to whether or not they are permitted to block.

Cc: <stable@vger.kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Message-Id: <20200811102725.7121-2-will@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-08-21 18:03:47 -04:00
Kajol Jain
64ef8f2c47 powerpc/perf/hv-24x7: Move cpumask file to top folder of hv-24x7 driver
Commit 792f73f747b8 ("powerpc/hv-24x7: Add sysfs files inside hv-24x7
device to show cpumask") added cpumask file as part of hv-24x7 driver
inside the interface folder. The cpumask file is supposed to be in the
top folder of the PMU driver in order to make hotplug work.

This patch fixes that issue and creates new group 'cpumask_attr_group'
to add cpumask file and make sure it added in top folder.

  command:# cat /sys/devices/hv_24x7/cpumask
  0

Fixes: 792f73f747b8 ("powerpc/hv-24x7: Add sysfs files inside hv-24x7 device to show cpumask")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200821080610.123997-1-kjain@linux.ibm.com
2020-08-21 23:35:27 +10:00
Christophe Leroy
541cebb51f powerpc/32s: Fix module loading failure when VMALLOC_END is over 0xf0000000
In is_module_segment(), when VMALLOC_END is over 0xf0000000,
ALIGN(VMALLOC_END, SZ_256M) has value 0.

In that case, addr >= ALIGN(VMALLOC_END, SZ_256M) is always
true then is_module_segment() always returns false.

Use (ALIGN(VMALLOC_END, SZ_256M) - 1) which will have
value 0xffffffff and will be suitable for the comparison.

Fixes: c49643319715 ("powerpc/32s: Only leave NX unset on segments used for modules")
Reported-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/09fc73fe9c7423c6b4cf93f93df9bb0ed8eefab5.1597994047.git.christophe.leroy@csgroup.eu
2020-08-21 23:30:25 +10:00
Vasant Hegde
90a9b102ed powerpc/pseries: Do not initiate shutdown when system is running on UPS
As per PAPR we have to look for both EPOW sensor value and event
modifier to identify the type of event and take appropriate action.

In LoPAPR v1.1 section 10.2.2 includes table 136 "EPOW Action Codes":

  SYSTEM_SHUTDOWN 3

  The system must be shut down. An EPOW-aware OS logs the EPOW error
  log information, then schedules the system to be shut down to begin
  after an OS defined delay internal (default is 10 minutes.)

Then in section 10.3.2.2.8 there is table 146 "Platform Event Log
Format, Version 6, EPOW Section", which includes the "EPOW Event
Modifier":

  For EPOW sensor value = 3
  0x01 = Normal system shutdown with no additional delay
  0x02 = Loss of utility power, system is running on UPS/Battery
  0x03 = Loss of system critical functions, system should be shutdown
  0x04 = Ambient temperature too high
  All other values = reserved

We have a user space tool (rtas_errd) on LPAR to monitor for
EPOW_SHUTDOWN_ON_UPS. Once it gets an event it initiates shutdown
after predefined time. It also starts monitoring for any new EPOW
events. If it receives "Power restored" event before predefined time
it will cancel the shutdown. Otherwise after predefined time it will
shutdown the system.

Commit 79872e35469b ("powerpc/pseries: All events of
EPOW_SYSTEM_SHUTDOWN must initiate shutdown") changed our handling of
the "on UPS/Battery" case, to immediately shutdown the system. This
breaks existing setups that rely on the userspace tool to delay
shutdown and let the system run on the UPS.

Fixes: 79872e35469b ("powerpc/pseries: All events of EPOW_SYSTEM_SHUTDOWN must initiate shutdown")
Cc: stable@vger.kernel.org # v4.0+
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
[mpe: Massage change log and add PAPR references]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200820061844.306460-1-hegdevasant@linux.vnet.ibm.com
2020-08-20 22:55:54 +10:00
Athira Rajeev
17899eaf88 powerpc/perf: Fix soft lockups due to missed interrupt accounting
Performance monitor interrupt handler checks if any counter has
overflown and calls record_and_restart() in core-book3s which invokes
perf_event_overflow() to record the sample information. Apart from
creating sample, perf_event_overflow() also does the interrupt and
period checks via perf_event_account_interrupt().

Currently we record information only if the SIAR (Sampled Instruction
Address Register) valid bit is set (using siar_valid() check) and
hence the interrupt check.

But it is possible that we do sampling for some events that are not
generating valid SIAR, and hence there is no chance to disable the
event if interrupts are more than max_samples_per_tick. This leads to
soft lockup.

Fix this by adding perf_event_account_interrupt() in the invalid SIAR
code path for a sampling event. ie if SIAR is invalid, just do
interrupt check and don't record the sample information.

Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1596717992-7321-1-git-send-email-atrajeev@linux.vnet.ibm.com
2020-08-20 20:29:09 +10:00
Frederic Barrat
e17a7c0e0a powerpc/powernv/pci: Fix possible crash when releasing DMA resources
Fix a typo introduced during recent code cleanup, which could lead to
silently not freeing resources or an oops message (on PCI hotplug or
CAPI reset).

Only impacts ioda2, the code path for ioda1 is correct.

Fixes: 01e12629af4e ("powerpc/powernv/pci: Add explicit tracking of the DMA setup state")
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200819130741.16769-1-fbarrat@linux.ibm.com
2020-08-20 12:30:40 +10:00
Michael Roth
801980f649 powerpc/pseries/hotplug-cpu: wait indefinitely for vCPU death
For a power9 KVM guest with XIVE enabled, running a test loop
where we hotplug 384 vcpus and then unplug them, the following traces
can be seen (generally within a few loops) either from the unplugged
vcpu:

  cpu 65 (hwid 65) Ready to die...
  Querying DEAD? cpu 66 (66) shows 2
  list_del corruption. next->prev should be c00a000002470208, but was c00a000002470048
  ------------[ cut here ]------------
  kernel BUG at lib/list_debug.c:56!
  Oops: Exception in kernel mode, sig: 5 [#1]
  LE SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in: fuse nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 ...
  CPU: 66 PID: 0 Comm: swapper/66 Kdump: loaded Not tainted 4.18.0-221.el8.ppc64le #1
  NIP:  c0000000007ab50c LR: c0000000007ab508 CTR: 00000000000003ac
  REGS: c0000009e5a17840 TRAP: 0700   Not tainted  (4.18.0-221.el8.ppc64le)
  MSR:  800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 28000842  XER: 20040000
  ...
  NIP __list_del_entry_valid+0xac/0x100
  LR  __list_del_entry_valid+0xa8/0x100
  Call Trace:
    __list_del_entry_valid+0xa8/0x100 (unreliable)
    free_pcppages_bulk+0x1f8/0x940
    free_unref_page+0xd0/0x100
    xive_spapr_cleanup_queue+0x148/0x1b0
    xive_teardown_cpu+0x1bc/0x240
    pseries_mach_cpu_die+0x78/0x2f0
    cpu_die+0x48/0x70
    arch_cpu_idle_dead+0x20/0x40
    do_idle+0x2f4/0x4c0
    cpu_startup_entry+0x38/0x40
    start_secondary+0x7bc/0x8f0
    start_secondary_prolog+0x10/0x14

or on the worker thread handling the unplug:

  pseries-hotplug-cpu: Attempting to remove CPU <NULL>, drc index: 1000013a
  Querying DEAD? cpu 314 (314) shows 2
  BUG: Bad page state in process kworker/u768:3  pfn:95de1
  cpu 314 (hwid 314) Ready to die...
  page:c00a000002577840 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x0
  flags: 0x5ffffc00000000()
  raw: 005ffffc00000000 5deadbeef0000100 5deadbeef0000200 0000000000000000
  raw: 0000000000000000 0000000000000000 00000000ffffff7f 0000000000000000
  page dumped because: nonzero mapcount
  Modules linked in: kvm xt_CHECKSUM ipt_MASQUERADE xt_conntrack ...
  CPU: 0 PID: 548 Comm: kworker/u768:3 Kdump: loaded Not tainted 4.18.0-224.el8.bz1856588.ppc64le #1
  Workqueue: pseries hotplug workque pseries_hp_work_fn
  Call Trace:
    dump_stack+0xb0/0xf4 (unreliable)
    bad_page+0x12c/0x1b0
    free_pcppages_bulk+0x5bc/0x940
    page_alloc_cpu_dead+0x118/0x120
    cpuhp_invoke_callback.constprop.5+0xb8/0x760
    _cpu_down+0x188/0x340
    cpu_down+0x5c/0xa0
    cpu_subsys_offline+0x24/0x40
    device_offline+0xf0/0x130
    dlpar_offline_cpu+0x1c4/0x2a0
    dlpar_cpu_remove+0xb8/0x190
    dlpar_cpu_remove_by_index+0x12c/0x150
    dlpar_cpu+0x94/0x800
    pseries_hp_work_fn+0x128/0x1e0
    process_one_work+0x304/0x5d0
    worker_thread+0xcc/0x7a0
    kthread+0x1ac/0x1c0
    ret_from_kernel_thread+0x5c/0x80

The latter trace is due to the following sequence:

  page_alloc_cpu_dead
    drain_pages
      drain_pages_zone
        free_pcppages_bulk

where drain_pages() in this case is called under the assumption that
the unplugged cpu is no longer executing. To ensure that is the case,
and early call is made to __cpu_die()->pseries_cpu_die(), which runs a
loop that waits for the cpu to reach a halted state by polling its
status via query-cpu-stopped-state RTAS calls. It only polls for 25
iterations before giving up, however, and in the trace above this
results in the following being printed only .1 seconds after the
hotplug worker thread begins processing the unplug request:

  pseries-hotplug-cpu: Attempting to remove CPU <NULL>, drc index: 1000013a
  Querying DEAD? cpu 314 (314) shows 2

At that point the worker thread assumes the unplugged CPU is in some
unknown/dead state and procedes with the cleanup, causing the race
with the XIVE cleanup code executed by the unplugged CPU.

Fix this by waiting indefinitely, but also making an effort to avoid
spurious lockup messages by allowing for rescheduling after polling
the CPU status and printing a warning if we wait for longer than 120s.

Fixes: eac1e731b59ee ("powerpc/xive: guest exploitation of the XIVE interrupt controller")
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Tested-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
[mpe: Trim oopses in change log slightly for readability]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200811161544.10513-1-mdroth@linux.vnet.ibm.com
2020-08-18 17:09:52 +10:00
Christophe Leroy
7bee31ad8e powerpc/32s: Fix is_module_segment() when MODULES_VADDR is defined
When MODULES_VADDR is defined, is_module_segment() shall check the
address against it instead of checking agains VMALLOC_START.

Fixes: 6ca055322da8 ("powerpc/32s: Use dedicated segment for modules with STRICT_KERNEL_RWX")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/07884ed033c31e074747b7eb8eaa329d15db07ec.1596641219.git.christophe.leroy@csgroup.eu
2020-08-18 13:40:15 +10:00
Christophe Leroy
48d2f0407b powerpc/kasan: Fix KASAN_SHADOW_START on BOOK3S_32
On BOOK3S_32, when we have modules and strict kernel RWX, modules
are not in vmalloc space but in a dedicated segment that is
below PAGE_OFFSET.

So KASAN_SHADOW_START must take it into account.

MODULES_VADDR can't be used because it is not defined yet
in kasan.h

Fixes: 6ca055322da8 ("powerpc/32s: Use dedicated segment for modules with STRICT_KERNEL_RWX")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/6eddca2d5611fd57312a88eae31278c87a8fc99d.1596641224.git.christophe.leroy@csgroup.eu
2020-08-18 13:39:52 +10:00
Christophe Leroy
fdc6edbb31 powerpc/fixmap: Fix the size of the early debug area
Commit ("03fd42d458fb powerpc/fixmap: Fix FIX_EARLY_DEBUG_BASE when
page size is 256k") reworked the setup of the early debug area and
mistakenly replaced 128 * 1024 by SZ_128.

Change to SZ_128K to restore the original 128 kbytes size of the area.

Fixes: 03fd42d458fb ("powerpc/fixmap: Fix FIX_EARLY_DEBUG_BASE when page size is 256k")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/996184974d674ff984643778cf1cdd7fe58cc065.1597644194.git.christophe.leroy@csgroup.eu
2020-08-17 23:35:58 +10:00
Aneesh Kumar K.V
1e4e4bcaf7 powerpc/pkeys: Fix build error with PPC_MEM_KEYS disabled
IS_ENABLED() instead of #ifdef still requires variable declaration.
In this specific case, default_uamor is declared in asm/pkeys.h which
is only included if PPC_MEM_KEYS is enabled.

arch/powerpc/mm/book3s64/hash_utils.c: In function ‘hash__early_init_mmu_secondary’:
arch/powerpc/mm/book3s64/hash_utils.c:1119:21: error: ‘default_uamor’ undeclared (first use in this function)
 1119 |   mtspr(SPRN_UAMOR, default_uamor);
      |                     ^~~~~~~~~~~~~

Fixes: 6553fb799f60 ("powerpc/pkeys: Fix boot failures with Nemo board (A-EON AmigaOne X1000)")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200817103301.158836-1-aneesh.kumar@linux.ibm.com
2020-08-17 23:33:00 +10:00
Madhavan Srinivasan
388692e943 powerpc/kernel: Cleanup machine check function declarations
__machine_check_early_realmode_p*() are currently declared as extern
in cputable.c and because of this when compiled with "C=1" (which
enables semantic checker) produces these warnings.

    CHECK   arch/powerpc/kernel/mce_power.c
  arch/powerpc/kernel/mce_power.c:709:6: warning: symbol '__machine_check_early_realmode_p7' was not declared. Should it be static?
  arch/powerpc/kernel/mce_power.c:717:6: warning: symbol '__machine_check_early_realmode_p8' was not declared. Should it be static?
  arch/powerpc/kernel/mce_power.c:722:6: warning: symbol '__machine_check_early_realmode_p9' was not declared. Should it be static?
  arch/powerpc/kernel/mce_power.c:740:6: warning: symbol '__machine_check_early_realmode_p10' was not declared. Should it be static?

Patch here moves the declaration to asm/mce.h and includes the same in
cputable.c

Fixes: ae744f3432d3 ("powerpc/book3s: Flush SLB/TLBs if we get SLB/TLB machine check errors on power8")
Fixes: 7b9f71f974a1 ("powerpc/64s: POWER9 machine check handler")
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200817005618.3305028-1-maddy@linux.ibm.com
2020-08-17 14:13:18 +10:00
Madhavan Srinivasan
327da008e6 powerpc: Add POWER10 raw mode cputable entry
Add a raw mode cputable entry for POWER10. Copies most of the fields
from commit a3ea40d5c736 ("powerpc: Add POWER10 architected mode")
except for oprofile_cpu_type, machine_check_early, pvr_mask and
pvr_mask fields. On bare metal systems we use DT CPU features, which
doesn't need a cputable entry. But in VMs we still rely on the raw
cputable entry to set the correct values for the PMU related fields.

Fixes: a3ea40d5c736 ("powerpc: Add POWER10 architected mode")
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
[mpe: Reorder vs cleanup patch and add Fixes tag]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200817005618.3305028-2-maddy@linux.ibm.com
2020-08-17 14:11:26 +10:00
Athira Rajeev
d735599a06 powerpc/perf: Add extended regs support for power10 platform
Include capability flag PERF_PMU_CAP_EXTENDED_REGS for power10 and
expose MMCR3, SIER2, SIER3 registers as part of extended regs. Also
introduce PERF_REG_PMU_MASK_31 to define extended mask value at
runtime for power10.

Suggested-by: Ryan Grimm <grimm@linux.ibm.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Nageswara R Sastry <nasastry@in.ibm.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-and-tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1596794701-23530-3-git-send-email-atrajeev@linux.vnet.ibm.com
2020-08-17 13:11:22 +10:00
Anju T Sudhakar
781fa4811d powerpc/perf: Add support for outputting extended regs in perf intr_regs
Add support for perf extended register capability in powerpc. The
capability flag PERF_PMU_CAP_EXTENDED_REGS, is used to indicate the
PMU which support extended registers. The generic code define the mask
of extended registers as 0 for non supported architectures.

Patch adds extended regs support for power9 platform by exposing
MMCR0, MMCR1 and MMCR2 registers.

REG_RESERVED mask needs update to include extended regs.
PERF_REG_EXTENDED_MASK, contains mask value of the supported
registers, is defined at runtime in the kernel based on platform since
the supported registers may differ from one processor version to
another and hence the MASK value.

With the patch:

  available registers: r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11
  r12 r13 r14 r15 r16 r17 r18 r19 r20 r21 r22 r23 r24 r25 r26
  r27 r28 r29 r30 r31 nip msr orig_r3 ctr link xer ccr softe
  trap dar dsisr sier mmcra mmcr0 mmcr1 mmcr2

  PERF_RECORD_SAMPLE(IP, 0x1): 4784/4784: 0 period: 1 addr: 0
  ... intr regs: mask 0xffffffffffff ABI 64-bit
  .... r0    0xc00000000012b77c
  .... r1    0xc000003fe5e03930
  .... r2    0xc000000001b0e000
  .... r3    0xc000003fdcddf800
  .... r4    0xc000003fc7880000
  .... r5    0x9c422724be
  .... r6    0xc000003fe5e03908
  .... r7    0xffffff63bddc8706
  .... r8    0x9e4
  .... r9    0x0
  .... r10   0x1
  .... r11   0x0
  .... r12   0xc0000000001299c0
  .... r13   0xc000003ffffc4800
  .... r14   0x0
  .... r15   0x7fffdd8b8b00
  .... r16   0x0
  .... r17   0x7fffdd8be6b8
  .... r18   0x7e7076607730
  .... r19   0x2f
  .... r20   0xc00000001fc26c68
  .... r21   0xc0002041e4227e00
  .... r22   0xc00000002018fb60
  .... r23   0x1
  .... r24   0xc000003ffec4d900
  .... r25   0x80000000
  .... r26   0x0
  .... r27   0x1
  .... r28   0x1
  .... r29   0xc000000001be1260
  .... r30   0x6008010
  .... r31   0xc000003ffebb7218
  .... nip   0xc00000000012b910
  .... msr   0x9000000000009033
  .... orig_r3 0xc00000000012b86c
  .... ctr   0xc0000000001299c0
  .... link  0xc00000000012b77c
  .... xer   0x0
  .... ccr   0x28002222
  .... softe 0x1
  .... trap  0xf00
  .... dar   0x0
  .... dsisr 0x80000000000
  .... sier  0x0
  .... mmcra 0x80000000000
  .... mmcr0 0x82008090
  .... mmcr1 0x1e000000
  .... mmcr2 0x0
   ... thread: perf:4784

Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Nageswara R Sastry <nasastry@in.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-and-tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1596794701-23530-2-git-send-email-atrajeev@linux.vnet.ibm.com
2020-08-17 13:11:22 +10:00
Michael Neuling
030a2c689f powerpc: Fix P10 PVR revision in /proc/cpuinfo for SMT4 cores
On POWER10 bit 12 in the PVR indicates if the core is SMT4 or SMT8.
Bit 12 is set for SMT4.

Without this patch, /proc/cpuinfo on a SMT4 DD1 POWER10 looks like
this:
  cpu             : POWER10, altivec supported
  revision        : 17.0 (pvr 0080 1100)

Fixes: a3ea40d5c736 ("powerpc: Add POWER10 architected mode")
Cc: stable@vger.kernel.org # v5.8
Signed-off-by: Michael Neuling <mikey@neuling.org>
Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200803035600.1820371-1-mikey@neuling.org
2020-08-17 13:11:22 +10:00
Krzysztof Kozlowski
8f28ca6bd8 iomap: constify ioreadX() iomem argument (as in generic implementation)
Patch series "iomap: Constify ioreadX() iomem argument", v3.

The ioread8/16/32() and others have inconsistent interface among the
architectures: some taking address as const, some not.

It seems there is nothing really stopping all of them to take pointer to
const.

This patch (of 4):

The ioreadX() and ioreadX_rep() helpers have inconsistent interface.  On
some architectures void *__iomem address argument is a pointer to const,
on some not.

Implementations of ioreadX() do not modify the memory under the address so
they can be converted to a "const" version for const-safety and
consistency among architectures.

[krzk@kernel.org: sh: clk: fix assignment from incompatible pointer type for ioreadX()]
  Link: http://lkml.kernel.org/r/20200723082017.24053-1-krzk@kernel.org
[akpm@linux-foundation.org: fix drivers/mailbox/bcm-pdc-mailbox.c]
  Link: http://lkml.kernel.org/r/202007132209.Rxmv4QyS%25lkp@intel.com

Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Allen Hubbe <allenbh@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Link: http://lkml.kernel.org/r/20200709072837.5869-1-krzk@kernel.org
Link: http://lkml.kernel.org/r/20200709072837.5869-2-krzk@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-14 19:56:57 -07:00
Xiaoming Ni
88db0aa242 all arch: remove system call sys_sysctl
Since commit 61a47c1ad3a4dc ("sysctl: Remove the sysctl system call"),
sys_sysctl is actually unavailable: any input can only return an error.

We have been warning about people using the sysctl system call for years
and believe there are no more users.  Even if there are users of this
interface if they have not complained or fixed their code by now they
probably are not going to, so there is no point in warning them any
longer.

So completely remove sys_sysctl on all architectures.

[nixiaoming@huawei.com: s390: fix build error for sys_call_table_emu]
 Link: http://lkml.kernel.org/r/20200618141426.16884-1-nixiaoming@huawei.com

Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Will Deacon <will@kernel.org>		[arm/arm64]
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Bin Meng <bin.meng@windriver.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: chenzefeng <chenzefeng2@huawei.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christian Brauner <christian@brauner.io>
Cc: Chris Zankel <chris@zankel.net>
Cc: David Howells <dhowells@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Diego Elio Pettenò <flameeyes@flameeyes.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kars de Jong <jongk@linux-m68k.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Nick Piggin <npiggin@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Paul Burton <paulburton@kernel.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Sven Schnelle <svens@stackframe.org>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Zhou Yanjie <zhouyanjie@wanyeetech.com>
Link: http://lkml.kernel.org/r/20200616030734.87257-1-nixiaoming@huawei.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-14 19:56:56 -07:00
Linus Torvalds
7fca4dee61 powerpc fixes for 5.9 #2
One fix for a boot crash on some platforms introduced by the recent pkey
 refactoring.
 
 Thanks to:
   Christian Zigotzky, Aneesh Kumar K.V.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl82hVUTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgI5GEACaEJDAJimcb7zqMJ7dQZR1hr5CoU50
 r7LIa0BTDDwP4gTfj+CrCp3XCZvTHeYFoYKMgQ2TU9qW/viF6DrXAoufLPQ8Ftyf
 sFSHvA7olz1ztPwAVLM4HkF61BsMCvrqMCRoDe3NRTWyLD7Ufs4dtiV1GtrjMSZt
 vQtXs0OWXhNas9hPnt4GR9ZgTSQQxp/VsgiwkhpLsW8dVUXQ12kcHjjKgD6qmXy3
 LXdK5NNvHKZ632shwIHHOp0gMd016AqwZAIKGtSD7A8ZRRVMX8StQw8W+pPKRCSA
 dksHtOUrM4reDx7B00Ce84rNkr2r+gOSzzOyIhnhcMbxhLj6lyb4PkXsCS1D+lB0
 c0IbAti7IJaPOIDVG+sjpMR7IetyAtGLRioLMeOcyQmd+C27DEy/NbUzl1yJF8LK
 htR0XvJNmBgJ3QPhyWw3NpcdLMYsnGZ6rJdHp0XnrXn9ngR1zb8xQU8BdaP/DtKd
 xeAwNBeiQ7EPeEWv67KXJUJYsEKSKSjZNdUyY6W8m9/qwOHrUrvbBnYzuVD1e4eE
 ELB1GF5DO10EeYdqFEhhnGUJIwjA7M3SU30LI5DO9x8ikiMCRoI0vLgT4eWU+aTe
 aRVYfQ750ZlYEGvDhvWKlpcvgRcdtT2rgnqrysyjkJ9bf24/prXa0MSfhwvPipXA
 A4gY681C1/2IOA==
 =d4vU
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fix from Michael Ellerman:
 "One fix for a boot crash on some platforms introduced by the recent
  pkey refactoring.

  Thanks to Christian Zigotzky and Aneesh Kumar K.V"

* tag 'powerpc-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/pkeys: Fix boot failures with Nemo board (A-EON AmigaOne X1000)
2020-08-14 13:40:27 -07:00
Linus Torvalds
8cd84b7096 PPC:
* Improvements and bugfixes for secure VM support, giving reduced startup
   time and memory hotplug support.
 * Locking fixes in nested KVM code
 * Increase number of guests supported by HV KVM to 4094
 * Preliminary POWER10 support
 
 ARM:
 * Split the VHE and nVHE hypervisor code bases, build the EL2 code
   separately, allowing for the VHE code to now be built with instrumentation
 * Level-based TLB invalidation support
 * Restructure of the vcpu register storage to accomodate the NV code
 * Pointer Authentication available for guests on nVHE hosts
 * Simplification of the system register table parsing
 * MMU cleanups and fixes
 * A number of post-32bit cleanups and other fixes
 
 MIPS:
 * compilation fixes
 
 x86:
 * bugfixes
 * support for the SERIALIZE instruction
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl8yfuQUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNweQgAiEycRbpifAueihK3ScKwYcCFhbHg
 n6KLiFCY3sJRg+ORNb9EuFPJgGygV8DPKbEMvKaGDhNpX3rOpSIrpi5QQ5Hx+WOj
 WHg+aX8Eyy1ys7V84UbiMeZKUbKDDRr0/UOUtJEsF4hiD7s0FgobbQhC/3+awp5k
 sdSTMYlXelep+pjdFX7cNIgjrBNFtqH0ECeuDCcQzDg2zlH+poEPyLaC5+U4RF6r
 pfvcxd6xp50fobo8ro7kMuBeclG3JxLjqqdNrkkHcF1DxROMLLKN7CjHZchYC/BK
 c+S7JHLFnafxiTncMLhv3s4viey05mohW6SxeLw4qcWHfFlz+qyfZwMvZA==
 =d/GI
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull more KVM updates from Paolo Bonzini:
 "PPC:
   - Improvements and bugfixes for secure VM support, giving reduced
     startup time and memory hotplug support.

   - Locking fixes in nested KVM code

   - Increase number of guests supported by HV KVM to 4094

   - Preliminary POWER10 support

  ARM:
   - Split the VHE and nVHE hypervisor code bases, build the EL2 code
     separately, allowing for the VHE code to now be built with
     instrumentation

   - Level-based TLB invalidation support

   - Restructure of the vcpu register storage to accomodate the NV code

   - Pointer Authentication available for guests on nVHE hosts

   - Simplification of the system register table parsing

   - MMU cleanups and fixes

   - A number of post-32bit cleanups and other fixes

  MIPS:
   - compilation fixes

  x86:
   - bugfixes

   - support for the SERIALIZE instruction"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (70 commits)
  KVM: MIPS/VZ: Fix build error caused by 'kvm_run' cleanup
  x86/kvm/hyper-v: Synic default SCONTROL MSR needs to be enabled
  MIPS: KVM: Convert a fallthrough comment to fallthrough
  MIPS: VZ: Only include loongson_regs.h for CPU_LOONGSON64
  x86: Expose SERIALIZE for supported cpuid
  KVM: x86: Don't attempt to load PDPTRs when 64-bit mode is enabled
  KVM: arm64: Move S1PTW S2 fault logic out of io_mem_abort()
  KVM: arm64: Don't skip cache maintenance for read-only memslots
  KVM: arm64: Handle data and instruction external aborts the same way
  KVM: arm64: Rename kvm_vcpu_dabt_isextabt()
  KVM: arm: Add trace name for ARM_NISV
  KVM: arm64: Ensure that all nVHE hyp code is in .hyp.text
  KVM: arm64: Substitute RANDOMIZE_BASE for HARDEN_EL2_VECTORS
  KVM: arm64: Make nVHE ASLR conditional on RANDOMIZE_BASE
  KVM: PPC: Book3S HV: Rework secure mem slot dropping
  KVM: PPC: Book3S HV: Move kvmppc_svm_page_out up
  KVM: PPC: Book3S HV: Migrate hot plugged memory
  KVM: PPC: Book3S HV: In H_SVM_INIT_DONE, migrate remaining normal-GFNs to secure-GFNs
  KVM: PPC: Book3S HV: Track the state GFNs associated with secure VMs
  KVM: PPC: Book3S HV: Disable page merging in H_SVM_INIT_START
  ...
2020-08-12 12:25:06 -07:00
Linus Torvalds
9ad57f6dfc Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:

 - most of the rest of MM (memcg, hugetlb, vmscan, proc, compaction,
   mempolicy, oom-kill, hugetlbfs, migration, thp, cma, util,
   memory-hotplug, cleanups, uaccess, migration, gup, pagemap),

 - various other subsystems (alpha, misc, sparse, bitmap, lib, bitops,
   checkpatch, autofs, minix, nilfs, ufs, fat, signals, kmod, coredump,
   exec, kdump, rapidio, panic, kcov, kgdb, ipc).

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (164 commits)
  mm/gup: remove task_struct pointer for all gup code
  mm: clean up the last pieces of page fault accountings
  mm/xtensa: use general page fault accounting
  mm/x86: use general page fault accounting
  mm/sparc64: use general page fault accounting
  mm/sparc32: use general page fault accounting
  mm/sh: use general page fault accounting
  mm/s390: use general page fault accounting
  mm/riscv: use general page fault accounting
  mm/powerpc: use general page fault accounting
  mm/parisc: use general page fault accounting
  mm/openrisc: use general page fault accounting
  mm/nios2: use general page fault accounting
  mm/nds32: use general page fault accounting
  mm/mips: use general page fault accounting
  mm/microblaze: use general page fault accounting
  mm/m68k: use general page fault accounting
  mm/ia64: use general page fault accounting
  mm/hexagon: use general page fault accounting
  mm/csky: use general page fault accounting
  ...
2020-08-12 11:24:12 -07:00
Peter Xu
a2beb5f1ef mm: clean up the last pieces of page fault accountings
Here're the last pieces of page fault accounting that were still done
outside handle_mm_fault() where we still have regs==NULL when calling
handle_mm_fault():

arch/powerpc/mm/copro_fault.c:   copro_handle_mm_fault
arch/sparc/mm/fault_32.c:        force_user_fault
arch/um/kernel/trap.c:           handle_page_fault
mm/gup.c:                        faultin_page
                                 fixup_user_fault
mm/hmm.c:                        hmm_vma_fault
mm/ksm.c:                        break_ksm

Some of them has the issue of duplicated accounting for page fault
retries.  Some of them didn't do the accounting at all.

This patch cleans all these up by letting handle_mm_fault() to do per-task
page fault accounting even if regs==NULL (though we'll still skip the perf
event accountings).  With that, we can safely remove all the outliers now.

There's another functional change in that now we account the page faults
to the caller of gup, rather than the task_struct that passed into the gup
code.  More information of this can be found at [1].

After this patch, below things should never be touched again outside
handle_mm_fault():

  - task_struct.[maj|min]_flt
  - PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]

[1] https://lore.kernel.org/lkml/CAHk-=wj_V2Tps2QrMn20_W0OJF9xqNh52XSGA42s-ZJ8Y+GyKw@mail.gmail.com/

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200707225021.200906-25-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:58:04 -07:00
Peter Xu
428fdc0944 mm/powerpc: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault().

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Link: http://lkml.kernel.org/r/20200707225021.200906-17-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:58:03 -07:00
Peter Xu
bce617edec mm: do page fault accounting in handle_mm_fault
Patch series "mm: Page fault accounting cleanups", v5.

This is v5 of the pf accounting cleanup series.  It originates from Gerald
Schaefer's report on an issue a week ago regarding to incorrect page fault
accountings for retried page fault after commit 4064b9827063 ("mm: allow
VM_FAULT_RETRY for multiple times"):

  https://lore.kernel.org/lkml/20200610174811.44b94525@thinkpad/

What this series did:

  - Correct page fault accounting: we do accounting for a page fault
    (no matter whether it's from #PF handling, or gup, or anything else)
    only with the one that completed the fault.  For example, page fault
    retries should not be counted in page fault counters.  Same to the
    perf events.

  - Unify definition of PERF_COUNT_SW_PAGE_FAULTS: currently this perf
    event is used in an adhoc way across different archs.

    Case (1): for many archs it's done at the entry of a page fault
    handler, so that it will also cover e.g.  errornous faults.

    Case (2): for some other archs, it is only accounted when the page
    fault is resolved successfully.

    Case (3): there're still quite some archs that have not enabled
    this perf event.

    Since this series will touch merely all the archs, we unify this
    perf event to always follow case (1), which is the one that makes most
    sense.  And since we moved the accounting into handle_mm_fault, the
    other two MAJ/MIN perf events are well taken care of naturally.

  - Unify definition of "major faults": the definition of "major
    fault" is slightly changed when used in accounting (not
    VM_FAULT_MAJOR).  More information in patch 1.

  - Always account the page fault onto the one that triggered the page
    fault.  This does not matter much for #PF handlings, but mostly for
    gup.  More information on this in patch 25.

Patchset layout:

Patch 1:     Introduced the accounting in handle_mm_fault(), not enabled.
Patch 2-23:  Enable the new accounting for arch #PF handlers one by one.
Patch 24:    Enable the new accounting for the rest outliers (gup, iommu, etc.)
Patch 25:    Cleanup GUP task_struct pointer since it's not needed any more

This patch (of 25):

This is a preparation patch to move page fault accountings into the
general code in handle_mm_fault().  This includes both the per task
flt_maj/flt_min counters, and the major/minor page fault perf events.  To
do this, the pt_regs pointer is passed into handle_mm_fault().

PERF_COUNT_SW_PAGE_FAULTS should still be kept in per-arch page fault
handlers.

So far, all the pt_regs pointer that passed into handle_mm_fault() is
NULL, which means this patch should have no intented functional change.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200707225021.200906-1-peterx@redhat.com
Link: http://lkml.kernel.org/r/20200707225021.200906-2-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:58:02 -07:00
Christoph Hellwig
428e2976a5 uaccess: remove segment_eq
segment_eq is only used to implement uaccess_kernel.  Just open code
uaccess_kernel in the arch uaccess headers and remove one layer of
indirection.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Greentime Hu <green.hu@gmail.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Link: http://lkml.kernel.org/r/20200710135706.537715-5-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:57:58 -07:00
Linus Torvalds
952ace797c IOMMU Updates for Linux v5.9
Including:
 
 	- Removal of the dev->archdata.iommu (or similar) pointers from
 	  most architectures. Only Sparc is left, but this is private to
 	  Sparc as their drivers don't use the IOMMU-API.
 
 	- ARM-SMMU Updates from Will Deacon:
 
 	  -  Support for SMMU-500 implementation in Marvell
 	     Armada-AP806 SoC
 
 	  - Support for SMMU-500 implementation in NVIDIA Tegra194 SoC
 
 	  - DT compatible string updates
 
 	  - Remove unused IOMMU_SYS_CACHE_ONLY flag
 
 	  - Move ARM-SMMU drivers into their own subdirectory
 
 	- Intel VT-d Updates from Lu Baolu:
 
 	  - Misc tweaks and fixes for vSVA
 
 	  - Report/response page request events
 
 	  - Cleanups
 
 	- Move the Kconfig and Makefile bits for the AMD and Intel
 	  drivers into their respective subdirectory.
 
 	- MT6779 IOMMU Support
 
 	- Support for new chipsets in the Renesas IOMMU driver
 
 	- Other misc cleanups and fixes (e.g. to improve compile test
 	  coverage)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAl8ygTIACgkQK/BELZcB
 GuPZmRAAzSLuUNoQPWrFUbocNuZ/YHUCKdluKdYx26AgtYFwBrwzDAHPdq8HF8Hm
 y8w2xiUVVP9uZ8gnDkAuwXBtg+yOnG9sRNFZMNdtCy1Q0ehp0HNsn/6NabxVpSml
 QuAmd2PxMMopQRVLOR5YYvZl6JdiZx19W8X+trgwnR9Kghqq+7QXI9+D00jztRxQ
 Qvh/9NvIdX3k+5R4ZPJaV6OhaFvxzQzQZwKuO61VqFOWZRH1z9Oo+aXDCWTFUjYN
 IClTcG8qOK2W9/SOyYDXMoz30Yf0vcuDxhafi2JJVNcTPRmMWoeqff6yKslp76ea
 lTepDcIKld1Ul9NoqfYzhhKiEaLcgMEW2ua6vk5YFVxBBqJfg5qdtDZzBxa0FiNx
 TQrZFX3xjtZC6tRyy+eKWOj6vx7l0ONwwDxRc3HdvL+xE+KUdmsg82qHU4cAHRjp
 U2dgTdlkTEd56q4BEQxmJAHYMIUrx2QAp6pa2+Jv/Iqpi9PsZ2k+l9Gy6h+rM7dn
 Est/1gA4kDhKdCKfTx7g9EL6AAoU50WttxNmwMxrUrXX3fsstfY1fKgyZUPpkL7V
 V5iXbbsdMQLHzOF2qiqIIMxMGYxr/x/FJ1DnSJ7j+jAXMF77d2B9iQttzImOVN2c
 VXBxcVstWN7/xXjIy13C/83bRKwWqXaaS4cbv3Di0ZGFeD2oAF0=
 =3O2Z
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:

 - Remove of the dev->archdata.iommu (or similar) pointers from most
   architectures. Only Sparc is left, but this is private to Sparc as
   their drivers don't use the IOMMU-API.

 - ARM-SMMU updates from Will Deacon:

     - Support for SMMU-500 implementation in Marvell Armada-AP806 SoC

     - Support for SMMU-500 implementation in NVIDIA Tegra194 SoC

     - DT compatible string updates

     - Remove unused IOMMU_SYS_CACHE_ONLY flag

     - Move ARM-SMMU drivers into their own subdirectory

 - Intel VT-d updates from Lu Baolu:

     - Misc tweaks and fixes for vSVA

     - Report/response page request events

     - Cleanups

 - Move the Kconfig and Makefile bits for the AMD and Intel drivers into
   their respective subdirectory.

 - MT6779 IOMMU Support

 - Support for new chipsets in the Renesas IOMMU driver

 - Other misc cleanups and fixes (e.g. to improve compile test coverage)

* tag 'iommu-updates-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (77 commits)
  iommu/amd: Move Kconfig and Makefile bits down into amd directory
  iommu/vt-d: Move Kconfig and Makefile bits down into intel directory
  iommu/arm-smmu: Move Arm SMMU drivers into their own subdirectory
  iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu
  iommu: Add gfp parameter to io_pgtable_ops->map()
  iommu: Mark __iommu_map_sg() as static
  iommu/vt-d: Rename intel-pasid.h to pasid.h
  iommu/vt-d: Add page response ops support
  iommu/vt-d: Report page request faults for guest SVA
  iommu/vt-d: Add a helper to get svm and sdev for pasid
  iommu/vt-d: Refactor device_to_iommu() helper
  iommu/vt-d: Disable multiple GPASID-dev bind
  iommu/vt-d: Warn on out-of-range invalidation address
  iommu/vt-d: Fix devTLB flush for vSVA
  iommu/vt-d: Handle non-page aligned address
  iommu/vt-d: Fix PASID devTLB invalidation
  iommu/vt-d: Remove global page support in devTLB flush
  iommu/vt-d: Enforce PASID devTLB field mask
  iommu: Make some functions static
  iommu/amd: Remove double zero check
  ...
2020-08-11 14:13:24 -07:00
Aneesh Kumar K.V
6553fb799f powerpc/pkeys: Fix boot failures with Nemo board (A-EON AmigaOne X1000)
On p6 and before we should avoid updating UAMOR SPRN. This resulted
in boot failure on Nemo board.

Fixes: 269e829f48a0 ("powerpc/book3s64/pkey: Disable pkey on POWER6 and before")
Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200810102623.685083-1-aneesh.kumar@linux.ibm.com
2020-08-10 23:07:21 +10:00
Linus Torvalds
fc80c51fd4 Kbuild updates for v5.9
- run the checker (e.g. sparse) after the compiler
 
  - remove unneeded cc-option tests for old compiler flags
 
  - fix tar-pkg to install dtbs
 
  - introduce ccflags-remove-y and asflags-remove-y syntax
 
  - allow to trace functions in sub-directories of lib/
 
  - introduce hostprogs-always-y and userprogs-always-y syntax
 
  - various Makefile cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAl8wJXEVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGMGEP/0jDq/WafbfPN0aU83EqEWLt/sKg
 bluzmf/6HGx3XVRnuAzsHNNqysUx77WJiDsU/jbC/zdH8Iox3Sc1diE2sELLNAfY
 iJmQ8NBPggyU74aYG3OJdpDjz8T9EX/nVaYrjyFlbuXElM+Qvo8Z4Fz6NpWqKWlA
 gU+yGxEPPdX6MLHcSPSIu1hGWx7UT4fgfx3zDFTI2qvbQgQjKtzyTjAH5Cm3o87h
 rfomvHSSoAUg+Fh1LediRh1tJlkdVO+w7c+LNwCswmdBtkZuxecj1bQGUTS8GaLl
 CCWOKYfWp0KsVf1veXNNNaX/ecbp+Y34WErFq3V9Fdq5RmVlp+FPSGMyjDMRiQ/p
 LGvzbJLPpG586MnK8of0dOj6Es6tVPuq6WH2HuvsyTGcZJDpFTTxRcK3HDkE8ig6
 ZtuM3owB/Mep8IzwY2yWQiDrc7TX5Fz8S4hzGPU1zG9cfj4VT6TBqHGAy1Eql/0l
 txj6vJpnbQSdXiIX8MIU3yH35Y7eW3JYWgspTZH5Woj1S/wAWwuG93Fuuxq6mQIJ
 q6LSkMavtOfuCjOA9vJBZewpKXRU6yo0CzWNL/5EZ6z/r/I+DGtfb/qka8oYUDjX
 9H0cecL37AQxDHRPTxCZDQF0TpYiFJ6bmnMftK9NKNuIdvsk9DF7UBa3EdUNIj38
 yKS3rI7Lw55xWuY3
 =bkNQ
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - run the checker (e.g. sparse) after the compiler

 - remove unneeded cc-option tests for old compiler flags

 - fix tar-pkg to install dtbs

 - introduce ccflags-remove-y and asflags-remove-y syntax

 - allow to trace functions in sub-directories of lib/

 - introduce hostprogs-always-y and userprogs-always-y syntax

 - various Makefile cleanups

* tag 'kbuild-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: stop filtering out $(GCC_PLUGINS_CFLAGS) from cc-option base
  kbuild: include scripts/Makefile.* only when relevant CONFIG is enabled
  kbuild: introduce hostprogs-always-y and userprogs-always-y
  kbuild: sort hostprogs before passing it to ifneq
  kbuild: move host .so build rules to scripts/gcc-plugins/Makefile
  kbuild: Replace HTTP links with HTTPS ones
  kbuild: trace functions in subdirectories of lib/
  kbuild: introduce ccflags-remove-y and asflags-remove-y
  kbuild: do not export LDFLAGS_vmlinux
  kbuild: always create directories of targets
  powerpc/boot: add DTB to 'targets'
  kbuild: buildtar: add dtbs support
  kbuild: remove cc-option test of -ffreestanding
  kbuild: remove cc-option test of -fno-stack-protector
  Revert "kbuild: Create directory for target DTB"
  kbuild: run the checker after the compiler
2020-08-09 14:10:26 -07:00