12534 Commits

Author SHA1 Message Date
Peter Ujfalusi
9bcb631e99 arm64: dts: ti: k3-am654-main: Add McASP nodes
Add the nodes for McASP 0-2 and keep them disabled because several
required properties are not present as they are board specific.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Reviewed-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Tero Kristo <t-kristo@ti.com>
2020-01-24 09:30:24 +02:00
Peter Ujfalusi
6f73c1e599 arm64: dts: ti: k3-j721e: DMA support
Add the ringacc and udmap nodes for main and mcu NAVSS.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Reviewed-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Tero Kristo <t-kristo@ti.com>
2020-01-24 09:30:24 +02:00
Peter Ujfalusi
515c034013 arm64: dts: ti: k3-j721e-main: Move secure proxy and smmu under main_navss
Secure proxy (NAVSS0_SEC_PROXY0) and smmu (NAVSS0_TCU) is part of the
Navigator Subsystem.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Reviewed-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Tero Kristo <t-kristo@ti.com>
2020-01-24 09:30:24 +02:00
Peter Ujfalusi
ab641f2811 arm64: dts: ti: k3-j721e-main: Correct main NAVSS representation
NAVSS is a subsystem containing different IPs, it is not really a bus.
Change the compatible from "simple-bus" to "simple-mfd" to reflect that.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Reviewed-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Tero Kristo <t-kristo@ti.com>
2020-01-24 09:30:24 +02:00
Peter Ujfalusi
8c0deacaf4 arm64: dts: ti: k3-j721e: Correct the address for MAIN NAVSS
On am654 the MAIN NAVSS base address was 0x30800000, but in j721e it is
at 0x30000000

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Reviewed-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Tero Kristo <t-kristo@ti.com>
2020-01-24 09:30:24 +02:00
Peter Ujfalusi
3d6230548c arm64: dts: ti: k3-am65: DMA support
Add the ringacc and udmap nodes for main and mcu NAVSS.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Reviewed-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Tero Kristo <t-kristo@ti.com>
2020-01-24 09:30:24 +02:00
Peter Ujfalusi
12f207003c arm64: dts: ti: k3-am65-main: Move secure proxy under cbass_main_navss
Secure proxy (NAVSS0_SEC_PROXY0) is part of the Navigator Subsystem.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Reviewed-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Tero Kristo <t-kristo@ti.com>
2020-01-24 09:30:23 +02:00
Peter Ujfalusi
2daaa18014 arm64: dts: ti: k3-am65-main: Correct main NAVSS representation
NAVSS is a subsystem containing different IPs, it is not really a bus.
Change the compatible from "simple-bus" to "simple-mfd" to reflect that.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Reviewed-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Tero Kristo <t-kristo@ti.com>
2020-01-24 09:30:23 +02:00
Mark Brown
6645d8542e arm64: KVM: Annotate guest entry/exit as a single function
In an effort to clarify and simplify the annotations of assembly
functions in the kernel new macros have been introduced replacing ENTRY
and ENDPROC. There are separate annotations SYM_FUNC_ for normal C
functions and SYM_CODE_ for other code. Currently __guest_enter and
__guest_exit are annotated as standard functions but this is not
entirely correct as the former doesn't do a normal return and the latter
is not entered in a normal fashion. From the point of view of the
hypervisor the guest entry/exit may be viewed as a single
function which happens to have an eret in the middle of it so let's
annotate it as such.

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200120124706.8681-1-broonie@kernel.org
2020-01-23 10:38:14 +00:00
Andrew Jones
290a6bb06d arm64: KVM: Add UAPI notes for swapped registers
Two UAPI system register IDs do not derive their values from the
ARM system register encodings. This is because their values were
accidentally swapped. As the IDs are API, they cannot be changed.
Add WARNING notes to point them out.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Andrew Jones <drjones@redhat.com>
[maz: turned XXX into WARNING]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200120130825.28838-1-drjones@redhat.com
2020-01-23 10:38:14 +00:00
Marc Zyngier
0e20f5e255 KVM: arm/arm64: Cleanup MMIO handling
Our MMIO handling is a bit odd, in the sense that it uses an
intermediate per-vcpu structure to store the various decoded
information that describe the access.

But the same information is readily available in the HSR/ESR_EL2
field, and we actually use this field to populate the structure.

Let's simplify the whole thing by getting rid of the superfluous
structure and save a (tiny) bit of space in the vcpu structure.

[32bit fix courtesy of Olof Johansson <olof@lixom.net>]
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-01-23 10:38:14 +00:00
Andrew Murray
4942dc6638 KVM: arm64: Write arch.mdcr_el2 changes since last vcpu_load on VHE
On VHE systems arch.mdcr_el2 is written to mdcr_el2 at vcpu_load time to
set options for self-hosted debug and the performance monitors
extension.

Unfortunately the value of arch.mdcr_el2 is not calculated until
kvm_arm_setup_debug() in the run loop after the vcpu has been loaded.
This means that the initial brief iterations of the run loop use a zero
value of mdcr_el2 - until the vcpu is preempted. This also results in a
delay between changes to vcpu->guest_debug taking effect.

Fix this by writing to mdcr_el2 in kvm_arm_setup_debug() on VHE systems
when a change to arch.mdcr_el2 has been detected.

Fixes: d5a21bcc2995 ("KVM: arm64: Move common VHE/non-VHE trap config in separate functions")
Cc: <stable@vger.kernel.org> # 4.17.x-
Suggested-by: James Morse <james.morse@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-01-22 18:38:04 +00:00
Mark Rutland
e533dbe9dc arm64: acpi: fix DAIF manipulation with pNMI
Since commit:

  d44f1b8dd7e66d80 ("arm64: KVM/mm: Move SEA handling behind a single 'claim' interface")

... the top-level APEI SEA handler has the shape:

1. current_flags = arch_local_save_flags()
2. local_daif_restore(DAIF_ERRCTX)
3. <GHES handler>
4. local_daif_restore(current_flags)

However, since commit:

  4a503217ce37e1f4 ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking")

... when pseudo-NMIs (pNMIs) are in use, arch_local_save_flags() will save
the PMR value rather than the DAIF flags.

The combination of these two commits means that the APEI SEA handler will
erroneously attempt to restore the PMR value into DAIF. Fix this by
factoring local_daif_save_flags() out of local_daif_save(), so that we
can consistently save DAIF in step #1, regardless of whether pNMIs are in
use.

Both commits were introduced concurrently in v5.0.

Cc: <stable@vger.kernel.org>
Fixes: 4a503217ce37e1f4 ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt masking")
Fixes: d44f1b8dd7e66d80 ("arm64: KVM/mm: Move SEA handling behind a single 'claim' interface")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-22 14:41:22 +00:00
Marc Zyngier
5e5168461c irqchip/gic-v4.1: VPE table (aka GICR_VPROPBASER) allocation
GICv4.1 defines a new VPE table that is potentially shared between
both the ITSs and the redistributors, following complicated affinity
rules.

To make things more confusing, the programming of this table at
the redistributor level is reusing the GICv4.0 GICR_VPROPBASER register
for something completely different.

The code flow is somewhat complexified by the need to respect the
affinities required by the HW, meaning that tables can either be
inherited from a previously discovered ITS or redistributor.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20191224111055.11836-6-maz@kernel.org
2020-01-22 14:22:19 +00:00
Will Deacon
bc20606594 Merge branch 'for-next/rng' into for-next/core
* for-next/rng: (2 commits)
  arm64: Use v8.5-RNG entropy for KASLR seed
  ...
2020-01-22 11:38:53 +00:00
Will Deacon
ab3906c531 Merge branch 'for-next/errata' into for-next/core
* for-next/errata: (3 commits)
  arm64: Workaround for Cortex-A55 erratum 1530923
  ...
2020-01-22 11:35:05 +00:00
Will Deacon
aa246c056c Merge branch 'for-next/asm-annotations' into for-next/core
* for-next/asm-annotations: (6 commits)
  arm64: kernel: Correct annotation of end of el0_sync
  ...
2020-01-22 11:34:21 +00:00
Will Deacon
4f6cdf296c Merge branches 'for-next/acpi', 'for-next/cpufeatures', 'for-next/csum', 'for-next/e0pd', 'for-next/entry', 'for-next/kbuild', 'for-next/kexec/cleanup', 'for-next/kexec/file-kdump', 'for-next/misc', 'for-next/nofpsimd', 'for-next/perf' and 'for-next/scs' into for-next/core
* for-next/acpi:
  ACPI/IORT: Fix 'Number of IDs' handling in iort_id_map()

* for-next/cpufeatures: (2 commits)
  arm64: Introduce ID_ISAR6 CPU register
  ...

* for-next/csum: (2 commits)
  arm64: csum: Fix pathological zero-length calls
  ...

* for-next/e0pd: (7 commits)
  arm64: kconfig: Fix alignment of E0PD help text
  ...

* for-next/entry: (5 commits)
  arm64: entry: cleanup sp_el0 manipulation
  ...

* for-next/kbuild: (4 commits)
  arm64: kbuild: remove compressed images on 'make ARCH=arm64 (dist)clean'
  ...

* for-next/kexec/cleanup: (11 commits)
  Revert "arm64: kexec: make dtb_mem always enabled"
  ...

* for-next/kexec/file-kdump: (2 commits)
  arm64: kexec_file: add crash dump support
  ...

* for-next/misc: (12 commits)
  arm64: entry: Avoid empty alternatives entries
  ...

* for-next/nofpsimd: (7 commits)
  arm64: nofpsmid: Handle TIF_FOREIGN_FPSTATE flag cleanly
  ...

* for-next/perf: (2 commits)
  perf/imx_ddr: Fix cpu hotplug state cleanup
  ...

* for-next/scs: (6 commits)
  arm64: kernel: avoid x18 in __cpu_soft_restart
  ...
2020-01-22 11:32:31 +00:00
Will Deacon
e717d93b1c arm64: kconfig: Fix alignment of E0PD help text
Remove the additional space.

Signed-off-by: Will Deacon <will@kernel.org>
2020-01-22 11:23:54 +00:00
Mark Brown
2e8e1ea88c arm64: Use v8.5-RNG entropy for KASLR seed
When seeding KALSR on a system where we have architecture level random
number generation make use of that entropy, mixing it in with the seed
passed by the bootloader. Since this is run very early in init before
feature detection is complete we open code rather than use archrandom.h.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-22 09:54:52 +00:00
Richard Henderson
1a50ec0b3b arm64: Implement archrandom.h for ARMv8.5-RNG
Expose the ID_AA64ISAR0.RNDR field to userspace, as the RNG system
registers are always available at EL0.

Implement arch_get_random_seed_long using RNDR.  Given that the
TRNG is likely to be a shared resource between cores, and VMs,
do not explicitly force re-seeding with RNDRRS.  In order to avoid
code complexity and potential issues with hetrogenous systems only
provide values after cpufeature has finalized the system capabilities.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
[Modified to only function after cpufeature has finalized the system
capabilities and move all the code into the header -- broonie]
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
[will: Advertise HWCAP via /proc/cpuinfo]
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-22 09:54:18 +00:00
Olof Johansson
498f2a4be6 arm64: dts: zynqmp: DT changes for v5.6
- Switch from fixed to firmware based clock driver
 - Wire power domain driver
 - Wire all ina226 chips through IIO and IIO hwmon drivers
 - Add missing dr_mode property to usb nodes
 - Use gpio-line-names property instead of comments
 - Use clock-output-names for si570 differentiation
 - Minor DT fixes
 -----BEGIN PGP SIGNATURE-----
 
 iF0EABECAB0WIQQbPNTMvXmYlBPRwx7KSWXLKUoMIQUCXibMLgAKCRDKSWXLKUoM
 IQW6AJ9Szg4yizku9Q6oi+yrR6L3eRxnyACcDiM4EeKIvrlK19cYVnd/58ENPs8=
 =k6/g
 -----END PGP SIGNATURE-----

Merge tag 'zynqmp-dt-for-v5.6' of https://github.com/Xilinx/linux-xlnx into arm/dt

arm64: dts: zynqmp: DT changes for v5.6

- Switch from fixed to firmware based clock driver
- Wire power domain driver
- Wire all ina226 chips through IIO and IIO hwmon drivers
- Add missing dr_mode property to usb nodes
- Use gpio-line-names property instead of comments
- Use clock-output-names for si570 differentiation
- Minor DT fixes

* tag 'zynqmp-dt-for-v5.6' of https://github.com/Xilinx/linux-xlnx: (21 commits)
  arm64: zynqmp: Add label property to all ina226 on zcu106
  arm64: zynqmp: Enable iio-hwmon for ina226 on zcu106
  arm64: zynqmp: Add label property to all ina226 on zcu102
  arm64: zynqmp: Enable iio-hwmon for ina226 on zcu102
  arm64: zynqmp: Add label property to all ina226 on zcu111
  arm64: zynqmp: Enable iio-hwmon for ina226 on zcu111
  arm64: zynqmp: Enable iio-hwmon for ina226 on zcu100
  arm64: zynqmp: Setup default number of chipselects for zcu100
  arm64: zynqmp: Remove broken-cd from zcu100-revC
  arm64: zynqmp: Fix the si570 clock frequency on zcu111
  arm64: zynqmp: Setup clock-output-names for si570 chips
  arm64: zynqmp: Turn comment to gpio-line-names
  arm64: zynqmp: Fix address for tca6416_u97 chip on zcu104
  arm64: zynqmp: Remove addition number in node name
  arm64: zynqmp: Use ethernet-phy as node name for ethernet phys
  arm64: dts: xilinx: Add the power nodes for zynqmp
  arm64: dts: xilinx: Remove dtsi for fixed clock
  arm64: dts: xilinx: Add the clock nodes for zynqmp
  arm64: zynqmp: Add dr_mode property to usb node
  arm64: dts: zynqmp: Use decimal values for drm-clock properties
  ...

Link: https://lore.kernel.org/r/c70d2efa-9ee2-a764-5248-0e5bfbf29f8a@monstr.eu
Signed-off-by: Olof Johansson <olof@lixom.net>
2020-01-21 15:06:11 -08:00
Dirk Behme
d7bbd6c1b0 arm64: kbuild: remove compressed images on 'make ARCH=arm64 (dist)clean'
Since v4.3-rc1 commit 0723c05fb75e44 ("arm64: enable more compressed
Image formats"), it is possible to build Image.{bz2,lz4,lzma,lzo}
AArch64 images. However, the commit missed adding support for removing
those images on 'make ARCH=arm64 (dist)clean'.

Fix this by adding them to the target list.
Make sure to match the order of the recipes in the makefile.

Cc: stable@vger.kernel.org # v4.3+
Fixes: 0723c05fb75e44 ("arm64: enable more compressed Image formats")
Signed-off-by: Dirk Behme <dirk.behme@de.bosch.com>
Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com>
Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-21 16:28:36 +00:00
Julien Thierry
108eae2d4d arm64: entry: Avoid empty alternatives entries
kernel_ventry will create alternative entries to potentially replace
0 instructions with 0 instructions for EL1 vectors. While this does not
cause an issue, it pointlessly takes up some bytes in the alternatives
section.

Do not generate such entries.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Julien Thierry <jthierry@redhat.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-21 09:55:18 +00:00
Vladimir Murzin
9834602336 arm64: Kconfig: select HAVE_FUTEX_CMPXCHG
arm64 provides always working implementation of futex_atomic_cmpxchg_inatomic(),
so there is no need to check it runtime.

Reported-by: Piyush swami <Piyush.swami@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-21 09:14:01 +00:00
Ingo Molnar
a786810cc8 Linux 5.5-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl4k7i8eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGvk0IAKRenVOdiudY77SQ
 VZjsteyrYTTQtPPv494ToIRjR0XQ+gYp8vyWzXTUC5Nm9Y9U3VzDqUPUjWszrSXE
 6mU+tzcMc9qwuUxnIFn8zfg64ygw+37sn/w3xqeH4QmF9Z5Wl3EX3SdXTs7jp3RS
 VxiztkUNI5ZBV2GDtla5K/9qLPqCQnUYXIiyi5lAtBtiitZDVXFp7dy7hMgEiaEO
 +78K5Kh3xlt5ndDsBFOlwIb2Oof3KL7bBXntdbSBc/bjol6IRvAgln48HWCv59G2
 jzAp2tj2KobX9GRAEPj+v4TQZEW0SXDNDi8MgQsM+3DYVCTmANsv57CBKRuf01+F
 nB1kAys=
 =zSnJ
 -----END PGP SIGNATURE-----

Merge tag 'v5.5-rc7' into efi/core, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-01-20 08:05:16 +01:00
Olof Johansson
71acc94c49 DSI display for px30 evaluation board and a number of cleanups
accross multiple socs.
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCAAuFiEE7v+35S2Q1vLNA3Lx86Z5yZzRHYEFAl4k5D4QHGhlaWtvQHNu
 dGVjaC5kZQAKCRDzpnnJnNEdgfEbCACOCZhDKEttjrJckHExJzyEm+NThYTy4Mmt
 FwKS10w3/Fp9Z+Pc7fZk/e1QZ6AlEQPjmy+Ls1r7D7MrkotmPy6YRI6R9OFoj/aw
 eUCeo+plJUl2ij8oGgHNnWq6DJEm9cXcmCILuk8K3aj51fWMXH30L0F70lGxAuYb
 G1Ta3lnZ22Yc46ie6rQuTU9xtpv2Oy7gtPnZ2uYPEs9T8GFHH/frnaJIyk0SaU4k
 n7vWajLqoASDawZJ162+Tq2xtUpNF8CWdYTwtHlheVEC28g9H8f+XX+66x+plrGe
 t0Prg62VWUaEn6w80JILxYwPEXh1eHXY7reM/bGwoIW+9U0UtxMX
 =8p9R
 -----END PGP SIGNATURE-----

Merge tag 'v5.6-rockchip-dts64-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into arm/dt

DSI display for px30 evaluation board and a number of cleanups
accross multiple socs.

* tag 'v5.6-rockchip-dts64-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
  arm64: dts: rockchip: Kill off "simple-panel" compatibles
  arm64: dts: rockchip: rename dwmmc node names to mmc
  arm64: dts: rockchip: hook up the px30-evb dsi display
  arm64: dts: rockchip: Enable sdio0 and uart0 on rk3399-roc-pc-mezzanine
  arm64: dts: rockchip: add reg property to brcmf sub-nodes
  arm64: dts: rockchip: fix dwmmc clock name for rk3308
  arm64: dts: rockchip: fix dwmmc clock name for px30

Link: https://lore.kernel.org/r/7641353.lIegmeFAIi@phil
Signed-off-by: Olof Johansson <olof@lixom.net>
2020-01-19 22:47:47 -08:00
David S. Miller
b3f7e3f23a Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net 2020-01-19 22:10:04 +01:00
Mark Rutland
1cfbb484de KVM: arm/arm64: Correct AArch32 SPSR on exception entry
Confusingly, there are three SPSR layouts that a kernel may need to deal
with:

(1) An AArch64 SPSR_ELx view of an AArch64 pstate
(2) An AArch64 SPSR_ELx view of an AArch32 pstate
(3) An AArch32 SPSR_* view of an AArch32 pstate

When the KVM AArch32 support code deals with SPSR_{EL2,HYP}, it's either
dealing with #2 or #3 consistently. On arm64 the PSR_AA32_* definitions
match the AArch64 SPSR_ELx view, and on arm the PSR_AA32_* definitions
match the AArch32 SPSR_* view.

However, when we inject an exception into an AArch32 guest, we have to
synthesize the AArch32 SPSR_* that the guest will see. Thus, an AArch64
host needs to synthesize layout #3 from layout #2.

This patch adds a new host_spsr_to_spsr32() helper for this, and makes
use of it in the KVM AArch32 support code. For arm64 we need to shuffle
the DIT bit around, and remove the SS bit, while for arm we can use the
value as-is.

I've open-coded the bit manipulation for now to avoid having to rework
the existing PSR_* definitions into PSR64_AA32_* and PSR32_AA32_*
definitions. I hope to perform a more thorough refactoring in future so
that we can handle pstate view manipulation more consistently across the
kernel tree.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200108134324.46500-4-mark.rutland@arm.com
2020-01-19 18:06:14 +00:00
Mark Rutland
3c2483f154 KVM: arm/arm64: Correct CPSR on exception entry
When KVM injects an exception into a guest, it generates the CPSR value
from scratch, configuring CPSR.{M,A,I,T,E}, and setting all other
bits to zero.

This isn't correct, as the architecture specifies that some CPSR bits
are (conditionally) cleared or set upon an exception, and others are
unchanged from the original context.

This patch adds logic to match the architectural behaviour. To make this
simple to follow/audit/extend, documentation references are provided,
and bits are configured in order of their layout in SPSR_EL2. This
layout can be seen in the diagram on ARM DDI 0487E.a page C5-426.

Note that this code is used by both arm and arm64, and is intended to
fuction with the SPSR_EL2 and SPSR_HYP layouts.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200108134324.46500-3-mark.rutland@arm.com
2020-01-19 18:06:14 +00:00
Mark Rutland
a425372e73 KVM: arm64: Correct PSTATE on exception entry
When KVM injects an exception into a guest, it generates the PSTATE
value from scratch, configuring PSTATE.{M[4:0],DAIF}, and setting all
other bits to zero.

This isn't correct, as the architecture specifies that some PSTATE bits
are (conditionally) cleared or set upon an exception, and others are
unchanged from the original context.

This patch adds logic to match the architectural behaviour. To make this
simple to follow/audit/extend, documentation references are provided,
and bits are configured in order of their layout in SPSR_EL2. This
layout can be seen in the diagram on ARM DDI 0487E.a page C5-429.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200108134324.46500-2-mark.rutland@arm.com
2020-01-19 18:06:13 +00:00
Russell King
f5523423de arm64: kvm: Fix IDMAP overlap with HYP VA
Booting 5.4 on LX2160A reveals that KVM is non-functional:

kvm: Limiting the IPA size due to kernel Virtual Address limit
kvm [1]: IPA Size Limit: 43bits
kvm [1]: IDMAP intersecting with HYP VA, unable to continue
kvm [1]: error initializing Hyp mode: -22

Debugging shows:

kvm [1]: IDMAP page: 81a26000
kvm [1]: HYP VA range: 0:22ffffffff

as RAM is located at:

80000000-fbdfffff : System RAM
2080000000-237fffffff : System RAM

Comparing this with the same kernel on Armada 8040 shows:

kvm: Limiting the IPA size due to kernel Virtual Address limit
kvm [1]: IPA Size Limit: 43bits
kvm [1]: IDMAP page: 2a26000
kvm [1]: HYP VA range: 4800000000:493fffffff
...
kvm [1]: Hyp mode initialized successfully

which indicates that hyp_va_msb is set, and is always set to the
opposite value of the idmap page to avoid the overlap. This does not
happen with the LX2160A.

Further debugging shows vabits_actual = 39, kva_msb = 38 on LX2160A and
kva_msb = 33 on Armada 8040. Looking at the bit layout of the HYP VA,
there is still one bit available for hyp_va_msb. Set this bit
appropriately. This allows KVM to be functional on the LX2160A, but
without any HYP VA randomisation:

kvm: Limiting the IPA size due to kernel Virtual Address limit
kvm [1]: IPA Size Limit: 43bits
kvm [1]: IDMAP page: 81a24000
kvm [1]: HYP VA range: 4000000000:62ffffffff
...
kvm [1]: Hyp mode initialized successfully

Fixes: ed57cac83e05 ("arm64: KVM: Introduce EL2 VA randomisation")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
[maz: small additional cleanups, preserved case where the tag
 is legitimately 0 and we can just use the mask, Fixes tag]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/E1ilAiY-0000MA-RG@rmk-PC.armlinux.org.uk
2020-01-19 16:05:23 +00:00
Christoffer Dall
b6ae256afd KVM: arm64: Only sign-extend MMIO up to register width
On AArch64 you can do a sign-extended load to either a 32-bit or 64-bit
register, and we should only sign extend the register up to the width of
the register as specified in the operation (by using the 32-bit Wn or
64-bit Xn register specifier).

As it turns out, the architecture provides this decoding information in
the SF ("Sixty-Four" -- how cute...) bit.

Let's take advantage of this with the usual 32-bit/64-bit header file
dance and do the right thing on AArch64 hosts.

Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20191212195055.5541-1-christoffer.dall@arm.com
2020-01-19 16:05:10 +00:00
Rob Herring
62b5efc919 arm64: dts: rockchip: Kill off "simple-panel" compatibles
"simple-panel" is a Linux driver and has never been an accepted upstream
compatible string, so remove it.

Cc: Heiko Stuebner <heiko@sntech.de>
Cc: linux-rockchip@lists.infradead.org
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20200117230851.25434-1-robh@kernel.org
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2020-01-18 23:58:56 +01:00
Johan Jonker
3ef7c2558f arm64: dts: rockchip: rename dwmmc node names to mmc
Current dts files with 'dwmmc' nodes are manually verified.
In order to automate this process rockchip-dw-mshc.txt
has to be converted to yaml. In the new setup
rockchip-dw-mshc.yaml will inherit properties from
mmc-controller.yaml and synopsys-dw-mshc-common.yaml.
'dwmmc' will no longer be a valid name for a node,
so change them all to 'mmc'

Signed-off-by: Johan Jonker <jbx6244@gmail.com>
Link: https://lore.kernel.org/r/20200115185244.18149-2-jbx6244@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2020-01-18 23:56:10 +01:00
Aleksa Sarai
fddb5d430a open: introduce openat2(2) syscall
/* Background. */
For a very long time, extending openat(2) with new features has been
incredibly frustrating. This stems from the fact that openat(2) is
possibly the most famous counter-example to the mantra "don't silently
accept garbage from userspace" -- it doesn't check whether unknown flags
are present[1].

This means that (generally) the addition of new flags to openat(2) has
been fraught with backwards-compatibility issues (O_TMPFILE has to be
defined as __O_TMPFILE|O_DIRECTORY|[O_RDWR or O_WRONLY] to ensure old
kernels gave errors, since it's insecure to silently ignore the
flag[2]). All new security-related flags therefore have a tough road to
being added to openat(2).

Userspace also has a hard time figuring out whether a particular flag is
supported on a particular kernel. While it is now possible with
contemporary kernels (thanks to [3]), older kernels will expose unknown
flag bits through fcntl(F_GETFL). Giving a clear -EINVAL during
openat(2) time matches modern syscall designs and is far more
fool-proof.

In addition, the newly-added path resolution restriction LOOKUP flags
(which we would like to expose to user-space) don't feel related to the
pre-existing O_* flag set -- they affect all components of path lookup.
We'd therefore like to add a new flag argument.

Adding a new syscall allows us to finally fix the flag-ignoring problem,
and we can make it extensible enough so that we will hopefully never
need an openat3(2).

/* Syscall Prototype. */
  /*
   * open_how is an extensible structure (similar in interface to
   * clone3(2) or sched_setattr(2)). The size parameter must be set to
   * sizeof(struct open_how), to allow for future extensions. All future
   * extensions will be appended to open_how, with their zero value
   * acting as a no-op default.
   */
  struct open_how { /* ... */ };

  int openat2(int dfd, const char *pathname,
              struct open_how *how, size_t size);

/* Description. */
The initial version of 'struct open_how' contains the following fields:

  flags
    Used to specify openat(2)-style flags. However, any unknown flag
    bits or otherwise incorrect flag combinations (like O_PATH|O_RDWR)
    will result in -EINVAL. In addition, this field is 64-bits wide to
    allow for more O_ flags than currently permitted with openat(2).

  mode
    The file mode for O_CREAT or O_TMPFILE.

    Must be set to zero if flags does not contain O_CREAT or O_TMPFILE.

  resolve
    Restrict path resolution (in contrast to O_* flags they affect all
    path components). The current set of flags are as follows (at the
    moment, all of the RESOLVE_ flags are implemented as just passing
    the corresponding LOOKUP_ flag).

    RESOLVE_NO_XDEV       => LOOKUP_NO_XDEV
    RESOLVE_NO_SYMLINKS   => LOOKUP_NO_SYMLINKS
    RESOLVE_NO_MAGICLINKS => LOOKUP_NO_MAGICLINKS
    RESOLVE_BENEATH       => LOOKUP_BENEATH
    RESOLVE_IN_ROOT       => LOOKUP_IN_ROOT

open_how does not contain an embedded size field, because it is of
little benefit (userspace can figure out the kernel open_how size at
runtime fairly easily without it). It also only contains u64s (even
though ->mode arguably should be a u16) to avoid having padding fields
which are never used in the future.

Note that as a result of the new how->flags handling, O_PATH|O_TMPFILE
is no longer permitted for openat(2). As far as I can tell, this has
always been a bug and appears to not be used by userspace (and I've not
seen any problems on my machines by disallowing it). If it turns out
this breaks something, we can special-case it and only permit it for
openat(2) but not openat2(2).

After input from Florian Weimer, the new open_how and flag definitions
are inside a separate header from uapi/linux/fcntl.h, to avoid problems
that glibc has with importing that header.

/* Testing. */
In a follow-up patch there are over 200 selftests which ensure that this
syscall has the correct semantics and will correctly handle several
attack scenarios.

In addition, I've written a userspace library[4] which provides
convenient wrappers around openat2(RESOLVE_IN_ROOT) (this is necessary
because no other syscalls support RESOLVE_IN_ROOT, and thus lots of care
must be taken when using RESOLVE_IN_ROOT'd file descriptors with other
syscalls). During the development of this patch, I've run numerous
verification tests using libpathrs (showing that the API is reasonably
usable by userspace).

/* Future Work. */
Additional RESOLVE_ flags have been suggested during the review period.
These can be easily implemented separately (such as blocking auto-mount
during resolution).

Furthermore, there are some other proposed changes to the openat(2)
interface (the most obvious example is magic-link hardening[5]) which
would be a good opportunity to add a way for userspace to restrict how
O_PATH file descriptors can be re-opened.

Another possible avenue of future work would be some kind of
CHECK_FIELDS[6] flag which causes the kernel to indicate to userspace
which openat2(2) flags and fields are supported by the current kernel
(to avoid userspace having to go through several guesses to figure it
out).

[1]: https://lwn.net/Articles/588444/
[2]: https://lore.kernel.org/lkml/CA+55aFyyxJL1LyXZeBsf2ypriraj5ut1XkNDsunRBqgVjZU_6Q@mail.gmail.com
[3]: commit 629e014bb834 ("fs: completely ignore unknown open flags")
[4]: https://sourceware.org/bugzilla/show_bug.cgi?id=17523
[5]: https://lore.kernel.org/lkml/20190930183316.10190-2-cyphar@cyphar.com/
[6]: https://youtu.be/ggD-eb3yPVs

Suggested-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-01-18 09:19:18 -05:00
Olof Johansson
bba9d2b163 This pull request contains Broadcom ARM64-based SoCs defconfig changes
for 5.6, please pull the following:
 
 - Nicolas enables the Broadcom GENET controller and Broadcom STB PCIe
   Root Complex driver as a module for the ARM64 defconfig. The PCIe RC
   driver will go through the PCIe maintainers pull request for 5.6.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEm+Rq3+YGJdiR9yuFh9CWnEQHBwQFAl4iFKAACgkQh9CWnEQH
 BwSiqRAAr57unOelMvv+Nd8H5x7m1JkfHQZquVkqlui3na+fssIOshRgQ+VXbtVR
 y2wDZlAH5j66VCsAv9yXBJJPOb8Ta0iXMCGxSycZmGpF9rn/RaUziz9JljJ4DvZN
 xOf+j13q1R/mjaleRIIwZGk6MwFZr8RCtZhAV/jQjTihJSEJj0+FiNfG/TjW3mgY
 L25JLHXtThFpADxXYHzI5LUsH9T2HiWn0IC3UEloBggjn7Vv2+1fZfeukOexRnt7
 SHzHOFADrPCvaSxK9CtRFR1n4FcB87Kvl0g+b0EKDrYLnZkpuZ1FAmt7tR8NZ6RC
 mZ2DCPuhPZ4W2ToM5OpzQVUUpBaMLNM0TtNwS6NFit7t3q+RYa7mvCNoQifvwh7L
 ruNNLXEaH0fl+/hPRlQw2jBuhfuvgz0W8dpgj/389Zhs5SnO5P/7jOp5DqNmQ4su
 rMDcbEV2sElC0ZJuhkEL7Z0STRBhf72D3KTrfOwZd4hrrWKKEdDHUc/3fX1YsQ1h
 /PLWt3K5VbqZCgH+cmuiO0afFJZa3unx3HoMI8fTpo6WLVmA57gWvJMDusKHSq3p
 j0wJtY5q8oTc36xe2Y0nuO6DpWDMuR0RBJGHBpbbWT7NgsJSj5BqS3JOG44DQSdF
 gJS8GhAGnY9PeJuVv4QElq0QWo31iv/3fjROaCh+IpA2QMthbxQ=
 =UsZS
 -----END PGP SIGNATURE-----

Merge tag 'arm-soc/for-5.6/defconfig-arm64' of https://github.com/Broadcom/stblinux into arm/defconfig

This pull request contains Broadcom ARM64-based SoCs defconfig changes
for 5.6, please pull the following:

- Nicolas enables the Broadcom GENET controller and Broadcom STB PCIe
  Root Complex driver as a module for the ARM64 defconfig. The PCIe RC
  driver will go through the PCIe maintainers pull request for 5.6.

* tag 'arm-soc/for-5.6/defconfig-arm64' of https://github.com/Broadcom/stblinux:
  arm64: defconfig: Enable Broadcom's GENET Ethernet controller
  arm64: defconfig: Enable Broadcom's STB PCIe controller

Link: https://lore.kernel.org/r/20200117222705.25391-1-f.fainelli@gmail.com
Signed-off-by: Olof Johansson <olof@lixom.net>
2020-01-17 17:07:35 -08:00
Krzysztof Kozlowski
b0e55fef62 arm64: dts: exynos: Rename Samsung and Exynos to lowercase
Fix up inconsistent usage of upper and lowercase letters in "Samsung"
and "Exynos" names.

"SAMSUNG" and "EXYNOS" are not abbreviations but regular trademarked
names.  Therefore they should be written with lowercase letters starting
with capital letter.

The lowercase "Exynos" name is promoted by its manufacturer Samsung
Electronics Co., Ltd., in advertisement materials and on website.

Although advertisement materials usually use uppercase "SAMSUNG", the
lowercase version is used in all legal aspects (e.g. on Wikipedia and in
privacy/legal statements on
https://www.samsung.com/semiconductor/privacy-global/).

Link: https://lore.kernel.org/r/20200117190305.5257-1-krzk@kernel.org
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
2020-01-17 12:08:49 -08:00
Nicolas Saenz Julienne
e926791a96 arm64: defconfig: Enable Broadcom's GENET Ethernet controller
Currently used on the Raspberry Pi 4 and various Broadcom STB SoCs.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
2020-01-17 12:03:18 -08:00
Olof Johansson
bd4d5488d3 Merge tag 'ti-k3-soc-for-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/kristo/linux into arm/dt
Texas Instruments K3 SoC family changes for 5.6

- Add missing power domains for smmu for J721e
- Add I2C, ADC, OSPI and UFS nodes for J721e
- Add OSPI and MCU syscon nodes for am65x
- Add IRQ line for GPIO expander on am65x

* tag 'ti-k3-soc-for-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/kristo/linux:
  arm64: dts: ti: k3-j721e-main: Add missing power-domains for smmu
  arm64: dts: ti: k3-am65-mcu: add system control module node
  arm64: dts: k3-am654-base-board: Add IRQ line for GPIO expander
  arm64: dts: ti: k3-am65: Add OSPI DT node
  arm64: dts: ti: k3-j721e: Add DT nodes for few peripherials

Link: https://lore.kernel.org/r/c5b74bfc-f2f0-1b72-4a3c-4c1d478a023a@ti.com
Signed-off-by: Olof Johansson <olof@lixom.net>
2020-01-17 12:01:31 -08:00
Robin Murphy
c2c24edb1d arm64: csum: Fix pathological zero-length calls
In validating the checksumming results of the new routine, I sadly
neglected to test its not-checksumming results. Thus it slipped through
that the one case where @buff is already dword-aligned and @len = 0
manages to defeat the tail-masking logic and behave as if @len = 8.
For a zero length it doesn't make much sense to deference @buff anyway,
so just add an early return (which has essentially zero impact on
performance).

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-17 16:05:50 +00:00
Masahiro Yamada
e98d5023fe arm64: dts: uniphier: add reset-names to NAND controller node
The Denali NAND controller IP has separate reset control for the
controller core and registers.

Add the reset-names, and one more phandle accordingly. This is the
approved DT-binding.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2020-01-18 00:56:18 +09:00
Mark Rutland
3e3934176a arm64: entry: cleanup sp_el0 manipulation
The kernel stashes the current task struct in sp_el0 so that this can be
acquired consistently/cheaply when required. When we take an exception
from EL0 we have to:

1) stash the original sp_el0 value
2) find the current task
3) update sp_el0 with the current task pointer

Currently steps #1 and #2 occur in one place, and step #3 a while later.
As the value of sp_el0 is immaterial between these points, let's move
them together to make the code clearer and minimize ifdeffery. This
necessitates moving the comment for MDSCR_EL1.SS.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-17 13:22:14 +00:00
Mark Rutland
7a2c094464 arm64: entry: cleanup el0 svc handler naming
For most of the exception entry code, <foo>_handler() is the first C
function called from the entry assembly in entry-common.c, and external
functions handling the bulk of the logic are called do_<foo>().

For consistency, apply this scheme to el0_svc_handler and
el0_svc_compat_handler, renaming them to do_el0_svc and
do_el0_svc_compat respectively.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-17 13:22:14 +00:00
Mark Rutland
2d226c1e1c arm64: entry: mark all entry code as notrace
Almost all functions in entry-common.c are marked notrace, with
el1_undef and el1_inv being the only exceptions. We appear to have done
this on the assumption that there were no exception registers that we
needed to snapshot, and thus it was safe to run trace code that might
result in further exceptions and clobber those registers.

However, until we inherit the DAIF flags, our irq flag tracing is stale,
and this discrepancy could set off warnings in some configurations. For
example if CONFIG_DEBUG_LOCKDEP is selected and a trace function calls
into any flag-checking locking routines. Given we don't expect to
trigger el1_undef or el1_inv unless something is already wrong, any
irqflag warnigns are liable to mask the information we'd actually care
about.

Let's keep things simple and mark el1_undef and el1_inv as notrace.
Developers can trace do_undefinstr and bad_mode if they really want to
monitor these cases.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-17 13:22:14 +00:00
Mark Rutland
ddb953f86c arm64: assembler: remove smp_dmb macro
These days arm64 kernels are always SMP, and thus smp_dmb is an
overly-long way of writing dmb. Naturally, no-one uses it.

Remove the unused macro.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-17 13:22:13 +00:00
Mark Rutland
170b25fa6a arm64: assembler: remove inherit_daif macro
We haven't needed the inherit_daif macro since commit:

  ed3768db588291dd ("arm64: entry: convert el1_sync to C")

... which converted all callers to C and the local_daif_inherit
function.

Remove the unused macro.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-17 13:22:13 +00:00
Catalin Marinas
95b3f74bec arm64: Use macros instead of hard-coded constants for MAIR_EL1
Currently, the arm64 __cpu_setup has hard-coded constants for the memory
attributes that go into the MAIR_EL1 register. Define proper macros in
asm/sysreg.h and make use of them in proc.S.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-17 12:48:33 +00:00
Sai Prakash Ranjan
83b0c36b8a arm64: Add KRYO{3,4}XX CPU cores to spectre-v2 safe list
The "silver" KRYO3XX and KRYO4XX CPU cores are not affected by Spectre
variant 2. Add them to spectre_v2 safe list to correct the spurious
ARM_SMCCC_ARCH_WORKAROUND_1 warning and vulnerability status reported
under sysfs.

Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Tested-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
[will: tweaked commit message to remove stale mention of "gold" cores]
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-17 12:46:41 +00:00
Waiman Long
f5bfdc8e39 locking/osq: Use optimized spinning loop for arm64
Arm64 has a more optimized spinning loop (atomic_cond_read_acquire)
using wfe for spinlock that can boost performance of sibling threads
by putting the current cpu to a wait state that is broken only when
the monitored variable changes or an external event happens.

OSQ has a more complicated spinning loop. Besides the lock value, it
also checks for need_resched() and vcpu_is_preempted(). The check for
need_resched() is not a problem as it is only set by the tick interrupt
handler. That will be detected by the spinning cpu right after iret.

The vcpu_is_preempted() check, however, is a problem as changes to the
preempt state of of previous node will not affect the wait state. For
ARM64, vcpu_is_preempted is not currently defined and so is a no-op.
Will has indicated that he is planning to para-virtualize wfe instead
of defining vcpu_is_preempted for PV support. So just add a comment in
arch/arm64/include/asm/spinlock.h to indicate that vcpu_is_preempted()
should not be defined as suggested.

On a 2-socket 56-core 224-thread ARM64 system, a kernel mutex locking
microbenchmark was run for 10s with and without the patch. The
performance numbers before patch were:

Running locktest with mutex [runtime = 10s, load = 1]
Threads = 224, Min/Mean/Max = 316/123,143/2,121,269
Threads = 224, Total Rate = 2,757 kop/s; Percpu Rate = 12 kop/s

After patch, the numbers were:

Running locktest with mutex [runtime = 10s, load = 1]
Threads = 224, Min/Mean/Max = 334/147,836/1,304,787
Threads = 224, Total Rate = 3,311 kop/s; Percpu Rate = 15 kop/s

So there was about 20% performance improvement.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/r/20200113150735.21956-1-longman@redhat.com
2020-01-17 10:19:30 +01:00