25527 Commits

Author SHA1 Message Date
Arnd Bergmann
931a2ca6a5 arm64: ftrace: fix build error with CONFIG_FUNCTION_GRAPH_TRACER=n
It appears that a merge conflict ended up hiding a newly added constant
in some configurations:

arch/arm64/kernel/entry-ftrace.S: Assembler messages:
arch/arm64/kernel/entry-ftrace.S:59: Error: undefined symbol FTRACE_OPS_DIRECT_CALL used as an immediate value

FTRACE_OPS_DIRECT_CALL is still used when CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
is enabled, even if CONFIG_FUNCTION_GRAPH_TRACER is disabled, so change the
ifdef accordingly.

Link: https://lkml.kernel.org/r/20230623152204.2216297-1-arnd@kernel.org

Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Donglin Peng <pengdonglin@sangfor.com.cn>
Fixes: 3646970322464 ("arm64: ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Florent Revest <revest@chromium.org>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-07-05 09:46:19 -04:00
Donglin Peng
3646970322 arm64: ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL
The previous patch ("function_graph: Support recording and printing
the return value of function") has laid the groundwork for the for
the funcgraph-retval, and this modification makes it available on
the ARM64 platform.

We introduce a new structure called fgraph_ret_regs for the ARM64
platform to hold return registers and the frame pointer. We then
fill its content in the return_to_handler and pass its address to
the function ftrace_return_to_handler to record the return value.

Link: https://lkml.kernel.org/r/c78366416ce93f704ae7000c4ee60eb4258c38f7.1680954589.git.pengdonglin@sangfor.com.cn

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Donglin Peng <pengdonglin@sangfor.com.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-06-20 18:38:37 -04:00
Linus Torvalds
b066935bf8 ARM:
* Address some fallout of the locking rework, this time affecting
   the way the vgic is configured
 
 * Fix an issue where the page table walker frees a subtree and
   then proceeds with walking what it has just freed...
 
 * Check that a given PA donated to the guest is actually memory
   (only affecting pKVM)
 
 * Correctly handle MTE CMOs by Set/Way
 
 * Fix the reported address of a watchpoint forwarded to userspace
 
 * Fix the freeing of the root of stage-2 page tables
 
 * Stop creating spurious PMU events to perform detection of the
   default PMU and use the existing PMU list instead.
 
 x86:
 
 * Fix a memslot lookup bug in the NX recovery thread that could
    theoretically let userspace bypass the NX hugepage mitigation
 
 * Fix a s/BLOCKING/PENDING bug in SVM's vNMI support
 
 * Account exit stats for fastpath VM-Exits that never leave the super
   tight run-loop
 
 * Fix an out-of-bounds bug in the optimized APIC map code, and add a
   regression test for the race.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmR7k1QUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNblwf/faUVOBMv7mQBGsGa7FNcmaNhYeIT
 U1k4pFNlo7dNNuNJrGdpo+sOGP5A8CRLNSVvlyjgCHF1Qc9gVtXNvZ9PnA6nAYmB
 qqvUz/TDw9/NLTlJEkbSs05B4am4yfd5pV6R/32jrPIbXOW++6ae2LpILS/NPBrB
 y0tGiVUJrO3zVXdBKa4PFmlO8jsXPmMEiicEJa5v2Boeo5SFyFfErw9zDNwSMsQc
 27bzbs3O2daXTNMFnwVCCpWUxt1EqWYUXGvBjsChAUI0K10F2/GW9f6YeFsGXqKI
 d8g1QuCukSt/CvN0pT+g/540mR6i0Azpek1myQfuCu2IhQ1jCJaSWOjoEw==
 =8VrO
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM:

   - Address some fallout of the locking rework, this time affecting the
     way the vgic is configured

   - Fix an issue where the page table walker frees a subtree and then
     proceeds with walking what it has just freed...

   - Check that a given PA donated to the guest is actually memory (only
     affecting pKVM)

   - Correctly handle MTE CMOs by Set/Way

   - Fix the reported address of a watchpoint forwarded to userspace

   - Fix the freeing of the root of stage-2 page tables

   - Stop creating spurious PMU events to perform detection of the
     default PMU and use the existing PMU list instead

  x86:

   - Fix a memslot lookup bug in the NX recovery thread that could
     theoretically let userspace bypass the NX hugepage mitigation

   - Fix a s/BLOCKING/PENDING bug in SVM's vNMI support

   - Account exit stats for fastpath VM-Exits that never leave the super
     tight run-loop

   - Fix an out-of-bounds bug in the optimized APIC map code, and add a
     regression test for the race"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: selftests: Add test for race in kvm_recalculate_apic_map()
  KVM: x86: Bail from kvm_recalculate_phys_map() if x2APIC ID is out-of-bounds
  KVM: x86: Account fastpath-only VM-Exits in vCPU stats
  KVM: SVM: vNMI pending bit is V_NMI_PENDING_MASK not V_NMI_BLOCKING_MASK
  KVM: x86/mmu: Grab memslot for correct address space in NX recovery worker
  KVM: arm64: Document default vPMU behavior on heterogeneous systems
  KVM: arm64: Iterate arm_pmus list to probe for default PMU
  KVM: arm64: Drop last page ref in kvm_pgtable_stage2_free_removed()
  KVM: arm64: Populate fault info for watchpoint
  KVM: arm64: Reload PTE after invoking walker callback on preorder traversal
  KVM: arm64: Handle trap of tagged Set/Way CMOs
  arm64: Add missing Set/Way CMO encodings
  KVM: arm64: Prevent unconditional donation of unmapped regions from the host
  KVM: arm64: vgic: Fix a comment
  KVM: arm64: vgic: Fix locking comment
  KVM: arm64: vgic: Wrap vgic_its_create() with config_lock
  KVM: arm64: vgic: Fix a circular locking issue
2023-06-04 07:16:53 -04:00
Oliver Upton
40e54cad45 KVM: arm64: Document default vPMU behavior on heterogeneous systems
KVM maintains a mask of supported CPUs when a vPMU type is explicitly
selected by userspace and is used to reject any attempt to run the vCPU
on an unsupported CPU. This is great, but we're still beholden to the
default behavior where vCPUs can be scheduled anywhere and guest
counters may silently stop working.

Avoid confusing the next poor sod to look at this code and document the
intended behavior.

Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230525212723.3361524-3-oliver.upton@linux.dev
2023-05-31 10:29:56 +01:00
Oliver Upton
1c913a1c35 KVM: arm64: Iterate arm_pmus list to probe for default PMU
To date KVM has relied on using a perf event to probe the core PMU at
the time of vPMU initialization. Behind the scenes perf_event_init()
would iteratively walk the PMUs of the system and return the PMU that
could handle the event. However, an upcoming change in perf core will
drop the iterative walk, thereby breaking the fragile dance we do on the
KVM side.

Avoid the problem altogether by iterating over the list of supported
PMUs maintained in KVM, returning the core PMU that matches the CPU
we were called on.

Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230525212723.3361524-2-oliver.upton@linux.dev
2023-05-31 10:29:56 +01:00
Oliver Upton
f6a27d6dc5 KVM: arm64: Drop last page ref in kvm_pgtable_stage2_free_removed()
The reference count on page table allocations is increased for every
'counted' PTE (valid or donated) in the table in addition to the initial
reference from ->zalloc_page(). kvm_pgtable_stage2_free_removed() fails
to drop the last reference on the root of the table walk, meaning we
leak memory.

Fix it by dropping the last reference after the free walker returns,
at which point all references for 'counted' PTEs have been released.

Cc: stable@vger.kernel.org
Fixes: 5c359cca1faf ("KVM: arm64: Tear down unlinked stage-2 subtree after break-before-make")
Reported-by: Yu Zhao <yuzhao@google.com>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Tested-by: Yu Zhao <yuzhao@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230530193213.1663411-1-oliver.upton@linux.dev
2023-05-31 08:02:21 +01:00
Akihiko Odaki
811154e234 KVM: arm64: Populate fault info for watchpoint
When handling ESR_ELx_EC_WATCHPT_LOW, far_el2 member of struct
kvm_vcpu_fault_info will be copied to far member of struct
kvm_debug_exit_arch and exposed to the userspace. The userspace will
see stale values from older faults if the fault info does not get
populated.

Fixes: 8fb2046180a0 ("KVM: arm64: Move early handlers to per-EC handlers")
Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230530024651.10014-1-akihiko.odaki@daynix.com
Cc: stable@vger.kernel.org
2023-05-30 08:39:07 +01:00
Arnd Bergmann
66bbb32978 i.MX fixes for 6.4:
- A couple of i.MX8MN/P video clock changes from Adam Ford to fix issue
   with clock re-parenting.
 - Add missing pvcie-supply regulator for imx6qdl-mba6 board.
 - A series of colibri-imx8x board fixes on pin configuration.
 - Set and limit the mode for PMIC bucks for imx6ull-dhcor board to fix
   stability problems.
 - A couple of changes from Frank Li to correct cdns,usb3 bindings
   cdns,on-chip-buff-size property and fix USB 3.0 gadget failure on
   i.MX8QM & QXPB0.
 - Add a required PHY deassert delay for imx8mn-var-som board to fix PHY
   detection failure.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCgAyFiEEFmJXigPl4LoGSz08UFdYWoewfM4FAmRjhcYUHHNoYXduZ3Vv
 QGtlcm5lbC5vcmcACgkQUFdYWoewfM41EQf/Rjk68gAG2dZfFV33PYcnONrJJYuz
 gtnBME7XtHqSR5dKByUIXqtSrn9ROY3RQt0Kp2dWv/dY248PtF4IdldsAjr6tF6P
 Sy8m6tdG9n+tvCgsHGKxhomLm2Wophwt5+Na8G+3XPLzs9PsiBuRLIHWPMAqOiZX
 +TzpzNOwflGt49HRqAObzAexmR24cG9U6N5dcNb/avd4qMguuh4UVkMpTsDuz9gi
 Lpt3K8yCjy8AexlB4Fti/8F1cZUcmJRRsUeFhlduBLgeHKHAEJaTJZuTHUtkmRXT
 CxW3ya2HKGJE4Swea4CDpjhrOS3nZBkp1Z2VyjPUJIszauuBZzHm4Pvkdw==
 =vZ+e
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmRvfT4ACgkQYKtH/8kJ
 Uien9Q/+PVS6G/lEWu3nV8J8GkQlj/fKO4mUnJiSYS5HNeDdBj6M20qlHZvd4dMb
 +Kt6IC5jpnDSRajbjBOxzSD6aAsmV1VUrBa8k/TfteHUnLCfS0okQ/DC9YUVtHgr
 5AVokHlNvFI9dDTTeQ6AuyNp/kn57h6+3jSdBQAhDTyNXYDQEkEmeFYpEx4v7VlL
 E9EEjsBK8f2dQe1x7GSZuNrjU0JRsK7KJZBa7aOShAFHz5Mj/hJ7XFtchpW1ZQ+u
 lBn3VMucLADJMKnDYZ6O28hZ8My3mB9vm0Wd0n0N6slRFOQHzeX2dAcvQEVMMQPV
 Ll6ddU09j5e/rNc/mQUuHhEMOS6pZjM3FISrI9QDJOc3s3wHXQSdoikTF+oBDsO3
 imfMZyGD5W1rapWDTr+i8clmGOZl5riKcsvm5LPrmlBQSZpDKphCpkooryCF7XSM
 +KYrHFSPV8iwgB4uO8/Ow8QTNfGe3pDRUo1eQ8uPryX5ZgTOZKGvpzcO4puK5iCu
 NaXkLm4KcVLq6E7yfIqhPGkiJtV3VCnFDx3D7cupk0IeZW+9Z8/4CrmeO40QcKUb
 qT8k+caijpuCDDl/gHxuHA9Ld1vSbl1VL18OrqWTfK3NbrP8o/dytINlspscy8fj
 JDExUWKVIQrhJqyP9VdYEiTFQ+A4SznE9OVdnLeCu0hM70wfqjg=
 =6H6u
 -----END PGP SIGNATURE-----

Merge tag 'imx-fixes-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes

i.MX fixes for 6.4:

- A couple of i.MX8MN/P video clock changes from Adam Ford to fix issue
  with clock re-parenting.
- Add missing pvcie-supply regulator for imx6qdl-mba6 board.
- A series of colibri-imx8x board fixes on pin configuration.
- Set and limit the mode for PMIC bucks for imx6ull-dhcor board to fix
  stability problems.
- A couple of changes from Frank Li to correct cdns,usb3 bindings
  cdns,on-chip-buff-size property and fix USB 3.0 gadget failure on
  i.MX8QM & QXPB0.
- Add a required PHY deassert delay for imx8mn-var-som board to fix PHY
  detection failure.

* tag 'imx-fixes-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  arm64: dts: imx8: fix USB 3.0 Gadget Failure in QM & QXPB0 at super speed
  dt-binding: cdns,usb3: Fix cdns,on-chip-buff-size type
  arm64: dts: colibri-imx8x: delete adc1 and dsp
  arm64: dts: colibri-imx8x: fix iris pinctrl configuration
  arm64: dts: colibri-imx8x: move pinctrl property from SoM to eval board
  arm64: dts: colibri-imx8x: fix eval board pin configuration
  arm64: dts: imx8mp: Fix video clock parents
  ARM: dts: imx6qdl-mba6: Add missing pvcie-supply regulator
  ARM: dts: imx6ull-dhcor: Set and limit the mode for PMIC buck 1, 2 and 3
  arm64: dts: imx8mn-var-som: fix PHY detection bug by adding deassert delay
  arm64: dts: imx8mn: Fix video clock parents

Link: https://lore.kernel.org/r/20230516133625.GI767028@dragon
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-05-25 17:22:38 +02:00
Arnd Bergmann
d14b555c33 Arm FVP/Vexpress fixes for v6.4
Couple of fixes to address the missing required 'cache-unified' property
 in the level 2 and 3 caches on some of the FVP/vexpress platforms.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEunHlEgbzHrJD3ZPhAEG6vDF+4pgFAmRaIG4ACgkQAEG6vDF+
 4pgkag//TU1DhDNQcDveZ8eVb4BdyWolUit4DdiR6raZLsZ8JXbBZ9GYLsZgogui
 q+5i9UyCDfmyEeEf6pBieB4cNiiq/cIRbCzcj+6ngSSf5SaOcFwBDpdml7aFsh5V
 2jNca/kJaP9uxHDmWVMyPPmtdzoIFCFytGX6TVNCor4lW8AW7QuY7373AlyBhgo2
 eAcsZ9ZwsJ7PC80cLE1D7sa+Kxsv63A7Y87ETgSK/HkiwLXJ6aYV766tm3u1jf8S
 GGLQ3zu75WsOEIyhiKCEXTSFPCqNA7Cblv49tw3Ok7AxEYpmSt7Nc7har6H6mwLv
 wxDJB/WtFpW6lzCO22y5+MnO9H3u8vV1ADUYjsHoCxuzGYy67NaE/8mPveupmqHN
 CKpc9VNi0sPlF/Z+pbFLNQbLZDaahujpTTs/CXeZyDNQVRa3a6NwfCMTzgwXcKjS
 qYOW6gSRgJZ7W1UNvGSBWIFHksRpBgAdVrm7CfFJjw/NZ5EA1bgCWbWzXM7tmcQf
 9/mCeRcgOYAFehwhzUBWdTPYNsvQ92xUj2LqzrdIPUWm/0JvfXNRJ6aE7clgY2v1
 eKExorNZtKvaLQYA4nBDMzYkmUefgmCizxHLINADWoA28oRBmsZq6f42cTh5Ft+x
 Me3gLlot3RWHZWtOl+zaDQQaYS4gJudtNK9dKfEpHFGE8VQpgXA=
 =rMXM
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmRvfGsACgkQYKtH/8kJ
 UicW8BAAgyy7PNBRPCCD/9I3Sop8t4x515a691V5mMEZ57X59JyvjDH5x1N723Uk
 0zqHbngSEiCcFzj9ZPs5MSa9ezDQsCtCb8omlhEk2LToE9+fIVBmOycS7Wn1wfvw
 MjH+ioQ9iR5YZThw+fp+m3RXAf2fEqfdYx5zVXJdA73I1sI0Xcg0PgSR4hOLuVdH
 tieLYL6X/UShATHd1Z9f4RUaoR38eYY3+bpNUPSuAZz2F+A1okwqbqCGiwGBaRBI
 QBzAnDdpC9pSw1G0qG4A6yjMBvbfvfcdICrYnJ0XVcax92pVM9IJQA6eLY4zPovw
 MTU9BcX3uFXM9sEN8LTGpSezNIXdeLcGr9WRxB+GX5yx0za8ktj8tUOEnA45K48i
 QgJUjViEP31WsXxKa11bipV0YfddT2hvGFPdsfi7yK/apWExHa6zeCP42rGjBb/X
 H6S+g4iCY314bgvRPiXjsabNYJcr+0+R1vdWQeZMTN/U/ppYzu1gYT7lfU1q7t/I
 zsuv9fA05NiC3uwEoBO6nqxSbr4LOjERM1kVDssA8ooesQwoxeexbl5HBInOG8i+
 1V3bTuScv/wNf5M1XmD4EzcjuiZ2LFM2cjlpqNjsM/G5r1+03tDhgS9xIf8hv4AV
 Pf4aFLUwtuwQJz05V8TS+XhzBj++PEzHXAv10vAEcUMtjbKtznI=
 =Orpf
 -----END PGP SIGNATURE-----

Merge tag 'juno-fixes-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes

Arm FVP/Vexpress fixes for v6.4

Couple of fixes to address the missing required 'cache-unified' property
in the level 2 and 3 caches on some of the FVP/vexpress platforms.

* tag 'juno-fixes-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
  arm64: dts: arm: add missing cache properties
  ARM: dts: vexpress: add missing cache properties

Link: https://lore.kernel.org/r/20230509143508.1188786-1-sudeep.holla@arm.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-05-25 17:19:07 +02:00
Fuad Tabba
a9f0e3d5a0 KVM: arm64: Reload PTE after invoking walker callback on preorder traversal
The preorder callback on the kvm_pgtable_stage2_map() path can replace
a table with a block, then recursively free the detached table. The
higher-level walking logic stashes the old page table entry and
then walks the freed table, invoking the leaf callback and
potentially freeing pgtable pages prematurely.

In normal operation, the call to tear down the detached stage-2
is indirected and uses an RCU callback to trigger the freeing.
RCU is not available to pKVM, which is where this bug is
triggered.

Change the behavior of the walker to reload the page table entry
after invoking the walker callback on preorder traversal, as it
does for leaf entries.

Tested on Pixel 6.

Fixes: 5c359cca1faf ("KVM: arm64: Tear down unlinked stage-2 subtree after break-before-make")
Suggested-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230522103258.402272-1-tabba@google.com
2023-05-24 13:47:12 +01:00
Marc Zyngier
d282fa3c5c KVM: arm64: Handle trap of tagged Set/Way CMOs
We appear to have missed the Set/Way CMOs when adding MTE support.
Not that we really expect anyone to use them, but you never know
what stupidity some people can come up with...

Treat these mostly like we deal with the classic S/W CMOs, only
with an additional check that MTE really is enabled.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20230515204601.1270428-3-maz@kernel.org
2023-05-24 13:45:18 +01:00
Marc Zyngier
8d0f019e4c arm64: Add missing Set/Way CMO encodings
Add the missing Set/Way CMOs that apply to tagged memory.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20230515204601.1270428-2-maz@kernel.org
2023-05-24 13:45:18 +01:00
Linus Torvalds
a35747c310 ARM:
* Plug a race in the stage-2 mapping code where the IPA and the PA
   would end up being out of sync
 
 * Make better use of the bitmap API (bitmap_zero, bitmap_zalloc...)
 
 * FP/SVE/SME documentation update, in the hope that this field
   becomes clearer...
 
 * Add workaround for Apple SEIS brokenness to a new SoC
 
 * Random comment fixes
 
 x86:
 
 * add MSR_IA32_TSX_CTRL into msrs_to_save
 
 * fixes for XCR0 handling in SGX enclaves
 
 Generic:
 
 * Fix vcpu_array[0] races
 
 * Fix race between starting a VM and "reboot -f"
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmRp0WIUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPqVwf+OFayNPpURAFqfrOuISYW7hoCL24+
 sCtXyVv4Ei0np1vGekit2h/m8GmxO12xEBibcFeYj+YQItIqu9HvC08fRxAKaMeE
 N3p9iLuS1zcM3cEuZpg0r6QN+pKybttdadl70yho43CtagEM4FmB7dgyAo9AhyXk
 pZUaVfoO6beBQ/J6A6V/Q5xlue1LvHk1+K4rmNcYVTYn6ZOd+yYgvqng1nv5/h9b
 0HgW0aUWkEHAB67/sSnUUro707loMNTowsZlMCtgDk4Fzf8RwQ7qc8lClLk1UPjJ
 DHB6Hif9F0Q5mkrwn+c7xyVlKARaY6/FOshS2Q620q19+4fq5fUD9HgjrQ==
 =ARzH
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM:

   - Plug a race in the stage-2 mapping code where the IPA and the PA
     would end up being out of sync

   - Make better use of the bitmap API (bitmap_zero, bitmap_zalloc...)

   - FP/SVE/SME documentation update, in the hope that this field
     becomes clearer...

   - Add workaround for Apple SEIS brokenness to a new SoC

   - Random comment fixes

  x86:

   - add MSR_IA32_TSX_CTRL into msrs_to_save

   - fixes for XCR0 handling in SGX enclaves

  Generic:

   - Fix vcpu_array[0] races

   - Fix race between starting a VM and 'reboot -f'"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: VMX: add MSR_IA32_TSX_CTRL into msrs_to_save
  KVM: x86: Don't adjust guest's CPUID.0x12.1 (allowed SGX enclave XFRM)
  KVM: VMX: Don't rely _only_ on CPUID to enforce XCR0 restrictions for ECREATE
  KVM: Fix vcpu_array[0] races
  KVM: VMX: Fix header file dependency of asm/vmx.h
  KVM: Don't enable hardware after a restart/shutdown is initiated
  KVM: Use syscore_ops instead of reboot_notifier to hook restart/shutdown
  KVM: arm64: vgic: Add Apple M2 PRO/MAX cpus to the list of broken SEIS implementations
  KVM: arm64: Clarify host SME state management
  KVM: arm64: Restructure check for SVE support in FP trap handler
  KVM: arm64: Document check for TIF_FOREIGN_FPSTATE
  KVM: arm64: Fix repeated words in comments
  KVM: arm64: Constify start/end/phys fields of the pgtable walker data
  KVM: arm64: Infer PA offset from VA in hyp map walker
  KVM: arm64: Infer the PA offset from IPA in stage-2 map walker
  KVM: arm64: Use the bitmap API to allocate bitmaps
  KVM: arm64: Slightly optimize flush_context()
2023-05-21 13:58:37 -07:00
Will Deacon
09cce60bdd KVM: arm64: Prevent unconditional donation of unmapped regions from the host
Since host stage-2 mappings are created lazily, we cannot rely solely on
the pte in order to recover the target physical address when checking a
host-initiated memory transition as this permits donation of unmapped
regions corresponding to MMIO or "no-map" memory.

Instead of inspecting the pte, move the addr_is_allowed_memory() check
into the host callback function where it is passed the physical address
directly from the walker.

Cc: Quentin Perret <qperret@google.com>
Fixes: e82edcc75c4e ("KVM: arm64: Implement do_share() helper for sharing memory")
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230518095844.1178-1-will@kernel.org
2023-05-19 10:20:30 +01:00
Jean-Philippe Brucker
6254873226 KVM: arm64: vgic: Fix a comment
It is host userspace, not the guest, that issues
KVM_DEV_ARM_VGIC_GRP_CTRL

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230518100914.2837292-5-jean-philippe@linaro.org
2023-05-19 10:20:00 +01:00
Jean-Philippe Brucker
c38b8400ae KVM: arm64: vgic: Fix locking comment
It is now config_lock that must be held, not kvm lock. Replace the
comment with a lockdep annotation.

Fixes: f00327731131 ("KVM: arm64: Use config_lock to protect vgic state")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230518100914.2837292-4-jean-philippe@linaro.org
2023-05-19 10:20:00 +01:00
Jean-Philippe Brucker
9cf2f840c4 KVM: arm64: vgic: Wrap vgic_its_create() with config_lock
vgic_its_create() changes the vgic state without holding the
config_lock, which triggers a lockdep warning in vgic_v4_init():

[  358.667941] WARNING: CPU: 3 PID: 178 at arch/arm64/kvm/vgic/vgic-v4.c:245 vgic_v4_init+0x15c/0x7a8
...
[  358.707410]  vgic_v4_init+0x15c/0x7a8
[  358.708550]  vgic_its_create+0x37c/0x4a4
[  358.709640]  kvm_vm_ioctl+0x1518/0x2d80
[  358.710688]  __arm64_sys_ioctl+0x7ac/0x1ba8
[  358.711960]  invoke_syscall.constprop.0+0x70/0x1e0
[  358.713245]  do_el0_svc+0xe4/0x2d4
[  358.714289]  el0_svc+0x44/0x8c
[  358.715329]  el0t_64_sync_handler+0xf4/0x120
[  358.716615]  el0t_64_sync+0x190/0x194

Wrap the whole of vgic_its_create() with config_lock since, in addition
to calling vgic_v4_init(), it also modifies the global kvm->arch.vgic
state.

Fixes: f00327731131 ("KVM: arm64: Use config_lock to protect vgic state")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230518100914.2837292-3-jean-philippe@linaro.org
2023-05-19 10:20:00 +01:00
Jean-Philippe Brucker
59112e9c39 KVM: arm64: vgic: Fix a circular locking issue
Lockdep reports a circular lock dependency between the srcu and the
config_lock:

[  262.179917] -> #1 (&kvm->srcu){.+.+}-{0:0}:
[  262.182010]        __synchronize_srcu+0xb0/0x224
[  262.183422]        synchronize_srcu_expedited+0x24/0x34
[  262.184554]        kvm_io_bus_register_dev+0x324/0x50c
[  262.185650]        vgic_register_redist_iodev+0x254/0x398
[  262.186740]        vgic_v3_set_redist_base+0x3b0/0x724
[  262.188087]        kvm_vgic_addr+0x364/0x600
[  262.189189]        vgic_set_common_attr+0x90/0x544
[  262.190278]        vgic_v3_set_attr+0x74/0x9c
[  262.191432]        kvm_device_ioctl+0x2a0/0x4e4
[  262.192515]        __arm64_sys_ioctl+0x7ac/0x1ba8
[  262.193612]        invoke_syscall.constprop.0+0x70/0x1e0
[  262.195006]        do_el0_svc+0xe4/0x2d4
[  262.195929]        el0_svc+0x44/0x8c
[  262.196917]        el0t_64_sync_handler+0xf4/0x120
[  262.198238]        el0t_64_sync+0x190/0x194
[  262.199224]
[  262.199224] -> #0 (&kvm->arch.config_lock){+.+.}-{3:3}:
[  262.201094]        __lock_acquire+0x2b70/0x626c
[  262.202245]        lock_acquire+0x454/0x778
[  262.203132]        __mutex_lock+0x190/0x8b4
[  262.204023]        mutex_lock_nested+0x24/0x30
[  262.205100]        vgic_mmio_write_v3_misc+0x5c/0x2a0
[  262.206178]        dispatch_mmio_write+0xd8/0x258
[  262.207498]        __kvm_io_bus_write+0x1e0/0x350
[  262.208582]        kvm_io_bus_write+0xe0/0x1cc
[  262.209653]        io_mem_abort+0x2ac/0x6d8
[  262.210569]        kvm_handle_guest_abort+0x9b8/0x1f88
[  262.211937]        handle_exit+0xc4/0x39c
[  262.212971]        kvm_arch_vcpu_ioctl_run+0x90c/0x1c04
[  262.214154]        kvm_vcpu_ioctl+0x450/0x12f8
[  262.215233]        __arm64_sys_ioctl+0x7ac/0x1ba8
[  262.216402]        invoke_syscall.constprop.0+0x70/0x1e0
[  262.217774]        do_el0_svc+0xe4/0x2d4
[  262.218758]        el0_svc+0x44/0x8c
[  262.219941]        el0t_64_sync_handler+0xf4/0x120
[  262.221110]        el0t_64_sync+0x190/0x194

Note that the current report, which can be triggered by the vgic_irq
kselftest, is a triple chain that includes slots_lock, but after
inverting the slots_lock/config_lock dependency, the actual problem
reported above remains.

In several places, the vgic code calls kvm_io_bus_register_dev(), which
synchronizes the srcu, while holding config_lock (#1). And the MMIO
handler takes the config_lock while holding the srcu read lock (#0).

Break dependency #1, by registering the distributor and redistributors
without holding config_lock. The ITS also uses kvm_io_bus_register_dev()
but already relies on slots_lock to serialize calls.

The distributor iodev is created on the first KVM_RUN call. Multiple
threads will race for vgic initialization, and only the first one will
see !vgic_ready() under the lock. To serialize those threads, rely on
slots_lock rather than config_lock.

Redistributors are created earlier, through KVM_DEV_ARM_VGIC_GRP_ADDR
ioctls and vCPU creation. Similarly, serialize the iodev creation with
slots_lock, and the rest with config_lock.

Fixes: f00327731131 ("KVM: arm64: Use config_lock to protect vgic state")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230518100914.2837292-2-jean-philippe@linaro.org
2023-05-19 10:20:00 +01:00
Peter Collingbourne
c4c597f1b3 arm64: mte: Do not set PG_mte_tagged if tags were not initialized
The mte_sync_page_tags() function sets PG_mte_tagged if it initializes
page tags. Then we return to mte_sync_tags(), which sets PG_mte_tagged
again. At best, this is redundant. However, it is possible for
mte_sync_page_tags() to return without having initialized tags for the
page, i.e. in the case where check_swap is true (non-compound page),
is_swap_pte(old_pte) is false and pte_is_tagged is false. So at worst,
we set PG_mte_tagged on a page with uninitialized tags. This can happen
if, for example, page migration causes a PTE for an untagged page to
be replaced. If the userspace program subsequently uses mprotect() to
enable PROT_MTE for that page, the uninitialized tags will be exposed
to userspace.

Fix it by removing the redundant call to set_page_mte_tagged().

Fixes: e059853d14ca ("arm64: mte: Fix/clarify the PG_mte_tagged semantics")
Signed-off-by: Peter Collingbourne <pcc@google.com>
Cc: <stable@vger.kernel.org> # 6.1
Link: https://linux-review.googlesource.com/id/Ib02d004d435b2ed87603b858ef7480f7b1463052
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/20230420214327.2357985-1-pcc@google.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-05-16 14:59:16 +01:00
Peter Collingbourne
2efbafb91e arm64: Also reset KASAN tag if page is not PG_mte_tagged
Consider the following sequence of events:

1) A page in a PROT_READ|PROT_WRITE VMA is faulted.
2) Page migration allocates a page with the KASAN allocator,
   causing it to receive a non-match-all tag, and uses it
   to replace the page faulted in 1.
3) The program uses mprotect() to enable PROT_MTE on the page faulted in 1.

As a result of step 3, we are left with a non-match-all tag for a page
with tags accessible to userspace, which can lead to the same kind of
tag check faults that commit e74a68468062 ("arm64: Reset KASAN tag in
copy_highpage with HW tags only") intended to fix.

The general invariant that we have for pages in a VMA with VM_MTE_ALLOWED
is that they cannot have a non-match-all tag. As a result of step 2, the
invariant is broken. This means that the fix in the referenced commit
was incomplete and we also need to reset the tag for pages without
PG_mte_tagged.

Fixes: e5b8d9218951 ("arm64: mte: reset the page tag in page->flags")
Cc: <stable@vger.kernel.org> # 5.15
Link: https://linux-review.googlesource.com/id/I7409cdd41acbcb215c2a7417c1e50d37b875beff
Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20230420210945.2313627-1-pcc@google.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-05-16 14:58:54 +01:00
Geert Uytterhoeven
3bc879e355 arm64: perf: Mark all accessor functions inline
When just including <asm/arm_pmuv3.h>:

    arch/arm64/include/asm/arm_pmuv3.h:31:13: error: ‘write_pmevtypern’ defined but not used [-Werror=unused-function]
       31 | static void write_pmevtypern(int n, unsigned long val)
	  |             ^~~~~~~~~~~~~~~~
    arch/arm64/include/asm/arm_pmuv3.h:24:13: error: ‘write_pmevcntrn’ defined but not used [-Werror=unused-function]
       24 | static void write_pmevcntrn(int n, unsigned long val)
	  |             ^~~~~~~~~~~~~~~
    arch/arm64/include/asm/arm_pmuv3.h:16:22: error: ‘read_pmevcntrn’ defined but not used [-Werror=unused-function]
       16 | static unsigned long read_pmevcntrn(int n)
	  |                      ^~~~~~~~~~~~~~

Fix this by adding the missing "inline" keywords to the three accessor
functions that lack them.

Fixes: df29ddf4f04b ("arm64: perf: Abstract system register accesses away")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/d53a19043c0c3bd25f6c203e73a2fb08a9661824.1683561482.git.geert+renesas@glider.be
Signed-off-by: Will Deacon <will@kernel.org>
2023-05-16 14:57:29 +01:00
Linus Walleij
b0abde8062 arm64: vdso: Pass (void *) to virt_to_page()
Like the other calls in this function virt_to_page() expects
a pointer, not an integer.

However since many architectures implement virt_to_pfn() as
a macro, this function becomes polymorphic and accepts both a
(unsigned long) and a (void *).

Fix this up with an explicit cast.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: http://lists.infradead.org/pipermail/linux-arm-kernel/2023-May/832583.html
Signed-off-by: Will Deacon <will@kernel.org>
2023-05-16 14:53:36 +01:00
Min-Hua Chen
d91d580878 arm64/mm: mark private VM_FAULT_X defines as vm_fault_t
This patch fixes several sparse warnings for fault.c:

arch/arm64/mm/fault.c:493:24: sparse: warning: incorrect type in return expression (different base types)
arch/arm64/mm/fault.c:493:24: sparse:    expected restricted vm_fault_t
arch/arm64/mm/fault.c:493:24: sparse:    got int
arch/arm64/mm/fault.c:501:32: sparse: warning: incorrect type in return expression (different base types)
arch/arm64/mm/fault.c:501:32: sparse:    expected restricted vm_fault_t
arch/arm64/mm/fault.c:501:32: sparse:    got int
arch/arm64/mm/fault.c:503:32: sparse: warning: incorrect type in return expression (different base types)
arch/arm64/mm/fault.c:503:32: sparse:    expected restricted vm_fault_t
arch/arm64/mm/fault.c:503:32: sparse:    got int
arch/arm64/mm/fault.c:511:24: sparse: warning: incorrect type in return expression (different base types)
arch/arm64/mm/fault.c:511:24: sparse:    expected restricted vm_fault_t
arch/arm64/mm/fault.c:511:24: sparse:    got int
arch/arm64/mm/fault.c:670:13: sparse: warning: restricted vm_fault_t degrades to integer
arch/arm64/mm/fault.c:670:13: sparse: warning: restricted vm_fault_t degrades to integer
arch/arm64/mm/fault.c:713:39: sparse: warning: restricted vm_fault_t degrades to integer

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Min-Hua Chen <minhuadotchen@gmail.com>
Link: https://lore.kernel.org/r/20230502151909.128810-1-minhuadotchen@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-05-16 14:50:43 +01:00
Frank Li
0f554e37da arm64: dts: imx8: fix USB 3.0 Gadget Failure in QM & QXPB0 at super speed
Resolve USB 3.0 gadget failure for QM and QXPB0 in super speed mode with
single IN and OUT endpoints, like mass storage devices, due to incorrect
ACTUAL_MEM_SIZE in ep_cap2 (32k instead of actual 18k). Implement dt
property cdns,on-chip-buff-size to override ep_cap2 and set it to 18k for
imx8QM and imx8QXP chips. No adverse effects for 8QXP C0.

Cc: stable@vger.kernel.org
Fixes: dce49449e04f ("usb: cdns3: allocate TX FIFO size according to composite EP number")
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2023-05-16 20:43:48 +08:00
Andrejs Cainikovs
b8b23fbe93 arm64: dts: colibri-imx8x: delete adc1 and dsp
i.MX8, i.MX8X, i.MX8XP and i.MX8XL SOC device trees are all based on
imx8-ss-*.dtsi files. For i.MX8X and i.MX8XP these device trees
should be updated with some peripherals removed or updated, similar
to i.MX8XL (imx8dxl-ss-*.dtsi files). However, it looks like only
i.MX8 and i.MX8XL are up to date, but for i.MX8X and i.MX8XP some
of the peripherals got inherited from imx8-ss-*.dtsi files, but in
reality they are not present on SOC.
As a result, during resource partition ownership check U-Boot receives
messages from SCU firmware about these resources not owned by boot
partition. In reality, these resources are not owned by anyone, as
they simply does not exist, but are defined in Linux device tree.
This change removes those peripherals, which are listed during
U-Boot resource partition ownership check as warnings:

  ## Flattened Device Tree blob at 9d400000
     Booting using the fdt blob at 0x9d400000
     Loading Device Tree to 00000000fd652000, end 00000000fd67efff ... OK
  Disable clock-controller@59580000 rsrc 512 not owned
  Disable clock-controller@5ac90000 rsrc 102 not owned

  Starting kernel ...

Fixes: ba5a5615d54f ("arm64: dts: freescale: add initial support for colibri imx8x")
Signed-off-by: Andrejs Cainikovs <andrejs.cainikovs@toradex.com>
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2023-05-15 11:18:53 +08:00
Emanuele Ghidoli
34e5c0cd55 arm64: dts: colibri-imx8x: fix iris pinctrl configuration
Remove GPIO3_IO10 from Iris carrier board pinctrl configuration,
this is already defined in the SOM dtsi since this is a
standard SOM functionality (wake-up button).

Duplicating it leads to the following error message
imx8qxp-pinctrl scu:pinctrl: pin IMX8QXP_QSPI0A_DATA1 already requested

Fixes: aefb5e2d974d ("arm64: dts: colibri-imx8x: Add iris carrier board")
Signed-off-by: Emanuele Ghidoli <emanuele.ghidoli@toradex.com>
Signed-off-by: Andrejs Cainikovs <andrejs.cainikovs@toradex.com>
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2023-05-15 11:18:51 +08:00
Emanuele Ghidoli
25acffb008 arm64: dts: colibri-imx8x: move pinctrl property from SoM to eval board
Each carrier board device tree except the eval board one already override
iomuxc pinctrl property to configure unused pins as gpio.
So move also the pinctrl property to eval board device tree.
Leave the pin group definition in imx8x-colibri.dtsi to avoid duplication
and simplify configuration of gpio.

Signed-off-by: Emanuele Ghidoli <emanuele.ghidoli@toradex.com>
Signed-off-by: Andrejs Cainikovs <andrejs.cainikovs@toradex.com>
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2023-05-15 11:18:48 +08:00
Emanuele Ghidoli
a346d4dc74 arm64: dts: colibri-imx8x: fix eval board pin configuration
Fix pinctrl groups to have SODIMM 75 only in one group.
Remove configuration of the pin at SoM level because it is normally
used as CSI_MCLK at camera interface connector.
Without this fix it is not possible, without redefining iomuxc pinctrl,
to use CSI_MCLK signal and leads to the following error messages:

imx8qxp-pinctrl scu:pinctrl: pin IMX8QXP_CSI_MCLK already requested
imx8qxp-pinctrl scu:pinctrl: pin-147 (16-003c) status -22

Fixes: 4d2adf738169 ("arm64: dts: colibri-imx8x: Split pinctrl_hog1")
Signed-off-by: Emanuele Ghidoli <emanuele.ghidoli@toradex.com>
Signed-off-by: Andrejs Cainikovs <andrejs.cainikovs@toradex.com>
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2023-05-15 11:18:33 +08:00
Adam Ford
07bb2e3688 arm64: dts: imx8mp: Fix video clock parents
There are a few clocks whose parents are set in mipi_dsi
and lcdif nodes, but these clocks are used by the media_blk_ctrl
power domain.  This may cause an issue when re-parenting, because
the media_blk_ctrl may start the clocks before the reparent is
done resulting in a disp_pixel clock having the wrong parent and
rate.

Fix this by moving the assigned-clock-parents and rates to the
media_blk_ctrl node to configure these clocks before they are enabled.

After this patch, both disp1_pix_root and dixp2_pix_root clock
become children of the video_pll1.

video_pll1_ref_sel           24000000
  video_pll1               1039500000
    video_pll1_bypass        1039500000
      video_pll1_out           1039500000
        media_disp2_pix          1039500000
          media_disp2_pix_root_clk   1039500000
        media_disp1_pix          1039500000
          media_disp1_pix_root_clk   1039500000

Fixes: eda09fe149df ("arm64: dts: imx8mp: Add display pipeline components")
Signed-off-by: Adam Ford <aford173@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2023-05-15 11:15:31 +08:00
Hugo Villeneuve
f161cea5a2 arm64: dts: imx8mn-var-som: fix PHY detection bug by adding deassert delay
While testing the ethernet interface on a Variscite symphony carrier
board using an imx8mn SOM with an onboard ADIN1300 PHY (EC hardware
configuration), the ethernet PHY is not detected.

The ADIN1300 datasheet indicate that the "Management interface
active (t4)" state is reached at most 5ms after the reset signal is
deasserted.

The device tree in Variscite custom git repository uses the following
property:

    phy-reset-post-delay = <20>;

Add a new MDIO property 'reset-deassert-us' of 20ms to have the same
delay inside the ethphy node. Adding this property fixes the problem
with the PHY detection.

Note that this SOM can also have an Atheros AR8033 PHY. In this case,
a 1ms deassert delay is sufficient. Add a comment to that effect.

Fixes: ade0176dd8a0 ("arm64: dts: imx8mn-var-som: Add Variscite VAR-SOM-MX8MN System on Module")
Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2023-05-14 20:56:19 +08:00
Adam Ford
2ac6c4a637 arm64: dts: imx8mn: Fix video clock parents
There are a few clocks whose parents are set in mipi_dsi
and mxsfb nodes, but these clocks are used by the disp_blk_ctrl
power domain which may cause an issue when re-parenting, resuling
in a disp_pixel clock having the wrong parent and wrong rate.

Fix this by moving the assigned-clock-parents as associate clock
assignments to the power-domain node to setup these clocks before
they are enabled.

Fixes: d825fb6455d5 ("arm64: dts: imx8mn: Add display pipeline components")
Signed-off-by: Adam Ford <aford173@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2023-05-14 20:32:25 +08:00
Marc Zyngier
c3a62df457 Merge branch kvm-arm64/pgtable-fixes-6.4 into kvmarm-master/fixes
* kvm-arm64/pgtable-fixes-6.4:
  : .
  : Fixes for concurrent S2 mapping race from Oliver:
  :
  : "So it appears that there is a race between two parallel stage-2 map
  : walkers that could lead to mapping the incorrect PA for a given IPA, as
  : the IPA -> PA relationship picks up an unintended offset. This series
  : eliminates the problem by using the current IPA of the walk as the
  : source-of-truth regarding where we are in a map operation."
  : .
  KVM: arm64: Constify start/end/phys fields of the pgtable walker data
  KVM: arm64: Infer PA offset from VA in hyp map walker
  KVM: arm64: Infer the PA offset from IPA in stage-2 map walker

Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-05-11 15:26:01 +01:00
Marc Zyngier
9a48c597d6 Merge branch kvm-arm64/misc-6.4 into kvmarm-master/fixes
* kvm-arm64/misc-6.4:
  : .
  : Minor changes for 6.4:
  :
  : - Make better use of the bitmap API (bitmap_zero, bitmap_zalloc...)
  :
  : - FP/SVE/SME documentation update, in the hope that this field
  :   becomes clearer...
  :
  : - Add workaround for the usual Apple SEIS brokenness
  :
  : - Random comment fixes
  : .
  KVM: arm64: vgic: Add Apple M2 PRO/MAX cpus to the list of broken SEIS implementations
  KVM: arm64: Clarify host SME state management
  KVM: arm64: Restructure check for SVE support in FP trap handler
  KVM: arm64: Document check for TIF_FOREIGN_FPSTATE
  KVM: arm64: Fix repeated words in comments
  KVM: arm64: Use the bitmap API to allocate bitmaps
  KVM: arm64: Slightly optimize flush_context()

Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-05-11 15:25:58 +01:00
Marc Zyngier
e910baa9c1 KVM: arm64: vgic: Add Apple M2 PRO/MAX cpus to the list of broken SEIS implementations
Unsurprisingly, the M2 PRO is also affected by the SEIS bug, so add it
to the naughty list. And since M2 MAX is likely to be of the same ilk,
flag it as well.

Tested on a M2 PRO mini machine.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20230501182141.39770-1-maz@kernel.org
2023-05-11 15:17:02 +01:00
Krzysztof Kozlowski
55b37d9c8b arm64: dts: arm: add missing cache properties
As all level 2 and level 3 caches are unified, add required
cache-unified properties to fix warnings like:

  foundation-v8.dtb: l2-cache0: 'cache-unified' is a required property

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20230421223213.115639-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2023-05-08 20:40:36 +01:00
Linus Torvalds
b115d85a95 Locking changes in v6.4:
- Introduce local{,64}_try_cmpxchg() - a slightly more optimal
    primitive, which will be used in perf events ring-buffer code.
 
  - Simplify/modify rwsems on PREEMPT_RT, to address writer starvation.
 
  - Misc cleanups/fixes.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmRUvUoRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hlIhAArP33rTKi+HAndQ3UHW3XtmHRxEEQTfiE
 wvIoN89h58QW4DGMeAV4ltafbIPQAkI233Aogwz903L0qbDV0Ro4OU3XJembRuWl
 LeOADKwYyypXdOa8XICuY9aIP7e1/h0DF3ySs7inLcwK9JCyAIxnsVHYej+hsRXA
 kZoXN98T3TR1C0V9UQy4SU3HI1lC3tsG3R9Ti9TnYUg3ygVXhRE9lOQ4kv9lFPVz
 BNuj2Blj7KNiVaY9kehrhO54THI7NmsCVZO44Rcl48I0KAcFulAmFcNlE7GnR8Nj
 thj38pU6XAFVHXG8MYjgE+Al+PnK48NtJxexCtHyGvGG4D2aLzRMnkolxAUCcVuK
 G+UBsQm3ybjYgHgt1zuN6ehcpT+5tULkDH8JA7vrgZYaVgxHzsUaHgYfCCWKnmUY
 mPR6aImEmYZwZVNLskhe0HT4mq244bp+VnWlnJ6LZK7t/itenvDhqnj7KTi4Bfej
 lTHplOTitV/8uCEW8V4pX+YTEenVsIQmTc/G3iIabXP/6HzLffA3q4vyW6vKIErE
 pqrpuFA0Z4GB+pU0mJXt7+I7zscDVthwI055jDyQBjA7IcdVGm2MjQ6xcNRW5FYN
 UynvaEMocue4ZO4WdFsd1ZBUd9VfoNzGQspBw46DhCL1MEQBYv36SKQNjej/9aRr
 ilVwqnOWI2s=
 =mM0A
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2023-05-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Ingo Molnar:

 - Introduce local{,64}_try_cmpxchg() - a slightly more optimal
   primitive, which will be used in perf events ring-buffer code

 - Simplify/modify rwsems on PREEMPT_RT, to address writer starvation

 - Misc cleanups/fixes

* tag 'locking-core-2023-05-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/atomic: Correct (cmp)xchg() instrumentation
  locking/x86: Define arch_try_cmpxchg_local()
  locking/arch: Wire up local_try_cmpxchg()
  locking/generic: Wire up local{,64}_try_cmpxchg()
  locking/atomic: Add generic try_cmpxchg{,64}_local() support
  locking/rwbase: Mitigate indefinite writer starvation
  locking/arch: Rename all internal __xchg() names to __arch_xchg()
2023-05-05 12:56:55 -07:00
Linus Torvalds
671e148d07 arm64 fixes for -rc1
- Fix regression in CPU erratum workaround when disabling the MMU
 
 - Fix detection of pointer authentication hwcaps
 
 - Avoid writeable, executable ELF sections in vmlinux
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmRTe+UQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNJXqB/9C9DrrbHUg9ZPqAIUpXkyaxem4gpIS+kyU
 +ard53uweuQHchuR/x2s2K9Sp/ano5jGnQXEjikNy29Opu2UYI/wmsqdJEn3km8q
 kohTRsiFgQ40Y85/3iJ8ug6+llxCxK6AXdZCskdWTP56Jur0WpNiQd0a/ShYQLdX
 wBHdInT3QpDVzd5bEWDtUEj4H//tTCy4rESQyGsLhrHgb/x8uZKgZMtPJp6+Q3Eq
 ofs+PQc0qHr/Ri3ahQOCMbxTbNaLIgUzkyXbZN+y2JtxgE+l8E3Gsir2+Pv7mcSx
 1gCSLCmpwE7rVJpTykN+jA6OSsoSUSJHXs6565nF4n8+ugdL7aqR
 =Tba+
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "A few arm64 fixes that came in during the merge window for -rc1.

  The main thing is restoring the pointer authentication hwcaps, which
  disappeared during some recent refactoring

   - Fix regression in CPU erratum workaround when disabling the MMU

   - Fix detection of pointer authentication hwcaps

   - Avoid writeable, executable ELF sections in vmlinux"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: lds: move .got section out of .text
  arm64: kernel: remove SHF_WRITE|SHF_EXECINSTR from .idmap.text
  arm64: cpufeature: Fix pointer auth hwcaps
  arm64: Fix label placement in record_mmu_state()
2023-05-04 12:45:32 -07:00
Linus Torvalds
29ee463d6f hte: Changes for v6.4-rc1
The changes for the hte/timestamp subsystem include the following:
 - Add Tegra234 HTE provider and relevant DT bindings
 - Update MAINTAINERS file for the HTE subsystem
 -----BEGIN PGP SIGNATURE-----
 
 iIgEABYIADAWIQT4slW2T0Q/rXAa29f4pUxhzZTZKAUCZErbLBIcZGlwZW5wQG52
 aWRpYS5jb20ACgkQ+KVMYc2U2SiW0QEAt3bPgopjIMzaInOguZthR1pHCuKtyK7F
 u4aJAyHv7tIA/jtsFJuFO4LmiwA/IsNits5l7F36oaB94/cQGuRH1M8E
 =opcL
 -----END PGP SIGNATURE-----

Merge tag 'for-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/pateldipen1984/linux

Pull hardware timestamp engine updates from Dipen Patel:
 "The changes for the hte subsystem include:

   - Add Tegra234 HTE provider and relevant DT bindings

   - Update MAINTAINERS file for the HTE subsystem"

* tag 'for-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/pateldipen1984/linux:
  hte: tegra-194: Use proper includes
  hte: Use device_match_of_node()
  hte: tegra-194: Fix off by one in tegra_hte_map_to_line_id()
  hte: tegra: fix 'struct of_device_id' build error
  hte: Use of_property_present() for testing DT property presence
  gpio: tegra186: Add Tegra234 hte support
  hte: handle nvidia,gpio-controller property
  hte: Deprecate nvidia,slices property
  hte: Add Tegra234 provider
  hte: Re-phrase tegra API document
  arm64: tegra: Add Tegra234 GTE nodes
  dt-bindings: timestamp: Deprecate nvidia,slices property
  dt-bindings: timestamp: Add Tegra234 support
  MAINTAINERS: Add HTE/timestamp subsystem details
2023-05-03 11:00:27 -07:00
Fangrui Song
0fddb79bf2 arm64: lds: move .got section out of .text
Currently, the .got section is placed within the output section .text.
However, when .got is non-empty, the SHF_WRITE flag is set for .text
when linked by lld. GNU ld recognizes .text as a special section and
ignores the SHF_WRITE flag. By renaming .text, we can also get the
SHF_WRITE flag.

The kernel has performed R_AARCH64_RELATIVE resolving very early, and can
then assume that .got is read-only. Let's move .got to the vmlinux_rodata
pseudo-segment.

As Ard Biesheuvel notes:

"This matters to consumers of the vmlinux ELF representation of the
kernel image, such as syzkaller, which disregards writable PT_LOAD
segments when resolving code symbols. The kernel itself does not care
about this distinction, but given that the GOT contains data and not
code, it does not require executable permissions, and therefore does
not belong in .text to begin with."

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/r/20230502074105.1541926-1-maskray@google.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-05-02 13:12:45 +01:00
ndesaulniers@google.com
4df69e0df2 arm64: kernel: remove SHF_WRITE|SHF_EXECINSTR from .idmap.text
commit d54170812ef1 ("arm64: fix .idmap.text assertion for large kernels")
modified some of the section assembler directives that declare
.idmap.text to be SHF_ALLOC instead of
SHF_ALLOC|SHF_WRITE|SHF_EXECINSTR.

This patch fixes up the remaining stragglers that were left behind.  Add
Fixes tag so that this doesn't precede related change in stable.

Fixes: d54170812ef1 ("arm64: fix .idmap.text assertion for large kernels")
Reported-by: Greg Thelen <gthelen@google.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20230428-awx-v2-1-b197ffa16edc@google.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-05-02 12:42:22 +01:00
Kristina Martsenko
eda081d2ef arm64: cpufeature: Fix pointer auth hwcaps
The pointer auth hwcaps are not getting reported to userspace, as they
are missing the .matches field. Add the field back.

Fixes: 876e3c8efe79 ("arm64/cpufeature: Pull out helper for CPUID register definitions")
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230428132546.2513834-1-kristina.martsenko@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-05-02 12:41:45 +01:00
Linus Torvalds
c8c655c34e s390:
* More phys_to_virt conversions
 
 * Improvement of AP management for VSIE (nested virtualization)
 
 ARM64:
 
 * Numerous fixes for the pathological lock inversion issue that
   plagued KVM/arm64 since... forever.
 
 * New framework allowing SMCCC-compliant hypercalls to be forwarded
   to userspace, hopefully paving the way for some more features
   being moved to VMMs rather than be implemented in the kernel.
 
 * Large rework of the timer code to allow a VM-wide offset to be
   applied to both virtual and physical counters as well as a
   per-timer, per-vcpu offset that complements the global one.
   This last part allows the NV timer code to be implemented on
   top.
 
 * A small set of fixes to make sure that we don't change anything
   affecting the EL1&0 translation regime just after having having
   taken an exception to EL2 until we have executed a DSB. This
   ensures that speculative walks started in EL1&0 have completed.
 
 * The usual selftest fixes and improvements.
 
 KVM x86 changes for 6.4:
 
 * Optimize CR0.WP toggling by avoiding an MMU reload when TDP is enabled,
   and by giving the guest control of CR0.WP when EPT is enabled on VMX
   (VMX-only because SVM doesn't support per-bit controls)
 
 * Add CR0/CR4 helpers to query single bits, and clean up related code
   where KVM was interpreting kvm_read_cr4_bits()'s "unsigned long" return
   as a bool
 
 * Move AMD_PSFD to cpufeatures.h and purge KVM's definition
 
 * Avoid unnecessary writes+flushes when the guest is only adding new PTEs
 
 * Overhaul .sync_page() and .invlpg() to utilize .sync_page()'s optimizations
   when emulating invalidations
 
 * Clean up the range-based flushing APIs
 
 * Revamp the TDP MMU's reaping of Accessed/Dirty bits to clear a single
   A/D bit using a LOCK AND instead of XCHG, and skip all of the "handle
   changed SPTE" overhead associated with writing the entire entry
 
 * Track the number of "tail" entries in a pte_list_desc to avoid having
   to walk (potentially) all descriptors during insertion and deletion,
   which gets quite expensive if the guest is spamming fork()
 
 * Disallow virtualizing legacy LBRs if architectural LBRs are available,
   the two are mutually exclusive in hardware
 
 * Disallow writes to immutable feature MSRs (notably PERF_CAPABILITIES)
   after KVM_RUN, similar to CPUID features
 
 * Overhaul the vmx_pmu_caps selftest to better validate PERF_CAPABILITIES
 
 * Apply PMU filters to emulated events and add test coverage to the
   pmu_event_filter selftest
 
 x86 AMD:
 
 * Add support for virtual NMIs
 
 * Fixes for edge cases related to virtual interrupts
 
 x86 Intel:
 
 * Don't advertise XTILE_CFG in KVM_GET_SUPPORTED_CPUID if XTILE_DATA is
   not being reported due to userspace not opting in via prctl()
 
 * Fix a bug in emulation of ENCLS in compatibility mode
 
 * Allow emulation of NOP and PAUSE for L2
 
 * AMX selftests improvements
 
 * Misc cleanups
 
 MIPS:
 
 * Constify MIPS's internal callbacks (a leftover from the hardware enabling
   rework that landed in 6.3)
 
 Generic:
 
 * Drop unnecessary casts from "void *" throughout kvm_main.c
 
 * Tweak the layout of "struct kvm_mmu_memory_cache" to shrink the struct
   size by 8 bytes on 64-bit kernels by utilizing a padding hole
 
 Documentation:
 
 * Fix goof introduced by the conversion to rST
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmRNExkUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNyjwf+MkzDael9y9AsOZoqhEZ5OsfQYJ32
 Im5ZVYsPRU2K5TuoWql6meIihgclCj1iIU32qYHa2F1WYt2rZ72rJp+HoY8b+TaI
 WvF0pvNtqQyg3iEKUBKPA4xQ6mj7RpQBw86qqiCHmlfNt0zxluEGEPxH8xrWcfhC
 huDQ+NUOdU7fmJ3rqGitCvkUbCuZNkw3aNPR8dhU8RAWrwRzP2hBOmdxIeo81WWY
 XMEpJSijbGpXL9CvM0Jz9nOuMJwZwCCBGxg1vSQq0xTfLySNMxzvWZC2GFaBjucb
 j0UOQ7yE0drIZDVhd3sdNslubXXU6FcSEzacGQb9aigMUon3Tem9SHi7Kw==
 =S2Hq
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "s390:

   - More phys_to_virt conversions

   - Improvement of AP management for VSIE (nested virtualization)

  ARM64:

   - Numerous fixes for the pathological lock inversion issue that
     plagued KVM/arm64 since... forever.

   - New framework allowing SMCCC-compliant hypercalls to be forwarded
     to userspace, hopefully paving the way for some more features being
     moved to VMMs rather than be implemented in the kernel.

   - Large rework of the timer code to allow a VM-wide offset to be
     applied to both virtual and physical counters as well as a
     per-timer, per-vcpu offset that complements the global one. This
     last part allows the NV timer code to be implemented on top.

   - A small set of fixes to make sure that we don't change anything
     affecting the EL1&0 translation regime just after having having
     taken an exception to EL2 until we have executed a DSB. This
     ensures that speculative walks started in EL1&0 have completed.

   - The usual selftest fixes and improvements.

  x86:

   - Optimize CR0.WP toggling by avoiding an MMU reload when TDP is
     enabled, and by giving the guest control of CR0.WP when EPT is
     enabled on VMX (VMX-only because SVM doesn't support per-bit
     controls)

   - Add CR0/CR4 helpers to query single bits, and clean up related code
     where KVM was interpreting kvm_read_cr4_bits()'s "unsigned long"
     return as a bool

   - Move AMD_PSFD to cpufeatures.h and purge KVM's definition

   - Avoid unnecessary writes+flushes when the guest is only adding new
     PTEs

   - Overhaul .sync_page() and .invlpg() to utilize .sync_page()'s
     optimizations when emulating invalidations

   - Clean up the range-based flushing APIs

   - Revamp the TDP MMU's reaping of Accessed/Dirty bits to clear a
     single A/D bit using a LOCK AND instead of XCHG, and skip all of
     the "handle changed SPTE" overhead associated with writing the
     entire entry

   - Track the number of "tail" entries in a pte_list_desc to avoid
     having to walk (potentially) all descriptors during insertion and
     deletion, which gets quite expensive if the guest is spamming
     fork()

   - Disallow virtualizing legacy LBRs if architectural LBRs are
     available, the two are mutually exclusive in hardware

   - Disallow writes to immutable feature MSRs (notably
     PERF_CAPABILITIES) after KVM_RUN, similar to CPUID features

   - Overhaul the vmx_pmu_caps selftest to better validate
     PERF_CAPABILITIES

   - Apply PMU filters to emulated events and add test coverage to the
     pmu_event_filter selftest

   - AMD SVM:
       - Add support for virtual NMIs
       - Fixes for edge cases related to virtual interrupts

   - Intel AMX:
       - Don't advertise XTILE_CFG in KVM_GET_SUPPORTED_CPUID if
         XTILE_DATA is not being reported due to userspace not opting in
         via prctl()
       - Fix a bug in emulation of ENCLS in compatibility mode
       - Allow emulation of NOP and PAUSE for L2
       - AMX selftests improvements
       - Misc cleanups

  MIPS:

   - Constify MIPS's internal callbacks (a leftover from the hardware
     enabling rework that landed in 6.3)

  Generic:

   - Drop unnecessary casts from "void *" throughout kvm_main.c

   - Tweak the layout of "struct kvm_mmu_memory_cache" to shrink the
     struct size by 8 bytes on 64-bit kernels by utilizing a padding
     hole

  Documentation:

   - Fix goof introduced by the conversion to rST"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (211 commits)
  KVM: s390: pci: fix virtual-physical confusion on module unload/load
  KVM: s390: vsie: clarifications on setting the APCB
  KVM: s390: interrupt: fix virtual-physical confusion for next alert GISA
  KVM: arm64: Have kvm_psci_vcpu_on() use WRITE_ONCE() to update mp_state
  KVM: arm64: Acquire mp_state_lock in kvm_arch_vcpu_ioctl_vcpu_init()
  KVM: selftests: Test the PMU event "Instructions retired"
  KVM: selftests: Copy full counter values from guest in PMU event filter test
  KVM: selftests: Use error codes to signal errors in PMU event filter test
  KVM: selftests: Print detailed info in PMU event filter asserts
  KVM: selftests: Add helpers for PMC asserts in PMU event filter test
  KVM: selftests: Add a common helper for the PMU event filter guest code
  KVM: selftests: Fix spelling mistake "perrmited" -> "permitted"
  KVM: arm64: vhe: Drop extra isb() on guest exit
  KVM: arm64: vhe: Synchronise with page table walker on MMU update
  KVM: arm64: pkvm: Document the side effects of kvm_flush_dcache_to_poc()
  KVM: arm64: nvhe: Synchronise with page table walker on TLBI
  KVM: arm64: Handle 32bit CNTPCTSS traps
  KVM: arm64: nvhe: Synchronise with page table walker on vcpu run
  KVM: arm64: vgic: Don't acquire its_lock before config_lock
  KVM: selftests: Add test to verify KVM's supported XCR0
  ...
2023-05-01 12:06:20 -07:00
Linus Torvalds
58390c8ce1 IOMMU Updates for Linux 6.4
Including:
 
 	- Convert to platform remove callback returning void
 
 	- Extend changing default domain to normal group
 
 	- Intel VT-d updates:
 	    - Remove VT-d virtual command interface and IOASID
 	    - Allow the VT-d driver to support non-PRI IOPF
 	    - Remove PASID supervisor request support
 	    - Various small and misc cleanups
 
 	- ARM SMMU updates:
 	    - Device-tree binding updates:
 	        * Allow Qualcomm GPU SMMUs to accept relevant clock properties
 	        * Document Qualcomm 8550 SoC as implementing an MMU-500
 	        * Favour new "qcom,smmu-500" binding for Adreno SMMUs
 
 	    - Fix S2CR quirk detection on non-architectural Qualcomm SMMU
 	      implementations
 
 	    - Acknowledge SMMUv3 PRI queue overflow when consuming events
 
 	    - Document (in a comment) why ATS is disabled for bypass streams
 
 	- AMD IOMMU updates:
 	    - 5-level page-table support
 	    - NUMA awareness for memory allocations
 
 	- Unisoc driver: Support for reattaching an existing domain
 
 	- Rockchip driver: Add missing set_platform_dma_ops callback
 
 	- Mediatek driver: Adjust the dma-ranges
 
 	- Various other small fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmRONeAACgkQK/BELZcB
 GuPmpw/8C9ruxQ0JU5rcDBXQGvos4gMmxlbELMrBpbbiTtdb35xchpKfdhnECGIF
 k2SrrcF40R/S82SyzNU/eZtGKirtcXvGFraUFgu/QdCcnnqpRHs+IJMXX2NJP+it
 +0wO1uiInt3CN1ERcR4F31cDKiWjDG8bvQVE5LIyiy4KrIU5ld2G91Fkaa0R13Au
 6H+/wKkcUC6OyaGE6wPx474xBkapT20vj5AIQuAWisXJJR0wbBon1sUTo/IRKsU+
 IkNxH0W+1PNImJ+crAdf/nkOlyqoChY4ww6cm07LrOsBLIsX5bCqXfL4HvKthElD
 MEgk2SN5kfjfR5Vf29W4hZVM1CT8VbhO41I7OzaZ6X6RU2PXoldPKlgKtZGeSKn1
 9bcMpSgB0BtbttvBevSkxTo5KHFozXS2DG3DFoMB3yFMme8Th0LrhBZ9oB7NIPNw
 ntMo4K75vviC6Vvzjy4Anj/+y+Zm3W6wDDP7F12O6WZLkK5s4hrSsHUm/MQnnKQP
 muJlG870RnSl73xUQZe3cuBxktXuJ3EHqqYIPE0npzvauu8hhWcis3opf2Y+U2s8
 aBCCIgp5kTKqjHLh2e4lNCKZf1/b/dhxRcRBQhpAIb8YsjMlIJyM+G8Jz6K6gBga
 5Ld+68UQ3oHJwoLV1HCFN8jbpQ9KZn1s9+h3yrYjRAcLNiFb3nU=
 =OvTo
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:

 - Convert to platform remove callback returning void

 - Extend changing default domain to normal group

 - Intel VT-d updates:
     - Remove VT-d virtual command interface and IOASID
     - Allow the VT-d driver to support non-PRI IOPF
     - Remove PASID supervisor request support
     - Various small and misc cleanups

 - ARM SMMU updates:
     - Device-tree binding updates:
         * Allow Qualcomm GPU SMMUs to accept relevant clock properties
         * Document Qualcomm 8550 SoC as implementing an MMU-500
         * Favour new "qcom,smmu-500" binding for Adreno SMMUs

     - Fix S2CR quirk detection on non-architectural Qualcomm SMMU
       implementations

     - Acknowledge SMMUv3 PRI queue overflow when consuming events

     - Document (in a comment) why ATS is disabled for bypass streams

 - AMD IOMMU updates:
     - 5-level page-table support
     - NUMA awareness for memory allocations

 - Unisoc driver: Support for reattaching an existing domain

 - Rockchip driver: Add missing set_platform_dma_ops callback

 - Mediatek driver: Adjust the dma-ranges

 - Various other small fixes and cleanups

* tag 'iommu-updates-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (82 commits)
  iommu: Remove iommu_group_get_by_id()
  iommu: Make iommu_release_device() static
  iommu/vt-d: Remove BUG_ON in dmar_insert_dev_scope()
  iommu/vt-d: Remove a useless BUG_ON(dev->is_virtfn)
  iommu/vt-d: Remove BUG_ON in map/unmap()
  iommu/vt-d: Remove BUG_ON when domain->pgd is NULL
  iommu/vt-d: Remove BUG_ON in handling iotlb cache invalidation
  iommu/vt-d: Remove BUG_ON on checking valid pfn range
  iommu/vt-d: Make size of operands same in bitwise operations
  iommu/vt-d: Remove PASID supervisor request support
  iommu/vt-d: Use non-privileged mode for all PASIDs
  iommu/vt-d: Remove extern from function prototypes
  iommu/vt-d: Do not use GFP_ATOMIC when not needed
  iommu/vt-d: Remove unnecessary checks in iopf disabling path
  iommu/vt-d: Move PRI handling to IOPF feature path
  iommu/vt-d: Move pfsid and ats_qdep calculation to device probe path
  iommu/vt-d: Move iopf code from SVA to IOPF enabling path
  iommu/vt-d: Allow SVA with device-specific IOPF
  dmaengine: idxd: Add enable/disable device IOPF feature
  arm64: dts: mt8186: Add dma-ranges for the parent "soc" node
  ...
2023-04-30 13:00:38 -07:00
Linus Torvalds
825a0714d2 EFI updates for v6.4:
- relocate the LoongArch kernel if the preferred address is already
   occupied;
 
 - implement BTI annotations for arm64 EFI stub and zboot images;
 
 - clean up arm64 zboot Kbuild rules for injecting the kernel code size.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQQm/3uucuRGn1Dmh0wbglWLn0tXAUCZEwUOwAKCRAwbglWLn0t
 XMNzAQChdPim0N+l2G4XLa1g8WCGany/+6/B9GHPJVcmQ25zLQD/UaNvAofkHwjR
 Y3P3ZEY1SPEA+UJBL/BTI0wO9/XgpAA=
 =hGWP
 -----END PGP SIGNATURE-----

Merge tag 'efi-next-for-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI updates from Ard Biesheuvel:

 - relocate the LoongArch kernel if the preferred address is already
   occupied

 - implement BTI annotations for arm64 EFI stub and zboot images

 - clean up arm64 zboot Kbuild rules for injecting the kernel code size

* tag 'efi-next-for-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  efi/zboot: arm64: Grab code size from ELF symbol in payload
  efi/zboot: arm64: Inject kernel code size symbol into the zboot payload
  efi/zboot: Set forward edge CFI compat header flag if supported
  efi/zboot: Add BSS padding before compression
  arm64: efi: Enable BTI codegen and add PE/COFF annotation
  efi/pe: Import new BTI/IBT header flags from the spec
  efi/loongarch: Reintroduce efi_relocate_kernel() to relocate kernel
2023-04-29 17:42:33 -07:00
Andrzej Hajda
068550631f locking/arch: Rename all internal __xchg() names to __arch_xchg()
Decrease the probability of this internal facility to be used by
driver code.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Acked-by: Palmer Dabbelt <palmer@rivosinc.com> [riscv]
Link: https://lore.kernel.org/r/20230118154450.73842-1-andrzej.hajda@intel.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
2023-04-29 09:08:44 +02:00
Linus Torvalds
f20730efbd SMP cross-CPU function-call updates for v6.4:
- Remove diagnostics and adjust config for CSD lock diagnostics
 
  - Add a generic IPI-sending tracepoint, as currently there's no easy
    way to instrument IPI origins: it's arch dependent and for some
    major architectures it's not even consistently available.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmRK438RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1jJ5Q/5AZ0HGpyqwdFK8GmGznyu5qjP5HwV9pPq
 gZQScqSy4tZEeza4TFMi83CoXSg9uJ7GlYJqqQMKm78LGEPomnZtXXC7oWvTA9M5
 M/jAvzytmvZloSCXV6kK7jzSejMHhag97J/BjTYhZYQpJ9T+hNC87XO6J6COsKr9
 lPIYqkFrIkQNr6B0U11AQfFejRYP1ics2fnbnZL86G/zZAc6x8EveM3KgSer2iHl
 KbrO+xcYyGY8Ef9P2F72HhEGFfM3WslpT1yzqR3sm4Y+fuMG0oW3qOQuMJx0ZhxT
 AloterY0uo6gJwI0P9k/K4klWgz81Tf/zLb0eBAtY2uJV9Fo3YhPHuZC7jGPGAy3
 JusW2yNYqc8erHVEMAKDUsl/1KN4TE2uKlkZy98wno+KOoMufK5MA2e2kPPqXvUi
 Jk9RvFolnWUsexaPmCftti0OCv3YFiviVAJ/t0pchfmvvJA2da0VC9hzmEXpLJVF
 25nBTV/1uAOrWvOpCyo3ElrC2CkQVkFmK5rXMDdvf6ib0Nid4vFcCkCSLVfu+ePB
 11mi7QYro+CcnOug1K+yKogUDmsZgV/u1kUwgQzTIpZ05Kkb49gUiXw9L2RGcBJh
 yoDoiI66KPR7PWQ2qBdQoXug4zfEEtWG0O9HNLB0FFRC3hu7I+HHyiUkBWs9jasK
 PA5+V7HcQRk=
 =Wp7f
 -----END PGP SIGNATURE-----

Merge tag 'smp-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull SMP cross-CPU function-call updates from Ingo Molnar:

 - Remove diagnostics and adjust config for CSD lock diagnostics

 - Add a generic IPI-sending tracepoint, as currently there's no easy
   way to instrument IPI origins: it's arch dependent and for some major
   architectures it's not even consistently available.

* tag 'smp-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  trace,smp: Trace all smp_function_call*() invocations
  trace: Add trace_ipi_send_cpu()
  sched, smp: Trace smp callback causing an IPI
  smp: reword smp call IPI comment
  treewide: Trace IPIs sent via smp_send_reschedule()
  irq_work: Trace self-IPIs sent via arch_irq_work_raise()
  smp: Trace IPIs sent via arch_send_call_function_ipi_mask()
  sched, smp: Trace IPIs sent via send_call_function_single_ipi()
  trace: Add trace_ipi_send_cpumask()
  kernel/smp: Make csdlock_debug= resettable
  locking/csd_lock: Remove per-CPU data indirection from CSD lock debugging
  locking/csd_lock: Remove added data from CSD lock debugging
  locking/csd_lock: Add Kconfig option for csd_debug default
2023-04-28 15:03:43 -07:00
Linus Torvalds
2aff7c706c Objtool changes for v6.4:
- Mark arch_cpu_idle_dead() __noreturn, make all architectures & drivers that did
    this inconsistently follow this new, common convention, and fix all the fallout
    that objtool can now detect statically.
 
  - Fix/improve the ORC unwinder becoming unreliable due to UNWIND_HINT_EMPTY ambiguity,
    split it into UNWIND_HINT_END_OF_STACK and UNWIND_HINT_UNDEFINED to resolve it.
 
  - Fix noinstr violations in the KCSAN code and the lkdtm/stackleak code.
 
  - Generate ORC data for __pfx code
 
  - Add more __noreturn annotations to various kernel startup/shutdown/panic functions.
 
  - Misc improvements & fixes.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmRK1x0RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1ghxQ/+IkCynMYtdF5OG9YwbcGJqsPSfOPMEcEM
 pUSFYg+gGPBDT/fJfcVSqvUtdnWbLC2kXt9yiswXz3X3J2nmNkBk5YKQftsNDcul
 TmKeqIIAK51XTncpegKH0EGnOX63oZ9Vxa8CTPdDlb+YF23Km2FoudGRI9F5qbUd
 LoraXqGYeiaeySkGyWmZVl6Uc8dIxnMkTN3H/oI9aB6TOrsi059hAtFcSaFfyemP
 c4LqXXCH7k2baiQt+qaLZ8cuZVG/+K5r2N2cmjO5kmJc6ynIaFnfMe4XxZLjp5LT
 /PulYI15bXkvSARKx5CRh/CDHMOx5Blw+ASO0RhWbdy0WH4ZhhcaVF5AeIpPW86a
 1LBcz97rMp72WmvKgrJeVO1r9+ll4SI6/YKGJRsxsCMdP3hgFpqntXyVjTFNdTM1
 0gH6H5v55x06vJHvhtTk8SR3PfMTEM2fRU5jXEOrGowoGifx+wNUwORiwj6LE3KQ
 SKUdT19RNzoW3VkFxhgk65ThK1S7YsJUKRoac3YdhttpqqqtFV//erenrZoR4k/p
 vzvKy68EQ7RCNyD5wNWNFe0YjeJl5G8gQ8bUm4Xmab7djjgz+pn4WpQB8yYKJLAo
 x9dqQ+6eUbw3Hcgk6qQ9E+r/svbulnAL0AeALAWK/91DwnZ2mCzKroFkLN7napKi
 fRho4CqzrtM=
 =NwEV
 -----END PGP SIGNATURE-----

Merge tag 'objtool-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool updates from Ingo Molnar:

 - Mark arch_cpu_idle_dead() __noreturn, make all architectures &
   drivers that did this inconsistently follow this new, common
   convention, and fix all the fallout that objtool can now detect
   statically

 - Fix/improve the ORC unwinder becoming unreliable due to
   UNWIND_HINT_EMPTY ambiguity, split it into UNWIND_HINT_END_OF_STACK
   and UNWIND_HINT_UNDEFINED to resolve it

 - Fix noinstr violations in the KCSAN code and the lkdtm/stackleak code

 - Generate ORC data for __pfx code

 - Add more __noreturn annotations to various kernel startup/shutdown
   and panic functions

 - Misc improvements & fixes

* tag 'objtool-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits)
  x86/hyperv: Mark hv_ghcb_terminate() as noreturn
  scsi: message: fusion: Mark mpt_halt_firmware() __noreturn
  x86/cpu: Mark {hlt,resume}_play_dead() __noreturn
  btrfs: Mark btrfs_assertfail() __noreturn
  objtool: Include weak functions in global_noreturns check
  cpu: Mark nmi_panic_self_stop() __noreturn
  cpu: Mark panic_smp_self_stop() __noreturn
  arm64/cpu: Mark cpu_park_loop() and friends __noreturn
  x86/head: Mark *_start_kernel() __noreturn
  init: Mark start_kernel() __noreturn
  init: Mark [arch_call_]rest_init() __noreturn
  objtool: Generate ORC data for __pfx code
  x86/linkage: Fix padding for typed functions
  objtool: Separate prefix code from stack validation code
  objtool: Remove superfluous dead_end_function() check
  objtool: Add symbol iteration helpers
  objtool: Add WARN_INSN()
  scripts/objdump-func: Support multiple functions
  context_tracking: Fix KCSAN noinstr violation
  objtool: Add stackleak instrumentation to uaccess safe list
  ...
2023-04-28 14:02:54 -07:00
Linus Torvalds
22b8cc3e78 Add support for new Linear Address Masking CPU feature. This is similar
to ARM's Top Byte Ignore and allows userspace to store metadata in some
 bits of pointers without masking it out before use.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmRK/WIACgkQaDWVMHDJ
 krAL+RAAw33EhsWyYVkeAtYmYBKkGvlgeSDULtfJKe5bynJBTHkGKfM6RE9MSJIt
 5fHWaConGh8HNpy0Us1sDvd/aWcWRm5h7ZcCVD+R4qrgh/vc7ULzM+elXe5jzr4W
 cyuTckF2eW6SVrYg6fH5q+6Uy/moDtrdkLRvwRBf+AYeepB8gvSSH5XixKDNiVBE
 pjNy1xXVZQokqD4tjsFelmLttyacR5OabiE/aeVNoFYf9yTwfnN8N3T6kwuOoS4l
 Lp6NA+/0ux+oBlR+Is+JJG8Mxrjvz96yJGZYdR2YP5k3bMQtHAAjuq2w+GgqZm5i
 j3/E6KQepEGaCfC+bHl68xy/kKx8ik+jMCEcBalCC25J3uxbLz41g6K3aI890wJn
 +5ZtfcmoDUk9pnUyLxR8t+UjOSBFAcRSUE+FTjUH1qEGsMPK++9a4iLXz5vYVK1+
 +YCt1u5LNJbkDxE8xVX3F5jkXh0G01SJsuUVAOqHSNfqSNmohFK8/omqhVRrRqoK
 A7cYLtnOGiUXLnvjrwSxPNOzRrG+GAwqaw8gwOTaYogETWbTY8qsSCEVl204uYwd
 m8io9rk2ZXUdDuha56xpBbPE0JHL9hJ2eKCuPkfvRgJT9YFyTh+e0UdX20k+nDjc
 ang1S350o/Y0sus6rij1qS8AuxJIjHucG0GdgpZk3KUbcxoRLhI=
 =qitk
 -----END PGP SIGNATURE-----

Merge tag 'x86_mm_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 LAM (Linear Address Masking) support from Dave Hansen:
 "Add support for the new Linear Address Masking CPU feature.

  This is similar to ARM's Top Byte Ignore and allows userspace to store
  metadata in some bits of pointers without masking it out before use"

* tag 'x86_mm_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm/iommu/sva: Do not allow to set FORCE_TAGGED_SVA bit from outside
  x86/mm/iommu/sva: Fix error code for LAM enabling failure due to SVA
  selftests/x86/lam: Add test cases for LAM vs thread creation
  selftests/x86/lam: Add ARCH_FORCE_TAGGED_SVA test cases for linear-address masking
  selftests/x86/lam: Add inherit test cases for linear-address masking
  selftests/x86/lam: Add io_uring test cases for linear-address masking
  selftests/x86/lam: Add mmap and SYSCALL test cases for linear-address masking
  selftests/x86/lam: Add malloc and tag-bits test cases for linear-address masking
  x86/mm/iommu/sva: Make LAM and SVA mutually exclusive
  iommu/sva: Replace pasid_valid() helper with mm_valid_pasid()
  mm: Expose untagging mask in /proc/$PID/status
  x86/mm: Provide arch_prctl() interface for LAM
  x86/mm: Reduce untagged_addr() overhead for systems without LAM
  x86/uaccess: Provide untagged_addr() and remove tags before address check
  mm: Introduce untagged_addr_remote()
  x86/mm: Handle LAM on context switch
  x86: CPUID and CR3/CR4 flags for Linear Address Masking
  x86: Allow atomic MM_CONTEXT flags setting
  x86/mm: Rework address range check in get_user() and put_user()
2023-04-28 09:43:49 -07:00
Linus Torvalds
7fa8a8ee94 - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of
switching from a user process to a kernel thread.
 
 - More folio conversions from Kefeng Wang, Zhang Peng and Pankaj Raghav.
 
 - zsmalloc performance improvements from Sergey Senozhatsky.
 
 - Yue Zhao has found and fixed some data race issues around the
   alteration of memcg userspace tunables.
 
 - VFS rationalizations from Christoph Hellwig:
 
   - removal of most of the callers of write_one_page().
 
   - make __filemap_get_folio()'s return value more useful
 
 - Luis Chamberlain has changed tmpfs so it no longer requires swap
   backing.  Use `mount -o noswap'.
 
 - Qi Zheng has made the slab shrinkers operate locklessly, providing
   some scalability benefits.
 
 - Keith Busch has improved dmapool's performance, making part of its
   operations O(1) rather than O(n).
 
 - Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd,
   permitting userspace to wr-protect anon memory unpopulated ptes.
 
 - Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive rather
   than exclusive, and has fixed a bunch of errors which were caused by its
   unintuitive meaning.
 
 - Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature,
   which causes minor faults to install a write-protected pte.
 
 - Vlastimil Babka has done some maintenance work on vma_merge():
   cleanups to the kernel code and improvements to our userspace test
   harness.
 
 - Cleanups to do_fault_around() by Lorenzo Stoakes.
 
 - Mike Rapoport has moved a lot of initialization code out of various
   mm/ files and into mm/mm_init.c.
 
 - Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for
   DRM, but DRM doesn't use it any more.
 
 - Lorenzo has also coverted read_kcore() and vread() to use iterators
   and has thereby removed the use of bounce buffers in some cases.
 
 - Lorenzo has also contributed further cleanups of vma_merge().
 
 - Chaitanya Prakash provides some fixes to the mmap selftesting code.
 
 - Matthew Wilcox changes xfs and afs so they no longer take sleeping
   locks in ->map_page(), a step towards RCUification of pagefaults.
 
 - Suren Baghdasaryan has improved mmap_lock scalability by switching to
   per-VMA locking.
 
 - Frederic Weisbecker has reworked the percpu cache draining so that it
   no longer causes latency glitches on cpu isolated workloads.
 
 - Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig
   logic.
 
 - Liu Shixin has changed zswap's initialization so we no longer waste a
   chunk of memory if zswap is not being used.
 
 - Yosry Ahmed has improved the performance of memcg statistics flushing.
 
 - David Stevens has fixed several issues involving khugepaged,
   userfaultfd and shmem.
 
 - Christoph Hellwig has provided some cleanup work to zram's IO-related
   code paths.
 
 - David Hildenbrand has fixed up some issues in the selftest code's
   testing of our pte state changing.
 
 - Pankaj Raghav has made page_endio() unneeded and has removed it.
 
 - Peter Xu contributed some rationalizations of the userfaultfd
   selftests.
 
 - Yosry Ahmed has fixed an issue around memcg's page recalim accounting.
 
 - Chaitanya Prakash has fixed some arm-related issues in the
   selftests/mm code.
 
 - Longlong Xia has improved the way in which KSM handles hwpoisoned
   pages.
 
 - Peter Xu fixes a few issues with uffd-wp at fork() time.
 
 - Stefan Roesch has changed KSM so that it may now be used on a
   per-process and per-cgroup basis.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZEr3zQAKCRDdBJ7gKXxA
 jlLoAP0fpQBipwFxED0Us4SKQfupV6z4caXNJGPeay7Aj11/kQD/aMRC2uPfgr96
 eMG3kwn2pqkB9ST2QpkaRbxA//eMbQY=
 =J+Dj
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of
   switching from a user process to a kernel thread.

 - More folio conversions from Kefeng Wang, Zhang Peng and Pankaj
   Raghav.

 - zsmalloc performance improvements from Sergey Senozhatsky.

 - Yue Zhao has found and fixed some data race issues around the
   alteration of memcg userspace tunables.

 - VFS rationalizations from Christoph Hellwig:
     - removal of most of the callers of write_one_page()
     - make __filemap_get_folio()'s return value more useful

 - Luis Chamberlain has changed tmpfs so it no longer requires swap
   backing. Use `mount -o noswap'.

 - Qi Zheng has made the slab shrinkers operate locklessly, providing
   some scalability benefits.

 - Keith Busch has improved dmapool's performance, making part of its
   operations O(1) rather than O(n).

 - Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd,
   permitting userspace to wr-protect anon memory unpopulated ptes.

 - Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive
   rather than exclusive, and has fixed a bunch of errors which were
   caused by its unintuitive meaning.

 - Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature,
   which causes minor faults to install a write-protected pte.

 - Vlastimil Babka has done some maintenance work on vma_merge():
   cleanups to the kernel code and improvements to our userspace test
   harness.

 - Cleanups to do_fault_around() by Lorenzo Stoakes.

 - Mike Rapoport has moved a lot of initialization code out of various
   mm/ files and into mm/mm_init.c.

 - Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for
   DRM, but DRM doesn't use it any more.

 - Lorenzo has also coverted read_kcore() and vread() to use iterators
   and has thereby removed the use of bounce buffers in some cases.

 - Lorenzo has also contributed further cleanups of vma_merge().

 - Chaitanya Prakash provides some fixes to the mmap selftesting code.

 - Matthew Wilcox changes xfs and afs so they no longer take sleeping
   locks in ->map_page(), a step towards RCUification of pagefaults.

 - Suren Baghdasaryan has improved mmap_lock scalability by switching to
   per-VMA locking.

 - Frederic Weisbecker has reworked the percpu cache draining so that it
   no longer causes latency glitches on cpu isolated workloads.

 - Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig
   logic.

 - Liu Shixin has changed zswap's initialization so we no longer waste a
   chunk of memory if zswap is not being used.

 - Yosry Ahmed has improved the performance of memcg statistics
   flushing.

 - David Stevens has fixed several issues involving khugepaged,
   userfaultfd and shmem.

 - Christoph Hellwig has provided some cleanup work to zram's IO-related
   code paths.

 - David Hildenbrand has fixed up some issues in the selftest code's
   testing of our pte state changing.

 - Pankaj Raghav has made page_endio() unneeded and has removed it.

 - Peter Xu contributed some rationalizations of the userfaultfd
   selftests.

 - Yosry Ahmed has fixed an issue around memcg's page recalim
   accounting.

 - Chaitanya Prakash has fixed some arm-related issues in the
   selftests/mm code.

 - Longlong Xia has improved the way in which KSM handles hwpoisoned
   pages.

 - Peter Xu fixes a few issues with uffd-wp at fork() time.

 - Stefan Roesch has changed KSM so that it may now be used on a
   per-process and per-cgroup basis.

* tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits)
  mm,unmap: avoid flushing TLB in batch if PTE is inaccessible
  shmem: restrict noswap option to initial user namespace
  mm/khugepaged: fix conflicting mods to collapse_file()
  sparse: remove unnecessary 0 values from rc
  mm: move 'mmap_min_addr' logic from callers into vm_unmapped_area()
  hugetlb: pte_alloc_huge() to replace huge pte_alloc_map()
  maple_tree: fix allocation in mas_sparse_area()
  mm: do not increment pgfault stats when page fault handler retries
  zsmalloc: allow only one active pool compaction context
  selftests/mm: add new selftests for KSM
  mm: add new KSM process and sysfs knobs
  mm: add new api to enable ksm per process
  mm: shrinkers: fix debugfs file permissions
  mm: don't check VMA write permissions if the PTE/PMD indicates write permissions
  migrate_pages_batch: fix statistics for longterm pin retry
  userfaultfd: use helper function range_in_vma()
  lib/show_mem.c: use for_each_populated_zone() simplify code
  mm: correct arg in reclaim_pages()/reclaim_clean_pages_from_list()
  fs/buffer: convert create_page_buffers to folio_create_buffers
  fs/buffer: add folio_create_empty_buffers helper
  ...
2023-04-27 19:42:02 -07:00
Linus Torvalds
b6a7828502 modules-6.4-rc1
The summary of the changes for this pull requests is:
 
  * Song Liu's new struct module_memory replacement
  * Nick Alcock's MODULE_LICENSE() removal for non-modules
  * My cleanups and enhancements to reduce the areas where we vmalloc
    module memory for duplicates, and the respective debug code which
    proves the remaining vmalloc pressure comes from userspace.
 
 Most of the changes have been in linux-next for quite some time except
 the minor fixes I made to check if a module was already loaded
 prior to allocating the final module memory with vmalloc and the
 respective debug code it introduces to help clarify the issue. Although
 the functional change is small it is rather safe as it can only *help*
 reduce vmalloc space for duplicates and is confirmed to fix a bootup
 issue with over 400 CPUs with KASAN enabled. I don't expect stable
 kernels to pick up that fix as the cleanups would have also had to have
 been picked up. Folks on larger CPU systems with modules will want to
 just upgrade if vmalloc space has been an issue on bootup.
 
 Given the size of this request, here's some more elaborate details
 on this pull request.
 
 The functional change change in this pull request is the very first
 patch from Song Liu which replaces the struct module_layout with a new
 struct module memory. The old data structure tried to put together all
 types of supported module memory types in one data structure, the new
 one abstracts the differences in memory types in a module to allow each
 one to provide their own set of details. This paves the way in the
 future so we can deal with them in a cleaner way. If you look at changes
 they also provide a nice cleanup of how we handle these different memory
 areas in a module. This change has been in linux-next since before the
 merge window opened for v6.3 so to provide more than a full kernel cycle
 of testing. It's a good thing as quite a bit of fixes have been found
 for it.
 
 Jason Baron then made dynamic debug a first class citizen module user by
 using module notifier callbacks to allocate / remove module specific
 dynamic debug information.
 
 Nick Alcock has done quite a bit of work cross-tree to remove module
 license tags from things which cannot possibly be module at my request
 so to:
 
   a) help him with his longer term tooling goals which require a
      deterministic evaluation if a piece a symbol code could ever be
      part of a module or not. But quite recently it is has been made
      clear that tooling is not the only one that would benefit.
      Disambiguating symbols also helps efforts such as live patching,
      kprobes and BPF, but for other reasons and R&D on this area
      is active with no clear solution in sight.
 
   b) help us inch closer to the now generally accepted long term goal
      of automating all the MODULE_LICENSE() tags from SPDX license tags
 
 In so far as a) is concerned, although module license tags are a no-op
 for non-modules, tools which would want create a mapping of possible
 modules can only rely on the module license tag after the commit
 8b41fc4454e ("kbuild: create modules.builtin without Makefile.modbuiltin
 or tristate.conf").  Nick has been working on this *for years* and
 AFAICT I was the only one to suggest two alternatives to this approach
 for tooling. The complexity in one of my suggested approaches lies in
 that we'd need a possible-obj-m and a could-be-module which would check
 if the object being built is part of any kconfig build which could ever
 lead to it being part of a module, and if so define a new define
 -DPOSSIBLE_MODULE [0]. A more obvious yet theoretical approach I've
 suggested would be to have a tristate in kconfig imply the same new
 -DPOSSIBLE_MODULE as well but that means getting kconfig symbol names
 mapping to modules always, and I don't think that's the case today. I am
 not aware of Nick or anyone exploring either of these options. Quite
 recently Josh Poimboeuf has pointed out that live patching, kprobes and
 BPF would benefit from resolving some part of the disambiguation as
 well but for other reasons. The function granularity KASLR (fgkaslr)
 patches were mentioned but Joe Lawrence has clarified this effort has
 been dropped with no clear solution in sight [1].
 
 In the meantime removing module license tags from code which could never
 be modules is welcomed for both objectives mentioned above. Some
 developers have also welcomed these changes as it has helped clarify
 when a module was never possible and they forgot to clean this up,
 and so you'll see quite a bit of Nick's patches in other pull
 requests for this merge window. I just picked up the stragglers after
 rc3. LWN has good coverage on the motivation behind this work [2] and
 the typical cross-tree issues he ran into along the way. The only
 concrete blocker issue he ran into was that we should not remove the
 MODULE_LICENSE() tags from files which have no SPDX tags yet, even if
 they can never be modules. Nick ended up giving up on his efforts due
 to having to do this vetting and backlash he ran into from folks who
 really did *not understand* the core of the issue nor were providing
 any alternative / guidance. I've gone through his changes and dropped
 the patches which dropped the module license tags where an SPDX
 license tag was missing, it only consisted of 11 drivers.  To see
 if a pull request deals with a file which lacks SPDX tags you
 can just use:
 
   ./scripts/spdxcheck.py -f \
 	$(git diff --name-only commid-id | xargs echo)
 
 You'll see a core module file in this pull request for the above,
 but that's not related to his changes. WE just need to add the SPDX
 license tag for the kernel/module/kmod.c file in the future but
 it demonstrates the effectiveness of the script.
 
 Most of Nick's changes were spread out through different trees,
 and I just picked up the slack after rc3 for the last kernel was out.
 Those changes have been in linux-next for over two weeks.
 
 The cleanups, debug code I added and final fix I added for modules
 were motivated by David Hildenbrand's report of boot failing on
 a systems with over 400 CPUs when KASAN was enabled due to running
 out of virtual memory space. Although the functional change only
 consists of 3 lines in the patch "module: avoid allocation if module is
 already present and ready", proving that this was the best we can
 do on the modules side took quite a bit of effort and new debug code.
 
 The initial cleanups I did on the modules side of things has been
 in linux-next since around rc3 of the last kernel, the actual final
 fix for and debug code however have only been in linux-next for about a
 week or so but I think it is worth getting that code in for this merge
 window as it does help fix / prove / evaluate the issues reported
 with larger number of CPUs. Userspace is not yet fixed as it is taking
 a bit of time for folks to understand the crux of the issue and find a
 proper resolution. Worst come to worst, I have a kludge-of-concept [3]
 of how to make kernel_read*() calls for modules unique / converge them,
 but I'm currently inclined to just see if userspace can fix this
 instead.
 
 [0] https://lore.kernel.org/all/Y/kXDqW+7d71C4wz@bombadil.infradead.org/
 [1] https://lkml.kernel.org/r/025f2151-ce7c-5630-9b90-98742c97ac65@redhat.com
 [2] https://lwn.net/Articles/927569/
 [3] https://lkml.kernel.org/r/20230414052840.1994456-3-mcgrof@kernel.org
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmRG4m0SHG1jZ3JvZkBr
 ZXJuZWwub3JnAAoJEM4jHQowkoinQ2oP/0xlvKwJg6Ey8fHZF0qv8VOskE80zoLF
 hMazU3xfqLA+1TQvouW1YBxt3jwS3t1Ehs+NrV+nY9Yzcm0MzRX/n3fASJVe7nRr
 oqWWQU+voYl5Pw1xsfdp6C8IXpBQorpYby3Vp0MAMoZyl2W2YrNo36NV488wM9KC
 jD4HF5Z6xpnPSZTRR7AgW9mo7FdAtxPeKJ76Bch7lH8U6omT7n36WqTw+5B1eAYU
 YTOvrjRs294oqmWE+LeebyiOOXhH/yEYx4JNQgCwPdxwnRiGJWKsk5va0hRApqF/
 WW8dIqdEnjsa84lCuxnmWgbcPK8cgmlO0rT0DyneACCldNlldCW1LJ0HOwLk9pea
 p3JFAsBL7TKue4Tos6I7/4rx1ufyBGGIigqw9/VX5g0Iif+3BhWnqKRfz+p9wiMa
 Fl7cU6u7yC68CHu1HBSisK16cYMCPeOnTSd89upHj8JU/t74O6k/ARvjrQ9qmNUt
 c5U+OY+WpNJ1nXQydhY/yIDhFdYg8SSpNuIO90r4L8/8jRQYXNG80FDd1UtvVDuy
 eq0r2yZ8C0XHSlOT9QHaua/tWV/aaKtyC/c0hDRrigfUrq8UOlGujMXbUnrmrWJI
 tLJLAc7ePWAAoZXGSHrt0U27l029GzLwRdKqJ6kkDANVnTeOdV+mmBg9zGh3/Mp6
 agiwdHUMVN7X
 =56WK
 -----END PGP SIGNATURE-----

Merge tag 'modules-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux

Pull module updates from Luis Chamberlain:
 "The summary of the changes for this pull requests is:

   - Song Liu's new struct module_memory replacement

   - Nick Alcock's MODULE_LICENSE() removal for non-modules

   - My cleanups and enhancements to reduce the areas where we vmalloc
     module memory for duplicates, and the respective debug code which
     proves the remaining vmalloc pressure comes from userspace.

  Most of the changes have been in linux-next for quite some time except
  the minor fixes I made to check if a module was already loaded prior
  to allocating the final module memory with vmalloc and the respective
  debug code it introduces to help clarify the issue. Although the
  functional change is small it is rather safe as it can only *help*
  reduce vmalloc space for duplicates and is confirmed to fix a bootup
  issue with over 400 CPUs with KASAN enabled. I don't expect stable
  kernels to pick up that fix as the cleanups would have also had to
  have been picked up. Folks on larger CPU systems with modules will
  want to just upgrade if vmalloc space has been an issue on bootup.

  Given the size of this request, here's some more elaborate details:

  The functional change change in this pull request is the very first
  patch from Song Liu which replaces the 'struct module_layout' with a
  new 'struct module_memory'. The old data structure tried to put
  together all types of supported module memory types in one data
  structure, the new one abstracts the differences in memory types in a
  module to allow each one to provide their own set of details. This
  paves the way in the future so we can deal with them in a cleaner way.
  If you look at changes they also provide a nice cleanup of how we
  handle these different memory areas in a module. This change has been
  in linux-next since before the merge window opened for v6.3 so to
  provide more than a full kernel cycle of testing. It's a good thing as
  quite a bit of fixes have been found for it.

  Jason Baron then made dynamic debug a first class citizen module user
  by using module notifier callbacks to allocate / remove module
  specific dynamic debug information.

  Nick Alcock has done quite a bit of work cross-tree to remove module
  license tags from things which cannot possibly be module at my request
  so to:

   a) help him with his longer term tooling goals which require a
      deterministic evaluation if a piece a symbol code could ever be
      part of a module or not. But quite recently it is has been made
      clear that tooling is not the only one that would benefit.
      Disambiguating symbols also helps efforts such as live patching,
      kprobes and BPF, but for other reasons and R&D on this area is
      active with no clear solution in sight.

   b) help us inch closer to the now generally accepted long term goal
      of automating all the MODULE_LICENSE() tags from SPDX license tags

  In so far as a) is concerned, although module license tags are a no-op
  for non-modules, tools which would want create a mapping of possible
  modules can only rely on the module license tag after the commit
  8b41fc4454e ("kbuild: create modules.builtin without
  Makefile.modbuiltin or tristate.conf").

  Nick has been working on this *for years* and AFAICT I was the only
  one to suggest two alternatives to this approach for tooling. The
  complexity in one of my suggested approaches lies in that we'd need a
  possible-obj-m and a could-be-module which would check if the object
  being built is part of any kconfig build which could ever lead to it
  being part of a module, and if so define a new define
  -DPOSSIBLE_MODULE [0].

  A more obvious yet theoretical approach I've suggested would be to
  have a tristate in kconfig imply the same new -DPOSSIBLE_MODULE as
  well but that means getting kconfig symbol names mapping to modules
  always, and I don't think that's the case today. I am not aware of
  Nick or anyone exploring either of these options. Quite recently Josh
  Poimboeuf has pointed out that live patching, kprobes and BPF would
  benefit from resolving some part of the disambiguation as well but for
  other reasons. The function granularity KASLR (fgkaslr) patches were
  mentioned but Joe Lawrence has clarified this effort has been dropped
  with no clear solution in sight [1].

  In the meantime removing module license tags from code which could
  never be modules is welcomed for both objectives mentioned above. Some
  developers have also welcomed these changes as it has helped clarify
  when a module was never possible and they forgot to clean this up, and
  so you'll see quite a bit of Nick's patches in other pull requests for
  this merge window. I just picked up the stragglers after rc3. LWN has
  good coverage on the motivation behind this work [2] and the typical
  cross-tree issues he ran into along the way. The only concrete blocker
  issue he ran into was that we should not remove the MODULE_LICENSE()
  tags from files which have no SPDX tags yet, even if they can never be
  modules. Nick ended up giving up on his efforts due to having to do
  this vetting and backlash he ran into from folks who really did *not
  understand* the core of the issue nor were providing any alternative /
  guidance. I've gone through his changes and dropped the patches which
  dropped the module license tags where an SPDX license tag was missing,
  it only consisted of 11 drivers. To see if a pull request deals with a
  file which lacks SPDX tags you can just use:

    ./scripts/spdxcheck.py -f \
	$(git diff --name-only commid-id | xargs echo)

  You'll see a core module file in this pull request for the above, but
  that's not related to his changes. WE just need to add the SPDX
  license tag for the kernel/module/kmod.c file in the future but it
  demonstrates the effectiveness of the script.

  Most of Nick's changes were spread out through different trees, and I
  just picked up the slack after rc3 for the last kernel was out. Those
  changes have been in linux-next for over two weeks.

  The cleanups, debug code I added and final fix I added for modules
  were motivated by David Hildenbrand's report of boot failing on a
  systems with over 400 CPUs when KASAN was enabled due to running out
  of virtual memory space. Although the functional change only consists
  of 3 lines in the patch "module: avoid allocation if module is already
  present and ready", proving that this was the best we can do on the
  modules side took quite a bit of effort and new debug code.

  The initial cleanups I did on the modules side of things has been in
  linux-next since around rc3 of the last kernel, the actual final fix
  for and debug code however have only been in linux-next for about a
  week or so but I think it is worth getting that code in for this merge
  window as it does help fix / prove / evaluate the issues reported with
  larger number of CPUs. Userspace is not yet fixed as it is taking a
  bit of time for folks to understand the crux of the issue and find a
  proper resolution. Worst come to worst, I have a kludge-of-concept [3]
  of how to make kernel_read*() calls for modules unique / converge
  them, but I'm currently inclined to just see if userspace can fix this
  instead"

Link: https://lore.kernel.org/all/Y/kXDqW+7d71C4wz@bombadil.infradead.org/ [0]
Link: https://lkml.kernel.org/r/025f2151-ce7c-5630-9b90-98742c97ac65@redhat.com [1]
Link: https://lwn.net/Articles/927569/ [2]
Link: https://lkml.kernel.org/r/20230414052840.1994456-3-mcgrof@kernel.org [3]

* tag 'modules-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (121 commits)
  module: add debugging auto-load duplicate module support
  module: stats: fix invalid_mod_bytes typo
  module: remove use of uninitialized variable len
  module: fix building stats for 32-bit targets
  module: stats: include uapi/linux/module.h
  module: avoid allocation if module is already present and ready
  module: add debug stats to help identify memory pressure
  module: extract patient module check into helper
  modules/kmod: replace implementation with a semaphore
  Change DEFINE_SEMAPHORE() to take a number argument
  module: fix kmemleak annotations for non init ELF sections
  module: Ignore L0 and rename is_arm_mapping_symbol()
  module: Move is_arm_mapping_symbol() to module_symbol.h
  module: Sync code of is_arm_mapping_symbol()
  scripts/gdb: use mem instead of core_layout to get the module address
  interconnect: remove module-related code
  interconnect: remove MODULE_LICENSE in non-modules
  zswap: remove MODULE_LICENSE in non-modules
  zpool: remove MODULE_LICENSE in non-modules
  x86/mm/dump_pagetables: remove MODULE_LICENSE in non-modules
  ...
2023-04-27 16:36:55 -07:00