Commit Graph

175 Commits

Author SHA1 Message Date
Mostafa Saleh
ec9098d6bf iommu/arm-smmu-v3: Fix access for STE.SHCFG
STE attributes(NSCFG, PRIVCFG, INSTCFG) use value 0 for "Use Icomming",
for some reason SHCFG doesn't follow that, and it is defined as "0b01".

Currently the driver sets SHCFG to Use Incoming for stage-2 and bypass
domains.

However according to the User Manual (ARM IHI 0070 F.b):
	When SMMU_IDR1.ATTR_TYPES_OVR == 0, this field is RES0 and the
	incoming Shareability attribute is used.

This patch adds a condition for writing SHCFG to Use incoming to be
compliant with the architecture, and defines ATTR_TYPE_OVR as a new
feature discovered from IDR1.
This also required to propagate the SMMU through some functions args.

There is no need to add similar condition for the newly introduced function
arm_smmu_get_ste_used() as the values of the STE are the same before and
after any transition, so this will not trigger any change. (we already
do the same for the VMID).

Although this is a misconfiguration from the driver, this has been there
for a long time, so probably no HW running Linux is affected by it.

Reported-by: Will Deacon <will@kernel.org>
Closes: https://lore.kernel.org/all/20240215134952.GA690@willie-the-truck/

Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240323134658.464743-1-smostafa@google.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-03-26 10:47:39 +00:00
Jason Gunthorpe
0493e739cc iommu/arm-smmu-v3: Add cpu_to_le64() around STRTAB_STE_0_V
STRTAB_STE_0_V is a CPU value, it needs conversion for sparse to be clean.

The missing annotation was a mistake introduced by splitting the ops out
from the STE writer.

Fixes: 7da51af912 ("iommu/arm-smmu-v3: Make STE programming independent of the callers")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202403011441.5WqGrYjp-lkp@intel.com/
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/0-v1-98b23ebb0c84+9f-smmu_cputole_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-03-26 10:44:18 +00:00
Linus Torvalds
8c9c2f851b IOMMU Updates for Linux v6.9
Including:
 
 	- Core changes:
 	  - Constification of bus_type pointer
 	  - Preparations for user-space page-fault delivery
 	  - Use a named kmem_cache for IOVA magazines
 
 	- Intel VT-d changes from Lu Baolu:
 	  - Add RBTree to track iommu probed devices
 	  - Add Intel IOMMU debugfs document
 	  - Cleanup and refactoring
 
 	- ARM-SMMU Updates from Will Deacon:
 	  - Device-tree binding updates for a bunch of Qualcomm SoCs
 	  - SMMUv2: Support for Qualcomm X1E80100 MDSS
 	  - SMMUv3: Significant rework of the driver's STE manipulation and
 	    domain handling code. This is the initial part of a larger scale
 	    rework aiming to improve the driver's implementation of the
 	    IOMMU-API in preparation for hooking up IOMMUFD support.
 
 	- AMD-Vi Updates:
 	  - Refactor GCR3 table support for SVA
 	  - Cleanups
 
 	- Some smaller cleanups and fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmXuyf8ACgkQK/BELZcB
 GuNXwxAApkjDm7VWM2D2K8Y+8YLbtaljMCCudNZKhgT++HEo4YlXcA5NmOddMIFc
 qhF9EwAWlQfj3krJLJQSZ6v/joKpXSwS6LDYuEGmJ/pIGfN5HqaTsOCItriP7Mle
 ZgRTI28u5ykZt4b6IKG8QeexilQi2DsIxT46HFiHL0GrvcBcdxDuKnE22PNCTwU2
 25WyJzgo//Ht2BrwlhrduZVQUh0KzXYuV5lErvoobmT0v/a4llS20ov+IE/ut54w
 FxIqGR8rMdJ9D2dM0bWRkdJY/vJxokah2QHm0gcna3Gr2iENL2xWFUtm+j1B6Smb
 VuxbwMkB0Iz530eShebmzQ07e2f1rRb4DySriu4m/jb8we20AYqKMYaxQxZkU68T
 1hExo+/QJQil9p1t+7Eur+S1u6gRHOdqfBnCzGOth/zzY1lbEzpdp8b9M8wnGa4K
 Y0EDeUpKtVIP1ZRCBi8CGyU1jgJF13Nx7MnOalgGWjDysB5RPamnrhz71EuD6rLw
 Jxp2EYo8NQPmPbEcl9NDS+oOn5Fz5TyPiMF2GUzhb9KisLxUjriLoTaNyBsdFkds
 2q+x6KY8qPGk37NhN0ktfpk9CtSGN47Pm8ZznEkFt9AR96GJDX+3NhUNAwEKslwt
 1tavDmmdOclOfIpWtaMlKQTHGhuSBZo1A40ATeM/MjHQ8rEtwXk=
 =HV07
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:
 "Core changes:
    - Constification of bus_type pointer
    - Preparations for user-space page-fault delivery
    - Use a named kmem_cache for IOVA magazines

  Intel VT-d changes from Lu Baolu:
    - Add RBTree to track iommu probed devices
    - Add Intel IOMMU debugfs document
    - Cleanup and refactoring

  ARM-SMMU Updates from Will Deacon:
    - Device-tree binding updates for a bunch of Qualcomm SoCs
    - SMMUv2: Support for Qualcomm X1E80100 MDSS
    - SMMUv3: Significant rework of the driver's STE manipulation and
      domain handling code. This is the initial part of a larger scale
      rework aiming to improve the driver's implementation of the
      IOMMU-API in preparation for hooking up IOMMUFD support.

  AMD-Vi Updates:
    - Refactor GCR3 table support for SVA
    - Cleanups

  Some smaller cleanups and fixes"

* tag 'iommu-updates-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (88 commits)
  iommu: Fix compilation without CONFIG_IOMMU_INTEL
  iommu/amd: Fix sleeping in atomic context
  iommu/dma: Document min_align_mask assumption
  iommu/vt-d: Remove scalabe mode in domain_context_clear_one()
  iommu/vt-d: Remove scalable mode context entry setup from attach_dev
  iommu/vt-d: Setup scalable mode context entry in probe path
  iommu/vt-d: Fix NULL domain on device release
  iommu: Add static iommu_ops->release_domain
  iommu/vt-d: Improve ITE fault handling if target device isn't present
  iommu/vt-d: Don't issue ATS Invalidation request when device is disconnected
  PCI: Make pci_dev_is_disconnected() helper public for other drivers
  iommu/vt-d: Use device rbtree in iopf reporting path
  iommu/vt-d: Use rbtree to track iommu probed devices
  iommu/vt-d: Merge intel_svm_bind_mm() into its caller
  iommu/vt-d: Remove initialization for dynamically heap-allocated rcu_head
  iommu/vt-d: Remove treatment for revoking PASIDs with pending page faults
  iommu/vt-d: Add the document for Intel IOMMU debugfs
  iommu/vt-d: Use kcalloc() instead of kzalloc()
  iommu/vt-d: Remove INTEL_IOMMU_BROKEN_GFX_WA
  iommu: re-use local fwnode variable in iommu_ops_from_fwnode()
  ...
2024-03-13 09:15:30 -07:00
Linus Torvalds
4527e83780 Updates for the MSI interrupt subsystem and RISC-V initial MSI support:
- Core and platform-MSI
 
     The core changes have been adopted from previous work which converted
     ARM[64] to the new per device MSI domain model, which was merged to
     support multiple MSI domain per device. The ARM[64] changes are being
     worked on too, but have not been ready yet. The core and platform-MSI
     changes have been split out to not hold up RISC-V and to avoid that
     RISC-V builds on the scheduled for removal interfaces.
 
     The core support provides new interfaces to handle wire to MSI bridges
     in a straight forward way and introduces new platform-MSI interfaces
     which are built on top of the per device MSI domain model.
 
     Once ARM[64] is converted over the old platform-MSI interfaces and the
     related ugliness in the MSI core code will be removed.
 
   - Drivers:
 
     - Add a new driver for the Andes hart-level interrupt controller
 
     - Rework the SiFive PLIC driver to prepare for MSI suport
 
     - Expand the RISC-V INTC driver to support the new RISC-V AIA
       controller which provides the basis for MSI on RISC-V
 
     - A few fixup for the fallout of the core changes.
 
     The actual MSI parts for RISC-V were finalized late and have been
     post-poned for the next merge window.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmXt7MsTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYofrMD/9Dag12ttmbE2uqzTzlTxc7RHC2MX5n
 VJLt84FNNwGPA4r7WLOOqHrfuvfoGjuWT9pYMrVaXCglRG1CMvL10kHMB2f28UWv
 Qpc5PzbJwpD6tqyfRSFHMoJp63DAI8IpS7J3I8bqnRD8+0PwYn3jMA1+iMZkH0B7
 8uO3mxlFhQ7BFvIAeMEAhR0szuAfvXqEtpi1iTgQTrQ4Je4Rf1pmLjEe2rkwDvF4
 p3SAmPIh4+F3IjO7vNsVkQ2yOarTP2cpSns6JmO8mrobLIVX7ZCQ6uVaVCfBhxfx
 WttuJO6Bmh/I15yDe/waH6q9ym+0VBwYRWi5lonMpViGdq4/D2WVnY1mNeLRIfjl
 X65aMWE1+bhiqyIIUfc24hacf0UgBIlMEW4kJ31VmQzb+OyLDXw+UvzWg1dO6XdA
 3L6j1nRgHk0ea5yFyH6SfH/mrfeyqHuwHqo17KFyHxD3jM2H1RRMplpbwXiOIepp
 KJJ/O06eMEzHqzn4B8GCT2EvX6L2ehgoWbLeEDNLQh/3LwA9OdcBzPr6gsweEl0U
 Q7szJgUWZHeMr39F2rnt0GmvkEuu6muEp/nQzfnohjoYZ0PhpMLSq++4Gi+Ko3fz
 2IyecJ+tlbSfyM5//8AdNnOSpsTG3f8u6B/WwhGp5lIDwMnMzCssgfQmRnc3Uyv5
 kU3pdMjURJaTUA==
 =7aXj
 -----END PGP SIGNATURE-----

Merge tag 'irq-msi-2024-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull MSI updates from Thomas Gleixner:
 "Updates for the MSI interrupt subsystem and initial RISC-V MSI
  support.

  The core changes have been adopted from previous work which converted
  ARM[64] to the new per device MSI domain model, which was merged to
  support multiple MSI domain per device. The ARM[64] changes are being
  worked on too, but have not been ready yet. The core and platform-MSI
  changes have been split out to not hold up RISC-V and to avoid that
  RISC-V builds on the scheduled for removal interfaces.

  The core support provides new interfaces to handle wire to MSI bridges
  in a straight forward way and introduces new platform-MSI interfaces
  which are built on top of the per device MSI domain model.

  Once ARM[64] is converted over the old platform-MSI interfaces and the
  related ugliness in the MSI core code will be removed.

  The actual MSI parts for RISC-V were finalized late and have been
  post-poned for the next merge window.

  Drivers:

   - Add a new driver for the Andes hart-level interrupt controller

   - Rework the SiFive PLIC driver to prepare for MSI suport

   - Expand the RISC-V INTC driver to support the new RISC-V AIA
     controller which provides the basis for MSI on RISC-V

   - A few fixup for the fallout of the core changes"

* tag 'irq-msi-2024-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits)
  irqchip/riscv-intc: Fix low-level interrupt handler setup for AIA
  x86/apic/msi: Use DOMAIN_BUS_GENERIC_MSI for HPET/IO-APIC domain search
  genirq/matrix: Dynamic bitmap allocation
  irqchip/riscv-intc: Add support for RISC-V AIA
  irqchip/sifive-plic: Improve locking safety by using irqsave/irqrestore
  irqchip/sifive-plic: Parse number of interrupts and contexts early in plic_probe()
  irqchip/sifive-plic: Cleanup PLIC contexts upon irqdomain creation failure
  irqchip/sifive-plic: Use riscv_get_intc_hwnode() to get parent fwnode
  irqchip/sifive-plic: Use devm_xyz() for managed allocation
  irqchip/sifive-plic: Use dev_xyz() in-place of pr_xyz()
  irqchip/sifive-plic: Convert PLIC driver into a platform driver
  irqchip/riscv-intc: Introduce Andes hart-level interrupt controller
  irqchip/riscv-intc: Allow large non-standard interrupt number
  genirq/irqdomain: Don't call ops->select for DOMAIN_BUS_ANY tokens
  irqchip/imx-intmux: Handle pure domain searches correctly
  genirq/msi: Provide MSI_FLAG_PARENT_PM_DEV
  genirq/irqdomain: Reroute device MSI create_mapping
  genirq/msi: Provide allocation/free functions for "wired" MSI interrupts
  genirq/msi: Optionally use dev->fwnode for device domain
  genirq/msi: Provide DOMAIN_BUS_WIRED_TO_MSI
  ...
2024-03-11 14:03:03 -07:00
Joerg Roedel
f379a7e9c3 Merge branches 'arm/mediatek', 'arm/renesas', 'arm/smmu', 'x86/vt-d', 'x86/amd' and 'core' into next 2024-03-08 09:05:59 +01:00
Krzysztof Kozlowski
b42a905b6a iommu: constify of_phandle_args in xlate
The xlate callbacks are supposed to translate of_phandle_args to proper
provider without modifying the of_phandle_args.  Make the argument
pointer to const for code safety and readability.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240216144027.185959-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-03-01 13:46:57 +01:00
Jason Gunthorpe
327e10b47a iommu/arm-smmu-v3: Convert to domain_alloc_paging()
Now that the BLOCKED and IDENTITY behaviors are managed with their own
domains change to the domain_alloc_paging() op.

For now SVA remains using the old interface, eventually it will get its
own op that can pass in the device and mm_struct which will let us have a
sane lifetime for the mmu_notifier.

Call arm_smmu_domain_finalise() early if dev is available.

Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/16-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29 15:12:23 +00:00
Jason Gunthorpe
d8cd200609 iommu/arm-smmu-v3: Pass arm_smmu_domain and arm_smmu_device to finalize
Instead of putting container_of() casts in the internals, use the proper
type in this call chain. This makes it easier to check that the two global
static domains are not leaking into call chains they should not.

Passing the smmu avoids the only caller from having to set it and unset it
in the error path.

Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/15-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29 15:12:23 +00:00
Jason Gunthorpe
d36464f40f iommu/arm-smmu-v3: Use the identity/blocked domain during release
Consolidate some more code by having release call
arm_smmu_attach_dev_identity/blocked() instead of open coding this.

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/14-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29 15:12:23 +00:00
Jason Gunthorpe
352bd64cd8 iommu/arm-smmu-v3: Add a global static BLOCKED domain
Using the same design as the IDENTITY domain install an
STRTAB_STE_0_CFG_ABORT STE.

Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/13-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29 15:12:22 +00:00
Jason Gunthorpe
12dacfb5b9 iommu/arm-smmu-v3: Add a global static IDENTITY domain
Move to the new static global for identity domains. Move all the logic out
of arm_smmu_attach_dev into an identity only function.

Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/12-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29 15:12:22 +00:00
Jason Gunthorpe
ae91f6552c iommu/arm-smmu-v3: Check that the RID domain is S1 in SVA
The SVA code only works if the RID domain is a S1 domain and has already
installed the cdtable.

Originally the check for this was in arm_smmu_sva_bind() but when the op
was removed the test didn't get copied over to the new
arm_smmu_sva_set_dev_pasid().

Without the test wrong usage usually will hit a WARN_ON() in
arm_smmu_write_ctx_desc() due to a missing ctx table.

However, the next patches wil change things so that an IDENTITY domain is
not a struct arm_smmu_domain and this will get into memory corruption if
the struct is wrongly casted.

Fail in arm_smmu_sva_set_dev_pasid() if the STE does not have a S1, which
is a proxy for the STE having a pointer to the CD table. Write it in a way
that will be compatible with the next patches.

Fixes: 386fa64fd5 ("arm-smmu-v3/sva: Add SVA domain support")
Reported-by: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Closes: https://lore.kernel.org/linux-iommu/2a828e481416405fb3a4cceb9e075a59@huawei.com/
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/11-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29 15:12:22 +00:00
Jason Gunthorpe
1b50017d39 iommu/arm-smmu-v3: Remove arm_smmu_master->domain
Introducing global statics which are of type struct iommu_domain, not
struct arm_smmu_domain makes it difficult to retain
arm_smmu_master->domain, as it can no longer point to an IDENTITY or
BLOCKED domain.

The only place that uses the value is arm_smmu_detach_dev(). Change things
to work like other drivers and call iommu_get_domain_for_dev() to obtain
the current domain.

The master->domain is subtly protecting the master->domain_head against
being unused as only PAGING domains will set master->domain and only
paging domains use the master->domain_head. To make it simple keep the
master->domain_head initialized so that the list_del() logic just does
nothing for attached non-PAGING domains.

Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/10-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29 15:12:22 +00:00
Jason Gunthorpe
d550ddc5b7 iommu/arm-smmu-v3: Pass smmu_domain to arm_enable/disable_ats()
The caller already has the domain, just pass it in. A following patch will
remove master->domain.

Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/9-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29 15:12:22 +00:00
Jason Gunthorpe
d2e053d732 iommu/arm-smmu-v3: Put writing the context descriptor in the right order
Get closer to the IOMMU API ideal that changes between domains can be
hitless. The ordering for the CD table entry is not entirely clean from
this perspective.

When switching away from a STE with a CD table programmed in it we should
write the new STE first, then clear any old data in the CD entry.

If we are programming a CD table for the first time to a STE then the CD
entry should be programmed before the STE is loaded.

If we are replacing a CD table entry when the STE already points at the CD
entry then we just need to do the make/break sequence.

Lift this code out of arm_smmu_detach_dev() so it can all be sequenced
properly. The only other caller is arm_smmu_release_device() and it is
going to free the cdtable anyhow, so it doesn't matter what is in it.

Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/8-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29 15:12:22 +00:00
Jason Gunthorpe
8c73c32c83 iommu/arm-smmu-v3: Do not change the STE twice during arm_smmu_attach_dev()
This was needed because the STE code required the STE to be in
ABORT/BYPASS inorder to program a cdtable or S2 STE. Now that the STE code
can automatically handle all transitions we can remove this step
from the attach_dev flow.

A few small bugs exist because of this:

1) If the core code does BLOCKED -> UNMANAGED with disable_bypass=false
   then there will be a moment where the STE points at BYPASS. Since
   this can be done by VFIO/IOMMUFD it is a small security race.

2) If the core code does IDENTITY -> DMA then any IOMMU_RESV_DIRECT
   regions will temporarily become BLOCKED. We'd like drivers to
   work in a way that allows IOMMU_RESV_DIRECT to be continuously
   functional during these transitions.

Make arm_smmu_release_device() put the STE back to the correct
ABORT/BYPASS setting. Fix a bug where a IOMMU_RESV_DIRECT was ignored on
this path.

As noted before the reordering of the linked list/STE/CD changes is OK
against concurrent arm_smmu_share_asid() because of the
arm_smmu_asid_lock.

Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/7-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29 15:12:22 +00:00
Jason Gunthorpe
65547275d7 iommu/arm-smmu-v3: Compute the STE only once for each master
Currently arm_smmu_install_ste_for_dev() iterates over every SID and
computes from scratch an identical STE. Every SID should have the same STE
contents. Turn this inside out so that the STE is supplied by the caller
and arm_smmu_install_ste_for_dev() simply installs it to every SID.

This is possible now that the STE generation does not inform what sequence
should be used to program it.

This allows splitting the STE calculation up according to the call site,
which following patches will make use of, and removes the confusing NULL
domain special case that only supported arm_smmu_detach_dev().

Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/6-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29 15:12:22 +00:00
Jason Gunthorpe
9f7c689115 iommu/arm-smmu-v3: Hold arm_smmu_asid_lock during all of attach_dev
The BTM support wants to be able to change the ASID of any smmu_domain.
When it goes to do this it holds the arm_smmu_asid_lock and iterates over
the target domain's devices list.

During attach of a S1 domain we must ensure that the devices list and
CD are in sync, otherwise we could miss CD updates or a parallel CD update
could push an out of date CD.

This is pretty complicated, and almost works today because
arm_smmu_detach_dev() removes the master from the linked list before
working on the CD entries, preventing parallel update of the CD.

However, it does have an issue where the CD can remain programed while the
domain appears to be unattached. arm_smmu_share_asid() will then not clear
any CD entriess and install its own CD entry with the same ASID
concurrently. This creates a small race window where the IOMMU can see two
ASIDs pointing to different translations.

       CPU0                                   CPU1
arm_smmu_attach_dev()
   arm_smmu_detach_dev()
     spin_lock_irqsave(&smmu_domain->devices_lock, flags);
     list_del(&master->domain_head);
     spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);

				      arm_smmu_mmu_notifier_get()
				       arm_smmu_alloc_shared_cd()
					arm_smmu_share_asid():
                                          // Does nothing due to list_del above
					  arm_smmu_update_ctx_desc_devices()
					  arm_smmu_tlb_inv_asid()
				       arm_smmu_write_ctx_desc()
					 ** Now the ASID is in two CDs
					    with different translation

     arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);

Solve this by wrapping most of the attach flow in the
arm_smmu_asid_lock. This locks more than strictly needed to prepare for
the next patch which will reorganize the order of the linked list, STE and
CD changes.

Move arm_smmu_detach_dev() till after we have initialized the domain so
the lock can be held for less time.

Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/5-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29 15:12:21 +00:00
Jason Gunthorpe
71b0aa10b1 iommu/arm-smmu-v3: Build the whole STE in arm_smmu_make_s2_domain_ste()
Half the code was living in arm_smmu_domain_finalise_s2(), just move it
here and take the values directly from the pgtbl_ops instead of storing
copies.

Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/4-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29 15:12:21 +00:00
Jason Gunthorpe
efe15df087 iommu/arm-smmu-v3: Move the STE generation for S1 and S2 domains into functions
This is preparation to move the STE calculation higher up in to the call
chain and remove arm_smmu_write_strtab_ent(). These new functions will be
called directly from attach_dev.

Reviewed-by: Moritz Fischer <mdf@kernel.org>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/3-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29 15:12:21 +00:00
Jason Gunthorpe
7686aa5f8d iommu/arm-smmu-v3: Consolidate the STE generation for abort/bypass
This allows writing the flow of arm_smmu_write_strtab_ent() around abort
and bypass domains more naturally.

Note that the core code no longer supplies NULL domains, though there is
still a flow in the driver that end up in arm_smmu_write_strtab_ent() with
NULL. A later patch will remove it.

Remove the duplicate calculation of the STE in arm_smmu_init_bypass_stes()
and remove the force parameter. arm_smmu_rmr_install_bypass_ste() can now
simply invoke arm_smmu_make_bypass_ste() directly.

Rename arm_smmu_init_bypass_stes() to arm_smmu_init_initial_stes() to
better reflect its purpose.

Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/2-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29 15:12:21 +00:00
Jason Gunthorpe
7da51af912 iommu/arm-smmu-v3: Make STE programming independent of the callers
As the comment in arm_smmu_write_strtab_ent() explains, this routine has
been limited to only work correctly in certain scenarios that the caller
must ensure. Generally the caller must put the STE into ABORT or BYPASS
before attempting to program it to something else.

The iommu core APIs would ideally expect the driver to do a hitless change
of iommu_domain in a number of cases:

 - RESV_DIRECT support wants IDENTITY -> DMA -> IDENTITY to be hitless
   for the RESV ranges

 - PASID upgrade has IDENTIY on the RID with no PASID then a PASID paging
   domain installed. The RID should not be impacted

 - PASID downgrade has IDENTIY on the RID and all PASID's removed.
   The RID should not be impacted

 - RID does PAGING -> BLOCKING with active PASID, PASID's should not be
   impacted

 - NESTING -> NESTING for carrying all the above hitless cases in a VM
   into the hypervisor. To comprehensively emulate the HW in a VM we
   should assume the VM OS is running logic like this and expecting
   hitless updates to be relayed to real HW.

For CD updates arm_smmu_write_ctx_desc() has a similar comment explaining
how limited it is, and the driver does have a need for hitless CD updates:

 - SMMUv3 BTM S1 ASID re-label

 - SVA mm release should change the CD to answert not-present to all
   requests without allowing logging (EPD0)

The next patches/series are going to start removing some of this logic
from the callers, and add more complex state combinations than currently.
At the end everything that can be hitless will be hitless, including all
of the above.

Introduce arm_smmu_write_ste() which will run through the multi-qword
programming sequence to avoid creating an incoherent 'torn' STE in the HW
caches. It automatically detects which of two algorithms to use:

1) The disruptive V=0 update described in the spec which disrupts the
   entry and does three syncs to make the change:
       - Write V=0 to QWORD 0
       - Write the entire STE except QWORD 0
       - Write QWORD 0

2) A hitless update algorithm that follows the same rational that the driver
   already uses. It is safe to change IGNORED bits that HW doesn't use:
       - Write the target value into all currently unused bits
       - Write a single QWORD, this makes the new STE live atomically
       - Ensure now unused bits are 0

The detection of which path to use and the implementation of the hitless
update rely on a "used bitmask" describing what bits the HW is actually
using based on the V/CFG/etc bits. This flows from the spec language,
typically indicated as IGNORED.

Knowing which bits the HW is using we can update the bits it does not use
and then compute how many QWORDS need to be changed. If only one qword
needs to be updated the hitless algorithm is possible.

Later patches will include CD updates in this mechanism so make the
implementation generic using a struct arm_smmu_entry_writer and struct
arm_smmu_entry_writer_ops to abstract the differences between STE and CD
to be plugged in.

At this point it generates the same sequence of updates as the current
code, except that zeroing the VMID on entry to BYPASS/ABORT will do an
extra sync (this seems to be an existing bug).

Going forward this will use a V=0 transition instead of cycling through
ABORT if a hitfull change is required. This seems more appropriate as ABORT
will fail DMAs without any logging, but dropping a DMA due to transient
V=0 is probably signaling a bug, so the C_BAD_STE is valuable.

Add STRTAB_STE_1_SHCFG_INCOMING to s2_cfg, this was editing the STE in
place and subtly inherited the value of data[1] from abort/bypass.

Signed-off-by: Michael Shavit <mshavit@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/1-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-02-29 15:12:21 +00:00
Jason Gunthorpe
b5bf7778b7 iommu/arm-smmu-v3: Do not use GFP_KERNEL under as spinlock
If the SMMU is configured to use a two level CD table then
arm_smmu_write_ctx_desc() allocates a CD table leaf internally using
GFP_KERNEL. Due to recent changes this is being done under a spinlock to
iterate over the device list - thus it will trigger a sleeping while
atomic warning:

  arm_smmu_sva_set_dev_pasid()
    mutex_lock(&sva_lock);
    __arm_smmu_sva_bind()
     arm_smmu_mmu_notifier_get()
      spin_lock_irqsave()
      arm_smmu_write_ctx_desc()
	arm_smmu_get_cd_ptr()
         arm_smmu_alloc_cd_leaf_table()
	  dmam_alloc_coherent(GFP_KERNEL)

This is a 64K high order allocation and really should not be done
atomically.

At the moment the rework of the SVA to follow the new API is half
finished. Recently the CD table memory was moved from the domain to the
master, however we have the confusing situation where the SVA code is
wrongly using the RID domains device's list to track which CD tables the
SVA is installed in.

Remove the logic to replicate the CD across all the domain's masters
during attach. We know which master and which CD table the PASID should be
installed in.

Right now SVA only works when dma-iommu.c is in control of the RID
translation, which means we have a single iommu_domain shared across the
entire group and that iommu_domain is not shared outside the group.

Critically this means that the iommu_group->devices list and RID's
smmu_domain->devices list describe the same set of masters.

For PCI cases the core code also insists on singleton groups so there is
only one entry in the smmu_domain->devices list that is equal to the
master being passed in to arm_smmu_sva_set_dev_pasid().

Only non-PCI cases may have multi-device groups. However, the core code
will repeat the calls to arm_smmu_sva_set_dev_pasid() across the entire
iommu_group->devices list.

Instead of having arm_smmu_mmu_notifier_get() indirectly loop over all the
devices in the group via the RID's smmu_domain, rely on
__arm_smmu_sva_bind() to be called for each device in the group and
install the repeated CD entry that way.

This avoids taking the spinlock to access the devices list and permits the
arm_smmu_write_ctx_desc() to use a sleeping allocation. Leave the
arm_smmu_mm_release() as a confusing situation, this requires tracking
attached masters inside the SVA domain.

Removing the loop allows arm_smmu_write_ctx_desc() to be called outside
the spinlock and thus is safe to use GFP_KERNEL.

Move the clearing of the CD into arm_smmu_sva_remove_dev_pasid() so that
arm_smmu_mmu_notifier_get/put() remain paired functions.

Fixes: 24503148c5 ("iommu/arm-smmu-v3: Refactor write_ctx_desc")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/4e25d161-0cf8-4050-9aa3-dfa21cd63e56@moroto.mountain/
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Link: https://lore.kernel.org/r/0-v3-11978fc67151+112-smmu_cd_atomic_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-02-22 12:34:11 +00:00
Lu Baolu
3dfa64aecb iommu: Make iommu_report_device_fault() return void
As the iommu_report_device_fault() has been converted to auto-respond a
page fault if it fails to enqueue it, there's no need to return a code
in any case. Make it return void.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20240212012227.119381-17-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-16 15:19:37 +01:00
Lu Baolu
b554e396e5 iommu: Make iopf_group_response() return void
The iopf_group_response() should return void, as nothing can do anything
with the failure. This implies that ops->page_response() must also return
void; this is consistent with what the drivers do. The failure paths,
which are all integrity validations of the fault, should be WARN_ON'd,
not return codes.

If the iommu core fails to enqueue the fault, it should respond the fault
directly by calling ops->page_response() instead of returning an error
number and relying on the iommu drivers to do so. Consolidate the error
fault handling code in the core.

Co-developed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20240212012227.119381-16-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-16 15:19:36 +01:00
Lu Baolu
17c51a0ea3 iommu: Separate SVA and IOPF
Add CONFIG_IOMMU_IOPF for page fault handling framework and select it
from its real consumer. Move iopf function declaration from iommu-sva.h
to iommu.h and remove iommu-sva.h as it's empty now.

Consolidate all SVA related code into iommu-sva.c:
- Move iommu_sva_domain_alloc() from iommu.c to iommu-sva.c.
- Move sva iopf handling code from io-pgfault.c to iommu-sva.c.

Consolidate iommu_report_device_fault() and iommu_page_response() into
io-pgfault.c.

Export iopf_free_group() and iopf_group_response() for iopf handlers
implemented in modules. Some functions are renamed with more meaningful
names. No other intentional functionality changes.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Yan Zhao <yan.y.zhao@intel.com>
Tested-by: Longfang Liu <liulongfang@huawei.com>
Link: https://lore.kernel.org/r/20240212012227.119381-11-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-16 15:19:29 +01:00
Lu Baolu
3f02a9dc70 iommu: Merge iommu_fault_event and iopf_fault
The iommu_fault_event and iopf_fault data structures store the same
information about an iopf fault. They are also used in the same way.
Merge these two data structures into a single one to make the code
more concise and easier to maintain.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Yan Zhao <yan.y.zhao@intel.com>
Tested-by: Longfang Liu <liulongfang@huawei.com>
Link: https://lore.kernel.org/r/20240212012227.119381-8-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-16 15:19:26 +01:00
Lu Baolu
1ff25d798e iommu: Remove iommu_[un]register_device_fault_handler()
The individual iommu driver reports the iommu page faults by calling
iommu_report_device_fault(), where a pre-registered device fault handler
is called to route the fault to another fault handler installed on the
corresponding iommu domain.

The pre-registered device fault handler is static and won't be dynamic
as the fault handler is eventually per iommu domain. Replace calling
device fault handler with iommu_queue_iopf().

After this replacement, the registering and unregistering fault handler
interfaces are not needed anywhere. Remove the interfaces and the related
data structures to avoid dead code.

Convert cookie parameter of iommu_queue_iopf() into a device pointer that
is really passed.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Tested-by: Yan Zhao <yan.y.zhao@intel.com>
Tested-by: Longfang Liu <liulongfang@huawei.com>
Link: https://lore.kernel.org/r/20240212012227.119381-7-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-16 15:19:24 +01:00
Lu Baolu
66014df73b iommu/arm-smmu-v3: Remove unrecoverable faults reporting
No device driver registers fault handler to handle the reported
unrecoveraable faults. Remove it to avoid dead code.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Tested-by: Longfang Liu <liulongfang@huawei.com>
Link: https://lore.kernel.org/r/20240212012227.119381-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-02-16 15:19:21 +01:00
Thomas Gleixner
14fd06c776 irqchip: Convert all platform MSI users to the new API
Switch all the users of the platform MSI domain over to invoke the new
interfaces which branch to the original platform MSI functions when the
irqdomain associated to the caller device does not yet provide MSI parent
functionality.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240127161753.114685-7-apatel@ventanamicro.com
2024-02-15 17:55:40 +01:00
Linus Torvalds
0dde2bf67b IOMMU Updates for Linux v6.8
Including:
 
 	- Core changes:
 	  - Fix race conditions in device probe path
 	  - Retire IOMMU bus_ops
 	  - Support for passing custom allocators to page table drivers
 	  - Clean up Kconfig around IOMMU_SVA
 	  - Support for sharing SVA domains with all devices bound to
 	    a mm
 	  - Firmware data parsing cleanup
 	  - Tracing improvements for iommu-dma code
 	  - Some smaller fixes and cleanups
 
 	- ARM-SMMU drivers:
 	  - Device-tree binding updates:
 	     - Add additional compatible strings for Qualcomm SoCs
 	     - Document Adreno clocks for Qualcomm's SM8350 SoC
 	  - SMMUv2:
 	    - Implement support for the ->domain_alloc_paging() callback
 	    - Ensure Secure context is restored following suspend of Qualcomm SMMU
 	      implementation
 	  - SMMUv3:
 	    - Disable stalling mode for the "quiet" context descriptor
 	    - Minor refactoring and driver cleanups
 
 	 - Intel VT-d driver:
 	   - Cleanup and refactoring
 
 	 - AMD IOMMU driver:
 	   - Improve IO TLB invalidation logic
 	   - Small cleanups and improvements
 
 	 - Rockchip IOMMU driver:
 	   - DT binding update to add Rockchip RK3588
 
 	 - Apple DART driver:
 	   - Apple M1 USB4/Thunderbolt DART support
 	   - Cleanups
 
 	 - Virtio IOMMU driver:
 	   - Add support for iotlb_sync_map
 	   - Enable deferred IO TLB flushes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmWecQoACgkQK/BELZcB
 GuN5ZxAAzC5QUKAzANx0puk7QhPpKKlbSvj6Q7iRgCLk00KJO1+VQh9v4ouCmXqF
 kn3Ko8gddjhtrgwN0OQ54F39cLUrp1SBemy71K5YOR+vu8VKtwtmawZGeeRZ+k+B
 Eohw58oaXTiR1maYvoLixLYczLrjklqyJOQ1vZ0GxFGxDqrFByAryHDgG/3OCpJx
 C9e6PsLbbfhfqA8Kv97iKcBqniGbXxAMuodqSUG0buQ3oZgfpIP6Bt3EgUzFGPGk
 3BTlYxowS/gkjUWd3fgjQFIFLTA01u9FhpA2Jb0a4v67pUCR64YxHN7rBQ6ZChtG
 kB9laQfU9re79RsHhqQzr0JT9x/eyq7pzGzjp5TV5TPW6IW+sqjMIPhzd9P08Ef7
 BclkCVobx0jSAHOhnnG4QJiKANr2Y2oM3HfsAJccMMY45RRhUKmVqM7jxMPfGn3A
 i+inlee73xTjZXJse1EWG1fmKKMLvX9LDEp4DyOfn9CqVT+7hpZvzPjfbGr937Rm
 JlwXhF3rQXEpOCagEsbt1vOf+V0e9QiCLf1Y2KpkIkDbE5wwSD/2qLm3tFhJG3oF
 fkW+J14Cid0pj+hY0afGe0kOUOIYlimu0nFmSf0pzMH+UktZdKogSfyb1gSDsy+S
 rsZRGPFhMJ832ExqhlDfxqBebqh+jsfKynlskui6Td5C9ZULaHA=
 =q751
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:
 "Core changes:
   - Fix race conditions in device probe path
   - Retire IOMMU bus_ops
   - Support for passing custom allocators to page table drivers
   - Clean up Kconfig around IOMMU_SVA
   - Support for sharing SVA domains with all devices bound to a mm
   - Firmware data parsing cleanup
   - Tracing improvements for iommu-dma code
   - Some smaller fixes and cleanups

  ARM-SMMU drivers:
   - Device-tree binding updates:
      - Add additional compatible strings for Qualcomm SoCs
      - Document Adreno clocks for Qualcomm's SM8350 SoC
   - SMMUv2:
      - Implement support for the ->domain_alloc_paging() callback
      - Ensure Secure context is restored following suspend of Qualcomm
        SMMU implementation
   - SMMUv3:
      - Disable stalling mode for the "quiet" context descriptor
      - Minor refactoring and driver cleanups

  Intel VT-d driver:
   - Cleanup and refactoring

  AMD IOMMU driver:
   - Improve IO TLB invalidation logic
   - Small cleanups and improvements

  Rockchip IOMMU driver:
   - DT binding update to add Rockchip RK3588

  Apple DART driver:
   - Apple M1 USB4/Thunderbolt DART support
   - Cleanups

  Virtio IOMMU driver:
   - Add support for iotlb_sync_map
   - Enable deferred IO TLB flushes"

* tag 'iommu-updates-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (66 commits)
  iommu: Don't reserve 0-length IOVA region
  iommu/vt-d: Move inline helpers to header files
  iommu/vt-d: Remove unused vcmd interfaces
  iommu/vt-d: Remove unused parameter of intel_pasid_setup_pass_through()
  iommu/vt-d: Refactor device_to_iommu() to retrieve iommu directly
  iommu/sva: Fix memory leak in iommu_sva_bind_device()
  dt-bindings: iommu: rockchip: Add Rockchip RK3588
  iommu/dma: Trace bounce buffer usage when mapping buffers
  iommu/arm-smmu: Convert to domain_alloc_paging()
  iommu/arm-smmu: Pass arm_smmu_domain to internal functions
  iommu/arm-smmu: Implement IOMMU_DOMAIN_BLOCKED
  iommu/arm-smmu: Convert to a global static identity domain
  iommu/arm-smmu: Reorganize arm_smmu_domain_add_master()
  iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED
  iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent()
  iommu/arm-smmu-v3: Add a type for the STE
  iommu/arm-smmu-v3: disable stall for quiet_cd
  iommu/qcom: restore IOMMU state if needed
  iommu/arm-smmu-qcom: Add QCM2290 MDSS compatible
  iommu/arm-smmu-qcom: Add missing GMU entry to match table
  ...
2024-01-18 15:16:57 -08:00
Kirill A. Shutemov
5e0a760b44 mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER
commit 23baf831a3 ("mm, treewide: redefine MAX_ORDER sanely") has
changed the definition of MAX_ORDER to be inclusive.  This has caused
issues with code that was not yet upstream and depended on the previous
definition.

To draw attention to the altered meaning of the define, rename MAX_ORDER
to MAX_PAGE_ORDER.

Link: https://lkml.kernel.org/r/20231228144704.14033-2-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-08 15:27:15 -08:00
Joerg Roedel
75f74f85a4 Merge branches 'apple/dart', 'arm/rockchip', 'arm/smmu', 'virtio', 'x86/vt-d', 'x86/amd' and 'core' into next 2024-01-03 09:59:32 +01:00
Jason Gunthorpe
9fde008337 iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED
Currently this is exactly the same as ARM_SMMU_DOMAIN_S2, so just remove
it. The ongoing work to add nesting support through iommufd will do
something a little different.

Reviewed-by: Moritz Fischer <mdf@kernel.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Will Deacon <will@kernel.org>
2023-12-13 12:32:19 +00:00
Jason Gunthorpe
12a48fe90d iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent()
The only caller is arm_smmu_install_ste_for_dev() which never has a NULL
master. Remove the confusing if.

Reviewed-by: Moritz Fischer <mdf@kernel.org>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Will Deacon <will@kernel.org>
2023-12-13 12:32:19 +00:00
Jason Gunthorpe
57b8904887 iommu/arm-smmu-v3: Add a type for the STE
Instead of passing a naked __le16 * around to represent a STE wrap it in a
"struct arm_smmu_ste" with an array of the correct size. This makes it
much clearer which functions will comprise the "STE API".

Reviewed-by: Moritz Fischer <mdf@kernel.org>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Will Deacon <will@kernel.org>
2023-12-13 12:32:19 +00:00
Wenkai Lin
b41932f544 iommu/arm-smmu-v3: disable stall for quiet_cd
In the stall model, invalid transactions were expected to be
stalled and aborted by the IOPF handler.

However, when killing a test case with a huge amount of data, the
accelerator streamline can not stop until all data is consumed
even if the page fault handler reports errors. As a result, the
kill may take a long time, about 10 seconds with numerous iopf
interrupts.

So disable stall for quiet_cd in the non-force stall model, since
force stall model (STALL_MODEL==0b10) requires CD.S must be 1.

Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Suggested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/20231206005727.46150-1-zhangfei.gao@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
2023-12-12 12:38:52 +00:00
Jason Gunthorpe
eda1a94caf iommu: Mark dev_iommu_priv_set() with a lockdep
A perfect driver would only call dev_iommu_priv_set() from its probe
callback. We've made it functionally correct to call it from the of_xlate
by adding a lock around that call.

lockdep assert that iommu_probe_device_lock is held to discourage misuse.

Exclude PPC kernels with CONFIG_FSL_PAMU turned on because FSL_PAMU uses a
global static for its priv and abuses priv for its domain.

Remove the pointless stores of NULL, all these are on paths where the core
code will free dev->iommu after the op returns.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Tested-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/5-v2-16e4def25ebb+820-iommu_fwspec_p1_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-12-12 10:18:49 +01:00
Tina Zhang
2396046d75 iommu: Add mm_get_enqcmd_pasid() helper function
mm_get_enqcmd_pasid() should be used by architecture code and closely
related to learn the PASID value that the x86 ENQCMD operation should
use for the mm.

For the moment SMMUv3 uses this without any connection to ENQCMD, it
will be cleaned up similar to how the prior patch made VT-d use the
PASID argument of set_dev_pasid().

The motivation is to replace mm->pasid with an iommu private data
structure that is introduced in a later patch.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20231027000525.1278806-4-tina.zhang@intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-12-12 10:11:29 +01:00
Robin Murphy
e7080665c9 iommu: Clean up open-coded ownership checks
Some drivers already implement their own defence against the possibility
of being given someone else's device. Since this is now taken care of by
the core code (and via a slightly different path from the original
fwspec-based idea), let's clean them up.

Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/58a9879ce3f03562bb061e6714fe6efb554c3907.1700589539.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-11-27 11:03:16 +01:00
Michael Shavit
37ed36448f iommu/arm-smmu-v3-sva: Remove bond refcount
Always allocate a new arm_smmu_bond in __arm_smmu_sva_bind and remove
the bond refcount since arm_smmu_bond can never be shared across calls
to __arm_smmu_sva_bind.

The iommu framework will not allocate multiple SVA domains for the same
(device/mm) pair, nor will it call set_dev_pasid for a device if a
domain is already attached on the given pasid. There's also a one-to-one
mapping between MM and PASID. __arm_smmu_sva_bind is therefore never
called with the same (device/mm) pair, and so there's no reason to try
and normalize allocations of the arm_smmu_bond struct for a (device/mm)
pair across set_dev_pasid.

Signed-off-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20230905194849.v1.2.Id3ab7cf665bcead097654937233a645722a4cce3@changeid
Signed-off-by: Will Deacon <will@kernel.org>
2023-10-12 18:37:06 +01:00
Michael Shavit
d912aed14f iommu/arm-smmu-v3-sva: Remove unused iommu_sva handle
The __arm_smmu_sva_bind function returned an unused iommu_sva handle
that can be removed.

Signed-off-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20230905194849.v1.1.Ib483f67c9e2ad90ea2254b4b5ac696e4b68aa638@changeid
Signed-off-by: Will Deacon <will@kernel.org>
2023-10-12 18:37:06 +01:00
Michael Shavit
475918e9c4 iommu/arm-smmu-v3: Rename cdcfg to cd_table
cdcfg is a confusing name, especially given other variables with the cfg
suffix in this driver. cd_table more clearly describes what is being
operated on.

Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Michael Shavit <mshavit@google.com>
Link: https://lore.kernel.org/r/20230915211705.v8.9.I5ee79793b444ddb933e8bc1eb7b77e728d7f8350@changeid
Signed-off-by: Will Deacon <will@kernel.org>
2023-10-12 17:08:17 +01:00
Michael Shavit
6032f58498 iommu/arm-smmu-v3: Update comment about STE liveness
Update the comment to reflect the fact that the STE is not always
installed. arm_smmu_domain_finalise_s1 intentionnaly calls
arm_smmu_write_ctx_desc while the STE is not installed.

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Michael Shavit <mshavit@google.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Link: https://lore.kernel.org/r/20230915211705.v8.8.I7a8beb615e2520ad395d96df94b9ab9708ee0d9c@changeid
Signed-off-by: Will Deacon <will@kernel.org>
2023-10-12 17:08:17 +01:00
Michael Shavit
5e14313df2 iommu/arm-smmu-v3: Cleanup arm_smmu_domain_finalise
Remove unused master parameter now that the CD table is allocated
elsewhere.

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Michael Shavit <mshavit@google.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Link: https://lore.kernel.org/r/20230915211705.v8.7.Iff18df41564b9df82bf40b3ec7af26b87f08ef6e@changeid
Signed-off-by: Will Deacon <will@kernel.org>
2023-10-12 17:08:17 +01:00
Michael Shavit
10e4968cd5 iommu/arm-smmu-v3: Move CD table to arm_smmu_master
With this change, each master will now own its own CD table instead of
sharing one with other masters attached to the same domain. Attaching a
stage 1 domain installs CD entries into the master's CD table. SVA
writes its CD entries into each master's CD table if the domain is
shared across masters.

Also add the device to the devices list before writing the CD to the
table so that SVA will know that the CD needs to be re-written to this
device's CD table as well if it decides to update the CD's ASID
concurrently with this function.

Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Michael Shavit <mshavit@google.com>
Link: https://lore.kernel.org/r/20230915211705.v8.6.Ice063dcf87d1b777a72e008d9e3406d2bcf6d876@changeid
Signed-off-by: Will Deacon <will@kernel.org>
2023-10-12 17:08:17 +01:00
Michael Shavit
24503148c5 iommu/arm-smmu-v3: Refactor write_ctx_desc
Update arm_smmu_write_ctx_desc and downstream functions to operate on
a master instead of an smmu domain. We expect arm_smmu_write_ctx_desc()
to only be called to write a CD entry into a CD table owned by the
master. Under the hood, arm_smmu_write_ctx_desc still fetches the CD
table from the domain that is attached to the master, but a subsequent
commit will move that table's ownership to the master.

Note that this change isn't a nop refactor since SVA will call
arm_smmu_write_ctx_desc in a loop for every master the domain is
attached to despite the fact that they all share the same CD table. This
loop may look weird but becomes necessary when the CD table becomes
per-master in a subsequent commit.

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Michael Shavit <mshavit@google.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Link: https://lore.kernel.org/r/20230915211705.v8.5.I219054a6cf538df5bb22f4ada2d9933155d6058c@changeid
Signed-off-by: Will Deacon <will@kernel.org>
2023-10-12 17:08:17 +01:00
Michael Shavit
1228cc509f iommu/arm-smmu-v3: move stall_enabled to the cd table
A domain can be attached to multiple masters with different
master->stall_enabled values. The stall bit of a CD entry should follow
master->stall_enabled and has an inverse relationship with the
STE.S1STALLD bit.

The stall_enabled bit does not depend on any property of the domain, so
move it out of the arm_smmu_domain struct.  Move it to the CD table
struct so that it can fully describe how CD entries should be written to
it.

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Michael Shavit <mshavit@google.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Link: https://lore.kernel.org/r/20230915211705.v8.4.I5aa89c849228794a64146cfe86df21fb71629384@changeid
Signed-off-by: Will Deacon <will@kernel.org>
2023-10-12 17:08:17 +01:00
Michael Shavit
e3aad74c51 iommu/arm-smmu-v3: Encapsulate ctx_desc_cfg init in alloc_cd_tables
This is slighlty cleaner: arm_smmu_ctx_desc_cfg is initialized in a
single function instead of having pieces set ahead-of time by its caller.

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Michael Shavit <mshavit@google.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Link: https://lore.kernel.org/r/20230915211705.v8.3.I875254464d044a8ce8b3a2ad6beb655a4a006456@changeid
Signed-off-by: Will Deacon <will@kernel.org>
2023-10-12 17:08:17 +01:00
Michael Shavit
1f85888340 iommu/arm-smmu-v3: Replace s1_cfg with cdtab_cfg
Remove struct arm_smmu_s1_cfg. This is really just a CD table with a
bit of extra information. Move other attributes of the CD table that
were held there into the existing CD table structure, struct
arm_smmu_ctx_desc_cfg, and replace all usages of arm_smmu_s1_cfg with
arm_smmu_ctx_desc_cfg.

For clarity, use the name "cd_table" for the variables pointing to
arm_smmu_ctx_desc_cfg in the new code instead of cdcfg. A later patch
will make this fully consistent.

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Michael Shavit <mshavit@google.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Link: https://lore.kernel.org/r/20230915211705.v8.2.I1ef1ed19d7786c8176a0d05820c869e650c8d68f@changeid
Signed-off-by: Will Deacon <will@kernel.org>
2023-10-12 17:08:16 +01:00