45 Commits

Author SHA1 Message Date
Thomas Gleixner
a539cc86a1 x86/vector: Rename send_cleanup_vector() to vector_schedule_cleanup()
Rename send_cleanup_vector() to vector_schedule_cleanup() to prepare for
replacing the vector cleanup IPI with a timer callback.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Xin Li <xin3.li@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steve Wahl <steve.wahl@hpe.com>
Link: https://lore.kernel.org/r/20230621171248.6805-2-xin3.li@intel.com
2023-08-06 14:15:09 +02:00
Peter Zijlstra
b1fe7f2cda x86,intel_iommu: Replace cmpxchg_double()
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20230531132323.855976804@infradead.org
2023-06-05 09:36:38 +02:00
Joerg Roedel
e51b419839 Merge branches 'iommu/fixes', 'arm/allwinner', 'arm/exynos', 'arm/mediatek', 'arm/omap', 'arm/renesas', 'arm/rockchip', 'arm/smmu', 'ppc/pamu', 'unisoc', 'x86/vt-d', 'x86/amd', 'core' and 'platform-remove_new' into next 2023-04-14 13:45:50 +02:00
Christophe JAILLET
41d71e09a1 iommu/vt-d: Do not use GFP_ATOMIC when not needed
There is no need to use GFP_ATOMIC here. GFP_KERNEL is already used for
some other memory allocations just a few lines above.

Commit e3a981d61d15 ("iommu/vt-d: Convert allocations to GFP_KERNEL") has
changed the other memory allocation flags.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/e2a8a1019ffc8a86b4b4ed93def3623f60581274.1675542576.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-04-13 12:05:48 +02:00
Lu Baolu
c7d624520c iommu/vt-d: Remove unnecessary locking in intel_irq_remapping_alloc()
The global rwsem dmar_global_lock was introduced by commit 3a5670e8ac932
("iommu/vt-d: Introduce a rwsem to protect global data structures"). It
is used to protect DMAR related global data from DMAR hotplug operations.

Using dmar_global_lock in intel_irq_remapping_alloc() is unnecessary as
the DMAR global data structures are not touched there. Remove it to avoid
below lockdep warning.

 ======================================================
 WARNING: possible circular locking dependency detected
 6.3.0-rc2  Not tainted
 ------------------------------------------------------
 swapper/0/1 is trying to acquire lock:
 ff1db4cb40178698 (&domain->mutex){+.+.}-{3:3},
   at: __irq_domain_alloc_irqs+0x3b/0xa0

 but task is already holding lock:
 ffffffffa0c1cdf0 (dmar_global_lock){++++}-{3:3},
   at: intel_iommu_init+0x58e/0x880

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 ->  (dmar_global_lock){++++}-{3:3}:
        lock_acquire+0xd6/0x320
        down_read+0x42/0x180
        intel_irq_remapping_alloc+0xad/0x750
        mp_irqdomain_alloc+0xb8/0x2b0
        irq_domain_alloc_irqs_locked+0x12f/0x2d0
        __irq_domain_alloc_irqs+0x56/0xa0
        alloc_isa_irq_from_domain.isra.7+0xa0/0xe0
        mp_map_pin_to_irq+0x1dc/0x330
        setup_IO_APIC+0x128/0x210
        apic_intr_mode_init+0x67/0x110
        x86_late_time_init+0x24/0x40
        start_kernel+0x41e/0x7e0
        secondary_startup_64_no_verify+0xe0/0xeb

 ->  (&domain->mutex){+.+.}-{3:3}:
        check_prevs_add+0x160/0xef0
        __lock_acquire+0x147d/0x1950
        lock_acquire+0xd6/0x320
        __mutex_lock+0x9c/0xfc0
        __irq_domain_alloc_irqs+0x3b/0xa0
        dmar_alloc_hwirq+0x9e/0x120
        iommu_pmu_register+0x11d/0x200
        intel_iommu_init+0x5de/0x880
        pci_iommu_init+0x12/0x40
        do_one_initcall+0x65/0x350
        kernel_init_freeable+0x3ca/0x610
        kernel_init+0x1a/0x140
        ret_from_fork+0x29/0x50

 other info that might help us debug this:

 Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(dmar_global_lock);
                                lock(&domain->mutex);
                                lock(dmar_global_lock);
   lock(&domain->mutex);

                *** DEADLOCK ***

Fixes: 9dbb8e3452ab ("irqdomain: Switch to per-domain locking")
Reviewed-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Tested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20230314051836.23817-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20230329134721.469447-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-03-31 10:06:15 +02:00
Jason Gunthorpe
f188bdb5f1 iommu/x86: Replace IOMMU_CAP_INTR_REMAP with IRQ_DOMAIN_FLAG_ISOLATED_MSI
On x86 platforms when the HW can support interrupt remapping the iommu
driver creates an irq_domain for the IR hardware and creates a child MSI
irq_domain.

When the global irq_remapping_enabled is set, the IR MSI domain is
assigned to the PCI devices (by intel_irq_remap_add_device(), or
amd_iommu_set_pci_msi_domain()) making those devices have the isolated MSI
property.

Due to how interrupt domains work, setting IRQ_DOMAIN_FLAG_ISOLATED_MSI on
the parent IR domain will cause all struct devices attached to it to
return true from msi_device_has_isolated_msi(). This replaces the
IOMMU_CAP_INTR_REMAP flag as all places using IOMMU_CAP_INTR_REMAP also
call msi_device_has_isolated_msi()

Set the flag and delete the cap.

Link: https://lore.kernel.org/r/7-v3-3313bb5dd3a3+10f11-secure_msi_jgg@nvidia.com
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-01-11 16:27:23 -04:00
Linus Torvalds
4f292c4de4 New Feature:
* Randomize the per-cpu entry areas
 Cleanups:
 * Have CR3_ADDR_MASK use PHYSICAL_PAGE_MASK instead of open
   coding it
 * Move to "native" set_memory_rox() helper
 * Clean up pmd_get_atomic() and i386-PAE
 * Remove some unused page table size macros
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmOc53UACgkQaDWVMHDJ
 krCUHw//SGZ+La0hLZLAiAiZTXLZZHpYkOmg1Oj1+11qSU11uZzTFqDpauhaKpRS
 cJCSh+D+RXe5e2ipgt0+Zl0hESLt7pJf8258OE4ra0DL/IlyO9uqruAs9Kn3eRS/
 Fk76nG8gdEU+JKJqpG02GqOLslYQuIy96n9hpuj1x25b614+uezPfC7S4XEat0NT
 MbJQ+jnVDf16aJIJkzT+iSwhubDVeh+bSHeO0SSCzX23WLUqDeg5NvlyxoCHGbBh
 UpUTWggV/0pYAkBKRHToeJs8qTWREwuuH/8JGewpe9A0tjdB5wyZfNL2PuracweN
 9MauXC3T5f0+Ca4yIIaPq1fF7Ny/PR2dBFihk27rOD0N7tjaZxNwal2pB1sZcmvZ
 +PAokjyTPVH5ZXjkMYGGAUe1jyjwr2+TgFSZxhTnDuGtyVQiY4pihGKOifLCX6tv
 x6khvYeTBw7wfaDRtKEAf+2kLHYn+71HszHP/8bNKX9T03h+Zf0i1wdZu5xbM5Gc
 VK2wR7bCC+UftJJYG0pldcHg2qaF19RBHK2tLwp7zngUv7lTbkKfkgKjre73KV2a
 D4b76lrqdUMo6UYwYdw7WtDyarZS4OVLq2DcNhwwMddBCaX8kyN5a4AqwQlZYJ0u
 dM+kuMofE8U3yMxmMhJimkZUsj09yLHIqfynY0jbAcU3nhKZZNY=
 =wwVF
 -----END PGP SIGNATURE-----

Merge tag 'x86_mm_for_6.2_v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 mm updates from Dave Hansen:
 "New Feature:

   - Randomize the per-cpu entry areas

  Cleanups:

   - Have CR3_ADDR_MASK use PHYSICAL_PAGE_MASK instead of open coding it

   - Move to "native" set_memory_rox() helper

   - Clean up pmd_get_atomic() and i386-PAE

   - Remove some unused page table size macros"

* tag 'x86_mm_for_6.2_v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (35 commits)
  x86/mm: Ensure forced page table splitting
  x86/kasan: Populate shadow for shared chunk of the CPU entry area
  x86/kasan: Add helpers to align shadow addresses up and down
  x86/kasan: Rename local CPU_ENTRY_AREA variables to shorten names
  x86/mm: Populate KASAN shadow for entire per-CPU range of CPU entry area
  x86/mm: Recompute physical address for every page of per-CPU CEA mapping
  x86/mm: Rename __change_page_attr_set_clr(.checkalias)
  x86/mm: Inhibit _PAGE_NX changes from cpa_process_alias()
  x86/mm: Untangle __change_page_attr_set_clr(.checkalias)
  x86/mm: Add a few comments
  x86/mm: Fix CR3_ADDR_MASK
  x86/mm: Remove P*D_PAGE_MASK and P*D_PAGE_SIZE macros
  mm: Convert __HAVE_ARCH_P..P_GET to the new style
  mm: Remove pointless barrier() after pmdp_get_lockless()
  x86/mm/pae: Get rid of set_64bit()
  x86_64: Remove pointless set_64bit() usage
  x86/mm/pae: Be consistent with pXXp_get_and_clear()
  x86/mm/pae: Use WRITE_ONCE()
  x86/mm/pae: Don't (ab)use atomic64
  mm/gup: Fix the lockless PMD access
  ...
2022-12-17 14:06:53 -06:00
Peter Zijlstra
9ee850acd2 x86_64: Remove pointless set_64bit() usage
The use of set_64bit() in X86_64 only code is pretty pointless, seeing
how it's a direct assignment. Remove all this nonsense.

[nathanchance: unbreak irte]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20221022114425.168036718%40infradead.org
2022-12-15 10:37:27 -08:00
Thomas Gleixner
810531a1af iommu/vt-d: Enable PCI/IMS
PCI/IMS works like PCI/MSI-X in the remapping. Just add the feature flag,
but only when on real hardware.

Virtualized IOMMUs need additional support, e.g. for PASID.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232327.081482253@linutronix.de
2022-12-05 22:22:35 +01:00
Thomas Gleixner
9a945234ab iommu/vt-d: Switch to MSI parent domains
Remove the global PCI/MSI irqdomain implementation and provide the required
MSI parent ops so the PCI/MSI code can detect the new parent and setup per
device domains.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232326.151226317@linutronix.de
2022-12-05 22:22:33 +01:00
Thomas Gleixner
b6d5fc3a52 x86/apic/vector: Provide MSI parent domain
Enable MSI parent domain support in the x86 vector domain and fixup the
checks in the iommu implementations to check whether device::msi::domain is
the default MSI parent domain. That keeps the existing logic to protect
e.g. devices behind VMD working.

The interrupt remap PCI/MSI code still works because the underlying vector
domain still provides the same functionality.

None of the other x86 PCI/MSI, e.g. XEN and HyperV, implementations are
affected either. They still work the same way both at the low level and the
PCI/MSI implementations they provide.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232326.034672592@linutronix.de
2022-12-05 22:22:33 +01:00
Thomas Gleixner
d474d92d70 x86/apic: Remove X86_IRQ_ALLOC_CONTIGUOUS_VECTORS
Now that the PCI/MSI core code does early checking for multi-MSI support
X86_IRQ_ALLOC_CONTIGUOUS_VECTORS is not required anymore.

Remove the flag and rely on MSI_FLAG_MULTI_PCI_MSI.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20221111122015.865042356@linutronix.de
2022-11-17 15:15:22 +01:00
Thomas Gleixner
527f378c42 iommu/vt-d: Remove bogus check for multi MSI-X
PCI/Multi-MSI is MSI specific and not supported for MSI-X.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Link: https://lore.kernel.org/r/20221111122013.713848846@linutronix.de
2022-11-17 15:15:18 +01:00
Lu Baolu
eb5b20114b iommu/vt-d: Avoid unnecessary global IRTE cache invalidation
Some VT-d hardware implementations invalidate all interrupt remapping
hardware translation caches as part of SIRTP flow. The VT-d spec adds
a ESIRTPS (Enhanced Set Interrupt Remap Table Pointer Support, section
11.4.2 in VT-d spec) capability bit to indicate this.

The spec also states in 11.4.4 that hardware also performs global
invalidation on all interrupt remapping caches as part of Interrupt
Remapping Disable operation if ESIRTPS capability bit is set.

This checks the ESIRTPS capability bit and skip software global cache
invalidation if it's set.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20220921065741.3572495-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-26 15:52:26 +02:00
Lu Baolu
2585a2790e iommu/vt-d: Move include/linux/intel-iommu.h under iommu
This header file is private to the Intel IOMMU driver. Move it to the
driver folder.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Steve Wahl <steve.wahl@hpe.com>
Link: https://lore.kernel.org/r/20220514014322.2927339-8-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-07-15 10:21:31 +02:00
Guoqing Jiang
99e675d473 iommu/vt-d: Fix potential memory leak in intel_setup_irq_remapping()
After commit e3beca48a45b ("irqdomain/treewide: Keep firmware node
unconditionally allocated"). For tear down scenario, fn is only freed
after fail to allocate ir_domain, though it also should be freed in case
dmar_enable_qi returns error.

Besides free fn, irq_domain and ir_msi_domain need to be removed as well
if intel_setup_irq_remapping fails to enable queued invalidation.

Improve the rewinding path by add out_free_ir_domain and out_free_fwnode
lables per Baolu's suggestion.

Fixes: e3beca48a45b ("irqdomain/treewide: Keep firmware node unconditionally allocated")
Suggested-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Link: https://lore.kernel.org/r/20220119063640.16864-1-guoqing.jiang@linux.dev
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20220128031002.2219155-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-01-31 16:53:09 +01:00
Linus Torvalds
57151b502c pci-v5.13-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmCRp48UHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vwsVRAAsIYueNKzZczpkeQwHigYzf4HLdKm
 yyT2c/Zlj9REAUOe7ApkowVAJWiMGDJP0J361KIluAGvAxnkMP1V6WlVdByorYd0
 CrXc/UhD//cs+3QDo4SmJRHyL8q5QQTDa8Z/8seVJUYTR/t5OhSpMOuEJPhpeQ1s
 nqUk0yWNJRoN6wn6T/7KqgYEvPhARXo9epuWy5MNPZ5f8E7SRi/QG/6hP8/YOLpK
 A+8beIOX5LAvUJaXxEovwv5UQnSUkeZTGDyRietQYE6xXNeHPKCvZ7vDjjSE7NOW
 mIodD6JcG3n/riYV3sMA5PKDZgsPI3P/qJU6Y6vWBBYOaO/kQX/c7CZ+M2bcZay4
 mh1dW0vOqoTy/pAVwQB2aq08Rrg2SAskpNdeyzduXllmuTyuwCMPXzG4RKmbQ8I1
 qMFb8qOyNulRAWcTKgSMKByEQYASQsFA5yShtaba6h0+vqrseuP6hchBKKOEan8F
 9THTI3ZflKwRvGjkI0MDbp0z0+wPYmNhrcZDpAJ3bEltw58E8TL/9aBtuhajmo8+
 wJ64mZclFuMmSyhsfkAXOvjeKXMlEBaw7vinZGbcACmv4ZGI0MV7r4vVYQbQltcy
 myzB6xJxcWB8N07UpKpUbsGMb9JjTUPlaT36eZNvUZQDntrE1ljt8RSq3nphDrcD
 KmBRU8ru74I2RE0=
 =WvTD
 -----END PGP SIGNATURE-----

Merge tag 'pci-v5.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull pci updates from Bjorn Helgaas:
 "Enumeration:
   - Release OF node when pci_scan_device() fails (Dmitry Baryshkov)
   - Add pci_disable_parity() (Bjorn Helgaas)
   - Disable Mellanox Tavor parity reporting (Heiner Kallweit)
   - Disable N2100 r8169 parity reporting (Heiner Kallweit)
   - Fix RCiEP device to RCEC association (Qiuxu Zhuo)
   - Convert sysfs "config", "rom", "reset", "label", "index",
     "acpi_index" to static attributes to help fix races in device
     enumeration (Krzysztof Wilczyński)
   - Convert sysfs "vpd" to static attribute (Heiner Kallweit, Krzysztof
     Wilczyński)
   - Use sysfs_emit() in "show" functions (Krzysztof Wilczyński)
   - Remove unused alloc_pci_root_info() return value (Krzysztof
     Wilczyński)

  PCI device hotplug:
   - Fix acpiphp reference count leak (Feilong Lin)

  Power management:
   - Fix acpi_pci_set_power_state() debug message (Rafael J. Wysocki)
   - Fix runtime PM imbalance (Dinghao Liu)

  Virtualization:
   - Increase delay after FLR to work around Intel DC P4510 NVMe erratum
     (Raphael Norwitz)

  MSI:
   - Convert rcar, tegra, xilinx to MSI domains (Marc Zyngier)
   - For rcar, xilinx, use controller address as MSI doorbell (Marc
     Zyngier)
   - Remove unused hv msi_controller struct (Marc Zyngier)
   - Remove unused PCI core msi_controller support (Marc Zyngier)
   - Remove struct msi_controller altogether (Marc Zyngier)
   - Remove unused default_teardown_msi_irqs() (Marc Zyngier)
   - Let host bridges declare their reliance on MSI domains (Marc
     Zyngier)
   - Make pci_host_common_probe() declare its reliance on MSI domains
     (Marc Zyngier)
   - Advertise mediatek lack of built-in MSI handling (Thomas Gleixner)
   - Document ways of ending up with NO_MSI (Marc Zyngier)
   - Refactor HT advertising of NO_MSI flag (Marc Zyngier)

  VPD:
   - Remove obsolete Broadcom NIC VPD length-limiting quirk (Heiner
     Kallweit)
   - Remove sysfs VPD size checking dead code (Heiner Kallweit)
   - Convert VPF sysfs file to static attribute (Heiner Kallweit)
   - Remove unnecessary pci_set_vpd_size() (Heiner Kallweit)
   - Tone down "missing VPD" message (Heiner Kallweit)

  Endpoint framework:
   - Fix NULL pointer dereference when epc_features not implemented
     (Shradha Todi)
   - Add missing destroy_workqueue() in endpoint test (Yang Yingliang)

  Amazon Annapurna Labs PCIe controller driver:
   - Fix compile testing without CONFIG_PCI_ECAM (Arnd Bergmann)
   - Fix "no symbols" warnings when compile testing with
     CONFIG_TRIM_UNUSED_KSYMS (Arnd Bergmann)

  APM X-Gene PCIe controller driver:
   - Fix cfg resource mapping regression (Dejin Zheng)

  Broadcom iProc PCIe controller driver:
   - Return zero for success of iproc_msi_irq_domain_alloc() (Pali
     Rohár)

  Broadcom STB PCIe controller driver:
   - Add reset_control_rearm() stub for !CONFIG_RESET_CONTROLLER (Jim
     Quinlan)
   - Fix use of BCM7216 reset controller (Jim Quinlan)
   - Use reset/rearm for Broadcom STB pulse reset instead of
     deassert/assert (Jim Quinlan)
   - Fix brcm_pcie_probe() error return for unsupported revision (Wei
     Yongjun)

  Cavium ThunderX PCIe controller driver:
   - Fix compile testing (Arnd Bergmann)
   - Fix "no symbols" warnings when compile testing with
     CONFIG_TRIM_UNUSED_KSYMS (Arnd Bergmann)

  Freescale Layerscape PCIe controller driver:
   - Fix ls_pcie_ep_probe() syntax error (comma for semicolon)
     (Krzysztof Wilczyński)
   - Remove layerscape-gen4 dependencies on OF and ARM64, add dependency
     on ARCH_LAYERSCAPE (Geert Uytterhoeven)

  HiSilicon HIP PCIe controller driver:
   - Remove obsolete HiSilicon PCIe DT description (Dongdong Liu)

  Intel Gateway PCIe controller driver:
   - Remove unused pcie_app_rd() (Jiapeng Chong)

  Intel VMD host bridge driver:
   - Program IRTE with Requester ID of VMD endpoint, not child device
     (Jon Derrick)
   - Disable VMD MSI-X remapping when possible so children can use more
     MSI-X vectors (Jon Derrick)

  MediaTek PCIe controller driver:
   - Configure FC and FTS for functions other than 0 (Ryder Lee)
   - Add YAML schema for MediaTek (Jianjun Wang)
   - Export pci_pio_to_address() for module use (Jianjun Wang)
   - Add MediaTek MT8192 PCIe controller driver (Jianjun Wang)
   - Add MediaTek MT8192 INTx support (Jianjun Wang)
   - Add MediaTek MT8192 MSI support (Jianjun Wang)
   - Add MediaTek MT8192 system power management support (Jianjun Wang)
   - Add missing MODULE_DEVICE_TABLE (Qiheng Lin)

  Microchip PolarFlare PCIe controller driver:
   - Make several symbols static (Wei Yongjun)

  NVIDIA Tegra PCIe controller driver:
   - Add MCFG quirks for Tegra194 ECAM errata (Vidya Sagar)
   - Make several symbols const (Rikard Falkeborn)
   - Fix Kconfig host/endpoint typo (Wesley Sheng)

  SiFive FU740 PCIe controller driver:
   - Add pcie_aux clock to prci driver (Greentime Hu)
   - Use reset-simple in prci driver for PCIe (Greentime Hu)
   - Add SiFive FU740 PCIe host controller driver and DT binding (Paul
     Walmsley, Greentime Hu)

  Synopsys DesignWare PCIe controller driver:
   - Move MSI Receiver init to dw_pcie_host_init() so it is
     re-initialized along with the RC in resume (Jisheng Zhang)
   - Move iATU detection earlier to fix regression (Hou Zhiqiang)

  TI J721E PCIe driver:
   - Add DT binding and TI j721e support for refclk to PCIe connector
     (Kishon Vijay Abraham I)
   - Add host mode and endpoint mode DT bindings for TI AM64 SoC (Kishon
     Vijay Abraham I)

  TI Keystone PCIe controller driver:
   - Use generic config accessors for TI AM65x (K3) to fix regression
     (Kishon Vijay Abraham I)

  Xilinx NWL PCIe controller driver:
   - Add support for coherent PCIe DMA traffic using CCI (Bharat Kumar
     Gogada)
   - Add optional "dma-coherent" DT property (Bharat Kumar Gogada)

  Miscellaneous:
   - Fix kernel-doc warnings (Krzysztof Wilczyński)
   - Remove unused MicroGate SyncLink device IDs (Jiri Slaby)
   - Remove redundant dev_err() for devm_ioremap_resource() failure
     (Chen Hui)
   - Remove redundant initialization (Colin Ian King)
   - Drop redundant dev_err() for platform_get_irq() errors (Krzysztof
     Wilczyński)"

* tag 'pci-v5.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (98 commits)
  riscv: dts: Add PCIe support for the SiFive FU740-C000 SoC
  PCI: fu740: Add SiFive FU740 PCIe host controller driver
  dt-bindings: PCI: Add SiFive FU740 PCIe host controller
  MAINTAINERS: Add maintainers for SiFive FU740 PCIe driver
  clk: sifive: Use reset-simple in prci driver for PCIe driver
  clk: sifive: Add pcie_aux clock in prci driver for PCIe driver
  PCI: brcmstb: Use reset/rearm instead of deassert/assert
  ata: ahci_brcm: Fix use of BCM7216 reset controller
  reset: add missing empty function reset_control_rearm()
  PCI: Allow VPD access for QLogic ISP2722
  PCI/VPD: Add helper pci_get_func0_dev()
  PCI/VPD: Remove pci_vpd_find_tag() SRDT handling
  PCI/VPD: Remove pci_vpd_find_tag() 'offset' argument
  PCI/VPD: Change pci_vpd_init() return type to void
  PCI/VPD: Make missing VPD message less alarming
  PCI/VPD: Remove pci_set_vpd_size()
  x86/PCI: Remove unused alloc_pci_root_info() return value
  MAINTAINERS: Add Jianjun Wang as MediaTek PCI co-maintainer
  PCI: mediatek-gen3: Add system PM support
  PCI: mediatek-gen3: Add MSI support
  ...
2021-05-05 13:24:11 -07:00
Christophe JAILLET
745610c4a3 iommu/vt-d: Fix an error handling path in 'intel_prepare_irq_remapping()'
If 'intel_cap_audit()' fails, we should return directly, as already done in
the surrounding error handling path.

Fixes: ad3d19029979 ("iommu/vt-d: Audit IOMMU Capabilities and add helper functions")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/98d531caabe66012b4fffc7813fd4b9470afd517.1618124777.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-04-15 15:44:38 +02:00
Jon Derrick
9b4a824b88 iommu/vt-d: Use Real PCI DMA device for IRTE
VMD retransmits child device MSI-X with the VMD endpoint's requester-id.
In order to support direct interrupt remapping of VMD child devices,
ensure that the IRTE is programmed with the VMD endpoint's requester-id
using pci_real_dma_dev().

Link: https://lore.kernel.org/r/20210210161315.316097-2-jonathan.derrick@intel.com
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
2021-03-22 14:08:20 +00:00
Kyung Min Park
ad3d190299 iommu/vt-d: Audit IOMMU Capabilities and add helper functions
Audit IOMMU Capability/Extended Capability and check if the IOMMUs have
the consistent value for features. Report out or scale to the lowest
supported when IOMMU features have incompatibility among IOMMUs.

Report out features when below features are mismatched:
  - First Level 5 Level Paging Support (FL5LP)
  - First Level 1 GByte Page Support (FL1GP)
  - Read Draining (DRD)
  - Write Draining (DWD)
  - Page Selective Invalidation (PSI)
  - Zero Length Read (ZLR)
  - Caching Mode (CM)
  - Protected High/Low-Memory Region (PHMR/PLMR)
  - Required Write-Buffer Flushing (RWBF)
  - Advanced Fault Logging (AFL)
  - RID-PASID Support (RPS)
  - Scalable Mode Page Walk Coherency (SMPWC)
  - First Level Translation Support (FLTS)
  - Second Level Translation Support (SLTS)
  - No Write Flag Support (NWFS)
  - Second Level Accessed/Dirty Support (SLADS)
  - Virtual Command Support (VCS)
  - Scalable Mode Translation Support (SMTS)
  - Device TLB Invalidation Throttle (DIT)
  - Page Drain Support (PDS)
  - Process Address Space ID Support (PASID)
  - Extended Accessed Flag Support (EAFS)
  - Supervisor Request Support (SRS)
  - Execute Request Support (ERS)
  - Page Request Support (PRS)
  - Nested Translation Support (NEST)
  - Snoop Control (SC)
  - Pass Through (PT)
  - Device TLB Support (DT)
  - Queued Invalidation (QI)
  - Page walk Coherency (C)

Set capability to the lowest supported when below features are mismatched:
  - Maximum Address Mask Value (MAMV)
  - Number of Fault Recording Registers (NFR)
  - Second Level Large Page Support (SLLPS)
  - Fault Recording Offset (FRO)
  - Maximum Guest Address Width (MGAW)
  - Supported Adjusted Guest Address Width (SAGAW)
  - Number of Domains supported (NDOMS)
  - Pasid Size Supported (PSS)
  - Maximum Handle Mask Value (MHMV)
  - IOTLB Register Offset (IRO)

Signed-off-by: Kyung Min Park <kyung.min.park@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210130184452.31711-1-kyung.min.park@intel.com
Link: https://lore.kernel.org/r/20210204014401.2846425-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-02-04 14:42:00 +01:00
Dinghao Liu
ff2b46d7cf iommu/intel: Fix memleak in intel_irq_remapping_alloc
When irq_domain_get_irq_data() or irqd_cfg() fails
at i == 0, data allocated by kzalloc() has not been
freed before returning, which leads to memleak.

Fixes: b106ee63abcc ("irq_remapping/vt-d: Enhance Intel IR driver to support hierarchical irqdomains")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210105051837.32118-1-dinghao.liu@zju.edu.cn
Signed-off-by: Will Deacon <will@kernel.org>
2021-01-05 19:12:06 +00:00
David Woodhouse
79eb3581bc iommu/vt-d: Simplify intel_irq_remapping_select()
Now that the old get_irq_domain() method has gone, consolidate on just the
map_XXX_to_iommu() functions.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201024213535.443185-31-dwmw2@infradead.org
2020-10-28 20:26:28 +01:00
David Woodhouse
ed381fca47 x86: Kill all traces of irq_remapping_get_irq_domain()
All users are converted to use the fwspec based parent domain lookup.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201024213535.443185-30-dwmw2@infradead.org
2020-10-28 20:26:28 +01:00
David Woodhouse
a87fb465ff iommu/vt-d: Implement select() method on remapping irqdomain
Preparatory for removing irq_remapping_get_irq_domain()

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201024213535.443185-26-dwmw2@infradead.org
2020-10-28 20:26:27 +01:00
David Woodhouse
5d5a971338 x86/ioapic: Generate RTE directly from parent irqchip's MSI message
The I/O-APIC generates an MSI cycle with address/data bits taken from its
Redirection Table Entry in some combination which used to make sense, but
now is just a bunch of bits which get passed through in some seemingly
arbitrary order.

Instead of making IRQ remapping drivers directly frob the I/OA-PIC RTE, let
them just do their job and generate an MSI message. The bit swizzling to
turn that MSI message into the I/O-APIC's RTE is the same in all cases,
since it's a function of the I/O-APIC hardware. The IRQ remappers have no
real need to get involved with that.

The only slight caveat is that the I/OAPIC is interpreting some of those
fields too, and it does want the 'vector' field to be unique to make EOI
work. The AMD IOMMU happens to put its IRTE index in the bits that the
I/O-APIC thinks are the vector field, and accommodates this requirement by
reserving the first 32 indices for the I/O-APIC.  The Intel IOMMU doesn't
actually use the bits that the I/O-APIC thinks are the vector field, so it
fills in the 'pin' value there instead.

[ tglx: Replaced the unreadably macro maze with the cleaned up RTE/msi_msg
  	bitfields and added commentry to explain the mapping magic ]

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201024213535.443185-22-dwmw2@infradead.org
2020-10-28 20:26:27 +01:00
Thomas Gleixner
341b4a7211 x86/ioapic: Cleanup IO/APIC route entry structs
Having two seperate structs for the I/O-APIC RTE entries (non-remapped and
DMAR remapped) requires type casts and makes it hard to map.

Combine them in IO_APIC_routing_entry by defining a union of two 64bit
bitfields. Use naming which reflects which bits are shared and which bits
are actually different for the operating modes.

[dwmw2: Fix it up and finish the job, pulling the 32-bit w1,w2 words for
        register access into the same union and eliminating a few more
        places where bits were accessed through masks and shifts.]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201024213535.443185-21-dwmw2@infradead.org
2020-10-28 20:26:27 +01:00
Thomas Gleixner
a27dca645d x86/io_apic: Cleanup trigger/polarity helpers
'trigger' and 'polarity' are used throughout the I/O-APIC code for handling
the trigger type (edge/level) and the active low/high configuration. While
there are defines for initializing these variables and struct members, they
are not used consequently and the meaning of 'trigger' and 'polarity' is
opaque and confusing at best.

Rename them to 'is_level' and 'active_low' and make them boolean in various
structs so it's entirely clear what the meaning is.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201024213535.443185-20-dwmw2@infradead.org
2020-10-28 20:26:26 +01:00
Thomas Gleixner
5c0d0e2cc6 iommu/intel: Use msi_msg shadow structs
Use the bitfields in the x86 shadow struct to compose the MSI message.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201024213535.443185-14-dwmw2@infradead.org
2020-10-28 20:26:25 +01:00
Thomas Gleixner
8c44963b60 x86/apic: Cleanup destination mode
apic::irq_dest_mode is actually a boolean, but defined as u32 and named in
a way which does not explain what it means.

Make it a boolean and rename it to 'dest_mode_logical'

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201024213535.443185-9-dwmw2@infradead.org
2020-10-28 20:26:25 +01:00
Thomas Gleixner
721612994f x86/apic: Cleanup delivery mode defines
The enum ioapic_irq_destination_types and the enumerated constants starting
with 'dest_' are gross misnomers because they describe the delivery mode.

Rename then enum and the constants so they actually make sense.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20201024213535.443185-6-dwmw2@infradead.org
2020-10-28 20:26:24 +01:00
Thomas Gleixner
9f0ffb4bb3 iommu/vt-d: Remove domain search for PCI/MSI[X]
Now that the domain can be retrieved through device::msi_domain the domain
search for PCI_MSI[X] is not longer required. Remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20200826112334.305699301@linutronix.de
2020-09-16 16:52:38 +02:00
Thomas Gleixner
85a8dfc57a iommm/vt-d: Store irq domain in struct device
As a first step to make X86 utilize the direct MSI irq domain operations
store the irq domain pointer in the device struct when a device is probed.

This is done from dmar_pci_bus_add_dev() because it has to work even when
DMA remapping is disabled. It only overrides the irqdomain of devices which
are handled by a regular PCI/MSI irq domain which protects PCI devices
behind special busses like VMD which have their own irq domain.

No functional change. It just avoids the redirection through
arch_*_msi_irqs() and allows the PCI/MSI core to directly invoke the irq
domain alloc/free functions instead of having to look up the irq domain for
every single MSI interupt.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20200826112333.714566121@linutronix.de
2020-09-16 16:52:37 +02:00
Thomas Gleixner
3b9c1d377d x86/msi: Consolidate MSI allocation
Convert the interrupt remap drivers to retrieve the pci device from the msi
descriptor and use info::hwirq.

This is the first step to prepare x86 for using the generic MSI domain ops.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20200826112332.466405395@linutronix.de
2020-09-16 16:52:35 +02:00
Thomas Gleixner
33a65ba470 x86_ioapic_Consolidate_IOAPIC_allocation
Move the IOAPIC specific fields into their own struct and reuse the common
devid. Get rid of the #ifdeffery as it does not matter at all whether the
alloc info is a couple of bytes longer or not.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20200826112332.054367732@linutronix.de
2020-09-16 16:52:32 +02:00
Thomas Gleixner
2bf1e7bced x86/msi: Consolidate HPET allocation
None of the magic HPET fields are required in any way.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20200826112331.943993771@linutronix.de
2020-09-16 16:52:31 +02:00
Thomas Gleixner
6b6256e616 iommu/irq_remapping: Consolidate irq domain lookup
Now that the iommu implementations handle the X86_*_GET_PARENT_DOMAIN
types, consolidate the two getter functions.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20200826112331.741909337@linutronix.de
2020-09-16 16:52:30 +02:00
Thomas Gleixner
60e5a9397c iommu/vt-d: Consolidate irq domain getter
The irq domain request mode is now indicated in irq_alloc_info::type.

Consolidate the two getter functions into one.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20200826112331.530546013@linutronix.de
2020-09-16 16:52:29 +02:00
Thomas Gleixner
b4c364da32 x86/irq: Add allocation type for parent domain retrieval
irq_remapping_ir_irq_domain() is used to retrieve the remapping parent
domain for an allocation type. irq_remapping_irq_domain() is for retrieving
the actual device domain for allocating interrupts for a device.

The two functions are similar and can be unified by using explicit modes
for parent irq domain retrieval.

Add X86_IRQ_ALLOC_TYPE_IOAPIC/HPET_GET_PARENT and use it in the iommu
implementations. Drop the parent domain retrieval for PCI_MSI/X as that is
unused.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20200826112331.436350257@linutronix.de
2020-09-16 16:52:29 +02:00
Thomas Gleixner
801b5e4c4e x86_irq_Rename_X86_IRQ_ALLOC_TYPE_MSI_to_reflect_PCI_dependency
No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20200826112331.343103175@linutronix.de
2020-09-16 16:52:29 +02:00
Lu Baolu
6e4e9ec650 iommu/vt-d: Serialize IOMMU GCMD register modifications
The VT-d spec requires (10.4.4 Global Command Register, GCMD_REG General
Description) that:

If multiple control fields in this register need to be modified, software
must serialize the modifications through multiple writes to this register.

However, in irq_remapping.c, modifications of IRE and CFI are done in one
write. We need to do two separate writes with STS checking after each. It
also checks the status register before writing command register to avoid
unnecessary register write.

Fixes: af8d102f999a4 ("x86/intel/irq_remapping: Clean up x2apic opt-out security warning mess")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Link: https://lore.kernel.org/r/20200828000615.8281-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04 11:39:21 +02:00
Linus Torvalds
97d052ea3f A set of locking fixes and updates:
- Untangle the header spaghetti which causes build failures in various
     situations caused by the lockdep additions to seqcount to validate that
     the write side critical sections are non-preemptible.
 
   - The seqcount associated lock debug addons which were blocked by the
     above fallout.
 
     seqcount writers contrary to seqlock writers must be externally
     serialized, which usually happens via locking - except for strict per
     CPU seqcounts. As the lock is not part of the seqcount, lockdep cannot
     validate that the lock is held.
 
     This new debug mechanism adds the concept of associated locks.
     sequence count has now lock type variants and corresponding
     initializers which take a pointer to the associated lock used for
     writer serialization. If lockdep is enabled the pointer is stored and
     write_seqcount_begin() has a lockdep assertion to validate that the
     lock is held.
 
     Aside of the type and the initializer no other code changes are
     required at the seqcount usage sites. The rest of the seqcount API is
     unchanged and determines the type at compile time with the help of
     _Generic which is possible now that the minimal GCC version has been
     moved up.
 
     Adding this lockdep coverage unearthed a handful of seqcount bugs which
     have been addressed already independent of this.
 
     While generaly useful this comes with a Trojan Horse twist: On RT
     kernels the write side critical section can become preemtible if the
     writers are serialized by an associated lock, which leads to the well
     known reader preempts writer livelock. RT prevents this by storing the
     associated lock pointer independent of lockdep in the seqcount and
     changing the reader side to block on the lock when a reader detects
     that a writer is in the write side critical section.
 
  - Conversion of seqcount usage sites to associated types and initializers.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl8xmPYTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoTuQEACyzQCjU8PgehPp9oMqWzaX2fcVyuZO
 QU2yw6gmz2oTz3ZHUNwdW8UnzGh2OWosK3kDruoD9FtSS51lER1/ISfSPCGfyqxC
 KTjOcB1Kvxwq/3LcCx7Zi3ZxWApat74qs3EhYhKtEiQ2Y9xv9rLq8VV1UWAwyxq0
 eHpjlIJ6b6rbt+ARslaB7drnccOsdK+W/roNj4kfyt+gezjBfojGRdMGQNMFcpnv
 shuTC+vYurAVIiVA/0IuizgHfwZiXOtVpjVoEWaxg6bBH6HNuYMYzdSa/YrlDkZs
 n/aBI/Xkvx+Eacu8b1Zwmbzs5EnikUK/2dMqbzXKUZK61eV4hX5c2xrnr1yGWKTs
 F/juh69Squ7X6VZyKVgJ9RIccVueqwR2EprXWgH3+RMice5kjnXH4zURp0GHALxa
 DFPfB6fawcH3Ps87kcRFvjgm6FBo0hJ1AxmsW1dY4ACFB9azFa2euW+AARDzHOy2
 VRsUdhL9CGwtPjXcZ/9Rhej6fZLGBXKr8uq5QiMuvttp4b6+j9FEfBgD4S6h8csl
 AT2c2I9LcbWqyUM9P4S7zY/YgOZw88vHRuDH7tEBdIeoiHfrbSBU7EQ9jlAKq/59
 f+Htu2Io281c005g7DEeuCYvpzSYnJnAitj5Lmp/kzk2Wn3utY1uIAVszqwf95Ul
 81ppn2KlvzUK8g==
 =7Gj+
 -----END PGP SIGNATURE-----

Merge tag 'locking-urgent-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Thomas Gleixner:
 "A set of locking fixes and updates:

   - Untangle the header spaghetti which causes build failures in
     various situations caused by the lockdep additions to seqcount to
     validate that the write side critical sections are non-preemptible.

   - The seqcount associated lock debug addons which were blocked by the
     above fallout.

     seqcount writers contrary to seqlock writers must be externally
     serialized, which usually happens via locking - except for strict
     per CPU seqcounts. As the lock is not part of the seqcount, lockdep
     cannot validate that the lock is held.

     This new debug mechanism adds the concept of associated locks.
     sequence count has now lock type variants and corresponding
     initializers which take a pointer to the associated lock used for
     writer serialization. If lockdep is enabled the pointer is stored
     and write_seqcount_begin() has a lockdep assertion to validate that
     the lock is held.

     Aside of the type and the initializer no other code changes are
     required at the seqcount usage sites. The rest of the seqcount API
     is unchanged and determines the type at compile time with the help
     of _Generic which is possible now that the minimal GCC version has
     been moved up.

     Adding this lockdep coverage unearthed a handful of seqcount bugs
     which have been addressed already independent of this.

     While generally useful this comes with a Trojan Horse twist: On RT
     kernels the write side critical section can become preemtible if
     the writers are serialized by an associated lock, which leads to
     the well known reader preempts writer livelock. RT prevents this by
     storing the associated lock pointer independent of lockdep in the
     seqcount and changing the reader side to block on the lock when a
     reader detects that a writer is in the write side critical section.

   - Conversion of seqcount usage sites to associated types and
     initializers"

* tag 'locking-urgent-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
  locking/seqlock, headers: Untangle the spaghetti monster
  locking, arch/ia64: Reduce <asm/smp.h> header dependencies by moving XTP bits into the new <asm/xtp.h> header
  x86/headers: Remove APIC headers from <asm/smp.h>
  seqcount: More consistent seqprop names
  seqcount: Compress SEQCNT_LOCKNAME_ZERO()
  seqlock: Fold seqcount_LOCKNAME_init() definition
  seqlock: Fold seqcount_LOCKNAME_t definition
  seqlock: s/__SEQ_LOCKDEP/__SEQ_LOCK/g
  hrtimer: Use sequence counter with associated raw spinlock
  kvm/eventfd: Use sequence counter with associated spinlock
  userfaultfd: Use sequence counter with associated spinlock
  NFSv4: Use sequence counter with associated spinlock
  iocost: Use sequence counter with associated spinlock
  raid5: Use sequence counter with associated spinlock
  vfs: Use sequence counter with associated spinlock
  timekeeping: Use sequence counter with associated raw spinlock
  xfrm: policy: Use sequence counters with associated lock
  netfilter: nft_set_rbtree: Use sequence counter with associated rwlock
  netfilter: conntrack: Use sequence counter with associated spinlock
  sched: tasks: Use sequence counter with associated spinlock
  ...
2020-08-10 19:07:44 -07:00
Ingo Molnar
13c01139b1 x86/headers: Remove APIC headers from <asm/smp.h>
The APIC headers are relatively complex and bring in additional
header dependencies - while smp.h is a relatively simple header
included from high level headers.

Remove the dependency and add in the missing #include's in .c
files where they gained it indirectly before.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-08-06 16:13:09 +02:00
Jon Derrick
ec0160891e irqdomain/treewide: Free firmware node after domain removal
Commit 711419e504eb ("irqdomain: Add the missing assignment of
domain->fwnode for named fwnode") unintentionally caused a dangling pointer
page fault issue on firmware nodes that were freed after IRQ domain
allocation. Commit e3beca48a45b fixed that dangling pointer issue by only
freeing the firmware node after an IRQ domain allocation failure. That fix
no longer frees the firmware node immediately, but leaves the firmware node
allocated after the domain is removed.

The firmware node must be kept around through irq_domain_remove, but should be
freed it afterwards.

Add the missing free operations after domain removal where where appropriate.

Fixes: e3beca48a45b ("irqdomain/treewide: Keep firmware node unconditionally allocated")
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>	# drivers/pci
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1595363169-7157-1-git-send-email-jonathan.derrick@intel.com
2020-07-23 00:08:52 +02:00
Thomas Gleixner
e3beca48a4 irqdomain/treewide: Keep firmware node unconditionally allocated
Quite some non OF/ACPI users of irqdomains allocate firmware nodes of type
IRQCHIP_FWNODE_NAMED or IRQCHIP_FWNODE_NAMED_ID and free them right after
creating the irqdomain. The only purpose of these FW nodes is to convey
name information. When this was introduced the core code did not store the
pointer to the node in the irqdomain. A recent change stored the firmware
node pointer in irqdomain for other reasons and missed to notice that the
usage sites which do the alloc_fwnode/create_domain/free_fwnode sequence
are broken by this. Storing a dangling pointer is dangerous itself, but in
case that the domain is destroyed later on this leads to a double free.

Remove the freeing of the firmware node after creating the irqdomain from
all affected call sites to cure this.

Fixes: 711419e504eb ("irqdomain: Add the missing assignment of domain->fwnode for named fwnode")
Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/873661qakd.fsf@nanos.tec.linutronix.de
2020-07-14 17:44:42 +02:00
Joerg Roedel
672cf6df9b iommu/vt-d: Move Intel IOMMU driver into subdirectory
Move all files related to the Intel IOMMU driver into its own
subdirectory.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200609130303.26974-3-joro@8bytes.org
2020-06-10 17:46:43 +02:00