linux

iv/linux

Author	SHA1	Message	Date
Lu Baolu	0f5432a9b8	iommu/vt-d: Omit devTLB invalidation requests when TES=0 The latest VT-d spec indicates that when remapping hardware is disabled (TES=0 in Global Status Register), upstream ATS Invalidation Completion requests are treated as UR (Unsupported Request). Consequently, the spec recommends in section 4.3 Handling of Device-TLB Invalidations that software refrain from submitting any Device-TLB invalidation requests when address remapping hardware is disabled. Verify address remapping hardware is enabled prior to submitting Device- TLB invalidation requests. Fixes: `792fb43ce2` ("iommu/vt-d: Enable Intel IOMMU scalable mode by default") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20231114011036.70142-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2023-11-27 11:07:51 +01:00
Joerg Roedel	e51b419839	Merge branches 'iommu/fixes', 'arm/allwinner', 'arm/exynos', 'arm/mediatek', 'arm/omap', 'arm/renesas', 'arm/rockchip', 'arm/smmu', 'ppc/pamu', 'unisoc', 'x86/vt-d', 'x86/amd', 'core' and 'platform-remove_new' into next	2023-04-14 13:45:50 +02:00
Tina Zhang	e60d63e32d	iommu/vt-d: Remove BUG_ON in dmar_insert_dev_scope() The dmar_insert_dev_scope() could fail if any unexpected condition is encountered. However, in this situation, the kernel should attempt recovery and proceed with execution. Remove BUG_ON with WARN_ON, so that kernel can avoid being crashed when an unexpected condition occurs. Signed-off-by: Tina Zhang <tina.zhang@intel.com> Link: https://lore.kernel.org/r/20230406065944.2773296-8-tina.zhang@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2023-04-13 12:05:53 +02:00
Tina Zhang	ff45ab9646	iommu/vt-d: Remove a useless BUG_ON(dev->is_virtfn) When dmar_alloc_pci_notify_info() is being invoked, the invoker has ensured the dev->is_virtfn is false. So, remove the useless BUG_ON in dmar_alloc_pci_notify_info(). Signed-off-by: Tina Zhang <tina.zhang@intel.com> Link: https://lore.kernel.org/r/20230406065944.2773296-7-tina.zhang@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2023-04-13 12:05:53 +02:00
Tina Zhang	b31064f881	iommu/vt-d: Make size of operands same in bitwise operations This addresses the following issue reported by klocwork tool: - operands of different size in bitwise operations Suggested-by: Yongwei Ma <yongwei.ma@intel.com> Signed-off-by: Tina Zhang <tina.zhang@intel.com> Link: https://lore.kernel.org/r/20230406065944.2773296-2-tina.zhang@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2023-04-13 12:05:50 +02:00
Lu Baolu	bfd3c6b9fa	iommu/vt-d: Allow zero SAGAW if second-stage not supported The VT-d spec states (in section 11.4.2) that hardware implementations reporting second-stage translation support (SSTS) field as Clear also report the SAGAW field as 0. Fix an inappropriate check in alloc_iommu(). Fixes: `792fb43ce2` ("iommu/vt-d: Enable Intel IOMMU scalable mode by default") Suggested-by: Raghunathan Srinivasan <raghunathan.srinivasan@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20230318024824.124542-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20230329134721.469447-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2023-03-31 10:06:15 +02:00
Jacob Pan	fffaed1e24	iommu/ioasid: Rename INVALID_IOASID INVALID_IOASID and IOMMU_PASID_INVALID are duplicated. Rename INVALID_IOASID and consolidate since we are moving away from IOASID infrastructure. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Link: https://lore.kernel.org/r/20230322200803.869130-7-jacob.jun.pan@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2023-03-31 10:03:27 +02:00
Jacob Pan	760f41d182	iommu/vt-d: Remove virtual command interface Virtual command interface was introduced to allow using host PASIDs inside VMs. It is unused and abandoned due to architectural change. With this patch, we can safely remove this feature and the related helpers. Link: https://lore.kernel.org/r/20230210230206.3160144-2-jacob.jun.pan@linux.intel.com Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20230322200803.869130-2-jacob.jun.pan@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2023-03-31 10:03:21 +02:00
Kan Liang	d8a7c0cf05	iommu/vt-d: Enable IOMMU perfmon support Register and enable an IOMMU perfmon for each active IOMMU device. The failure of IOMMU perfmon registration doesn't impact other functionalities of an IOMMU device. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20230128200428.1459118-8-kan.liang@linux.intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2023-02-03 11:06:09 +01:00
Kan Liang	4a0d426565	iommu/vt-d: Add IOMMU perfmon overflow handler support While enabled to count events and an event occurrence causes the counter value to increment and roll over to or past zero, this is termed a counter overflow. The overflow can trigger an interrupt. The IOMMU perfmon needs to handle the case properly. New HW IRQs are allocated for each IOMMU device for perfmon. The IRQ IDs are after the SVM range. In the overflow handler, the counter is not frozen. It's very unlikely that the same counter overflows again during the period. But it's possible that other counters overflow at the same time. Read the overflow register at the end of the handler and check whether there are more. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20230128200428.1459118-7-kan.liang@linux.intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2023-02-03 11:06:08 +01:00
Kan Liang	dc57875866	iommu/vt-d: Support Enhanced Command Interface The Enhanced Command Register is to submit command and operand of enhanced commands to DMA Remapping hardware. It can supports up to 256 enhanced commands. There is a HW register to indicate the availability of all 256 enhanced commands. Each bit stands for each command. But there isn't an existing interface to read/write all 256 bits. Introduce the u64 ecmdcap[4] to store the existence of each enhanced command. Read 4 times to get all of them in map_iommu(). Add a helper to facilitate an enhanced command launch. Make sure hardware complete the command. Also add a helper to facilitate the check of PMU essentials. These helpers will be used later. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20230128200428.1459118-4-kan.liang@linux.intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2023-02-03 11:06:05 +01:00
Kan Liang	a6a5006dad	iommu/vt-d: Retrieve IOMMU perfmon capability information The performance monitoring infrastructure, perfmon, is to support collection of information about key events occurring during operation of the remapping hardware, to aid performance tuning and debug. Each remapping hardware unit has capability registers that indicate support for performance monitoring features and enumerate the capabilities. Add alloc_iommu_pmu() to retrieve IOMMU perfmon capability information for each iommu unit. The information is stored in the iommu->pmu data structure. Capability registers are read-only, so it's safe to prefetch and store them in the pmu structure. This could avoid unnecessary VMEXIT when this code is running in the virtualization environment. Add free_iommu_pmu() to free the saved capability information when freeing the iommu unit. Add a kernel config option for the IOMMU perfmon feature. Unless a user explicitly uses the perf tool to monitor the IOMMU perfmon event, there isn't any impact for the existing IOMMU. Enable it by default. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20230128200428.1459118-3-kan.liang@linux.intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2023-02-03 11:06:04 +01:00
Kan Liang	4db96bfe9d	iommu/vt-d: Support size of the register set in DRHD A new field, which indicates the size of the remapping hardware register set for this remapping unit, is introduced in the DMA-remapping hardware unit definition (DRHD) structure with the VT-d Spec 4.0. With this information, SW doesn't need to 'guess' the size of the register set anymore. Update the struct acpi_dmar_hardware_unit to reflect the field. Store the size of the register set in struct dmar_drhd_unit for each dmar device. The 'size' information is ResvZ for the old BIOS and platforms. Fall back to the old guessing method. There is nothing changed. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20230128200428.1459118-2-kan.liang@linux.intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2023-02-03 11:06:03 +01:00
Linus Torvalds	08cdc21579	Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd Pull iommufd implementation from Jason Gunthorpe: "iommufd is the user API to control the IOMMU subsystem as it relates to managing IO page tables that point at user space memory. It takes over from drivers/vfio/vfio_iommu_type1.c (aka the VFIO container) which is the VFIO specific interface for a similar idea. We see a broad need for extended features, some being highly IOMMU device specific: - Binding iommu_domain's to PASID/SSID - Userspace IO page tables, for ARM, x86 and S390 - Kernel bypassed invalidation of user page tables - Re-use of the KVM page table in the IOMMU - Dirty page tracking in the IOMMU - Runtime Increase/Decrease of IOPTE size - PRI support with faults resolved in userspace Many of these HW features exist to support VM use cases - for instance the combination of PASID, PRI and Userspace IO Page Tables allows an implementation of DMA Shared Virtual Addressing (vSVA) within a guest. Dirty tracking enables VM live migration with SRIOV devices and PASID support allow creating "scalable IOV" devices, among other things. As these features are fundamental to a VM platform they need to be uniformly exposed to all the driver families that do DMA into VMs, which is currently VFIO and VDPA" For more background, see the extended explanations in Jason's pull request: https://lore.kernel.org/lkml/Y5dzTU8dlmXTbzoJ@nvidia.com/ * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (62 commits) iommufd: Change the order of MSI setup iommufd: Improve a few unclear bits of code iommufd: Fix comment typos vfio: Move vfio group specific code into group.c vfio: Refactor dma APIs for emulated devices vfio: Wrap vfio group module init/clean code into helpers vfio: Refactor vfio_device open and close vfio: Make vfio_device_open() truly device specific vfio: Swap order of vfio_device_container_register() and open_device() vfio: Set device->group in helper function vfio: Create wrappers for group register/unregister vfio: Move the sanity check of the group to vfio_create_group() vfio: Simplify vfio_create_group() iommufd: Allow iommufd to supply /dev/vfio/vfio vfio: Make vfio_container optionally compiled vfio: Move container related MODULE_ALIAS statements into container.c vfio-iommufd: Support iommufd for emulated VFIO devices vfio-iommufd: Support iommufd for physical VFIO devices vfio-iommufd: Allow iommufd to be used in place of a container fd vfio: Use IOMMU_CAP_ENFORCE_CACHE_COHERENCY for vfio_file_enforced_coherent() ...	2022-12-14 09:15:43 -08:00
Xiongfeng Wang	4bedbbd782	iommu/vt-d: Fix PCI device refcount leak in dmar_dev_scope_init() for_each_pci_dev() is implemented by pci_get_device(). The comment of pci_get_device() says that it will increase the reference count for the returned pci_dev and also decrease the reference count for the input pci_dev @from if it is not NULL. If we break for_each_pci_dev() loop with pdev not NULL, we need to call pci_dev_put() to decrease the reference count. Add the missing pci_dev_put() for the error path to avoid reference count leak. Fixes: `2e45528930` ("iommu/vt-d: Unify the way to process DMAR device scope array") Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Link: https://lore.kernel.org/r/20221121113649.190393-3-wangxiongfeng2@huawei.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2022-12-02 11:45:33 +01:00
Lu Baolu	1adf3cc20d	iommu: Add max_pasids field in struct iommu_device Use this field to keep the number of supported PASIDs that an IOMMU hardware is able to support. This is a generic attribute of an IOMMU and lifting it into the per-IOMMU device structure makes it possible to allocate a PASID for device without calls into the IOMMU drivers. Any iommu driver that supports PASID related features should set this field before enabling them on the devices. In the Intel IOMMU driver, intel_iommu_sm is moved to CONFIG_INTEL_IOMMU enclave so that the pasid_supported() helper could be used in dmar.c without compilation errors. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Tested-by: Tony Zhu <tony.zhu@intel.com> Link: https://lore.kernel.org/r/20221031005917.45690-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2022-11-03 15:47:43 +01:00
Lu Baolu	7ebb5f8e00	Revert "iommu/vt-d: Fix possible recursive locking in intel_iommu_init()" This reverts commit `9cd4f14344`. Some issues were reported on the original commit. Some thunderbolt devices don't work anymore due to the following DMA fault. DMAR: DRHD: handling fault status reg 2 DMAR: [INTR-REMAP] Request device [09:00.0] fault index 0x8080 [fault reason 0x25] Blocked a compatibility format interrupt request Bring it back for now to avoid functional regression. Fixes: `9cd4f14344` ("iommu/vt-d: Fix possible recursive locking in intel_iommu_init()") Link: https://lore.kernel.org/linux-iommu/485A6EA5-6D58-42EA-B298-8571E97422DE@getmailspring.com/ Link: https://bugzilla.kernel.org/show_bug.cgi?id=216497 Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: <stable@vger.kernel.org> # 5.19.x Reported-and-tested-by: George Hilliard <thirtythreeforty@gmail.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220920081701.3453504-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2022-09-21 10:22:54 +02:00
Lu Baolu	9cd4f14344	iommu/vt-d: Fix possible recursive locking in intel_iommu_init() The global rwsem dmar_global_lock was introduced by commit `3a5670e8ac` ("iommu/vt-d: Introduce a rwsem to protect global data structures"). It is used to protect DMAR related global data from DMAR hotplug operations. The dmar_global_lock used in the intel_iommu_init() might cause recursive locking issue, for example, intel_iommu_get_resv_regions() is taking the dmar_global_lock from within a section where intel_iommu_init() already holds it via probe_acpi_namespace_devices(). Using dmar_global_lock in intel_iommu_init() could be relaxed since it is unlikely that any IO board must be hot added before the IOMMU subsystem is initialized. This eliminates the possible recursive locking issue by moving down DMAR hotplug support after the IOMMU is initialized and removing the uses of dmar_global_lock in intel_iommu_init(). Fixes: `d5692d4af0` ("iommu/vt-d: Fix suspicious RCU usage in probe_acpi_namespace_devices()") Reported-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/894db0ccae854b35c73814485569b634237b5538.1657034828.git.robin.murphy@arm.com Link: https://lore.kernel.org/r/20220718235325.3952426-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2022-09-11 08:19:24 +02:00
Linus Torvalds	4e23eeebb2	Merge tag 'bitmap-6.0-rc1' of https://github.com/norov/linux Pull bitmap updates from Yury Norov: - fix the duplicated comments on bitmap_to_arr64() (Qu Wenruo) - optimize out non-atomic bitops on compile-time constants (Alexander Lobakin) - cleanup bitmap-related headers (Yury Norov) - x86/olpc: fix 'logical not is only applied to the left hand side' (Alexander Lobakin) - lib/nodemask: inline wrappers around bitmap (Yury Norov) * tag 'bitmap-6.0-rc1' of https://github.com/norov/linux: (26 commits) lib/nodemask: inline next_node_in() and node_random() powerpc: drop dependency on <asm/machdep.h> in archrandom.h x86/olpc: fix 'logical not is only applied to the left hand side' lib/cpumask: move some one-line wrappers to header file headers/deps: mm: align MANITAINERS and Docs with new gfp.h structure headers/deps: mm: Split <linux/gfp_types.h> out of <linux/gfp.h> headers/deps: mm: Optimize <linux/gfp.h> header dependencies lib/cpumask: move trivial wrappers around find_bit to the header lib/cpumask: change return types to unsigned where appropriate cpumask: change return types to bool where appropriate lib/bitmap: change type of bitmap_weight to unsigned long lib/bitmap: change return types to bool where appropriate arm: align find_bit declarations with generic kernel iommu/vt-d: avoid invalid memory access via node_online(NUMA_NO_NODE) lib/test_bitmap: test the tail after bitmap_to_arr64() lib/bitmap: fix off-by-one in bitmap_to_arr64() lib: test_bitmap: add compile-time optimization/evaluations assertions bitmap: don't assume compiler evaluates small mem*() builtins calls net/ice: fix initializing the bitmap in the switch code bitops: let optimize out non-atomic bitops on compile-time constants ...	2022-08-07 17:52:35 -07:00
Lu Baolu	913432f217	iommu/vt-d: Use IDA interface to manage iommu sequence id Switch dmar unit sequence id allocation and release from bitmap to IDA interface. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lore.kernel.org/r/20220702015610.2849494-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2022-07-15 10:21:40 +02:00
Lu Baolu	2585a2790e	iommu/vt-d: Move include/linux/intel-iommu.h under iommu This header file is private to the Intel IOMMU driver. Move it to the driver folder. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lore.kernel.org/r/20220514014322.2927339-8-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2022-07-15 10:21:31 +02:00
Lu Baolu	933ab6d301	iommu/vt-d: Move trace/events/intel_iommu.h under iommu This header file is private to the Intel IOMMU driver. Move it to the driver folder. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Steve Wahl <steve.wahl@hpe.com> Link: https://lore.kernel.org/r/20220514014322.2927339-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2022-07-15 10:21:28 +02:00
Alexander Lobakin	b0b0b77ea6	iommu/vt-d: avoid invalid memory access via node_online(NUMA_NO_NODE) KASAN reports: [ 4.668325][ T0] BUG: KASAN: wild-memory-access in dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497) [ 4.676149][ T0] Read of size 8 at addr 1fffffff85115558 by task swapper/0/0 [ 4.683454][ T0] [ 4.685638][ T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc3-00004-g0e862838f290 #1 [ 4.694331][ T0] Hardware name: Supermicro SYS-5018D-FN4T/X10SDV-8C-TLN4F, BIOS 1.1 03/02/2016 [ 4.703196][ T0] Call Trace: [ 4.706334][ T0] <TASK> [ 4.709133][ T0] ? dmar_parse_one_rhsa (arch/x86/include/asm/bitops.h:214 arch/x86/include/asm/bitops.h:226 include/asm-generic/bitops/instrumented-non-atomic.h:142 include/linux/nodemask.h:415 drivers/iommu/intel/dmar.c:497) after converting the type of the first argument (@nr, bit number) of arch_test_bit() from `long` to `unsigned long`[0]. Under certain conditions (for example, when ACPI NUMA is disabled via command line), pxm_to_node() can return %NUMA_NO_NODE (-1). It is valid 'magic' number of NUMA node, but not valid bit number to use in bitops. node_online() eventually descends to test_bit() without checking for the input, assuming it's on caller side (which might be good for perf-critical tasks). There, -1 becomes %ULONG_MAX which leads to an insane array index when calculating bit position in memory. For now, add an explicit check for @node being not %NUMA_NO_NODE before calling test_bit(). The actual logics didn't change here at all. [0] `0e862838f2` Fixes: `ee34b32d8c` ("dmar: support for parsing Remapping Hardware Static Affinity structure") Cc: stable@vger.kernel.org # 2.6.33+ Reported-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Yury Norov <yury.norov@gmail.com>	2022-07-13 09:10:32 -07:00
Yian Chen	316f92a705	iommu/vt-d: Fix PCI bus rescan device hot add Notifier calling chain uses priority to determine the execution order of the notifiers or listeners registered to the chain. PCI bus device hot add utilizes the notification mechanism. The current code sets low priority (INT_MIN) to Intel dmar_pci_bus_notifier and postpones DMAR decoding after adding new device into IOMMU. The result is that struct device pointer cannot be found in DRHD search for the new device's DMAR/IOMMU. Subsequently, the device is put under the "catch-all" IOMMU instead of the correct one. This could cause system hang when device TLB invalidation is sent to the wrong IOMMU. Invalidation timeout error and hard lockup have been observed and data inconsistency/crush may occur as well. This patch fixes the issue by setting a positive priority(1) for dmar_pci_bus_notifier while the priority of IOMMU bus notifier uses the default value(0), therefore DMAR decoding will be in advance of DRHD search for a new device to find the correct IOMMU. Following is a 2-step example that triggers the bug by simulating PCI device hot add behavior in Intel Sapphire Rapids server. echo 1 > /sys/bus/pci/devices/0000:6a:01.0/remove echo 1 > /sys/bus/pci/rescan Fixes: `59ce0515cd` ("iommu/vt-d: Update DRHD/RMRR/ATSR device scope") Cc: stable@vger.kernel.org # v3.15+ Reported-by: Zhang, Bernice <bernice.zhang@intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Yian Chen <yian.chen@intel.com> Link: https://lore.kernel.org/r/20220521002115.1624069-1-yian.chen@intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2022-07-06 12:44:07 +02:00
Christoph Hellwig	78013eaadf	x86: remove the IOMMU table infrastructure The IOMMU table tries to separate the different IOMMUs into different backends, but actually requires various cross calls. Rewrite the code to do the generic swiotlb/swiotlb-xen setup directly in pci-dma.c and then just call into the IOMMU drivers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2022-04-18 07:21:10 +02:00
Andy Shevchenko	2852631d96	iommu/vt-d: Move intel_iommu_ops to header file Compiler is not happy about hidden declaration of intel_iommu_ops. .../iommu.c:414:24: warning: symbol 'intel_iommu_ops' was not declared. Should it be static? Move declaration to header file to make compiler happy. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20220207141240.8253-1-andriy.shevchenko@linux.intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220301020159.633356-10-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2022-03-04 16:46:31 +01:00
Rafael J. Wysocki	f266c11bce	iommu/vtd: Replace acpi_bus_get_device() Replace acpi_bus_get_device() that is going to be dropped with acpi_fetch_acpi_dev(). No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1807113.tdWV9SEqCh@kreacher Signed-off-by: Joerg Roedel <jroedel@suse.de>	2022-02-14 12:59:45 +01:00
Kyung Min Park	914ff7719e	iommu/vt-d: Dump DMAR translation structure when DMA fault occurs When the dmar translation fault happens, the kernel prints a single line fault reason with corresponding hexadecimal code defined in the Intel VT-d specification. Currently, when user wants to debug the translation fault in detail, debugfs is used for dumping the dmar_translation_struct, which is not available when the kernel failed to boot. Dump the DMAR translation structure, pagewalk the IO page table and print the page table entry when the fault happens. This takes effect only when CONFIG_DMAR_DEBUG is enabled. Signed-off-by: Kyung Min Park <kyung.min.park@intel.com> Link: https://lore.kernel.org/r/20210815203845.31287-1-kyung.min.park@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20211014053839.727419-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-10-18 12:31:48 +02:00
Bjorn Helgaas	0b482d0c75	iommu/vt-d: Drop "0x" prefix from PCI bus & device addresses `719a193356` ("iommu/vt-d: Tweak the description of a DMA fault") changed the DMA fault reason from hex to decimal. It also added "0x" prefixes to the PCI bus/device, e.g., - DMAR: [INTR-REMAP] Request device [00:00.5] + DMAR: [INTR-REMAP] Request device [0x00:0x00.5] These no longer match dev_printk() and other similar messages in dmar_match_pci_path() and dmar_acpi_insert_dev_scope(). Drop the "0x" prefixes from the bus and device addresses. Fixes: `719a193356` ("iommu/vt-d: Tweak the description of a DMA fault") Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20210903193711.483999-1-helgaas@kernel.org Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210922054726.499110-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-09-28 11:41:11 +02:00
Zhen Lei	5e41c99894	iommu/vt-d: Remove unnecessary oom message Fixes scripts/checkpatch.pl warning: WARNING: Possible unnecessary 'out of memory' message Remove it can help us save a bit of memory. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/20210609124937.14260-1-thunder.leizhen@huawei.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210818134852.1847070-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-08-19 10:41:08 +02:00
Lu Baolu	74eb87a0f9	iommu/vt-d: Add cache invalidation latency sampling Queued invalidation execution time is performance critical and needs to be monitored. This adds code to sample the execution time of IOTLB/ devTLB/ICE cache invalidation. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-16-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:13 +02:00
Lu Baolu	719a193356	iommu/vt-d: Tweak the description of a DMA fault The Intel IOMMU driver reports the DMA fault reason in a decimal number while the VT-d specification uses a hexadecimal one. It's inconvenient that users need to covert them everytime before consulting the spec. Let's use hexadecimal number for a DMA fault reason. The fault message uses 0xffffffff as PASID for DMA requests w/o PASID. This is confusing. Tweak this by adding "NO_PASID" explicitly. Reviewed-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210517065425.4953-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-06-10 09:06:12 +02:00
Rolf Eike Beer	0ee74d5a48	iommu/vt-d: Fix sysfs leak in alloc_iommu() iommu_device_sysfs_add() is called before, so is has to be cleaned on subsequent errors. Fixes: `39ab9555c2` ("iommu: Add sysfs bindings for struct iommu_device") Cc: stable@vger.kernel.org # 4.11.x Signed-off-by: Rolf Eike Beer <eb@emlix.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/17411490.HIIP88n32C@mobilepool36.emlix.com Link: https://lore.kernel.org/r/20210525070802.361755-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-05-27 16:07:08 +02:00
Robin Murphy	2d471b20c5	iommu: Streamline registration interface Rather than have separate opaque setter functions that are easy to overlook and lead to repetitive boilerplate in drivers, let's pass the relevant initialisation parameters directly to iommu_device_register(). Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/ab001b87c533b6f4db71eb90db6f888953986c36.1617285386.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-16 17:20:45 +02:00
Lu Baolu	6ca69e5841	iommu/vt-d: Report more information about invalidation errors When the invalidation queue errors are encountered, dump the information logged by the VT-d hardware together with the pending queue invalidation descriptors. Signed-off-by: Ashok Raj <ashok.raj@intel.com> Tested-by: Guo Kaijie <Kaijie.Guo@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210318005340.187311-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-03-18 11:27:46 +01:00
Joerg Roedel	45e606f272	Merge branches 'arm/renesas', 'arm/smmu', 'x86/amd', 'x86/vt-d' and 'core' into next	2021-02-12 15:27:17 +01:00
Yian Chen	31a75cbbb9	iommu/vt-d: Parse SATC reporting structure Software should parse every SATC table and all devices in the tables reported by the BIOS and keep the information in kernel list for further reference. Signed-off-by: Yian Chen <yian.chen@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210203093329.1617808-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210204014401.2846425-7-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-02-04 14:42:00 +01:00
Lu Baolu	494b3688bb	iommu/vt-d: Correctly check addr alignment in qi_flush_dev_iotlb_pasid() An incorrect address mask is being used in the qi_flush_dev_iotlb_pasid() to check the address alignment. This leads to a lot of spurious kernel warnings: [ 485.837093] DMAR: Invalidate non-aligned address 7f76f47f9000, order 0 [ 485.837098] DMAR: Invalidate non-aligned address 7f76f47f9000, order 0 [ 492.494145] qi_flush_dev_iotlb_pasid: 5734 callbacks suppressed [ 492.494147] DMAR: Invalidate non-aligned address 7f7728800000, order 11 [ 492.508965] DMAR: Invalidate non-aligned address 7f7728800000, order 11 Fix it by checking the alignment in right way. Fixes: `288d08e780` ("iommu/vt-d: Handle non-page aligned address") Reported-and-tested-by: Guo Kaijie <Kaijie.Guo@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Cc: Liu Yi L <yi.l.liu@intel.com> Link: https://lore.kernel.org/r/20210119043500.1539596-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-01-28 13:17:08 +01:00
Lu Baolu	f2dd871799	iommu/vt-d: Add qi_submit trace event This adds a new trace event to track the submissions of requests to the invalidation queue. This event will provide the information like: - IOMMU name - Invalidation type - Descriptor raw data A sample output like: \| qi_submit: iotlb_inv dmar1: 0x100e2 0x0 0x0 0x0 \| qi_submit: dev_tlb_inv dmar1: 0x1000000003 0x7ffffffffffff001 0x0 0x0 \| qi_submit: iotlb_inv dmar2: 0x800f2 0xf9a00005 0x0 0x0 This will be helpful for queued invalidation related debugging. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210114090400.736104-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-01-28 11:32:13 +01:00
Lu Baolu	1efd17e7ac	iommu/vt-d: Fix misuse of ALIGN in qi_flush_piotlb() Use IS_ALIGNED() instead. Otherwise, an unaligned address will be ignored. Fixes: `33cd6e642d` ("iommu/vt-d: Flush PASID-based iotlb for iova over first level") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20201231005323.2178523-1-baolu.lu@linux.intel.com Signed-off-by: Will Deacon <will@kernel.org>	2021-01-07 13:27:14 +00:00
David Woodhouse	d76b42e927	iommu/vt-d: Don't read VCCAP register unless it exists My virtual IOMMU implementation is whining that the guest is reading a register that doesn't exist. Only read the VCCAP_REG if the corresponding capability is set in ECAP_REG to indicate that it actually exists. Fixes: `3375303e82` ("iommu/vt-d: Add custom allocator for IOASID") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Liu Yi L <yi.l.liu@intel.com> Cc: stable@vger.kernel.org # v5.7+ Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/de32b150ffaa752e0cff8571b17dfb1213fbe71c.camel@infradead.org Signed-off-by: Will Deacon <will@kernel.org>	2020-11-26 14:50:24 +00:00
Lu Baolu	3645a34f5b	iommu/vt-d: Fix compile error with CONFIG_PCI_ATS not set Fix the compile error below (CONFIG_PCI_ATS not set): drivers/iommu/intel/dmar.c: In function ‘vf_inherit_msi_domain’: drivers/iommu/intel/dmar.c:338:59: error: ‘struct pci_dev’ has no member named ‘physfn’; did you mean ‘is_physfn’? 338 \| dev_set_msi_domain(&pdev->dev, dev_get_msi_domain(&pdev->physfn->dev)); \| ^~~~~~ \| is_physfn Fixes: `ff828729be` ("iommu/vt-d: Cure VF irqdomain hickup") Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/linux-iommu/CAMuHMdXA7wfJovmfSH2nbAhN0cPyCiFHodTvg4a8Hm9rx5Dj-w@mail.gmail.com/ Link: https://lore.kernel.org/r/20201119055119.2862701-1-baolu.lu@linux.intel.com Signed-off-by: Will Deacon <will@kernel.org>	2020-11-19 09:52:26 +00:00
Thomas Gleixner	ff828729be	iommu/vt-d: Cure VF irqdomain hickup The recent changes to store the MSI irqdomain pointer in struct device missed that Intel DMAR does not register virtual function devices. Due to that a VF device gets the plain PCI-MSI domain assigned and then issues compat MSI messages which get caught by the interrupt remapping unit. Cure that by inheriting the irq domain from the physical function device. Ideally the irqdomain would be associated to the bus, but DMAR can have multiple units and therefore irqdomains on a single bus. The VF 'bus' could of course inherit the domain from the PF, but that'd be yet another x86 oddity. Fixes: `85a8dfc57a` ("iommm/vt-d: Store irq domain in struct device") Reported-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: David Woodhouse <dwmw2@infradead.org> Link: https://lore.kernel.org/r/draft-87eekymlpz.fsf@nanos.tec.linutronix.de	2020-11-13 12:00:40 +01:00
Linus Torvalds	5c7e3f3f5c	Merge tag 'iommu-fix-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fix from Joerg Roedel: "Fix a build regression with !CONFIG_IOMMU_API" * tag 'iommu-fix-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/vt-d: Don't dereference iommu_device if IOMMU_API is not built	2020-10-20 09:35:06 -07:00
Bartosz Golaszewski	9def3b1a07	iommu/vt-d: Don't dereference iommu_device if IOMMU_API is not built Since commit `c40aaaac10` ("iommu/vt-d: Gracefully handle DMAR units with no supported address widths") dmar.c needs struct iommu_device to be selected. We can drop this dependency by not dereferencing struct iommu_device if IOMMU_API is not selected and by reusing the information stored in iommu->drhd->ignored instead. This fixes the following build error when IOMMU_API is not selected: drivers/iommu/intel/dmar.c: In function ‘free_iommu’: drivers/iommu/intel/dmar.c:1139:41: error: ‘struct iommu_device’ has no member named ‘ops’ 1139 \| if (intel_iommu_enabled && iommu->iommu.ops) { ^ Fixes: `c40aaaac10` ("iommu/vt-d: Gracefully handle DMAR units with no supported address widths") Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Acked-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Link: https://lore.kernel.org/r/20201013073055.11262-1-brgl@bgdev.pl Signed-off-by: Joerg Roedel <jroedel@suse.de>	2020-10-19 14:16:02 +02:00
Linus Torvalds	531d29b0b6	Merge tag 'iommu-updates-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: - ARM-SMMU Updates from Will: - Continued SVM enablement, where page-table is shared with CPU - Groundwork to support integrated SMMU with Adreno GPU - Allow disabling of MSI-based polling on the kernel command-line - Minor driver fixes and cleanups (octal permissions, error messages, ...) - Secure Nested Paging Support for AMD IOMMU. The IOMMU will fault when a device tries DMA on memory owned by a guest. This needs new fault-types as well as a rewrite of the IOMMU memory semaphore for command completions. - Allow broken Intel IOMMUs (wrong address widths reported) to still be used for interrupt remapping. - IOMMU UAPI updates for supporting vSVA, where the IOMMU can access address spaces of processes running in a VM. - Support for the MT8167 IOMMU in the Mediatek IOMMU driver. - Device-tree updates for the Renesas driver to support r8a7742. - Several smaller fixes and cleanups all over the place. * tag 'iommu-updates-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (57 commits) iommu/vt-d: Gracefully handle DMAR units with no supported address widths iommu/vt-d: Check UAPI data processed by IOMMU core iommu/uapi: Handle data and argsz filled by users iommu/uapi: Rename uapi functions iommu/uapi: Use named union for user data iommu/uapi: Add argsz for user filled data docs: IOMMU user API iommu/qcom: add missing put_device() call in qcom_iommu_of_xlate() iommu/arm-smmu-v3: Add SVA device feature iommu/arm-smmu-v3: Check for SVA features iommu/arm-smmu-v3: Seize private ASID iommu/arm-smmu-v3: Share process page tables iommu/arm-smmu-v3: Move definitions to a header iommu/io-pgtable-arm: Move some definitions to a header iommu/arm-smmu-v3: Ensure queue is read after updating prod pointer iommu/amd: Re-purpose Exclusion range registers to support SNP CWWB iommu/amd: Add support for RMP_PAGE_FAULT and RMP_HW_ERR iommu/amd: Use 4K page for completion wait write-back semaphore iommu/tegra-smmu: Allow to group clients in same swgroup iommu/tegra-smmu: Fix iova->phys translation ...	2020-10-14 12:08:34 -07:00
Linus Torvalds	cf1d2b44f6	Merge tag 'acpi-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "These add support for generic initiator-only proximity domains to the ACPI NUMA code and the architectures using it, clean up some non-ACPICA code referring to debug facilities from ACPICA, reduce the overhead related to accessing GPE registers, add a new DPTF (Dynamic Power and Thermal Framework) participant driver, update the ACPICA code in the kernel to upstream revision 20200925, add a new ACPI backlight whitelist entry, fix a few assorted issues and clean up some code. Specifics: - Add support for generic initiator-only proximity domains to the ACPI NUMA code and the architectures using it (Jonathan Cameron) - Clean up some non-ACPICA code referring to debug facilities from ACPICA that are not actually used in there (Hanjun Guo) - Add new DPTF driver for the PCH FIVR participant (Srinivas Pandruvada) - Reduce overhead related to accessing GPE registers in ACPICA and the OS interface layer and make it possible to access GPE registers using logical addresses if they are memory-mapped (Rafael Wysocki) - Update the ACPICA code in the kernel to upstream revision 20200925 including changes as follows: + Add predefined names from the SMBus sepcification (Bob Moore) + Update acpi_help UUID list (Bob Moore) + Return exceptions for string-to-integer conversions in iASL (Bob Moore) + Add a new "ALL <NameSeg>" debugger command (Bob Moore) + Add support for 64 bit risc-v compilation (Colin Ian King) + Do assorted cleanups (Bob Moore, Colin Ian King, Randy Dunlap) - Add new ACPI backlight whitelist entry for HP 635 Notebook (Alex Hung) - Move TPS68470 OpRegion driver to drivers/acpi/pmic/ and split out Kconfig and Makefile specific for ACPI PMIC (Andy Shevchenko) - Clean up the ACPI SoC driver for AMD SoCs (Hanjun Guo) - Add missing config_item_put() to fix refcount leak (Hanjun Guo) - Drop lefrover field from struct acpi_memory_device (Hanjun Guo) - Make the ACPI extlog driver check for RDMSR failures (Ben Hutchings) - Fix handling of lid state changes in the ACPI button driver when input device is closed (Dmitry Torokhov) - Fix several assorted build issues (Barnabás Pőcze, John Garry, Nathan Chancellor, Tian Tao) - Drop unused inline functions and reduce code duplication by using kobj_to_dev() in the NFIT parsing code (YueHaibing, Wang Qing) - Serialize tools/power/acpi Makefile (Thomas Renninger)" * tag 'acpi-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (64 commits) ACPICA: Update version to 20200925 Version 20200925 ACPICA: Remove unnecessary semicolon ACPICA: Debugger: Add a new command: "ALL <NameSeg>" ACPICA: iASL: Return exceptions for string-to-integer conversions ACPICA: acpi_help: Update UUID list ACPICA: Add predefined names found in the SMBus sepcification ACPICA: Tree-wide: fix various typos and spelling mistakes ACPICA: Drop the repeated word "an" in a comment ACPICA: Add support for 64 bit risc-v compilation ACPI: button: fix handling lid state changes when input device closed tools/power/acpi: Serialize Makefile ACPI: scan: Replace ACPI_DEBUG_PRINT() with pr_debug() ACPI: memhotplug: Remove 'state' from struct acpi_memory_device ACPI / extlog: Check for RDMSR failure ACPI: Make acpi_evaluate_dsm() prototype consistent docs: mm: numaperf.rst Add brief description for access class 1. node: Add access1 class to represent CPU to memory characteristics ACPI: HMAT: Fix handling of changes from ACPI 6.2 to ACPI 6.3 ACPI: Let ACPI know we support Generic Initiator Affinity Structures x86: Support Generic Initiator only proximity domains ...	2020-10-14 11:42:04 -07:00
Linus Torvalds	cc7343724e	Merge tag 'x86-irq-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 irq updates from Thomas Gleixner: "Surgery of the MSI interrupt handling to prepare the support of upcoming devices which require non-PCI based MSI handling: - Cleanup historical leftovers all over the place - Rework the code to utilize more core functionality - Wrap XEN PCI/MSI interrupts into an irqdomain to make irqdomain assignment to PCI devices possible. - Assign irqdomains to PCI devices at initialization time which allows to utilize the full functionality of hierarchical irqdomains. - Remove arch_._msi_irq() functions from X86 and utilize the irqdomain which is assigned to the device for interrupt management. - Make the arch_._msi_irq() support conditional on a config switch and let the last few users select it" * tag 'x86-irq-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits) PCI: MSI: Fix Kconfig dependencies for PCI_MSI_ARCH_FALLBACKS x86/apic/msi: Unbreak DMAR and HPET MSI iommu/amd: Remove domain search for PCI/MSI iommu/vt-d: Remove domain search for PCI/MSI[X] x86/irq: Make most MSI ops XEN private x86/irq: Cleanup the arch__msi_irqs() leftovers PCI/MSI: Make arch_._msi_irq[s] fallbacks selectable x86/pci: Set default irq domain in pcibios_add_device() iommm/amd: Store irq domain in struct device iommm/vt-d: Store irq domain in struct device x86/xen: Wrap XEN MSI management into irqdomain irqdomain/msi: Allow to override msi_domain_alloc/free_irqs() x86/xen: Consolidate XEN-MSI init x86/xen: Rework MSI teardown x86/xen: Make xen_msi_init() static and rename it to xen_hvm_msi_init() PCI/MSI: Provide pci_dev_has_special_msi_domain() helper PCI_vmd_Mark_VMD_irqdomain_with_DOMAIN_BUS_VMD_MSI irqdomain/msi: Provide DOMAIN_BUS_VMD_MSI x86/irq: Initialize PCI/MSI domain at PCI init time x86/pci: Reducde #ifdeffery in PCI init code ...	2020-10-12 11:40:41 -07:00
David Woodhouse	c40aaaac10	iommu/vt-d: Gracefully handle DMAR units with no supported address widths Instead of bailing out completely, such a unit can still be used for interrupt remapping. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/linux-iommu/549928db2de6532117f36c9c810373c14cf76f51.camel@infradead.org/ Signed-off-by: Joerg Roedel <jroedel@suse.de>	2020-10-07 11:49:54 +02:00
Jonathan Cameron	01feba590c	ACPI: Do not create new NUMA domains from ACPI static tables that are not SRAT Several ACPI static tables contain references to proximity domains. ACPI 6.3 has clarified that only entries in SRAT may define a new domain (sec 5.2.16). Those tables described in the ACPI spec have additional clarifying text. NFIT: Table 5-132, "Integer that represents the proximity domain to which the memory belongs. This number must match with corresponding entry in the SRAT table." HMAT: Table 5-145, "... This number must match with the corresponding entry in the SRAT table's processor affinity structure ... if the initiator is a processor, or the Generic Initiator Affinity Structure if the initiator is a generic initiator". IORT and DMAR are defined by external specifications. Intel Virtualization Technology for Directed I/O Rev 3.1 does not make any explicit statements, but the general SRAT statement above will still apply. https://software.intel.com/sites/default/files/managed/c5/15/vt-directed-io-spec.pdf IO Remapping Table, Platform Design Document rev D, also makes not explicit statement, but refers to ACPI SRAT table for more information and again the generic SRAT statement above applies. https://developer.arm.com/documentation/den0049/d/ In conclusion, any proximity domain specified in these tables, should be a reference to a proximity domain also found in SRAT, and they should not be able to instantiate a new domain. Hence we switch to pxm_to_node() which will only return existing nodes. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Barry Song <song.bao.hua@hisilicon.com> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-09-24 12:57:37 +02:00

1 2

58 Commits