linux

iv/linux

Author	SHA1	Message	Date
Lu Baolu	0095bf8355	iommu: Improve iopf_queue_remove_device() Convert iopf_queue_remove_device() to return void instead of an error code, as the return value is never used. This removal helper is designed to be never-failed, so there's no need for error handling. Ack all outstanding page requests from the device with the response code of IOMMU_PAGE_RESP_INVALID, indicating device should not attempt any retry. Add comments to this helper explaining the steps involved in removing a device from the iopf queue and disabling its PRI. The individual drivers are expected to be adjusted accordingly. Here we just define the expected behaviors of the individual iommu driver from the core's perspective. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Link: https://lore.kernel.org/r/20240212012227.119381-14-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-16 15:19:34 +01:00
Lu Baolu	a74c077b90	iommu: Use refcount for fault data access The per-device fault data structure stores information about faults occurring on a device. Its lifetime spans from IOPF enablement to disablement. Multiple paths, including IOPF reporting, handling, and responding, may access it concurrently. Previously, a mutex protected the fault data from use after free. But this is not performance friendly due to the critical nature of IOPF handling paths. Refine this with a refcount-based approach. The fault data pointer is obtained within an RCU read region with a refcount. The fault data pointer is returned for usage only when the pointer is valid and a refcount is successfully obtained. The fault data is freed with kfree_rcu(), ensuring data is only freed after all RCU critical regions complete. An iopf handling work starts once an iopf group is created. The handling work continues until iommu_page_response() is called to respond to the iopf and the iopf group is freed. During this time, the device fault parameter should always be available. Add a pointer to the device fault parameter in the iopf_group structure and hold the reference until the iopf_group is freed. Make iommu_page_response() static as it is only used in io-pgfault.c. Co-developed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Link: https://lore.kernel.org/r/20240212012227.119381-13-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-16 15:19:33 +01:00
Lu Baolu	cc7338e9d8	iommu: Refine locking for per-device fault data management The per-device fault data is a data structure that is used to store information about faults that occur on a device. This data is allocated when IOPF is enabled on the device and freed when IOPF is disabled. The data is used in the paths of iopf reporting, handling, responding, and draining. The fault data is protected by two locks: - dev->iommu->lock: This lock is used to protect the allocation and freeing of the fault data. - dev->iommu->fault_parameter->lock: This lock is used to protect the fault data itself. Apply the locking mechanism to the fault reporting and responding paths. The fault_parameter->lock is also added in iopf_queue_discard_partial(). It does not fix any real issue, as iopf_queue_discard_partial() is only used in the VT-d driver's prq_event_thread(), which is a single-threaded path that reports the IOPFs. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Longfang Liu <liulongfang@huawei.com> Link: https://lore.kernel.org/r/20240212012227.119381-12-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-16 15:19:32 +01:00
Lu Baolu	17c51a0ea3	iommu: Separate SVA and IOPF Add CONFIG_IOMMU_IOPF for page fault handling framework and select it from its real consumer. Move iopf function declaration from iommu-sva.h to iommu.h and remove iommu-sva.h as it's empty now. Consolidate all SVA related code into iommu-sva.c: - Move iommu_sva_domain_alloc() from iommu.c to iommu-sva.c. - Move sva iopf handling code from io-pgfault.c to iommu-sva.c. Consolidate iommu_report_device_fault() and iommu_page_response() into io-pgfault.c. Export iopf_free_group() and iopf_group_response() for iopf handlers implemented in modules. Some functions are renamed with more meaningful names. No other intentional functionality changes. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Longfang Liu <liulongfang@huawei.com> Link: https://lore.kernel.org/r/20240212012227.119381-11-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-16 15:19:29 +01:00
Lu Baolu	351ffcb11c	iommu: Make iommu_queue_iopf() more generic Make iommu_queue_iopf() more generic by making the iopf_group a minimal set of iopf's that an iopf handler of domain should handle and respond to. Add domain parameter to struct iopf_group so that the handler can retrieve and use it directly. Change iommu_queue_iopf() to forward groups of iopf's to the domain's iopf handler. This is also a necessary step to decouple the sva iopf handling code from this interface. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Longfang Liu <liulongfang@huawei.com> Link: https://lore.kernel.org/r/20240212012227.119381-10-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-16 15:19:28 +01:00
Lu Baolu	24b5d268b5	iommu: Prepare for separating SVA and IOPF Move iopf_group data structure to iommu.h to make it a minimal set of faults that a domain's page fault handler should handle. Add a new function, iopf_free_group(), to free a fault group after all faults in the group are handled. This function will be made global so that it can be called from other files, such as iommu-sva.c. Move iopf_queue data structure to iommu.h to allow the workqueue to be scheduled out of this file. This will simplify the sequential patches. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Longfang Liu <liulongfang@huawei.com> Link: https://lore.kernel.org/r/20240212012227.119381-9-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-16 15:19:27 +01:00
Lu Baolu	3f02a9dc70	iommu: Merge iommu_fault_event and iopf_fault The iommu_fault_event and iopf_fault data structures store the same information about an iopf fault. They are also used in the same way. Merge these two data structures into a single one to make the code more concise and easier to maintain. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Longfang Liu <liulongfang@huawei.com> Link: https://lore.kernel.org/r/20240212012227.119381-8-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-16 15:19:26 +01:00
Lu Baolu	1ff25d798e	iommu: Remove iommu_[un]register_device_fault_handler() The individual iommu driver reports the iommu page faults by calling iommu_report_device_fault(), where a pre-registered device fault handler is called to route the fault to another fault handler installed on the corresponding iommu domain. The pre-registered device fault handler is static and won't be dynamic as the fault handler is eventually per iommu domain. Replace calling device fault handler with iommu_queue_iopf(). After this replacement, the registering and unregistering fault handler interfaces are not needed anywhere. Remove the interfaces and the related data structures to avoid dead code. Convert cookie parameter of iommu_queue_iopf() into a device pointer that is really passed. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Longfang Liu <liulongfang@huawei.com> Link: https://lore.kernel.org/r/20240212012227.119381-7-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-16 15:19:24 +01:00
Lu Baolu	15fc60cdd2	iommu: Merge iopf_device_param into iommu_fault_param The struct dev_iommu contains two pointers, fault_param and iopf_param. The fault_param pointer points to a data structure that is used to store pending faults that are awaiting responses. The iopf_param pointer points to a data structure that is used to store partial faults that are part of a Page Request Group. The fault_param and iopf_param pointers are essentially duplicate. This causes memory waste. Merge the iopf_device_param pointer into the iommu_fault_param pointer to consolidate the code and save memory. The consolidated pointer would be allocated on demand when the device driver enables the iopf on device, and would be freed after iopf is disabled. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Longfang Liu <liulongfang@huawei.com> Link: https://lore.kernel.org/r/20240212012227.119381-6-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-16 15:19:23 +01:00
Lu Baolu	8b32a3bea2	iommu: Cleanup iopf data structure definitions struct iommu_fault_page_request and struct iommu_page_response are not part of uAPI anymore. Convert them to data structures for kAPI. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Longfang Liu <liulongfang@huawei.com> Link: https://lore.kernel.org/r/20240212012227.119381-5-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-16 15:19:22 +01:00
Lu Baolu	66014df73b	iommu/arm-smmu-v3: Remove unrecoverable faults reporting No device driver registers fault handler to handle the reported unrecoveraable faults. Remove it to avoid dead code. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Longfang Liu <liulongfang@huawei.com> Link: https://lore.kernel.org/r/20240212012227.119381-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-16 15:19:21 +01:00
Erick Archer	b07cd3b746	iommu/mtk_iommu: Use devm_kcalloc() instead of devm_kzalloc() This is an effort to get rid of all multiplications from allocation functions in order to prevent integer overflows [1]. Here the multiplication is obviously safe because MTK_PROTECT_PA_ALIGN is defined as a literal value of 256 or 128. For the "mtk_iommu.c" file: 256 For the "mtk_iommu_v1.c" file: 128 However, using devm_kcalloc() is more appropriate [2] and improves readability. This patch has no effect on runtime behavior. Link: https://github.com/KSPP/linux/issues/162 [1] Link: https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments [2] Signed-off-by: Erick Archer <erick.archer@gmx.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20240211182250.12656-1-erick.archer@gmx.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-16 15:18:12 +01:00
Mario Limonciello	0feda94c86	iommu/amd: Mark interrupt as managed On many systems that have an AMD IOMMU the following sequence of warnings is observed during bootup. ``` pci 0000:00:00.2 can't derive routing for PCI INT A pci 0000:00:00.2: PCI INT A: not connected ``` This series of events happens because of the IOMMU initialization sequence order and the lack of _PRT entries for the IOMMU. During initialization the IOMMU driver first enables the PCI device using pci_enable_device(). This will call acpi_pci_irq_enable() which will check if the interrupt is declared in a PCI routing table (_PRT) entry. According to the PCI spec [1] these routing entries are only required under PCI root bridges: The _PRT object is required under all PCI root bridges The IOMMU is directly connected to the root complex, so there is no parent bridge to look for a _PRT entry. The first warning is emitted since no entry could be found in the hierarchy. The second warning is then emitted because the interrupt hasn't yet been configured to any value. The pin was configured in pci_read_irq() but the byte in PCI_INTERRUPT_LINE return 0xff which means "Unknown". After that sequence of events pci_enable_msi() is called and this will allocate an interrupt. That is both of these warnings are totally harmless because the IOMMU uses MSI for interrupts. To avoid even trying to probe for a _PRT entry mark the IOMMU as IRQ managed. This avoids both warnings. Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/06_Device_Configuration/Device_Configuration.html?highlight=_prt#prt-pci-routing-table [1] Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Fixes: cffe0a2b5a34 ("x86, irq: Keep balance of IOAPIC pin reference count") Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20240122233400.1802-1-mario.limonciello@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-16 15:16:14 +01:00
Thomas Gleixner	14fd06c776	irqchip: Convert all platform MSI users to the new API Switch all the users of the platform MSI domain over to invoke the new interfaces which branch to the original platform MSI functions when the irqdomain associated to the caller device does not yet provide MSI parent functionality. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Anup Patel <apatel@ventanamicro.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240127161753.114685-7-apatel@ventanamicro.com	2024-02-15 17:55:40 +01:00
Dmitry Baryshkov	b419c5e2d9	Revert "iommu/arm-smmu: Convert to domain_alloc_paging()" This reverts commit 9b3febc3a3da ("iommu/arm-smmu: Convert to domain_alloc_paging()"). It breaks Qualcomm MSM8996 platform. Calling arm_smmu_write_context_bank() from new codepath results in the platform being reset because of the unclocked hardware access. Fixes: 9b3febc3a3da ("iommu/arm-smmu: Convert to domain_alloc_paging()") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Acked-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20240213-iommu-revert-domain-alloc-v1-1-325ff55dece4@linaro.org Signed-off-by: Will Deacon <will@kernel.org>	2024-02-13 12:19:55 +00:00
Vasant Hegde	87a6f1f22c	iommu/amd: Introduce per-device domain ID to fix potential TLB aliasing issue With v1 page table, the AMD IOMMU spec states that the hardware must use the domain ID to tag its internal translation caches. I/O devices with different v1 page tables must be given different domain IDs. I/O devices that share the same v1 page table __may__ be given the same domain ID. This domain ID management policy is currently implemented by the AMD IOMMU driver. In this case, only the domain ID is needed when issuing the INVALIDATE_IOMMU_PAGES command to invalidate the IOMMU translation cache (TLB). With v2 page table, the hardware uses domain ID and PASID as parameters to tag and issue the INVALIDATE_IOMMU_PAGES command. Since the GCR3 table is setup per-device, and there is no guarantee for PASID to be unique across multiple devices. The same PASID for different devices could have different v2 page tables. In such case, if multiple devices share the same domain ID, IOMMU translation cache for these devices would be polluted due to TLB aliasing. Hence, avoid the TLB aliasing issue with v2 page table by allocating unique domain ID for each device even when multiple devices are sharing the same v1 page table. Please note that this fix would result in multiple INVALIDATE_IOMMU_PAGES commands (one per domain id) when unmapping a translation. Domain ID can be shared until device starts using PASID. We will enhance this code later where we will allocate per device domain ID only when its needed. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-18-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 13:16:31 +01:00
Suravee Suthikulpanit	c2a6af5e08	iommu/amd: Remove unused GCR3 table parameters from struct protection_domain Since they are moved to struct iommu_dev_data, and the driver has been ported to use them. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-17-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 13:16:30 +01:00
Vasant Hegde	a7b2aff313	iommu/amd: Rearrange device flush code Consolidate all flush related code in one place so that its easy to maintain. No functional changes intended. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-16-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 13:16:29 +01:00
Vasant Hegde	02b990253d	iommu/amd: Remove unused flush pasid functions We have removed iommu_v2 module and converted v2 page table to use common flush functions. Also we have moved GCR3 table to per device. PASID related functions are not used. Hence remove these unused functions. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-15-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 13:16:29 +01:00
Suravee Suthikulpanit	cf70873e3d	iommu/amd: Refactor GCR3 table helper functions To use the new per-device struct gcr3_tbl_info. Use GFP_KERNEL flag instead of GFP_ATOMIC for GCR3 table allocation. Also modify set_dte_entry() to use new per device GCR3 table. Also in free_gcr3_table() path replace BUG_ON with WARN_ON_ONCE(). Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Co-developed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-14-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 13:16:28 +01:00
Suravee Suthikulpanit	fb575d1781	iommu/amd: Refactor protection_domain helper functions To removes the code to setup GCR3 table, and only handle domain create / destroy, since GCR3 is no longer part of a domain. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-13-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 13:16:28 +01:00
Suravee Suthikulpanit	4ebd4c7f25	iommu/amd: Refactor attaching / detaching device functions If domain is configured with V2 page table then setup default GCR3 with domain GCR3 pointer. So that all devices in the domain uses same page table for translation. Also return page table setup status from do_attach() function. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Co-developed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-12-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 13:16:27 +01:00
Suravee Suthikulpanit	e8e1aac334	iommu/amd: Refactor helper function for setting / clearing GCR3 Refactor GCR3 helper functions in preparation to use per device GCR3 table. * Add new function update_gcr3 to update per device GCR3 table * Remove per domain default GCR3 setup during v2 page table allocation. Subsequent patch will add support to setup default gcr3 while attaching device to domain. * Remove amd_iommu_domain_update() from V2 page table path as device detach path will take care of updating the domain. * Consolidate GCR3 table related code in one place so that its easy to maintain. * Rename functions to reflect its usage. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Co-developed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-11-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 13:16:26 +01:00
Vasant Hegde	bf8aff2945	iommu: Introduce iommu_group_mutex_assert() Add function to check iommu group mutex lock. So that device drivers can rely on group mutex lock instead of adding another driver level lock before modifying driver specific device data structure. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-10-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 13:16:26 +01:00
Vasant Hegde	b2e8a7f5d2	iommu/amd: Rearrange GCR3 table setup code Consolidate GCR3 table related code in one place so that its easy to maintain. Note that this patch doesn't move __set_gcr3/__clear_gcr3. We are moving GCR3 table from per domain to per device. Following series will rework these functions. During that time I will move these functions as well. No functional changes intended. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-9-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 13:16:25 +01:00
Vasant Hegde	474bf01ed9	iommu/amd: Add support for device based TLB invalidation Add support to invalidate TLB/IOTLB for the given device. These functions will be used in subsequent patches where we will introduce per device GCR3 table and SVA support. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-8-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 13:16:24 +01:00
Vasant Hegde	7b4e5623d8	iommu/amd: Use protection_domain.flags to check page table mode Page table mode (v1, v2 or pt) is per domain property. Recently we have enhanced protection_domain.pd_mode to track per domain page table mode. Use that variable to check the page table mode instead of global 'amd_iommu_pgtable' in {map/unmap}_pages path. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-7-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 13:16:24 +01:00
Suravee Suthikulpanit	b773106552	iommu/amd: Introduce per-device GCR3 table AMD IOMMU GCR3 table is indexed by PASID. Each entry stores guest CR3 register value, which is an address to the root of guest IO page table. The GCR3 table can be programmed per-device. However, Linux AMD IOMMU driver currently managing the table on a per-domain basis. PASID is a device feature. When SVA is enabled it will bind PASID to device, not domain. Hence it makes sense to have per device GCR3 table. Introduce struct iommu_dev_data.gcr3_tbl_info to keep track of GCR3 table configuration. This will eventually replaces gcr3 related variables in protection_domain structure. Suggested-by: Jason Gunthorpe <jgg@ziepe.ca> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-6-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 13:16:23 +01:00
Suravee Suthikulpanit	fda5108eba	iommu/amd: Introduce struct protection_domain.pd_mode This enum variable is used to track the type of page table used by the protection domain. It will replace the protection_domain.flags in subsequent series. Suggested-by: Jason Gunthorpe <jgg@ziepe.ca> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-5-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 13:16:23 +01:00
Suravee Suthikulpanit	6f35fe5d8a	iommu/amd: Introduce get_amd_iommu_from_dev() Introduce get_amd_iommu_from_dev() and get_amd_iommu_from_dev_data(). And replace rlookup_amd_iommu() with the new helper function where applicable to avoid unnecessary loop to look up struct amd_iommu from struct device. Suggested-by: Jason Gunthorpe <jgg@ziepe.ca> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-4-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 13:16:22 +01:00
Vasant Hegde	8e01797331	iommu/amd: Enable Guest Translation before registering devices IOMMU Guest Translation (GT) feature needs to be enabled before invalidating guest translations (CMD_INV_IOMMU_PAGES with GN=1). Currently GT feature is enabled after setting up interrupt handler. So far it was fine as we were not invalidating guest page table before this point. Upcoming series will introduce per device GCR3 table and it will invalidate guest pages after configuring. Hence move GT feature enablement to early_enable_iommu(). Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-3-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 13:16:21 +01:00
Vasant Hegde	a6ffb9b3d7	iommu/amd: Pass struct iommu_dev_data to set_dte_entry() Pass iommu_dev_data structure instead of passing indivisual variables. No functional changes intended. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240205115615.6053-2-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 13:16:21 +01:00
Vasant Hegde	108042db53	iommu/amd: Remove EXPORT_SYMBOL for perf counter related functions .. as IOMMU perf counters are always built as part of kernel. No functional change intended. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240118090105.5864-7-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 13:16:20 +01:00
Vasant Hegde	2dc9506bfb	iommu/amd: Remove redundant error check in amd_iommu_probe_device() iommu_init_device() is not returning -ENOTSUPP since commit 61289cbaf6c8 ("iommu/amd: Remove old alias handling code"). No functional change intended. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240118090105.5864-6-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 13:16:20 +01:00
Vasant Hegde	773b05e7f4	iommu/amd: Remove duplicate function declarations from amd_iommu.h Perf counter related functions are defined in amd-iommu.h as well. Hence remove duplicate declarations. No functional change intended. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240118090105.5864-5-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 13:16:05 +01:00
Robin Murphy	f2d6677ad5	iommu/ipmmu-vmsa: Minor cleanups Remove the of_match_ptr() which was supposed to have gone long ago, but managed to got lost in a fix-squashing mishap. On a similar theme, we may as well also modernise the PM ops to get rid of the clunky #ifdefs, and modernise the resource mapping to keep the checkpatch brigade happy. Link: https://lore.kernel.org/linux-iommu/Yxni3d6CdI3FZ5D+@8bytes.org/ Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Link: https://lore.kernel.org/r/791877b0d310dc2ab7dc616d2786ab24252b9b8e.1707151207.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 11:47:21 +01:00
Pasha Tatashin	84e6f56be9	iommu/iova: use named kmem_cache for iova magazines The magazine buffers can take gigabytes of kmem memory, dominating all other allocations. For observability purpose create named slab cache so the iova magazine memory overhead can be clearly observed. With this change: > slabtop -o \| head Active / Total Objects (% used) : 869731 / 952904 (91.3%) Active / Total Slabs (% used) : 103411 / 103974 (99.5%) Active / Total Caches (% used) : 135 / 211 (64.0%) Active / Total Size (% used) : 395389.68K / 411430.20K (96.1%) Minimum / Average / Maximum Object : 0.02K / 0.43K / 8.00K OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 244412 244239 99% 1.00K 61103 4 244412K iommu_iova_magazine 91636 88343 96% 0.03K 739 124 2956K kmalloc-32 75744 74844 98% 0.12K 2367 32 9468K kernfs_node_cache On this machine it is now clear that magazine use 242M of kmem memory. Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> [ rm: adjust to rework of iova_cache_{get,put} ] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Link: https://lore.kernel.org/r/dc5c51aaba50906a92b9ba1a5137ed462484a7be.1707144953.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 11:45:47 +01:00
Robin Murphy	7f845d8b2e	iommu/iova: Reorganise some code The iova_cache_{get,put}() calls really represent top-level lifecycle management for the whole IOVA library, so it's long been rather confusing to have them buried right in the middle of the allocator implementation details. Move them to a more expected position at the end of the file, where it will then also be easier to expand them. With this, we can also move the rcache hotplug handler (plus another stray function) into the rcache portion of the file. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: David Rientjes <rientjes@google.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Link: https://lore.kernel.org/r/d4753562f4faa0e6b3aeebcbf88fdb60cc22d715.1707144953.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 11:45:47 +01:00
Robin Murphy	e7b3533c81	iommu/iova: Tidy up iova_cache_get() failure Failure handling in iova_cache_get() is a little messy, and we'd like to add some more to it, so let's tidy up a bit first. By leaving the hotplug handler until last we can take advantage of kmem_cache_destroy() being NULL-safe to have a single cleanup label. We can also improve the error reporting, noting that kmem_cache_create() already screams if it fails, so that one is redundant. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: David Rientjes <rientjes@google.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Link: https://lore.kernel.org/r/ae4a3bda2d6a9b738221553c838d30473bd624e7.1707144953.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 11:45:46 +01:00
Vasant Hegde	2edf056f57	iommu/amd: Remove unused APERTURE_* macros These macros are not used after commit 518d9b450387 ("iommu/amd: Remove special mapping code for dma_ops path"). No functional change intended. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240118090105.5864-4-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 11:40:31 +01:00
Vasant Hegde	a408663766	iommu/amd: Remove unused IOVA_* macro These macros are not used after commit ac6d704679d343 ("iommu/dma: Pass address limit rather than size to iommu_setup_dma_ops()"). No functional change intended. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Cc: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240118090105.5864-3-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 11:40:31 +01:00
Vasant Hegde	be4f599587	iommu/amd: Remove unused PPR_* macros Commit 5a0b11a180a ("iommu/amd: Remove iommu_v2 module") missed to remove PPR_* macros. Remove these macros as its not used anymore. No functional change intended. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240118090105.5864-2-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2024-02-09 11:40:31 +01:00
Ashish Kalra	45ba5b3c0a	iommu/amd: Fix failure return from snp_lookup_rmpentry() Commit f366a8dac1b8: ("iommu/amd: Clean up RMP entries for IOMMU pages during SNP shutdown") leads to the following Smatch static checker warning: drivers/iommu/amd/init.c:3820 iommu_page_make_shared() error: uninitialized symbol 'assigned'. Fix it. [ bp: Address the other error cases too. ] Fixes: f366a8dac1b8 ("iommu/amd: Clean up RMP entries for IOMMU pages during SNP shutdown") Closes: https://lore.kernel.org/linux-iommu/1be69f6a-e7e1-45f9-9a74-b2550344f3fd@moroto.mountain Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: Joerg Roedel <jroedel@suse.com> Link: https://lore.kernel.org/lkml/20240126041126.1927228-20-michael.roth@amd.com	2024-02-07 17:27:03 +01:00
Dmitry Baryshkov	18368ee25d	iommu/msm-iommu: don't limit the driver too much In preparation of dropping most of ARCH_QCOM subtypes, stop limiting the driver just to those machines. Allow it to be built for any 32-bit Qualcomm platform (ARCH_QCOM). Acked-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Link: https://lore.kernel.org/r/20231216162700.863456-2-dmitry.baryshkov@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org>	2024-02-06 15:00:02 -06:00
Joao Martins	4bbcbc6ea2	iommufd/iova_bitmap: Consider page offset for the pages to be pinned For small bitmaps that aren't PAGE_SIZE aligned and that are less than 512 pages in bitmap length, use an extra page to be able to cover the entire range e.g. [1M..3G] which would be iterated more efficiently in a single iteration, rather than two. Fixes: b058ea3ab5af ("vfio/iova_bitmap: refactor iova_bitmap_set() to better handle page boundaries") Link: https://lore.kernel.org/r/20240202133415.23819-10-joao.m.martins@oracle.com Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Tested-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-02-06 11:31:46 -04:00
Joao Martins	7db521e23f	iommufd/selftest: Hugepage mock domain support Add support to mock iommu hugepages of 1M (for a 2K mock io page size). To avoid breaking test suite defaults, the way this is done is by explicitly creating a iommu mock device which has hugepage support (i.e. through MOCK_FLAGS_DEVICE_HUGE_IOVA). The same scheme is maintained of mock base page index tracking in the XArray, except that an extra bit is added to mark it as a hugepage. One subpage containing the dirty bit, means that the whole hugepage is dirty (similar to AMD IOMMU non-standard page sizes). For clearing, same thing applies, and it must clear all dirty subpages. This is in preparation for dirty tracking to mark mock hugepages as dirty to exercise all the iova-bitmap fixes. Link: https://lore.kernel.org/r/20240202133415.23819-8-joao.m.martins@oracle.com Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-02-06 11:31:45 -04:00
Joao Martins	02a8c61a8b	iommufd/selftest: Refactor mock_domain_read_and_clear_dirty() Move the clearing of the dirty bit of the mock domain into mock_domain_test_and_clear_dirty() helper, simplifying the caller function. Additionally, rework the mock_domain_read_and_clear_dirty() loop to iterate over a potentially variable IO page size. No functional change intended with the loop refactor. This is in preparation for dirty tracking support for IOMMU hugepage mock domains. Link: https://lore.kernel.org/r/20240202133415.23819-7-joao.m.martins@oracle.com Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-02-06 11:31:45 -04:00
Joao Martins	2780025e01	iommufd/iova_bitmap: Handle recording beyond the mapped pages IOVA bitmap is a zero-copy scheme of recording dirty bits that iterate the different bitmap user pages at chunks of a maximum of PAGE_SIZE/sizeof(struct page*) pages. When the iterations are split up into 64G, the end of the range may be broken up in a way that's aligned with a non base page PTE size. This leads to only part of the huge page being recorded in the bitmap. Note that in pratice this is only a problem for IOMMU dirty tracking i.e. when the backing PTEs are in IOMMU hugepages and the bitmap is in base page granularity. So far this not something that affects VF dirty trackers (which reports and records at the same granularity). To fix that, if there is a remainder of bits left to set in which the current IOVA bitmap doesn't cover, make a copy of the bitmap structure and iterate-and-set the rest of the bits remaining. Finally, when advancing the iterator, skip all the bits that were set ahead. Link: https://lore.kernel.org/r/20240202133415.23819-5-joao.m.martins@oracle.com Reported-by: Avihai Horon <avihaih@nvidia.com> Fixes: f35f22cc760e ("iommu/vt-d: Access/Dirty bit support for SS domains") Fixes: 421a511a293f ("iommu/amd: Access/Dirty bit support in IOPTEs") Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Tested-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-02-06 11:31:45 -04:00
Joao Martins	d18411ec30	iommufd/iova_bitmap: Switch iova_bitmap::bitmap to an u8 array iova_bitmap_mapped_length() don't deal correctly with the small bitmaps (< 2M bitmaps) when the starting address isn't u64 aligned, leading to skipping a tiny part of the IOVA range. This is materialized as not marking data dirty that should otherwise have been. Fix that by using a u8 * in the internal state of IOVA bitmap. Most of the data structures use the type of the bitmap to adjust its indexes, thus changing the type of the bitmap decreases the granularity of the bitmap indexes. Fixes: b058ea3ab5af ("vfio/iova_bitmap: refactor iova_bitmap_set() to better handle page boundaries") Link: https://lore.kernel.org/r/20240202133415.23819-3-joao.m.martins@oracle.com Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Tested-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-02-06 11:31:45 -04:00
Joao Martins	a4ab7dedae	iommufd/iova_bitmap: Bounds check mapped::pages access Dirty IOMMU hugepages reported on a base page page-size granularity can lead to an attempt to set dirty pages in the bitmap beyond the limits that are pinned. Bounds check the page index of the array we are trying to access is within the limits before we kmap() and return otherwise. While it is also a defensive check, this is also in preparation to defer setting bits (outside the mapped range) to the next iteration(s) when the pages become available. Fixes: b058ea3ab5af ("vfio/iova_bitmap: refactor iova_bitmap_set() to better handle page boundaries") Link: https://lore.kernel.org/r/20240202133415.23819-2-joao.m.martins@oracle.com Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Tested-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-02-06 11:31:45 -04:00

1 2 3 4 5 ...

5129 Commits