linux

iv/linux

Author	SHA1	Message	Date
Joerg Roedel	6a7086431f	Merge branches 'iommu/fixes', 'arm/rockchip', 'arm/renesas', 'arm/smmu', 'arm/core', 'x86/vt-d', 'x86/amd', 's390' and 'core' into next	2017-06-28 14:45:02 +02:00
Suravee Suthikulpanit	84a21dbdef	iommu/amd: Fix interrupt remapping when disable guest_mode Pass-through devices to VM guest can get updated IRQ affinity information via irq_set_affinity() when not running in guest mode. Currently, AMD IOMMU driver in GA mode ignores the updated information if the pass-through device is setup to use vAPIC regardless of guest_mode. This could cause invalid interrupt remapping. Also, the guest_mode bit should be set and cleared only when SVM updates posted-interrupt interrupt remapping information. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: Joerg Roedel <jroedel@suse.de> Fixes: d98de49a53e48 ('iommu/amd: Enable vAPIC interrupt remapping mode by default') Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-28 14:44:56 +02:00
Arvind Yadav	01e1932a17	iommu/vt-d: Constify intel_dma_ops Most dma_map_ops structures are never modified. Constify these structures such that these can be write-protected. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-28 14:17:12 +02:00
Joerg Roedel	72dcac6334	iommu: Warn once when device_group callback returns NULL This callback should never return NULL. Print a warning if that happens so that we notice and can fix it. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-28 13:29:46 +02:00
Joerg Roedel	8faf5e5a12	iommu/omap: Return ERR_PTR in device_group call-back Make sure that the device_group callback returns an ERR_PTR instead of NULL. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-28 13:29:45 +02:00
Joerg Roedel	7f7a2304aa	iommu: Return ERR_PTR() values from device_group call-backs The generic device_group call-backs in iommu.c return NULL in case of error. Since they are getting ERR_PTR values from iommu_group_alloc(), just pass them up instead. Reported-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-28 13:29:45 +02:00
Joerg Roedel	0929deca40	iommu/s390: Use iommu_group_get_for_dev() in s390_iommu_add_device() The iommu_group_get_for_dev() function also attaches the device to its group, so this code doesn't need to be in the iommu driver. Further by using this function the driver can make use of default domains in the future. Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-28 12:29:00 +02:00
Sebastian Andrzej Siewior	58c4a95f90	iommu/vt-d: Don't disable preemption while accessing deferred_flush() get_cpu() disables preemption and returns the current CPU number. The CPU number is only used once while retrieving the address of the local's CPU deferred_flush pointer. We can instead use raw_cpu_ptr() while we remain preemptible. The worst thing that can happen is that flush_unmaps_timeout() is invoked multiple times: once by taskA after seeing HIGH_WATER_MARK and then preempted to another CPU and then by taskB which saw HIGH_WATER_MARK on the same CPU as taskA. It is also likely that ->size got from HIGH_WATER_MARK to 0 right after its read because another CPU invoked flush_unmaps_timeout() for this CPU. The access to flush_data is protected by a spinlock so even if we get migrated to another CPU or preempted - the data structure is protected. While at it, I marked deferred_flush static since I can't find a reference to it outside of this file. Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: iommu@lists.linux-foundation.org Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-28 12:24:40 +02:00
Sebastian Andrzej Siewior	aaffaa8a3b	iommu/iova: Don't disable preempt around this_cpu_ptr() Commit 583248e6620a ("iommu/iova: Disable preemption around use of this_cpu_ptr()") disables preemption while accessing a per-CPU variable. This does keep lockdep quiet. However I don't see the point why it is bad if we get migrated after its access to another CPU. __iova_rcache_insert() and __iova_rcache_get() immediately locks the variable after obtaining it - before accessing its members. _If_ we get migrated away after retrieving the address of cpu_rcache before taking the lock then the other task on the same CPU will retrieve the same address of cpu_rcache and will spin on the lock. alloc_iova_fast() disables preemption while invoking free_cpu_cached_iovas() on each CPU. The function itself uses per_cpu_ptr() which does not trigger a warning (like this_cpu_ptr() does). It _could_ make sense to use get_online_cpus() instead but the we have a hotplug notifier for CPU down (and none for up) so we are good. Cc: Joerg Roedel <joro@8bytes.org> Cc: iommu@lists.linux-foundation.org Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-28 12:23:01 +02:00
Joerg Roedel	0b25635bd4	Merge branch 'for-joerg/arm-smmu/updates' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into arm/smmu	2017-06-28 10:42:54 +02:00
Geetha Sowjanya	f935448acf	iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #126 Cavium ThunderX2 SMMU doesn't support MSI and also doesn't have unique irq lines for gerror, eventq and cmdq-sync. New named irq "combined" is set as a errata workaround, which allows to share the irq line by register single irq handler for all the interrupts. Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Geetha sowjanya <gakula@caviumnetworks.com> [will: reworked irq equality checking and added SPI check] Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:58:04 +01:00
shameer	99caf177f6	iommu/arm-smmu-v3: Enable ACPI based HiSilicon CMD_PREFETCH quirk(erratum 161010701) HiSilicon SMMUv3 on Hip06/Hip07 platforms doesn't support CMD_PREFETCH command. The dt based support for this quirk is already present in the driver(hisilicon,broken-prefetch-cmd). This adds ACPI support for the quirk using the IORT smmu model number. Signed-off-by: shameer <shameerali.kolothum.thodi@huawei.com> Signed-off-by: hanjun <guohanjun@huawei.com> [will: rewrote patch] Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:58:04 +01:00
Linu Cherian	e5b829de05	iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #74 Cavium ThunderX2 SMMU implementation doesn't support page 1 register space and PAGE0_REGS_ONLY option is enabled as an errata workaround. This option when turned on, replaces all page 1 offsets used for EVTQ_PROD/CONS, PRIQ_PROD/CONS register access with page 0 offsets. SMMU resource size checks are now based on SMMU option PAGE0_REGS_ONLY, since resource size can be either 64k/128k. For this, arm_smmu_device_dt_probe/acpi_probe has been moved before platform_get_resource call, so that SMMU options are set beforehand. Signed-off-by: Linu Cherian <linu.cherian@cavium.com> Signed-off-by: Geetha Sowjanya <geethasowjanya.akula@cavium.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:58:03 +01:00
Robert Richter	12275bf0a4	iommu/arm-smmu-v3, acpi: Add temporary Cavium SMMU-V3 IORT model number definitions The model number is already defined in acpica and we are actually waiting for the acpi maintainers to include it: https://github.com/acpica/acpica/commit/d00a4eb86e64 Adding those temporary definitions until the change makes it into include/acpi/actbl2.h. Once that is done this patch can be reverted. Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:58:03 +01:00
Will Deacon	77f3445866	iommu/io-pgtable-arm: Use dma_wmb() instead of wmb() when publishing table When writing a new table entry, we must ensure that the contents of the table is made visible to the SMMU page table walker before the updated table entry itself. This is currently achieved using wmb(), which expands to an expensive and unnecessary DSB instruction. Ideally, we'd just use cmpxchg64_release when writing the table entry, but this doesn't have memory ordering semantics on !SMP systems. Instead, use dma_wmb(), which emits DMB OSHST. Strictly speaking, this does more than we require (since it targets the outer-shareable domain), but it's likely to be significantly faster than the DSB approach. Reported-by: Linu Cherian <linu.cherian@cavium.com> Suggested-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:58:02 +01:00
Will Deacon	c1004803b4	iommu/io-pgtable: depend on !GENERIC_ATOMIC64 when using COMPILE_TEST with LPAE The LPAE/ARMv8 page table format relies on the ability to read and write 64-bit page table entries in an atomic fashion. With the move to a lockless implementation, we also need support for cmpxchg64 to resolve races when installing table entries concurrently. Unfortunately, not all architectures support cmpxchg64, so the code can fail to compiler when building for these architectures using COMPILE_TEST. Rather than disable COMPILE_TEST altogether, instead check that GENERIC_ATOMIC64 is not selected, which is a reasonable indication that the architecture has support for 64-bit cmpxchg. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:58:02 +01:00
Robin Murphy	58188afeb7	iommu/arm-smmu-v3: Remove io-pgtable spinlock As for SMMUv2, take advantage of io-pgtable's newfound tolerance for concurrency. Unfortunately in this case the command queue lock remains a point of serialisation for the unmap path, but there may be a little more we can do to ameliorate that in future. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:58:01 +01:00
Robin Murphy	523d7423e2	iommu/arm-smmu: Remove io-pgtable spinlock With the io-pgtable code now robust against (valid) races, we no longer need to serialise all operations with a lock. This might make broken callers who issue concurrent operations on overlapping addresses go even more wrong than before, but hey, they already had little hope of useful or deterministic results. We do however still have to keep a lock around to serialise the ATS1* translation ops, as parallel iova_to_phys() calls could lead to unpredictable hardware behaviour otherwise. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:58:01 +01:00
Robin Murphy	119ff305b0	iommu/io-pgtable-arm-v7s: Support lockless operation Mirroring the LPAE implementation, rework the v7s code to be robust against concurrent operations. The same two potential races exist, and are solved in the same manner, with the fixed 2-level structure making life ever so slightly simpler. What complicates matters compared to LPAE, however, is large page entries, since we can't update a block of 16 PTEs atomically, nor assume available software bits to do clever things with. As most users are never likely to do partial unmaps anyway (due to DMA API rules), it doesn't seem unreasonable for this case to remain behind a serialising lock; we just pull said lock down into the bowels of the implementation so it's well out of the way of the normal call paths. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:58:00 +01:00
Robin Murphy	2c3d273eab	iommu/io-pgtable-arm: Support lockless operation For parallel I/O with multiple concurrent threads servicing the same device (or devices, if several share a domain), serialising page table updates becomes a massive bottleneck. On reflection, though, we don't strictly need to do that - for valid IOMMU API usage, there are in fact only two races that we need to guard against: multiple map requests for different blocks within the same region, when the intermediate-level table for that region does not yet exist; and multiple unmaps of different parts of the same block entry. Both of those are fairly easily solved by using a cmpxchg to install the new table, such that if we then find that someone else's table got there first, we can simply free ours and continue. Make the requisite changes such that we can withstand being called without the caller maintaining a lock. In theory, this opens up a few corners in which wildly misbehaving callers making nonsensical overlapping requests might lead to crashes instead of just unpredictable results, but correct code really does not deserve to pay a significant performance cost for the sake of masking bugs in theoretical broken code. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:58:00 +01:00
Robin Murphy	81b3c25218	iommu/io-pgtable: Introduce explicit coherency Once we remove the serialising spinlock, a potential race opens up for non-coherent IOMMUs whereby a caller of .map() can be sure that cache maintenance has been performed on their new PTE, but will have no guarantee that such maintenance for table entries above it has actually completed (e.g. if another CPU took an interrupt immediately after writing the table entry, but before initiating the DMA sync). Handling this race safely will add some potentially non-trivial overhead to installing a table entry, which we would much rather avoid on coherent systems where it will be unnecessary, and where we are stirivng to minimise latency by removing the locking in the first place. To that end, let's introduce an explicit notion of cache-coherency to io-pgtable, such that we will be able to avoid penalising IOMMUs which know enough to know when they are coherent. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:58:00 +01:00
Robin Murphy	b9f1ef30ac	iommu/io-pgtable-arm-v7s: Refactor split_blk_unmap Whilst the short-descriptor format's split_blk_unmap implementation has no need to be recursive, it followed the pattern of the LPAE version anyway for the sake of consistency. With the latter now reworked for both efficiency and future scalability improvements, tweak the former similarly, not least to make it less obtuse. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:57:59 +01:00
Robin Murphy	fb3a95795d	iommu/io-pgtable-arm: Improve split_blk_unmap The current split_blk_unmap implementation suffers from some inscrutable pointer trickery for creating the tables to replace the block entry, but more than that it also suffers from hideous inefficiency. For example, the most pathological case of unmapping a level 3 page from a level 1 block will allocate 513 lower-level tables to remap the entire block at page granularity, when only 2 are actually needed (the rest can be covered by level 2 block entries). Also, we would like to be able to relax the spinlock requirement in future, for which the roll-back-and-try-again logic for race resolution would be pretty hideous under the current paradigm. Both issues can be resolved most neatly by turning things sideways: instead of repeatedly recursing into __arm_lpae_map() map to build up an entire new sub-table depth-first, we can directly replace the block entry with a next-level table of block/page entries, then repeat by unmapping at the next level if necessary. With a little refactoring of some helper functions, the code ends up not much bigger than before, but considerably easier to follow and to adapt in future. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:57:59 +01:00
Robin Murphy	9db829d281	iommu/io-pgtable-arm-v7s: Check table PTEs more precisely Whilst we don't support the PXN bit at all, so should never encounter a level 1 section or supersection PTE with it set, it would still be wise to check both table type bits to resolve any theoretical ambiguity. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:57:58 +01:00
Arvind Yadav	5c2d021829	iommu: arm-smmu: Handle return of iommu_device_register. iommu_device_register returns an error code and, although it currently never fails, we should check its return value anyway. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> [will: adjusted to follow arm-smmu.c] Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:57:58 +01:00
Arvind Yadav	ebdd13c93f	iommu: arm-smmu-v3: make of_device_ids const of_device_ids are not supposed to change at runtime. All functions working with of_device_ids provided by <linux/of.h> work with const of_device_ids. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:57:57 +01:00
Robin Murphy	84c24379a7	iommu/arm-smmu: Plumb in new ACPI identifiers Revision C of IORT now allows us to identify ARM MMU-401 and the Cavium ThunderX implementation. Wire them up so that we can probe these models once firmware starts using the new codes in place of generic ones, and so that the appropriate features and quirks get enabled when we do. For the sake of backports and mitigating sychronisation problems with the ACPICA headers, we'll carry a backup copy of the new definitions locally for the short term to make life simpler. CC: stable@vger.kernel.org # 4.10 Acked-by: Robert Richter <rrichter@cavium.com> Tested-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:57:57 +01:00
Arvind Yadav	60ab7a75c8	iommu/io-pgtable-arm-v7s: constify dummy_tlb_ops. File size before: text data bss dec hex filename 6146 56 9 6211 1843 drivers/iommu/io-pgtable-arm-v7s.o File size After adding 'const': text data bss dec hex filename 6170 24 9 6203 183b drivers/iommu/io-pgtable-arm-v7s.o Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:57:57 +01:00
Sunil Goutham	b847de4e50	iommu/arm-smmu-v3: Increase CMDQ drain timeout value Waiting for a CMD_SYNC to be processed involves waiting for the command queue to drain, which can take an awful lot longer than waiting for a single entry to become available. Consequently, the common timeout value of 100us has been observed to be too short on some platforms when a CMD_SYNC is issued into a queued full of TLBI commands. This patch resolves the issue by using a different (1s) timeout when waiting for the CMDQ to drain and using a simple back-off mechanism when polling the cons pointer in the absence of WFE support. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> [will: rewrote commit message and cosmetic changes] Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-23 17:57:56 +01:00
Joerg Roedel	9ce3a72cd7	iommu/amd: Free already flushed ring-buffer entries before full-check To benefit from IOTLB flushes on other CPUs we have to free the already flushed IOVAs from the ring-buffer before we do the queue_ring_full() check. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-22 12:54:22 +02:00
Joerg Roedel	ffa080ebb5	iommu/amd: Remove amd_iommu_disabled check from amd_iommu_detect() This check needs to happens later now, when all previously enabled IOMMUs have been disabled. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-22 12:54:21 +02:00
Joerg Roedel	7ad820e433	iommu/amd: Free IOMMU resources when disabled on command line After we made sure that all IOMMUs have been disabled we need to make sure that all resources we allocated are released again. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-22 12:54:21 +02:00
Joerg Roedel	f601927136	iommu/amd: Set global pointers to NULL after freeing them Avoid any tries to double-free these pointers. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-22 12:54:20 +02:00
Joerg Roedel	151b09031a	iommu/amd: Check for error states first in iommu_go_to_state() Check if we are in an error state already before calling into state_next(). Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-22 12:54:20 +02:00
Joerg Roedel	1b1e942e34	iommu/amd: Add new init-state IOMMU_CMDLINE_DISABLED This will be used when during initialization we detect that the iommu should be disabled. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-22 12:54:20 +02:00
Joerg Roedel	90b3eb03e1	iommu/amd: Rename free_on_init_error() The function will also be used to free iommu resources when amd_iommu=off was specified on the kernel command line. So rename the function to reflect that. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-22 12:54:20 +02:00
Joerg Roedel	1112374153	iommu/amd: Disable IOMMUs at boot if they are enabled When booting, make sure the IOMMUs are disabled. They could be previously enabled if we boot into a kexec or kdump kernel. So make sure they are off. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-22 12:54:19 +02:00
Joerg Roedel	54bd635704	iommu/amd: Suppress IO_PAGE_FAULTs in kdump kernel When booting into a kdump kernel, suppress IO_PAGE_FAULTs by default for all devices. But allow the faults again when a domain is assigned to a device. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-16 10:21:05 +02:00
Joerg Roedel	ac3b708ad4	iommu/amd: Remove queue_release() function We can use queue_ring_free_flushed() instead, so remove this redundancy. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-08 14:39:57 +02:00
Joerg Roedel	fca6af6a59	iommu/amd: Add per-domain timer to flush per-cpu queues Add a timer to each dma_ops domain so that we flush unused IOTLB entries regularily, even if the queues don't get full all the time. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-08 14:39:51 +02:00
Joerg Roedel	a6e3f6f030	iommu/amd: Add flush counters to struct dma_ops_domain The counters are increased every time the TLB for a given domain is flushed. We also store the current value of that counter into newly added entries of the flush-queue, so that we can tell whether this entry is already flushed. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-08 14:39:26 +02:00
Joerg Roedel	e241f8e76c	iommu/amd: Add locking to per-domain flush-queue With locking we can safely access the flush-queues of other cpus. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-08 14:39:16 +02:00
Joerg Roedel	fd62190a67	iommu/amd: Make use of the per-domain flush queue Fill the flush-queue on unmap and only flush the IOMMU and device TLBs when a per-cpu queue gets full. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-08 14:39:10 +02:00
Joerg Roedel	d4241a2761	iommu/amd: Add per-domain flush-queue data structures Make the flush-queue per dma-ops domain and add code allocate and free the flush-queues; Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-08 14:39:05 +02:00
Joerg Roedel	460c26d05b	iommu/amd: Rip out old queue flushing code The queue flushing is pretty inefficient when it flushes the queues for all cpus at once. Further it flushes all domains from all IOMMUs for all CPUs, which is overkill as well. Rip it out to make room for something more efficient. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-08 14:38:57 +02:00
Tom Lendacky	23e967e17c	iommu/amd: Reduce delay waiting for command buffer space Currently if there is no room to add a command to the command buffer, the driver performs a "completion wait" which only returns when all commands on the queue have been processed. There is no need to wait for the entire command queue to be executed before adding the next command. Update the driver to perform the same udelay() loop that the "completion wait" performs, but instead re-read the head pointer to determine if sufficient space is available. The very first time it is found that there is no space available, the udelay() will be skipped to immediately perform the opportunistic read of the head pointer. If it is still found that there is not sufficient space, then the udelay() will be performed. Signed-off-by: Leo Duran <leo.duran@amd.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-08 14:31:03 +02:00
Tom Lendacky	d334a5637d	iommu/amd: Reduce amount of MMIO when submitting commands As newer, higher speed devices are developed, perf data shows that the amount of MMIO that is performed when submitting commands to the IOMMU causes performance issues. Currently, the command submission path reads the command buffer head and tail pointers and then writes the tail pointer once the command is ready. The tail pointer is only ever updated by the driver so it can be tracked by the driver without having to read it from the hardware. The head pointer is updated by the hardware, but can be read opportunistically. Reading the head pointer only when it appears that there might not be room in the command buffer and then re-checking the available space reduces the number of times the head pointer has to be read. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-06-08 14:31:03 +02:00
Tobias Klauser	e2f9d45fb4	iommu/amd: Constify irq_domain_ops struct irq_domain_ops is not modified, so it can be made const. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-05-30 11:45:07 +02:00
Joerg Roedel	30bf2df65b	iommu/amd: Ratelimit io-page-faults per device Misbehaving devices can cause an endless chain of io-page-faults, flooding dmesg and making the system-log unusable or even prevent the system from booting. So ratelimit the error messages about io-page-faults on a per-device basis. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-05-30 11:44:19 +02:00
Tobias Klauser	71bb620df6	iommu/vt-d: Constify irq_domain_ops struct irq_domain_ops is not modified, so it can be made const. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-05-30 11:41:32 +02:00

1 2 3 4 5 ...

1838 Commits