IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
It's not only for PCI devices any more, and the scope information for an
ACPI device provides the bus and devfn so that has to be stored here too.
It is the device pointer itself which needs to be protected with RCU,
so the __rcu annotation follows it into the definition of struct
dmar_dev_scope, since we're no longer just passing arrays of device
pointers around.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
In commit 2e12bc29 ("intel-iommu: Default to non-coherent for domains
unattached to iommus") we decided to err on the side of caution and
always assume that it's possible that a device will be attached which is
behind a non-coherent IOMMU.
In some cases, however, that just *cannot* happen. If there *are* no
IOMMUs in the system which are non-coherent, then we don't need to do
it. And flushing the dcache is a *significant* performance hit.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
There is a race condition between the existing clear/free code and the
hardware. The IOMMU is actually permitted to cache the intermediate
levels of the page tables, and doesn't need to walk the table from the
very top of the PGD each time. So the existing back-to-back calls to
dma_pte_clear_range() and dma_pte_free_pagetable() can lead to a
use-after-free where the IOMMU reads from a freed page table.
When freeing page tables we actually need to do the IOTLB flush, with
the 'invalidation hint' bit clear to indicate that it's not just a
leaf-node flush, after unlinking each page table page from the next level
up but before actually freeing it.
So in the rewritten domain_unmap() we just return a list of pages (using
pg->freelist to make a list of them), and then the caller is expected to
do the appropriate IOTLB flush (or tear down the domain completely,
whatever), before finally calling dma_free_pagelist() to free the pages.
As an added bonus, we no longer need to flush the CPU's data cache for
pages which are about to be *removed* from the page table hierarchy anyway,
in the non-cache-coherent case. This drastically improves the performance
of large unmaps.
As a side-effect of all these changes, this also fixes the fact that
intel_iommu_unmap() was neglecting to free the page tables for the range
in question after clearing them.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
We have this horrid API where iommu_unmap() can unmap more than it's asked
to, if the IOVA in question happens to be mapped with a large page.
Instead of propagating this nonsense to the point where we end up returning
the page order from dma_pte_clear_range(), let's just do it once and adjust
the 'size' parameter accordingly.
Augment pfn_to_dma_pte() to return the level at which the PTE was found,
which will also be useful later if we end up changing the API for
iommu_iova_to_phys() to behave the same way as is being discussed upstream.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
The iopte_free() function should check for NULL because
kmem_cache_free() will panic on NULL argument.
Signed-off-by: Zhouyi Zhou <yizhouzhou@ict.ac.cn>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Now we have a PCI bus notification based mechanism to update DMAR
device scope array, we could extend the mechanism to support boot
time initialization too, which will help to unify and simplify
the implementation.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Current Intel DMAR/IOMMU driver assumes that all PCI devices associated
with DMAR/RMRR/ATSR device scope arrays are created at boot time and
won't change at runtime, so it caches pointers of associated PCI device
object. That assumption may be wrong now due to:
1) introduction of PCI host bridge hotplug
2) PCI device hotplug through sysfs interfaces.
Wang Yijing has tried to solve this issue by caching <bus, dev, func>
tupple instead of the PCI device object pointer, but that's still
unreliable because PCI bus number may change in case of hotplug.
Please refer to http://lkml.org/lkml/2013/11/5/64
Message from Yingjing's mail:
after remove and rescan a pci device
[ 611.857095] dmar: DRHD: handling fault status reg 2
[ 611.857109] dmar: DMAR:[DMA Read] Request device [86:00.3] fault addr ffff7000
[ 611.857109] DMAR:[fault reason 02] Present bit in context entry is clear
[ 611.857524] dmar: DRHD: handling fault status reg 102
[ 611.857534] dmar: DMAR:[DMA Read] Request device [86:00.3] fault addr ffff6000
[ 611.857534] DMAR:[fault reason 02] Present bit in context entry is clear
[ 611.857936] dmar: DRHD: handling fault status reg 202
[ 611.857947] dmar: DMAR:[DMA Read] Request device [86:00.3] fault addr ffff5000
[ 611.857947] DMAR:[fault reason 02] Present bit in context entry is clear
[ 611.858351] dmar: DRHD: handling fault status reg 302
[ 611.858362] dmar: DMAR:[DMA Read] Request device [86:00.3] fault addr ffff4000
[ 611.858362] DMAR:[fault reason 02] Present bit in context entry is clear
[ 611.860819] IPv6: ADDRCONF(NETDEV_UP): eth3: link is not ready
[ 611.860983] dmar: DRHD: handling fault status reg 402
[ 611.860995] dmar: INTR-REMAP: Request device [[86:00.3] fault index a4
[ 611.860995] INTR-REMAP:[fault reason 34] Present field in the IRTE entry is clear
This patch introduces a new mechanism to update the DRHD/RMRR/ATSR device scope
caches by hooking PCI bus notification.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Global DMA and interrupt remapping resources may be accessed in
interrupt context, so use RCU instead of rwsem to protect them
in such cases.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Introduce a global rwsem dmar_global_lock, which will be used to
protect DMAR related global data structures from DMAR/PCI/memory
device hotplug operations in process context.
DMA and interrupt remapping related data structures are read most,
and only change when memory/PCI/DMAR hotplug event happens.
So a global rwsem solution is adopted for balance between simplicity
and performance.
For interrupt remapping driver, function intel_irq_remapping_supported(),
dmar_table_init(), intel_enable_irq_remapping(), disable_irq_remapping(),
reenable_irq_remapping() and enable_drhd_fault_handling() etc
are called during booting, suspending and resuming with interrupt
disabled, so no need to take the global lock.
For interrupt remapping entry allocation, the locking model is:
down_read(&dmar_global_lock);
/* Find corresponding iommu */
iommu = map_hpet_to_ir(id);
if (iommu)
/*
* Allocate remapping entry and mark entry busy,
* the IOMMU won't be hot-removed until the
* allocated entry has been released.
*/
index = alloc_irte(iommu, irq, 1);
up_read(&dmar_global_lock);
For DMA remmaping driver, we only uses the dmar_global_lock rwsem to
protect functions which are only called in process context. For any
function which may be called in interrupt context, we will use RCU
to protect them in following patches.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Introduce for_each_dev_scope()/for_each_active_dev_scope() to walk
{active} device scope entries. This will help following RCU lock
related patches.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Reduce duplicated code to handle virtual machine domains, there's no
functionality changes. It also improves code readability.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Enhance function get_domain_for_dev() to release allocated resources
if failed to create domain for PCIe endpoint, otherwise the allocated
resources will get lost.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Function get_domain_for_dev() is a little complex, simplify it
by factoring out dmar_search_domain_by_dev_info() and
dmar_insert_dev_info().
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Move private structures and variables into intel-iommu.c, which will
help to simplify locking policy for hotplug. Also delete redundant
declarations.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Factor out function dmar_alloc_dev_scope() from dmar_parse_dev_scope()
for later reuse.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Function device_notifier() in intel-iommu.c only remove domain_device_info
data structure associated with a PCI device when handling PCI device
driver unbinding events. If a PCI device has never been bound to a PCI
device driver, there won't be BUS_NOTIFY_UNBOUND_DRIVER event when
hot-removing the PCI device. So associated domain_device_info data
structure may get lost.
On the other hand, if iommu_pass_through is enabled, function
iommu_prepare_static_indentify_mapping() will create domain_device_info
data structure for each PCIe to PCIe bridge and PCIe endpoint,
no matter whether there are drivers associated with those PCIe devices
or not. So those domain_device_info data structures will get lost when
hot-removing the assocated PCIe devices if they have never bound to
any PCI device driver.
To be even worse, it's not only an memory leak issue, but also an
caching of stale information bug because the memory are kept in
device_domain_list and domain->devices lists.
Fix the bug by trying to remove domain_device_info data structure when
handling BUS_NOTIFY_DEL_DEVICE event.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Function device_notifier() in intel-iommu.c fails to remove
device_domain_info data structures for PCI devices if they are
associated with si_domain because iommu_no_mapping() returns true
for those PCI devices. This will cause memory leak and caching of
stale information in domain->devices list.
So fix the issue by not calling iommu_no_mapping() and skipping check
of iommu_pass_through.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
The OMAP IOMMU driver locates the IOMMU associated to a device using the
IOMMU name stored in the device archdata iommu field. That field is
expected to be populated by platform code and is left unset for DT-based
devices. This results in a crash when the IOMMU driver attaches a domain
to a device.
Fix this by allocating the archdata iommu structure when devices are
added and freeing when they are removed. Devices without an OF node, and
devices without an iommus property in their OF node are ignored. The
iommu name is initialized from the IOMMU device node name.
This should be simplified when removing non-DT support completely from
the IOMMU users as the IOMMU name won't be needed anymore, and the
IOMMU device pointer could then be stored in the archdata iommu field
directly.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
[s-anna@ti.com: updated to use device name instead of OF name]
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
The remoteproc MMUs in OMAP4+ SoCs have some additional debug
registers that can give out the PC value in addition to the
MMU fault address. The PC value can be extracted properly only
on the DSP cores, and is not available on the ARM processors
within the IPU sub-systems. Instead, the MMUs have been enhanced
to throw a bus-error response back to the IPU processors.
This functionality is programmable through the MMU_GP_REG register.
The cores are simply stalled if the MMU_GP_REG.BUS_ERR_BACK_EN bit
is not set. When set, a bus-error exception is raised allowing the
processor to handle it as a bus fault and provide additional debug
information. This feature is turned on by default by the driver on
iommus supporting it.
Signed-off-by: Subramaniam Chanderashekarapuram <subramaniam.ca@ti.com>
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
As OMAP2+ is moving to a full DT boot for all SoC families, commit
7ce93f3 "ARM: OMAP2+: Fix more missing data for omap3.dtsi file"
adds basic DT bits for OMAP3. But the driver is not yet converted,
so this will not work and driver will not be probed. Convert it!
The legacy boot mode is still supported until OMAP3 is converted
to DT-boot only.
Signed-off-by: Florian Vaussard <florian.vaussard@epfl.ch>
[s-anna@ti.com: dev_name adaptation and improved error checking]
Signed-off-by: Suman Anna <s-anna@ti.com>
[tony@atomide.com: Ack for arch/arm/*omap* parts]
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
When booting with a devicetree, no platform data is provided.
Do not prematurely exit iommu_enable() and iommu_disable() in
such a case.
Note: As OMAP do not yet has a proper reset controller driver,
IOMMUs requiring a reset signal should use pdata-quirks as a
transitional solution.
Signed-off-by: Florian Vaussard <florian.vaussard@epfl.ch>
Signed-off-by: Suman Anna <s-anna@ti.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
There are couple of issues with the error return paths in
omap_iommu_attach():
1. omap_iommu_attach() returns NULL or ERR_PTR in case of error,
but omap_iommu_attach_dev() only checks for IS_ERR. Thus a NULL
return value (in case driver_find_device fails) will cause the
kernel to panic when omap_iommu_attach_dev() dereferences the
pointer.
2. A try_module_get() failure returns a valid success value as
returned from iommu_enable().
Both the above issues have been fixed up to return the proper
ERR_PTR.
Signed-off-by: Florian Vaussard <florian.vaussard@epfl.ch>
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Use the various devm_ interfaces to simplify the cleanup in
probe and remove functions.
Signed-off-by: Florian Vaussard <florian.vaussard@epfl.ch>
Signed-off-by: Suman Anna <s-anna@ti.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Commit 78a2e12f51d9 ("iommu: shmobile: Enable driver compilation with
COMPILE_TEST") added an optional dependency on SH_MOBILE. But that
Kconfig symbol doesn't exist. It seems ARCH_SHMOBILE was intended. Use
that.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
This patch corrects the PASID format in the INVALIDATE_IOTLB_PAGES
command, which was caused by incorrect information in
the AMD IOMMU Architectural Specification v2.01 document.
Incorrect format:
cmd->data[0][16:23] = PASID[7:0]
cmd->data[1][16:27] = PASID[19:8]
Correct format:
cmd->data[0][16:23] = PASID[15:8]
cmd->data[1][16:23] = PASID[7:0]
However, this does not affect the IOMMUv2 hardware implementation,
and has been corrected since version 2.02 of the specification
(available through AMD NDA).
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
The 'order' parameter for IOMMU-aware dma-mapping implementation was
introduced mainly as a hack to reduce size of the bitmap used for
tracking IO virtual address space. Since now it is possible to dynamically
resize the bitmap, this hack is not needed and can be removed without any
impact on the client devices. This way the parameters for
arm_iommu_create_mapping() becomes much easier to understand. 'size'
parameter now means the maximum supported IO address space size.
The code will allocate (resize) bitmap in chunks, ensuring that a single
chunk is not larger than a single memory page to avoid unreliable
allocations of size larger than PAGE_SIZE in atomic context.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Commit 1463fe44fd0f ("iommu/arm-smmu: Don't use VMIDs for stage-1
translations") moved our TLB invalidation from context creation time to
context destruction time, but forgot to update an associated comment.
This patch fixes the broken comment.
Signed-off-by: Will Deacon <will.deacon@arm.com>
These should have been octal.
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Hiroshi DOYU <Hiroshi.DOYU@nokia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On coherent systems, publishing new page tables to the SMMU walker is
achieved with a dsb instruction. In fact, this can be a dsb(ishst) which
also provides the mandatory barrier option for arm64.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Commit 972157cac528 ("arm/smmu: Use irqsafe spinlock for domain lock")
fixed our page table locks to be the irq{save,restore} variants, since
the DMA mapping API can be invoked from interrupt context.
This patch cleans up our use of the flags variable so we can distinguish
between IRQ flags (now `flags') and pte protection bits (now `prot').
Signed-off-by: Will Deacon <will.deacon@arm.com>
In such a case we have to use secure aliases of some non-secure
registers.
This handling is switched on by DT property
"calxeda,smmu-secure-config-access" for an SMMU node.
Signed-off-by: Andreas Herrmann <andreas.herrmann@calxeda.com>
[will: merged with driver option handling patch]
Signed-off-by: Will Deacon <will.deacon@arm.com>
The DT parsing code that determines stream IDs uses
of_parse_phandle_with_args and thus MAX_MASTER_STREAMIDS
is always bound by MAX_PHANDLE_ARGS.
Signed-off-by: Andreas Herrmann <andreas.herrmann@calxeda.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
As the lock might be used through DMA-API which is allowed
in interrupt context.
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Acked-by: Will Deacon <will.deacon@arm.com>
We currently include <linux/irqreturn.h> in <linux/pci.h>, but I'm about to
remove that from linux/pci.h, so add explicit includes where needed.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Whilst trying to bring-up an SMMUv2 implementation with the table
walker plumbed into a coherent interconnect, I noticed that the memory
transactions targetting the CPU caches from the SMMU were marked as
outer-shareable instead of inner-shareable.
After a bunch of digging, it seems that we actually need to program
CBARn.BPSHCFG for s1-s2-bypass contexts to act as non-shareable in order
for the shareability configured in the corresponding TTBCR not to be
overridden with an outer-shareable attribute.
Cc: <stable@vger.kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Now that we populate page tables as we traverse them ("iommu/arm-smmu:
fix pud/pmd entry fill sequence"), we need to ensure that we flush out
our zeroed tables after initial allocation, to prevent speculative TLB
fills using bogus data.
This patch adds additional calls to arm_smmu_flush_pgtable during
initial table allocation, and moves the dsb required by coherent table
walkers into the helper.
Cc: <stable@vger.kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Commit a44a9791e778 ("iommu/arm-smmu: use mutex instead of spinlock for
locking page tables") replaced the page table spinlock with a mutex, to
allow blocking allocations to satisfy lazy mapping requests.
Unfortunately, it turns out that IOMMU mappings are created from atomic
context (e.g. spinlock held during a dma_map), so this change doesn't
really help us in practice.
This patch is a partial revert of the offending commit, bringing back
the original spinlock but replacing our page table allocations for any
levels below the pgd (which is allocated during domain init) with
GFP_ATOMIC instead of GFP_KERNEL.
Cc: <stable@vger.kernel.org>
Reported-by: Andreas Herrmann <andreas.herrmann@calxeda.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The ARM SMMU driver's population of puds and pmds is broken, since we
iterate over the next level of table repeatedly setting the current
level descriptor to point at the pmd being initialised. This is clearly
wrong when dealing with multiple pmds/puds.
This patch fixes the problem by moving the pud/pmd population out of the
loop and instead performing it when we allocate the next level (like we
correctly do for ptes already). The starting address for the next level
is then calculated prior to entering the loop.
Cc: <stable@vger.kernel.org>
Signed-off-by: Yifan Zhang <zhangyf@marvell.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>