IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
[ Upstream commit a34f3e20ddff02c4f12df2c0635367394e64c63d ]
The page request queue is per IOMMU, its allocation should be made
NUMA-aware for performance reasons.
Fixes: a222a7f0bb6c ("iommu/vt-d: Implement page request handling")
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20240403214007.985600-1-jacob.jun.pan@linux.intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 70bad345e622c23bb530016925c936ab04a646ac ]
When the kernel is comiled with CONFIG_IRQ_REMAP=y but without
CONFIG_IOMMU_INTEL compilation fails since commit def054b01a8678 with an
undefined reference to device_rbtree_find(). This patch makes sure that
intel specific code is only compiled with CONFIG_IOMMU_INTEL=y.
Signed-off-by: Bert Karwatzki <spasswolf@web.de>
Fixes: 80a9b50c0b9e ("iommu/vt-d: Improve ITE fault handling if target device isn't present")
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20240307194419.15801-1-spasswolf@web.de
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e645c20e8e9cde549bc233435d3c1338e1cd27fe ]
The enforce_cache_coherency callback ensures DMA cache coherency for
devices attached to the domain.
Intel IOMMU supports enforced DMA cache coherency when the Snoop
Control bit in the IOMMU's extended capability register is set.
Supporting it differs between legacy and scalable modes.
In legacy mode, it's supported page-level by setting the SNP field
in second-stage page-table entries. In scalable mode, it's supported
in PASID-table granularity by setting the PGSNP field in PASID-table
entries.
In legacy mode, mappings before attaching to a device have SNP
fields cleared, while mappings after the callback have them set.
This means partial DMAs are cache coherent while others are not.
One possible fix is replaying mappings and flipping SNP bits when
attaching a domain to a device. But this seems to be over-engineered,
given that all real use cases just attach an empty domain to a device.
To meet practical needs while reducing mode differences, only support
enforce_cache_coherency on a domain without mappings if SNP field is
used.
Fixes: fc0051cb9590 ("iommu/vt-d: Check domain force_snooping against attached devices")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20231114011036.70142-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9a16ab9d640274b20813d2d17475e18d3e99d834 ]
In the iommu probe_device path, domain_context_mapping() allows setting
up the context entry for a non-PCI device. However, in the iommu
release_device path, domain_context_clear() only clears context entries
for PCI devices.
Make domain_context_clear() behave consistently with
domain_context_mapping() by clearing context entries for both PCI and
non-PCI devices.
Fixes: 579305f75d34 ("iommu/vt-d: Update to use PCI DMA aliases")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20231114011036.70142-4-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit da37dddcf4caf015c400a930301d2ee27a7a15fb ]
When IOMMU hardware operates in legacy mode, the TT field of the context
entry determines the translation type, with three supported types (Section
9.3 Context Entry):
- DMA translation without device TLB support
- DMA translation with device TLB support
- Passthrough mode with translated and translation requests blocked
Device TLB support is absent when hardware is configured in passthrough
mode.
Disable the PCI ATS feature when IOMMU is configured for passthrough
translation type in legacy (non-scalable) mode.
Fixes: 0faa19a1515f ("iommu/vt-d: Decouple PASID & PRI enabling from SVA")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20231114011036.70142-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0f5432a9b839847dcfe9fa369d72e3d646102ddf ]
The latest VT-d spec indicates that when remapping hardware is disabled
(TES=0 in Global Status Register), upstream ATS Invalidation Completion
requests are treated as UR (Unsupported Request).
Consequently, the spec recommends in section 4.3 Handling of Device-TLB
Invalidations that software refrain from submitting any Device-TLB
invalidation requests when address remapping hardware is disabled.
Verify address remapping hardware is enabled prior to submitting Device-
TLB invalidation requests.
Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by default")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20231114011036.70142-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 85b80fdffa867d75dfb9084a839e7949e29064e8 upstream.
The VT-d spec requires (10.4.4 Global Command Register, TE field) that:
Hardware implementations supporting DMA draining must drain any in-flight
DMA read/write requests queued within the Root-Complex before switching
address translation on or off and reflecting the status of the command
through the TES field in the Global Status register.
Unfortunately, some integrated graphic devices fail to do so after some
kind of power state transition. As the result, the system might stuck in
iommu_disable_translation(), waiting for the completion of TE transition.
Add MTL to the quirk list for those devices and skips TE disabling if the
qurik hits.
Fixes: b1012ca8dc4f ("iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu")
Cc: stable@vger.kernel.org
Signed-off-by: Abdul Halim, Mohd Syazwan <mohd.syazwan.abdul.halim@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20231116022324.30120-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e7ad6c2a4b1aa710db94060b716f53c812cef565 upstream.
Commit 6bbd42e2df8f ("mmu_notifiers: call invalidate_range() when
invalidating TLBs") moved the secondary TLB invalidations into the TLB
invalidation functions to ensure that all secondary TLB invalidations
happen at the same time as the CPU invalidation and added a flush-all
type of secondary TLB invalidation for the batched mode, where a range
of [0, -1UL) is used to indicates that the range extends to the end of
the address space.
However, using an end address of -1UL caused an overflow in the Intel
IOMMU driver, where the end address was rounded up to the next page.
As a result, both the IOTLB and device ATC were not invalidated correctly.
Add a flush all helper function and call it when the invalidation range
is from 0 to -1UL, ensuring that the entire caches are invalidated
correctly.
Fixes: 6bbd42e2df8f ("mmu_notifiers: call invalidate_range() when invalidating TLBs")
Cc: stable@vger.kernel.org
Cc: Huang Ying <ying.huang@intel.com>
Cc: Alistair Popple <apopple@nvidia.com>
Tested-by: Luo Yuzhang <yuzhang.luo@intel.com> # QAT
Tested-by: Tony Zhu <tony.zhu@intel.com> # DSA
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20231117090933.75267-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The iommu_suspend() syscore suspend callback is invoked with IRQ disabled.
Allocating memory with the GFP_KERNEL flag may re-enable IRQs during
the suspend callback, which can cause intermittent suspend/hibernation
problems with the following kernel traces:
Calling iommu_suspend+0x0/0x1d0
------------[ cut here ]------------
WARNING: CPU: 0 PID: 15 at kernel/time/timekeeping.c:868 ktime_get+0x9b/0xb0
...
CPU: 0 PID: 15 Comm: rcu_preempt Tainted: G U E 6.3-intel #r1
RIP: 0010:ktime_get+0x9b/0xb0
...
Call Trace:
<IRQ>
tick_sched_timer+0x22/0x90
? __pfx_tick_sched_timer+0x10/0x10
__hrtimer_run_queues+0x111/0x2b0
hrtimer_interrupt+0xfa/0x230
__sysvec_apic_timer_interrupt+0x63/0x140
sysvec_apic_timer_interrupt+0x7b/0xa0
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x1f/0x30
...
------------[ cut here ]------------
Interrupts enabled after iommu_suspend+0x0/0x1d0
WARNING: CPU: 0 PID: 27420 at drivers/base/syscore.c:68 syscore_suspend+0x147/0x270
CPU: 0 PID: 27420 Comm: rtcwake Tainted: G U W E 6.3-intel #r1
RIP: 0010:syscore_suspend+0x147/0x270
...
Call Trace:
<TASK>
hibernation_snapshot+0x25b/0x670
hibernate+0xcd/0x390
state_store+0xcf/0xe0
kobj_attr_store+0x13/0x30
sysfs_kf_write+0x3f/0x50
kernfs_fop_write_iter+0x128/0x200
vfs_write+0x1fd/0x3c0
ksys_write+0x6f/0xf0
__x64_sys_write+0x1d/0x30
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc
Given that only 4 words memory is needed, avoid the memory allocation in
iommu_suspend().
CC: stable@kernel.org
Fixes: 33e07157105e ("iommu/vt-d: Avoid GFP_ATOMIC where it is not needed")
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Ooi, Chin Hao <chin.hao.ooi@intel.com>
Link: https://lore.kernel.org/r/20230921093956.234692-1-rui.zhang@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20230925120417.55977-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This includes a shared branch with VFIO:
- Enhance VFIO_DEVICE_GET_PCI_HOT_RESET_INFO so it can work with iommufd
FDs, not just group FDs. This removes the last place in the uAPI that
required the group fd.
- Give VFIO a new device node /dev/vfio/devices/vfioX (the so called cdev
node) which is very similar to the FD from VFIO_GROUP_GET_DEVICE_FD.
The cdev is associated with the struct device that the VFIO driver is
bound to and shows up in sysfs in the normal way.
- Add a cdev IOCTL VFIO_DEVICE_BIND_IOMMUFD which allows a newly opened
/dev/vfio/devices/vfioX to be associated with an IOMMUFD, this replaces
the VFIO_GROUP_SET_CONTAINER flow.
- Add cdev IOCTLs VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT to allow the IOMMU
translation the vfio_device is associated with to be changed. This is a
significant new feature for VFIO as previously each vfio_device was
fixed to a single translation.
The translation is under the control of iommufd, so it can be any of
the different translation modes that iommufd is learning to create.
At this point VFIO has compilation options to remove the legacy interfaces
and in modern mode it behaves like a normal driver subsystem. The
/dev/vfio/iommu and /dev/vfio/groupX nodes are not present and each
vfio_device only has a /dev/vfio/devices/vfioX cdev node that represents
the device.
On top of this is built some of the new iommufd functionality:
- IOMMU_HWPT_ALLOC allows userspace to directly create the low level
IO Page table objects and affiliate them with IOAS objects that hold
the translation mapping. This is the basic functionality for the
normal IOMMU_DOMAIN_PAGING domains.
- VFIO_DEVICE_ATTACH_IOMMUFD_PT can be used to replace the current
translation. This is wired up to through all the layers down to the
driver so the driver has the ability to implement a hitless
replacement. This is necessary to fully support guest behaviors when
emulating HW (eg guest atomic change of translation)
- IOMMU_GET_HW_INFO returns information about the IOMMU driver HW that
owns a VFIO device. This includes support for the Intel iommu, and
patches have been posted for all the other server IOMMU.
Along the way are a number of internal items:
- New iommufd kapis iommufd_ctx_has_group(), iommufd_device_to_ictx(),
iommufd_device_to_id(), iommufd_access_detach(), iommufd_ctx_from_fd(),
iommufd_device_replace()
- iommufd now internally tracks iommu_groups as it needs some per-group
data
- Reorganize how the internal hwpt allocation flows to have more robust
locking
- Improve the access interfaces to support detach and replace of an IOAS
from an access
- New selftests and a rework of how the selftests creates a mock iommu
driver to be more like a real iommu driver
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCZO/QDQAKCRCFwuHvBreF
YZ2iAP4hNEF6MJLRI2A28V3I/80f3x9Ed3Cirp/Q8ZdVEE+HYQD8DFaafJ0y3iPQ
5mxD4ZrZ9KfUns/gUqCT5oPHjrcvSAM=
=EQCw
-----END PGP SIGNATURE-----
Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd
Pull iommufd updates from Jason Gunthorpe:
"On top of the vfio updates is built some new iommufd functionality:
- IOMMU_HWPT_ALLOC allows userspace to directly create the low level
IO Page table objects and affiliate them with IOAS objects that
hold the translation mapping. This is the basic functionality for
the normal IOMMU_DOMAIN_PAGING domains.
- VFIO_DEVICE_ATTACH_IOMMUFD_PT can be used to replace the current
translation. This is wired up to through all the layers down to the
driver so the driver has the ability to implement a hitless
replacement. This is necessary to fully support guest behaviors
when emulating HW (eg guest atomic change of translation)
- IOMMU_GET_HW_INFO returns information about the IOMMU driver HW
that owns a VFIO device. This includes support for the Intel iommu,
and patches have been posted for all the other server IOMMU.
Along the way are a number of internal items:
- New iommufd kernel APIs: iommufd_ctx_has_group(),
iommufd_device_to_ictx(), iommufd_device_to_id(),
iommufd_access_detach(), iommufd_ctx_from_fd(),
iommufd_device_replace()
- iommufd now internally tracks iommu_groups as it needs some
per-group data
- Reorganize how the internal hwpt allocation flows to have more
robust locking
- Improve the access interfaces to support detach and replace of an
IOAS from an access
- New selftests and a rework of how the selftests creates a mock
iommu driver to be more like a real iommu driver"
Link: https://lore.kernel.org/lkml/ZO%2FTe6LU1ENf58ZW@nvidia.com/
* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (34 commits)
iommufd/selftest: Don't leak the platform device memory when unloading the module
iommu/vt-d: Implement hw_info for iommu capability query
iommufd/selftest: Add coverage for IOMMU_GET_HW_INFO ioctl
iommufd: Add IOMMU_GET_HW_INFO
iommu: Add new iommu op to get iommu hardware information
iommu: Move dev_iommu_ops() to private header
iommufd: Remove iommufd_ref_to_users()
iommufd/selftest: Make the mock iommu driver into a real driver
vfio: Support IO page table replacement
iommufd/selftest: Add IOMMU_TEST_OP_ACCESS_REPLACE_IOAS coverage
iommufd: Add iommufd_access_replace() API
iommufd: Use iommufd_access_change_ioas in iommufd_access_destroy_object
iommufd: Add iommufd_access_change_ioas(_id) helpers
iommufd: Allow passing in iopt_access_list_id to iopt_remove_access()
vfio: Do not allow !ops->dma_unmap in vfio_pin/unpin_pages()
iommufd/selftest: Add a selftest for IOMMU_HWPT_ALLOC
iommufd/selftest: Return the real idev id from selftest mock_domain
iommufd: Add IOMMU_HWPT_ALLOC
iommufd/selftest: Test iommufd_device_replace()
iommufd: Make destroy_rwsem use a lock class per object type
...
coalescing lots of silly duplicates.
* Use static_calls() instead of indirect calls for apic->foo()
* Tons of cleanups an crap removal along the way
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmTvfO8ACgkQaDWVMHDJ
krAP2A//ccii/LuvtTnNEIMMR5w2rwTdHv91ancgFkC8pOeNk37Z8sSLq8tKuLFA
vgjBIysVIqunuRcNCJ+eqwIIxYfU+UGCWHppzLwO+DY3Q7o9EoTL0BgytdAqxpQQ
ntEVarqWq25QYXKFoAqbUTJ1UXa42/8HfiXAX/jvP+ACXfilkGPZre6ASxlXeOhm
XbgPuNQPmXi2WYQH9GCQEsz2Nh80hKap8upK2WbQzzJ3lXsm+xA//4klab0HCYwl
Uc302uVZozyXRMKbAlwmgasTFOLiV8KKriJ0oHoktBpWgkpdR9uv/RDeSaFR3DAl
aFmecD4k/Hqezg4yVl+4YpEn2KjxiwARCm4PMW5AV7lpWBPBHAOOai65yJlAi9U6
bP8pM0+aIx9xg7oWfsTnQ7RkIJ+GZ0w+KZ9LXFM59iu3eV1pAJE3UVyUehe/J1q9
n8OcH0UeHRlAb8HckqVm1AC7IPvfHw4OAPtUq7z3NFDwbq6i651Tu7f+i2bj31cX
77Ames+fx6WjxUjyFbJwaK44E7Qez3waztdBfn91qw+m0b+gnKE3ieDNpJTqmm5b
mKulV7KJwwS6cdqY3+Kr+pIlN+uuGAv7wGzVLcaEAXucDsVn/YAMJHY2+v97xv+n
J9N+yeaYtmSXVlDsJ6dndMrTQMmcasK1CVXKxs+VYq5Lgf+A68w=
=eoKm
-----END PGP SIGNATURE-----
Merge tag 'x86_apic_for_6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 apic updates from Dave Hansen:
"This includes a very thorough rework of the 'struct apic' handlers.
Quite a variety of them popped up over the years, especially in the
32-bit days when odd apics were much more in vogue.
The end result speaks for itself, which is a removal of a ton of code
and static calls to replace indirect calls.
If there's any breakage here, it's likely to be around the 32-bit
museum pieces that get light to no testing these days.
Summary:
- Rework apic callbacks, getting rid of unnecessary ones and
coalescing lots of silly duplicates.
- Use static_calls() instead of indirect calls for apic->foo()
- Tons of cleanups an crap removal along the way"
* tag 'x86_apic_for_6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits)
x86/apic: Turn on static calls
x86/apic: Provide static call infrastructure for APIC callbacks
x86/apic: Wrap IPI calls into helper functions
x86/apic: Mark all hotpath APIC callback wrappers __always_inline
x86/xen/apic: Mark apic __ro_after_init
x86/apic: Convert other overrides to apic_update_callback()
x86/apic: Replace acpi_wake_cpu_handler_update() and apic_set_eoi_cb()
x86/apic: Provide apic_update_callback()
x86/xen/apic: Use standard apic driver mechanism for Xen PV
x86/apic: Provide common init infrastructure
x86/apic: Wrap apic->native_eoi() into a helper
x86/apic: Nuke ack_APIC_irq()
x86/apic: Remove pointless arguments from [native_]eoi_write()
x86/apic/noop: Tidy up the code
x86/apic: Remove pointless NULL initializations
x86/apic: Sanitize APIC ID range validation
x86/apic: Prepare x2APIC for using apic::max_apic_id
x86/apic: Simplify X2APIC ID validation
x86/apic: Add max_apic_id member
x86/apic: Wrap APIC ID validation into an inline
...
There are two main use cases for mmu notifiers. One is by KVM which uses
mmu_notifier_invalidate_range_start()/end() to manage a software TLB.
The other is to manage hardware TLBs which need to use the
invalidate_range() callback because HW can establish new TLB entries at
any time. Hence using start/end() can lead to memory corruption as these
callbacks happen too soon/late during page unmap.
mmu notifier users should therefore either use the start()/end() callbacks
or the invalidate_range() callbacks. To make this usage clearer rename
the invalidate_range() callback to arch_invalidate_secondary_tlbs() and
update documention.
Link: https://lkml.kernel.org/r/6f77248cd25545c8020a54b4e567e8b72be4dca1.1690292440.git-series.apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Cc: Andrew Donnellan <ajd@linux.ibm.com>
Cc: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com>
Cc: Frederic Barrat <fbarrat@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Nicolin Chen <nicolinc@nvidia.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zhi Wang <zhi.wang.linux@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add intel_iommu_hw_info() to report cap_reg and ecap_reg information.
Link: https://lore.kernel.org/r/20230818101033.4100-6-yi.l.liu@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
For the case that VT-d page is smaller than mm page, converting dma pfn
should be handled in two cases which are for start pfn and for end pfn.
Currently the calculation of end dma pfn is incorrect and the result is
less than real page frame number which is causing the mapping of iova
always misses some page frames.
Rename the mm_to_dma_pfn() to mm_to_dma_pfn_start() and add a new helper
for converting end dma pfn named mm_to_dma_pfn_end().
Signed-off-by: Yanfei Xu <yanfei.xu@intel.com>
Link: https://lore.kernel.org/r/20230625082046.979742-1-yanfei.xu@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Even the PCI devices don't support pasid capability, PASID table is
mandatory for a PCI device in scalable mode. However flushing cache
of pasid directory table for these devices are not taken after pasid
table is allocated as the "size" of table is zero. Fix it by
calculating the size by page order.
Found this when reading the code, no real problem encountered for now.
Fixes: 194b3348bdbb ("iommu/vt-d: Fix PASID directory pointer coherency")
Suggested-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Yanfei Xu <yanfei.xu@intel.com>
Link: https://lore.kernel.org/r/20230616081045.721873-1-yanfei.xu@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The core code now prevents devices with RMRR regions from being assigned
to user space. There is no need to check for this condition in individual
drivers. Remove it to avoid duplicate code.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20230724060352.113458-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This allows the upper layers to set a domain to a PASID of a device
if the PASID feature is supported by the IOMMU hardware. The typical
use cases are, for example, kernel DMA with PASID and hardware
assisted mediated device drivers.
The attaching device and pasid information is tracked in a per-domain
list and is used for IOTLB and devTLB invalidation.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20230802212427.1497170-8-jacob.jun.pan@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The domain_flush_pasid_iotlb() helper function is used to flush the IOTLB
entries for a given PASID. Previously, this function assumed that
RID2PASID was only used for the first-level DMA translation. However, with
the introduction of the set_dev_pasid callback, this assumption is no
longer valid.
Add a check before using the RID2PASID for PASID invalidation. This check
ensures that the domain has been attached to a physical device before
using RID2PASID.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20230802212427.1497170-7-jacob.jun.pan@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Currently draining page requests and responses for a pasid is part of SVA
implementation. This is because the driver only supports attaching an SVA
domain to a device pasid. As we are about to support attaching other types
of domains to a device pasid, the prq draining code becomes generic.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20230802212427.1497170-6-jacob.jun.pan@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The pasid_mutex was used to protect the paths of set/remove_dev_pasid().
It's duplicate with iommu_sva_lock. Remove it to avoid duplicate code.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20230802212427.1497170-5-jacob.jun.pan@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The VT-d spec requires to use PASID-based-IOTLB invalidation descriptor
to invalidate IOTLB and the paging-structure caches for a first-stage
page table. Add a generic helper to do this.
RID2PASID is used if the domain has been attached to a physical device,
otherwise real PASIDs that the domain has been attached to will be used.
The 'real' PASID attachment is handled in the subsequent change.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20230802212427.1497170-4-jacob.jun.pan@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
PCIe Process address space ID (PASID) is used to tag DMA traffic, it
provides finer grained isolation than requester ID (RID).
For each device/RID, 0 is a special PASID for the normal DMA (no
PASID). This is universal across all architectures that supports PASID,
therefore warranted to be reserved globally and declared in the common
header. Consequently, we can avoid the conflict between different PASID
use cases in the generic code. e.g. SVA and DMA API with PASIDs.
This paved away for device drivers to choose global PASID policy while
continue doing normal DMA.
Noting that VT-d could support none-zero RID/NO_PASID, but currently not
used.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/20230802212427.1497170-2-jacob.jun.pan@linux.intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Rename send_cleanup_vector() to vector_schedule_cleanup() to prepare for
replacing the vector cleanup IPI with a timer callback.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Xin Li <xin3.li@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steve Wahl <steve.wahl@hpe.com>
Link: https://lore.kernel.org/r/20230621171248.6805-2-xin3.li@intel.com
This is a step toward making __iommu_probe_device() self contained.
It should, under proper locking, check if the device is already associated
with an iommu driver and resolve parallel probes. All but one of the
callers open code this test using two different means, but they all
rely on dev->iommu_group.
Currently the bus_iommu_probe()/probe_iommu_group() and
probe_acpi_namespace_devices() rejects already probed devices with an
unlocked read of dev->iommu_group. The OF and ACPI "replay" functions use
device_iommu_mapped() which is the same read without the pointless
refcount.
Move this test into __iommu_probe_device() and put it under the
iommu_probe_device_lock. The store to dev->iommu_group is in
iommu_group_add_device() which is also called under this lock for iommu
driver devices, making it properly locked.
The only path that didn't have this check is the hotplug path triggered by
BUS_NOTIFY_ADD_DEVICE. The only way to get dev->iommu_group assigned
outside the probe path is via iommu_group_add_device(). Today the only
caller is VFIO no-iommu which never associates with an iommu driver. Thus
adding this additional check is safe.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/1-v3-328044aa278c+45e49-iommu_probe_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Including:
- Core changes:
- iova_magazine_alloc() optimization
- Make flush-queue an IOMMU driver capability
- Consolidate the error handling around device attachment
- AMD IOMMU changes:
- AVIC Interrupt Remapping Improvements
- Some minor fixes and cleanups
- Intel VT-d changes from Lu Baolu:
- Small and misc cleanups
- ARM-SMMU changes from Will Deacon:
- Device-tree binding updates:
* Add missing clocks for SC8280XP and SA8775 Adreno SMMUs
* Add two new Qualcomm SMMUs in SDX75 and SM6375
- Workarounds for Arm MMU-700 errata:
* 1076982: Avoid use of SEV-based cmdq wakeup
* 2812531: Terminate command batches with a CMD_SYNC
* Enforce single-stage translation to avoid nesting-related errata
- Set the correct level hint for range TLB invalidation on teardown
- Some other minor fixes and cleanups (including Freescale PAMU and
virtio-iommu changes)
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmSVnS0ACgkQK/BELZcB
GuM4txAAvtE5pMxM4V/9uTJt+de/vd8XiaH2kfQEULCJm2Yz07Z5+oE+QRtjPc2D
No+98IGMJCNOg+U+6JZ8P2GR3/soFvKdYjhY/iKTXK+C6jiy3dStIFN/KzzHkbpu
Y/fUZ5B+DizTO6837osDWIdAz3PcwV3Vk/ogHe3FoHWU13RJYOMp2FAox0QreBNE
kb7tK3ki/RCasbF9rMt9ClB0SZEVDysRkYF7AtXtsMNVm5jpQAITXVcNUYMeaJFL
n0J8hjn3EiZj7dgzxbL5bRgDyfPadwJkWz2BxkQ6x0gopgHu0EimGL8p2Bei2f8x
lv2y692L6zZth2ZgjSkecf3Lo4YHirsP/1U1zrLDjEgeBZ0vRxiX0qsvCb9692C1
+shy5jOX22ub+zJ2UFHMNGKu3ZdhcKi+meejdqM/GrHcRfZABh26bQILFnPF3Oxp
2WFb2v7Hq9qdQP50jsGbLji6n165aRW969fBdsk1uDUoCDHNOcdHQS3FsiKAAz5d
/Z/3PR9tQgnF9bDXJB6RbGJ1rQxHlfvarOQCAYiC02ALj4FnuSLiFSBLe1bI4InR
AgmnQaH2jmFMWHibdvj3q3sm33sLhOjmAE+ZX0YOhFfgrRGHq88qRwV53IfW477E
8a+6A+tnu28axk7yVJMvvz5/PeYkD2CMeplYQycUiaQutjvN9sk=
=aRMe
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
"Core changes:
- iova_magazine_alloc() optimization
- Make flush-queue an IOMMU driver capability
- Consolidate the error handling around device attachment
AMD IOMMU changes:
- AVIC Interrupt Remapping Improvements
- Some minor fixes and cleanups
Intel VT-d changes from Lu Baolu:
- Small and misc cleanups
ARM-SMMU changes from Will Deacon:
- Device-tree binding updates:
- Add missing clocks for SC8280XP and SA8775 Adreno SMMUs
- Add two new Qualcomm SMMUs in SDX75 and SM6375
- Workarounds for Arm MMU-700 errata:
- 1076982: Avoid use of SEV-based cmdq wakeup
- 2812531: Terminate command batches with a CMD_SYNC
- Enforce single-stage translation to avoid nesting-related errata
- Set the correct level hint for range TLB invalidation on teardown
.. and some other minor fixes and cleanups (including Freescale PAMU
and virtio-iommu changes)"
* tag 'iommu-updates-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (50 commits)
iommu/vt-d: Remove commented-out code
iommu/vt-d: Remove two WARN_ON in domain_context_mapping_one()
iommu/vt-d: Handle the failure case of dmar_reenable_qi()
iommu/vt-d: Remove unnecessary (void*) conversions
iommu/amd: Remove extern from function prototypes
iommu/amd: Use BIT/BIT_ULL macro to define bit fields
iommu/amd: Fix DTE_IRQ_PHYS_ADDR_MASK macro
iommu/amd: Fix compile error for unused function
iommu/amd: Improving Interrupt Remapping Table Invalidation
iommu/amd: Do not Invalidate IRT when IRTE caching is disabled
iommu/amd: Introduce Disable IRTE Caching Support
iommu/amd: Remove the unused struct amd_ir_data.ref
iommu/amd: Switch amd_iommu_update_ga() to use modify_irte_ga()
iommu/arm-smmu-v3: Set TTL invalidation hint better
iommu/arm-smmu-v3: Document nesting-related errata
iommu/arm-smmu-v3: Add explicit feature for nesting
iommu/arm-smmu-v3: Document MMU-700 erratum 2812531
iommu/arm-smmu-v3: Work around MMU-600 erratum 1076982
dt-bindings: arm-smmu: Add SDX75 SMMU compatible
dt-bindings: arm-smmu: Add SM6375 GPU SMMU
...
These lines of code were commented out when they were first added in commit
ba39592764ed ("Intel IOMMU: Intel IOMMU driver"). We do not want to restore
them because the VT-d spec has deprecated the read/write draining hit.
VT-d spec (section 11.4.2):
"
Hardware implementation with Major Version 2 or higher (VER_REG), always
performs required drain without software explicitly requesting a drain in
IOTLB invalidation. This field is deprecated and hardware will always
report it as 1 to maintain backward compatibility with software.
"
Remove the code to make the code cleaner.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/20230609060514.15154-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Remove the WARN_ON(did == 0) as the domain id 0 is reserved and
set once the domain_ids is allocated. So iommu_init_domains will
never return 0.
Remove the WARN_ON(!table) as this pointer will be accessed in
the following code, if empty "table" really happens, the kernel
will report a NULL pointer reference warning at the first place.
Signed-off-by: Yanfei Xu <yanfei.xu@intel.com>
Link: https://lore.kernel.org/r/20230605112659.308981-3-yanfei.xu@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
dmar_reenable_qi() may not succeed. Check and return when it fails.
Signed-off-by: Yanfei Xu <yanfei.xu@intel.com>
Link: https://lore.kernel.org/r/20230605112659.308981-2-yanfei.xu@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
It remains really handy to have distinct DMA domain types within core
code for the sake of default domain policy selection, but we can now
hide that detail from drivers by using the new capability instead.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Jerry Snitselaar <jsnitsel@redhat.com> # amd, intel, smmu-v3
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/1c552d99e8ba452bdac48209fa74c0bdd52fd9d9.1683233867.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Passing a special type to domain_alloc to indirectly query whether flush
queues are a worthwhile optimisation with the given driver is a bit
clunky, and looking increasingly anachronistic. Let's put that into an
explicit capability instead.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Jerry Snitselaar <jsnitsel@redhat.com> # amd, intel, smmu-v3
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/f0086a93dbccb92622e1ace775846d81c1c4b174.1683233867.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Including:
- Convert to platform remove callback returning void
- Extend changing default domain to normal group
- Intel VT-d updates:
- Remove VT-d virtual command interface and IOASID
- Allow the VT-d driver to support non-PRI IOPF
- Remove PASID supervisor request support
- Various small and misc cleanups
- ARM SMMU updates:
- Device-tree binding updates:
* Allow Qualcomm GPU SMMUs to accept relevant clock properties
* Document Qualcomm 8550 SoC as implementing an MMU-500
* Favour new "qcom,smmu-500" binding for Adreno SMMUs
- Fix S2CR quirk detection on non-architectural Qualcomm SMMU
implementations
- Acknowledge SMMUv3 PRI queue overflow when consuming events
- Document (in a comment) why ATS is disabled for bypass streams
- AMD IOMMU updates:
- 5-level page-table support
- NUMA awareness for memory allocations
- Unisoc driver: Support for reattaching an existing domain
- Rockchip driver: Add missing set_platform_dma_ops callback
- Mediatek driver: Adjust the dma-ranges
- Various other small fixes and cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmRONeAACgkQK/BELZcB
GuPmpw/8C9ruxQ0JU5rcDBXQGvos4gMmxlbELMrBpbbiTtdb35xchpKfdhnECGIF
k2SrrcF40R/S82SyzNU/eZtGKirtcXvGFraUFgu/QdCcnnqpRHs+IJMXX2NJP+it
+0wO1uiInt3CN1ERcR4F31cDKiWjDG8bvQVE5LIyiy4KrIU5ld2G91Fkaa0R13Au
6H+/wKkcUC6OyaGE6wPx474xBkapT20vj5AIQuAWisXJJR0wbBon1sUTo/IRKsU+
IkNxH0W+1PNImJ+crAdf/nkOlyqoChY4ww6cm07LrOsBLIsX5bCqXfL4HvKthElD
MEgk2SN5kfjfR5Vf29W4hZVM1CT8VbhO41I7OzaZ6X6RU2PXoldPKlgKtZGeSKn1
9bcMpSgB0BtbttvBevSkxTo5KHFozXS2DG3DFoMB3yFMme8Th0LrhBZ9oB7NIPNw
ntMo4K75vviC6Vvzjy4Anj/+y+Zm3W6wDDP7F12O6WZLkK5s4hrSsHUm/MQnnKQP
muJlG870RnSl73xUQZe3cuBxktXuJ3EHqqYIPE0npzvauu8hhWcis3opf2Y+U2s8
aBCCIgp5kTKqjHLh2e4lNCKZf1/b/dhxRcRBQhpAIb8YsjMlIJyM+G8Jz6K6gBga
5Ld+68UQ3oHJwoLV1HCFN8jbpQ9KZn1s9+h3yrYjRAcLNiFb3nU=
=OvTo
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
- Convert to platform remove callback returning void
- Extend changing default domain to normal group
- Intel VT-d updates:
- Remove VT-d virtual command interface and IOASID
- Allow the VT-d driver to support non-PRI IOPF
- Remove PASID supervisor request support
- Various small and misc cleanups
- ARM SMMU updates:
- Device-tree binding updates:
* Allow Qualcomm GPU SMMUs to accept relevant clock properties
* Document Qualcomm 8550 SoC as implementing an MMU-500
* Favour new "qcom,smmu-500" binding for Adreno SMMUs
- Fix S2CR quirk detection on non-architectural Qualcomm SMMU
implementations
- Acknowledge SMMUv3 PRI queue overflow when consuming events
- Document (in a comment) why ATS is disabled for bypass streams
- AMD IOMMU updates:
- 5-level page-table support
- NUMA awareness for memory allocations
- Unisoc driver: Support for reattaching an existing domain
- Rockchip driver: Add missing set_platform_dma_ops callback
- Mediatek driver: Adjust the dma-ranges
- Various other small fixes and cleanups
* tag 'iommu-updates-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (82 commits)
iommu: Remove iommu_group_get_by_id()
iommu: Make iommu_release_device() static
iommu/vt-d: Remove BUG_ON in dmar_insert_dev_scope()
iommu/vt-d: Remove a useless BUG_ON(dev->is_virtfn)
iommu/vt-d: Remove BUG_ON in map/unmap()
iommu/vt-d: Remove BUG_ON when domain->pgd is NULL
iommu/vt-d: Remove BUG_ON in handling iotlb cache invalidation
iommu/vt-d: Remove BUG_ON on checking valid pfn range
iommu/vt-d: Make size of operands same in bitwise operations
iommu/vt-d: Remove PASID supervisor request support
iommu/vt-d: Use non-privileged mode for all PASIDs
iommu/vt-d: Remove extern from function prototypes
iommu/vt-d: Do not use GFP_ATOMIC when not needed
iommu/vt-d: Remove unnecessary checks in iopf disabling path
iommu/vt-d: Move PRI handling to IOPF feature path
iommu/vt-d: Move pfsid and ats_qdep calculation to device probe path
iommu/vt-d: Move iopf code from SVA to IOPF enabling path
iommu/vt-d: Allow SVA with device-specific IOPF
dmaengine: idxd: Add enable/disable device IOPF feature
arm64: dts: mt8186: Add dma-ranges for the parent "soc" node
...
The dmar_insert_dev_scope() could fail if any unexpected condition is
encountered. However, in this situation, the kernel should attempt
recovery and proceed with execution. Remove BUG_ON with WARN_ON, so that
kernel can avoid being crashed when an unexpected condition occurs.
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Link: https://lore.kernel.org/r/20230406065944.2773296-8-tina.zhang@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When dmar_alloc_pci_notify_info() is being invoked, the invoker has
ensured the dev->is_virtfn is false. So, remove the useless BUG_ON in
dmar_alloc_pci_notify_info().
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Link: https://lore.kernel.org/r/20230406065944.2773296-7-tina.zhang@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Domain map/unmap with invalid parameters shouldn't crash the kernel.
Therefore, using if() replaces the BUG_ON.
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Link: https://lore.kernel.org/r/20230406065944.2773296-6-tina.zhang@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When performing domain_context_mapping or getting dma_pte of a pfn, the
availability of the domain page table directory is ensured. Therefore,
the domain->pgd checkings are unnecessary.
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Link: https://lore.kernel.org/r/20230406065944.2773296-5-tina.zhang@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
VT-d iotlb cache invalidation request with unexpected type is considered
as a bug to developers, which can be fixed. So, when such kind of issue
comes out, it needs to be reported through the kernel log, instead of
halting the system. Replacing BUG_ON with warning reporting.
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Link: https://lore.kernel.org/r/20230406065944.2773296-4-tina.zhang@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When encountering an unexpected invalid pfn range, the kernel should
attempt recovery and proceed with execution. Therefore, using WARN_ON to
replace BUG_ON to avoid halting the machine.
Besides, one redundant checking is reduced.
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Link: https://lore.kernel.org/r/20230406065944.2773296-3-tina.zhang@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This addresses the following issue reported by klocwork tool:
- operands of different size in bitwise operations
Suggested-by: Yongwei Ma <yongwei.ma@intel.com>
Signed-off-by: Tina Zhang <tina.zhang@intel.com>
Link: https://lore.kernel.org/r/20230406065944.2773296-2-tina.zhang@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
There's no more usage, remove PASID supervisor support.
Suggested-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/20230331231137.1947675-3-jacob.jun.pan@linux.intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Supervisor Request Enable (SRE) bit in a PASID entry is for permission
checking on DMA requests. When SRE = 0, DMA with supervisor privilege
will be blocked. However, for in-kernel DMA this is not necessary in that
we are targeting kernel memory anyway. There's no need to differentiate
user and kernel for in-kernel DMA.
Let's use non-privileged (user) permission for all PASIDs used in kernel,
it will be consistent with DMA without PASID (RID_PASID) as well.
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/20230331231137.1947675-2-jacob.jun.pan@linux.intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The kernel coding style does not require 'extern' in function prototypes
in .h files, so remove them from drivers/iommu/intel/iommu.h as they are
not needed.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20230331045452.500265-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
There is no need to use GFP_ATOMIC here. GFP_KERNEL is already used for
some other memory allocations just a few lines above.
Commit e3a981d61d15 ("iommu/vt-d: Convert allocations to GFP_KERNEL") has
changed the other memory allocation flags.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/e2a8a1019ffc8a86b4b4ed93def3623f60581274.1675542576.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>