IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
gvt->types needs to be freed on error.
Fixes: bc90d097ae14 ("drm/i915/gvt: define weight according to vGPU type")
Reported-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://lore.kernel.org/r/20220923092652.100656-2-hch@lst.de
[aw: Correct fixes commit ID as reported by Stephen Rothwell]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
When converting to directly create the vfio_device the mdev driver has to
put a vfio_register_emulated_iommu_dev() in the probe() and a pairing
vfio_unregister_group_dev() in the remove.
This was missed for gvt, add it.
Cc: stable@vger.kernel.org
Fixes: 978cf586ac35 ("drm/i915/gvt: convert to use vfio_register_emulated_iommu_dev")
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/0-v1-013609965fe8+9d-vfio_gvt_unregister_jgg@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
1. Modify some annotation information formats to keep the
entire driver annotation format consistent.
2. Modify some log description formats to be consistent with
the format of the entire driver log.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Longfang Liu <liulongfang@huawei.com>
Link: https://lore.kernel.org/r/20220926093332.28824-6-liulongfang@huawei.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
The QM_QUE_ISO_CFG macro definition is no longer used
and needs to be deleted from the current driver.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Longfang Liu <liulongfang@huawei.com>
Link: https://lore.kernel.org/r/20220926093332.28824-5-liulongfang@huawei.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Remove unused function parameters for vf_qm_fun_reset() and
ensure the device is enabled before the reset operation
is performed.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Longfang Liu <liulongfang@huawei.com>
Link: https://lore.kernel.org/r/20220926093332.28824-4-liulongfang@huawei.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
The queue address of the accelerator device should be combined into
a dma address in a way of combining the low and high bits.
The previous combination is wrong and needs to be modified.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Longfang Liu <liulongfang@huawei.com>
Link: https://lore.kernel.org/r/20220926093332.28824-3-liulongfang@huawei.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
During the process of compatibility and matching of live migration
device information, if the isolation status of the two devices is
inconsistent, the live migration needs to be exited.
The current driver does not return the error code correctly and
needs to be fixed.
Reviewed-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Longfang Liu <liulongfang@huawei.com>
Link: https://lore.kernel.org/r/20220926093332.28824-2-liulongfang@huawei.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
The iommu_group comes from the struct device that a driver has been bound
to and then created a struct vfio_device against. To keep the iommu layer
sane we want to have a simple rule that only an attached driver should be
using the iommu API. Particularly only an attached driver should hold
ownership.
In VFIO's case since it uses the group APIs and it shares between
different drivers it is a bit more complicated, but the principle still
holds.
Solve this by waiting for all users of the vfio_group to stop before
allowing vfio_unregister_group_dev() to complete. This is done with a new
completion to know when the users go away and an additional refcount to
keep track of how many device drivers are sharing the vfio group. The last
driver to be unregistered will clean up the group.
This solves crashes in the S390 iommu driver that come because VFIO ends
up racing releasing ownership (which attaches the default iommu_domain to
the device) with the removal of that same device from the iommu
driver. This is a side case that iommu drivers should not have to cope
with.
iommu driver failed to attach the default/blocking domain
WARNING: CPU: 0 PID: 5082 at drivers/iommu/iommu.c:1961 iommu_detach_group+0x6c/0x80
Modules linked in: macvtap macvlan tap vfio_pci vfio_pci_core irqbypass vfio_virqfd kvm nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink mlx5_ib sunrpc ib_uverbs ism smc uvdevice ib_core s390_trng eadm_sch tape_3590 tape tape_class vfio_ccw mdev vfio_iommu_type1 vfio zcrypt_cex4 sch_fq_codel configfs ghash_s390 prng chacha_s390 libchacha aes_s390 mlx5_core des_s390 libdes sha3_512_s390 nvme sha3_256_s390 sha512_s390 sha256_s390 sha1_s390 sha_common nvme_core zfcp scsi_transport_fc pkey zcrypt rng_core autofs4
CPU: 0 PID: 5082 Comm: qemu-system-s39 Tainted: G W 6.0.0-rc3 #5
Hardware name: IBM 3931 A01 782 (LPAR)
Krnl PSW : 0704c00180000000 000000095bb10d28 (iommu_detach_group+0x70/0x80)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
Krnl GPRS: 0000000000000001 0000000900000027 0000000000000039 000000095c97ffe0
00000000fffeffff 00000009fc290000 00000000af1fda50 00000000af590b58
00000000af1fdaf0 0000000135c7a320 0000000135e52258 0000000135e52200
00000000a29e8000 00000000af590b40 000000095bb10d24 0000038004b13c98
Krnl Code: 000000095bb10d18: c020003d56fc larl %r2,000000095c2bbb10
000000095bb10d1e: c0e50019d901 brasl %r14,000000095be4bf20
#000000095bb10d24: af000000 mc 0,0
>000000095bb10d28: b904002a lgr %r2,%r10
000000095bb10d2c: ebaff0a00004 lmg %r10,%r15,160(%r15)
000000095bb10d32: c0f4001aa867 brcl 15,000000095be65e00
000000095bb10d38: c004002168e0 brcl 0,000000095bf3def8
000000095bb10d3e: eb6ff0480024 stmg %r6,%r15,72(%r15)
Call Trace:
[<000000095bb10d28>] iommu_detach_group+0x70/0x80
([<000000095bb10d24>] iommu_detach_group+0x6c/0x80)
[<000003ff80243b0e>] vfio_iommu_type1_detach_group+0x136/0x6c8 [vfio_iommu_type1]
[<000003ff80137780>] __vfio_group_unset_container+0x58/0x158 [vfio]
[<000003ff80138a16>] vfio_group_fops_unl_ioctl+0x1b6/0x210 [vfio]
pci 0004:00:00.0: Removing from iommu group 4
[<000000095b5b62e8>] __s390x_sys_ioctl+0xc0/0x100
[<000000095be5d3b4>] __do_syscall+0x1d4/0x200
[<000000095be6c072>] system_call+0x82/0xb0
Last Breaking-Event-Address:
[<000000095be4bf80>] __warn_printk+0x60/0x68
It indicates that domain->ops->attach_dev() failed because the driver has
already passed the point of destructing the device.
Fixes: 9ac8545199a1 ("iommu: Fix use-after-free in iommu_release_device")
Reported-by: Matthew Rosato <mjrosato@linux.ibm.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/0-v2-a3c5f4429e2a+55-iommu_group_lifetime_jgg@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
All the functions that dereference struct vfio_container are moved into
container.c.
Simple code motion, no functional change.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/8-v3-297af71838d2+b9-vfio_container_split_jgg@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
This is a container item.
A following patch will move the vfio_container functions to their own .c
file.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/7-v3-297af71838d2+b9-vfio_container_split_jgg@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
To vfio_container_ioctl_check_extension().
A following patch will turn this into a non-static function, make it clear
it is related to the container.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/6-v3-297af71838d2+b9-vfio_container_split_jgg@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
This miscdev, noiommu driver and a couple of globals are all container
items. Move this init into its own functions.
A following patch will move the vfio_container functions to their own .c
file.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/5-v3-297af71838d2+b9-vfio_container_split_jgg@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
This can all be accomplished using typical IS_ENABLED techniques, drop it
all.
Also rename the variable to vfio_noiommu so this can be made global in
following patches.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/4-v3-297af71838d2+b9-vfio_container_split_jgg@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
This splits up the ioctl of vfio_group_ioctl_set_container() so it
determines the type of file then invokes a type specific attachment
function. Future patches will add iommufd to this function as an
alternative type.
A following patch will move the vfio_container functions to their own .c
file.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/3-v3-297af71838d2+b9-vfio_container_split_jgg@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
To vfio_group_detach_container(). This function is really a container
function.
Fold the WARN_ON() into it as a precondition assertion.
A following patch will move the vfio_container functions to their own .c
file.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/2-v3-297af71838d2+b9-vfio_container_split_jgg@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
and replace kref. With it a 'vfio-dev/vfioX' node is created under the
sysfs path of the parent, indicating the device is bound to a vfio
driver, e.g.:
/sys/devices/pci0000\:6f/0000\:6f\:01.0/vfio-dev/vfio0
It is also a preparatory step toward adding cdev for supporting future
device-oriented uAPI.
Add Documentation/ABI/testing/sysfs-devices-vfio-dev.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220921104401.38898-16-kevin.tian@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
With the addition of vfio_put_device() now the names become confusing.
vfio_put_device() is clear from object life cycle p.o.v given kref.
vfio_device_put()/vfio_device_try_get() are helpers for tracking
users on a registered device.
Now rename them:
- vfio_device_put() -> vfio_device_put_registration()
- vfio_device_try_get() -> vfio_device_try_get_registration()
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20220921104401.38898-15-kevin.tian@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
ccw is the only exception which cannot use vfio_alloc_device() because
its private device structure is designed to serve both mdev and parent.
Life cycle of the parent is managed by css_driver so vfio_ccw_private
must be allocated/freed in css_driver probe/remove path instead of
conforming to vfio core life cycle for mdev.
Given that use a wait/completion scheme so the mdev remove path waits
after vfio_put_device() until receiving a completion notification from
@release. The completion indicates that all active references on
vfio_device have been released.
After that point although free of vfio_ccw_private is delayed to
css_driver it's at least guaranteed to have no parallel reference on
released vfio device part from other code paths.
memset() in @probe is removed. vfio_device is either already cleared
when probed for the first time or cleared in @release from last probe.
The right fix is to introduce separate structures for mdev and parent,
but this won't happen in short term per prior discussions.
Remove vfio_init/uninit_group_dev() as no user now.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Link: https://lore.kernel.org/r/20220921104401.38898-14-kevin.tian@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Implement amba's own vfio_device_ops.
Remove vfio_platform_probe/remove_common() given no user now.
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20220921104401.38898-13-kevin.tian@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Move vfio_device_ops from platform core to platform drivers so device
specific init/cleanup can be added.
Introduce two new helpers vfio_platform_init/release_common() for the
use in driver @init/@release.
vfio_platform_probe/remove_common() will be deprecated.
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20220921104401.38898-12-kevin.tian@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Also add a comment to mark that vfio core releases device_set if @init
fails.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220921104401.38898-11-kevin.tian@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
and manage available_instances inside @init/@release.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220921104401.38898-10-kevin.tian@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Move vfio_device to the start of intel_vgpu as required by the new
helpers.
Change intel_gvt_create_vgpu() to use intel_vgpu as the first param
as other vgpu helpers do.
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://lore.kernel.org/r/20220921104401.38898-9-kevin.tian@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
and manage avail_mbytes inside @init/@release.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220921104401.38898-8-kevin.tian@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
and manage available ports inside @init/@release.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220921104401.38898-7-kevin.tian@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
and manage mdpy_count inside @init/@release.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220921104401.38898-6-kevin.tian@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Tidy up @probe so all migration specific initialization logic is moved
to migration specific @init callback.
Remove vfio_pci_core_{un}init_device() given no user now.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Link: https://lore.kernel.org/r/20220921104401.38898-5-kevin.tian@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
mlx5 has its own @init/@release for handling migration cap.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220921104401.38898-4-kevin.tian@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Also introduce two pci core helpers as @init/@release for pci drivers:
- vfio_pci_core_init_dev()
- vfio_pci_core_release_dev()
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220921104401.38898-3-kevin.tian@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
The idea is to let vfio core manage the vfio_device life cycle instead
of duplicating the logic cross drivers. This is also a preparatory
step for adding struct device into vfio_device.
New pair of helpers together with a kref in vfio_device:
- vfio_alloc_device()
- vfio_put_device()
Drivers can register @init/@release callbacks to manage any private
state wrapping the vfio_device.
However vfio-ccw doesn't fit this model due to a life cycle mess
that its private structure mixes both parent and mdev info hence must
be allocated/freed outside of the life cycle of vfio device.
Per prior discussions this won't be fixed in short term by IBM folks.
Instead of waiting for those modifications introduce another helper
vfio_init_device() so ccw can call it to initialize a pre-allocated
vfio_device.
Further implication of the ccw trick is that vfio_device cannot be
freed uniformly in vfio core. Instead, require *EVERY* driver to
implement @release and free vfio_device inside. Then ccw can choose
to delay the free at its own discretion.
Another trick down the road is that kvzalloc() is used to accommodate
the need of gvt which uses vzalloc() while all others use kzalloc().
So drivers should call a helper vfio_free_device() to free the
vfio_device instead of assuming that kfree() or vfree() is appliable.
Later once the ccw mess is fixed we can remove those tricks and
fully handle structure alloc/free in vfio core.
Existing vfio_{un}init_group_dev() will be deprecated after all
existing usages are converted to the new model.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Co-developed-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20220921104401.38898-2-kevin.tian@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Now that everything is ready set the driver DMA logging callbacks if
supported by the device.
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20220908183448.195262-11-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Handle async error events and health/recovery flow to safely stop the
tracker upon error scenarios.
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20220908183448.195262-10-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Report dirty pages from tracker.
It includes:
Querying for dirty pages in a given IOVA range, this is done by
modifying the tracker into the reporting state and supplying the
required range.
Using the CQ event completion mechanism to be notified once data is
ready on the CQ/QP to be processed.
Once data is available turn on the corresponding bits in the bit map.
This functionality will be used as part of the 'log_read_and_clear'
driver callback in the next patches.
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20220908183448.195262-9-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Add support for creating and destroying page tracker object.
This object is used to control/report the device dirty pages.
As part of creating the tracker need to consider the device capabilities
for max ranges and adapt/combine ranges accordingly.
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20220908183448.195262-8-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Init QP based resources for dirty tracking to be used upon start
logging.
It includes:
Creating the host and firmware RC QPs, move each of them to its expected
state based on the device specification, etc.
Creating the relevant resources which are needed by both QPs as of UAR,
PD, etc.
Creating the host receive side resources as of MKEY, CQ, receive WQEs,
etc.
The above resources are cleaned-up upon stop logging.
The tracker object that will be introduced by next patches will use
those resources.
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20220908183448.195262-7-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Introduce the DMA logging feature support in the vfio core layer.
It includes the processing of the device start/stop/report DMA logging
UAPIs and calling the relevant driver 'op' to do the work.
Specifically,
Upon start, the core translates the given input ranges into an interval
tree, checks for unexpected overlapping, non aligned ranges and then
pass the translated input to the driver for start tracking the given
ranges.
Upon report, the core translates the given input user space bitmap and
page size into an IOVA kernel bitmap iterator. Then it iterates it and
call the driver to set the corresponding bits for the dirtied pages in a
specific IOVA range.
Upon stop, the driver is called to stop the previous started tracking.
The next patches from the series will introduce the mlx5 driver
implementation for the logging ops.
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20220908183448.195262-6-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
The new facility adds a bunch of wrappers that abstract how an IOVA range
is represented in a bitmap that is granulated by a given page_size. So it
translates all the lifting of dealing with user pointers into its
corresponding kernel addresses backing said user memory into doing finally
the (non-atomic) bitmap ops to change various bits.
The formula for the bitmap is:
data[(iova / page_size) / 64] & (1ULL << (iova % 64))
Where 64 is the number of bits in a unsigned long (depending on arch)
It introduces an IOVA iterator that uses a windowing scheme to minimize the
pinning overhead, as opposed to pinning it on demand 4K at a time. Assuming
a 4K kernel page and 4K requested page size, we can use a single kernel
page to hold 512 page pointers, mapping 2M of bitmap, representing 64G of
IOVA space.
An example usage of these helpers for a given @base_iova, @page_size,
@length and __user @data:
bitmap = iova_bitmap_alloc(base_iova, page_size, length, data);
if (IS_ERR(bitmap))
return -ENOMEM;
ret = iova_bitmap_for_each(bitmap, arg, dirty_reporter_fn);
iova_bitmap_free(bitmap);
Each iteration of the @dirty_reporter_fn is called with a unique @iova
and @length argument, indicating the current range available through the
iova_bitmap. The @dirty_reporter_fn uses iova_bitmap_set() to mark dirty
areas (@iova_length) within that provided range, as following:
iova_bitmap_set(bitmap, iova, iova_length);
The facility is intended to be used for user bitmaps representing dirtied
IOVAs by IOMMU (via IOMMUFD) and PCI Devices (via vfio-pci).
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20220908183448.195262-5-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
DMA logging allows a device to internally record what DMAs the device is
initiating and report them back to userspace. It is part of the VFIO
migration infrastructure that allows implementing dirty page tracking
during the pre copy phase of live migration. Only DMA WRITEs are logged,
and this API is not connected to VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE.
This patch introduces the DMA logging involved uAPIs.
It uses the FEATURE ioctl with its GET/SET/PROBE options as of below.
It exposes a PROBE option to detect if the device supports DMA logging.
It exposes a SET option to start device DMA logging in given IOVAs
ranges.
It exposes a SET option to stop device DMA logging that was previously
started.
It exposes a GET option to read back and clear the device DMA log.
Extra details exist as part of vfio.h per a specific option.
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220908183448.195262-4-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
L and S are swapped in the message.
s/VFIO_FLS_MC/VFIO_FSL_MC/
Also use 'ret' instead of 'WARN_ON(ret)' to avoid a duplicated message.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Diana Craciun <diana.craciun@oss.nxp.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Link: https://lore.kernel.org/r/a7c1394346725b7435792628c8d4c06a0a745e0b.1662134821.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Query ADV_VIRTUALIZATION capabilities which provide information for
advanced virtualization related features.
Current capabilities refer to the page tracker object which is used for
tracking the pages that are dirtied by the device.
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20220905105852.26398-3-yishaih@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Introduce ifc related stuff to enable using page tracker.
A page tracker is a dirty page tracking object used by the device to
report the tracking log.
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20220905105852.26398-2-yishaih@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
- Fix handling of PCI domains in /proc on 32-bit systems using the recently added support
for numbering buses from zero for each domain.
- A fix and a revert for some changes to use READ/WRITE_ONCE() which caused problems with
KASAN enabled due to sanitisation calls being introduced in low-level paths that can't
cope with it.
- Fix build errors on 32-bit caused by the syscall table being misaligned sometimes.
- Two fixes to get IBM Cell native machines booting again, which had bit-rotted while my
QS22 was temporarily out of action.
- Fix the papr_scm driver to not assume the order of events returned by the hypervisor is
stable, and a related compile fix.
Thanks to: Aneesh Kumar K.V, Christophe Leroy, Jordan Niethe, Kajol Jain, Masahiro Yamada,
Nathan Chancellor, Pali Rohár, Vaibhav Jain, Zhouyi Zhou.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmMUiKATHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgEyCD/wI2tkpoI7NleAwsmoiMqsxSBWChwBz
d1Z1PeEoUSQtrvNJG3D/nRZrd4DUb/i0nYV0EYpDqxg0gWBBSGhRIr8NdOVlh2Ot
w8uzhy579ngtEOExFJtDUfg/j5Gdobeto20aQg6G80HlvzF/DH11Z4HvFDg7UXFj
Y4Kq6sFSF52SUK+e20lsCzRV8PrUZHy2Oi8KqKmrwfaqtbZMfi1r2LLXHFDwGVqI
DfGWdebcJ1ojxrFotShh9IoCBNjiqlp+XiZK3GlrYl/SElTmyhvmJ1izTSQf/SwL
MvRDZj5ZXycMraZ0rx+TiBTFVP8EaX9rtg6kF/nAFLJkKJYOw0JeO1knljrfGMOb
Q9jj2QaSORXI3AMishueg2Du+InzTIrIdGBa3WXaHGZzU8t5KFzMcmR+MJ2dw60Y
YNHa85lRJZkc0w5Acq7tpc5mhK/Wa80LAAWaR80Obg5Y4NviJdyD/fFKssGTYHyC
NXl17nAY8pc79yOLrh7jJ7kxp0A4GkST8Jq/jej7ppFl2PDpqwFNKvyvqqrR0Lll
EVtjdxx4Ty4gu+tviA5/BTnQIcwhIrWn4M12orXI/KBtVKneJ56uu7gtMS4VWXQR
WV3y91Qv9OguXDcm66nwnO1m5I7s6PJFeSmWS1o4Gm3YRQjp4s3NIVemnqs34DXs
iVsA3MR/vaKwFw==
=3cQx
-----END PGP SIGNATURE-----
Merge tag 'powerpc-6.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix handling of PCI domains in /proc on 32-bit systems using the
recently added support for numbering buses from zero for each domain.
- A fix and a revert for some changes to use READ/WRITE_ONCE() which
caused problems with KASAN enabled due to sanitisation calls being
introduced in low-level paths that can't cope with it.
- Fix build errors on 32-bit caused by the syscall table being
misaligned sometimes.
- Two fixes to get IBM Cell native machines booting again, which had
bit-rotted while my QS22 was temporarily out of action.
- Fix the papr_scm driver to not assume the order of events returned by
the hypervisor is stable, and a related compile fix.
Thanks to Aneesh Kumar K.V, Christophe Leroy, Jordan Niethe, Kajol Jain,
Masahiro Yamada, Nathan Chancellor, Pali Rohár, Vaibhav Jain, and Zhouyi
Zhou.
* tag 'powerpc-6.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/papr_scm: Ensure rc is always initialized in papr_scm_pmu_register()
Revert "powerpc/irq: Don't open code irq_soft_mask helpers"
powerpc: Fix hard_irq_disable() with sanitizer
powerpc/rtas: Fix RTAS MSR[HV] handling for Cell
Revert "powerpc: Remove unused FW_FEATURE_NATIVE references"
powerpc: align syscall table for ppc32
powerpc/pci: Enable PCI domains in /proc when PCI bus numbers are not unique
powerpc/papr_scm: Fix nvdimm event mappings
-Wformat was recently re-enabled for builds with clang, then quickly
re-disabled, due to concerns stemming from the frequency of default
argument promotion related warning instances.
commit 258fafcd0683 ("Makefile.extrawarn: re-enable -Wformat for clang")
commit 21f9c8a13bb2 ("Revert "Makefile.extrawarn: re-enable -Wformat for clang"")
ISO WG14 has ratified N2562 to address default argument promotion
explicitly for printf, as part of the upcoming ISO C2X standard.
The behavior of clang was changed in clang-16 to not warn for the cited
cases in all language modes.
Add a version check, so that users of clang-16 now get the full effect
of -Wformat. For older clang versions, re-enable flags under the
-Wformat group that way users still get some useful checks related to
format strings, without noisy default argument promotion warnings. I
intentionally omitted -Wformat-y2k and -Wformat-security from being
re-enabled, which are also part of -Wformat in clang-16.
Link: https://github.com/ClangBuiltLinux/linux/issues/378
Link: https://github.com/llvm/llvm-project/issues/57102
Link: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2562.pdf
Suggested-by: Justin Stitt <jstitt007@gmail.com>
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Suggested-by: Youngmin Nam <youngmin.nam@samsung.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- MAINTAINERS update
- fix resource leaks in gpio-mockup and gpio-pxa
- add missing locking in gpio-pca953x
- use 32-bit I/O in gpio-realtek-otto
- make irq_chip structures immutable in four more drivers
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAmMTwMsACgkQEacuoBRx
13IAkQ//SmLx/kNTjpuWtqa6MQKR3hbbXmLYbpMI3hTqpQ+Q/SXR6pxEJXmj1+K6
GlC9oSN3oV0AsayLaJ2YgmqH59rtjS+dJILXpmIx1jchnnBD7xXNeA3uX4GoJid0
NEKbBRb9dUz/CwXNiLmGeOPaDoi81K0ha+9RP4/lnjEzmexc6jfWd4cdics2XCyr
Zr8qY8bk0kNzWWdEuhXIN2WFL2XMnWSF/Y9L0+LtpMZf5BtT1vVu/6qwL4Da2KTQ
pnmMmDjQG1npe1AJ/CvObhzxYyaJRFPA9zCjhCJskm2SiwxwXTCArjMsl2vFcboi
f8JdqYfbH2D4LpJsWh03K7OEhPRzThdLBKy1HOJkrbqqVTKmUl750121O1p2AV9H
bMUpsmXw+ali/C4e4wBvNcS0XZK7vP1T3yAzjukZK2KZvhEDBo9YQECWnuUhFh/J
kB6qt9PlhdwHdLhe7+LnPWTuUdF5yQL4DR+53Tigjnqt9tQ5Mh1d2V/XVjRATppa
+cmCKp5NvVTmoenjO1c9QnMErzrCWoQ8mZSEalIoC5vQ6wzXM9OaiRla0V9DS8aV
dpankWmKyZ74i3f0QWMKHoqx+AKP+6AC7iRtPcSzZe2QjVOsWeStwHl6+GzlazDk
/Ml+iK9eZsOCJvJFq+5/r3n3sJWGcl5a4WpkZ2QYQ95fJprZhNQ=
=VkA4
-----END PGP SIGNATURE-----
Merge tag 'gpio-fixes-for-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
"A a set of fixes from the GPIO subsystem.
Most are small driver fixes except the realtek-otto driver patch which
is pretty big but addresses a significant flaw that can cause the CPU
to stay infinitely busy on uncleared ISR on some platforms.
Summary:
- MAINTAINERS update
- fix resource leaks in gpio-mockup and gpio-pxa
- add missing locking in gpio-pca953x
- use 32-bit I/O in gpio-realtek-otto
- make irq_chip structures immutable in four more drivers"
* tag 'gpio-fixes-for-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: ws16c48: Make irq_chip immutable
gpio: 104-idio-16: Make irq_chip immutable
gpio: 104-idi-48: Make irq_chip immutable
gpio: 104-dio-48e: Make irq_chip immutable
gpio: realtek-otto: switch to 32-bit I/O
gpio: pca953x: Add mutex_lock for regcache sync in PM
gpio: mockup: remove gpio debugfs when remove device
gpio: pxa: use devres for the clock struct
MAINTAINERS: rectify entry for XILINX GPIO DRIVER
Kernel warns about mutable irq_chips:
"not an immutable chip, please consider fixing!"
Make the struct irq_chip const, flag it as IRQCHIP_IMMUTABLE, add the
new helper functions, and call the appropriate gpiolib functions.
Signed-off-by: William Breathitt Gray <william.gray@linaro.org>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Kernel warns about mutable irq_chips:
"not an immutable chip, please consider fixing!"
Make the struct irq_chip const, flag it as IRQCHIP_IMMUTABLE, add the
new helper functions, and call the appropriate gpiolib functions.
Signed-off-by: William Breathitt Gray <william.gray@linaro.org>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>