linux/drivers/vfio
Abhishek Sahu cc2742fe36 vfio/pci: Implement VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY/EXIT
Currently, if the runtime power management is enabled for vfio-pci
based devices in the guest OS, then the guest OS will do the register
write for PCI_PM_CTRL register. This write request will be handled in
vfio_pm_config_write() where it will do the actual register write of
PCI_PM_CTRL register. With this, the maximum D3hot state can be
achieved for low power. If we can use the runtime PM framework, then
we can achieve the D3cold state (on the supported systems) which will
help in saving maximum power.

1. D3cold state can't be achieved by writing PCI standard
   PM config registers. This patch implements the following
   newly added low power related device features:
    - VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY
    - VFIO_DEVICE_FEATURE_LOW_POWER_EXIT

   The VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY feature will allow the
   device to make use of low power platform states on the host
   while the VFIO_DEVICE_FEATURE_LOW_POWER_EXIT will prevent
   further use of those power states.

2. The vfio-pci driver uses runtime PM framework for low power entry and
   exit. On the platforms where D3cold state is supported, the runtime
   PM framework will put the device into D3cold otherwise, D3hot or some
   other power state will be used.

   There are various cases where the device will not go into the runtime
   suspended state. For example,

   - The runtime power management is disabled on the host side for
     the device.
   - The user keeps the device busy after calling LOW_POWER_ENTRY.
   - There are dependent devices that are still in runtime active state.

   For these cases, the device will be in the same power state that has
   been configured by the user through PCI_PM_CTRL register.

3. The hypervisors can implement virtual ACPI methods. For example,
   in guest linux OS if PCI device ACPI node has _PR3 and _PR0 power
   resources with _ON/_OFF method, then guest linux OS invokes
   the _OFF method during D3cold transition and then _ON during D0
   transition. The hypervisor can tap these virtual ACPI calls and then
   call the low power device feature IOCTL.

4. The 'pm_runtime_engaged' flag tracks the entry and exit to
   runtime PM. This flag is protected with 'memory_lock' semaphore.

5. All the config and other region access are wrapped under
   pm_runtime_resume_and_get() and pm_runtime_put(). So, if any
   device access happens while the device is in the runtime suspended
   state, then the device will be resumed first before access. Once the
   access has been finished, then the device will again go into the
   runtime suspended state.

6. The memory region access through mmap will not be allowed in the low
   power state. Since __vfio_pci_memory_enabled() is a common function,
   so check for 'pm_runtime_engaged' has been added explicitly in
   vfio_pci_mmap_fault() to block only mmap'ed access.

Signed-off-by: Abhishek Sahu <abhsahu@nvidia.com>
Link: https://lore.kernel.org/r/20220829114850.4341-5-abhsahu@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-09-01 15:29:11 -06:00
..
fsl-mc vfio: de-extern-ify function prototypes 2022-06-27 09:38:02 -06:00
mdev vfio/mdev: Use the driver core to create the 'remove' file 2022-04-21 07:36:56 -04:00
pci vfio/pci: Implement VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY/EXIT 2022-09-01 15:29:11 -06:00
platform vfio: de-extern-ify function prototypes 2022-06-27 09:38:02 -06:00
Kconfig vfio: Use kconfig if XX/endif blocks instead of repeating 'depends on' 2021-08-26 10:36:51 -06:00
Makefile vfio: Move vfio.c to vfio_main.c 2022-08-08 14:33:41 -06:00
vfio_iommu_spapr_tce.c vfio/spapr_tce: Fix the comment 2022-07-22 16:24:47 -06:00
vfio_iommu_type1.c vfio: Replace phys_pfn with pages for vfio_pin_pages() 2022-07-25 13:41:22 -06:00
vfio_main.c vfio: Increment the runtime PM usage count during IOCTL call 2022-09-01 15:29:11 -06:00
vfio_spapr_eeh.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
vfio.h vfio: Replace phys_pfn with pages for vfio_pin_pages() 2022-07-25 13:41:22 -06:00
virqfd.c vfio/virqfd: Drain events from eventfd in virqfd_wakeup() 2020-11-15 09:49:10 -05:00