linux/drivers/pci
Rafael J. Wysocki 9d5286d4e7 PCI/PM: Drain runtime-idle callbacks before driver removal
A race condition between the .runtime_idle() callback and the .remove()
callback in the rtsx_pcr PCI driver leads to a kernel crash due to an
unhandled page fault [1].

The problem is that rtsx_pci_runtime_idle() is not expected to be running
after pm_runtime_get_sync() has been called, but the latter doesn't really
guarantee that.  It only guarantees that the suspend and resume callbacks
will not be running when it returns.

However, if a .runtime_idle() callback is already running when
pm_runtime_get_sync() is called, the latter will notice that the runtime PM
status of the device is RPM_ACTIVE and it will return right away without
waiting for the former to complete.  In fact, it cannot wait for
.runtime_idle() to complete because it may be called from that callback (it
arguably does not make much sense to do that, but it is not strictly
prohibited).

Thus in general, whoever is providing a .runtime_idle() callback needs
to protect it from running in parallel with whatever code runs after
pm_runtime_get_sync().  [Note that .runtime_idle() will not start after
pm_runtime_get_sync() has returned, but it may continue running then if it
has started earlier.]

One way to address that race condition is to call pm_runtime_barrier()
after pm_runtime_get_sync() (not before it, because a nonzero value of the
runtime PM usage counter is necessary to prevent runtime PM callbacks from
being invoked) to wait for the .runtime_idle() callback to complete should
it be running at that point.  A suitable place for doing that is in
pci_device_remove() which calls pm_runtime_get_sync() before removing the
driver, so it may as well call pm_runtime_barrier() subsequently, which
will prevent the race in question from occurring, not just in the rtsx_pcr
driver, but in any PCI drivers providing .runtime_idle() callbacks.

Link: https://lore.kernel.org/lkml/20240229062201.49500-1-kai.heng.feng@canonical.com/ # [1]
Link: https://lore.kernel.org/r/5761426.DvuYhMxLoT@kreacher
Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Ricky Wu <ricky_wu@realtek.com>
Acked-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: <stable@vger.kernel.org>
2024-03-05 12:08:31 -06:00
..
controller pci-v6.8-changes 2024-01-17 16:23:17 -08:00
endpoint Char/Misc and other Driver changes for 6.8-rc1 2024-01-17 16:47:17 -08:00
hotplug Revert "PCI: acpiphp: Reassign resources on bridge if necessary" 2023-12-15 14:55:10 -06:00
msi PCI/MSI: Use FIELD_GET/PREP() 2023-10-24 10:54:04 -05:00
pcie pci-v6.8-changes 2024-01-17 16:23:17 -08:00
switch PCI: switchtec: Fix stdev_release() crash after surprise hot remove 2023-11-22 09:44:06 -06:00
access.c PCI: Move pci_clear_and_set_dword() helper to PCI header 2023-12-13 13:35:41 +00:00
ats.c PCI/ATS: Use FIELD_GET() 2023-10-24 16:55:45 -05:00
bus.c Devicetree updates for v6.6: 2023-08-30 16:59:03 -07:00
doe.c PCI/DOE: Fix destroy_work_on_stack() race 2023-07-27 15:20:47 -05:00
ecam.c
host-bridge.c
iov.c PCI: Use resource names in PCI log messages 2023-12-15 17:28:42 -06:00
irq.c
Kconfig PCI: Replace unnecessary UTF-8 in Kconfig 2023-10-06 14:30:48 -05:00
Makefile PCI: Create device tree node for bridge 2023-08-22 14:56:09 -05:00
mmap.c
of_property.c PCI: of_property: Handle interrupt parsing failures 2023-09-29 17:33:46 -05:00
of.c PCI: of: Destroy changeset when adding PCI device node fails 2023-09-29 17:33:51 -05:00
p2pdma.c PCI/P2PDMA: Remove redundant goto 2023-10-23 12:17:52 -05:00
pci-acpi.c Merge branch 'pci/pm' 2023-10-28 13:30:59 -05:00
pci-bridge-emul.c
pci-bridge-emul.h
pci-driver.c PCI/PM: Drain runtime-idle callbacks before driver removal 2024-03-05 12:08:31 -06:00
pci-label.c
pci-mid.c
pci-pf-stub.c
pci-stub.c
pci-sysfs.c Driver core changes for 6.7-rc1 2023-11-03 15:15:47 -10:00
pci.c cxl for v6.8 2024-01-18 16:22:43 -08:00
pci.h pci-v6.8-changes 2024-01-17 16:23:17 -08:00
probe.c PCI: Log bridge info when first enumerating bridge 2023-12-15 17:28:43 -06:00
proc.c
quirks.c pci-v6.8-changes 2024-01-17 16:23:17 -08:00
remove.c PCI: Create device tree node for bridge 2023-08-22 14:56:09 -05:00
rom.c
search.c PCI: Add pci_get_base_class() helper 2023-09-28 16:49:44 -05:00
setup-bus.c PCI: Use resource names in PCI log messages 2023-12-15 17:28:42 -06:00
setup-irq.c
setup-res.c PCI: Use resource names in PCI log messages 2023-12-15 17:28:42 -06:00
slot.c
syscall.c PCI: Use consistent put_user() pointer types 2023-08-25 08:15:13 -05:00
vc.c PCI/VC: Use FIELD_GET() 2023-10-24 16:55:45 -05:00
vgaarb.c pci-v6.7-changes 2023-11-02 14:05:18 -10:00
vpd.c PCI/VPD: Add runtime power management to sysfs interface 2023-08-11 14:19:16 -05:00
xen-pcifront.c x86: always initialize xen-swiotlb when xen-pcifront is enabling 2023-07-31 17:54:27 +02:00