drm/amdgpu: Fix resume failures when device is gone

Problem: When device goes into suspend and unplugged during it then all HW programming during resume fails leading to a bad SW during pci remove handling which follows. Because device is first resumed and only later removed we cannot rely on drm_dev_enter/exit here. Fix: Use a flag we use for PCIe error recovery to avoid accessing registres. This allows to successfully complete pm resume sequence and finish pci remove. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-17 07:18:29 -04:00 · 2021-09-17 07:18:29 -04:00 · ebe86a57c8
commit ebe86a57c8
parent c03509cbc0
1 changed files with 4 additions and 0 deletions
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@ -1508,6 +1508,10 @@ static int amdgpu_pmops_resume(struct device *dev)
 	struct amdgpu_device *adev = drm_to_adev(drm_dev);
 	int r;

+	/* Avoids registers access if device is physically gone */
+	if (!pci_device_is_present(adev->pdev))
+		adev->no_hw_access = true;
+
 	r = amdgpu_device_resume(drm_dev, true);
 	if (amdgpu_acpi_is_s0ix_active(adev))
 		adev->in_s0ix = false;