linux/drivers/gpu/drm/msm
Javier Martinez Canillas 0a58d2ae57
drm/msm: Make .remove and .shutdown HW shutdown consistent
Drivers' .remove and .shutdown callbacks are executed on different code
paths. The former is called when a device is removed from the bus, while
the latter is called at system shutdown time to quiesce the device.

This means that some overlap exists between the two, because both have to
take care of properly shutting down the hardware. But currently the logic
used in these two callbacks isn't consistent in msm drivers, which could
lead to kernel panic.

For example, on .remove the component is deleted and its .unbind callback
leads to the hardware being shutdown but only if the DRM device has been
marked as registered.

That check doesn't exist in the .shutdown logic and this can lead to the
driver calling drm_atomic_helper_shutdown() for a DRM device that hasn't
been properly initialized.

A situation like this can happen if drivers for expected sub-devices fail
to probe, since the .bind callback will never be executed. If that is the
case, drm_atomic_helper_shutdown() will attempt to take mutexes that are
only initialized if drm_mode_config_init() is called during a device bind.

This bug was attempted to be fixed in commit 623f279c77 ("drm/msm: fix
shutdown hook in case GPU components failed to bind"), but unfortunately
it still happens in some cases as the one mentioned above, i.e:

  systemd-shutdown[1]: Powering off.
  kvm: exiting hardware virtualization
  platform wifi-firmware.0: Removing from iommu group 12
  platform video-firmware.0: Removing from iommu group 10
  ------------[ cut here ]------------
  WARNING: CPU: 6 PID: 1 at drivers/gpu/drm/drm_modeset_lock.c:317 drm_modeset_lock_all_ctx+0x3c4/0x3d0
  ...
  Hardware name: Google CoachZ (rev3+) (DT)
  pstate: a0400009 (NzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : drm_modeset_lock_all_ctx+0x3c4/0x3d0
  lr : drm_modeset_lock_all_ctx+0x48/0x3d0
  sp : ffff80000805bb80
  x29: ffff80000805bb80 x28: ffff327c00128000 x27: 0000000000000000
  x26: 0000000000000000 x25: 0000000000000001 x24: ffffc95d820ec030
  x23: ffff327c00bbd090 x22: ffffc95d8215eca0 x21: ffff327c039c5800
  x20: ffff327c039c5988 x19: ffff80000805bbe8 x18: 0000000000000034
  x17: 000000040044ffff x16: ffffc95d80cac920 x15: 0000000000000000
  x14: 0000000000000315 x13: 0000000000000315 x12: 0000000000000000
  x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000
  x8 : ffff80000805bc28 x7 : 0000000000000000 x6 : 0000000000000000
  x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
  x2 : ffff327c00128000 x1 : 0000000000000000 x0 : ffff327c039c59b0
  Call trace:
   drm_modeset_lock_all_ctx+0x3c4/0x3d0
   drm_atomic_helper_shutdown+0x70/0x134
   msm_drv_shutdown+0x30/0x40
   platform_shutdown+0x28/0x40
   device_shutdown+0x148/0x350
   kernel_power_off+0x38/0x80
   __do_sys_reboot+0x288/0x2c0
   __arm64_sys_reboot+0x28/0x34
   invoke_syscall+0x48/0x114
   el0_svc_common.constprop.0+0x44/0xec
   do_el0_svc+0x2c/0xc0
   el0_svc+0x2c/0x84
   el0t_64_sync_handler+0x11c/0x150
   el0t_64_sync+0x18c/0x190
  ---[ end trace 0000000000000000 ]---
  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000018
  Mem abort info:
    ESR = 0x0000000096000004
    EC = 0x25: DABT (current EL), IL = 32 bits
    SET = 0, FnV = 0
    EA = 0, S1PTW = 0
    FSC = 0x04: level 0 translation fault
  Data abort info:
    ISV = 0, ISS = 0x00000004
    CM = 0, WnR = 0
  user pgtable: 4k pages, 48-bit VAs, pgdp=000000010eab1000
  [0000000000000018] pgd=0000000000000000, p4d=0000000000000000
  Internal error: Oops: 96000004 [#1] PREEMPT SMP
  ...
  Hardware name: Google CoachZ (rev3+) (DT)
  pstate: a0400009 (NzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : ww_mutex_lock+0x28/0x32c
  lr : drm_modeset_lock_all_ctx+0x1b0/0x3d0
  sp : ffff80000805bb50
  x29: ffff80000805bb50 x28: ffff327c00128000 x27: 0000000000000000
  x26: 0000000000000000 x25: 0000000000000001 x24: 0000000000000018
  x23: ffff80000805bc10 x22: ffff327c039c5ad8 x21: ffff327c039c5800
  x20: ffff80000805bbe8 x19: 0000000000000018 x18: 0000000000000034
  x17: 000000040044ffff x16: ffffc95d80cac920 x15: 0000000000000000
  x14: 0000000000000315 x13: 0000000000000315 x12: 0000000000000000
  x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000
  x8 : ffff80000805bc28 x7 : 0000000000000000 x6 : 0000000000000000
  x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
  x2 : ffff327c00128000 x1 : 0000000000000000 x0 : 0000000000000018
  Call trace:
   ww_mutex_lock+0x28/0x32c
   drm_modeset_lock_all_ctx+0x1b0/0x3d0
   drm_atomic_helper_shutdown+0x70/0x134
   msm_drv_shutdown+0x30/0x40
   platform_shutdown+0x28/0x40
   device_shutdown+0x148/0x350
   kernel_power_off+0x38/0x80
   __do_sys_reboot+0x288/0x2c0
   __arm64_sys_reboot+0x28/0x34
   invoke_syscall+0x48/0x114
   el0_svc_common.constprop.0+0x44/0xec
   do_el0_svc+0x2c/0xc0
   el0_svc+0x2c/0x84
   el0t_64_sync_handler+0x11c/0x150
   el0t_64_sync+0x18c/0x190
  Code: aa0103f4 d503201f d2800001 aa0103e3 (c8e37c02)
  ---[ end trace 0000000000000000 ]---
  Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
  Kernel Offset: 0x495d77c00000 from 0xffff800008000000
  PHYS_OFFSET: 0xffffcd8500000000
  CPU features: 0x800,00c2a015,19801c82
  Memory Limit: none
  ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---

Fixes: 9d5cbf5fe4 ("drm/msm: add shutdown support for display platform_driver")
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220816134612.916527-1-javierm@redhat.com
2022-08-22 05:13:55 +02:00
..
adreno drm/msm/adreno: Defer enabling runpm until hw_init() 2022-07-06 18:54:41 -07:00
disp Merge tag 'drm-msm-next-2022-07-10' of https://gitlab.freedesktop.org/drm/msm into drm-next 2022-07-13 10:55:53 +10:00
dp Merge tag 'drm-msm-next-2022-07-10' of https://gitlab.freedesktop.org/drm/msm into drm-next 2022-07-13 10:55:53 +10:00
dsi Merge tag 'drm-msm-next-2022-07-10' of https://gitlab.freedesktop.org/drm/msm into drm-next 2022-07-13 10:55:53 +10:00
hdmi Merge tag 'drm-msm-next-2022-07-10' of https://gitlab.freedesktop.org/drm/msm into drm-next 2022-07-13 10:55:53 +10:00
Kconfig Merge tag 'drm-msm-next-2022-05-09' of https://gitlab.freedesktop.org/drm/msm into drm-next 2022-05-11 12:40:47 +10:00
Makefile drm/msm/dp: rewrite dss_module_power to use bulk clock functions 2022-07-04 21:05:28 +03:00
msm_atomic_trace.h
msm_atomic_tracepoints.c
msm_atomic.c drm/msm: Avoid dirtyfb stalls on video mode displays (v2) 2022-02-25 07:59:58 -08:00
msm_debugfs.c drm: Drop drm_framebuffer.h from drm_crtc.h 2022-06-20 23:53:55 +03:00
msm_debugfs.h
msm_drv.c drm/msm: Make .remove and .shutdown HW shutdown consistent 2022-08-22 05:13:55 +02:00
msm_drv.h drm: Remove unnecessary include statements of drm_plane_helper.h 2022-07-26 18:42:04 +02:00
msm_fb.c drm: Drop drm_framebuffer.h from drm_crtc.h 2022-06-20 23:53:55 +03:00
msm_fbdev.c drm: Drop drm_framebuffer.h from drm_crtc.h 2022-06-20 23:53:55 +03:00
msm_fence.c drm/msm: Fix fence rollover issue 2022-07-05 06:01:11 +03:00
msm_fence.h drm/msm/gem: Add fenced vma unpin 2022-04-21 15:03:12 -07:00
msm_gem_prime.c drm/msm: Ensure mmap offset is initialized 2022-06-01 17:20:08 -07:00
msm_gem_shrinker.c drm/msm: Make enable_eviction flag static 2022-07-07 08:50:36 -07:00
msm_gem_submit.c Merge tag 'drm-msm-fixes-2022-06-28' of https://gitlab.freedesktop.org/drm/msm into drm-fixes 2022-06-29 14:16:46 +10:00
msm_gem_vma.c drm/msm/gem: Drop early returns in close/purge vma 2022-06-15 13:22:45 -07:00
msm_gem.c drm/msm: Switch to pfn mappings 2022-07-06 18:54:42 -07:00
msm_gem.h drm/msm/gem: Drop obj lock in msm_gem_free_object() 2022-07-06 18:54:41 -07:00
msm_gpu_devfreq.c drm/msm: Avoid unclocked GMU register access in 6xx gpu_busy 2022-07-06 08:38:06 -07:00
msm_gpu_trace.h
msm_gpu_tracepoints.c
msm_gpu.c drm/msm: Deprecate MSM_BO_UNCACHED harder 2022-07-06 18:54:42 -07:00
msm_gpu.h drm/msm/gpu: Add GEM debug label to devcore 2022-07-06 18:54:41 -07:00
msm_gpummu.c
msm_io_utils.c
msm_iommu.c drm/msm: use for_each_sgtable_sg to iterate over scatterlist 2022-06-14 11:54:38 -07:00
msm_kms.h drm/msm: don't free the IRQ if it was not requested 2022-05-18 15:55:46 -07:00
msm_mdss.c drm/msm/dpu: Move min BW request and full BW disable back to mdss 2022-06-01 16:16:19 -07:00
msm_mmu.h
msm_perf.c
msm_rd.c drm/msm: Add support for pointer params 2022-04-21 15:01:08 -07:00
msm_ringbuffer.c drm/msm/gem: Separate object and vma unpin 2022-06-15 13:06:54 -07:00
msm_ringbuffer.h drm/msm/gpu: Drop duplicate fence counter 2022-04-21 15:03:11 -07:00
msm_submitqueue.c drm/msm: Add a way to override processes comm/cmdline 2022-04-21 15:01:09 -07:00
NOTES