drm/i915: Sleep and retry a GPU reset if at first we don't succeed

As we declare the GPU wedged if the reset fails, such a failure is quite
terminal. Before taking that drastic action, let's sleep first and try
active, in the hope that the hardware has quietened down and is then
able to reset. After a few such attempts, it is fair to say that the HW
is truly wedged.

v2: Always print the failure message now, we precheck whether resets are
disabled.

References: https://bugs.freedesktop.org/show_bug.cgi?id=104007
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171201122011.16841-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
This commit is contained in:
Chris Wilson 2017-12-01 12:20:11 +00:00
parent 0502138933
commit f7096d40ee

View File

@ -1877,7 +1877,9 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
{
struct i915_gpu_error *error = &i915->gpu_error;
int ret;
int i;
might_sleep();
lockdep_assert_held(&i915->drm.struct_mutex);
GEM_BUG_ON(!test_bit(I915_RESET_BACKOFF, &error->flags));
@ -1900,15 +1902,23 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
goto error;
}
ret = intel_gpu_reset(i915, ALL_ENGINES);
if (ret) {
if (ret != -ENODEV)
DRM_ERROR("Failed to reset chip: %i\n", ret);
else
if (!intel_has_gpu_reset(i915)) {
DRM_DEBUG_DRIVER("GPU reset disabled\n");
goto error;
}
for (i = 0; i < 3; i++) {
ret = intel_gpu_reset(i915, ALL_ENGINES);
if (ret == 0)
break;
msleep(100);
}
if (ret) {
dev_err(i915->drm.dev, "Failed to reset chip\n");
goto error;
}
i915_gem_reset(i915);
intel_overlay_reset(i915);