Chris Wilson 10e36489ab drm/i915/execlists: Always clear pending&inflight requests on reset
If we skip the reset as we found the engine inactive at the time of the
reset, we still need to clear the residual inflight & pending request
bookkeeping to reflect the current state of HW.

Otherwise, we may end up stuck in a loop like:

<7> [416.490346] hangcheck rcs0
<7> [416.490371] hangcheck 	Awake? 1
<7> [416.490376] hangcheck 	Hangcheck: 8003 ms ago
<7> [416.490380] hangcheck 	Reset count: 0 (global 0)
<7> [416.490383] hangcheck 	Requests:
<7> [416.491210] hangcheck 	RING_START: 0x0017b000
<7> [416.491983] hangcheck 	RING_HEAD:  0x00000048
<7> [416.491992] hangcheck 	RING_TAIL:  0x00000048
<7> [416.492006] hangcheck 	RING_CTL:   0x00000000
<7> [416.492037] hangcheck 	RING_MODE:  0x00000200 [idle]
<7> [416.492044] hangcheck 	RING_IMR: 00000000
<7> [416.492809] hangcheck 	ACTHD:  0x00000000_9ca00048
<7> [416.492824] hangcheck 	BBADDR: 0x00000000_00001004
<7> [416.492838] hangcheck 	DMA_FADDR: 0x00000000_00000000
<7> [416.492845] hangcheck 	IPEIR: 0x00000000
<7> [416.492852] hangcheck 	IPEHR: 0x00000000
<7> [416.492863] hangcheck 	Execlist status: 0x00018001 00000000, entries 12
<7> [416.492869] hangcheck 	Execlist CSB read 1, write 1, tasklet queued? no (enabled)
<7> [416.492938] hangcheck 		Pending[0] ring:{start:0017b000, hwsp:fedf9000, seqno:00016fd6}, rq:  20ffa:16fd6!+  prio=-4094 @ 8307ms: signaled
<7> [416.492972] hangcheck 		Queue priority hint: -4093
<7> [416.492979] hangcheck 		Q  20ffa:16fd8-  prio=-4093 @ 8307ms: [i915]
<7> [416.492985] hangcheck 		Q  20ffa:16fda  prio=-4094 @ 8307ms: [i915]
<7> [416.492990] hangcheck 		Q  20ffa:16fdc  prio=-4094 @ 8307ms: [i915]
<7> [416.492996] hangcheck 		Q  20ffa:16fde  prio=-4094 @ 8307ms: [i915]
<7> [416.493001] hangcheck 		Q  20ffa:16fe0  prio=-4094 @ 8307ms: [i915]
<7> [416.493007] hangcheck 		Q  20ffa:16fe2  prio=-4094 @ 8307ms: [i915]
<7> [416.493013] hangcheck 		Q  20ffa:16fe4  prio=-4094 @ 8307ms: [i915]
<7> [416.493021] hangcheck 		...skipping 21 queued requests...
<7> [416.493027] hangcheck 		Q  20ffa:17010  prio=-4094 @ 8307ms: [i915]
<7> [416.493081] hangcheck HWSP:
<7> [416.493089] hangcheck [0000] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [416.493094] hangcheck *
<7> [416.493100] hangcheck [0040] 10008002 00000000 10000018 00000000 10000018 00000000 10000001 00000000
<7> [416.493106] hangcheck [0060] 10000018 00000000 10000001 00000000 10000018 00000000 10000001 00000000
<7> [416.493111] hangcheck *
<7> [416.493117] hangcheck [00a0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001
<7> [416.493123] hangcheck [00c0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [416.493127] hangcheck *
<7> [416.493132] hangcheck Idle? no
<6> [416.512124] i915 0000:00:02.0: GPU HANG: ecode 11:0:0x00000000, hang on rcs0
<6> [416.512205] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
<6> [416.512207] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
<6> [416.512208] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
<6> [416.512210] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
<6> [416.512212] [drm] GPU crash dump saved to /sys/class/drm/card0/error
<5> [416.513602] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
<7> [424.489258] hangcheck rcs0
<7> [424.489263] hangcheck 	Awake? 1
<7> [424.489267] hangcheck 	Hangcheck: 5954 ms ago
<7> [424.489271] hangcheck 	Reset count: 1 (global 0)
<7> [424.489274] hangcheck 	Requests:
<7> [424.490128] hangcheck 	RING_START: 0x00000000
<7> [424.490870] hangcheck 	RING_HEAD:  0x00000000
<7> [424.490877] hangcheck 	RING_TAIL:  0x00000000
<7> [424.490887] hangcheck 	RING_CTL:   0x00000000
<7> [424.490897] hangcheck 	RING_MODE:  0x00000200 [idle]
<7> [424.490904] hangcheck 	RING_IMR: 00000000
<7> [424.490917] hangcheck 	ACTHD:  0x00000000_00000000
<7> [424.490930] hangcheck 	BBADDR: 0x00000000_00000000
<7> [424.490943] hangcheck 	DMA_FADDR: 0x00000000_00000000
<7> [424.490950] hangcheck 	IPEIR: 0x00000000
<7> [424.490956] hangcheck 	IPEHR: 0x00000000
<7> [424.490968] hangcheck 	Execlist status: 0x00000001 00000000, entries 12
<7> [424.490972] hangcheck 	Execlist CSB read 11, write 11, tasklet queued? no (enabled)
<7> [424.490983] hangcheck 		Pending[0] ring:{start:0017b000, hwsp:fedf9000, seqno:00016fd6}, rq:  20ffa:16fd6!+  prio=-4094 @ 16305ms: signaled
<7> [424.490989] hangcheck 		Queue priority hint: -4093
<7> [424.490996] hangcheck 		Q  20ffa:16fd8-  prio=-4093 @ 16305ms: [i915]
<7> [424.491001] hangcheck 		Q  20ffa:16fda  prio=-4094 @ 16305ms: [i915]
<7> [424.491006] hangcheck 		Q  20ffa:16fdc  prio=-4094 @ 16305ms: [i915]
<7> [424.491011] hangcheck 		Q  20ffa:16fde  prio=-4094 @ 16305ms: [i915]
<7> [424.491016] hangcheck 		Q  20ffa:16fe0  prio=-4094 @ 16305ms: [i915]
<7> [424.491022] hangcheck 		Q  20ffa:16fe2  prio=-4094 @ 16305ms: [i915]
<7> [424.491048] hangcheck 		Q  20ffa:16fe4  prio=-4094 @ 16305ms: [i915]
<7> [424.491057] hangcheck 		...skipping 21 queued requests...
<7> [424.491063] hangcheck 		Q  20ffa:17010  prio=-4094 @ 16305ms: [i915]
<7> [424.491095] hangcheck HWSP:
<7> [424.491102] hangcheck [0000] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [424.491106] hangcheck *
<7> [424.491113] hangcheck [0040] 10008002 00000000 10000018 00000000 10000018 00000000 10000001 00000000
<7> [424.491118] hangcheck [0060] 10000018 00000000 10000001 00000000 10000018 00000000 10000001 00000000
<7> [424.491122] hangcheck *
<7> [424.491127] hangcheck [00a0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 0000000b
<7> [424.491133] hangcheck [00c0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [424.491136] hangcheck *
<7> [424.491141] hangcheck Idle? no
<5> [424.491834] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0

Where not having cleared the pending array on reset, it persists
indefinitely.

Fixes: fff8102aaed5 ("drm/i915/execlists: Process interrupted context on reset")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190730133035.1977-2-chris@chris-wilson.co.uk
2019-08-01 09:24:59 +01:00
2019-07-16 12:21:41 -07:00
2019-07-11 15:40:06 -07:00
2019-07-20 12:09:52 -07:00
2019-07-20 09:34:55 -07:00
2019-07-20 09:34:55 -07:00
2019-07-18 09:36:51 -07:00
2019-07-20 09:34:55 -07:00
2019-06-18 14:37:27 +01:00
2019-07-19 12:22:04 -07:00
2019-03-10 17:48:21 -07:00
2019-07-21 14:05:38 -07:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.
Description
No description provided
Readme 5.7 GiB
Languages
C 97.6%
Assembly 1%
Shell 0.5%
Python 0.3%
Makefile 0.3%