69696 Commits

Author SHA1 Message Date
Chris Wilson
651dabe27f drm/i915/gem: Always test execution status on closing the context
Verify that if a context is active at the time it is closed, that it is
either persistent and preemptible (with hangcheck running) or it shall
be removed from execution.

Fixes: 9a40bddd47ca ("drm/i915/gt: Expose heartbeat interval via sysfs")
Testcase: igt/gem_ctx_persistence/heartbeat-close
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-3-chris@chris-wilson.co.uk
(cherry picked from commit d3bb2f9b5ee66d5e000293edd6b6575e59d11db9)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30 14:24:51 -04:00
Chris Wilson
ca65fc0d8e drm/i915/gt: Always send a pulse down the engine after disabling heartbeat
Currently, we check we can send a pulse prior to disabling the
heartbeat to verify that we can change the heartbeat, but since we may
re-evaluate execution upon changing the heartbeat interval we need another
pulse afterwards to refresh execution.

v2: Tvrtko asked if we could reduce the double pulse to a single, which
opened up a discussion of how we should handle the pulse-error after
attempting to change the property, and the desire to serialise
adjustment of the property with its validating pulse, and unwind upon
failure.

Fixes: 9a40bddd47ca ("drm/i915/gt: Expose heartbeat interval via sysfs")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-2-chris@chris-wilson.co.uk
(cherry picked from commit 3dd66a94de59d7792e7917eb3075342e70f06f44)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30 14:24:48 -04:00
Chris Wilson
7d442ea7c5 drm/i915: Cancel outstanding work after disabling heartbeats on an engine
We only allow persistent requests to remain on the GPU past the closure
of their containing context (and process) so long as they are continuously
checked for hangs or allow other requests to preempt them, as we need to
ensure forward progress of the system. If we allow persistent contexts
to remain on the system after the the hangcheck mechanism is disabled,
the system may grind to a halt. On disabling the mechanism, we sent a
pulse along the engine to remove all executing contexts from the engine
which would check for hung contexts -- but we did not prevent those
contexts from being resubmitted if they survived the final hangcheck.

Fixes: 9a40bddd47ca ("drm/i915/gt: Expose heartbeat interval via sysfs")
Testcase: igt/gem_ctx_persistence/heartbeat-stop
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-1-chris@chris-wilson.co.uk
(cherry picked from commit 7a991cd3e3da9a56d5616b62d425db000a3242f2)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30 14:24:46 -04:00
Chris Wilson
3cfea8c97c drm/i915/gem: Hold request reference for canceling an active context
We have to be very careful while walking the timeline->requests list
under the RCU guard, as the requests (and so rq->link) use
SLAB_TYPESAFE_BY_RCU and so the requests may be reallocated within an
rcu grace period. As the requests are reallocated, they are removed from
one list and placed on another, and if we are iterating over that
request at that moment, the list iteration jumps from one list to the
next and promptly gets confused. Verify we hold the request reference
to ensure that the request is not added to a new list behind our backs.

<4> [582.745252] general protection fault, probably for non-canonical address 0xcccccccccccccd5c: 0000 [#1] PREEMPT SMP PTI
<4> [582.745297] CPU: 0 PID: 1475 Comm: gem_ctx_persist Not tainted 5.9.0-rc1-CI-CI_DRM_8908+ #1
<4> [582.745304] Hardware name: Intel Corporation NUC7CJYH/NUC7JYB, BIOS JYGLKCPX.86A.0027.2018.0125.1347 01/25/2018
<4> [582.745317] RIP: 0010:__lock_acquire+0x2c3/0x1f40
<4> [582.745323] Code: 00 65 8b 05 c7 8a ef 7e 85 c0 0f 85 b4 07 00 00 44 8b 9d c4 08 00 00 45 85 db 0f 84 0f 01 00 00 ba 05 00 00 00 e9 c8 06 00 00 <48> 81 3f c0 89 c7 82 b8 00 00 00 00 41 0f 45 c0 83 fe 01 41 89 c3
<4> [582.745334] RSP: 0018:ffffc9000461bc40 EFLAGS: 00010002
<4> [582.745340] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
<4> [582.745345] RDX: 0000000000000000 RSI: 0000000000000000 RDI: cccccccccccccd5c
<4> [582.745350] RBP: ffff8881ec4a2880 R08: 0000000000000001 R09: 0000000000000001
<4> [582.745356] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
<4> [582.745361] R13: 0000000000000000 R14: 0000000000000000 R15: cccccccccccccd5c
<4> [582.745367] FS:  00007fb44da78e40(0000) GS:ffff888278000000(0000) knlGS:0000000000000000
<4> [582.745373] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [582.745378] CR2: 00007fb44daad040 CR3: 0000000268428000 CR4: 0000000000350ef0
<4> [582.745383] Call Trace:
<4> [582.745390]  ? __lock_acquire+0x913/0x1f40
<4> [582.745397]  lock_acquire+0xb5/0x3c0
<4> [582.745526]  ? kill_engines+0x19a/0x4b0 [i915]
<4> [582.745533]  ? find_held_lock+0x2d/0x90
<4> [582.745541]  _raw_spin_lock_irq+0x30/0x40
<4> [582.745635]  ? kill_engines+0x19a/0x4b0 [i915]
<4> [582.745727]  kill_engines+0x19a/0x4b0 [i915]
<4> [582.745820]  context_close+0x195/0x410 [i915]
<4> [582.745912]  i915_gem_context_close+0x5b/0x160 [i915]
<4> [582.745994]  i915_driver_postclose+0x14/0x40 [i915]
<4> [582.746003]  drm_file_free.part.13+0x240/0x290
<4> [582.746009]  drm_release_noglobal+0x16/0x50
<4> [582.746016]  __fput+0xa5/0x250
<4> [582.746021]  task_work_run+0x6e/0xb0
<4> [582.746028]  exit_to_user_mode_prepare+0x178/0x180
<4> [582.746034]  syscall_exit_to_user_mode+0x36/0x220
<4> [582.746040]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
<4> [582.746045] RIP: 0033:0x7fb44d1dc421
<4> [582.746050] Code: f7 d8 64 89 02 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 8b 05 ea cf 20 00 85 c0 75 16 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 3f f3 c3 0f 1f 44 00 00 53 89 fb 48 83 ec 10
<4> [582.746062] RSP: 002b:00007ffed2e83818 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
<4> [582.746069] RAX: 0000000000000000 RBX: 0000556410bfe840 RCX: 00007fb44d1dc421
<4> [582.746075] RDX: 000000000000000a RSI: 00000000c0406469 RDI: 0000000000000008
<4> [582.746080] RBP: 0000000000000008 R08: 00007fb44d1c51cc R09: 00007fb44d1c5240
<4> [582.746086] R10: 0000000000000001 R11: 0000000000000246 R12: 00000000fffffffb
<4> [582.746091] R13: 0000000000000006 R14: 0000000000000000 R15: 000000000000000a
<4> [582.746099] Modules linked in: vgem mei_hdcp snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio btusb btrtl btbcm btintel x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul bluetooth ghash_clmulni_intel ecdh_generic ecc i915 r8169 realtek mei_me mei snd_hda_intel i2c_hid snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_pcm pinctrl_geminilake pinctrl_intel prime_numbers [last unloaded: test_drm_mm]

Fixes: 736e785f9b28 ("drm/i915/gem: Reduce context termination list iteration guard to RCU")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200925101107.27869-2-chris@chris-wilson.co.uk
(cherry picked from commit badef44deff1fae8d21c5c1cfc4dde95fb5bf993)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30 14:24:42 -04:00
Chris Wilson
5701a66edb drm/i915: Redo "Remove i915_request.lock requirement for execution callbacks"
The reordering and rebasing of commit 2e4c6c1a9db5 ("drm/i915: Remove
i915_request.lock requirement for execution callbacks") caused it to
revert an earlier correction. Let us restore commit 99f0a640d464
("drm/i915: Remove requirement for holding i915_request.lock for
breadcrumbs")

Fixes: 2e4c6c1a9db5 ("drm/i915: Remove i915_request.lock requirement for execution callbacks")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200925101107.27869-1-chris@chris-wilson.co.uk
(cherry picked from commit 35faeb7de9ef83da510a048f2016061f1e31d5fc)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30 14:24:40 -04:00
Chris Wilson
4fe9af8e88 drm/i915/gem: Serialise debugfs i915_gem_objects with ctx->mutex
Since the debugfs may peek into the GEM contexts as the corresponding
client/fd is being closed, we may try and follow a dangling pointer.
However, the context closure itself is serialised with the ctx->mutex,
so if we hold that mutex as we inspect the state coupled in the context,
we know the pointers within the context are stable and will remain valid
as we inspect their tables.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: CQ Tang <cq.tang@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200723172119.17649-3-chris@chris-wilson.co.uk
(cherry picked from commit 102f5aa491f262c818e607fc4fee08a724a76c69)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30 14:24:37 -04:00
Matthew Auld
cef8ce5528 drm/i915: check i915_vm_alloc_pt_stash for errors
If we are really unlucky and encounter an error during
i915_vm_alloc_pt_stash, we end up passing an empty pt/pd stash all the
way down into the low-level ppgtt alloc code, leading to explosions,
since it expects at least the required number of pt/pd for the va range.

[  211.981418] BUG: kernel NULL pointer dereference, address: 0000000000000000
[  211.981421] #PF: supervisor read access in kernel mode
[  211.981422] #PF: error_code(0x0000) - not-present page
[  211.981424] PGD 80000008439cb067 P4D 80000008439cb067 PUD 84a37f067 PMD 0
[  211.981427] Oops: 0000 [#1] SMP PTI
[  211.981428] CPU: 1 PID: 1301 Comm: i915_selftest Tainted: G     U    I       5.9.0-rc5+ #3
[  211.981430] Hardware name:  /NUC6i7KYB, BIOS KYSKLi70.86A.0050.2017.0831.1924 08/31/2017
[  211.981521] RIP: 0010:__gen8_ppgtt_alloc+0x1ed/0x3c0 [i915]
[  211.981523] Code: c1 48 c7 c7 5d 5d fe c0 65 ff 0d ee 1d 03 3f e8 d9 91 1f e2 8b 55 c4 31 c0 48 8b 75 b8 85 d2 0f 95 c0 48 8b 1c c6 48 89 45 98 <48> 8b 03 48 8b 90 58 02 00 00 48 85 d2 0f 84 07 ea 15 00 48 81 fa
[  211.981526] RSP: 0018:ffffba2cc0eb3970 EFLAGS: 00010202
[  211.981527] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000004
[  211.981529] RDX: 0000000000000002 RSI: ffff9be998bdb8c0 RDI: ffff9be99c844300
[  211.981530] RBP: ffffba2cc0eb39d8 R08: 0000000000000640 R09: ffff9be97cdfd000
[  211.981531] R10: ffff9be97cdfd614 R11: 0000000000000000 R12: 0000000000000000
[  211.981532] R13: ffff9be98607ba20 R14: ffff9be995a0b400 R15: ffffba2cc0eb39e8
[  211.981534] FS:  00007f0f10b31000(0000) GS:ffff9be99fc40000(0000) knlGS:0000000000000000
[  211.981536] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  211.981538] CR2: 0000000000000000 CR3: 000000084d74e006 CR4: 00000000003706e0
[  211.981539] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  211.981541] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  211.981542] Call Trace:
[  211.981609]  gen8_ppgtt_alloc+0x79/0x90 [i915]
[  211.981678]  ppgtt_bind_vma+0x36/0x80 [i915]
[  211.981756]  __vma_bind+0x39/0x40 [i915]
[  211.981818]  fence_work+0x21/0x98 [i915]
[  211.981879]  fence_notify+0x8d/0x128 [i915]
[  211.981939]  __i915_sw_fence_complete+0x62/0x240 [i915]
[  211.982018]  i915_vma_pin_ww+0x1ee/0x9c0 [i915]

Fixes: cd0452aa2a0d ("drm/i915: Preallocate stashes for vma page-directories")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200921160844.73186-1-matthew.auld@intel.com
(cherry picked from commit 1604cb2aa7fafd83e11f9257f765a5f5dd7c19d3)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30 14:24:35 -04:00
Maarten Lankhorst
159ace7ffe drm/i915: Fix uninitialised variable in intel_context_create_request.
In case backoff fails with an error, we return an undefined rq,
assign err to rq correctly.

Fixes: 8a929c9eb1c2 ("drm/i915: Use ww pinning for intel_context_create_request()")
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200918111208.1392128-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 4316b19dee27cc5cd34a95fdbc0a3a5237507701)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30 14:24:32 -04:00
Chris Wilson
7d55531476 drm/i915: Break up error capture compression loops with cond_resched()
As the error capture will compress user buffers as directed to by the
user, it can take an arbitrary amount of time and space. Break up the
compression loops with a call to cond_resched(), that will allow other
processes to schedule (avoiding the soft lockups) and also serve as a
warning should we try to make this loop atomic in the future.

Testcase: igt/gem_exec_capture/many-*
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200916090059.3189-2-chris@chris-wilson.co.uk
(cherry picked from commit 293f43c80c0027ff9299036c24218ac705ce584e)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30 14:24:30 -04:00
Dan Carpenter
eb2a27086a drm/i915: Fix an error code i915_gem_object_copy_blt()
This code should use "vma[1]" instead of "vma".  The "vma" variable is a
valid pointer.

Fixes: 6b05030496f7 ("drm/i915: Convert i915_gem_object/client_blt.c to use ww locking as well, v2.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200911075243.GG12635@kadam
(cherry picked from commit 68ba71e3ae6dd86a23486655e33c5f8c9bd90777)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30 14:24:28 -04:00
Chris Wilson
922d369b29 drm/i915/gt: Clear the buffer pool age before use
If we create a new node, it is possible for the slab allocator to return
us a recently freed node. If that node was just retired, it will retain
the current jiffy as its node->age. There is then a miniscule window,
where as that node is retired, it will appear on the free list with an
incorrect age and be eligible for reuse by one thread, and then by a
second thread as the correct node->age is written.

Fixes: 06b73c2d0b65 ("drm/i915/gt: Delay taking the spinlock for grabbing from the buffer pool")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915091417.4086-3-chris@chris-wilson.co.uk
(cherry picked from commit 9bb34ff25c458a2a48fb61409df42f465ede37f8)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30 14:24:23 -04:00
Chris Wilson
ba2ebf605d drm/i915/gem: Prevent using pgprot_writecombine() if PAT is not supported
Let's not try and use PAT attributes for I915_MAP_WC if the CPU doesn't
support PAT.

Fixes: 6056e50033d9 ("drm/i915/gem: Support discontiguous lmem object maps")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Link: https://patchwork.freedesktop.org/patch/msgid/20200915091417.4086-2-chris@chris-wilson.co.uk
(cherry picked from commit 121ba69ffddc60df11da56f6d5b29bdb45c8eb80)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30 14:24:20 -04:00
Chris Wilson
4caf017ee9 drm/i915/gem: Avoid implicit vmap for highmem on x86-32
On 32b, highmem using a finite set of indirect PTE (i.e. vmap) to provide
virtual mappings of the high pages.  As these are finite, map_new_virtual()
must wait for some other kmap() to finish when it runs out. If we map a
large number of objects, there is no method for it to tell us to release
the mappings, and we deadlock.

However, if we make an explicit vmap of the page, that uses a larger
vmalloc arena, and also has the ability to tell us to release unwanted
mappings. Most importantly, it will fail and propagate an error instead
of waiting forever.

Fixes: fb8621d3bee8 ("drm/i915: Avoid allocating a vmap arena for a single page") #x86-32
References: e87666b52f00 ("drm/i915/shrinker: Hook up vmap allocation failure notifier")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v4.7+
Link: https://patchwork.freedesktop.org/patch/msgid/20200915091417.4086-1-chris@chris-wilson.co.uk
(cherry picked from commit 060bb115c2d664f04db9c7613a104dfaef3fdd98)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2020-09-30 14:24:17 -04:00
Jiansong Chen
39ad082459 drm/amdgpu: disable gfxoff temporarily for navy_flounder
gfxoff is temporarily disabled for navy_flounder, since
at present the feature caused some tdr when performing
display operations.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30 13:53:21 -04:00
Guchun Chen
4a20300bc2 drm/amdgpu: drop duplicated ecc check for vega10 (v5)
The same ECC check has been executed in amdgpu_ras_init for vega10,
prior to gmc_v9_0_late_init.

v2: drop all atombios helper callings
v3: use bit operation
v4: correct inline comment, remove parity check statement
v5: squash in build fix

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30 13:53:21 -04:00
Dmytro Laktyushkin
d3768874e5 drm/amd/display: add pipe reassignment prevention code to dcn3
Add code to gracefuly handle any pipe reassignment
occuring on dcn3 hardware. This should only happen when new
surfaces are used for an update rather than old ones updated.

Fixes: 69fc1f4b976cea ("amd/drm/display: avoid dcn3 on flip opp change for slave pipes")
Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Reviewed-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30 13:50:22 -04:00
Oak Zeng
8ffff9b449 drm/amdgpu: use function pointer for gfxhub functions
gfxhub functions are now called from function pointers,
instead of from asic-specific functions.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30 13:50:13 -04:00
Ramesh Errabolu
825c91d090 drm/amd/amdgpu: Prepare implementation to support reporting of CU usage
[Why]
Allow user to know number of compute units (CU) that are in use at any
given moment.

[How]
Read registers of SQ that give number of waves that are in flight
of various queues. Use this information to determine number of CU's
in use.

Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30 13:50:06 -04:00
Ramesh Errabolu
b8810a142a drm/amd/amdgpu: Clean up header file of symbols that are defined to be static
[Why]
Header file exports functions get_gpu_clock_counter(), get_cu_info() and
select_se_sh() that are defined to be static

Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30 13:49:44 -04:00
Jiansong Chen
95433a1305 drm/amdgpu: disable gfxoff temporarily for navy_flounder
gfxoff is temporarily disabled for navy_flounder, since
at present the feature caused some tdr when performing
display operations.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30 09:47:43 -04:00
Evan Quan
b195152536 drm/amd/pm: setup APU dpm clock table in SMU HW initialization
As the dpm clock table is needed during DC HW initialization.
And that (DC HW initialization) comes before smu_late_init()
where current APU dpm clock table setup is performed. So, NULL
pointer dereference will be triggered. By moving APU dpm clock
table setup to smu_hw_init(), this can be avoided.

Fixes: 02cf91c113ea ("drm/amd/powerplay: postpone operations not required for hw setup to late_init")
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reported-by: Dirk Gouders <dirk@gouders.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-30 09:29:00 -04:00
Zack Rusin
f54c444289 drm/vmwgfx: Fix error handling in get_node
ttm_mem_type_manager_func.get_node was changed to return -ENOSPC
instead of setting the node pointer to NULL. Unfortunately
vmwgfx still had two places where it was explicitly converting
-ENOSPC to 0 causing regressions. This fixes those spots by
allowing -ENOSPC to be returned. That seems to fix recent
regressions with vmwgfx.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Sigend-off-by: Roland Scheidegger <sroland@vmware.com>
2020-09-30 05:44:28 +02:00
Dirk Gouders
548c7ba7dc drm/amd/display: remove duplicate call to rn_vbios_smu_get_smu_version()
Commit 78fe9f63947a2b ("drm/amd/display: Remove DISPCLK Limit Floor for Certain SMU Versions")
added a call to rn_vbios_smu_get_smu_version() to set clk_mgr->smu_ver.
That field is initialized prior to the if-statement, already.

Fixes: 78fe9f63947a2b (drm/amd/display: Remove DISPCLK Limit Floor for Certain SMU Versions)
Signed-off-by: Dirk Gouders <dirk@gouders.net>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Sung Lee <sung.lee@amd.com>
Cc: Yongqiang Sun <yongqiang.sun@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 17:10:31 -04:00
Alex Deucher
3c26d0314c drm/amdgpu/swsmu/smu12: fix force clock handling for mclk
The state array is in the reverse order compared to other asics
(high to low rather than low to high).

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1313
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 17:09:59 -04:00
Jean Delvare
a39d0d7bdf drm/amdgpu: restore proper ref count in amdgpu_display_crtc_set_config
A recent attempt to fix a ref count leak in
amdgpu_display_crtc_set_config() turned out to be doing too much and
"fixed" an intended decrease as if it were a leak. Undo that part to
restore the proper balance. This is the very nature of this function
to increase or decrease the power reference count depending on the
situation.

Consequences of this bug is that the power reference would
eventually get down to 0 while the display was still in use,
resulting in that display switching off unexpectedly.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Fixes: e008fa6fb415 ("drm/amdgpu: fix ref count leak in amdgpu_display_crtc_set_config")
Cc: stable@vger.kernel.org
Cc: Navid Emamdoost <navid.emamdoost@gmail.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 17:09:22 -04:00
Alex Deucher
c73d05eaba drm/amdgpu/display: fix CFLAGS setup for DCN30
Properly handle clang and older versions of gcc.

Fixes: e77165bf7b02a3 ("drm/amd/display: Add DCN3 blocks to Makefile")
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 17:08:02 -04:00
Flora Cui
898c7302f4 drm/amd/display: fix return value check for hdcp_work
max_caps might be 0, thus hdcp_work might be ZERO_SIZE_PTR

Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 17:07:34 -04:00
Jiansong Chen
0c7014154d drm/amdgpu: remove gpu_info fw support for sienna_cichlid etc.
Remove gpu_info fw support for sienna_cichlid etc., since the
information can be retrieved from discovery binary.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 17:07:06 -04:00
Sudheesh Mavila
97cf32996c drm/amd/pm: Removed fixed clock in auto mode DPM
SMU10_UMD_PSTATE_PEAK_FCLK value should not be used to set the DPM.

Suggested-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 17:05:02 -04:00
Kent Russell
f94582e4bc drm/amdgpu: Use SKU instead of DID for FRU check v2
The VG20 DIDs 66a0, 66a1 and 66a4 are used for various SKUs that may or may
not have the FRU EEPROM on it. Parse the VBIOS to check for server SKU
variants (D131 or D134) until a more general solution can be determined.

v2: Remove string-based logic, correct the VBIOS string comment

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 16:14:09 -04:00
Alex Deucher
485d531c69 drm/amdgpu/swsmu/smu12: fix force clock handling for mclk
The state array is in the reverse order compared to other asics
(high to low rather than low to high).

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1313
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 16:14:03 -04:00
Dirk Gouders
808ec542c0 drm/amd/display: remove duplicate call to rn_vbios_smu_get_smu_version()
Commit 78fe9f63947a2b ("drm/amd/display: Remove DISPCLK Limit Floor for Certain SMU Versions")
added a call to rn_vbios_smu_get_smu_version() to set clk_mgr->smu_ver.
That field is initialized prior to the if-statement, already.

Fixes: 78fe9f63947a2b (drm/amd/display: Remove DISPCLK Limit Floor for Certain SMU Versions)
Signed-off-by: Dirk Gouders <dirk@gouders.net>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Sung Lee <sung.lee@amd.com>
Cc: Yongqiang Sun <yongqiang.sun@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 16:12:36 -04:00
Xiaojian Du
12a6727dee drm/amd/powerplay: add one sysfs file to support the feature to modify gfx clock on Raven/Raven2/Picasso APU.
This patch is to add one sysfs file -- "pp_od_clk_voltage" for
Raven/Raven2/Picasso APU, which is only used by dGPU like VEGA10.
This sysfs file supports the feature to modify gfx engine clock(Mhz units), it can
be used to configure the min value and the max value for gfx clock limited in the
safe range.

Command guide:
echo "s level clock" > pp_od_clk_voltage
	s - adjust teh sclk level
	level - 0 or 1, "0" represents the min value, "1" represents the max value
	clock - the clock value(Mhz units), like 400, 800 or 1200, the value must be within the
                OD_RANGE limits.
Example:
$ cat pp_od_clk_voltage
OD_SCLK:
0:        200Mhz
1:       1400Mhz
OD_RANGE:
SCLK:     200MHz       1400MHz

$ echo "s 0 600" > pp_od_clk_voltage
$ echo "s 1 1000" > pp_od_clk_voltage
$ cat pp_od_clk_voltage
OD_SCLK:
0:        600Mhz
1:       1000Mhz
OD_RANGE:
SCLK:     200MHz       1400MHz

Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 16:12:16 -04:00
Shashank Sharma
72e71a82d6 drm/amdgpu: add new trace event for page table update
This patch adds a new trace event to track the PTE update
events. This specific event will provide information like:
- start and end of virtual memory mapping
- HW engine flags for the map
- physical address for mapping

This will be particularly useful for memory profiling tools
(like RMV) which are monitoring the page table update events.

V2: Added physical address lookup logic in trace point
V3: switch to use __dynamic_array
    added nptes int the TPprint arguments list
    added page size in the arg list
V4: Addressed Christian's review comments
    add start/end instead of seg
    use incr instead of page_sz to be accurate
V5: Addressed Christian's review comments:
    add pid and vm context information in the event
V6: Re-sequence the variables (put pid and ctx_id first)

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 16:12:02 -04:00
Guchun Chen
125b1deb60 drm/amdgpu: fix incorrect comment
It should be one copy-paste typo.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 16:11:50 -04:00
Jean Delvare
3514521ccb drm/amdgpu: restore proper ref count in amdgpu_display_crtc_set_config
A recent attempt to fix a ref count leak in
amdgpu_display_crtc_set_config() turned out to be doing too much and
"fixed" an intended decrease as if it were a leak. Undo that part to
restore the proper balance. This is the very nature of this function
to increase or decrease the power reference count depending on the
situation.

Consequences of this bug is that the power reference would
eventually get down to 0 while the display was still in use,
resulting in that display switching off unexpectedly.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Fixes: e008fa6fb415 ("drm/amdgpu: fix ref count leak in amdgpu_display_crtc_set_config")
Cc: stable@vger.kernel.org
Cc: Navid Emamdoost <navid.emamdoost@gmail.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 16:11:45 -04:00
Jason Yan
faf0389f1e drm/amd/display: make two symbols static
This addresses the following sparse warning:

drivers/gpu/drm/amd/amdgpu/../display/dc/dce110/dce110_hw_sequencer.c:2740:6:
warning: symbol 'dce110_set_cursor_position' was not declared. Should it
be static?
drivers/gpu/drm/amd/amdgpu/../display/dc/dce110/dce110_hw_sequencer.c:2785:6:
warning: symbol 'dce110_set_cursor_attribute' was not declared. Should
it be static?

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 16:11:40 -04:00
Jason Yan
0ac900bae2 drm/amd/display: make get_color_space_type() static
This addresses the following sparse warning:

drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_hw_sequencer.c:180:26:
warning: symbol 'get_color_space_type' was not declared. Should it be
static?

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 16:11:35 -04:00
Dmytro Laktyushkin
69fc1f4b97 amd/drm/display: avoid dcn3 on flip opp change for slave pipes
At the moment on flip opp reassignment does not work in all cases
for non root pipes.
This change simply makes sure we prefer pipes not used previously
when splitting in dcn3.

Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Reviewed-by: Eric Bernstein <eric.bernstein@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 16:11:23 -04:00
Chiawen Huang
8353d30e74 drm/amd/display: disable stream if pixel clock changed with link active
[Why]
Vbios uses preferred timing to turn on edp but OS could use other
timing.  If change pixel clock when link active, there is unexpected
garbage on monitor.

[How]
Once pixel clock changed, the driver needs to disable stream.

Signed-off-by: Chiawen Huang <chiawen.huang@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 16:11:13 -04:00
Anthony Koo
d9beecfc79 drm/amd/display: [FW Promotion] Release 0.0.35
[Header Changes]
   - Definition for retaining ABM settings during disable
   - Addition of some new AUX interface definitions
   - Addition of some outbox definitions

Signed-off-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 16:11:04 -04:00
Aric Cyr
cbd975d0b1 drm/amd/display: Revert check for flip pending before locking pipes
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 16:09:05 -04:00
Wesley Chalmers
ec30798a82 drm/amd/display: Add debug param to force dio disable
[WHY]
At the moment, some tests are failing because cur_link_settings is
invalid. As a workaround, add an option to force dio disable.

Signed-off-by: Wesley Chalmers <Wesley.Chalmers@amd.com>
Reviewed-by: Martin Leung <Martin.Leung@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 16:08:59 -04:00
Joshua Aberback
8e02c26a58 drm/amd/display: Calc DLG from dummy p-state if full p-state unsupported
[Why]
Currently, when full p-state changes are not supported, DLG parameters
are calculated for no p-state support at all. However, we are required
to always support dummy p-state changes, so we should instead calculate
DLG based on dummy p-state latency when full p-state is unsupported.
This behaviour already exists for DCN2.

[How]
 - move DLG calculation inside WM calculation
 - if p-state unsupported, do not recalculate for set A, instead copy from
set C, and perform DLG calculation with dummy p-state latency

Signed-off-by: Joshua Aberback <joshua.aberback@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 16:08:50 -04:00
Chiawen Huang
ba578afd5a drm/amd/display: disable stream if pixel clock changed with link active
[Why]
Vbios uses preferred timing to turn on edp but OS could use other
timing. If change pixel clock when link active, there is unexpected
garbage on monitor.

[How]
Once pixel clock changed, the driver needs to disable stream.

Signed-off-by: Chiawen Huang <chiawen.huang@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 16:08:44 -04:00
Wyatt Wood
89b151ade7 drm/amd/display: Ensure all debug bits are passed to fw
[Why]
Some debug bits are not being copied from driver to fw.

[How]
Copy debug bits properly.

Signed-off-by: Wyatt Wood <wyatt.wood@amd.com>
Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 16:08:37 -04:00
Eric Bernstein
4ab1edbc9d drm/amd/display: Add dp_set_dsc_pps_info_packet to virtual stream encoder
Signed-off-by: Eric Bernstein <eric.bernstein@amd.com>
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 16:08:31 -04:00
Alvin Lee
4a3dea8932 drm/amd/display: Update NV1x SR latency values
[Why]
HW team measurement requires updating values

[How]
Update bounding box values

Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-29 16:08:15 -04:00
Dave Airlie
06c14f5c2d Mediatek DRM Next for Linux 5.10
1. Move Mediatek HDMI PHY driver from DRM folder to PHY folder
 2. Convert mtk-dpi to drm_bridge API
 3. Disable tmds on mt2701
 -----BEGIN PGP SIGNATURE-----
 
 iQJMBAABCgA2FiEEACwLKSDmq+9RDv5P4cpzo8lZTiQFAl9f6twYHGNodW5rdWFu
 Zy5odUBrZXJuZWwub3JnAAoJEOHKc6PJWU4k+FwQAItKwQieI/hfjE6+8AI4atW+
 e7HV3BQ2MkctIqeis/RZv1ubOtVFbpy7U0ndOU2ejFTLs92qBDgf5x91ywFZBK9P
 4n5BnU7uMmkhSSUJFfGAutADCONq2AsCCp7SNwqhCk865cYowbc0RBbE/6FyXPHB
 XbmGKKyU61F0X/MdIXPIC37zUNIr0aynMHqo2dirhpQd3wKDxUcWaRzC3VC7tmA+
 OgR0KQJondBtNeW0lHXv/beAyLbqQgMwlNbGNG0omWjnsO92BvmFyK1W1WYUMCRx
 UMzbjzFV8SHn2ewVbGaNj8hgfnp3qA1CZ7qNcTZdYt3cEvj8xB8iGP4kRdzcCKh7
 iqHqMQNrC+vWPtL6uNl/9MkO6mXpXL0bQj2tPkMf2tR33VbMS+L700QRgoC6PPrE
 JTGb0/UEPrdL/wghJHOOJ+oYo/gaHjPgWZZ/FmPdS3VNZ9DqG1xGmzRdNpixCFKV
 RK185UKiGgoUfv14kGWRu9YXS0I+nGl4tzj3xXCM0uxXEL0z7rxWhMEEkWzQfHqf
 87hsbt8WD0/TAdEhUYdqfClKbxzlTMAiIbH9rTJii/wAXtP2mSx8fjoM3K4mr5k4
 wx6RFPVpJkD7/CPNsJHuhicwn3AiBOlnEO9tfYqhqSOez1nFF9HaLur4XLpq4W0c
 KqALebgZ9zM9UnJGlhIY
 =NyM2
 -----END PGP SIGNATURE-----

Merge tag 'mediatek-drm-next-5.10' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-next

Mediatek DRM Next for Linux 5.10

1. Move Mediatek HDMI PHY driver from DRM folder to PHY folder
2. Convert mtk-dpi to drm_bridge API
3. Disable tmds on mt2701

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200914231227.30500-1-chunkuang.hu@kernel.org
2020-09-29 10:26:44 +10:00
Rob Clark
200a2186b6 drm/msm: fix 32b build warns
Neither of these code-paths apply to older 32b devices, but it is rude
to introduce warnings.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200929001925.2916984-1-robdclark@gmail.com
2020-09-29 10:20:51 +10:00