240 Commits

Author SHA1 Message Date
Mukul Joshi
93c5bcd4ea drm/amdgpu: Conditionally reset SDMA RAS error counts
Reset SDMA RAS error counts during init only if persistent
EDC harvesting is not supported.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:41 -04:00
Aaron Liu
e2329e74a6 drm/amdgpu: enable sdma0 tmz for Raven/Renoir(V2)
Without driver loaded, SDMA0_UTCL1_PAGE.TMZ_ENABLE is set to 1
by default for all asic. On Raven/Renoir, the sdma goldsetting
changes SDMA0_UTCL1_PAGE.TMZ_ENABLE to 0.
This patch restores SDMA0_UTCL1_PAGE.TMZ_ENABLE to 1.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-07-01 00:02:22 -04:00
Lee Jones
47a6c67648 drm/amd/amdgpu/sdma_v4_0: Realign functions with their headers
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:764: warning: expecting prototype for sdma_v4_0_page_ring_set_wptr(). Prototype was for sdma_v4_0_ring_set_wptr() instead
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:830: warning: expecting prototype for sdma_v4_0_ring_set_wptr(). Prototype was for sdma_v4_0_page_ring_set_wptr() instead

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-21 10:32:18 -04:00
Feifei Xu
5d11699914 drm/amdgpu: Correct and simplify sdma 4.x irq.num_types
Correct and init the sdma4.x irq.num_types.

v2: squash in fix (Alex)

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-28 23:36:04 -04:00
Feifei Xu
3d2bee9188 drm/amdgpu: Change the sdma interrupt print level
Change the print level into debug.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-28 23:35:51 -04:00
Alex Sierra
126bbd4ab5 drm/amdgpu: extend xnack limit page fault timeout
Extending this timeout will prevent IH from storm interrupts coming
from SDMA while a page fault is active. Currently, on Aldebaran,
handling that many interrupts can take a lot of CPU time
(up to 4 seconds).
This eventually causes timeouts in other tasks.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-23 17:16:20 -04:00
Christian König
c107171b8d drm/amdgpu: add the sched_score to amdgpu_ring_init
Allow separate ring to share the same scheduler score.

No functional change.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-and-Tested-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09 16:44:56 -04:00
Alex Sierra
07744e9069 drm/amdgpu: UTLC1 RB SDMA timeout on Aldebaran
[Why]
This causes infinite retries on the UTCL1 RB, preventing
higher priority RB such as paging RB.

[How]
Set to one the SDMAx_UTLC1_TIMEOUT registers for all SDMAs.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:56:36 -04:00
Hawking Zhang
2fdb91a25e drm/amdgpu: add sdma v4_4 ras function
sdma ras function is the main structure to support
sdma ras on aldebaran. the patch initializes late_init
late_fini callbacks.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li<Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:55:31 -04:00
Hawking Zhang
a6d9d6ab84 drm/amdgpu: apply sdma golden settings for aldebaran
perform one-time initialization for sdma registers

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:55:28 -04:00
Kevin Wang
e747ca0a4e drm/amdgpu: declare sdma firmware binary file for aldebaran
declare sdma firmware binary file for aldebaran

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:54:26 -04:00
Kevin Wang
5af81c6e6e drm/amdgpu: add aldebaran sdma firmware support (v2)
add sdma firmware load support for soc model

v2: drop some emulator leftovers (Alex)

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10 00:02:17 -05:00
Le Ma
b61a273e5d drm/amdgpu: add sdma block support for aldebaran
Add initial sdma support for aldebaran, and this asic has 5 sdma instances.

v2: remove adundant condition check

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Evan Quan <Evan.Quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10 00:01:59 -05:00
Feifei Xu
8f211fe8ac drm/amdgpu: add sdma 4_x interrupts printing
Add VM_HOLE/DOORBELL_INVALID_BE/POLL_TIMEOUT/SRBMWRITE
interrupt info printing.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-05 15:12:23 -05:00
Feifei Xu
e528556577 drm/amdgpu: simplify the sdma 4_x MGCG/MGLS logic.
SDMA 4_x asics share the same MGCG/MGLS setting.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-05 15:12:15 -05:00
Jiapeng Chong
fec432f557 drm/amdgpu: Remove unnecessary conversion to bool
Fix the following coccicheck warnings:

./drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:2252:40-45: WARNING: conversion
to bool not needed here.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-03 10:51:37 -05:00
Likun Gao
9ca0674a71 drm/amdgpu: remove redundant logic related HDP
Remove hdp_flush function from amdgpu_nbio struct as it have been unified
into hdp struct.
Remove the include about hdp register which was not used.
V2: Remove hdp golden setting which is unnecessary.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-05 11:33:14 -05:00
Lee Jones
4c724ae91d drm/amd/amdgpu/sdma_v4_0: Repair a bunch of kernel-doc problems
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:848: warning: Function parameter or member 'job' not described in 'sdma_v4_0_ring_emit_ib'
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:848: warning: Function parameter or member 'flags' not described in 'sdma_v4_0_ring_emit_ib'
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:923: warning: Function parameter or member 'addr' not described in 'sdma_v4_0_ring_emit_fence'
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:923: warning: Function parameter or member 'seq' not described in 'sdma_v4_0_ring_emit_fence'
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:923: warning: Function parameter or member 'flags' not described in 'sdma_v4_0_ring_emit_fence'
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:923: warning: Excess function parameter 'fence' description in 'sdma_v4_0_ring_emit_fence'
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:1117: warning: Function parameter or member 'ring' not described in 'sdma_v4_0_rb_cntl'
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:1117: warning: Function parameter or member 'rb_cntl' not described in 'sdma_v4_0_rb_cntl'
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:1581: warning: Function parameter or member 'timeout' not described in 'sdma_v4_0_ring_test_ib'
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:1682: warning: Function parameter or member 'value' not described in 'sdma_v4_0_vm_write_pte'
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:1682: warning: Excess function parameter 'addr' description in 'sdma_v4_0_vm_write_pte'
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:1682: warning: Excess function parameter 'flags' description in 'sdma_v4_0_vm_write_pte'
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:1734: warning: Function parameter or member 'ring' not described in 'sdma_v4_0_ring_pad_ib'
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:1782: warning: Function parameter or member 'vmid' not described in 'sdma_v4_0_ring_emit_vm_flush'
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:1782: warning: Function parameter or member 'pd_addr' not described in 'sdma_v4_0_ring_emit_vm_flush'
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:1782: warning: Excess function parameter 'vm' description in 'sdma_v4_0_ring_emit_vm_flush'
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:2508: warning: Function parameter or member 'ib' not described in 'sdma_v4_0_emit_copy_buffer'
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:2508: warning: Function parameter or member 'tmz' not described in 'sdma_v4_0_emit_copy_buffer'
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:2508: warning: Excess function parameter 'ring' description in 'sdma_v4_0_emit_copy_buffer'
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:2534: warning: Function parameter or member 'ib' not described in 'sdma_v4_0_emit_fill_buffer'
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:2534: warning: Excess function parameter 'ring' description in 'sdma_v4_0_emit_fill_buffer'

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-12-01 16:04:42 -05:00
Stanley.Yang
055e94a867 drm/amdgpu: only skip smc sdma sos ta and asd fw in SRIOV for navi12
The KFDTopologyTest.BasicTest will failed if skip smc, sdma, sos, ta
and asd fw in SRIOV for vega10, so adjust above fw and skip load them
in SRIOV only for navi12.

v2: remove unnecessary asic type check.

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-12-01 15:58:16 -05:00
Deepak R Varma
8e607d7e27 drm/amdgpu/sdma: use "*" adjacent to data name
When declaring pointer data, the "*" symbol should be used adjacent to
the data name as per the coding standards. This resolves following
issues reported by checkpatch script:
	ERROR: "foo *   bar" should be "foo *bar"
	ERROR: "foo * bar" should be "foo *bar"
	ERROR: "foo*            bar" should be "foo *bar"
	ERROR: "(foo*)" should be "(foo *)"

Signed-off-by: Deepak R Varma <mh12gx2825@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-11-02 15:35:48 -05:00
Prike Liang
f74d0535e9 drm/amdgpu/sdma: add sdma engine support for green_sardine (v2)
Initialize the SDMA IP for green_sardine.

v2: use apu flags

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-07 14:45:17 -04:00
Jingwen Chen
162b786f0f drm/amd: Skip not used microcode loading in SRIOV
smc, sdma, sos, ta and asd fw is not used in SRIOV. Skip them to
accelerate sw_init for navi12.

v2: skip above fw in SRIOV for vega10 and sienna_cichlid
v3: directly skip psp fw loading in SRIOV
Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by:  Emily.Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-25 16:54:00 -04:00
Zheng Bin
724dc53b92 drm/amd/amdgpu: fix comparison pointer to bool warning in sdma_v4_0.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:1003:4-9: WARNING: Comparison to bool
drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c:1083:5-11: WARNING: Comparison to bool

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:43 -04:00
Mukul Joshi
62f6b1162e drm/amdgpu: Enable SDMA utilization for Arcturus
SDMA utilization calculations are enabled/disabled by
writing to SDMAx_PUB_DUMMY_REG2 register. Currently,
enable this only for Arcturus.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:39 -04:00
Nirmoy Das
75e1658ea0 drm/amdgpu: call release_firmware() without a NULL check
The release_firmware() function is NULL tolerant so we do not need
to check for NULL param before calling it.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-01 01:59:27 -04:00
Joseph Greathouse
d4dd336409 drm/amdgpu: Reconfigure ULV for gfx9 server SKUs
SDMA ULV can benefit low-power modes, but can sometimes cause
latency increases in small SDMA transfers. Server SKUs have a
different trade-off space in this domain, so this configures
the server SKUs' ULV hysteresis times differently than consumer
SKUs'.

Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-01 01:59:21 -04:00
Alex Deucher
d0767e0e0f drm/amdgpu/sdma4: simplify the logic around powering up sdma
Just check if it's an APU.  The checks for the ppfuncs are
pointless because if we don't have them we can't power up
sdma anyway so we shouldn't even be in this code in the first
place.  I'm not sure about the in_gpu_reset check.  This
probably needs to be double checked.  The fini logic doesn't
match the init logic however with that in_gpu_reset check
in place which seems odd.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-28 14:00:49 -04:00
Alex Deucher
70534d1ee8 drm/amdgpu: simplify raven and renoir checks
Just check for APU.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-28 14:00:49 -04:00
Alex Deucher
e27fb8215f drm/amdgpu/sdma4: add renoir to powergating setup
Looks like renoir should be handled here as well.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-28 14:00:49 -04:00
Alex Deucher
54f78a7655 drm/amdgpu: add apu flags (v2)
Add some APU flags to simplify handling of different APU
variants.  It's easier to understand the special cases
if we use names flags rather than checking device ids and
silicon revisions.

v2: rebase on latest code

Acked-by: Evan Quan <evan.quan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-22 13:41:53 -04:00
Dave Airlie
370fb6b0aa Merge tag 'amd-drm-next-5.8-2020-04-30' of git://people.freedesktop.org/~agd5f/linux into drm-next
amd-drm-next-5.8-2020-04-30:

amdgpu:
- SR-IOV fixes
- SDMA fix for Navi
- VCN 2.5 DPG fixes
- Display fixes
- Display stuttering fixes for pageflip and cursor
- Add support for handling encrypted GPU memory
- Add UAPI for encrypted GPU memory
- Rework IB pool handling

amdkfd:
- Expose asic revision in topology
- Add UAPI for GWS (Global Wave Sync) resource management

UAPI:
- Add amdgpu UAPI for encrypted GPU memory
  Used by: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4401
- Add amdkfd UAPI for GWS (Global Wave Sync) resource management
  Thunk usage of KFD ioctl: https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/blob/roc-2.8.0/src/queues.c#L840
  ROCr usage of Thunk API: https://github.com/RadeonOpenCompute/ROCR-Runtime/blob/roc-3.1.0/src/core/runtime/amd_gpu_agent.cpp#L597
  HCC code using ROCr API: 98ee9f3494/lib/hsa/mcwamp_hsa.cpp (L2161)
  HIP code using HCC API: cf8589b8c8/src/hip_module.cpp (L567)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430212951.3902-1-alexander.deucher@amd.com
2020-05-08 13:31:08 +10:00
Dave Airlie
937eea297e Merge tag 'amd-drm-next-5.8-2020-04-24' of git://people.freedesktop.org/~agd5f/linux into drm-next
amd-drm-next-5.8-2020-04-24:

amdgpu:
- Documentation improvements
- Enable FRU chip access on boards that support it
- RAS updates
- SR-IOV updates
- Powerplay locking fixes for older SMU versions
- VCN DPG (dynamic powergating) cleanup
- VCN 2.5 DPG enablement
- Rework GPU scheduler handling
- Improve scheduler priority handling
- Add SPM (streaming performance monitor) golden settings for navi
- GFX10 clockgating fixes
- DC ABM (automatic backlight modulation) fixes
- DC cursor and plane fixes
- DC watermark fixes
- DC clock handling fixes
- DC color management fixes
- GPU reset fixes
- Clean up MMIO access macros
- EEPROM access fixes
- Misc code cleanups

amdkfd:
- Misc code cleanups

radeon:
- Clean up safe reg list generation
- Misc code cleanups

From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200424190827.4542-1-alexander.deucher@amd.com
2020-04-30 11:08:54 +10:00
Aaron Liu
b7c163fe91 drm/amdgpu: enable TMZ bit in sdma copy pkt for sdma v4
Enable sdma TMZ mode via setting TMZ bit in sdma copy pkt
for sdma v4

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-28 16:20:29 -04:00
Aaron Liu
be7538ff74 drm/amdgpu: expand sdma copy_buffer interface with tmz parameter
This patch expands sdma copy_buffer interface with tmz parameter.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-28 16:20:29 -04:00
Yong Zhao
e1046a1f70 drm/amdgpu: Adjust the SDMA doorbell info printing
Turn off the printing by default because it is not very useful, while
adding more details.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22 18:11:49 -04:00
Nirmoy Das
1c6d567bdf drm/amdgpu: rework sched_list generation
Generate HW IP's sched_list in amdgpu_ring_init() instead of
amdgpu_ctx.c. This makes amdgpu_ctx_init_compute_sched(),
ring.has_high_prio and amdgpu_ctx_init_sched() unnecessary.
This patch also stores sched_list for all HW IPs in one big
array in struct amdgpu_device which makes amdgpu_ctx_init_entity()
much more leaner.

v2:
fix a coding style issue
do not use drm hw_ip const to populate amdgpu_ring_type enum

v3:
remove ctx reference and move sched array and num_sched to a struct
use num_scheds to detect uninitialized scheduler list

v4:
use array_index_nospec for user space controlled variables
fix possible checkpatch.pl warnings

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-09 10:43:14 -04:00
Linus Torvalds
f365ab31ef drm for 5.7-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJehCE6AAoJEAx081l5xIa+bKAQAJj49Hj3WvSJV6cSl3LmgwPV
 IcWpR6LLVrCOQiS600NyXnbv6lTmCtnIfwEneQqm5ltQvJk38QcKQvSua6+ETi9f
 0hl7IiytLwv2R/pS0g9jgSsKmbeP2bDwBAR44vK6pcK6WOgCrkpoL1F4YZEDUILT
 ewnRb52afF3Hw8PSG0lwgBYd7G0uto49t3nt86LjGftJgB3wPFlluVOzLHTeEh0w
 FZEyKuqS8hq8EZFfG1Gu1hS2ylO9y1VgYxiv18jDyRb8jUPq+gzqH6roTPRIronA
 whGZgC6SkyZ9NthCLBu4ITbO9wStAHoawzFfax25QwwuoOyrikuvGy3PfEUu+ixL
 bW2UtYK6BHLGnvZChH6E7i9J6qQbNYCn3Ty5bB1KsY06sHVoP1jYUBSicPN2ELWc
 9KaBI+WROBNEkge/rizwUFfD/u0MZaaSRsVSlGDdHkD7IFj2tDPhfNWdXZQl4EwR
 JnndT6cu97htf7tkid7RASpJ/hnwJTb1hg0Cc9kPblOrGqEDF0K5845FLR9VptGl
 5c2/KyKM/GaI793fP1TG76uDegBhV9337mUF1ZW3c6gCE7QXncZXM5jrIFRE9q2L
 IbvuyUYRof1gW0r1R0WVJYAi2CRBeMd7qgkQjMrbTw0o7FiYDJXB7dc5pYj2um2v
 mSVHikC2S1rbUdH0xbWM
 =I0vz
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2020-04-01' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "This is the main drm pull request for 5.7-rc1.

  Highlights:

   - i915 enables Tigerlake by default

   - i915 and amdgpu have initial OLED backlight support

     [ Jani Nikula pipes up and points out that we've had a bunch of
       "initial support" code for a long time already, but only now
       Lyude made it actually work on real world machines ]

   - vmwgfx add support to enable OpenGL 4 userspace

   - zero length arrays are mostly removed.

  Detailed summary:

  new driver:
   - tidss: TI Keystone platform display subsystem

  core:
   - new drm device warn macros
   - mode config valid for memory constrained devices
   - bridge bus format negotation
   - consolidated fake vblank event handling
   - dma_alloc related cleanups
   - drop get_crtc callback
   - dp: DP1.4 EDID corruption test
   - EDID CEA detailed timings improvements
   - relicense some code to dual GPL2/MIT
   - convert core vblank support to per-crtc support
   - rework drm_global_mutex
   - bridge rework to allow omap_dss custom driver removeal
   - remove drm_fb_helper connector interrfaces
   - zero-length array removal

  scheduler:
   - support for modifying the sched list
   - revert job distribution optimization
   - helper to pick least loaded scheduler
   - race condition fix

  mst:
   - various fixes
   - remove register_connector callback

  i915:
   - uapi to allows userspace specific CS ring buffer sizes
   - Tigerlake enablement patches + Tigerlake enabled by default
   - new sysfs entries for engine properties
   - display/logging refactors
   - eDP/DP fixes for DPCD
   - Gen7 back to aliasing-ppgtt
   - Gen8+ irq refactor
   - Avoid globals
   - GEM locking fixes and simplifications
   - Ice Lake and Elkhart Lake fixes and workarounds
   - Baytrail/Haswell instability fix
   - GVT - VFIO edid better support

  amdgpu:
   - Rework VM update handling in preparation for HMM support
   - drm load/unload removal fixups
   - USB-C PD firmware updates
   - HDCP srm support
   - Navi/renoir PM watermark fixes
   - OLED panel support
   - Optimize debugging vram access
   - Use BACO for runtime pm
   - DC clock programming optimizations and fixes
   - PSP fw loading sequence updates
   - Drop DRIVER_USE_AGP
   - Remove legacy drm load and unload callbacks
   - ACP Kconfig fix
   - Lots of fixes across the driver

  amdkfd:
   - runtime pm support
   - more gfx config details in amdgpu

  radeon:
   - drop DRIVER_USE_AGP

  vmwgfx:
   - Disable DMA when SEV encryption in use
   - Shader Model 5 support - needed for GL4 support

  msm:
   - DPU resource manager refactor
   - dpu using atomic global state

  mediatek:
   - MT8183 DPI support

  etnaviv:
   - out-of-bounds read fix
   - expose feature flags for GC400 STM32MP1 SoC
   - runtime suspend entry fix
   - dma32 zone fix

  hisilicon:
   - mode selection fixes

  meson:
   - YUV420 support

  lima:
   - add support for heap buffers

  tinydrm:
   - removal of owner field
   - explicit DT dependency removal
   - YAML schema conversion

  tegra:
   - misc cleanups

  tidss:
   - new driver

  virtio:
   - better batching of notifications to host
   - memory handling reworked
   - shmem + gpu context fixes

  hibmc:
   - add gamma_set support
   - improve DPMS support

  pl111:
   - Integrator IM-PD1 support

  sun4i:
   - LVDS support for A20 + A33
   - DSI panel handling improvements"

* tag 'drm-next-2020-04-01' of git://anongit.freedesktop.org/drm/drm: (1537 commits)
  drm/i915/display: Fix mode private_flags comparison at atomic_check
  drm/i915/gt: Stage the transfer of the virtual breadcrumb
  drm/i915/gt: Select the deepest available parking mode for rc6
  drm/i915: Avoid live-lock with i915_vma_parked()
  drm/i915/gt: Treat idling as a RPS downclock event
  drm/i915/gt: Cancel a hung context if already closed
  drm/i915: Use explicit flag to mark unreachable intel_context
  drm/amdgpu: don't try to reserve training bo for sriov (v2)
  drm/amdgpu/smu11: add support for SMU AC/DC interrupts
  drm/amdgpu/swSMU: handle manual AC/DC notifications
  drm/amdgpu/swSMU: handle DC controlled by GPIO for navi1x
  drm/amdgpu/swSMU: set AC/DC mode based on the current system state (v2)
  drm/amdgpu/swSMU: correct the bootup power source for Navi1X (v2)
  drm/amdgpu/swSMU: use the smu11 power source helper for navi1x
  drm/amdgpu/smu11: add a helper to set the power source
  drm/amd/swSMU: add callback to set AC/DC power source (v2)
  drm/scheduler: fix rare NULL ptr race
  drm/amdgpu: fix the coverage issue to clear ArcVPGRs
  drm/amd/display: Fix pageflip event race condition for DCN.
  drm/[radeon|amdgpu]: Remove HAINAN board from max_sclk override check
  ...
2020-04-01 15:24:20 -07:00
Christian König
1675c3a24d drm/amdgpu: stop disable the scheduler during HW fini
When we stop the HW for example for GPU reset we should not stop the
front-end scheduler. Otherwise we run into intermediate failures during
command submission.

The scheduler should only be stopped in very few cases:
1. We can't get the hardware working in ring or IB test after a GPU reset.
2. The KIQ scheduler is not used in the front-end and should be disabled during GPU reset.
3. In amdgpu_ring_fini() when the driver unloads.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Test-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-01 14:44:45 -04:00
Alex Sierra
04cdac5c17 drm/amdgpu: infinite retries fix from UTLC1 RB SDMA
[Why]
Previously these registers were set to 0. This was causing an
infinite retry on the UTCL1 RB, preventing higher priority RB such as paging RB.

[How]
Set to one the SDMAx_UTLC1_TIMEOUT registers for all SDMAs on Vega10, Vega12,
Vega20 and Arcturus.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-01 14:44:44 -04:00
xinhui pan
c8e42d5785 drm/amdgpu: implement more ib pools (v2)
We have three ib pools, they are normal, VM, direct pools.

Any jobs which schedule IBs without dependence on gpu scheduler should
use DIRECT pool.

Any jobs schedule direct VM update IBs should use VM pool.

Any other jobs use NORMAL pool.

v2: squash in coding style fix

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-01 14:44:44 -04:00
Wang Xiayang
aad7012c31 drm/amdgpu: fix two documentation mismatch issues
The function name mentioned in the documentational comments mismatches
the actual one. The mismatch may make trouble for automatic documentation
generation. One of the erronous name has seen to be misused
and fixed in commit bc5ab2d29b8a ("drm/amdgpu: fix typo in function
sdma_v4_0_page_resume").

There is apparently no functional change in the patch.

Signed-off-by: Wang Xiayang <xywang.sjtu@sjtu.edu.cn>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2020-03-17 19:02:28 +01:00
Hawking Zhang
86153f1be2 drm/amdgpu: add reset_ras_error_count function for SDMA
SDMA ras error counters are dirty ones after cold reboot
Read operation is needed to reset them to 0

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-03-05 00:32:32 -05:00
Nirmoy Das
a9d4fe2fd6 drm/amdgpu: remove unnecessary conversion to bool
Better clean that up before some automation starts to complain about it

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-22 16:55:27 -05:00
Aaron Liu
d8459d1b7f drm/amdgpu: update goldensetting for renoir
Update mmSDMA0_UTCL1_WATERMK golden setting for renoir.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:10 -05:00
Hawking Zhang
49da2ccd2d drm/amdgpu: check sdma ras funcs pointer before accessing
sdma ras funcs are not supported by ASIC prior
to vega20

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:09 -05:00
Hawking Zhang
5e62db9df6 drm/amdgpu: read sdma edc counter to clear the counters
SDMA edc counter registers were added in gfx edc counters
array. When querying gfx error counter in that array, there
is no way to differentiate sdma instance number for different
asic and then results to NULL pointer access when trying to
read sdma register base address for instances greater
than 2 on Vega20.
In addition, this also results to wrong gfx error counters
since it actually added sdma edc counters.
Therefore, sdma edc counter registers should be separated
from gfx edc counter regsiter array and only get initialized
when driver tries to enable sdma ras.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:08 -05:00
Hawking Zhang
1dd5ead294 drm/amdgpu: add ras_late_init and ras_fini for sdma v4
move ras_late_init and ras_fini to sdma_ras_funcs table

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:08 -05:00
Hawking Zhang
93070deb58 drm/amdgpu: add query_ras_error_count function for sdma v4
query_ras_error_count function will be invoked to query
single bit error count detected in sdma ip block

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-14 10:18:08 -05:00
Luben Tuikov
ce73516d42 drm/amdgpu: simplify padding calculations (v2)
Simplify padding calculations.

v2: Comment update and spacing.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-23 14:55:44 -05:00
Nirmoy Das
0c88b43032 drm/amdgpu: replace vm_pte's run-queue list with drm gpu scheds list
drm_sched_entity_init() takes drm gpu scheduler list instead of
drm_sched_rq list. This makes conversion of drm_sched_rq list
to drm gpu scheduler list unnecessary

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18 16:09:12 -05:00