12083 Commits

Author SHA1 Message Date
Pierre-Eric Pelloux-Prayer
febc9c65b3 drm/amdgpu: use sdma_v6 single packet invalidation
This achieves the same result as the sequence used in emit_flush_gpu_tlb
but the invalidation is now a single packet instead of the 3 packets
required to implement reg_write_reg_wait.

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-11 18:03:45 -04:00
Lijo Lazar
418431bcc9 drm/amdgpu: Fix warnings
Fix below warning due to incompatible types in conditional operator

../pm/swsmu/smu13/smu_v13_0_6_ppt.c:315:17: sparse: sparse: incompatible
types in conditional expression (different base types):

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Link: https://lore.kernel.org/oe-kbuild-all/202303082135.NjdX1Bij-lkp@intel.com/
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-11 18:03:44 -04:00
Tong Liu01
5591a051b8 drm/amdgpu: refine get gpu clock counter method
[why]
regGOLDEN_TSC_COUNT_LOWER/regGOLDEN_TSC_COUNT_UPPER are protected and
unaccessible under sriov.
The clock counter high bit may update during reading process.

[How]
Replace regGOLDEN_TSC_COUNT_LOWER/regGOLDEN_TSC_COUNT_UPPER with
regCP_MES_MTIME_LO/regCP_MES_MTIME_HI to get gpu clock under sriov.
Refine get gpu clock counter method to make the result more precise.

Signed-off-by: Tong Liu01 <Tong.Liu01@amd.com>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-11 18:03:44 -04:00
YiPeng Chai
fc926faefc drm/amdgpu: optimize redundant code in umc_v6_7
Optimize redundant code in umc_v6_7.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-11 18:03:44 -04:00
lyndonli
5e08e9c742 drm/amdgpu: Fix sdma v4 sw fini error
Fix sdma v4 sw fini error for sdma 4.2.2 to
solve the following general protection fault

[  +0.108196] general protection fault, probably for non-canonical
address 0xd5e5a4ae79d24a32: 0000 [#1] PREEMPT SMP PTI
[  +0.000018] RIP: 0010:free_fw_priv+0xd/0x70
[  +0.000022] Call Trace:
[  +0.000012]  <TASK>
[  +0.000011]  release_firmware+0x55/0x80
[  +0.000021]  amdgpu_ucode_release+0x11/0x20 [amdgpu]
[  +0.000415]  amdgpu_sdma_destroy_inst_ctx+0x4f/0x90 [amdgpu]
[  +0.000360]  sdma_v4_0_sw_fini+0xce/0x110 [amdgpu]

Signed-off-by: lyndonli <Lyndon.Li@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-11 18:03:44 -04:00
Shane Xiao
edd48e6d8f drm/amdgpu: DROP redundant drm_prime_sg_to_dma_addr_array
For DMA-MAP userptr on other GPUs, the dma address array
will be populated in amdgpu_ttm_backend_bind.

Remove the redundant call from the driver.

v2:
  update the comment

Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-11 18:03:44 -04:00
YiPeng Chai
e86bd8b21d drm/amdgpu: optimize redundant code in umc_v8_10
Optimize redundant code in umc_v8_10

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-11 18:03:44 -04:00
Shane Xiao
af152c2120 amd/amdgpu: Inherit coherence flags base on original BO flags
For SG BO to DMA-map userptrs on other GPUs, the SG BO
need inherit MTYPEs in PTEs from original BO.

If we set the flags, the device can be coherent with
the CPUs and other GPUs.

v2:
  1. Drop unnecessary flags check
  2. Remove local variable align

Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-11 18:03:44 -04:00
Shane Xiao
207bbfb63d drm/amdgpu: Add userptr bo support for mGPUs when iommu is on
For userptr bo with iommu on, multiple GPUs use same system
memory dma mapping address when both adev and bo_adev are in
identity mode or in the same iommu group.

If RAM direct map to one GPU, other GPUs can share the original
BO in order to reduce dma address array usage when RAM can
direct map to these GPUs. However, we should explicit check
whether RAM can direct map to all these GPUs.

This patch fixes a potential issue that where RAM is
direct mapped on some but not all GPUs.

v2:
  1. Update comment
  2. Add helper function reuse_dmamap

Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-11 18:03:44 -04:00
Srinivasan Shanmugam
11f25c844e drm/amd/amdgpu: Drop the hang limit parameter
The driver doesn't resubmit jobs on hangs any more, hence drop
the hang limit parameter - amdgpu_job_hang_limit, wherever it is used.

Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Kent Russell <kent.russell@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-11 18:03:43 -04:00
Yifan Zha
554836cc24 drm/amdgpu: Add MES KIQ clear to tell RLC that KIQ is dequeued
[Why]
As MES KIQ is dequeued, tell RLC that KIQ is inactive

[How]
Clear the RLC_CP_SCHEDULERS Active bit which RLC checks KIQ status
In addition, driver can halt MES under SRIOV when unloading driver

v2:
Use scheduler0 mask to clear KIQ portion of RLC_CP_SCHEDULERS

Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Reviewed-by: Horace Chen <horace.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-11 18:03:35 -04:00
Yifan Zha
9eb28ac1a2 drm/amdgpu: Add MES KIQ dequeue in MES hw fini
[Why]
Need dequeue MES KIQ under SRIOV when unloading driver

[How]
Modify mes_v11_0_kiq_dequeue_sched which was used to dequeue MES SCHED
to support veriable pipe.
Add MES KIQ dequeue in hw fini

Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Reviewed-by: Horace Chen <horace.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-11 18:03:22 -04:00
Jack Xiao
97998b893c drm/amd/amdgpu: introduce gc_*_mes_2.bin v2
To avoid new mes fw running with old driver, rename
mes schq fw to gc_*_mes_2.bin.

v2: add MODULE_FIRMWARE declaration
v3: squash in fixup patch

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-11 18:03:21 -04:00
Tim Huang
980d5baeb2 drm/amdgpu: allow more APUs to do mode2 reset when go to S4
Skip mode2 reset only for IMU enabled APUs when do S4.

This patch is to fix the regression issue
https://gitlab.freedesktop.org/drm/amd/-/issues/2483
It is generated by commit b589626674de ("drm/amdgpu: skip ASIC reset
for APUs when go to S4").

Fixes: b589626674de ("drm/amdgpu: skip ASIC reset for APUs when go to S4")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2483
Tested-by: Yuan Perry <Perry.Yuan@amd.com>
Signed-off-by: Tim Huang <tim.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-11 18:03:11 -04:00
Amber Lin
cebbfdd5f0 drm/amdkfd: Set noretry/xnack for GC 9.4.3
For GC 9.4.3, disable retry as default and XNACK can be different
modes per process.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-31 11:18:55 -04:00
Hawking Zhang
9c224e058d drm/amdgpu: Correct xgmi_wafl block name
Fix backward compatibility issue to stay with
the old name of xgmi_wafl node.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-31 11:18:53 -04:00
Yifan Zha
d2cdc01451 drm/amdgpu: Add JPEG IP block to SRIOV reinit
[Why]
Reset(mode1) failed as JPRG IP did not reinit under sriov.

[How]
Add JPEG IP block to sriov reinit function.

Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Reviewed-by: Horace Chen <Horace.Chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-31 11:18:53 -04:00
Srinivasan Shanmugam
e86c30e951 drm/amd/amdgpu: Remove initialisation of globals to 0 or NULL
Global variables do not need to be initialized to 0 or NULL and checkpatch
flags this error in drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:

ERROR: do not initialise globals to NULL
+char *amdgpu_disable_cu = NULL;
+char *amdgpu_virtual_display = NULL;

Fix this checkpatch error.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Hamza Mahfooz <Hamza.Mahfooz@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-31 11:18:44 -04:00
Hawking Zhang
69bacf1545 drm/amdgpu: add GMC ip block for GC 9.4.3
Add GMC IP handling for GC 9.4.3

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-31 11:18:43 -04:00
Le Ma
018f7300d4 drm/amdgpu: initialize gfxhub v1_2 and mmhub v1_8 funcs
Initialize gfxhub1.2 and mmhub1.8 function calls

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-31 11:18:43 -04:00
Le Ma
4d77b7e534 drm/amdgpu: add mmhub v1_8 support
Hack the mmhub 1.7 reg offset for initial support

v2: squash in header switch, CG funcs (Alex)

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-31 11:18:43 -04:00
Le Ma
6d4496bcfe drm/amdgpu: add gfxhub v1_2 support
Hack the gc 9.0 reg offset for initial support

v2: squash in header switch (Alex)

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-31 11:18:43 -04:00
Hawking Zhang
ab1a157ea7 drm/amdgpu: add gmc ip block support for GC 9.4.3
Initialize various gmc sw/hw settings/configurations
for GC 9.4.3.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-31 11:18:43 -04:00
Hawking Zhang
f3409f76a6 drm/amdgpu: Set family for GC 9.4.3
Set family for GC 9.4.3

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-31 11:18:43 -04:00
Hawking Zhang
5d55e1d02a drm/amdgpu: init nbio v7_9 callbacks
switch to the new nbio generation for NBIO 7.9.0.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-31 11:18:43 -04:00
Hawking Zhang
39def24f8c drm/amdgpu: add nbio v7_9 support
v7_9 is a new nbio generation ip.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-31 11:18:42 -04:00
Srinivasan Shanmugam
120ceaf78e drm/amd/amdgpu: Fix error do not initialise globals to 0
Global variables do not need to be initialized to 0 and checkpatch
flags this error in drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:

ERROR: do not initialise globals to 0
+int amdgpu_no_queue_eviction_on_vm_fault = 0;

Fix this checkpatch error.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Hamza Mahfooz <Hamza.Mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-31 11:18:32 -04:00
Luben Tuikov
8782007b5f drm/amdgpu: Return from switch early for EEPROM I2C address
As soon as control->i2c_address is set, return; remove the "break;" from the
switch--it is unnecessary. This mimics what happens when for some cases in the
switch, we call helper functions with "return <helper function>".

Remove final function "return true;" to indicate that the switch is final and
terminal, and that there should be no code after the switch.

Cc: Candice Li <candice.li@amd.com>
Cc: Kent Russell <kent.russell@amd.com>
Cc: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-31 11:18:32 -04:00
Luben Tuikov
1bb745d759 drm/amdgpu: Remove second moot switch to set EEPROM I2C address
Remove second switch since it already has its own function and case in the
first switch. This also avoids requalifying the EEPROM I2C address for VEGA20,
SIENNA CICHLID, and ALDEBARAN, as those have been set by the first switch and
shouldn't match SMU v13.0.x.

Cc: Candice Li <candice.li@amd.com>
Cc: Kent Russell <kent.russell@amd.com>
Cc: Alex Deucher <Alexander.Deucher@amd.com>
Fixes: 158225294683 ("drm/amdgpu: Add EEPROM I2C address for smu v13_0_0")
Fixes: c9bdc6c3cf39 ("drm/amdgpu: Add EEPROM I2C address support for ip discovery")
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-31 11:18:32 -04:00
Stanley.Yang
059478929a drm/amdgpu: print ras drv fw debug info
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Candice Li <candice.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-31 11:18:32 -04:00
Hawking Zhang
9af357bc3e drm/amdgpu: Add fatal error handling in nbio v4_3
GPU will stop working once fatal error is detected.
it will inform driver to do reset to recover from
the fatal error.

v2: squash in logic fix (Srinivasan)
v3: squash in logic fix (Dan)

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Candice Li <candice.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-31 11:18:32 -04:00
Tim Huang
2fec9dc8e0 drm/amdgpu: allow more APUs to do mode2 reset when go to S4
Skip mode2 reset only for IMU enabled APUs when do S4.

This patch is to fix the regression issue
https://gitlab.freedesktop.org/drm/amd/-/issues/2483
It is generated by commit b589626674de ("drm/amdgpu: skip ASIC reset
for APUs when go to S4").

Fixes: b589626674de ("drm/amdgpu: skip ASIC reset for APUs when go to S4")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2483
Tested-by:  Yuan  Perry <Perry.Yuan@amd.com>
Signed-off-by: Tim Huang <tim.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x
2023-03-30 11:23:58 -04:00
Tong Liu01
f5a5b08139 drm/amdgpu: skip unload tmr when tmr is not loaded
[why]
Skip TMR unload for Navi12 and CHIP_SIENNA_CICHLID SRIOV as TMR is
not loaded at all

Signed-off-by: Tong Liu01 <Tong.Liu01@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-27 18:20:14 -04:00
Kai-Heng Feng
2b072442f4 drm/amdgpu/nv: Apply ASPM quirk on Intel ADL + AMD Navi
S2idle resume freeze can be observed on Intel ADL + AMD WX5500. This is
caused by commit 0064b0ce85bb ("drm/amd/pm: enable ASPM by default").

The root cause is still not clear for now.

So extend and apply the ASPM quirk from commit e02fe3bc7aba
("drm/amdgpu: vi: disable ASPM on Intel Alder Lake based systems"), to
workaround the issue on Navi cards too.

Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2458
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2023-03-23 09:34:35 -04:00
Jane Jian
e06bfcc1a1 drm/amdgpu/gfx: set cg flags to enter/exit safe mode
sriov needs to enter/exit safe mode in update umd p state
add the cg flag to let it enter or exit while needed

Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-23 09:32:01 -04:00
YuBiao Wang
033c56474a drm/amdgpu: Force signal hw_fences that are embedded in non-sched jobs
[Why]
For engines not supporting soft reset, i.e. VCN, there will be a failed
ib test before mode 1 reset during asic reset. The fences in this case
are never signaled and next time when we try to free the sa_bo, kernel
will hang.

[How]
During pre_asic_reset, driver will clear job fences and afterwards the
fences' refcount will be reduced to 1. For drm_sched_jobs it will be
released in job_free_cb, and for non-sched jobs like ib_test, it's meant
to be released in sa_bo_free but only when the fences are signaled. So
we have to force signal the non_sched bad job's fence during
pre_asic_reset or the clear is not complete.

Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-23 09:31:30 -04:00
Tong Liu01
4eb0b49a0a drm/amdgpu: add mes resume when do gfx post soft reset
[why]
when gfx do soft reset, mes will also do reset, if mes is not
resumed when do recover from soft reset, mes is unable to respond
in later sequence

[how]
resume mes when do gfx post soft reset

Signed-off-by: Tong Liu01 <Tong.Liu01@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-23 09:30:59 -04:00
Tim Huang
b589626674 drm/amdgpu: skip ASIC reset for APUs when go to S4
For GC IP v11.0.4/11, PSP TMR need to be reserved
for ASIC mode2 reset. But for S4, when psp suspend,
it will destroy the TMR that fails the ASIC reset.

[  96.006101] amdgpu 0000:62:00.0: amdgpu: MODE2 reset
[  100.409717] amdgpu 0000:62:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x00000011 SMN_C2PMSG_82:0x00000002
[  100.411593] amdgpu 0000:62:00.0: amdgpu: Mode2 reset failed!
[  100.412470] amdgpu 0000:62:00.0: PM: pci_pm_freeze(): amdgpu_pmops_freeze+0x0/0x50 [amdgpu] returns -62
[  100.414020] amdgpu 0000:62:00.0: PM: dpm_run_callback(): pci_pm_freeze+0x0/0xd0 returns -62
[  100.415311] amdgpu 0000:62:00.0: PM: pci_pm_freeze+0x0/0xd0 returned -62 after 4623202 usecs
[  100.416608] amdgpu 0000:62:00.0: PM: failed to freeze async: error -62

We can skip the reset on APUs, assuming we can resume them
properly. Verified on some GFX11, GFX10 and old GFX9 APUs.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x
2023-03-23 09:29:55 -04:00
Tim Huang
aaee0ce460 drm/amdgpu: reposition the gpu reset checking for reuse
Move the amdgpu_acpi_should_gpu_reset out of
CONFIG_SUSPEND to share it with hibernate case.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x
2023-03-23 09:29:32 -04:00
Jack Xiao
5ee33d905f drm/amd/amdgpu: limit one queue per gang
Limit one queue per gang in mes self test,
due to mes schq fw change.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 01:07:04 -04:00
YiPeng Chai
28606c4e58 drm/amdgpu: resume ras for gfx v11_0_3 during reset on SRIOV
Gfx v11_0_3 supports ras on SRIOV, so need to resume ras
during reset.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:58:50 -04:00
YiPeng Chai
ec64350d01 drm/amdgpu: reinit mes ip block during reset on SRIOV
Reinit mes ip block during reset on SRIOV.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:58:44 -04:00
YiPeng Chai
fe120b9f5c drm/amdgpu: enable ras for mp0 v13_0_10 on SRIOV
Enable ras for mp0 v13_0_10 on SRIOV.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:58:37 -04:00
Kai-Heng Feng
3ad5dcfe00 drm/amdgpu/nv: Apply ASPM quirk on Intel ADL + AMD Navi
S2idle resume freeze can be observed on Intel ADL + AMD WX5500. This is
caused by commit 0064b0ce85bb ("drm/amd/pm: enable ASPM by default").

The root cause is still not clear for now.

So extend and apply the ASPM quirk from commit e02fe3bc7aba
("drm/amdgpu: vi: disable ASPM on Intel Alder Lake based systems"), to
workaround the issue on Navi cards too.

Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2458
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:58:08 -04:00
Hawking Zhang
cfa0759827 drm/amdgpu: Initialize umc ras callback
Fix a coding error which results to null interrupt
handler for umc ras.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:48:01 -04:00
Lee Jones
a37558e63b drm/amd/amdgpu/amdgpu_vce: Provide description for amdgpu_vce_validate_bo()'s 'p' param
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c:599: warning: Function parameter or member 'p' not described in 'amdgpu_vce_validate_bo'

v2: reword (Alex)

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:48:00 -04:00
Lee Jones
0b9ff428de drm/amd/amdgpu/amdgpu_mes: Ensure amdgpu_bo_create_kernel()'s return value is checked
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c: In function ‘amdgpu_mes_ctx_alloc_meta_data’:
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c:1099:13: warning: variable ‘r’ set but not used [-Wunused-but-set-variable]

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Jack Xiao <Jack.Xiao@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:48:00 -04:00
Lee Jones
25a75f56be drm/amd/amdgpu/ih_v6_0: Repair misspelling and provide descriptions for 'ih'
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/ih_v6_0.c:392: warning: Function parameter or member 'ih' not described in 'ih_v6_0_get_wptr'
 drivers/gpu/drm/amd/amdgpu/ih_v6_0.c:432: warning: Function parameter or member 'ih' not described in 'ih_v6_0_irq_rearm'
 drivers/gpu/drm/amd/amdgpu/ih_v6_0.c:458: warning: Function parameter or member 'ih' not described in 'ih_v6_0_set_rptr'

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:48:00 -04:00
Lee Jones
3e4bc662ec drm/amd/amdgpu/gmc_v11_0: Provide a few missing param descriptions relating to hubs and flushes
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c:282: warning: Function parameter or member 'vmhub' not described in 'gmc_v11_0_flush_gpu_tlb'
 drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c:282: warning: Function parameter or member 'flush_type' not described in 'gmc_v11_0_flush_gpu_tlb'
 drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c:322: warning: Function parameter or member 'flush_type' not described in 'gmc_v11_0_flush_gpu_tlb_pasid'
 drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c:322: warning: Function parameter or member 'all_hub' not described in 'gmc_v11_0_flush_gpu_tlb_pasid'

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:48:00 -04:00
Lee Jones
4caee043bd drm/amd/amdgpu/amdgpu_vm_pt: Supply description for amdgpu_vm_pt_free_dfs()'s unlocked param
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c:683: warning: Function parameter or member 'unlocked' not described in 'amdgpu_vm_pt_free_dfs'

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-22 00:48:00 -04:00