21 Commits

Author SHA1 Message Date
Zhigang Luo
488b83f4d5 drm/amdgpu: remove sriov vf mmhub system aperture and fb location programming
host driver programmed mmhub system aperture and fb location for vf, no
need to program in guest side.

Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-By : Shaoyun.liu <shaoyunl@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-08 12:15:01 -04:00
Dennis Li
7780f50358 drm/amdgpu: add function to clear MMEA error status for aldebaran
For aldebaran, hardware will not clear error status automatically when
reading error status register, insteadly driver should set clear bit of
the error status register explicitly to clear error status.

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-05-10 18:11:44 -04:00
Felix Kuehling
9705c85ff2 drm/amdgpu: Enable retry faults unconditionally on Aldebaran
This is needed to allow per-process XNACK mode selection in the SQ when
booting with XNACK off by default.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Tested-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-20 21:47:34 -04:00
Oak Zeng
5f41741a6d Revert "drm/amdgpu: workaround the TMR MC address issue (v2)"
This reverts commit 2f055097daef498da57552f422f49de50a1573e6.
2f055097daef498da57552f422f49de50a1573e6 was a driver workaround
when PSP firmware was not ready. Now the PSP fw is ready so we
revert this driver workaround.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-20 21:43:49 -04:00
Hawking Zhang
1f8d3ad2a0 drm/amdgpu: only harvest gcea/mmea error status in aldebaran
In aldebaran, driver only needs to harvest SDP
RdRspStatus, WrRspStatus and first parity error
on RdRsp data. Check error type before harvest
error information.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-20 21:35:55 -04:00
Oak Zeng
0ca565ab97 drm/amdgpu: Calling address translation functions to simplify codes
Use amdgpu_gmc_vram_pa and amdgpu_gmc_vram_cpu_pa
to simplify codes. No logic change.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-15 16:03:01 -04:00
John Clements
89514083f8 drm/amdgpu: update mmhub 1.7 ras error reporting
only output ras error status if an error bit is set

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-15 16:02:39 -04:00
Hawking Zhang
8bc7b360ad drm/amdgpu: split mmhub callbacks into ras and non-ras ones
mmhub ras is only avaiable in cerntain mmhub ip
generation.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: John Clements <John.Clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09 16:51:19 -04:00
Oak Zeng
64f171581a drm/amdgpu: fix a few compiler warnings
1. make function mmhub_v1_7_setup_vm_pt_regs static
2. indent a if statement

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:29:26 -04:00
Oak Zeng
2f055097da drm/amdgpu: workaround the TMR MC address issue (v2)
With the 2-level gart page table,  vram is squeezed into gart aperture
and FB aperture is disabled. Therefore all VRAM virtual addresses are
 in the GART aperture. However currently PSP requires TMR addresses
in FB aperture. So we need some design change at PSP FW level to support
this 2-level gart table driver change. Right now this PSP FW support
doesn't exist. To workaround this issue temporarily, FB aperture is
added back and the gart aperture address is converted back to FB aperture
for this PSP TMR address.

Will revert it after we get a fix from PSP FW.

v2: squash in tmr fix for other asics (Kevin)

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:58:52 -04:00
Oak Zeng
0c19cab555 drm/amdgpu: HW setup of 2-level vmid0 page table
Set up HW for 2-level vmid0 page table: 1. Set up
PAGE_TABLE_START/END registers. Currently only plan
to do 2-level page table for ALDEBARAN, so only gfxhub1.0
and mmhub1.7 is changed. 2. Set page table base register.
For 2-level page table, the page table base should point
to PDB0. 3. Disable AGP and FB aperture as they are not
used.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:58:49 -04:00
Oak Zeng
7b454b3a34 drm/amdgpu: Use different gart table parameters for 2-level gart table
If use gart for FB translation, we will squeeze vram into
sysvm aperture. This requires 2 level gart table. Add
page table depth and page table block size parameters
to gmc. This is prepare work to 2-level gart table
construction

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:58:42 -04:00
Oak Zeng
1f928f5159 drm/amdgpu: Use physical translation mode to access page table
On A+A platform, CPU write page directory and page table in cached
mode. So it is necessary for page table walker to snoop CPU cache.
This setting is necessary for page walker to snoop page directory
and page table data out of CPU cache.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Acked-by: Christian Konig <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:58:27 -04:00
Lijo Lazar
6d9059217a drm/amdgpu: Fix aldebaran MMHUB CG/LS logic
Aldebaran MMHUB CG/LS logic is controlled by VBIOS. Enable the state
change logic only if driver is used for control.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:57:41 -04:00
Hawking Zhang
842811369f drm/amdgpu: switch to cached noretry setting for aldebaran
global noretry setting now is cached to gmc.noretry

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:57:00 -04:00
Hawking Zhang
b45589b837 drm/amdgpu: add mmhub error status query callback for aldebaran
The callback will be invoked to query mmea error
status when needed.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li<Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:55:46 -04:00
Hawking Zhang
27ad2ca667 drm/amdgpu: add mmhub ras error reset callback for aldebaran
The callback will be invoked to reset mmhub ras error
counters when needed.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li<Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:55:44 -04:00
Hawking Zhang
cbb84e7aab drm/amdgpu: add mmhub ras error query callback for aldebaran
The callback will be invoked to harvest all kinds
of mmhub ras error

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li<Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:55:41 -04:00
Kevin Wang
5be50a8fd8 drm/amdgpu: switch to use reg distance member for mmhub v1_7
switch to use register distance member for mmhub v1_7
instead of hardcode

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:54:09 -04:00
Oak Zeng
4da999cdfc drm/amdgpu: Clean up mmhub functions for aldebaran
Add more function pointers to amdgpu_mmhub_funcs. ASIC specific
implementation of most mmhub functions are called from a general
function pointer, instead of calling different function for
different ASIC.

V2: Split patch into upstreamable and aldebaran

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:54:05 -04:00
Le Ma
f37945d50f drm/amdgpu: add mmhub support for aldebaran (v3)
v1: dupilcate mmhub_v1_7.c from mmhub_v1_0.c because
mmhub register address for aldebaran is different
from existing asics (Le)
v2: switch to latest mmhub_v9_4_2 register headers (Hawking)
v3: squash in init VM_L2_CNTL3 default value for mmhub v1_7

Signed-off-by: Le Ma <le.ma@amd.com>
Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-10 00:01:47 -05:00