231 Commits

Author SHA1 Message Date
Feifei Xu
0698b13403 drm/amdgpu: skip PP_MP1_STATE_UNLOAD on aldebaran
This message is not needed on Aldebaran.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09 16:47:50 -04:00
Chengming Gui
4a7ffbdb27 drm/amd/amdgpu: set MP1 state to UNLOAD before reload its FW for vega20/ALDEBARAN
When resume from gpu reset, need set MP1 state to UNLOAD before reload SMU
FW otherwise will cause following errors:
[  121.642772] [drm] reserve 0x400000 from 0x87fec00000 for PSP TMR [  123.801051] [drm] failed to load ucode id (24) [  123.801055] [drm] psp command (0x6) failed and response status is (0x0) [  123.801214] [drm:psp_load_smu_fw [amdgpu]] *ERROR* PSP load smu failed!
[  123.801398] [drm:psp_resume [amdgpu]] *ERROR* PSP resume failed [  123.801536] [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* resume of IP block <psp> failed -22 [  123.801632] amdgpu 0000:04:00.0: amdgpu: GPU reset(9) failed [  123.801691] amdgpu 0000:07:00.0: amdgpu: GPU reset(9) failed [  123.802899] amdgpu 0000:04:00.0: amdgpu: GPU reset end with ret = -22

v2: add error info and including ALDEBARAN also

Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Reviewed-and-tested-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09 16:47:43 -04:00
Lijo Lazar
a2052839cd drm/amdgpu: Add PSP public function to load a list of FWs
v1: Adds a function to load a list of FWs as passed by the caller. This is
needed as only a select need to loaded for some use cases.

v2: Omit unrelated change, remove info log, fix return value when count is 0

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09 16:46:18 -04:00
Tian Tao
36000c7a51 drm/amdgpu: Convert sysfs sprintf/snprintf family to sysfs_emit
Fix the following coccicheck warning:
drivers/gpu//drm/amd/amdgpu/amdgpu_ras.c:434:9-17: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_xgmi.c:220:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_xgmi.c:249:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/df_v3_6.c:208:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_psp.c:2973:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_vram_mgr.c:75:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_vram_mgr.c:112:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_vram_mgr.c:58:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_vram_mgr.c:93:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_vram_mgr.c:125:9-17: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_gtt_mgr.c:52:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_gtt_mgr.c:71:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_device.c:140:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_device.c:164:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_device.c:186:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_device.c:208:8-16: WARNING:
use scnprintf or sprintf
drivers/gpu//drm/amd/amdgpu/amdgpu_atombios.c:1916:8-16: WARNING:
use scnprintf or sprintf

Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09 16:43:47 -04:00
John Clements
cad7b7510c drm/amdgpu: added support for dynamic GECC
updated host to send boot config to psp to enable GECC for sienna cichlid

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09 16:43:06 -04:00
Evan Quan
1689fca0d6 drm/amd/pm: fix Navi1x runtime resume failure V2
The RLC was put into a wrong state on runtime suspend. Thus the RLC
autoload will fail on the succeeding runtime resume. By adding an
intermediate PPSMC_MSG_PrepareMp1ForUnload(some GC hard reset involved,
designed for PnP), we can bring RLC back into the desired state.

V2: integrate INTERRUPTS_ENABLED flag clearing into current
    mp1 state set routines

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09 16:37:53 -04:00
Feifei Xu
99d1da6774 drm/amdgpu:disable XGMI TA unload for A+A aldebaran
In gpu reset, XGMI TA unload will cause gpu hang.
Skip it on A+A aldebaran.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:27:36 -04:00
Kevin Wang
03597b47d6 Revert "drm/amdgpu: add psp RAP L0 check support"
This reverts commit d86fd724e59af96ae5ab6630f4f07a076e9b80cd.

Disable PSP RAP L0 self test until to RAP feature ready.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:03:45 -04:00
Oak Zeng
47bfa5f60f drm/amdgpu: Increase PSP runtime TMR region size
Aldebaran uses more than 4M runtime TMR. The current
hard coded 4M TMR is not big enough for Aldebaran.
Increase it to 8M.

v2: Only do 8M size for ALDEBARAN (Hawking)

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:00:51 -04:00
Kevin Wang
d86fd724e5 drm/amdgpu: add psp RAP L0 check support
add PSP RAP L0 check when RAP TA is loaded.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:00:25 -04:00
Kevin Wang
2fb3c5d0d1 drm/amdgpu: change psp_rap_invoke() function return value
RAP TA is an optional firmware. if it doesn’t exist,
the driver should bypass psp_rap_invoke() function.

1. bypass psp_rap_invoke() when RAP TA is not loaded.
2. add new parameter (status) to query RAP TA status.
   (the status value is different with psp_ta_invoke(),
3. fix the 'rap_status' MThread critical problem.
   (used without lock)

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:00:22 -04:00
Oak Zeng
2f055097da drm/amdgpu: workaround the TMR MC address issue (v2)
With the 2-level gart page table,  vram is squeezed into gart aperture
and FB aperture is disabled. Therefore all VRAM virtual addresses are
 in the GART aperture. However currently PSP requires TMR addresses
in FB aperture. So we need some design change at PSP FW level to support
this 2-level gart table driver change. Right now this PSP FW support
doesn't exist. To workaround this issue temporarily, FB aperture is
added back and the gart aperture address is converted back to FB aperture
for this PSP TMR address.

Will revert it after we get a fix from PSP FW.

v2: squash in tmr fix for other asics (Kevin)

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:58:52 -04:00
Hawking Zhang
b335f289fe drm/amdgpu: apply new pmfw loading sequence to arcturus and onwards
Arcturus and onwards products should follow the same sequence
that have pmfw loading ahead of tmr setup

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:57:44 -04:00
John Clements
0d2c1855d5 drm/amdgpu: added support for register list loading (v2)
call host to  psp cmd to load reg list

v2: update to latest interface (Alex)

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:54:41 -04:00
John Clements
b2aa382ae7 drm/amdgpu: added register list driver ctx (v2)
updated psp bin parsing and load register list

v2: update to latest interface (Alex)

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:54:38 -04:00
Hawking Zhang
9fbd96a136 drm/amdgpu: enable psp v13 ip block for aldebaran
Add psp v13 ip block to soc ip init list for aldebaran

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:52:33 -04:00
Hawking Zhang
428ad99e9c drm/amdgpu: load pmfw prior to other non-psp fw for aldebaran
PMFW should be loaded before any operation that
may toggling DF-Cstate. otherwsie, tOS has no
choice but to locally toggle DF Cstate (i.e.
disable DF-Cstate even it already enabled by VBIOS)

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:52:24 -04:00
Hawking Zhang
ee82108325 drm/amdgpu: init psp v13 ip function
Initialze psp ip function for aldebaran

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 22:52:15 -04:00
Jinzhou Su
ecaafb7b5a drm/amdgpu: Add secure display TA interface
Add interface to load, unload, invoke command for
secure display TA.

v2: Add debugfs interface for secure display TA
v3: fix warning in copy_from_user (Alex)

Signed-off-by: Jinzhou.Su <Jinzhou.Su@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-13 23:58:14 -05:00
Jiawei Gu
0d232dada3 drm/amdgpu: fix potential memory leak during navi12 deinitialization
Navi12 HDCP & DTM deinitialization needs continue to free bo if already
created though initialized flag is not set.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jiawei Gu <Jiawei.Gu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-05 11:35:26 -05:00
pengzhou
57995aa8ff drm/amdgpu: do optimization for psp command submit
In the psp command submit logic,
the function msleep(1) delayed too long,
Changing it to usleep_range(10, 100) to
have a better performance.

Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com>
Reviewed-by: Emily.Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-05 11:33:35 -05:00
Dennis Li
eb5f4f4653 drm/amdgpu: fix a memory protection fault when remove amdgpu device
ASD and TA share the same firmware in SIENNA_CICHLID and only TA
firmware is requested during boot, so only need release TA firmware when
remove device.

[   83.877150] general protection fault, probably for non-canonical address 0x1269f97e6ed04095: 0000 [#1] SMP PTI
[   83.888076] CPU: 0 PID: 1312 Comm: modprobe Tainted: G        W  OE     5.9.0-rc5-deli-amd-vangogh-0.0.6.6-114-gdd99d5669a96-dirty #2
[   83.901160] Hardware name: System manufacturer System Product Name/TUF Z370-PLUS GAMING II, BIOS 0411 09/21/2018
[   83.912353] RIP: 0010:free_fw_priv+0xc/0x120
[   83.917531] Code: e8 99 cd b0 ff b8 a1 ff ff ff eb 9f 4c 89 f7 e8 8a cd b0 ff b8 f4 ff ff ff eb 90 0f 1f 00 0f 1f 44 00 00 55 48 89 e5 41 54 53 <4c> 8b 67 18 48 89 fb 4c 89 e7 e8 45 94 41 00 b8 ff ff ff ff f0 0f
[   83.937576] RSP: 0018:ffffbc34c13a3ce0 EFLAGS: 00010206
[   83.943699] RAX: ffffffffbb681850 RBX: ffffa047f117eb60 RCX: 0000000080800055
[   83.951879] RDX: ffffbc34c1d5f000 RSI: 0000000080800055 RDI: 1269f97e6ed04095
[   83.959955] RBP: ffffbc34c13a3cf0 R08: 0000000000000000 R09: 0000000000000001
[   83.968107] R10: ffffbc34c13a3cc8 R11: 00000000ffffff00 R12: ffffa047d6b23378
[   83.976166] R13: ffffa047d6b23338 R14: ffffa047d6b240c8 R15: 0000000000000000
[   83.984295] FS:  00007f74f6712540(0000) GS:ffffa047fbe00000(0000) knlGS:0000000000000000
[   83.993323] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   84.000056] CR2: 0000556a1cca4e18 CR3: 000000021faa8004 CR4: 00000000003706f0
[   84.008128] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   84.016155] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   84.024174] Call Trace:
[   84.027514]  release_firmware.part.11+0x4b/0x70
[   84.033017]  release_firmware+0x13/0x20
[   84.037803]  psp_sw_fini+0x77/0xb0 [amdgpu]
[   84.042857]  amdgpu_device_fini+0x38c/0x5d0 [amdgpu]
[   84.048815]  amdgpu_driver_unload_kms+0x43/0x70 [amdgpu]
[   84.055055]  drm_dev_unregister+0x73/0xb0 [drm]
[   84.060499]  drm_dev_unplug+0x28/0x30 [drm]
[   84.065598]  amdgpu_dev_uninit+0x1b/0x40 [amdgpu]
[   84.071223]  amdgpu_pci_remove+0x4e/0x70 [amdgpu]
[   84.076835]  pci_device_remove+0x3e/0xc0
[   84.081609]  device_release_driver_internal+0xfb/0x1c0
[   84.087558]  driver_detach+0x4d/0xa0
[   84.092041]  bus_remove_driver+0x5f/0xe0
[   84.096854]  driver_unregister+0x2f/0x50
[   84.101594]  pci_unregister_driver+0x22/0xa0
[   84.106806]  amdgpu_exit+0x15/0x2b [amdgpu]

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-01-05 11:31:48 -05:00
Lee Jones
c18dd61ae4 drm/amd/amdgpu/amdgpu_psp: Make local function 'parse_ta_bin_descriptor' static
Fixes the following W=1 kernel build warning(s):

 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c:2576:5: warning: no previous prototype for ‘parse_ta_bin_descriptor’ [-Wmissing-prototypes]

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-12-01 16:04:41 -05:00
Kenneth Feng
c95ec47ccb drm/amd/amdgpu: skip unload message in reset
This has been confirmed that unload message is not needed from SIENNA_CICHLID in reset.
Otherwise it will cause the fw wrong state after reset and no response for any messages.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-11-24 12:03:24 -05:00
Bhawanpreet Lakha
6cb445e803 drm/amdgpu: Use PSP_FW_NAME_LEN instead of magic number
Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-11-13 17:29:46 -05:00
John Clements
c26dab27e5 drm/amdgpu: resolved ASD loading issue on sienna
updated fw header v2 parser to set asd fw memory

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-11-04 17:08:41 -05:00
Deepak R Varma
c4c5ae67d1 drm/amdgpu/amdgpu: use "*" adjacent to data name
When declaring pointer data, the "*" symbol should be used adjacent to
the data name as per the coding standards. This resolves following
issues reported by checkpatch script:
	ERROR: "foo *   bar" should be "foo *bar"
	ERROR: "foo * bar" should be "foo *bar"
	ERROR: "foo*            bar" should be "foo *bar"
	ERROR: "(foo*)" should be "(foo *)"

Signed-off-by: Deepak R Varma <mh12gx2825@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-11-02 15:35:50 -05:00
Deepak R Varma
f3729f7b1a drm/amdgpu/amdgpu: improve code indentation and alignment
General code indentation and alignment changes such as replace spaces
by tabs or align function arguments as per the coding style
guidelines. The patch corrects issues for various amdgpu_*.c files
for this driver. Issue reported by checkpatch script.

Signed-off-by: Deepak R Varma <mh12gx2825@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-11-02 15:34:29 -05:00
John Clements
19ae333001 drm/amdgpu: added support for psp fw attestation
loaded fw can be queried from sys fs interface

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-26 13:27:00 -04:00
Andrey Grodzovsky
23d9bd60bd drm/amd/psp: Fix sysfs: cannot create duplicate filename
psp sysfs not cleaned up on driver unload for sienna_cichlid

Fixes: ce87c98db428e7 ("drm/amdgpu: Include sienna_cichlid in USBC PD FW support.")
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-16 15:11:17 -04:00
Likun Gao
5bab858eee drm/amdgpu: add rlc iram and dram firmware support
Support to load RLC iram and dram ucode when RLC firmware struct use v2.2

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-15 12:23:04 -04:00
Tao Zhou
462c272b90 drm/amdgpu: add psp support for dimgrey_cavefish(v2)
General psp support for dimgrey_cavefish.

v2: remove the checks for asd load and reroute ih.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-12 14:01:12 -04:00
Tao Zhou
0a305e34c7 drm/amdgpu: increase size of psp fw_name string(v2)
Increase fw_name string size so longer chip name can be stored.

v2: define macro for the length of psp fw name.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-12 14:01:09 -04:00
Huang Rui
ed3b735332 drm/amdgpu: enable psp support for vangogh
This patch is to enable psp support for vangogh

Signed-off-by: Huang Rui <ray.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:15:28 -04:00
Huang Rui
5120cb5409 drm/amdgpu: add TOC firmware support for apu (v3)
APU needs load toc firmware for gfx10 series on psp front door loading.

v2: rebase against latest code
v3: clarify error message

Signed-off-by: Huang Rui <ray.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05 15:15:28 -04:00
Jingwen Chen
162b786f0f drm/amd: Skip not used microcode loading in SRIOV
smc, sdma, sos, ta and asd fw is not used in SRIOV. Skip them to
accelerate sw_init for navi12.

v2: skip above fw in SRIOV for vega10 and sienna_cichlid
v3: directly skip psp fw loading in SRIOV
Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by:  Emily.Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-25 16:54:00 -04:00
Andrey Grodzovsky
ce87c98db4 drm/amdgpu: Include sienna_cichlid in USBC PD FW support.
Create sysfs interface also for sienna_cichlid.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:42 -04:00
John Clements
9c7e2ceb1d drm/amdgpu: Update RAS init handling
Output RAS init status

If RAS init fails, teardown RAS context

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:39 -04:00
Andrey Grodzovsky
bf36b52e78 drm/amdgpu: Avoid accessing HW when suspending SW state
At this point the ASIC is already post reset by the HW/PSP
so the HW not in proper state to be configured for suspension,
some blocks might be even gated and so best is to avoid touching it.

v2: Rename in_dpc to more meaningful name

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:24:39 -04:00
Tao Zhou
c56c90f413 drm/amdgpu: add asd fw check before loading asd
asd is not ready for some ASICs in early stage, and psp->asd_fw is more generic than ASIC name in the check.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Jiansong Chen <Jiansong.Chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26 16:40:19 -04:00
Luben Tuikov
1348969ab6 drm/amdgpu: drm_device to amdgpu_device by inline-f (v2)
Get the amdgpu_device from the DRM device by use
of an inline function, drm_to_adev(). The inline
function resolves a pointer to struct drm_device
to a pointer to struct amdgpu_device.

v2: Use a typed visible static inline function
    instead of an invisible macro.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24 13:06:06 -04:00
Dennis Li
53b3f8f40e drm/amdgpu: refine codes to avoid reentering GPU recovery
if other threads have holden the reset lock, recovery will
fail to try_lock. Therefore we introduce atomic hive->in_reset
and adev->in_gpu_reset, to avoid reentering GPU recovery.

v2:
drop "? true : false" in the definition of amdgpu_in_reset

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24 12:22:56 -04:00
Christian König
f1403342eb drm/amdgpu: revert "fix system hang issue during GPU reset"
The whole approach wasn't thought through till the end.

We already had a reset lock like this in the past and it caused the same problems like this one.

Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary.

This reverts commit df9c8d1aa278c435c30a69b8f2418b4a52fcb929.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:40 -04:00
Wenhui Sheng
8602692b6f drm/amdgpu: enable RAP TA load
Enable the RAP TA loading path and add RAP test
trigger interface.

v2: fix potential mem leak issue

Signed-off-by: Wenhui Sheng <Wenhui.Sheng@amd.com>
Reviewed-by: Guchun Chen <Guchun.Chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-14 16:22:39 -04:00
Liu ChengZhe
f61772cd13 drm amdgpu: Skip tmr load for SRIOV
1. For Navi12, CHIP_SIENNA_CICHLID, skip tmr load operation;
2. Check pointer before release firmware.

v2: use CHIP_SIENNA_CICHLID instead
v3: remove local "bool ret"; fix grammer issue
v4: use my name instead of "root"
v5: fix grammer issue and indent issue

Signed-off-by: Liu ChengZhe <ChengZhe.Liu@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-30 15:36:44 -04:00
Dennis Li
df9c8d1aa2 drm/amdgpu: fix system hang issue during GPU reset
when GPU hang, driver has multi-paths to enter amdgpu_device_gpu_recover,
the atomic adev->in_gpu_reset and hive->in_reset are used to avoid
re-entering GPU recovery.

During GPU reset and resume, it is unsafe that other threads access GPU,
which maybe cause GPU reset failed. Therefore the new rw_semaphore
adev->reset_sem is introduced, which protect GPU from being accessed by
external threads during recovery.

v2:
1. add rwlock for some ioctls, debugfs and file-close function.
2. change to use dqm->is_resetting and dqm_lock for protection in kfd
driver.
3. remove try_lock and change adev->in_gpu_reset as atomic, to avoid
re-enter GPU recovery for the same GPU hang.

v3:
1. change back to use adev->reset_sem to protect kfd callback
functions, because dqm_lock couldn't protect all codes, for example:
free_mqd must be called outside of dqm_lock;

[ 1230.176199] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019
[ 1230.177221] Call Trace:
[ 1230.178249]  dump_stack+0x98/0xd5
[ 1230.179443]  amdgpu_virt_kiq_reg_write_reg_wait+0x181/0x190 [amdgpu]
[ 1230.180673]  gmc_v9_0_flush_gpu_tlb+0xcc/0x310 [amdgpu]
[ 1230.181882]  amdgpu_gart_unbind+0xa9/0xe0 [amdgpu]
[ 1230.183098]  amdgpu_ttm_backend_unbind+0x46/0x180 [amdgpu]
[ 1230.184239]  ? ttm_bo_put+0x171/0x5f0 [ttm]
[ 1230.185394]  ttm_tt_unbind+0x21/0x40 [ttm]
[ 1230.186558]  ttm_tt_destroy.part.12+0x12/0x60 [ttm]
[ 1230.187707]  ttm_tt_destroy+0x13/0x20 [ttm]
[ 1230.188832]  ttm_bo_cleanup_memtype_use+0x36/0x80 [ttm]
[ 1230.189979]  ttm_bo_put+0x1be/0x5f0 [ttm]
[ 1230.191230]  amdgpu_bo_unref+0x1e/0x30 [amdgpu]
[ 1230.192522]  amdgpu_amdkfd_free_gtt_mem+0xaf/0x140 [amdgpu]
[ 1230.193833]  free_mqd+0x25/0x40 [amdgpu]
[ 1230.195143]  destroy_queue_cpsch+0x1a7/0x270 [amdgpu]
[ 1230.196475]  pqm_destroy_queue+0x105/0x260 [amdgpu]
[ 1230.197819]  kfd_ioctl_destroy_queue+0x37/0x70 [amdgpu]
[ 1230.199154]  kfd_ioctl+0x277/0x500 [amdgpu]
[ 1230.200458]  ? kfd_ioctl_get_clock_counters+0x60/0x60 [amdgpu]
[ 1230.201656]  ? tomoyo_file_ioctl+0x19/0x20
[ 1230.202831]  ksys_ioctl+0x98/0xb0
[ 1230.204004]  __x64_sys_ioctl+0x1a/0x20
[ 1230.205174]  do_syscall_64+0x5f/0x250
[ 1230.206339]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

2. remove try_lock and introduce atomic hive->in_reset, to avoid
re-enter GPU recovery.

v4:
1. remove an unnecessary whitespace change in kfd_chardev.c
2. remove comment codes in amdgpu_device.c
3. add more detailed comment in commit message
4. define a wrap function amdgpu_in_reset

v5:
1. Fix some style issues.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Suggested-by: Lijo Lazar <Lijo.Lazar@amd.com>
Suggested-by: Luben Tukov <luben.tuikov@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-27 16:21:37 -04:00
John Clements
cff5f79019 drm/amdgpu: load asd for sienna cichlid
do not abort psp asd load sequence for sienna cichlid

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21 15:37:38 -04:00
Jiansong Chen
c82b38ec2e drm/amdgpu: add psp support for navy_flounder
Currently skip ASD FW loading and ih reroute per
sienna_cichlid.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-15 12:46:47 -04:00
John Clements
a330272936 drm/amdgpu: correct ta header v2 ucode init start address
resolve bug calculating fw start address within binary

Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-15 12:45:18 -04:00
Colin Ian King
4afaa61db9 drm/amdgpu: fix spelling mistake "Falied" -> "Failed"
There is a spelling mistake in a DRM_ERROR error message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-10 17:43:02 -04:00