drm/amd/amdkfd: Don't sent command to HWS on kfd reset

When kfd need to be reset, sent command to HWS might cause hang and get unnecessary timeout.
This change try not to touch HW in pre_reset and keep queues to be in the evicted state
when the reset is done, so they are not put back on the runlist. These queues will be destroied
on process termination.

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This commit is contained in:
shaoyunl 2021-11-03 10:49:47 -04:00 committed by Alex Deucher
parent e6ef9b396b
commit b8c20c74ab
2 changed files with 6 additions and 2 deletions

View File

@ -1430,7 +1430,7 @@ static int unmap_queues_cpsch(struct device_queue_manager *dqm,
if (!dqm->sched_running)
return 0;
if (dqm->is_hws_hang)
if (dqm->is_hws_hang || dqm->is_resetting)
return -EIO;
if (!dqm->active_runlist)
return retval;

View File

@ -1715,7 +1715,11 @@ int kfd_process_evict_queues(struct kfd_process *p)
r = pdd->dev->dqm->ops.evict_process_queues(pdd->dev->dqm,
&pdd->qpd);
if (r) {
/* evict return -EIO if HWS is hang or asic is resetting, in this case
* we would like to set all the queues to be in evicted state to prevent
* them been add back since they actually not be saved right now.
*/
if (r && r != -EIO) {
pr_err("Failed to evict process queues\n");
goto fail;
}