drm/xe/mmio: update gt_count when probing multi-tile

It looks like the single-tile PVC in CI dies during module load when doing
the pcode init. From the logs we try to access the address
0000000000138124 which doesn't map to anything, however 0x138124 also
looks to be the PCODE_MAILBOX register. So looks like the per-tile
mmio register mapping is NULL.

During probe the tile count is potentially trimmed, since we don't know
the real count until we actually probe the device. This seems to be
the case for single-tile PVC or similar devices.  However it looks like
the gt_count is never adjusted to respect this updated tile count. As a
result when later doing some for_each_gt() loop, like we do for the
pcode, we can get back some GT that maps to some non-existent tile
which hasn't been properly set up, leading to crashes.

Try to fix this by adjusting the gt_count after probing the tiles for
real.

v2: Fix typo so it actually builds

References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/383
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ofir Bitton <obitton@habana.ai>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
This commit is contained in:
Matthew Auld 2023-06-26 18:20:40 +01:00 committed by Rodrigo Vivi
parent 35c8a96439
commit 356010a1a0

View File

@ -335,6 +335,12 @@ static void xe_mmio_probe_tiles(struct xe_device *xe)
adj_tile_count = xe->info.tile_count =
REG_FIELD_GET(TILE_COUNT, mtcfg) + 1;
/*
* FIXME: Needs some work for standalone media, but should be impossible
* with multi-tile for now.
*/
xe->info.gt_count = xe->info.tile_count;
drm_info(&xe->drm, "tile_count: %d, adj_tile_count %d\n",
xe->info.tile_count, adj_tile_count);