1234988 Commits

Author SHA1 Message Date
Matthew Brost
4f1411e2da drm/xe: Reinstate render / compute cache invalidation in ring ops
Render / compute engines have additional caches (not just TLBs) that
need to be invalidated each batch, reinstate these invalidations in ring
ops.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Suggested-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:21 -05:00
Maarten Lankhorst
38c04b47ce drm/xe: Use atomic instead of mutex for xe_device_mem_access_ongoing
xe_guc_ct_fast_path() is called from an irq context, and cannot lock
the mutex used by xe_device_mem_access_ongoing().

Fortunately it is easy to fix, and the atomic guarantees are good enough
to ensure xe->mem_access.hold_rpm is set before last ref is dropped.

As far as I can tell, the runtime ref in device access should be
killable, but don't dare to do it yet.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:20 -05:00
Matthew Brost
044f0cfb19 drm/xe: Drop zero length arrays
Zero-length arrays as fake flexible arrays are deprecated and we are
moving towards adopting C99 flexible-array members instead.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:20 -05:00
Matthew Auld
ce79c6c43a drm/xe/buddy: add compatible and intersects hooks
Copy this from i915. We need .compatible for vram -> vram transfers, so
they don't just get nooped by ttm, if need to move something from
mappable to non-mappble or vice versa. The .intersects is needed for
eviction, to determine if a victim resource is worth eviction. e.g if we
need mappable space there is no point in evicting a resource that has
zero mappable pages.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:20 -05:00
Matthew Auld
793e6612de drm/xe/buddy: add visible tracking
Replace the allocation code with the i915 version. This simplifies the
code a little, and importantly we get the accounting at the mgr level,
which is useful for debug (and maybe userspace), plus per resource
tracking so we can easily check if a resource is using one or pages in
the mappable part of vram (useful for eviction), or if the resource is
completely within the mappable portion (useful for checking if the
resource can be safely CPU mapped).

v2: Fix missing PAGE_SHIFT
v3: (Gwan-gyeong Mun)
  - Fix incorrect usage of ilog2(mm.chunk_size).
  - Fix calculation when checking for impossible allocation sizes, also
    check much earlier.
v4: (Gwan-gyeong Mun)
  - Fix calculation when extending the [fpfn, lpfn] range due to the
    roundup_pow_of_two().
v5: (Gwan-gyeong Mun)
  - Move the check for running out of mappable VRAM to before doing any of
    the roundup_pow_of_two().
v6: (Jani)
  - Stop abusing BUG_ON(). We can easily just use WARN_ON() here and
    return a proper error to the caller, which is much nicer if we ever
    trigger these.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:20 -05:00
Balasubramani Vivekanandan
11a2407ed5 drm/xe: Stop accepting value in xe_migrate_clear
Although xe_migrate_clear() has a value argument, currently the driver
is only passing 0 at all the places this function is invoked with the
exception the kunit tests are using the parameter to validate this
function with different values.
xe_migrate_clear() is failing on platforms with link copy engines
because xe_migrate_clear() via emit_clear() is using the blitter
instruction XY_FAST_COLOR_BLT to clear the memory. But this instruction
is not supported by link copy engine.
So the solution is to use the alternate instruction MEM_SET when
platform contains link copy engine. But MEM_SET instruction accepts only
8-bit value for setting whereas the value agrument of xe_migrate_clear()
is 32-bit.
So instead of spreading this limitation around all invocations of
xe_migrate_clear() and causing more confusion, it was decided to not
accept any value itself as driver does not really need this currently.

All the kunit tests are adapted as per the new function prototype.

This will be followed by a patch to add support for link copy engines.

Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:20 -05:00
Balasubramani Vivekanandan
eb230dc47d drm/xe: Use max wopcm size when validating the preset GuC wopcm size
When the GuC wopcm base and size registers are populated by BIOS/IFWI,
validate the parameters against the maximum allowed wopcm size.

Bpsec: 44982

Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:20 -05:00
Matthew Auld
1a653b879d drm/xe/buddy: remove the virtualized start
Hopefully not needed anymore. We can add a .compatible() hook once we
need to differentiate between mappable and non-mappable vram. If the
allocation is not contiguous then the start value is kind of
meaningless, so rather just mark as invalid.

In upstream, TTM wants to eventually remove the ttm_resource.start
usage.

References: 544432703b2f ("drm/ttm: Add new callbacks to ttm res mgr")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:20 -05:00
Lucas De Marchi
436dbd6bff drm/xe/mcr: Separate version from engine type selection
In order to improve readability and make it more future proof,
split the engine type from the graphics/platform checks.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230317223441.3891073-1-lucas.demarchi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:20 -05:00
Matthew Auld
2492f4544e drm/xe/vram: start tracking the io_size
First step towards supporting small-bar is to track the io_size for
vram. We can longer assume that the io_size == vram size. This way we
know how much is CPU accessible via the BAR, and how much is not.
Effectively giving us a two tiered vram, where in some later patches we
can support different allocation strategies depending on if the memory
needs to be CPU accessible or not.

Note as this stage we still clamp the vram size to the usable vram size.
Only in the final patch do we turn this on for real, and allow distinct
io_size and vram_size.

v2: (Lucas):
  - Improve the commit message, plus improve the kernel-doc for the
    io_size to give a better sense of what it actually is.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:14 -05:00
Thomas Hellström
8e41443e1b drm/xe/vm: Defer vm rebind until next exec if nothing to execute
If all compute engines of a vm in compute mode are idle,
defer a rebind to the next exec to avoid the VM unnecessarily trying
to make memory resident and compete with other VMs for available
memory space.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:14 -05:00
Thomas Hellström
7cba3396fd drm/xe/tests: Test both CPU- and GPU page-table updates with the migrate test
Add a test parameter to force GPU page-table updates with the migrate
test and test both CPU- and GPU updates. Also provide some timing
results.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:13 -05:00
Thomas Hellström
a5dfb471bb drm/xe: Use a small negative initial seqno
Causes an early 32-bit wrap and may thus help CI catch wrapping errors
that may otherwise not show early enough.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:13 -05:00
Thomas Hellström
155c916554 drm/xe: Introduce xe_engine_is_idle()
Introduce xe_engine_is_idle, and replace the static function in
xe_migrate.c.
The latter had two flaws. First the seqno == 1 test might return a false
true value each time the seqno counter wrapped, Second, the
cur_seqno == next_seqno test would never return true.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:13 -05:00
Thomas Hellström
17a28ea23c drm/xe/tests: Support CPU page-table updates in the migrate test
The migrate test currently supports only GPU pagetable updates and
will thus break if we fix the CPU pagetable update selection.

Fix the migrate test first.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:13 -05:00
Thomas Hellström
fc1cc68030 drm/xe/migrate: Update cpu page-table updates
Don't wait for GPU to be able to update page-tables using CPU. Putting
ourselves to sleep may be more of a problem than using GPU for
page-table updates. Also allow the vm to be NULL since the migrate
kunit test uses NULL for vm.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:13 -05:00
Thomas Hellström
7c51050b3b drm/xe: Use a define to set initial seqno for fences
Also for HW fences, write the initial seqno - 1 to the HW completed
seqno to initialize.

v2:
- Use __dma_fence_is_later() to compare hw fence seqnos. (Matthew Auld)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:13 -05:00
Mauro Carvalho Chehab
8eb7ad99ae drm/xe/xe_uc_fw: Use firmware files from standard locations
The GuC/HuC firmware files used by Xe drivers are the same as
used by i915. Use the already-known location to find those
firmware files, for a couple of reasons:

1. Avoid having the same firmware placed on two different
   places on MODULE_FIRMWARE(), if both 915 and xe drivers
   are compiled;

2. Having firmware files located on different locations may end
   creating bigger initramfs, as the same files will be copied
   twice my mkinitrd/dracut/...;

3. this is the place where those firmware files are located at
   https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git
   Upstream doesn't expect them to have on other places;

4. When built with display support, DMC firmware will be
   loaded from i915/ directory. It is very confusing to have
   some firmware files on a different place for the same driver.

Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Lucas de Marchi <lucas.demarchi@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
[ Mostly agree with the direction of "use the firmware blobs from
  upstream at their current location for these platforms". Previous
  directory was not wrong as the plan was to have it handled in the
  upstream firmware repo. For future platforms the location can be
  changed if the support is only in xe ]
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230310081338.3275583-1-mauro.chehab@linux.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:13 -05:00
Jani Nikula
5fd92bdd54 drm/xe/irq: the irq handler local variable need not be static
It's just a local variable.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:12 -05:00
Lucas De Marchi
ccbb6ad52a drm/xe: Replace i915 with xe in uapi
All structs and defines had already been renamed to "xe", but some
comments with "i915" were left over. Rename them.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://lore.kernel.org/r/20230313211628.2492587-1-lucas.demarchi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:12 -05:00
Lucas De Marchi
fd93946d59 drm/xe: Add missing LRC workarounds for graphics 1200
Synchronize LRC workarounds for graphics version 1200 with i915 up to
commit 7cdae9e9ee5e ("drm/i915: Move DG2 tuning to the right function").
These were probably missed for TGL/RKL before because in i915 it uses a
!IS_DG1() condition.  Avoid a similar issue by just checking the
graphics version 1200 since DG1 is 1210.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-14-lucas.demarchi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:11 -05:00
Lucas De Marchi
95ff48c2e7 drm/xe: Add missing ADL-P engine workaround
Add the one missing workaround for ADL-P when comparing to i915 up to
commit 7cdae9e9ee5e ("drm/i915: Move DG2 tuning to the right function").

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-13-lucas.demarchi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:09 -05:00
Lucas De Marchi
8cd7e97597 drm/xe: Add missing DG2 lrc workarounds
Synchronize with i915 the DG2 lrc workarounds as of
commit 4d14d7717f19 ("drm/i915/selftest: Fix ktime_get() and h/w access
order").

A few simplifications were done when the WA should be applied to some
steps of a subplatform and all the steppings of the other subplatforms.
In this case, it was simply applied to all the steppings, which only
means applying it to a few more A* steppings.

The implementation of the workaround 16011186671 triggers a bug in the
RTP infra: it's not possible to set the flag the usual way when having
multiple actions in the entry. This may be fixed later, but for now it's
sufficient to just set the flag directly without the helper macro.

v2: Fix 14014947963 to use FIELD_SET (Matt Roper)

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-12-lucas.demarchi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:08 -05:00
Lucas De Marchi
11f78b1308 drm/xe: Add missing DG2 lrc tunings
Synchronize with i915 the DG2 tunings as of
commit 4d14d7717f19 ("drm/i915/selftest: Fix ktime_get() and h/w access
order").

Contrary to the tuning "gang timer" for TGL, there is no quick
justification for why the read back is disabled in i915. Keep it
with that flag for now. That can be tentatively removed later when the
read values are checked.

v2: Use XEHP_FF_MODE2 instead of GEN12_FF_MODE2 (Matt Roper)

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-11-lucas.demarchi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:30:06 -05:00
Lucas De Marchi
4d5ab12163 drm/xe: Add missing DG2 engine workarounds
Synchronize with i915 the DG2 gt workarounds as of
commit 4d14d7717f19 ("drm/i915/selftest: Fix ktime_get() and h/w access
order").

A few simplifications were done when the WA should be applied to
some steps of a subplatform and all the steppings of the other
subplatforms. This happened with Wa_1509727124, Wa_22012856258 and a few
others. In figure the pre-production steppings will be removed, so this
can be already simplified a little bit.

v2:
  - Make 1308578152 conditional on first gslice fused off
  - Add the missing Wa_1608949956/Wa_14010198302 (Matt Roper)
v3:
  - Do not duplicate the implementation of 18019627453 since it's
    already covered by other WA numbers in graphics versions 1200 and
    1210

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-10-lucas.demarchi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:59 -05:00
Lucas De Marchi
911aeb0f61 drm/xe: Add missing DG2 gt workarounds and tunings
Synchronize with i915 the DG2 gt workarounds as of
commit 4d14d7717f19 ("drm/i915/selftest: Fix ktime_get() and h/w access
order").

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-9-lucas.demarchi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:56 -05:00
Lucas De Marchi
4688d9ce2e drm/xe: Add PVC engine workarounds
Sync PVC engine workarounds with i915.

v2: Remove 16016694945. It was added by mistake. It's a GT workaround,
already present in the GT table (Matt Roper)

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-8-lucas.demarchi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:54 -05:00
Lucas De Marchi
a19220fa5f drm/xe: Add PVC gt workarounds
Synchronize with i915 the PVC gt workarounds as of committ
commit 4d14d7717f19 ("drm/i915/selftest: Fix ktime_get() and h/w
access order").

v2: Add masked flag to XEHPC_LNCFMISCCFGREG0 (Matt Roper)

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-7-lucas.demarchi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:49 -05:00
Lucas De Marchi
6b5ccd6360 drm/xe: Reorder WAs to consider the platform
Now that number of platforms is growing, it's getting hard to know the
workarounds for each platform. Split the entries inside the same table
so the workarounds checking IP version are listed first, followed by
each platform. Next step when it grows too much is to split in smaller
tables.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-6-lucas.demarchi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:47 -05:00
Lucas De Marchi
6647e2fe23 drm/xe/debugfs: Dump register save-restore tables
Add debugfs entry to dump the final tables with register save-restore
information.

For the workarounds, this has a format a little bit different than when the
values are applied because we don't want to read the values from the HW
when dumping via debugfs. For whitelist it just re-uses the print
function added for when the whitelist is being built.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-5-lucas.demarchi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:47 -05:00
Lucas De Marchi
d855d2246e drm/xe: Print whitelist while applying
Besides printing the various register save-restore, it's also useful to
know the register being allowed/denied access from unprivileged batch
buffers. Print them during device probe.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-4-lucas.demarchi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:47 -05:00
Lucas De Marchi
5be84050dd drm/xe/reg_sr: Tweak verbosity for register printing
If there is no register to save-restore or whitelist, just return. This
drops some noise from the log, particurlarly for platforms with several
engines like PVC:

	[drm:xe_reg_sr_apply_mmio [xe]] Applying bcs0 save-restore MMIOs
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs0 registers
	[drm:xe_reg_sr_apply_mmio [xe]] Applying bcs1 save-restore MMIOs
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs1 registers
	[drm:xe_reg_sr_apply_mmio [xe]] Applying bcs2 save-restore MMIOs
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs2 registers
	[drm:xe_reg_sr_apply_mmio [xe]] Applying bcs5 save-restore MMIOs
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs5 registers
	[drm:xe_reg_sr_apply_mmio [xe]] Applying bcs6 save-restore MMIOs
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs6 registers
	[drm:xe_reg_sr_apply_mmio [xe]] Applying bcs7 save-restore MMIOs
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs7 registers
	[drm:xe_reg_sr_apply_mmio [xe]] Applying bcs8 save-restore MMIOs
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs8 registers
	[drm:xe_reg_sr_apply_mmio [xe]] Applying ccs0 save-restore MMIOs
	[drm:xe_reg_sr_apply_mmio [xe]] REG[0x20e4] = 0x00008000
	[drm:xe_reg_sr_apply_mmio [xe]] REG[0xb01c] = 0x00000001
	[drm:xe_reg_sr_apply_mmio [xe]] REG[0xe48c] = 0x00000800
	[drm:xe_reg_sr_apply_mmio [xe]] REG[0xe7c8] = 0x40000000
	...

On a PVC system it should show something like below. Whitelist calls
are still there since they aren't actually empty - driver just doesn't
print each individual entry. This will be fixed in future.

	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs0 registers
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs1 registers
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs2 registers
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs5 registers
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs6 registers
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs7 registers
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs8 registers
	[drm:xe_reg_sr_apply_mmio [xe]] Applying ccs0 save-restore MMIOs
	[drm:xe_reg_sr_apply_mmio [xe]] REG[0x20e4] = 0x00008000
	[drm:xe_reg_sr_apply_mmio [xe]] REG[0xb01c] = 0x00000001
	[drm:xe_reg_sr_apply_mmio [xe]] REG[0xe48c] = 0x00000800
	[drm:xe_reg_sr_apply_mmio [xe]] REG[0xe7c8] = 0x40000000

v2: Only tweak log verbosity, leave the whitelist printout for later
    since decoding the whitelist is more complex.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-3-lucas.demarchi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:47 -05:00
Lucas De Marchi
143800547b drm/xe/rtp: Add match helper for gslice fused off
Add match helper to detect when the first gslice is fused off, as needed
by future workarounds.

v2:
  - Add warning if called on a platform without geometry pipeline
    (Matt Roper)
  - Hardcode 4 as the number of gslices, which matches all the currently
    supported platforms. PVC doesn't have geometry pipeline and
    shouldn't use this function (Matt Roper)

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-2-lucas.demarchi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:47 -05:00
Matthew Auld
69db25e447 drm/xe: add xe_ttm_stolen_cpu_access_needs_ggtt()
xe_ttm_stolen_cpu_inaccessible() was originally meant to just cover the
case where stolen is not directly CPU accessible on some older
integrated platforms, and as such a GGTT mapping was also required for
CPU access (as per the check in xe_bo_create_pin_map_at()).

However with small-bar systems on dgfx we have one more case where
stolen is also inaccessible, however here we don't have any fallback
GGTT mode for CPU access. Fix the check in xe_bo_create_pin_map_at() to
make this distinction clear. In such a case the later vmap() will fail
anyway.

v2: fix kernel-doc warning
v3: Simplify further and remove cpu_inaccessible()

Suggested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:47 -05:00
Matthew Auld
ddad061e8f drm/xe: one more s/lmem/vram/
Looks to have been introduced in some very recent changes, in-between
merging the driver wide s/lmem/vram/.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:47 -05:00
Riana Tauro
11823d48ab drm/xe: Fix overflow in vram manager
The overflow caused xe_bo_restore_kernel to return an error
Fix overflow in vram manager alloc function.

Signed-off-by: Riana Tauro <riana.tauro@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:46 -05:00
Jani Nikula
6d4f49b7de drm/xe: make compound literal initialization const
Be careful about having const in the compound literal initialization to
keep the initializers in rodata. Here, the impact is 1.8k of mutable
data moved to rodata.

add/remove: 0/1 grow/shrink: 0/0 up/down: 0/-1804 (-1804)
Data                                         old     new   delta
__compound_literal                          1804       -   -1804
Total: Before=42425, After=40621, chg -4.25%
add/remove: 0/0 grow/shrink: 1/0 up/down: 1804/0 (1804)
RO Data                                      old     new   delta
__compound_literal                          7696    9500   +1804
Total: Before=138535, After=140339, chg +1.30%

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230309121746.479146-1-jani.nikula@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:46 -05:00
Lucas De Marchi
91ed180b41 drm/xe/pvc: Remove A* steppings
The PVC pre-production A* steppings are not going to be supported in xe
driver - the steppings are important for the WAs and since we are not
adding the pre-productions ones, there is no need to add the stepping.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:46 -05:00
Lucas De Marchi
766849c4ac drm/xe: Name LRC wa after the engine it belongs
This makes it easier when printing the register-save-restore values
to know what is the engine.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:46 -05:00
Lucas De Marchi
f647eff172 drm/xe: Remove dump function from reg_sr
The dump function was originally added with the idea that it could be
re-used both for printing the reg-sr data and saving it to pass to GuC
via ADS. This was not used by the GuC integration, so remove it now to
give place to a new debug.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:46 -05:00
Lucas De Marchi
043790f3ed drm/xe/rtp: Add match for render reset domain
This allows to create WA/tuning rules that match the first engine that
is either of compute or render class. This matters for platforms that
don't have a render engine and that may have arbitrary compute engines
fused off: some register programming need to be added to one of those
engines.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:46 -05:00
Lucas De Marchi
4c128558fe drm/xe/rtp: Move match function from wa to rtp
Match functions are generally useful for other parts of the code (e.g.
xe_tuning.c). Move and rename the single one available to create a place
where similar match functions can be added.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:46 -05:00
Lucas De Marchi
1415283bef drm/xe: Constify xe_dss_mask_group_ffs()
Due to how xe_dss_mask_t is implemented, the type is a pointer. Since
this is only used for looking up the bits, make it const so it can be
used together with a const gt passed around.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:45 -05:00
Lucas De Marchi
8846ffb457 drm/xe: Allow const propagation in gt_to_xe()
Replace the inline function with a _Generic() so gt_to_xe() can work
with a const struct xe_gt*, which leads to a const struct xe *.
This allows a const gt being passed around and when the xe device is
needed, compiler won't issue a warning that calling gt_to_xe() would
discard the const. Rather, just propagate the const to the xe pointer
being returned.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:45 -05:00
Lucas De Marchi
6b980aa88d drm/xe/mcr: Document how to initialize group/instance
Add a sentence about the initialization so it's clear for newcomers how
to tweak the init functions for new platforms.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:45 -05:00
Lucas De Marchi
e84535d860 drm/xe/mcr: Add L3BANK steering for DG2
Some register ranges with replication type L3BANK were missing from the
driver table. The following warning was triggering when adding a
workaround touching the register 0xb188:

	xe 0000:03:00.0: Did not find MCR register 0xb188 in any MCR steering table

Add the L3BANK ranges according to the spec.

v2:
  - Fix typo in one of the ranges: s/0x00BCFF/0x008CFF/ (Matt Roper)
  - Add termination rule in the init function for L3BANK (Matt Roper)

Bspec: 66534
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:45 -05:00
Thomas Hellström
4b1430f775 drm/xe/vm: Use the correct vma destroy sequence on userptr failure
Fix the below warning by using the correct vma destroy sequence:

[   92.204921] ------------[ cut here ]------------
[   92.204954] WARNING: CPU: 3 PID: 2449 at drivers/gpu/drm/xe/xe_vm.c:933 xe_vma_destroy+0x280/0x290 [xe]
[   92.205002] Modules linked in: ccm nft_objref cmac nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nf_tables nfnetlink ip6table_filter iptable_filter bnep sunrpc vfat fat iwlmvm mac80211 intel_rapl_msr ee1004 ppdev intel_rapl_common snd_hda_codec_realtek libarc4 iTCO_wdt snd_hda_codec_generic intel_pmc_bxt x86_pkg_temp_thermal iTCO_vendor_support intel_powerclamp coretemp intel_cstate iwlwifi btusb btrtl btbcm snd_hda_intel btintel snd_intel_dspcfg eeepc_wmi snd_hda_codec asus_wmi bluetooth snd_hwdep snd_seq ledtrig_audio snd_hda_core snd_seq_device sparse_keymap cfg80211 snd_pcm intel_uncore joydev platform_profile mei_me wmi_bmof intel_wmi_thunderbolt snd_timer pcspkr ecdh_generic i2c_i801 snd
[   92.205060]  ecc mei rfkill soundcore idma64 i2c_smbus parport_pc parport acpi_pad acpi_tad xe drm_ttm_helper ttm i2c_algo_bit drm_suballoc_helper kunit drm_buddy gpu_sched drm_display_helper drm_kms_helper drm crct10dif_pclmul crc32_pclmul crc32c_intel nvme nvme_core e1000e ghash_clmulni_intel drm_panel_orientation_quirks video wmi pinctrl_tigerlake usb_storage ip6_tables ip_tables fuse
[   92.205242] CPU: 3 PID: 2449 Comm: xe_vm Tainted: G     U             6.1.0+ #120
[   92.205254] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 0403 01/26/2021
[   92.205266] RIP: 0010:xe_vma_destroy+0x280/0x290 [xe]
[   92.205299] Code: 74 15 48 8b 93 a0 01 00 00 48 8b 83 a8 01 00 00 48 89 42 08 48 89 10 4c 89 ab a0 01 00 00 4c 89 ab a8 01 00 00 e9 1b fe ff ff <0f> 0b e9 a3 fe ff ff 0f 0b e9 82 fe ff ff 66 90 0f 1f 44 00 00 48
[   92.205322] RSP: 0018:ffffaadd465c3a58 EFLAGS: 00010246
[   92.205331] RAX: 0000000000000000 RBX: ffff9706d53ed400 RCX: 0000000000000001
[   92.205341] RDX: ffff9706d53ed480 RSI: ffffffffa756dc2b RDI: ffffffffa760a05e
[   92.205351] RBP: 0000000000000000 R08: 0000000000000000 R09: 000000002c5370a2
[   92.205361] R10: ffff9706ca520000 R11: 0000000022c5370a R12: ffff9706cad03800
[   92.205370] R13: 000000000004ffff R14: fffffffffffffff2 R15: 0000000000000000
[   92.205380] FS:  00007fe98203a940(0000) GS:ffff970dffac0000(0000) knlGS:0000000000000000
[   92.205392] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   92.205400] CR2: 00007fe982ccb000 CR3: 000000010d6e6003 CR4: 0000000000770ee0
[   92.205410] PKRU: 55555554
[   92.205415] Call Trace:
[   92.205419]  <TASK>
[   92.205426]  vm_bind_ioctl_lookup_vma+0x9bb/0xbf0 [xe]
[   92.205461]  ? lock_is_held_type+0xe3/0x140
[   92.205472]  ? xe_vm_find_overlapping_vma+0x77/0x90 [xe]
[   92.205503]  ? __vm_bind_ioctl_lookup_vma.constprop.0+0x9e/0xe0 [xe]
[   92.205533]  ? __lock_acquire+0x3a3/0x1fb0
[   92.205543]  ? register_lock_class+0x38/0x480
[   92.205550]  ? __lock_acquire+0x3a3/0x1fb0
[   92.205558]  ? __lock_acquire+0x3a3/0x1fb0
[   92.205567]  ? __lock_acquire+0x3a3/0x1fb0
[   92.205579]  ? lock_acquire+0xbf/0x2b0
[   92.205586]  ? lock_acquire+0xcf/0x2b0
[   92.205597]  xe_vm_bind_ioctl+0x977/0x1c30 [xe]
[   92.205630]  ? find_held_lock+0x2b/0x80
[   92.205640]  ? lock_release+0x131/0x2c0
[   92.205648]  ? xe_vm_ttm_bo+0x40/0x40 [xe]
[   92.205677]  drm_ioctl_kernel+0xa1/0x150 [drm]
[   92.205706]  drm_ioctl+0x221/0x420 [drm]
[   92.205727]  ? xe_vm_ttm_bo+0x40/0x40 [xe]
[   92.205764]  __x64_sys_ioctl+0x8d/0xd0
[   92.205774]  do_syscall_64+0x37/0x90
[   92.205781]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   92.205790] RIP: 0033:0x7fe982be8d6f
[   92.205797] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
[   92.205821] RSP: 002b:00007ffde9f9c560 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[   92.205832] RAX: ffffffffffffffda RBX: 00007fadeadbe000 RCX: 00007fe982be8d6f
[   92.205842] RDX: 00007ffde9f9c5f0 RSI: 0000000040786445 RDI: 0000000000000003
[   92.205851] RBP: 00007ffde9f9c5f0 R08: 00007fadeadbe000 R09: 0000000000040000
[   92.205861] R10: 0000000000000003 R11: 0000000000000246 R12: 0000000040786445
[   92.205871] R13: 0000000000000003 R14: 0000000000000003 R15: 00007fe982e02000
[   92.205888]  </TASK>
[   92.205892] irq event stamp: 82723
[   92.205897] hardirqs last  enabled at (82731): [<ffffffffa617660e>] __up_console_sem+0x5e/0x70
[   92.205910] hardirqs last disabled at (82738): [<ffffffffa61765f3>] __up_console_sem+0x43/0x70
[   92.205922] softirqs last  enabled at (82182): [<ffffffffa60f026d>] __irq_exit_rcu+0xed/0x160
[   92.205935] softirqs last disabled at (82163): [<ffffffffa60f026d>] __irq_exit_rcu+0xed/0x160
[   92.205947] ---[ end trace 0000000000000000 ]---

Reported-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:45 -05:00
Matt Roper
13fb0c9872 drm/xe: Add support for CCS engine fusing
For Xe_HP platforms that can have multiple CCS engines, the
presence/absence of each CCS is inferred by the presence/absence of any
DSS in the corresponding quadrant of the GT's DSS mask.

This handling is only needed on platforms that can have more than one
CCS.  The CCS is never fused off on platforms like MTL that can only
have one.

v2:
 - Add extra warnings to try to catch mistakes where the register counts
   in get_num_dss_regs() are updated without corresponding updates to
   the register parameters passed to load_dss_mask().  (Lucas)
 - Add kerneldoc for xe_gt_topology_has_dss_in_quadrant() and clarify
   why we care about quadrants of the DSS space.  (Lucas)
 - Ensure CCS engine counting treats engine mask as 64-bit.  (Lucas)

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230309005530.3140173-2-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:45 -05:00
Matt Roper
7c7225ddaa drm/xe: Separate engine fuse handling into dedicated functions
The single function to handle fuse registers for all types of engines is
becoming a bit long and hard to follow (and we haven't even added the
compute engines yet).  Let's split it into dedicated functions for each
engine class.

v2:
 - Add note about BCS0 always being present.  (Bala)
 - Add forcewake assertion to read_copy_fuses.  (Bala)

Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230309005530.3140173-1-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:45 -05:00
Matthew Auld
2a8477f761 drm/xe: s/lmem/vram/
This seems to be the preferred nomenclature in xe. Currently we are
intermixing vram and lmem, which is confusing.

v2 (Gwan-gyeong Mun & Lucas):
  - Rather apply to the entire driver

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:29:45 -05:00