1235076 Commits

Author SHA1 Message Date
Lucas De Marchi
3512a78a3c drm/xe: Use XE_REG/XE_REG_MCR
These should replace the _MMIO() and MCR_REG() from i915, with the goal
of being more extensible, allowing to pass the additional fields for
struct xe_reg and struct xe_reg_mcr. Replace all uses of _MMIO() and
MCR_REG() in xe.

Since the RTP, reg-save-restore and WA infra are not ready to use the
new type, just undef the macro like was done for the i915 types
previously. That conversion will come later.

v2: Remove MEDIA_SOFT_SCRATCH_COUNT/MEDIA_SOFT_SCRATCH re-added by
    mistake (Matt Roper)

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230427223256.1432787-8-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:32:21 -05:00
Lucas De Marchi
36e22be498 drm/xe: Introduce xe_reg/xe_reg_mcr
Stop using i915 types for registers. Use our own types. Differently from
i915, this will keep under the register definition the knowledge for the
different types of registers. For now, the "flags"/"options" are mcr and
masked, although only the former is being used.

Additionally MCR registers have their own type. The only place that
should really look inside a xe_mcr_reg_t is that code dealing with the
steering and using other APIs when the register is MCR has been a source
of problem in the past.

Most of the driver is agnostic to the register differences since they
either use the definition from the header or already call the correct
MCR_REG()/_MMIO() macros. By embeding the struct xe_reg inside the
struct it's also possible to guarantee the compiler will break if
using RANDOM_MCR_REG.reg is attempted, since now the u32 is inside the
inner struct.

v2:
  - Deep a dedicated type for MCR registers to avoid misuse
    (Matt Roper, Jani)
  - Drop the typedef and just use a struct since it's not an opaque type
    (Jani)
  - Add more kernel-doc
v3:
  - Use only 22 bits for the register address since all the platforms
    supported so far have only 4MB of MMIO per tile  (Matt Roper)

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230427223256.1432787-7-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:32:21 -05:00
Lucas De Marchi
143e3bc783 drm/xe: Clarify register types on PAT programming
Clarify a few things related to the PAT programming, particularly on
MTL:

	- The register type doesn't change depending on the GT - what
	  happens is that media GT writes to other set of registers that
	  are not MCR
	- Remove "UNICAST": otherwise it's confusing why it's not using
	  MCR registers with the unicast function variant

Also, there isn't much reason to keep those parts as macros: promote
them to proper functions and let the compiler inline if it sees fit.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230427223256.1432787-5-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:32:21 -05:00
Lucas De Marchi
5f230a144a drm/xe: Use REG_FIELD/REG_BIT for all regs/*.h
Convert the macro declarations to the equivalent GENMASK and
and bitfield prep for all registers.

v2 (Matt Roper):
  - Fix wrong conversion of RPM_CONFIG0_CRYSTAL_CLOCK_FREQ_MASK
  - Reorder fields of XEHP_SLICE_UNIT_LEVEL_CLKGATE for consistency
  - Simplify CTC_SOURCE_* by only defining CTC_SOURCE_DIVIDE_LOGIC
    as REG_BIT(0)

v3: Also remove DOP_CLOCK_GATE_ENABLE that is unused and wrongly defined

Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://lore.kernel.org/r/20230427223256.1432787-4-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:32:21 -05:00
Lucas De Marchi
d9b79ad275 drm/xe: Drop gen afixes from registers
The defines for the registers were brought over from i915 while
bootstrapping the driver. As xe supports TGL and later only, it doesn't
make sense to keep the GEN* prefixes and suffixes in the registers: TGL
is graphics version 12, previously called "GEN12". So drop the prefix
everywhere.

v2:
  - Also drop _TGL suffix and reword commit message as suggested
    by Matt Roper. While at it, rename VSUNIT_CLKGATE_DIS_TGL to
    VSUNIT_CLKGATE2_DIS with the additional "2", so it doesn't clash
    with the define for the other register

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230427223256.1432787-3-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:32:15 -05:00
Lucas De Marchi
7b829f6dd6 drm/xe/guc: Convert GuC registers to REG_FIELD/REG_BIT
Cleanup GuC register declarations by converting them to use REG_FIELD,
REG_BIT and REG_GENMASK. While converting, also reorder the bitfields
so they follow the convention of declaring the higher bits first.

v2:
  - Drop unused HUC_LOADING_AGENT_VCR and DMA_ADDRESS_SPACE_GTT (Matt Roper)
  - Simplify HUC_LOADING_AGENT_GUC define (Matt Roper)

Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://lore.kernel.org/r/20230427223256.1432787-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:32:15 -05:00
Maarten Lankhorst
1bd4db39de drm/xe: Remove extra xe_mmio_read32 from xe_mmio_wait32
Commit 7aaec3a623ad ("drm/xe: Let's return last value read on
xe_mmio_wait32.") mentions that we should return the last value read,
but we never actually return it. This breaks display which depends on
the value being actually returned where needed.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Fixes: 7aaec3a623ad ("drm/xe: Let's return last value read on xe_mmio_wait32.")
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/257
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:47 -05:00
Lucas De Marchi
a9b1a13614 drm/xe/guc: Move GuC registers to regs/
There's no good reason to keep the GuC registers outside the regs/
directory: move the header with GuC registers under that.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:47 -05:00
Lucas De Marchi
e8178f8076 drm/xe/guc: Rename GEN11_SOFT_SCRATCH for clarity
That register is a completely different register, it's not the same as
SOFT_SCRATCH for GEN11 and beyond. Rename to to the same name as the
bspec uses, including the new variant for media. Also, move the
definitions to the guc header.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:46 -05:00
Lucas De Marchi
56492dacee drm/xe: Rename instruction field to avoid confusion
There was both BLT_DEPTH_32 and XY_FAST_COLOR_BLT_DEPTH_32 - also add
the prefix to the first to make it clear this is about the FAST_**COPY**
operation. While at it, remove the GEN9_ prefix.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:46 -05:00
Lucas De Marchi
bb36f4b4ed drm/xe: Rename RC0/RC6 macros
Follow up commits will mass-remove the gen prefix/suffix. For GEN6_RC0
and GEN6_RC6 that would make the variable too short and easy to
conflict. So, add "GT_" prefix that is also part of the register name.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:43 -05:00
Lucas De Marchi
58e19acf0c drm/xe: Cleanup page-related defines
Rename the following defines to lose the GEN* prefixes since they don't
make sense for xe:

GEN8_PTE_SHIFT		-> XE_PTE_SHIFT
GEN8_PAGE_SIZE		-> XE_PAGE_SIZE
GEN8_PTE_MASK		-> XE_PTE_MASK
GEN8_PDE_SHIFT		-> XE_PDE_SHIFT
GEN8_PDES		-> XE_PDES
GEN8_PDE_MASK		-> XE_PDE_MASK
GEN8_64K_PTE_SHIFT	-> XE_64K_PTE_SHIFT
GEN8_64K_PAGE_SIZE	-> XE_64K_PAGE_SIZE
GEN8_64K_PTE_MASK	-> XE_64K_PTE_MASK
GEN8_64K_PDE_MASK	-> XE_64K_PDE_MASK
GEN8_PDE_PS_2M		-> XE_PDE_PS_2M
GEN8_PDPE_PS_1G		-> XE_PDPE_PS_1G
GEN8_PDE_IPS_64K	-> XE_PDE_IPS_64K
GEN12_GGTT_PTE_LM	-> XE_GGTT_PTE_LM
GEN12_USM_PPGTT_PTE_AE	-> XE_USM_PPGTT_PTE_AE
GEN12_PPGTT_PTE_LM	-> XE_PPGTT_PTE_LM
GEN12_PDE_64K		-> XE_PDE_64K
GEN12_PTE_PS64		-> XE_PTE_PS64
GEN8_PAGE_PRESENT	-> XE_PAGE_PRESENT
GEN8_PAGE_RW		-> XE_PAGE_RW
PTE_READ_ONLY		-> XE_PTE_READ_ONLY

Keep an XE_ prefix to make sure we don't mix the defines for the CPU
(e.g. PAGE_SIZE) with the ones fro the GPU).

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:43 -05:00
Rodrigo Vivi
9d3c8fb98b drm/xe: Fix print of RING_EXECLIST_SQ_CONTENTS_HI
On xe_hw_engine_print_state we were printing:
value_of(0x510) + 4 instead of
value_of(0x514) as desired.

So, let's properly define a RING_EXECLIST_SQ_CONTENTS_HI
register to fix the issue and also to avoid other issues
like that.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
2023-12-19 18:31:43 -05:00
Rodrigo Vivi
052df73b9e drm/xe: Update comment on why d3cold is still blocked.
The main issue with buddy allocator was fixed, but then
we ended up on other issues, so we need to step back and rethink
our strategy with D3cold. So, let's update the comment with a
todo list so we don't get tempted in removing it before we are
really ready.

Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Riana Tauro <riana.tauro@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2023-12-19 18:31:43 -05:00
José Roberto de Souza
a121594006 drm/xe: Limit the system memory size to half of the system memory
ttm_global_init() imposes this limitation.

Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:42 -05:00
Francois Dugast
fdb3abcebb drm/xe: Fix build without CONFIG_PM_SLEEP
Build without CONFIG_PM_SLEEP (such as for riscv) was failing due
to unused xe_pci_runtime_* functions.

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:42 -05:00
Riana Tauro
a180f4e13c drm/xe/guc_pc: Reorder forcewake and xe_pm_runtime calls
When the device is runtime suspended, reading some of the sysfs
entries under device/gt#/ causes a resume error
This is due to the ordering of pm_runtime and forcewake calls.
Reorder to wake up using xe_pm_runtime_get and then forcewake

v2: add goto statements (Rodrigo)

Signed-off-by: Riana Tauro <riana.tauro@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:42 -05:00
Balasubramani Vivekanandan
96cb46df56 drm/xe: Keep all resize bar related prints inside xe_resize_vram_bar
xe_resize_vram_bar() function is already printing the status of bar
resizing. It has prints covering both success and failure.
There is no need of additional prints in the caller which were not so
easily to follow.

Modified all BAR size prints to consistently print the size in MiB.

Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:42 -05:00
Matt Roper
e881b1292f drm/xe: Drop GFX_FLSH_CNTL_GEN6 write during GGTT invalidation
The write of GFX_FLSH_CNTL_GEN6 was inherited from the i915 codebase
where it was used to force a flush of the write-combine buffer in cases
where the GSM/GGTT were mapped as WC.  Since Xe never uses WC mappings
of the GGTT, this register write is unnecessary.  Furthermore, this
register was removed on Xe_HP-based platforms, so this write winds up
clobbering an unrelated register.

v2:
 - Also drop GFX_FLSH_CNTL_GEN6 from the register file now that it's no
   longer used.  (Lucas)

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230418230247.3802438-1-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:42 -05:00
Matt Roper
d33dc1dc29 drm/xe: Fix xe_mmio_rmw32 operation
xe_mmio_rmw32 was failing to invert the passed in mask, resulting in a
register update that wasn't the expected RMW operation.  Fortunately the
impact of this mistake was limited, since this function isn't heavily
used in Xe right now; this will mostly fix some GuC PM interrupt
unmasking.

v2:
 - Rename parameters as 'clr' and 'set' to clarify semantics.  (Lucas)

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230421145006.10940-1-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:42 -05:00
Matthew Auld
7500477ded drm/xe/lrc: give start_seqno a better default
If looking at the initial engine dump we should expect this to match
XE_FENCE_INITIAL_SEQNO - 1.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:42 -05:00
Matthew Auld
1a9d163c42 drm/xe/sched_job: prefer dma_fence_is_later
Doesn't look like we are accounting for seqno wrap. Just use
__dma_fence_is_later() like we already do for xe_hw_fence_signaled().

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:41 -05:00
Matt Roper
79f2432e31 drm/xe/sr: Apply masked registers properly
The 'clear' field for register save/restore entries was being placed in
the value bits of the register rather than the mask bits; make sure it
gets shifted into the mask bits.

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Atwood <matthew.s.atwood@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230419224909.4000920-1-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:41 -05:00
Matthew Auld
fa4fe0db08 drm/xe/tlb: fix expected_seqno calculation
It looks like when tlb_invalidation.seqno overflows
TLB_INVALIDATION_SEQNO_MAX, we start counting again from one, as per
send_tlb_invalidation(). This is also inline with initial value we give
it in xe_gt_tlb_invalidation_init().  When calculating the
expected_seqno we should also take this into account.

While we are here also print out the values if we ever trigger the
warning.

v2 (José):
  - drm_WARN_ON() is preferred over plain WARN_ON(), since it gives
    information on the originating device.

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/248
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:41 -05:00
Anusha Srivatsa
a8a39c15b0 drm/xe: Add Rocketlake device info
Add missing device info for Rocketlake.
While at it, also set the value for IS_ROCKETLAKE
macro which is right now set to 0.

v2: Also add abox_mask to the device info(Lucas)
v3: rebase
v4: Set IS_ROCKETLAKE (Anusha)

Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Tested-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>(v2)
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:41 -05:00
Matthew Auld
221896e54a drm/xe/mmio: stop incorrectly triggering drm_warn
CI keeps triggering:

xe 0000:03:00.0: [drm] Restricting VRAM size to PCI resource size
(0x400000000->0x3fa000000)

Due to usable_size vs vram_size differences. However, we only want to
trigger the drm_warn() to let developers know that the system they are
using is going clamp the VRAM size to match the IO size, where they can
likely only use 256M of VRAM. Once we properly support small-bar we can
revisit this.

v2 (Lucas): Drop the TODO for now

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:41 -05:00
Anusha Srivatsa
94324e6bed drm/xe: GuC and HuC loading support for RKL
Rocketlake uses TGL GuC and HuC

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:41 -05:00
Matt Roper
3c6be2542e drm/xe: Only request PCODE_WRITE_MIN_FREQ_TABLE on LLC platforms
PCODE_WRITE_MIN_FREQ_TABLE is only applicable to platforms with an LLC.
Change the discrete GPU check to an LLC check instead; this take care of
skipping not only the discrete platforms, but also integrated platforms
like MTL that do not have an LLC.

Fixes MTL dmesg error:

  xe 0000:00:02.0: [drm] *ERROR* PCODE Mailbox failed: 1 Illegal Command

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230410183910.2696628-3-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:41 -05:00
Matt Roper
bf08dd47d1 drm/xe: Track whether platform has LLC
Some driver initialization is conditional on the presence of an LLC.
Add an extra feature flag to support this.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230410183910.2696628-2-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:41 -05:00
Matt Roper
689f40f520 drm/xe: Use packed bitfields for xe->info feature flags
Replace 'bool' fields with single bits to allow the various device
feature flags to pack more tightly.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20230410183910.2696628-1-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:40 -05:00
Matthew Brost
67f2f0d737 drm/xe: Don't grab runtime PM ref in engine create IOCTL
A VM had a runtime PM ref, a engine can't be created without a VM, and
the engine holds a ref to the VM thus this is unnecessary. Beyond that
taking a ref in the engine create IOCTL and dropping it in the destroy
IOCTL is wrong as a user doesn't have to call the destroy IOCTL (e.g.
they can just kill the process or close the driver FD). If a user does
this PM refs are leaked.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:40 -05:00
Matt Roper
0a12a612c8 drm/xe: Let primary and media GT share a kernel_bb_pool
The media GT requires a valid gt->kernel_bb_pool during driver probe to
allocate the WA and NOOP batchbuffers used to record default context
images.  Dynamically allocate the bb_pools so that the primary and media
GT can use the same pool during driver init.

The media GT still shouldn't be need the USM pool, so only hook up the
kernel_bb_pool for now.

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20230410200229.2726648-1-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:40 -05:00
Niranjana Vishwanathapura
2988cf02ee drm/xe: Fix memory use after free
The wait_event_timeout() on g2h_fence.wq which is declared on
stack can return before the wake_up() gets called, resulting in a
stack out of bound access when wake_up() accesses the g2h_fene.wq.

Do not declare g2h_fence related wait_queue_head_t on stack.

Fixes the below KASAN BUG and associated kernel crashes.

BUG: KASAN: stack-out-of-bounds in do_raw_spin_lock+0x6f/0x1e0
Read of size 4 at addr ffff88826252f4ac by task kworker/u128:5/467

CPU: 25 PID: 467 Comm: kworker/u128:5 Tainted: G  U 6.3.0-rc4-xe #1
Workqueue: events_unbound g2h_worker_func [xe]
Call Trace:
 <TASK>
 dump_stack_lvl+0x64/0xb0
 print_report+0xc2/0x600
 kasan_report+0x96/0xc0
 do_raw_spin_lock+0x6f/0x1e0
 _raw_spin_lock_irqsave+0x47/0x60
 __wake_up_common_lock+0xc0/0x150
 dequeue_one_g2h+0x20f/0x6a0 [xe]
 g2h_worker_func+0xa9/0x180 [xe]
 process_one_work+0x527/0x990
 worker_thread+0x2d1/0x640
 kthread+0x174/0x1b0
 ret_from_fork+0x29/0x50
 </TASK>

Tested-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Bruce Chang <yu.bruce.chang@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:40 -05:00
Matthew Auld
36919ebeaa drm/xe: fix suspend-resume for dgfx
This stopped working now that TTM treats moving a pinned object through
ttm_bo_validate() as an error, for the general case. Add some new
routines to handle the new special casing needed for suspend-resume.

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/244
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Tested-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:40 -05:00
Matt Roper
21cc8aaddd drm/xe: Clean up xe_device_desc
Now that most of the characteristics of a device are associated with the
graphics and media IPs, the remaining contents of xe_device_desc can be
cleaned up a bit:

 * 'gt' is unused; drop it
 * DEV_INFO_FOR_EACH_FLAG only covers two flags and is only used in this
   one file; drop the unnecessary macro complexity
 * Convert .has_4tile to a single bitfield bit so that it can be packed
   with the other feature flags
 * Move 'platform' lower in the structure for better packing

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230406235621.1914492-10-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:40 -05:00
Matt Roper
3713ed52ef drm/xe: Add KUnit test for xe_pci.c IP engine lists
Add a simple KUnit test to ensure that the hardware engine lists for
GMD_ID IP definitions are sensible (i.e., no graphics engines defined
for the media IP and vice versa).

Only the IP descriptors for GMD_ID platforms are checked for now.
Presumably the engine lists on older pre-GMD_ID platforms shouldn't be
changing.  We can extend the KUnit testing in the future if we decide we
want to check those as well.

v2:
 - Add missing 'const' in xe_call_for_each_media_ip to avoid compiler
   warning.

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230406235621.1914492-9-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:40 -05:00
Matt Roper
5822bba943 drm/xe: Select graphics/media descriptors from GMD_ID
If graphics_desc and media_desc are not specified in a platform's
xe_device_desc, treat this as an indication that the IP version should
be determined from the hardware's GMD_ID register.

Note that leaving media_desc unset for a platform that simply doesn't
have the IP (e.g., PVC) is also okay --- a read of the GMD_ID register
offset will be attempted, but since there's no register at that location
a value of '0' will be returned, effectively disabling media support.

Mapping of version -> IP description is done via a table lookup; this
table will be re-used in future patches for some KUnit testing.

v2:
 - Drop dummy structures.  NULL can be safely used for both the GMD_ID
   cases and the "media not present case."
 - Use a table-based lookup of GMD_ID versions rather than a simple
   switch statement; the table will allow us to easily perform kunit
   testing of all the IP descriptors.

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230406235621.1914492-8-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:40 -05:00
Matt Roper
9a08b2b935 drm/xe: Add printable name to IP descriptors
Printing the name, along with the IP version number, will help reduce
confusion about which IP is present on a platform.

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230406235621.1914492-7-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:39 -05:00
Matt Roper
bd75664b9c drm/xe: Clarify GT counting logic
The total number of GTs supported by a platform should be one primary
GT, one media GT (if media version >= 13), and a number of remote tile
GTs dependent on the graphics IP present.  Express this more clearly in
the device setup.

Note that xe->info.tile_count is inaccurately named; the rest of the
driver treats this as the GT count, not just the tile count.  This
will need to be cleaned up at some point down the road.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230406235621.1914492-6-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:39 -05:00
Matt Roper
33b270d939 drm/xe: Move engine masks into IP descriptor structures
Break the top-level platform_engine_mask field into separate
hw_engine_mask fields in the graphics and media structures.  Since
hardware has more flexibility to mix-and-match IP versions going
forward, this allows each IP to list exactly which engines it provides;
the final per-GT engine list can then be constructured from those:

 * On platforms without a standalone media GT (i.e., media IP versions
   prior to 13), the primary GT's engine list is the union of the
   graphics IP's engine list and the media IP's engine list.
 * Otherwise, GT0's engine list is the graphics IP's engine list.
 * For GT1 and beyond, the type of GT determines which IP's engine list
   is used.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230406235621.1914492-5-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:39 -05:00
Matt Roper
ce22dece00 drm/xe: Move most platform traits to graphics IP
Most of the traits currently in the device descriptor structures are
either tied to the graphics IP or should be inferred from the graphics
IP.  This becomes important on MTL and beyond where IP versions are
supposed to be detected from the hardware's GMD_ID registers rather than
mapped from PCI devid.

Engine masks are left where they are for now; they'll be dealt with
separately in a future patch.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230406235621.1914492-4-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:39 -05:00
Matt Roper
c94f32e4f5 drm/xe: Set require_force_probe in each platform's description
Set require_force_probe explicitly in each platform's description
structure rather than embedding it within the FOO_FEATURES macros.  Even
though we expect all platforms currently supported by the Xe driver to
be under force_probe protection, this will help prepare for some other
upcoming restructuring.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230406235621.1914492-3-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:39 -05:00
Matt Roper
c8d72dfb28 drm/xe: Start splitting xe_device_desc into graphics/media structures
Rather than storing all characteristics for an entire platform in the
xe_device_desc structure, create secondary graphics and media structures
to hold traits and feature flags specific to those IPs.  This will
eventually allow us to assign the graphics and media characteristics at
runtime based on the contents of the relevant GMD_ID registers.

For now, just move the IP versions into the new structures to keep
things simple.  Other IP-specific fields will migrate to these
structures in future patches.

Note that there's one functional change introduced by this:  previously
PVC was recognized as media version 12.60.  That's technically true, but
in practice the media engines are fused off on all production hardware.
By simply not assigning a media IP structure to PVC it will effectively
be treated as IP version 0.0 now (which the rest of the driver should
treat as non-existent media).

v2:
 - Split the new structures out to their own header.  This will ease the
   addition of KUnit tests later.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230406235621.1914492-2-matthew.d.roper@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:39 -05:00
Lucas De Marchi
61e72e77b6 drm/xe: Always log GuC/HuC firmware versions
When debugging issues related to GuC/HuC, it's important to know what is
the firmware version being used. The version from the filename can't be
relied upon, also because it normally only contains the major version
(except for the ones under experimental support).

Log the version from the blob after reading the CSS header. Example:

	xe 0000:03:00.0: [drm] Using GuC firmware (70.5) from i915/dg2_guc_70.bin

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20230405224725.1993719-1-lucas.demarchi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:39 -05:00
Matthew Brost
1c060057ec drm/xe: Always write GEN12_RCU_MODE.GEN12_RCU_MODE_CCS_ENABLE for CCS engines
If CCS0 was fused we did not write this register thus CCS engine were
not enabled resulting in driver load failures.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:38 -05:00
Lucas De Marchi
ad55ead7f3 drm/xe: Update GuC/HuC firmware autoselect logic
Update the logic to autoselect GuC/HuC for the platforms with the
following improvements:

- Document what is the firmware file that is expected to be
  loaded and what is checked from blob headers

- When the platform is under force-probe it's desired to enforce
  the full-version requirement so the correct firmware is used
  before widespread adoption and backward-compatibility
  commitments

- Directory from which we expect firmware blobs to be available in
  upstream linux-firmware repository depends on the platform: for
  the ones supported by i915 it uses the i915/ directory, but the ones
  expected to be supported by xe, it's on the xe/ directory. This
  means that for platforms in the intersection, the firmware is
  loaded from a different directory, but that is not much important
  in the firmware repo and it avoids firmware duplication.

- Make the table with the firmware definitions clearly state the
  versions being expected. Now with macros to select the version it's
  possible to choose between full-version/major-version for GuC and
  full-version/no-version for HuC. These are similar to the macros used
  in i915, but implemented in a slightly different way to avoid
  duplicating the macros for each firmware/type and functionality,
  besides adding the support for different directories.

- There is no check added regarding force-probe since xe should
  reuse the same firmware files published for i915 for past
  platforms. This can be improved later with additional
  kunit checking against a hardcoded list of platforms that
  falls in this category.

- As mentioned in the TODO, the major version fallback was not
  implemented before as currently each platform only supports one
  major. That can be easily added later.

- GuC version for MTL and PVC were updated to 70.6.4, using the exact
  full version, while the

After this the GuC firmware used by PVC changes to pvc_guc_70.5.2.bin
since it's using a file not published yet.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Link: https://lore.kernel.org/r/20230324051754.1346390-4-lucas.demarchi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:32 -05:00
Lucas De Marchi
b9d773fc51 drm/xe: Add test for GT workarounds and tunings
In order to avoid mistakes when populating the workarounds, it's good to
be able to test if the entries added are all compatible for a certain
platform. The platform itself is not needed as long as we create fake
devices with enough configuration for the RTP helpers to process the
tables.  Common mistakes that can be avoided:

	- Entries clashing the bitfields being updated
	- Register type being mixed (MCR vs regular / masked vs regular)
	- Unexpected errors while adding the reg_sr entry

To test, inject a duplicate entry in gt_was, but with platform == tigerlake
rather than the currenct graphics version check:

       { XE_RTP_NAME("14011059788"),
	 XE_RTP_RULES(PLATFORM(TIGERLAKE)),
	 XE_RTP_ACTIONS(SET(GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE))
       },

This produces the following result:

	$  ./tools/testing/kunit/kunit.py run \
		--kunitconfig drivers/gpu/drm/xe/.kunitconfig  xe_wa

	[14:18:02] Starting KUnit Kernel (1/1)...
	[14:18:02] ============================================================
	[14:18:02] ==================== xe_wa (1 subtest) =====================
	[14:18:02] ======================== xe_wa_gt  =========================
	[14:18:02] [drm:xe_reg_sr_add] *ERROR* Discarding save-restore reg 9550 (clear: 00000200, set: 00000200, masked: no): ret=-22
	[14:18:02]     # xe_wa_gt: ASSERTION FAILED at drivers/gpu/drm/xe/tests/xe_wa_test.c:116
	[14:18:02]     Expected gt->reg_sr.errors == 0, but
	[14:18:02]         gt->reg_sr.errors == 1 (0x1)
	[14:18:02] [FAILED] TIGERLAKE (B0)
	[14:18:02] [PASSED] DG1 (A0)
	[14:18:02] [PASSED] DG1 (B0)
	...

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://lore.kernel.org/r/20230401085151.1786204-8-lucas.demarchi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:32 -05:00
Lucas De Marchi
4cc0440229 drm/xe: Add basic unit tests for rtp
Add some basic unit tests for rtp. This is intended to prove the
functionality of the rtp itself, like coalescing entries, rejecting
non-disjoint values, etc.

Contrary to the other tests in xe, this is a unit test to test the
sw-side only, so it can be executed on any machine - it doesn't interact
with the real hardware. Running it produces the following output:

	$ ./tools/testing/kunit/kunit.py run --raw_output-kunit  \
		--kunitconfig drivers/gpu/drm/xe/.kunitconfig xe_rtp
	...
	[01:26:27] Starting KUnit Kernel (1/1)...
	KTAP version 1
	1..1
	    KTAP version 1
	    # Subtest: xe_rtp
	    1..1
		KTAP version 1
		# Subtest: xe_rtp_process_tests
		ok 1 coalesce-same-reg
		ok 2 no-match-no-add
		ok 3 no-match-no-add-multiple-rules
		ok 4 two-regs-two-entries
		ok 5 clr-one-set-other
		ok 6 set-field
	[drm:xe_reg_sr_add] *ERROR* Discarding save-restore reg 0001 (clear: 00000001, set: 00000001, masked: no): ret=-22
		ok 7 conflict-duplicate
	[drm:xe_reg_sr_add] *ERROR* Discarding save-restore reg 0001 (clear: 00000003, set: 00000000, masked: no): ret=-22
		ok 8 conflict-not-disjoint
	[drm:xe_reg_sr_add] *ERROR* Discarding save-restore reg 0001 (clear: 00000002, set: 00000002, masked: no): ret=-22
	[drm:xe_reg_sr_add] *ERROR* Discarding save-restore reg 0001 (clear: 00000001, set: 00000001, masked: yes): ret=-22
		ok 9 conflict-reg-type
	    # xe_rtp_process_tests: pass:9 fail:0 skip:0 total:9
	    ok 1 xe_rtp_process_tests
	# Totals: pass:9 fail:0 skip:0 total:9
	ok 1 xe_rtp
	...

Note that the ERRORs in the kernel log are expected since it's testing
incompatible entries.

v2:
  - Use parameterized table for tests  (Michał Winiarski)
  - Move everything to the xe_rtp_test.ko and only add a few exports to the
    right namespace
  - Add more tests to cover FIELD_SET, CLR, partially true rules, etc

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Maarten Lankhorst<maarten.lankhorst@linux.intel.com> # v1
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://lore.kernel.org/r/20230401085151.1786204-7-lucas.demarchi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:32 -05:00
Lucas De Marchi
7bf350ecb2 drm/xe/reg_sr: Save errors for kunit integration
When there's an entry that is dropped when xe_reg_sr_add(), there's
not much we can do other than reporting the error - it's for certain a
driver issue or conflicting workarounds/tunings. Save the number of
errors to be used later by kunit to report where it happens.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230401085151.1786204-6-lucas.demarchi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:32 -05:00
Lucas De Marchi
e460410023 drm/xe: Generalize fake device creation
Instead of requiring tests to initialize a fake device an keep it in
sync with xe_pci.c when it's platform-dependent, export a function from
xe_pci.c to be used and piggy back on the device info creation. For
simpler tests that don't need any specific platform and just need a fake
xe device to pass around, xe_pci_fake_device_init_any() can be used.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230401085151.1786204-5-lucas.demarchi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19 18:31:32 -05:00