linux

iv/linux

Author	SHA1	Message	Date
Matthew Brost	0f688c0eb6	drm/xe: Return 2MB page size for compact 64k PTEs Compact 64k PTEs are only intended to be used within a single VMA which covers the entire 2MB range of the compact 64k PTEs. Add XE_VMA_PTE_COMPACT VMA flag to indicate compact 64k PTEs are used and update xe_vma_max_pte_size to return at least 2MB if set. v2: Include missing changes Fixes: 8f33b4f054fc ("drm/xe: Avoid doing rebinds") Fixes: c47794bdd63d ("drm/xe: Set max pte size when skipping rebinds") Reported-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/758 Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240219211942.3633795-4-matthew.brost@intel.com	2024-02-20 08:39:45 -08:00
Matthew Brost	15f0e0c2c4	drm/xe: Add XE_VMA_PTE_64K VMA flag Add XE_VMA_PTE_64K VMA flag to ensure skipping rebinds does not cross 64k page boundaries. Fixes: 8f33b4f054fc ("drm/xe: Avoid doing rebinds") Fixes: c47794bdd63d ("drm/xe: Set max pte size when skipping rebinds") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240219211942.3633795-3-matthew.brost@intel.com	2024-02-20 08:39:34 -08:00
Matthew Brost	19adaccef8	drm/xe: Fix xe_vma_set_pte_size xe_vma_set_pte_size had a return value and did not set the 4k VMA flag. Both of these were incorrect. Fix these. Fixes: c47794bdd63d ("drm/xe: Set max pte size when skipping rebinds") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240219211942.3633795-2-matthew.brost@intel.com	2024-02-20 08:39:33 -08:00
Priyanka Dandamudi	a0df2cc858	drm/xe/xe_bo_move: Enhance xe_bo_move trace Enhanced xe_bo_move trace to be more readable. It will help to show the migration details. Src and dst details. v2: Modify trace_xe_bo_move(), it takes the integer mem_type rather than a string. Make mem_type_to_name() extern, it will be used by trace.(Thomas) v3: Move mem_type_to_name() to xe_bo.[ch] (Thomas, Matt) v4: Add device details to reduce ambiquity related to vram0/vram1. (Oak) v5: Rename mem_type_to_name to xe_mem_type_to_name. (Thomas) v6: Optimised code to use xe_bo_device(__entry->bo). (Thomas) Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Oak Zeng <oak.zeng@intel.com> Cc: Kempczynski Zbigniew <Zbigniew.Kempczynski@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Brian Welty <brian.welty@intel.com> Signed-off-by: Priyanka Dandamudi <priyanka.dandamudi@intel.com> Reviewed-by: Oak Zeng <oak.zeng@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240220044748.948496-1-priyanka.dandamudi@intel.com	2024-02-20 08:35:14 +01:00
Lucas De Marchi	237412e453	drm/xe: Enable 32bits build Now that all the issues with 32bits are fixed, enable it again. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240119001612.2991381-6-lucas.demarchi@intel.com	2024-02-19 23:19:15 -08:00
Thomas Hellström	f1a9abc0cf	drm/xe/uapi: Remove support for persistent exec_queues Persistent exec_queues delays explicit destruction of exec_queues until they are done executing, but destruction on process exit is still immediate. It turns out no UMD is relying on this functionality, so remove it. If there turns out to be a use-case in the future, let's re-add. Persistent exec_queues were never used for LR VMs v2: - Don't add an "UNUSED" define for the missing property (Lucas, Rodrigo) v3: - Remove the remaining struct xe_exec_queue::persistent state (Niranjana, Lucas) Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Francois Dugast <francois.dugast@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240209113444.8396-1-thomas.hellstrom@linux.intel.com	2024-02-19 12:54:48 +01:00
Arnd Bergmann	f2c9364db5	drm/xe: avoid function cast warnings clang-16 warns about a cast between incompatible function types: drivers/gpu/drm/xe/xe_range_fence.c:155:10: error: cast from 'void ()(const void )' to 'void ()(struct xe_range_fence )' converts to incompatible function type [-Werror,-Wcast-function-type-strict] 155 \| .free = (void ()(struct xe_range_fence rfence)) kfree, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Avoid this with a trivial helper function that calls kfree() here. v2: - s/* rfence/*rfence/ (Thomas) Fixes: 845f64bdbfc9 ("drm/xe: Introduce a range-fence utility") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240213095719.454865-1-arnd@kernel.org	2024-02-15 09:33:25 +01:00
Matthew Brost	761b333718	drm/xe: Remove exec queue bind.fence_* struct xe_exec_queue bind.fence_* members are unused. Remove these. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240213043251.3482928-1-matthew.brost@intel.com	2024-02-14 09:42:47 -08:00
José Roberto de Souza	9bc36e58d1	drm/xe: Add uAPI to query GuC firmware submission version Due to a bug in GuC firmware, Mesa can't enable by default the usage of compute engines in DG2 and newer. A new GuC firmware fixed the issue but until now there was no way for Mesa to know if KMD was running with the fixed GuC version or not, so this uAPI is required. It may be expanded in future to query other firmware versions too. This is querying XE_UC_FW_VER_COMPATIBILITY/submission version because that is also supported by VFs, while XE_UC_FW_VER_RELEASE don't. i915 uAPI: https://patchwork.freedesktop.org/series/129627/ Mesa usage: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25233 v2: - fixed drm_xe_query_uc_fw_version documentation - moved branch_ver as the first version number Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Francois Dugast <francois.dugast@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240208183539.185095-1-jose.souza@intel.com	2024-02-13 16:31:51 -05:00
Michal Wajdeczko	be46d7aacf	drm/xe/vf: Don't support MCR registers if VF VF drivers can't operate on MCR registers. Make sure that driver is not trying to read nor write using any of MCR register. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240213154355.1221-9-michal.wajdeczko@intel.com	2024-02-13 18:59:54 +01:00
Michal Wajdeczko	96eb895c7e	drm/xe/vf: Don't program PAT if VF PAT programming can only be done by the PF driver. Besides VF drivers don't have access to control registers. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240213154355.1221-8-michal.wajdeczko@intel.com	2024-02-13 18:59:53 +01:00
Michal Wajdeczko	602f9ebf32	drm/xe/vf: Don't enable hwmon if VF Registers used by hwmon are not available for VF drivers. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240213154355.1221-7-michal.wajdeczko@intel.com	2024-02-13 18:59:52 +01:00
Michal Wajdeczko	3ed34c6552	drm/xe/vf: Don't check if LMEM is initialized if VF It is PF driver responsibility to verify that LMEM was correctly initialized, also VF drivers don't have access to GU_CNTL register. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240213154355.1221-6-michal.wajdeczko@intel.com	2024-02-13 18:59:51 +01:00
Michal Wajdeczko	60da62fbe9	drm/xe/vf: Don't initialize stolen memory manager if VF VF drivers don't have access to the stolen memory. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240213154355.1221-5-michal.wajdeczko@intel.com	2024-02-13 18:59:50 +01:00
Michal Wajdeczko	18bc97fb4a	drm/xe/vf: Don't program MOCS if VF MOCS programming may only be done by the PF driver. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240213154355.1221-4-michal.wajdeczko@intel.com	2024-02-13 18:59:49 +01:00
Michal Wajdeczko	aec14e3370	drm/xe/vf: Don't try to capture engine data unavailable to VF Don't capture engine ring registers as thoe are not available for the VF driver. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240213154355.1221-3-michal.wajdeczko@intel.com	2024-02-13 18:59:48 +01:00
Michal Wajdeczko	a43d506008	drm/xe/vf: Assume fixed GSM size if VF VFs can't use size mirrored from PCI config, but it should be safe to assume it covers full 4GiB GGTT. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240213154355.1221-2-michal.wajdeczko@intel.com	2024-02-13 18:59:47 +01:00
Thomas Hellström	157261c58b	drm/xe/pt: Allow for stricter type- and range checking Distinguish between xe_pt and the xe_pt_dir subclass when allocating and freeing. Also use a fixed-size array for the xe_pt_dir page entries to make life easier for dynamic range- checkers. Finally rename the page-directory child pointer array to "children". While no functional change, this fixes ubsan splats similar to: [ 51.463021] ------------[ cut here ]------------ [ 51.463022] UBSAN: array-index-out-of-bounds in drivers/gpu/drm/xe/xe_pt.c:47:9 [ 51.463023] index 0 is out of range for type 'xe_ptw []' [ 51.463024] CPU: 5 PID: 2778 Comm: xe_vm Tainted: G U 6.8.0-rc1+ #218 [ 51.463026] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 2001 02/01/2023 [ 51.463027] Call Trace: [ 51.463028] <TASK> [ 51.463029] dump_stack_lvl+0x47/0x60 [ 51.463030] __ubsan_handle_out_of_bounds+0x95/0xd0 [ 51.463032] xe_pt_destroy+0xa5/0x150 [xe] [ 51.463088] __xe_pt_unbind_vma+0x36c/0x9b0 [xe] [ 51.463144] xe_vm_unbind+0xd8/0x580 [xe] [ 51.463204] ? drm_exec_prepare_obj+0x3f/0x60 [drm_exec] [ 51.463208] __xe_vma_op_execute+0x5da/0x910 [xe] [ 51.463268] ? __drm_gpuvm_sm_unmap+0x1cb/0x220 [drm_gpuvm] [ 51.463272] ? radix_tree_node_alloc.constprop.0+0x89/0xc0 [ 51.463275] ? drm_gpuva_it_remove+0x1f3/0x2a0 [drm_gpuvm] [ 51.463279] ? drm_gpuva_remove+0x2f/0xc0 [drm_gpuvm] [ 51.463283] xe_vm_bind_ioctl+0x1a55/0x20b0 [xe] [ 51.463344] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe] [ 51.463414] drm_ioctl_kernel+0xb6/0x120 [ 51.463416] drm_ioctl+0x287/0x4e0 [ 51.463418] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe] [ 51.463481] __x64_sys_ioctl+0x94/0xd0 [ 51.463484] do_syscall_64+0x86/0x170 [ 51.463486] ? syscall_exit_to_user_mode+0x7d/0x200 [ 51.463488] ? do_syscall_64+0x96/0x170 [ 51.463490] ? do_syscall_64+0x96/0x170 [ 51.463492] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 51.463494] RIP: 0033:0x7f246bfe817d [ 51.463498] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00 [ 51.463501] RSP: 002b:00007ffc1bd19ad0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 51.463502] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f246bfe817d [ 51.463504] RDX: 00007ffc1bd19b60 RSI: 0000000040886445 RDI: 0000000000000003 [ 51.463505] RBP: 00007ffc1bd19b20 R08: 0000000000000000 R09: 0000000000000000 [ 51.463506] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffc1bd19b60 [ 51.463508] R13: 0000000040886445 R14: 0000000000000003 R15: 0000000000010000 [ 51.463510] </TASK> [ 51.463517] ---[ end trace ]--- v2 - Fix kerneldoc warning (Matthew Brost) Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240209112655.4872-1-thomas.hellstrom@linux.intel.com	2024-02-12 22:57:35 +01:00
Jani Nikula	f8237c8c6a	drm/xe: use drm based debugging instead of dev Prefer drm_dbg() over dev_dbg(). Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240212145757.645094-1-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2024-02-12 20:28:11 +02:00
Matthew Auld	63fb531fbf	drm/xe/display: fix i915_gem_object_is_shmem() wrapper shmem ensures the memory is cleared on allocation, however here we are using TTM, which doesn't natively support shmem (other than for swap), but instead just allocates normal system memory. And we only zero such memory for userspace allocations. In the case of intel_fbdev we are missing the memset_io() since display path incorrectly thinks object is shmem based. Fixes: 44e694958b95 ("drm/xe/display: Implement display support") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240205153110.38340-2-matthew.auld@intel.com	2024-02-09 10:28:48 +00:00
Dani Liberman	82bd83a0cf	drm/xe/irq: allocate all possible msix interrupts If platform supports MSIX, driver needs to allocate all possible interrupts. v2: - drop msix_cap and use the api return code instead. - fix commit message. v3: - pass specific type in irq flags. Cc: Ohad Sharabi <osharabi@habana.ai> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Dani Liberman <dliberman@habana.ai> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240124075058.2302235-1-dliberman@habana.ai	2024-02-08 13:26:44 -08:00
Thomas Hellström	eb538b5574	drm/xe/vm: Avoid reserving zero fences The function xe_vm_prepare_vma was blindly accepting zero as the number of fences and forwarded that to drm_exec_prepare_obj. However, that leads to an out-of-bounds shift in the dma_resv_reserve_fences() and while one could argue that the dma_resv code should be robust against that, avoid attempting to reserve zero fences. Relevant stack trace: [773.183188] ------------[ cut here ]------------ [773.183199] UBSAN: shift-out-of-bounds in ../include/linux/log2.h:57:13 [773.183241] shift exponent 64 is too large for 64-bit type 'long unsigned int' [773.183254] CPU: 2 PID: 1816 Comm: xe_evict Tainted: G U 6.8.0-rc3-xe #1 [773.183256] Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 2014 10/14/2022 [773.183257] Call Trace: [773.183258] <TASK> [773.183260] dump_stack_lvl+0xaf/0xd0 [773.183266] dump_stack+0x10/0x20 [773.183283] ubsan_epilogue+0x9/0x40 [773.183286] __ubsan_handle_shift_out_of_bounds+0x10f/0x170 [773.183293] dma_resv_reserve_fences.cold+0x2b/0x48 [773.183295] ? ww_mutex_lock+0x3c/0x110 [773.183301] drm_exec_prepare_obj+0x45/0x60 [drm_exec] [773.183313] xe_vm_prepare_vma+0x33/0x70 [xe] [773.183375] xe_vma_destroy_unlocked+0x55/0xa0 [xe] [773.183427] xe_vm_close_and_put+0x526/0x940 [xe] Fixes: 2714d5093620 ("drm/xe: Convert pagefaulting code to use drm_exec") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240208132115.3132-1-thomas.hellstrom@linux.intel.com	2024-02-08 16:32:08 +01:00
Lucas De Marchi	6badfc463d	drm/xe: Avoid cryptic message when there's no GuC definition If there's no GuC firmware entry in the table and the user didn't pass an override path, the error message is very cryptic: xe will simply try to continue and then fail when submitting the default context: xe 0000:00:02.0: [drm:xe_pci_probe [xe]] XE_LUNARLAKE 64b0:0001 dgfx:0 gfx:Xe2_LPG (20.04) media:Xe2_LPM (20.00) display:no dma_m_s:46 tc:1 gscfi:0 ... xe: probe of 0000:00:02.0 failed with error -22 Add an explicit error message and bail out: xe 0000:00:02.0: [drm:xe_pci_probe [xe]] XE_LUNARLAKE 64b0:0001 dgfx:0 gfx:Xe2_LPG (20.04) media:Xe2_LPM (20.00) display:no dma_m_s:46 tc:1 gscfi:0 xe 0000:00:02.0: [drm] ERROR No GuC firmware defined for platform xe 0000:00:02.0: [drm] ERROR GuC init failed with -2 ... xe: probe of 0000:00:02.0 failed with error -2 Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240201224724.551130-3-lucas.demarchi@intel.com	2024-02-07 15:08:21 -08:00
Lucas De Marchi	4588341896	drm/xe: Always allow to override firmware The current logic for firmware selection is reminiscent from i915 where there are 2 backends and several platforms support only 1: execlist or GuC. The xe driver has only the GuC backend and it simply doesn't work without it. Allow developers to override the firmware path even if there isn't a firmware entry in the table yet: this allows developers to more easily test the very first firmware before adding it there. The justification above is only true for GuC, however those override paths should really be viewed as developer aid param. Simply make the same logic for all firmwares and allow the override path to be used for all of them. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240201224724.551130-2-lucas.demarchi@intel.com	2024-02-07 15:07:31 -08:00
Matthew Brost	d9890c028d	drm/xe: Remove TEST_VM_ASYNC_OPS_ERROR TEST_VM_ASYNC_OPS_ERROR is broken and unused. Remove for now and will pull back in a later time when it is used, fixed, and properly hidden behind a Kconfig option. Also fixup the supported flags value. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240206045010.2981051-1-matthew.brost@intel.com	2024-02-06 08:51:49 -08:00
Riana Tauro	95ec8c1d6c	drm/xe/pm: add debug logs for D3cold add additional debug logs for PME# capability and presence of ACPI _PR3 resources. This is to identify the reason why the card is not capable of D3cold. No functional changes Signed-off-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240206055917.2629027-1-riana.tauro@intel.com	2024-02-06 08:49:45 -05:00
Karthik Poosa	404669db60	drm/xe/hwmon: Refactor xe hwmon Check latest platform first in xe_hwmon_get_reg. Move PVC HWMON registers to regs/xe_pcode.h. Suggested-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Karthik Poosa <karthik.poosa@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240201180600.434822-1-karthik.poosa@intel.com	2024-02-06 08:42:03 -05:00
Matthew Auld	8087199cd5	drm/xe/vm: don't ignore error when in_kthread If GUP fails and we are in_kthread, we can have pinned = 0 and ret = 0. If that happens we call sg_alloc_append_table_from_pages() with n_pages = 0, which is not well behaved and can trigger: kernel BUG at include/linux/scatterlist.h:115! depending on if the pages array happens to be zeroed or not. Even if we don't hit that it crashes later when trying to dma_map the returned table. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202171435.427630-2-matthew.auld@intel.com	2024-02-06 10:36:03 +00:00
Matthew Brost	5ad6af5c91	drm/xe: Assume large page size if VMA not yet bound The calculation to determine max page size of a VMA during a REMAP operations assumes the VMA has been bound. This assumption is not true if the VMA is from an eariler operation in an array of binds. If a VMA has not been bound use the maximum page size which will ensure the previous / next REMAP operations are not incorrectly skipped. Fixes: 8f33b4f054fc ("drm/xe: Avoid doing rebinds") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240205231714.2956225-1-matthew.brost@intel.com	2024-02-05 17:42:19 -08:00
Nirmoy Das	17ffcdb041	drm/xe/query: Use kzalloc for drm_xe_query_engines Use kzalloc like other routines for better consistency. v2: Improve the subject(Matt) Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240131051838.24705-1-nirmoy.das@intel.com	2024-02-02 22:44:55 -08:00
John Harrison	db0adab049	drm/xe/guc: Add support for LNL firmware First release of GuC firmware for LNL is now available, so start using it. v2: Actually use xe directory. Doh! (review feedback from Lucas) Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202200017.2133438-6-John.C.Harrison@Intel.com	2024-02-02 22:12:27 -08:00
John Harrison	32ca46bf29	drm/xe/guc: Update to GuC firmware 70.19.2 API compatibility version: 1.8.2 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202200017.2133438-5-John.C.Harrison@Intel.com	2024-02-02 22:12:01 -08:00
John Harrison	6650ad3e09	drm/xe/uc: Include patch version in expectations Patch level releases can be just as important as major level releases if they fix a critical bug. So include the patch version in the expectation check so the user is properly informed if they need to update. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202200017.2133438-4-John.C.Harrison@Intel.com	2024-02-02 22:11:25 -08:00
Xiaoming Wang	86c99abb5f	drm/xe/display: Fix memleak in display initialization intel_power_domains_init is called twice in xe_device_probe: 1) intel_power_domains_init() xe_display_init_nommio() xe_device_probe() 2) intel_power_domains_init() intel_display_driver_probe_noirq() xe_display_init_noirq() xe_device_probe() It needs remove one to avoid power_domains->power_wells double malloc. unreferenced object 0xffff88811150ee00 (size 512): comm "systemd-udevd", pid 506, jiffies 4294674198 (age 3605.560s) hex dump (first 32 bytes): 10 b4 9d a0 ff ff ff ff ff ff ff ff ff ff ff ff ................ ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff8134b901>] __kmem_cache_alloc_node+0x1c1/0x2b0 [<ffffffff812c98b2>] __kmalloc+0x52/0x150 [<ffffffffa08b0033>] __set_power_wells+0xc3/0x360 [xe] [<ffffffffa08562fc>] xe_display_init_nommio+0x4c/0x70 [xe] [<ffffffffa07f0d1c>] xe_device_probe+0x3c/0x5a0 [xe] [<ffffffffa082e48f>] xe_pci_probe+0x33f/0x5a0 [xe] [<ffffffff817f2187>] local_pci_probe+0x47/0xa0 [<ffffffff817f3db3>] pci_device_probe+0xc3/0x1f0 [<ffffffff8192f2a2>] really_probe+0x1a2/0x410 [<ffffffff8192f598>] __driver_probe_device+0x78/0x160 [<ffffffff8192f6ae>] driver_probe_device+0x1e/0x90 [<ffffffff8192f92a>] __driver_attach+0xda/0x1d0 [<ffffffff8192c95c>] bus_for_each_dev+0x7c/0xd0 [<ffffffff8192e159>] bus_add_driver+0x119/0x220 [<ffffffff81930d00>] driver_register+0x60/0x120 [<ffffffffa05e50a0>] 0xffffffffa05e50a0 The call to intel_power_domains_cleanup() needs to stay where it is for now. The main issue is that while the init is called by the display side, shared by i915 and xe, the cleanup is called by a non-shared code path. Fixing that will be done as a separate commit. Fixes: 44e694958b95 ("drm/xe/display: Implement display support") Signed-off-by: Xiaoming Wang <xiaoming.wang@intel.com> [ reword commit message and explain why the fini needs to stay where it is ] Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202215658.561298-1-lucas.demarchi@intel.com	2024-02-02 22:05:07 -08:00
Matthew Brost	72f86ed3c8	drm/xe: Map both mem.kernel_bb_pool and usm.bb_pool For integrated devices we need to map both mem.kernel_bb_pool and usm.bb_pool to be able to run batches from both pools. Fixes: a682b6a42d4d ("drm/xe: Support device page faults on integrated platforms") Tested-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Brian Welty <brian.welty@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202033440.2351862-1-matthew.brost@intel.com	2024-02-02 17:37:58 -08:00
Arnd Bergmann	774ef5dfc9	drm/xe: circumvent bogus stringop-overflow warning gcc-13 warns about an array overflow that it sees but that is prevented by the "asid % NUM_PF_QUEUE" calculation: drivers/gpu/drm/xe/xe_gt_pagefault.c: In function 'xe_guc_pagefault_handler': include/linux/fortify-string.h:57:33: error: writing 16 bytes into a region of size 0 [-Werror=stringop-overflow=] include/linux/fortify-string.h:689:26: note: in expansion of macro '__fortify_memcpy_chk' 689 \| #define memcpy(p, q, s) __fortify_memcpy_chk(p, q, s, \ \| ^~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/xe/xe_gt_pagefault.c:341:17: note: in expansion of macro 'memcpy' 341 \| memcpy(pf_queue->data + pf_queue->tail, msg, len * sizeof(u32)); \| ^~~~~~ drivers/gpu/drm/xe/xe_gt_types.h:102:25: note: at offset [1144, 265324] into destination object 'tile' of size 8 I found that rewriting the assignment using pointer addition rather than the equivalent array index calculation prevents the warning, so use that instead. I sent a bug report against gcc for the false positive warning. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113214 Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240103114819.2913937-1-arnd@kernel.org	2024-02-02 14:01:12 -08:00
Matthew Brost	447f74d223	drm/xe: Pick correct userptr VMA to repin on REMAP op failure A REMAP op is composed of 3 VMA's - unmap, prev map, and next map. When op_execute fails with -EAGAIN we need to update the local VMA pointer to the current op state and then repin the VMA if it is a userptr. Fixes a failure seen in xe_vm.munmap-style-unbind-userptr-one-partial. Fixes: b06d47be7c83 ("drm/xe: Port Xe to GPUVA") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240201004849.2219558-3-matthew.brost@intel.com	2024-02-02 06:55:09 -08:00
Matthew Brost	a856b67a84	drm/xe: Take a reference in xe_exec_queue_last_fence_get() Take a reference in xe_exec_queue_last_fence_get(). Also fix a reference counting underflow bug VM bind and unbind. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240201004849.2219558-2-matthew.brost@intel.com	2024-02-02 06:45:57 -08:00
Matthew Brost	5fcbf83e39	drm/xe: Drop rebind argument from xe_pt_prepare_bind This is unused, drop it. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Oak Zeng <oak.zeng@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240201184844.2317004-1-matthew.brost@intel.com	2024-02-01 18:43:11 -08:00
Matthew Brost	3acc1ff1a7	drm/xe: Fix loop in vm_bind_ioctl_ops_unwind The logic for the unwind loop is incorrect resulting in an infinite loop. Fix to unwind to go from the last operations list to he first. Fixes: 617eebb9c480 ("drm/xe: Fix array of binds") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240201175532.2303168-1-matthew.brost@intel.com	2024-02-01 18:42:39 -08:00
Suraj Kandpal	d6beadc8d7	drm/xe/gsc: Add status check during gsc header readout Before checking if data is present in the message reply check the status in header and see if it indicates any error. --v2 - Use drm_err() instead of drm_dbg_kms() [Daniele] --v3 - Use &xe->drm in drm_err to make it more cleaner [Daniele] Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240124045248.687023-1-suraj.kandpal@intel.com	2024-02-01 13:25:47 -08:00
Thomas Hellström	5bd24e7882	drm/xe/vm: Subclass userptr vmas The construct allocating only parts of the vma structure when the userptr part is not needed is very fragile. A developer could add additional fields below the userptr part, and the code could easily attempt to access the userptr part even if its not persent. So introduce xe_userptr_vma which subclasses struct xe_vma the proper way, and accordingly modify a couple of interfaces. This should also help if adding userptr helpers to drm_gpuvm. v2: - Fix documentation of to_userptr_vma() (Matthew Brost) - Fix allocation and freeing of vmas to clearer distinguish between the types. Closes: https://lore.kernel.org/intel-xe/0c4cc1a7-f409-4597-b110-81f9e45d1ffe@embeddedor.com/T/#u Fixes: a4cc60a55fd9 ("drm/xe: Only alloc userptr part of xe_vma for userptrs") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240131091628.12318-1-thomas.hellstrom@linux.intel.com	2024-02-01 09:49:26 +01:00
Matthew Brost	152ca51d8d	drm/xe: Use LRC prefix rather than CTX prefix in lrc desc defines The sparc build fails [1] due to CTX_VALID being redefined. Fix this by using a better naming convention of LRC_VALID as this define is used in setting bits in the lrc descriptor. To be uniform, change other define with LRC prefix too. [1] https://lore.kernel.org/all/20240123111235.3097079-1-geert@linux-m68k.org/ v2: - s/LEGACY_64B_CONTEXT/LRC_LEGACY_64B_CONTEXT (Lucas) Fixes: 0bc519d20ffa ("drm/xe: Remove GEN[0-9]*_ prefixes") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240123212638.1605626-1-matthew.brost@intel.com	2024-01-31 20:00:47 -08:00
Matthew Brost	d83d8ae275	drm/xe: Make all GuC ABI shift values unsigned All GuC ABI definitions are unsigned and not defining as unsigned is causing build errors [1]. [1] https://lore.kernel.org/all/20240123111235.3097079-1-geert@linux-m68k.org/ Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240131025424.2087936-1-matthew.brost@intel.com	2024-01-31 09:40:10 -08:00
Matt Roper	996da37ffa	drm/xe: Convert job timeouts from assert to warning xe_assert() is intended to be used only for "impossible" situations that should never be hit (and if they are hit it means there's a driver bug somewhere); assertions are only compiled into debug builds. Although we expect jobs submitted by the kernel to be well-behaved and run without error, timeouts are a legitimate possibility for reasons beyond our control (bad firmware, flaky hardware, etc.). We should use a real WARN if we encounter these, even for non-debug builds, to ensure the issue is being properly highlighted in bug reports and such. Also give the WARNs more human-readable messages and move them below the general notice-level message that gets printed for any kind of timeout to make the errors a bit more understandable. v2: - Convert the VM / exec_queue_killed assertion as well. (MattB) Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240130200308.1429134-2-matthew.d.roper@intel.com	2024-01-31 08:09:30 -08:00
Thomas Hellström	78366eed68	drm/xe: Don't use __user error pointers The error pointer macros are not aware of __user pointers and as a consequence sparse warns. Have the copy_mask() function return an integer instead of a __user pointer. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240117134048.165425-5-thomas.hellstrom@linux.intel.com	2024-01-31 14:34:35 +01:00
Thomas Hellström	97fd7a7e4e	drm/xe: Annotate mcr_[un]lock() These functions acquire and release the gt::mcr_lock. Annotate accordingly. Fix the corresponding sparse warning. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Fixes: fb1d55efdfcb ("drm/xe: Cleanup OPEN_BRACE style issues") Cc: Francois Dugast <francois.dugast@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240117134048.165425-4-thomas.hellstrom@linux.intel.com	2024-01-31 14:34:35 +01:00
Jani Nikula	1e5a4dfe38	drm/xe: drop display/ subdir from include directories There are very few places that need to include anything from under display/. Require the display/ prefix in #include directives, and drop the subdirectory from the header search path. Sort the include lists while at it. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240122101428.2683468-2-jani.nikula@intel.com	2024-01-31 14:59:07 +02:00
Jani Nikula	f01ece502a	drm/xe: move xe_display.[ch] under display/ All the other display related files are under display/ subdirectory, also move xe_display.[ch] there. Sort the build list while at it. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240122101428.2683468-1-jani.nikula@intel.com	2024-01-31 14:58:40 +02:00
Matthew Brost	d1df9bfbf6	drm/xe: Only allow 1 ufence per exec / bind IOCTL The way exec ufences are coded only 1 ufence per IOCTL will be signaled. It is possible to fix this but for current use cases 1 ufence per IOCTL is sufficient. Enforce a limit of 1 ufence per IOCTL (both exec and bind to be uniform). v2: - Add fixes tag (Thomas) Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Mika Kahola <mika.kahola@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Brian Welty <brian.welty@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240124234413.1640825-1-matthew.brost@intel.com	2024-01-30 15:51:04 -08:00

1 2 3 4 5 ...

1248885 Commits