linux

iv/linux

Author	SHA1	Message	Date
Matthew Auld	fcf98d68c0	drm/xe: fix mem_access for early lrc generation We spawn some hw queues during device probe to generate the default LRC for every engine type, however the queue destruction step is typically async. Queue destruction needs to do stuff like GuC context deregister which requires GuC CT, which in turn requires an active mem_access ref. The caller during probe is meant to hold the mem_access token, however due to the async destruction it might have already been dropped if we are unlucky. Similar to how we already handle migrate VMs for which there is no mem_access ref, fix this by keeping the callers token alive, releasing it only when destroying the queue. We can treat a NULL vm as indication that we need to grab our own extra ref. Fixes the following splat sometimes seen during load: [ 1682.899930] WARNING: CPU: 1 PID: 8642 at drivers/gpu/drm/xe/xe_device.c:537 xe_device_assert_mem_access+0x27/0x30 [xe] [ 1682.900209] CPU: 1 PID: 8642 Comm: kworker/u24:97 Tainted: G U W E N 6.6.0-rc3+ #6 [ 1682.900214] Workqueue: submit_wq xe_sched_process_msg_work [xe] [ 1682.900303] RIP: 0010:xe_device_assert_mem_access+0x27/0x30 [xe] [ 1682.900388] Code: 90 90 90 66 0f 1f 00 0f 1f 44 00 00 53 48 89 fb e8 1e 6c 03 00 48 85 c0 74 06 5b c3 cc cc cc cc 8b 83 28 23 00 00 85 c0 75 f0 <0f> 0b 5b c3 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 [ 1682.900390] RSP: 0018:ffffc900021cfb68 EFLAGS: 00010246 [ 1682.900394] RAX: 0000000000000000 RBX: ffff8886a96d8000 RCX: 0000000000000000 [ 1682.900396] RDX: 0000000000000001 RSI: ffff8886a6311a00 RDI: ffff8886a96d8000 [ 1682.900398] RBP: ffffc900021cfcc0 R08: 0000000000000001 R09: 0000000000000000 [ 1682.900400] R10: ffffc900021cfcd0 R11: 0000000000000002 R12: 0000000000000004 [ 1682.900402] R13: 0000000000000000 R14: ffff8886a6311990 R15: ffffc900021cfd74 [ 1682.900405] FS: 0000000000000000(0000) GS:ffff888829880000(0000) knlGS:0000000000000000 [ 1682.900407] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1682.900409] CR2: 000055f70bad3fb0 CR3: 000000025243a004 CR4: 00000000003706e0 [ 1682.900412] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1682.900413] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1682.900415] Call Trace: [ 1682.900418] <TASK> [ 1682.900420] ? xe_device_assert_mem_access+0x27/0x30 [xe] [ 1682.900504] ? __warn+0x85/0x170 [ 1682.900510] ? xe_device_assert_mem_access+0x27/0x30 [xe] [ 1682.900596] ? report_bug+0x171/0x1a0 [ 1682.900604] ? handle_bug+0x3c/0x80 [ 1682.900608] ? exc_invalid_op+0x17/0x70 [ 1682.900612] ? asm_exc_invalid_op+0x1a/0x20 [ 1682.900621] ? xe_device_assert_mem_access+0x27/0x30 [xe] [ 1682.900706] ? xe_device_assert_mem_access+0x12/0x30 [xe] [ 1682.900790] guc_ct_send_locked+0xb9/0x1550 [xe] [ 1682.900882] ? lock_acquire+0xca/0x2b0 [ 1682.900885] ? guc_ct_send+0x3c/0x1a0 [xe] [ 1682.900977] ? lock_is_held_type+0x9b/0x110 [ 1682.900984] ? __mutex_lock+0xc0/0xb90 [ 1682.900989] ? __pfx___drm_printfn_info+0x10/0x10 [ 1682.900999] guc_ct_send+0x53/0x1a0 [xe] [ 1682.901090] ? __lock_acquire+0xf22/0x21b0 [ 1682.901097] ? process_one_work+0x1a0/0x500 [ 1682.901109] xe_guc_ct_send+0x19/0x50 [xe] [ 1682.901202] set_min_preemption_timeout+0x75/0xa0 [xe] [ 1682.901294] disable_scheduling_deregister+0x55/0x250 [xe] [ 1682.901383] ? xe_sched_process_msg_work+0x76/0xd0 [xe] [ 1682.901467] ? lock_release+0xc9/0x260 [ 1682.901474] xe_sched_process_msg_work+0x82/0xd0 [xe] [ 1682.901559] process_one_work+0x20a/0x500 v2: Add the splat Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:45:07 -05:00
Daniele Ceraolo Spurio	5152234e2e	drm/xe/gsc: Define GSC FW for MTL We track GSC FW based on its compatibility version, which is what determines the interface it supports. Also add a modparam override like the ones for GuC and HuC. v2: fix module param description (John) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:45:06 -05:00
Daniele Ceraolo Spurio	9897eb8555	drm/xe/gsc: Define GSCCS for MTL Add the GSCCS to the media_xelpmp engine list. Note that since the GSCCS is only used with the GSC FW, we can consider it disabled if we don't have the FW available. v2: mark GSCCS as allowed on the media IP in kunit tests Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:45:06 -05:00
Daniele Ceraolo Spurio	0881cbe040	drm/xe/gsc: Query GSC compatibility version The version is obtained via a dedicated MKHI GSC HECI command. The compatibility version is what we want to match against for the GSC, so we need to call the FW version checker after obtaining the version. Since this is the first time we send a GSC HECI command via the GSCCS, this patch also introduces common infrastructure to send such commands to the GSC. Communication with the GSC FW is done via input/output buffers, whose addresses are provided via a GSCCS command. The buffers contain a generic header and a client-specific packet (e.g. PXP, HDCP); the clients don't care about the header format and/or the GSCCS command in the batch, they only care about their client-specific header. This patch therefore introduces helpers that allow the callers to automatically fill in the input header, submit the GSCCS job and decode the output header, to make it so that the caller only needs to worry about their client-specific input and output messages. v3: squash of 2 separate patches ahead of merge, so that the common functions and their first user are added at the same time Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Suraj Kandpal <suraj.kandpal@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.Com> #v1 Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:45:06 -05:00
Daniele Ceraolo Spurio	f63182b45d	drm/xe/gsc: Trigger a driver flr to cleanup the GSC on unload GSC is only killed by an FLR, so we need to trigger one on unload to make sure we stop it. This is because we assign a chunk of memory to the GSC as part of the FW load, so we need to make sure it stops using it when we release it to the system on driver unload. Note that this is not a problem of the unload per-se, because the GSC will not touch that memory unless there are requests for it coming from the driver; therefore, no accesses will happen while Xe is not loaded, but if we re-load the driver then the GSC might wake up and try to access that old memory location again. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:45:06 -05:00
Daniele Ceraolo Spurio	aae84bf1cd	drm/xe/gsc: Implement WA 14015076503 When the GSC FW is loaded, we need to inform it when a GSCCS reset is coming and then wait 200ms for it to get ready to process the reset. v2: move WA code to GSC file, use variable in Makefile (John) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <john.c.harrison@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:45:06 -05:00
Daniele Ceraolo Spurio	dd0e89e5ed	drm/xe/gsc: GSC FW load The GSC FW must be copied in a 4MB stolen memory allocation, whose GGTT address is then passed as a parameter to a dedicated load instruction submitted via the GSC engine. Since the GSC load is relatively slow (up to 250ms), we perform it asynchronously via a worker. This requires us to make sure that the worker has stopped before suspending/unloading. Note that we can't yet use xe_migrate_copy for the copy because it doesn't work with stolen memory right now, so we do a memcpy from the CPU side instead. v2: add comment about timeout value, fix GSC status checking before load (John) Bspec: 65306, 65346 Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:45:06 -05:00
Daniele Ceraolo Spurio	985d5a49e8	drm/xe/gsc: Parse GSC FW header The GSC blob starts with a layout header, from which we can move to the boot directory, which in turns allows us to find the CPD. The CPD uses the same format as the one in the HuC binary, so we can re-use the same parsing code to get to the manifest, which contains the release and security versions of the FW. v2: Fix comments in struct definition (John) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:45:06 -05:00
Daniele Ceraolo Spurio	0d1caff4a3	drm/xe/gsc: Introduce GSC FW Add the basic definitions and init function. Same as HuC, GSC is only supported on the media GT on MTL and newer platforms. Note that the GSC requires submission resources which can't be allocated during init (because we don't have the hwconfig yet), so it can't be marked as loadable at the end of the init function. The allocation of those resources will come in the patch that makes use of them to load the FW. v2: better comment, move num FWs define inside the enum (John) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:45:06 -05:00
Daniele Ceraolo Spurio	2e7227b4b7	drm/xe/uc: Rework uC version tracking The GSC firmware, support for which is coming soon for Xe, has both a release version (updated on every release) and a compatibility version (update only on interface changes). The GuC has something similar, with a global release version and a submission version (which is also known as the VF compatibility version). The main difference is that for the GuC we still want to check the driver requirement against the release version, while for the GSC we'll need to check against the compatibility version. Instead of special casing the GSC, this patch reworks the FW logic so that we store both versions at the uc_fw level for all binaries and we allow checking against either of the versions. Initially, we'll use it to support GSC, but the logic could be re-used to allow VFs to check against the GuC compatibility version. Note that the GSC version has 4 numbers (major, minor, hotfix, build), so support for that has been added as part of the rework and will be used in follow-up patches. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:45:05 -05:00
Bommithi Sakeena	adce1b393f	drm/xe: Encapsulate all the module parameters Encapsulate all the module parameters in one single global struct variable. This also removes the extra xe_module.h from includes. v2: naming consistency as suggested by Jani and Lucas v3: fix checkpatch errors/warnings v4: adding blank line after struct declaration Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Bommithi Sakeena <bommithi.sakeena@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:45:05 -05:00
Tejas Upadhyay	a409901f51	drm/xe/xe2: Add workaround 14020013138 This workaround applies to Xe2_LPG A0 V3: - Apply rule RENDER class V2(Matt): - Apply WA in lrc context Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:45:05 -05:00
Matt Roper	f91bacce8d	drm/xe/dg2: Drop Wa_22014600077 The workaround database has been updated to drop this workaround for all DG2 variants. Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://lore.kernel.org/r/20231127190332.4099519-2-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:45:05 -05:00
Lucas De Marchi	812ec747a3	drm/xe: Sync MTL PCI IDs with i915 For Xe1 platforms, it's better to follow the way i915 adds the PCI IDs to the header, so it's easier to catch up when there is an update. This brings the same logic applied in commit 2e3c369f23a7 ("drm/i915/mtl: Eliminate subplatforms") to the equivalent xe header. The end result of this header for Xe1 platforms is now in sync with i915 as of commit 5032c607e886 ("drm/i915: ATS-M device ID update"). This can be seen by $ git show 5032c607e886:include/drm/i915_pciids.h > a.h $ git diff --color-words --no-index a.h include/drm/xe_pciids.h Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20231121195209.802235-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:45:01 -05:00
Thomas Hellström	fdb6a05383	drm/xe: Internally change the compute_mode and no_dma_fence mode naming The name "compute_mode" can be confusing since compute uses either this mode or fault_mode to achieve the long-running semantics, and compute_mode can, moving forward, enable fault_mode under the hood to work around hardware limitations. Also the name no_dma_fence_mode really refers to what we elsewhere call long-running mode and the mode contrary to what its name suggests allows dma-fences as in-fences. So in an attempt to be more consistent, rename no_dma_fence_mode -> lr_mode compute_mode -> preempt_fence_mode And adjust flags so that preempt_fence_mode sets XE_VM_FLAG_LR_MODE fault_mode sets XE_VM_FLAG_LR_MODE \| XE_VM_FLAG_FAULT_MODE v2: - Fix a typo in the commit message (Oak Zeng) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Oak Zeng <oak.zeng@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231127123349.23698-1-thomas.hellstrom@linux.intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:58 -05:00
Thomas Hellström	d2f51c50b9	drm/xe/vm: Fix ASID XA usage xa_alloc_cyclic() returns 1 on successful allocation, if wrapping occurs, but the code incorrectly treats that as an error. Fix that. Also, xa_alloc_cyclic() requires xa_init_flags(..., XA_FLAGS_ALLOC), so fix that, and assuming we don't want a zero ASID, instead of using XA_FLAGS_ALLOC1, adjust the xa limits at alloc_cyclic time. v2: - On CONFIG_DRM_XE_DEBUG, Initialize the cyclic ASID allocation in such a way that the next allocated ASID will be the maximum one, and the one following will cause an ASID wrap, (all to have CI test high ASIDs and ASID wraps). v3: - Stricter return value checking from xa_alloc_cyclic() (Matthew Auld) Suggested-by: Ohad Sharabi <osharabi@habana.ai> Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/946 Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Ohad Sharabi <osharabi@habana.ai> #v1 Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231124153345.97385-5-thomas.hellstrom@linux.intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:57 -05:00
Thomas Hellström	e7c9e049e0	drm/xe/bo: Remove leftover trace_printk() trace_printk() is not intended for production code. Remove it. Suggested-by: Ohad Sharabi <osharabi@habana.ai> Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/946 Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Ohad Sharabi <osharabi@habana.ai> Link: https://patchwork.freedesktop.org/patch/msgid/20231122110359.4087-4-thomas.hellstrom@linux.intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:57 -05:00
Thomas Hellström	a21fe5ee59	drm/xe/bo: Rename xe_bo_get_sg() to xe_bo_sg() Using "get" typically refers to obtaining a refcount, which we don't do here so rename to xe_bo_sg(). Suggested-by: Ohad Sharabi <osharabi@habana.ai> Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/946 Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Ohad Sharabi<osharabi@habana.ai> Link: https://patchwork.freedesktop.org/patch/msgid/20231122110359.4087-3-thomas.hellstrom@linux.intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:57 -05:00
Thomas Hellström	8c54ee8a86	drm/xe: Ensure that we don't access the placements array out-of-bounds Ensure, using xe_assert that the various try_add_<placement> functions don't access the bo placements array out-of-bounds. v2: - Remove the places argument to make sure the xe_assert operates on the array we're actually populating. (Matthew Auld) Suggested-by: Ohad Sharabi <osharabi@habana.ai> Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/946 Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Ohad Sharabi <osharabi@habana.ai> #v1 Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231123153158.12779-2-thomas.hellstrom@linux.intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:57 -05:00
Michal Wajdeczko	2475ac27df	drm/xe: Print virtualization mode during probe We already print some basic information about the device, add virtualization information, until we expose that elsewhere. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20231115073804.1861-3-michal.wajdeczko@intel.com Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:57 -05:00
Michal Wajdeczko	13e5c32c84	drm/xe: Prepare for running in different SR-IOV modes We will be adding support for the SR-IOV and driver might be then running, in addition to existing non-virtualized bare-metal mode, also in Physical Function (PF) or Virtual Function (VF) mode. Since these additional modes require some changes to the driver, define enum flag to represent different SR-IOV modes and add a function where we will detect the actual mode in the runtime. We start with a forced bare-metal mode as it is sufficient to enable basic functionality and ensures no impact to existing code. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20231115073804.1861-2-michal.wajdeczko@intel.com Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:57 -05:00
Michal Wajdeczko	d6d14854dd	drm/xe: Add device flag to indicate SR-IOV support The Single Root I/O Virtualization (SR-IOV) extension to the PCI Express (PCIe) specification suite is supported starting from 12th generation of Intel Graphics processors. Add a device flag that we will use to enable SR-IOV specific code paths and to indicate our readiness to support SR-IOV. We will enable this flag for the specific platforms once all required changes and additions will be ready and merged. Bspec: 52391 Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20231115073804.1861-1-michal.wajdeczko@intel.com Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:57 -05:00
Tejas Upadhyay	8bfbe174d7	drm/xe/xe2: Add workaround 14019449301 This workaround applies to Xe2_LPM V3(MattR): - Reorder reg and wa placement - Add base parameter to reg macro for better definition V2(MattR): - Change name of register - Loop for all engines - Driver permanent WA, applies to all steps Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:56 -05:00
Tejas Upadhyay	f25d8291ac	drm/xe/xe2: Add workaround 16021867713 This workaround applies to Xe2_LPM as well Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:56 -05:00
Tejas Upadhyay	11ea758c14	drm/xe/xe2: Add workaround 14017421178 This workaround applies to Xe2_LPM Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:56 -05:00
Gustavo Sousa	b3f0654f55	drm/xe/mmio: Make xe_mmio_wait32() aware of interrupts With the current implementation, a preemption or other kind of interrupt might happen between xe_mmio_read32() and ktime_get_raw(). Such an interruption (specially in the case of preemption) might be long enough to cause a timeout without giving a chance of a new check on the register value on a next iteration, which would have happened otherwise. This issue causes some sporadic timeouts in some code paths. As an example, we were experiencing some rare timeouts when waiting for PLL unlock for C10/C20 PHYs (see intel_cx0pll_disable()). After debugging, we found out that the PLL unlock was happening within the expected time period (20us), which suggested a bug in xe_mmio_wait32(). To fix the issue, ensure that we do a last check out of the loop if necessary. This change was tested with the aforementioned PLL unlocking code path. Experiments showed that, before this change, we observed reported timeouts in 54 of 5000 runs; and, after this change, no timeouts were reported in 5000 runs. v2: - Prefer an implementation without a barrier (v1 switched the order of xe_mmio_read32() and ktime_get_raw() calls and added a barrier() in between). (Lucas, Rodrigo) Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20231116214000.70573-3-gustavo.sousa@intel.com Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:56 -05:00
Gustavo Sousa	5c09bd6ccd	drm/xe/mmio: Move xe_mmio_wait32() to xe_mmio.c This function is big enough, let's move it to a shared compilation unit. While at it, document it. Here is the output of running bloat-o-metter on the new and old module (execution provided by Lucas): $ ./scripts/bloat-o-meter build64/drivers/gpu/drm/xe/xe.ko{.old,} add/remove: 2/0 grow/shrink: 0/58 up/down: 554/-15645 (-15091) (...) # Lines in between omitted Total: Before=2181322, After=2166231, chg -0.69% The overall reduction in the size is not that significant. Nevertheless, keeping the function as inline arguably does not bring too much benefit as well. As noted by Lucas, we would probably benefit from an inline function that did the fast-path check: do an optimistic first check before entering the wait-logic, which itself would go to a compilation unit. We might come back to implement this in the future if we have data to justify it. v2: - Add note in documentation for @timeout_us regarding the exponential backoff strategy. (Lucas) - Share output of bloat-o-meter in the commit message. (Lucas) Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20231116214000.70573-2-gustavo.sousa@intel.com Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:52 -05:00
Ruthuvikas Ravikumar	a6a4ea6d7d	drm/xe: Add mocs kunit This kunit verifies the hardware values of mocs and l3cc registers with the KMD programmed values. v14: Fix CHECK. v13: Remove ret after forcewake. v11: Add KUNIT_ASSERT_EQ_MSG for Forcewake. v9/v10: Add Forcewake Fail. v8: Remove xe_bo.h and xe_pm.h Remove mocs and l3cc from live_mocs. Pull debug and err msg for mocs/l3cc out of if else block. Add HAS_LNCF_MOCS. v7: correct checkpath v6: Change ssize_t type. Change forcewake domain to XE_FW_GT. Update change of MOCS registers are multicast on Xe_HP and beyond patch. v5: Release forcewake. Remove single statement braces. Fix debug statements. v4: Drop stratch and vaddr. Fix debug statements. Fix indentation. v3: Fix checkpath. v2: Fix checkpath. Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Cc: Mathew D Roper <matthew.d.roper@intel.com> Reviewed-by: Mathew D Roper <matthew.d.roper@intel.com> Signed-off-by: Ruthuvikas Ravikumar <ruthuvikas.ravikumar@intel.com> Link: https://lore.kernel.org/r/20231116215152.2248859-1-ruthuvikas.ravikumar@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:52 -05:00
Lucas De Marchi	1d425066f1	drm/xe: Fix modpost warning on kunit modules When built with W=1, the following warnings show up on modpost: MODPOST drivers/gpu/drm/xe/Module.symvers WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpu/drm/xe/tests/xe_bo_test.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpu/drm/xe/tests/xe_dma_buf_test.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpu/drm/xe/tests/xe_migrate_test.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpu/drm/xe/tests/xe_pci_test.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpu/drm/xe/tests/xe_rtp_test.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpu/drm/xe/tests/xe_wa_test.o Add the module description for each of these to fix the warning. Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://lore.kernel.org/r/20231120221904.695630-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:51 -05:00
Haridhar Kalvala	1a3d4d76ba	drm/xe: ATS-M device ID update ATS-M device ID update. BSpec: 44477 Signed-off-by: Haridhar Kalvala <haridhar.kalvala@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20231120065507.1543676-1-haridhar.kalvala@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:51 -05:00
José Roberto de Souza	1bec833316	drm/xe: Add missing RPL and ADL Those are ids present in i915 but missing in Xe. Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:51 -05:00
José Roberto de Souza	baf9089c80	drm/xe: Include RPL-U to pciidlist RPL-U is defined as a subplatform but those PCI ids were not included in pciidlist so Xe KMD would never probe device with those ids. This is following what i915 does to include RPL-U to PCI ids probe list. v2: - change order to match i915 Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:51 -05:00
Matthew Brost	40709aa761	drm/xe: Only set xe_vma_op.map fields for GPUVA map operations DRM_XE_VM_BIND_OP_MAP_* IOCTL operations can result in GPUVA unmap, remap, or map operations in vm_bind_ioctl_ops_create. The xe_vma_op.map fields are blindly set which is incorrect for GPUVA unmap or remap operations. Fix this by only setting xe_vma_op.map for GPUVA map operations. Also restructure a bit vm_bind_ioctl_ops_create to make the code a bit more readable. Reported-by: Dafna Hirschfeld <dhirschfeld@habana.ai> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:51 -05:00
Lucas De Marchi	0bc519d20f	drm/xe: Remove GEN[0-9]*_ prefixes After noticing in logs there were still mentions to GEN6 registers, it was clear commit d9b79ad275e7 ("drm/xe: Drop gen afixes from registers") didn't take care of all the afixes. Some were added later, but there are also constants and strings still using that. Continue the cleanup removing the remaining ones. To keep it consistent with code nearby, a few other changes are made: - Remove prefix in INTEL_LEGACY_64B_CONTEXT - Remove GEN8_CTX_L3LLC_COHERENT since it's unused - Rename GEN9_FREQ_SCALER to GT_FREQUENCY_SCALER v2: Use XELP_ as prefix for NUM_MOCS_ENTRIES and remove changes to MOCS_ENTRIES as this is now done as part of a previous commit (Matt Roper) Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20231117174049.527192-3-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:39 -05:00
Lucas De Marchi	4399e95102	drm/xe/mocs: Bring comment about mocs back to reality The mocs documentation was copied from i915 and doesn't match the reality in xe. Reword it so it matches what the code is doing. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20231117174049.527192-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:39 -05:00
Lucas De Marchi	8735f8616d	drm/xe: Fold GEN11_MOCS_ENTRIES into gen12_mocs_desc GEN11_MOCS_ENTRIES dates back from importing the table from the i915 module. The macro was used so the it could be maintained in a single place and platforms would just override with additional entries. With the platforms supported by xe, each of them is just defining individual tables without re-using this define. Move it inside gen12_mocs_desc that is the only user. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20231117174049.527192-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:39 -05:00
Matthew Auld	e7b4ebd7c6	drm/xe/bo: don't hold dma-resv lock over drm_gem_handle_create This seems to create a locking inversion with object_name_lock. The lock is held by drm_prime_fd_to_handle when calling our xe_gem_prime_import hook, which might eventually go on to grab the dma-resv lock during the attach. However we also have the opposite locking order in xe_gem_create_ioctl which is holding the dma-resv lock when calling drm_gem_handle_create, which wants to eventually grab object_name_lock: -> #1 (reservation_ww_class_mutex){+.+.}-{3:3}: <4> [635.739288] lock_acquire+0x169/0x3d0 <4> [635.739294] __ww_mutex_lock.constprop.0+0x164/0x1e60 <4> [635.739300] ww_mutex_lock_interruptible+0x42/0x1a0 <4> [635.739305] drm_gem_shmem_pin+0x4b/0x140 [drm_shmem_helper] <4> [635.739317] dma_buf_dynamic_attach+0x101/0x430 <4> [635.739323] xe_gem_prime_import+0xcc/0x2e0 [xe] <4> [635.739499] drm_prime_fd_to_handle_ioctl+0x184/0x2e0 [drm] <4> [635.739594] drm_ioctl_kernel+0x16f/0x250 [drm] <4> [635.739693] drm_ioctl+0x35e/0x620 [drm] <4> [635.739789] __x64_sys_ioctl+0xb7/0xf0 <4> [635.739794] do_syscall_64+0x3c/0x90 <4> [635.739799] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 <4> [635.739805] -> #0 (&dev->object_name_lock){+.+.}-{3:3}: <4> [635.739813] check_prev_add+0x1ba/0x14a0 <4> [635.739818] __lock_acquire+0x203e/0x2ff0 <4> [635.739823] lock_acquire+0x169/0x3d0 <4> [635.739827] __mutex_lock+0x124/0x1310 <4> [635.739832] drm_gem_handle_create+0x32/0x50 [drm] <4> [635.739927] xe_gem_create_ioctl+0x1d3/0x550 [xe] <4> [635.740102] drm_ioctl_kernel+0x16f/0x250 [drm] <4> [635.740197] drm_ioctl+0x35e/0x620 [drm] <4> [635.740293] __x64_sys_ioctl+0xb7/0xf0 <4> [635.740297] do_syscall_64+0x3c/0x90 <4> [635.740302] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 <4> [635.740307] It looks like it should be safe to simply drop the dma-resv lock prior to publishing the object when calling drm_gem_handle_create. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/743 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:39 -05:00
Michal Wajdeczko	cac74742fa	drm/xe/guc: Use valid scratch register for posting read There are only 4 scratch registers VF_SW_FLAG(0..3) on each GuC. We shouldn't use non-existing register VF_SW_FLAG(4) for posting read. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:39 -05:00
Michal Wajdeczko	1d087cb7d8	drm/xe/guc: Fix handling of GUC_HXG_TYPE_NO_RESPONSE_BUSY If GuC responds with the NO_RESPONSE_BUSY message, we extend our timeout while waiting for the actual response, but we wrongly assumed that the next message will be RESPONSE_SUCCESS, missing that we still can get RESPONSE_FAILURE. Change the condition for the expected message type, using only common bits from RESPONSE_SUCCESS and RESPONSE_FAILURE (as they differ, by ABI design, only by the last bit). v2: add comment/checks to the code (Matt) Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:38 -05:00
Michal Wajdeczko	cd1c9c54c3	drm/xe/guc: Copy response data from proper registers While copying GuC response from the scratch registers to the buffer, formula to identify next scratch register is broken. Fix it. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:38 -05:00
Michal Wajdeczko	4a9b7d29c1	drm/xe/guc: Fix wrong assert about full_len This variable holds full length of the message, including header length so it should be checked against GUC_CTB_MSG_MAX_LEN. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:38 -05:00
Matt Roper	32dd40fb48	drm/xe/dg2: Wa_18028616096 now applies to all DG2 The workaround database was just updated to extend this workaround to DG2-G11 (whereas previously it applied only to G10 and G12). Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://lore.kernel.org/r/20231115183029.2649992-2-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:38 -05:00
Rodrigo Vivi	aaa115ffaa	drm/xe/uapi: Be more specific about the vm_bind prefetch region Let's bring a bit of clarity on this 'region' field that is part of vm_bind operation struct. Rename and document to make it more than obvious that it is a region instance and not a mask and also that it should only be used with the prefetch operation itself. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com>	2023-12-21 11:44:38 -05:00
Rodrigo Vivi	4a349c8611	drm/xe/uapi: Differentiate WAIT_OP from WAIT_MASK On one hand the WAIT_OP represents the operation use for waiting such as ==, !=, > and so on. On the other hand, the mask is applied to the value used for comparision. Split those two to bring clarity to the uapi. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com>	2023-12-21 11:44:38 -05:00
Rodrigo Vivi	9ad743515c	drm/xe/uapi: Standardize the FLAG naming and assignment Only cosmetic things. No functional change on this patch. Define every flag with (1 << n) and use singular FLAG name. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com>	2023-12-21 11:44:38 -05:00
Rodrigo Vivi	b02606d323	drm/xe/uapi: Rename query's mem_usage to mem_regions 'Usage' gives an impression of telemetry information where someone would query to see how the memory is currently used and available size, etc. However this API is more than this. It is about a global view of all the memory regions available in the system and user space needs to have this information so they can then use the mem_region masks that are returned for the engine access. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com>	2023-12-21 11:44:38 -05:00
Rodrigo Vivi	45c30d8000	drm/xe/uapi: Rename *_mem_regions masks - 'native' doesn't make much sense on integrated devices. - 'slow' is not necessarily true and doesn't go well with opposition to 'native'. Instead, let's use 'near' vs 'far'. It makes sense with all the current Intel GPUs and it is future proof. Right now, there's absolutely no need to define among the 'far' memory, which ones are slower, either in terms of latency, nunmber of hops or bandwidth. In case of this might become a requirement in the future, a new query could be added to indicate the certain 'distance' between a given engine and a memory_region. But for now, this fulfill all of the current requirements in the most straightforward way for the userspace drivers. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com>	2023-12-21 11:44:37 -05:00
Francois Dugast	5ca2c4b800	drm/xe/uapi: Change rsvd to pad in struct drm_xe_class_instance Change rsvd to pad in struct drm_xe_class_instance to prevent the field from being used in future. v2: Change from fixup to regular commit because this touches the uAPI (Francois Dugast) Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:34 -05:00
Francois Dugast	3ac4a7896d	drm/xe/uapi: Add _FLAG to uAPI constants usable for flags Most constants defined in xe_drm.h which can be used for flags are named DRM_XE__FLAG_, which is helpful to identify them. Make this systematic and add _FLAG where it was missing. Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:33 -05:00
Francois Dugast	d5dc73dbd1	drm/xe/uapi: Add missing DRM_ prefix in uAPI constants Most constants defined in xe_drm.h use DRM_XE_ as prefix which is helpful to identify the name space. Make this systematic and add this prefix where it was missing. v2: - fix vertical alignment of define values - remove double DRM_ in some variables (José Roberto de Souza) v3: Rebase Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:33 -05:00

1 2 3 4 5 ...

1235693 Commits