linux

iv/linux

Author	SHA1	Message	Date
Matthew Auld	9306b2b2df	drm/i915/ttm: fix 32b build Since segment_pages is no longer a compile time constant, it looks the DIV_ROUND_UP(node->size, segment_pages) breaks the 32b build. Simplest is just to use the ULL variant, but really we should need not need more than u32 for the page alignment (also we are limited by that due to the sg->length type), so also make it all u32. Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Fixes: bc99f1209f19 ("drm/i915/ttm: fix sg_table construction") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Nirmoy Das <nirmoy.das@linux.intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220712174050.592550-1-matthew.auld@intel.com	2022-07-13 16:06:08 +01:00
Andrzej Hajda	ab3edc679c	drm/i915/selftests: fix subtraction overflow bug On some machines hole_end can be small enough to cause subtraction overflow. On the other side (addr + 2 * min_alignment) can overflow in case of mock tests. This patch should handle both cases. Fixes: e1c5f754067b59 ("drm/i915: Avoid overflow in computing pot_hole loop termination") Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3674 Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220624113528.2159210-1-andrzej.hajda@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2022-07-12 17:57:27 -04:00
Chris Wilson	c877bed82e	drm/i915/gt: Only kick the signal worker if there's been an update One impact of commit 047a1b877ed4 ("dma-buf & drm/amdgpu: remove dma_resv workaround") is that it stores many, many more fences. Whereas adding an exclusive fence used to remove the shared fence list, that list is now preserved and the write fences included into the list. Not just a single write fence, but now a write/read fence per context. That causes us to have to track more fences than before (albeit half of those are redundant), and we trigger more interrupts for multi-engine workloads. As part of reducing the impact from handling more signaling, we observe we only need to kick the signal worker after adding a fence iff we have good cause to believe that there is work to be done in processing the fence i.e. we either need to enable the interrupt or the request is already complete but we don't know if we saw the interrupt and so need to check signaling. References: 047a1b877ed4 ("dma-buf & drm/amdgpu: remove dma_resv workaround") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Karolina Drobnik <karolina.drobnik@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/d7b953c7a4ba747c8196a164e2f8c5aef468d048.1657289332.git.karolina.drobnik@intel.com	2022-07-12 17:44:43 -04:00
Chris Wilson	1ea7fe77c0	drm/i915: Bump GT idling delay to 2 jiffies In monitoring a transcode pipeline that is latency sensitive (it waits between submitting frames, and each frame requires work on rcs/vcs/vecs engines), it is found that it took longer than a single jiffy for it to sustain its workload. Allowing an extra jiffy headroom for the userspace prevents us from prematurely parking and having to exit powersaving immediately. Link: https://gitlab.freedesktop.org/drm/intel/-/issues/6284 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Karolina Drobnik <karolina.drobnik@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/e37911ec087a9ce50630d6faf61fa2c0d5f96d44.1657289332.git.karolina.drobnik@intel.com	2022-07-12 17:44:40 -04:00
Chris Wilson	394e2b57a9	drm/i915/gem: Look for waitboosting across the whole object prior to individual waits We employ a "waitboost" heuristic to detect when userspace is stalled waiting for results from earlier execution. Under latency sensitive work mixed between the gpu/cpu, the GPU is typically under-utilised and so RPS sees that low utilisation as a reason to downclock the frequency, causing longer stalls and lower throughput. The user left waiting for the results is not impressed. On applying commit 047a1b877ed4 ("dma-buf & drm/amdgpu: remove dma_resv workaround") it was observed that deinterlacing h264 on Haswell performance dropped by 2-5x. The reason being that the natural workload was not intense enough to trigger RPS (using HW evaluation intervals) to upclock, and so it was depending on waitboosting for the throughput. Commit 047a1b877ed4 ("dma-buf & drm/amdgpu: remove dma_resv workaround") changes the composition of dma-resv from keeping a single write fence + multiple read fences, to a single array of multiple write and read fences (a maximum of one pair of write/read fences per context). The iteration order was also changed implicitly from all-read fences then the single write fence, to a mix of write fences followed by read fences. It is that ordering change that belied the fragility of waitboosting. Currently, a waitboost is inspected at the point of waiting on an outstanding fence. If the GPU is backlogged such that we haven't yet stated the request we need to wait on, we force the GPU to upclock until the completion of that request. By changing the order in which we waited upon requests, we ended up waiting on those requests in sequence and as such we saw that each request was already started and so not a suitable candidate for waitboosting. Instead of asking whether to boost each fence in turn, we can look at whether boosting is required for the dma-resv ensemble prior to waiting on any fence, making the heuristic more robust to the order in which fences are stored in the dma-resv. Reported-by: Thomas Voegtle <tv@lio96.de> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6284 Fixes: 047a1b877ed4 ("dma-buf & drm/amdgpu: remove dma_resv workaround") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Karolina Drobnik <karolina.drobnik@intel.com> Tested-by: Thomas Voegtle <tv@lio96.de> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/07e05518d9f6620d20cc1101ec1849203fe973f9.1657289332.git.karolina.drobnik@intel.com	2022-07-12 17:44:36 -04:00
Chris Wilson	33da978947	drm/i915/gt: Serialize TLB invalidates with GT resets Avoid trying to invalidate the TLB in the middle of performing an engine reset, as this may result in the reset timing out. Currently, the TLB invalidate is only serialised by its own mutex, forgoing the uncore lock, but we can take the uncore->lock as well to serialise the mmio access, thereby serialising with the GDRST. Tested on a NUC5i7RYB, BIOS RYBDWi35.86A.0380.2019.0517.1530 with i915 selftest/hangcheck. Cc: stable@vger.kernel.org # v4.4 and upper Fixes: 7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store") Reported-by: Mauro Carvalho Chehab <mchehab@kernel.org> Tested-by: Mauro Carvalho Chehab <mchehab@kernel.org> Reviewed-by: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Chris Wilson <chris.p.wilson@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1e59a7c45dd919a530256b9ac721ac6ea86c0677.1657639152.git.mchehab@kernel.org	2022-07-12 17:38:01 -04:00
Chris Wilson	336561a914	drm/i915/gt: Serialize GRDOM access between multiple engine resets Don't allow two engines to be reset in parallel, as they would both try to select a reset bit (and send requests to common registers) and wait on that register, at the same time. Serialize control of the reset requests/acks using the uncore->lock, which will also ensure that no other GT state changes at the same time as the actual reset. Cc: stable@vger.kernel.org # v4.4 and upper Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/e0a2d894e77aed7c2e36b0d1abdc7dbac3011729.1657639152.git.mchehab@kernel.org	2022-07-12 17:37:59 -04:00
Matt Roper	b7580e669c	drm/i915/dg2: Add Wa_15010599737 This workaround may need to be extended to other platforms soon, but for now it's marked as DG2-specific. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Arun R Murthy <arun.r.murthy@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220708215804.2889246-1-matthew.d.roper@intel.com	2022-07-12 08:58:19 -07:00
Matthew Auld	bc99f1209f	drm/i915/ttm: fix sg_table construction If we encounter some monster sized local-memory page that exceeds the maximum sg length (UINT32_MAX), ensure that don't end up with some misaligned address in the entry that follows, leading to fireworks later. Also ensure we have some coverage of this in the selftests. v2(Chris): - Use round_down consistently to avoid udiv errors v3(Nirmoy): - Also update the max_segment in the selftest Fixes: f701b16d4cc5 ("drm/i915/ttm: add i915_sg_from_buddy_resource") Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6379 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Nirmoy Das <nirmoy.das@linux.intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220711085859.24198-1-matthew.auld@intel.com	2022-07-11 16:35:37 +01:00
Dan Carpenter	d50f5a109c	drm/i915/selftests: fix a couple IS_ERR() vs NULL tests The shmem_pin_map() function doesn't return error pointers, it returns NULL. Fixes: be1cb55a07bf ("drm/i915/gt: Keep a no-frills swappable copy of the default context state") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220708094104.GL2316@kadam	2022-07-11 10:14:51 +01:00
Radhakrishna Sripada	7835303982	drm/i915/mtl: Add MeteorLake PCI IDs Add Meteorlake PCI IDs. Split into M, and P subplatforms. v2: Update PCI id's v3: Move id 7d60 under MTL_M(MattR) Bspec: 55420 Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220708000335.2869311-3-radhakrishna.sripada@intel.com	2022-07-08 13:25:33 -07:00
Radhakrishna Sripada	bcf9b29662	drm/i915/mtl: Add MeteorLake platform info MTL has Xe_LPD+ display IP (version = 14), MTL graphics IP (version = 12.70), and Xe_LPM+ media IP (version = 13). Bspec: 55413 Bspec: 55416 Bspec: 55417 Bspec: 55418 Bspec: 55726 Bspec: 45544 Bspec: 65380 v2: rearrange the fields in pci_info(MattR) Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> [mattrope: Moved IS_METEORLAKE() higher in header] Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220708000335.2869311-2-radhakrishna.sripada@intel.com	2022-07-08 13:21:26 -07:00
Matt Roper	9a92732f04	drm/i915/gt: Add general DSS steering iterator to intel_gt_mcr Although all DSS belong to a single pool on Xe_HP platforms (i.e., they're not organized into slices from a topology point of view), we do still need to pass 'group' and 'instance' targets when steering register accesses to a specific instance of a per-DSS multicast register. The rules for how to determine group and instance IDs (which previously used legacy terms "slice" and "subslice") varies by platform. Some platforms determine steering by gslice membership, some platforms by cslice membership, and future platforms may have other rules. Since looping over each DSS and performing steered unicast register accesses is a relatively common pattern, let's add a dedicated iteration macro to handle this (and replace the platform-specific "instdone" loop we were using previously. This will avoid the calling code needing to figure out the details about how to obtain steering IDs for a specific DSS. Most of the places where we use this new loop are in the GPU errorstate code at the moment, but we do have some additional features coming in the future that will also need to loop over each DSS and steer some register accesses accordingly. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220701232006.1016135-1-matthew.d.roper@intel.com	2022-07-08 09:32:57 -07:00
Umesh Nerlige Ramappa	ca437b45ac	i915/perf: Disable OA sseu config param for gfx12.50+ The global sseu config is applicable only to gen11 platforms where concurrent media, render and OA use cases may cause some subslices to be turned off and hence lose NOA configuration. Ideally we want to return ENODEV for non-gen11 platforms, however, this has shipped with gfx12, so disable only for gfx12.50+. v2: gfx12 is already shipped with this, disable for gfx12.50+ (Lionel) v3: (Matt) - Update commit message and replace "12.5" with "12.50" - Replace DRM_DEBUG() with driver specific drm_dbg() Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220707193002.2859653-2-umesh.nerlige.ramappa@intel.com	2022-07-08 08:27:36 -07:00
Umesh Nerlige Ramappa	2fec539112	i915/perf: Replace DRM_DEBUG with driver specific drm_dbg call DRM_DEBUG is not the right debug call to use in i915 OA, replace it with driver specific drm_dbg() call (Matt). Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220707193002.2859653-1-umesh.nerlige.ramappa@intel.com	2022-07-08 08:27:08 -07:00
Chris Wilson	027c38b412	drm/i915/selftests: Grab the runtime pm in shrink_thp Since we are not holding a wakeref, shrinking a bound object is not guaranteed. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6370 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220706154738.235204-1-matthew.auld@intel.com	2022-07-07 09:16:43 +01:00
Alan Previn	b94a1a207d	drm/i915/guc: Asynchronous flush of GuC log regions Both error-capture and relay-logging mechanism use the GuC log infrastructure. That means the KMD must send a log flush complete notification back to GuC after reading the data out. This call is currently being sent synchronously. However, synchronous H2Gs cause problems when the system is backed up. There is no need for this to be synchronous. The KMD wasn't even looking at the return status from it. So make it asynchronous and then there is no issue about time outs. Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220607002314.1451656-2-alan.previn.teres.alexis@intel.com	2022-07-06 14:38:56 -07:00
Thomas Hellström	1926a6b759	drm/i915: Fix vm use-after-free in vma destruction In vma destruction, the following race may occur: Thread 1: Thread 2: i915_vma_destroy(); ... list_del_init(vma->vm_link); ... mutex_unlock(vma->vm->mutex); __i915_vm_release(); release_references(); And in release_reference() we dereference vma->vm to get to the vm gt pointer, leading to a use-after free. However, __i915_vm_release() grabs the vm->mutex so the vm won't be destroyed before vma->vm->mutex is released, so extract the gt pointer under the vm->mutex to avoid the vma->vm dereference in release_references(). v2: Fix a typo in the commit message (Andi Shyti) Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5944 Fixes: e1a7ab4fca0c ("drm/i915: Remove the vm open count") Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.con> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220620123659.381772-1-thomas.hellstrom@linux.intel.com	2022-07-04 10:25:50 +02:00
Niranjana Vishwanathapura	99c0b3ce6c	drm/doc/rfc: VM_BIND uapi definition VM_BIND and related uapi definitions v2: Reduce the scope to simple Mesa use case. v3: Expand VM_UNBIND documentation and add I915_GEM_VM_BIND/UNBIND_FENCE_VALID and I915_GEM_VM_BIND_TLB_FLUSH flags. v4: Remove I915_GEM_VM_BIND_TLB_FLUSH flag and add additional documentation for vm_bind/unbind. v5: Remove TLB flush requirement on VM_UNBIND. Add version support to stage implementation. v6: Define and use drm_i915_gem_timeline_fence structure for all timeline fences. v7: Rename I915_PARAM_HAS_VM_BIND to I915_PARAM_VM_BIND_VERSION. Update documentation on async vm_bind/unbind and versioning. Remove redundant vm_bind/unbind FENCE_VALID flag, execbuf3 batch_count field and I915_EXEC3_SECURE flag. v8: Remove I915_GEM_VM_BIND_READONLY and minor documentation updates. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220701003110.24843-4-niranjana.vishwanathapura@intel.com	2022-07-01 16:47:07 -07:00
Niranjana Vishwanathapura	a913bde810	drm/i915: Update i915 uapi documentation Add some missing i915 uapi documentation which the new i915 VM_BIND feature documentation will be refer to. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220701003110.24843-3-niranjana.vishwanathapura@intel.com	2022-07-01 16:47:01 -07:00
Niranjana Vishwanathapura	ece91c882d	drm/doc/rfc: VM_BIND feature design document VM_BIND design document with description of intended use cases. v2: Reduce the scope to simple Mesa use case. v3: Expand documentation on dma-resv usage, TLB flushing and execbuf3. v4: Remove vm_bind tlb flush request support. v5: Update TLB flushing documentation. v6: Update out of order completion documentation. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220701003110.24843-2-niranjana.vishwanathapura@intel.com	2022-07-01 16:45:23 -07:00
Matt Roper	8618b8489b	drm/i915: DG2 and ATS-M device ID updates Small BAR support has now landed, which allows us to add the PCI IDs that correspond to add-in card designs of DG2 and ATS-M. There's also one additional MB-down PCI ID that recently appeared (0x5698) so we add it too. Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220701152231.529511-2-matthew.d.roper@intel.com	2022-07-01 16:22:13 -07:00
Gustavo Sousa	3b05c96078	drm/i915/pvc: Implement w/a 16016694945 A new PVC-specific workaround has just been added to the BSpec. BSpec: 64027 Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220630201407.16770-1-gustavo.sousa@intel.com	2022-07-01 08:29:02 -07:00
Matthew Auld	eb1c535f0d	drm/i915: turn on small BAR support With the uAPI in place we should now have enough in place to ensure a working system on small BAR configurations. v2: (Nirmoy & Thomas): - s/full BAR/Resizable BAR/ which is hopefully more easily understood by users. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-13-matthew.auld@intel.com	2022-07-01 08:30:47 +01:00
Matthew Auld	efeb3caf43	drm/i915/ttm: disallow CPU fallback mode for ccs pages Falling back to memcpy/memset shouldn't be allowed if we know we have CCS state to manage using the blitter. Otherwise we are potentially leaving the aux CCS state in an unknown state, which smells like an info leak. Fixes: 48760ffe923a ("drm/i915/gt: Clear compress metadata for Flat-ccs objects") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Cc: Ramalingam C <ramalingam.c@intel.com> Reviewed-by: Ramalingam C <ramalingam.c@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-12-matthew.auld@intel.com	2022-07-01 08:30:31 +01:00
Matthew Auld	bfe53be268	drm/i915/ttm: handle blitter failure on DG2 If the move or clear operation somehow fails, and the memory underneath is not cleared, like when moving to lmem, then we currently fallback to memcpy or memset. However with small-BAR systems this fallback might no longer be possible. For now we use the set_wedged sledgehammer if we ever encounter such a scenario, and mark the object as borked to plug any holes where access to the memory underneath can happen. Add some basic selftests to exercise this. v2: - In the selftests make sure we grab the runtime pm around the reset. Also make sure we grab the reset lock before checking if the device is wedged, since the wedge might still be in-progress and hence the bit might not be set yet. - Don't wedge or put the object into an unknown state, if the request construction fails (or similar). Just returning an error and skipping the fallback should be safe here. - Make sure we wedge each gt. (Thomas) - Peek at the unknown_state in io_reserve, that way we don't have to export or hand roll the fault_wait_for_idle. (Thomas) - Add the missing read-side barriers for the unknown_state. (Thomas) - Some kernel-doc fixes. (Thomas) v3: - Tweak the ordering of the set_wedged, also add FIXME. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-11-matthew.auld@intel.com	2022-07-01 08:30:00 +01:00
Matthew Auld	11f01dcf3b	drm/i915/selftests: ensure we reserve a fence slot We should always be explicit and allocate a fence slot before adding a new fence. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-10-matthew.auld@intel.com	2022-07-01 08:30:00 +01:00
Matthew Auld	938d2fd17d	drm/i915/selftests: skip the mman tests for stolen It's not supported, and just skips later anyway. With small-BAR things get more complicated since all of stolen is likely not even CPU accessible, hence not passing I915_BO_ALLOC_GPU_ONLY just results in the object create failing. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-9-matthew.auld@intel.com	2022-07-01 08:30:00 +01:00
Matthew Auld	71b1669ea9	drm/i915/uapi: tweak error capture on recoverable contexts A non-recoverable context must be used if the user wants proper error capture on discrete platforms. In the future the kernel may want to blit the contents of some objects when later doing the capture stage. Also extend to newer integrated platforms. v2(Thomas): - Also extend to newer integrated platforms, for capture buffer memory allocation purposes. v3 (Reported-by: kernel test robot <lkp@intel.com>): - Fix build on !CONFIG_DRM_I915_CAPTURE_ERROR Testcase: igt@gem_exec_capture@capture-recoverable Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-8-matthew.auld@intel.com	2022-07-01 08:30:00 +01:00
Matthew Auld	d42a738e5a	drm/i915/error: skip non-mappable pages Skip capturing any lmem pages that can't be copied using the CPU. This in now only best effort on platforms that have small BAR. Testcase: igt@gem-exec-capture@capture-invisible Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-7-matthew.auld@intel.com	2022-07-01 08:30:00 +01:00
Matthew Auld	525e93f631	drm/i915/uapi: add NEEDS_CPU_ACCESS hint If set, force the allocation to be placed in the mappable portion of I915_MEMORY_CLASS_DEVICE. One big restriction here is that system memory (i.e I915_MEMORY_CLASS_SYSTEM) must be given as a potential placement for the object, that way we can always spill the object into system memory if we can't make space. Testcase: igt@gem-create@create-ext-cpu-access-sanity-check Testcase: igt@gem-create@create-ext-cpu-access-big Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-6-matthew.auld@intel.com	2022-07-01 08:30:00 +01:00
Matthew Auld	1dbd07e088	drm/i915/uapi: apply ALLOC_GPU_ONLY by default On small BAR configurations, when dealing with I915_MEMORY_CLASS_DEVICE allocations, we assume that by default, all userspace allocations should be placed in the non-CPU visible portion. Note that dumb buffers are not included here, since these are not "GPU accelerated" and likely need CPU access. We choose to just always set GPU_ONLY, and let the backend figure out if that should be ignored or not, for example on full BAR systems. In a later patch userspace will be able to provide a hint if CPU access to the buffer is needed. v2(Thomas) - Apply GPU_ONLY on all discrete devices, but only if the BO can be placed in LMEM. Down in the depths this should be turned into a noop, where required, and as an annotation it still make some sense. If we apply it regardless of the placements then we end up needing to check the placements during exec capture. Also it's slightly inconsistent since the NEEDS_CPU_ACCESS can only be applied on objects that can be placed in LMEM. The other annoyance would be gem_create_ext vs plain gem_create, if we were to always apply GPU_ONLY. Testcase: igt@gem-create@create-ext-cpu-access-sanity-check Testcase: igt@gem-create@create-ext-cpu-access-big Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-5-matthew.auld@intel.com	2022-07-01 08:29:59 +01:00
Matthew Auld	be4e366602	drm/i915: remove intel_memory_region avail No longer used. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-4-matthew.auld@intel.com	2022-07-01 08:29:59 +01:00
Matthew Auld	141f733bb3	drm/i915/uapi: expose the avail tracking Vulkan would like to have a rough measure of how much device memory can in theory be allocated. Also add unallocated_cpu_visible_size to track the visible portion, in case the device is using small BAR. Also tweak the locking so we nice consistent values for both the mm->avail and the visible tracking. v2: tweak the locking slightly so we update the mm->avail and visible tracking as one atomic operation, such that userspace doesn't get strange values when sampling the values. Testcase: igt@i915_query@query-regions-unallocated Testcase: igt@i915_query@query-regions-sanity-check Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-3-matthew.auld@intel.com	2022-07-01 08:29:59 +01:00
Matthew Auld	3f4309cbdc	drm/i915/uapi: add probed_cpu_visible_size Userspace wants to know the size of CPU visible portion of device local-memory, and on small BAR devices the probed_size is no longer enough. In Vulkan, for example, it would like to know the size in bytes for CPU visible VkMemoryHeap. We already track the io_size for each region, so plumb that through to the region query. v2: Drop the ( -1 = unknown ) stuff, which is confusing since nothing can currently ever return such a value. Testcase: igt@i915_query@query-regions-sanity-check Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-2-matthew.auld@intel.com	2022-07-01 08:29:59 +01:00
Matthew Auld	fff1d972f4	drm/doc: add rfc section for small BAR uapi Add an entry for the new uapi needed for small BAR on DG2+. v2: - Some spelling fixes and other small tweaks. (Akeem & Thomas) - Rework error capture interactions, including no longer needing NEEDS_CPU_ACCESS for objects marked for capture. (Thomas) - Add probed_cpu_visible_size. (Lionel) v3: - Drop the vma query for now. - Add unallocated_cpu_visible_size as part of the region query. - Improve the docs some more, including documenting the expected behaviour on older kernels, since this came up in some offline discussion. v4: - Various improvements all over. (Tvrtko) v5: - Include newer integrated platforms when applying the non-recoverable context and error capture restriction. (Thomas) Mesa: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16739 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Cc: mesa-dev@lists.freedesktop.org Acked-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Acked-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Jordan Justen <jordan.l.justen@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-1-matthew.auld@intel.com	2022-07-01 08:29:29 +01:00
Daniele Ceraolo Spurio	971e4a9781	drm/i915/guc: ADL-N should use the same GuC FW as ADL-S The only difference between the ADL S and P GuC FWs is the HWConfig support. ADL-N does not support HWConfig, so we should use the same binary as ADL-S, otherwise the GuC might attempt to fetch a config table that does not exist. ADL-N is internally identified as an ADL-P, so we need to special-case it in the FW selection code. Fixes: 7e28d0b26759 ("drm/i915/adl-n: Enable ADL-N platform") Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Tejas Upadhyay <tejas.upadhyay@intel.com> Cc: Anusha Srivatsa <anusha.srivatsa@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220621233005.3952293-1-daniele.ceraolospurio@intel.com	2022-06-30 14:25:09 -07:00
Vinay Belgaumkar	79398d24da	drm/i915/guc/slpc: Add a new SLPC selftest This test will validate we can achieve actual frequency of RP0. Pcode grants frequencies based on what GuC is requesting. However, thermal throttling can limit what is being granted. Add a test to request for max, but don't fail the test if RP0 is not granted due to throttle reasons. Also optimize the selftest by using a common run_test function to avoid code duplication. Rename the "clamp" tests to vary_max_freq and vary_min_freq. v2: Fix compile warning v3: Review comments (Ashutosh). Added a FIXME for the media RP0 case. v4: Checkpatch (strict) fixes, remove FIXME and other comments (Ashutosh) Fixes commit 8ee2c227822e ("drm/i915/guc/slpc: Add SLPC selftest") Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220627230346.27720-1-vinay.belgaumkar@intel.com	2022-06-29 13:50:23 -07:00
Nirmoy Das	a069685637	drm/i915: Fix a lockdep warning at error capture For some platfroms we use stop_machine version of gen8_ggtt_insert_page/gen8_ggtt_insert_entries to avoid a concurrent GGTT access bug but this causes a circular locking dependency warning: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&ggtt->error_mutex); lock(dma_fence_map); lock(&ggtt->error_mutex); lock(cpu_hotplug_lock); Fix this by calling gen8_ggtt_insert_page/gen8_ggtt_insert_entries directly at error capture which is concurrent GGTT access safe because reset path make sure of that. v2: Fix rebase conflict and added a comment. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5595 Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Ramalingam C <ramalingam.c@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220624110821.29190-1-nirmoy.das@intel.com	2022-06-29 14:52:50 +05:30
Vinay Belgaumkar	58eaa6b3fb	drm/i915/guc/slpc: Use non-blocking H2G for waitboost SLPC min/max frequency updates require H2G calls. We are seeing timeouts when GuC channel is backed up and it is unable to respond in a timely fashion causing warnings and affecting CI. This is seen when waitboosting happens during a stress test. this patch updates the waitboost path to use a non-blocking H2G call instead, which returns as soon as the message is successfully transmitted. v2: Use drm_notice to report any errors that might occur while sending the waitboost H2G request (Tvrtko) v3: Add drm_notice inside force_min_freq (Ashutosh) Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220623003225.23301-1-vinay.belgaumkar@intel.com	2022-06-27 15:08:58 -07:00
Umesh Nerlige Ramappa	0667429ce6	drm/i915/reset: Add additional steps for Wa_22011802037 for execlist backend For execlists backend, current implementation of Wa_22011802037 is to stop the CS before doing a reset of the engine. This WA was further extended to wait for any pending MI FORCE WAKEUPs before issuing a reset. Add the extended steps in the execlist path of reset. In addition, extend the WA to gen11. v2: (Tvrtko) - Clarify comments, commit message, fix typos - Use IS_GRAPHICS_VER for gen 11/12 checks v3: (Daneile) - Drop changes to intel_ring_submission since WA does not apply to it - Log an error if MSG IDLE is not defined for an engine Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Fixes: f6aa0d713c88 ("drm/i915: Add Wa_22011802037 force cs halt") Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220621192105.2100585-1-umesh.nerlige.ramappa@intel.com	2022-06-27 12:21:22 -07:00
Alan Previn	59bcdb564b	drm/i915/guc: Don't update engine busyness stats too frequently Using two different types of workoads, it was observed that guc_update_engine_gt_clks was being called too frequently and/or causing a CPU-to-lmem bandwidth hit over PCIE. Details on the workloads and numbers are in the notes below. Background: At the moment, guc_update_engine_gt_clks can be invoked via one of 3 ways. #1 and #2 are infrequent under normal operating conditions: 1.When a predefined "ping_delay" timer expires so that GuC- busyness can sample the GTPM clock counter to ensure it doesn't miss a wrap-around of the 32-bits of the HW counter. (The ping_delay is calculated based on 1/8th the time taken for the counter go from 0x0 to 0xffffffff based on the GT frequency. This comes to about once every 28 seconds at a GT frequency of 19.2Mhz). 2.In preparation for a gt reset. 3.In response to __gt_park events (as the gt power management puts the gt into a lower power state when there is no work being done). Root-cause: For both the workloads described farther below, it was observed that when user space calls IOCTLs that unparks the gt momentarily and repeats such calls many times in quick succession, it triggers calling guc_update_engine_gt_clks as many times. However, the primary purpose of guc_update_engine_gt_clks is to ensure we don't miss the wraparound while the counter is ticking. Thus, the solution is to ensure we skip that check if gt_park is calling this function earlier than necessary. Solution: Snapshot jiffies when we do actually update the busyness stats. Then get the new jiffies every time intel_guc_busyness_park is called and bail if we are being called too soon. Use half of the ping_delay as a safe threshold. NOTE1: Workload1: IGTs' gem_create was modified to create a file handle, allocate memory with sizes that range from a min of 4K to the max supported (in power of two step-sizes). Its maps, modifies and reads back the memory. Allocations and modification is repeated until total memory allocation reaches the max. Then the file handle is closed. With this workload, guc_update_engine_gt_clks was called over 188 thousand times in the span of 15 seconds while this test ran three times. With this patch, the number of calls reduced to 14. NOTE2: Workload2: 30 transcode sessions are created in quick succession. While these sessions are created, pcm-iio tool was used to measure I/O read operation bandwidth consumption sampled at 100 milisecond intervals over the course of 20 seconds. The total bandwidth consumed over 20 seconds without this patch was measured at average at 311KBps per sample. With this patch, the number went down to about 175Kbps which is about a 43% savings. Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220623023157.211650-2-alan.previn.teres.alexis@intel.com	2022-06-27 11:35:54 -07:00
Niranjana Vishwanathapura	bcb9aa45d5	Revert "drm/i915: Hold reference to intel_context over life of i915_request" This reverts commit 1e98d8c52ed5dfbaf273c4423c636525c2ce59e7. The problem with this patch is that it makes i915_request to hold a reference to intel_context, which in turn holds a reference on the VM. This strong back referencing can lead to reference loops which leads to resource leak. An example is the upcoming VM_BIND work which requires VM to hold a reference to some shared VM specific BO. But this BO's dma-resv fences holds reference to the i915_request thus leading to reference loop. v2: Do not use reserved requests for virtual engines Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Ramalingam C <ramalingam.c@intel.com> Suggested-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Mathew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220614184348.23746-3-ramalingam.c@intel.com	2022-06-27 23:47:26 +05:30
Niranjana Vishwanathapura	7307e91bfc	drm/i915: Do not access rq->engine without a reference In i915_fence_get_driver_name(), user may not hold a reference to rq->engine. Hence do not access it. Instead, store required device private pointer in 'rq->i915' and use it. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Suggested-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Ramalingam C <ramalingam.c@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220614184348.23746-2-ramalingam.c@intel.com	2022-06-27 23:46:35 +05:30
Matt Roper	7d8097073c	drm/i915: Prefer "XEHP_" prefix for registers We've been introducing new registers with a mix of "XEHP_" (architecture) and "XEHPSDV_" (platform) prefixes. For consistency, let's settle on "XEHP_" as the preferred form. XEHPSDV_RP_STATE_CAP stays with its current name since that's truly a platform-specific register and not something that applies to the Xe_HP architecture as a whole. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Caz Yokoyama <caz@caztech.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220624210328.308630-2-matthew.d.roper@intel.com	2022-06-27 07:44:25 -07:00
Matt Roper	8524bb6714	drm/i915: Correct duplicated/misplaced GT register definitions XEHPSDV_FLAT_CCS_BASE_ADDR, GEN8_L3_LRA_1_GPGPU, and MMCD_MISC_CTRL were duplicated between i915_reg.h and intel_gt_regs.h. These are all GT registers, so we should drop the copy from i915_reg.h. XEHPSDV_TILE0_ADDR_RANGE was defined in i915_reg.h, but really belongs in intel_gt_regs.h. Move it. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220624210328.308630-1-matthew.d.roper@intel.com	2022-06-27 07:44:17 -07:00
Matthew Auld	563aaf4a92	drm/i915: tweak the ordering in cpu_write_needs_clflush For imported dma-buf objects we leave the object as cache_coherent = 0 across all platforms, which is reasonable given that have no clue what the memory underneath is, and its not like the driver can ever manually clflush the pages anyway (like with i915_gem_clflush_object) for such objects. However on discrete we choose to treat cache_dirty = true as a programmer error, leading to a warning. The simplest fix looks to be to just change the ordering in cpu_write_needs_clflush to prevent ever setting cache_dirty for dma-buf objects on discrete. Fixes: d028a7690d87 ("drm/i915/dmabuf: Fix prime_mmap to work when using LMEM") Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5266 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220622155919.355081-1-matthew.auld@intel.com	2022-06-27 12:56:52 +01:00
Akeem G Abodunrin	373269ae6f	drm/i915/selftests: Increase timeout for live_parallel_switch With GuC submission, it takes a little bit longer switching contexts among all available engines simultaneously, when running live_parallel_switch subtest. Increase the timeout. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5885 Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220622141104.334432-1-matthew.auld@intel.com	2022-06-23 15:44:14 +01:00
Lucas De Marchi	9ce07d94c9	drm/i915/gt: Re-do the intel-gtt split Re-do what was attempted in commit 7a5c922377b4 ("drm/i915/gt: Split intel-gtt functions by arch"). The goal of that commit was to split the handlers for older hardware that depend on intel-gtt.ko so i915 can be built for non-x86 archs, after some more patches. Other archs do not need intel-gtt.ko. Main issue with the previous approach: it moved all the hooks, including the gen8, which is used by all platforms gen8 and newer. Re-do the split moving only the handlers for gen < 6, which are the only ones calling out to the separate module. While at it do some minor cleanups: - Rename the prefix s/gen5_/gmch_/ to be more accurate what platforms are covered by intel_ggtt_gmch.c - Remove dead code for gen12 out of needs_idle_maps() - Remove TODO comment leftover - Re-order if/else ladder in ggtt_probe_hw() to keep newest platforms first v2: Add minor cleanups (Matt Roper) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220617230559.2109427-2-lucas.demarchi@intel.com	2022-06-22 15:52:56 -07:00
Lucas De Marchi	64e06652e3	agp/intel: Rename intel-gtt symbols Exporting the symbols like intel_gtt_* creates some confusion inside i915 that has symbols named similarly. In an attempt to isolate platforms needing intel-gtt.ko, commit 7a5c922377b4 ("drm/i915/gt: Split intel-gtt functions by arch") moved way too much inside gt/intel_gt_gmch.c, even the functions that don't callout to this module. Rename the symbols to make the separation clear. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220617230559.2109427-1-lucas.demarchi@intel.com	2022-06-22 15:52:55 -07:00

1 2 3 4 5 ...

1089993 Commits