linux

iv/linux

Author	SHA1	Message	Date
Chris Wilson	f4d49692ad	drm/i915/gt: Mark up the racy read of execlists->context_tag Since we are using bitops on context_tag to allow us to reserve and release inflight tags concurrently, the scan for the next bit is intentionally racy. [ 516.446854] BUG: KCSAN: data-race in execlists_schedule_in.isra.0 [i915] / execlists_schedule_out [i915] [ 516.446874] [ 516.446886] write (marked) to 0xffff8881f7644048 of 8 bytes by interrupt on cpu 2: [ 516.447076] execlists_schedule_out+0x538/0x6a0 [i915] [ 516.447263] process_csb+0x10b/0x3d0 [i915] [ 516.447449] execlists_submission_tasklet+0x30/0x170 [i915] [ 516.447468] tasklet_action_common.isra.0+0x42/0x90 [ 516.447484] __do_softirq+0xc8/0x206 [ 516.447498] irq_exit+0xcd/0xe0 [ 516.447516] do_IRQ+0x44/0xc0 [ 516.447535] ret_from_intr+0x0/0x1c [ 516.447550] cpuidle_enter_state+0x199/0x400 [ 516.447572] cpuidle_enter+0x50/0x90 [ 516.447587] do_idle+0x197/0x1e0 [ 516.447600] cpu_startup_entry+0x14/0x20 [ 516.447619] start_secondary+0xf9/0x130 [ 516.447643] secondary_startup_64+0xa4/0xb0 [ 516.447655] [ 516.447671] read to 0xffff8881f7644048 of 8 bytes by task 460 on cpu 1: [ 516.447863] execlists_schedule_in.isra.0+0x3cf/0x5a0 [i915] [ 516.448064] execlists_dequeue+0xf8f/0x1690 [i915] [ 516.448252] __execlists_submission_tasklet+0x48/0x60 [i915] [ 516.448440] execlists_submit_request+0x2e2/0x310 [i915] [ 516.448634] submit_notify+0x8f/0xc8 [i915] [ 516.448820] __i915_sw_fence_complete+0x61/0x420 [i915] [ 516.449005] i915_sw_fence_complete+0x58/0x80 [i915] [ 516.449208] i915_sw_fence_commit+0x16/0x20 [i915] [ 516.449399] __i915_request_queue+0x60/0x70 [i915] [ 516.449590] i915_gem_do_execbuffer+0x33f1/0x4a00 [i915] [ 516.449782] i915_gem_execbuffer2_ioctl+0x2a2/0x550 [i915] [ 516.449800] drm_ioctl_kernel+0xe9/0x130 [ 516.449814] drm_ioctl+0x27d/0x45e [ 516.449827] ksys_ioctl+0x89/0xb0 [ 516.449842] __x64_sys_ioctl+0x42/0x60 [ 516.449864] do_syscall_64+0x6e/0x2c0 [ 516.449878] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200511075722.13483-1-chris@chris-wilson.co.uk	2020-05-11 11:01:12 +01:00
Gustavo A. R. Silva	f1e79c7e18	drm/i915: Replace zero-length array with flexible-array The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] sizeof(flexible-array-member) triggers a warning because flexible array members have incomplete type[1]. There are some instances of code in which the sizeof operator is being incorrectly/erroneously applied to zero-length arrays and the result is zero. Such instances may be hiding some bugs. So, this work (flexible-array member conversions) will also help to get completely rid of those sorts of issues. This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200507185408.GA14561@embeddedor	2020-05-09 12:59:23 +01:00
Chris Wilson	e41627db6f	drm/i915/gt: Improve precision on defer_request assert The kernel_context does not use initial-breadcrumbs, so when we ask if its requests have started we do so by comparing against the completion seqno of the previous request. This is very imprecise, not precise enough for the defer_request assertion. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1847 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200508104220.9872-1-chris@chris-wilson.co.uk	2020-05-08 15:09:11 +01:00
Mika Kuoppala	972282c4cf	drm/i915/gen12: Add aux table invalidate for all engines All engines, exception being blitter as it does not care about the form, can access compressed surfaces. So we need to add forced aux table invalidates for those engines. v2: virtual instance masking (Chris) v3: bug on if not found (Chris) References: d248b371f747 ("drm/i915/gen12: Invalidate aux table entries forcibly") References bspec#43904, hsdes#1809175790 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chuansheng Liu <chuansheng.liu@intel.com> Cc: Rafael Antognolli <rafael.antognolli@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200507142045.8668-1-mika.kuoppala@linux.intel.com	2020-05-07 20:18:28 +01:00
Chris Wilson	eec39e441c	drm/i915: Remove wait priority boosting Upon waiting a request (when asked), we gave that request a small priority boost, not enough for it to cause preemption, but enough for it to be scheduled next before all equals. We also used that bit to give new clients a small priority boost, similar to FQ_CODEL, such that we favoured short interactive tasks ahead of long running streams. However, this is causing lots of complications with timeslicing where we both want to honour the boost and yet ignore it. Those complications cause unexpected user behaviour (tasks not being timesliced and run concurrently as epxected), and the easiest way to resolve that is to remove the boost. Hopefully, we can find a compromise again if we need to, but in theory timeslicing itself and future more advanced schedulers should give us the interactivity boost we seek. Testcase: igt/gem_exec_schedule/lateslice Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200507152338.7452-3-chris@chris-wilson.co.uk	2020-05-07 20:08:58 +01:00
Chris Wilson	6b6cd2ebd8	drm/i915: Mark concurrent submissions with a weak-dependency We recorded the dependencies for WAIT_FOR_SUBMIT in order that we could correctly perform priority inheritance from the parallel branches to the common trunk. However, for the purpose of timeslicing and reset handling, the dependency is weak -- as we the pair of requests are allowed to run in parallel and not in strict succession. The real significance though is that this allows us to rearrange groups of WAIT_FOR_SUBMIT linked requests along the single engine, and so can resolve user level inter-batch scheduling dependencies from user semaphores. Fixes: c81471f5e95c ("drm/i915: Copy across scheduler behaviour flags across submit fences") Testcase: igt/gem_exec_fence/submit Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: <stable@vger.kernel.org> # v5.6+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200507155109.8892-1-chris@chris-wilson.co.uk	2020-05-07 19:49:21 +01:00
Mika Kuoppala	d248b371f7	drm/i915/gen12: Invalidate aux table entries forcibly Aux table invalidation can fail on update. So next access may cause memory access to be into stale entry. Proposed workaround is to invalidate entries between all batchbuffers. v2: correct register address (Yang) v3: respect the order (Chris) References bspec#43904, hsdes#1809175790 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chuansheng Liu <chuansheng.liu@intel.com> Cc: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Yang A Shi <yang.a.shi@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200506165310.1239-1-mika.kuoppala@linux.intel.com	2020-05-07 07:44:42 +01:00
Mika Kuoppala	0c7c0c8e6f	drm/i915/gen12: Flush L3 Flush TDL,L3 and EUs Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200506144734.29297-3-mika.kuoppala@linux.intel.com	2020-05-07 07:44:41 +01:00
Mika Kuoppala	32d7171ee2	drm/i915/gen12: Fix HDC pipeline flush HDC pipeline flush is bit on the first dword of the PIPE_CONTROL, not the second. Make it so. v2: function naming (Chris) Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200506144734.29297-2-mika.kuoppala@linux.intel.com	2020-05-07 07:44:41 +01:00
Mika Kuoppala	f02ac414ba	Revert "drm/i915/tgl: Include ro parts of l3 to invalidate" This reverts commit 62037ffff229b7d94f1db5ef8d2e2ec819832ef3. L3 ro cache invalidation is part of the dword0 of pipe control. Also it is not relevant to this gen. Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200506144734.29297-1-mika.kuoppala@linux.intel.com	2020-05-07 07:44:40 +01:00
Chris Wilson	977253df64	drm/i915/gt: Stop holding onto the pinned_default_state As we only restore the default context state upon banning a context, we only need enough of the state to run the ring and nothing more. That is we only need our bare protocontext. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504180745.15645-1-chris@chris-wilson.co.uk	2020-05-05 21:12:33 +01:00
Chris Wilson	b68be5c623	drm/i915/execlists: Record the active CCID from before reset If we cannot trust the reset will flush out the CS event queue such that process_csb() reports an accurate view of HW, we will need to search the active and pending contexts to determine which was actually running at the time we issued the reset. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200505084629.31365-1-chris@chris-wilson.co.uk	2020-05-05 12:05:40 +01:00
Chris Wilson	25fd6de315	drm/i915/gt: Small tidy of gen8+ breadcrumb emission Use a local to shrink a line under 80 columns, and refactor the common emit_xcs_breadcrumb() wrapper of ggtt-write. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504180507.6017-1-chris@chris-wilson.co.uk	2020-05-05 09:16:59 +01:00
Chris Wilson	8757797ff9	drm/i915/selftests: Repeat the rps clock frequency measurement Repeat the measurement of the clock frequency a few times and use the median to try and reduce the systematic measurement error. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504044903.7626-6-chris@chris-wilson.co.uk	2020-05-04 18:21:28 +01:00
Ville Syrjälä	dab3aff7b1	drm/i915: Remove cnl pre-prod workarounds Remove all the stepping dependent cnl workarounds. Bspec lists more steppings than this so presumably these are classed as pre-production. And this is cnl after all so no one should really care anyway. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200430125822.21985-2-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2020-05-04 18:44:52 +03:00
Chris Wilson	e3d291301f	drm/i915/gem: Implement legacy MI_STORE_DATA_IMM The older arches did not convert MI_STORE_DATA_IMM to using the GTT, but left them writing to a physical address. The notes suggest that the primary reason would be so that the writes were cache coherent, as the CPU cache uses physical tagging. As such we did not implement the legacy variant of MI_STORE_DATA_IMM and so left all the relocations synchronous -- but with a small function to convert from the vma address into the physical address, we can implement asynchronous relocs on these older arches, fixing up a few tests that require them. In order to be able to test the legacy paths, refactor the gpu relocations so that we can hook them up to a selftest. v2: Use an array of offsets not enum labels for the selftest v3: Refactor the common igt_hexdump() Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/757 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504140629.28240-1-chris@chris-wilson.co.uk	2020-05-04 15:15:04 +01:00
Chris Wilson	389b7f00c7	drm/i915/gt: Sanitize RPS interrupts upon resume Currently we clear and disable the RPS pm interrupts on module load, and presume that they remain disabled forevermore. However, the mask is cleared on suspend and so after resume they may start showing up again unexepectedly. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1811 Fixes: 8e99299a04bc ("drm/i915/gt: Track use of RPS interrupts in flags") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi@etezian.org> Reviewed-by: Andi Shyti <andi@etezian.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200502173512.32353-1-chris@chris-wilson.co.uk	2020-05-03 08:24:36 +01:00
Chris Wilson	a211da9c77	drm/i915/gt: Make timeslicing an explicit engine property In order to allow userspace to rely on timeslicing to reorder their batches, we must support preemption of those user batches. Declare timeslicing as an explicit property that is a combination of having the kernel support and HW support. Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 8ee36e048c98 ("drm/i915/execlists: Minimalistic timeslicing") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200501122249.12417-1-chris@chris-wilson.co.uk	2020-05-01 15:17:33 +01:00
Chris Wilson	16e8745967	drm/i915/gt: Move the batch buffer pool from the engine to the gt Since the introduction of 'soft-rc6', we aim to park the device quickly and that results in frequent idling of the whole device. Currently upon idling we free the batch buffer pool, and so this renders the cache ineffective for many workloads. If we want to have an effective cache of recently allocated buffers available for reuse, we need to decouple that cache from the engine powermanagement and make it timer based. As there is no reason then to keep it within the engine (where it once made retirement order easier to track), we can move it up the hierarchy to the owner of the memory allocations. v2: Hook up to debugfs/drop_caches to clear the cache on demand. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200430111819.10262-2-chris@chris-wilson.co.uk	2020-04-30 19:12:02 +01:00
Chris Wilson	de3b4d9361	drm/i915/gt: Restore aggressive post-boost downclocking We reduced the clocks slowly after a boost event based on the observation that the smoothness of animations suffered. However, since reducing the evalution intervals, we should be able to respond to the rapidly fluctuating workload of a simple desktop animation and so restore the more aggressive downclocking. References: 2a8862d2f3da ("drm/i915: Reduce the RPS shock") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-6-chris@chris-wilson.co.uk	2020-04-30 00:57:38 +01:00
Chris Wilson	3f88dde6ee	drm/i915/gt: Apply the aggressive downclocking to parking We treat parking as a manual RPS timeout event, and downclock the GPU for the next unpark and batch execution. However, having restored the aggressive downclocking and observed that we have very light workloads whose only interaction is through the manual parking events, carry over the aggressive downclocking to the fake RPS events. References: 21abf0bf168d ("drm/i915/gt: Treat idling as a RPS downclock event") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-5-chris@chris-wilson.co.uk	2020-04-30 00:57:37 +01:00
Chris Wilson	36d516be86	drm/i915/gt: Switch to manual evaluation of RPS As with the realisation for soft-rc6, we respond to idling the engines within microseconds, far faster than the response times for HW RC6 and RPS. Furthermore, our fast parking upon idle, prevents HW RPS from running for many desktop workloads, as the RPS evaluation intervals are on the order of tens of milliseconds, but the typical workload is just a couple of milliseconds, but yet we still need to determine the best frequency for user latency versus power. Recognising that the HW evaluation intervals are a poor fit, and that they were deprecated [in bspec at least] from gen10, start to wean ourselves off them and replace the EI with a timer and our accurate busy-stats. The principle benefit of manually evaluating RPS intervals is that we can be more responsive for better performance and powersaving for both spiky workloads and steady-state. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1698 Fixes: 98479ada421a ("drm/i915/gt: Treat idling as a RPS downclock event") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-4-chris@chris-wilson.co.uk	2020-04-30 00:57:37 +01:00
Chris Wilson	8e99299a04	drm/i915/gt: Track use of RPS interrupts in flags Use the new intel_rps.flags field to store whether or not interrupts are being used with RPS. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi@etezian.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-3-chris@chris-wilson.co.uk	2020-04-30 00:57:36 +01:00
Chris Wilson	9bad2adbdd	drm/i915/gt: Move rps.enabled/active to flags Pull the boolean intel_rps.enabled and intel_rps.active into a single flags field, in preparation for more. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-2-chris@chris-wilson.co.uk	2020-04-30 00:57:35 +01:00
Chris Wilson	426d0073fb	drm/i915/gt: Always enable busy-stats for execlists In the near future, we will utilize the busy-stats on each engine to approximate the C0 cycles of each, and use that as an input to a manual RPS mechanism. That entails having busy-stats always enabled and so we can remove the enable/disable routines and simplify the pmu setup. As a consequence of always having the stats enabled, we can also show the current active time via sysfs/engine/xcs/active_time_ns. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-1-chris@chris-wilson.co.uk	2020-04-30 00:57:34 +01:00
Chris Wilson	be1cb55a07	drm/i915/gt: Keep a no-frills swappable copy of the default context state We need to keep the default context state around to instantiate new contexts (aka golden rendercontext), and we also keep it pinned while the engine is active so that we can quickly reset a hanging context. However, the default contexts are large enough to merit keeping in swappable memory as opposed to kernel memory, so we store them inside shmemfs. Currently, we use the normal GEM objects to create the default context image, but we can throw away all but the shmemfs file. This greatly simplifies the tricky power management code which wants to run underneath the normal GT locking, and we definitely do not want to use any high level objects that may appear to recurse back into the GT. Though perhaps the primary advantage of the complex GEM object is that we aggressively cache the mapping, but here we are recreating the vm_area everytime time we unpark. At the worst, we add a lightweight cache, but first find a microbenchmark that is impacted. Having started to create some utility functions to make working with shmemfs objects easier, we can start putting them to wider use, where GEM objects are overkill, such as storing persistent error state. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Ramalingam C <ramalingam.c@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429172429.6054-1-chris@chris-wilson.co.uk	2020-04-29 19:02:37 +01:00
Dan Carpenter	8c35a19576	drm/i915/selftests: fix error handling in __live_lrc_indirect_ctx_bb() If intel_context_create() fails then it leads to an error pointer dereference. I shuffled things around to make error handling easier. Fixes: 1dd47b54baea ("drm/i915: Add live selftests for indirect ctx batchbuffers") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200429132425.GE815283@mwanda	2020-04-29 15:16:35 +01:00
Nathan Chancellor	2ea4a7ba9b	drm/i915/gt: Avoid uninitialized use of rpcurupei in frequency_show When building with clang + -Wuninitialized: drivers/gpu/drm/i915/gt/debugfs_gt_pm.c:407:7: warning: variable 'rpcurupei' is uninitialized when used here [-Wuninitialized] rpcurupei, ^~~~~~~~~ drivers/gpu/drm/i915/gt/debugfs_gt_pm.c:304:16: note: initialize the variable 'rpcurupei' to silence this warning u32 rpcurupei, rpcurup, rpprevup; ^ = 0 1 warning generated. rpupei is assigned twice; based on the second argument to intel_uncore_read, it seems this one should have been assigned to rpcurupei. Fixes: 9c878557b1eb ("drm/i915/gt: Use the RPM config register to determine clk frequencies") Link: https://github.com/ClangBuiltLinux/linux/issues/1016 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200429030051.920203-1-natechancellor@gmail.com	2020-04-29 07:46:21 +01:00
Chris Wilson	f6a7c21c99	drm/i915/execlists: Verify we don't submit two identical CCIDs Check that we do not submit two contexts into ELSP with the same CCID [upper portion of the descriptor]. References: https://gitlab.freedesktop.org/drm/intel/-/issues/1793 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-3-chris@chris-wilson.co.uk	2020-04-28 22:17:36 +01:00
Chris Wilson	5c4a53e3b1	drm/i915/execlists: Track inflight CCID The presumption is that by using a circular counter that is twice as large as the maximum ELSP submission, we would never reuse the same CCID for two inflight contexts. However, if we continually preempt an active context such that it always remains inflight, it can be resubmitted with an arbitrary number of paired contexts. As each of its paired contexts will use a new CCID, eventually it will wrap and submit two ELSP with the same CCID. Rather than use a simple circular counter, switch over to a small bitmap of inflight ids so we can avoid reusing one that is still potentially active. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1796 Fixes: 2935ed5339c4 ("drm/i915: Remove logical HW ID") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.5+ Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-2-chris@chris-wilson.co.uk	2020-04-28 22:17:36 +01:00
Chris Wilson	2632f174a2	drm/i915/execlists: Avoid reusing the same logical CCID The bspec is confusing on the nature of the upper 32bits of the LRC descriptor. Once upon a time, it said that it uses the upper 32b to decide if it should perform a lite-restore, and so we must ensure that each unique context submitted to HW is given a unique CCID [for the duration of it being on the HW]. Currently, this is achieved by using a small circular tag, and assigning every context submitted to HW a new id. However, this tag is being cleared on repinning an inflight context such that we end up re-using the 0 tag for multiple contexts. To avoid accidentally clearing the CCID in the upper 32bits of the LRC descriptor, split the descriptor into two dwords so we can update the GGTT address separately from the CCID. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1796 Fixes: 2935ed5339c4 ("drm/i915: Remove logical HW ID") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.5+ Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-1-chris@chris-wilson.co.uk	2020-04-28 22:17:36 +01:00
Chris Wilson	96a4faf524	drm/i915/selftests: Tweak the tolerance for clock ticks to 12.5% Give a small bump for our tolerance on comparing the expected vs measured clock ticks/time from 10% to 12.5% to accommodate a bad result on Sandybridge that was off by 10.3%. Hopefully, that is the worst we will see. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1802 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200428114307.5153-1-chris@chris-wilson.co.uk	2020-04-28 14:25:21 +01:00
Colin Ian King	d631461d5c	drm/i915/gt: fix spelling mistake "evalution" -> "evaluation" There is a spelling mistaking in a pr_notice message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200428084920.1035125-1-colin.king@canonical.com	2020-04-28 09:53:59 +01:00
Chris Wilson	6dc0d028f5	drm/i915/gt: Fix up clock frequency The bspec lists both the clock frequency and the effective interval. The interval corresponds to observed behaviour, so adjust the frequency to match. v2: Mika rightfully asked if we could measure the clock frequency from a selftest. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200427154554.12736-1-chris@chris-wilson.co.uk	2020-04-27 17:34:33 +01:00
Chris Wilson	4243cd5388	drm/i915/gt: Sanitize GT first We see that if the HW doesn't actually sleep, the HW may eat the poison we set in its write-only HWSP during sanitize: intel_gt_resume.part.8: 0000:00:02.0 __gt_unpark: 0000:00:02.0 gt_sanitize: 0000:00:02.0 force:yes process_csb: 0000:00:02.0 vcs0: cs-irq head=5, tail=90 process_csb: 0000:00:02.0 vcs0: csb[0]: status=0x5a5a5a5a:0x5a5a5a5a assert_pending_valid: Nothing pending for promotion! The CS TAIL pointer should have been reset by reset_csb_pointers(), so in this case it is likely that we have read back from the CPU cache and so we must clflush our control over that page. In doing so, push the sanitisation to the start of the GT sequence so that our poisoning is assuredly before we start talking to the HW. References: https://gitlab.freedesktop.org/drm/intel/-/issues/1794 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200427084000.10999-1-chris@chris-wilson.co.uk	2020-04-27 11:39:23 +01:00
Chris Wilson	2759e39535	drm/i915/gt: Check cacheline is valid before acquiring The hwsp_cacheline pointer from i915_request is very, very flimsy. The i915_request.timeline (and the hwsp_cacheline) are lost upon retiring (after an RCU grace). Therefore we need to confirm that once we have the right pointer for the cacheline, it is not in the process of being retired and disposed of before we attempt to acquire a reference to the cacheline. <3>[ 547.208237] BUG: KASAN: use-after-free in active_debug_hint+0x6a/0x70 [i915] <3>[ 547.208366] Read of size 8 at addr ffff88822a0d2710 by task gem_exec_parall/2536 <4>[ 547.208547] CPU: 3 PID: 2536 Comm: gem_exec_parall Tainted: G U 5.7.0-rc2-ged7a286b5d02d-kasan_117+ #1 <4>[ 547.208556] Hardware name: Dell Inc. XPS 13 9350/, BIOS 1.4.12 11/30/2016 <4>[ 547.208564] Call Trace: <4>[ 547.208579] dump_stack+0x96/0xdb <4>[ 547.208707] ? active_debug_hint+0x6a/0x70 [i915] <4>[ 547.208719] print_address_description.constprop.6+0x16/0x310 <4>[ 547.208841] ? active_debug_hint+0x6a/0x70 [i915] <4>[ 547.208963] ? active_debug_hint+0x6a/0x70 [i915] <4>[ 547.208975] __kasan_report+0x137/0x190 <4>[ 547.209106] ? active_debug_hint+0x6a/0x70 [i915] <4>[ 547.209127] kasan_report+0x32/0x50 <4>[ 547.209257] ? i915_gemfs_fini+0x40/0x40 [i915] <4>[ 547.209376] active_debug_hint+0x6a/0x70 [i915] <4>[ 547.209389] debug_print_object+0xa7/0x220 <4>[ 547.209405] ? lockdep_hardirqs_on+0x348/0x5f0 <4>[ 547.209426] debug_object_assert_init+0x297/0x430 <4>[ 547.209449] ? debug_object_free+0x360/0x360 <4>[ 547.209472] ? lock_acquire+0x1ac/0x8a0 <4>[ 547.209592] ? intel_timeline_read_hwsp+0x4f/0x840 [i915] <4>[ 547.209737] ? i915_active_acquire_if_busy+0x66/0x120 [i915] <4>[ 547.209861] i915_active_acquire_if_busy+0x66/0x120 [i915] <4>[ 547.209990] ? __live_alloc.isra.15+0xc0/0xc0 [i915] <4>[ 547.210005] ? rcu_read_lock_sched_held+0xd0/0xd0 <4>[ 547.210017] ? print_usage_bug+0x580/0x580 <4>[ 547.210153] intel_timeline_read_hwsp+0xbc/0x840 [i915] <4>[ 547.210284] __emit_semaphore_wait+0xd5/0x480 [i915] <4>[ 547.210415] ? i915_fence_get_timeline_name+0x110/0x110 [i915] <4>[ 547.210428] ? lockdep_hardirqs_on+0x348/0x5f0 <4>[ 547.210442] ? _raw_spin_unlock_irq+0x2a/0x40 <4>[ 547.210567] ? __await_execution.constprop.51+0x2e0/0x570 [i915] <4>[ 547.210706] i915_request_await_dma_fence+0x8f7/0xc70 [i915] Fixes: 85bedbf191e8 ("drm/i915/gt: Eliminate the trylock for reading a timeline's hwsp") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: <stable@vger.kernel.org> # v5.6+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200427093038.29219-1-chris@chris-wilson.co.uk	2020-04-27 11:39:23 +01:00
Chris Wilson	68ace460c5	drm/i915/execlists: Check preempt-timeout target before submit_ports We evaluate active, which is a pointer into execlists->inflight[] during dequeue to decide how long a preempt-timeout we need to apply. However, as soon as we do the submit_ports, the HW may send its ACK interrupt causing us to promote execlists->pending[] tp execlists->inflight[], overwriting the value of active. We know *active is only stable until we submit (as we only submit when there is no pending promotion). [ 16.102328] BUG: KCSAN: data-race in execlists_dequeue+0x1449/0x1600 [i915] [ 16.102356] [ 16.102375] race at unknown origin, with read to 0xffff8881e9500488 of 8 bytes by task 429 on cpu 1: [ 16.102780] execlists_dequeue+0x1449/0x1600 [i915] [ 16.103160] __execlists_submission_tasklet+0x48/0x60 [i915] [ 16.103540] execlists_submit_request+0x38e/0x3c0 [i915] [ 16.103940] submit_notify+0x8f/0xc0 [i915] [ 16.104308] __i915_sw_fence_complete+0x61/0x420 [i915] [ 16.104683] i915_sw_fence_complete+0x58/0x80 [i915] [ 16.105054] i915_sw_fence_commit+0x16/0x20 [i915] [ 16.105457] __i915_request_queue+0x60/0x70 [i915] [ 16.105843] i915_gem_do_execbuffer+0x2d6b/0x4230 [i915] [ 16.106227] i915_gem_execbuffer2_ioctl+0x2b0/0x580 [i915] [ 16.106257] drm_ioctl_kernel+0xe9/0x130 [ 16.106279] drm_ioctl+0x27d/0x45e [ 16.106311] ksys_ioctl+0x89/0xb0 [ 16.106336] __x64_sys_ioctl+0x42/0x60 [ 16.106370] do_syscall_64+0x6e/0x2c0 [ 16.106397] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200426094231.21995-1-chris@chris-wilson.co.uk	2020-04-27 11:36:59 +01:00
Mika Kuoppala	b8a1181122	drm/i915: Use indirect ctx bb to mend CMD_BUF_CCTL Use indirect ctx bb to load cmd buffer control value from context image to avoid corruption. v2: add to lrc layout (Chris) v3: end to a cacheline (Chris) v4: add to lrc fixed (Chris) v5: value in offset+1 Testcase: igt/i915_selftest/gt_lrc Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200424230632.30333-1-mika.kuoppala@linux.intel.com	2020-04-25 19:08:56 +01:00
Mika Kuoppala	1dd47b54ba	drm/i915: Add live selftests for indirect ctx batchbuffers Indirect ctx batchbuffers are a hw feature of which batch can be run, by hardware, during context restoration stage. Driver can setup a batchbuffer and also an offset into the context image. When context image is marshalled from memory to registers, and when the offset from the start of context register state is equal of what driver pre-determined, batch will run. So one can manipulate context restoration process at cacheline granularity, given some limitations, as you need to have rudimentaries in place before you can run a batch. Add selftest which will write the ring start register to a canary spot. This will test that hardware will run a batchbuffer for the context in question. v2: request wait fix, naming (Chris) v3: test order (Chris) v4: rebase Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200424214841.28076-3-mika.kuoppala@linux.intel.com	2020-04-25 19:08:18 +01:00
Mika Kuoppala	685d21096f	drm/i915: Add per ctx batchbuffer wa for timestamp Restoration of a previous timestamp can collide with updating the timestamp, causing a value corruption. Combat this issue by using indirect ctx bb to modify the context image during restoring process. We can preload value into scratch register. From which we then do the actual write with LRR. LRR is faster and thus less error prone as probability of race drops. v2: tidying (Chris) v3: lrr for all engines v4: grp v5: reg bit v6: wa_bb_offset, virtual engines (Chris) References: HSDES#16010904313 Testcase: igt/i915_selftest/gt_lrc Suggested-by: Joseph Koston <joseph.koston@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200424230546.30271-1-mika.kuoppala@linux.intel.com	2020-04-25 18:39:32 +01:00
Mika Kuoppala	168c6d231b	drm/i915: Add engine scratch register to live_lrc_fixed General purpose registers are per engine and in a fixed location. Add to live_lrc_fixed. Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200424214841.28076-1-mika.kuoppala@linux.intel.com	2020-04-25 17:58:33 +01:00
Chris Wilson	9c878557b1	drm/i915/gt: Use the RPM config register to determine clk frequencies For many configuration details within RC6 and RPS we are programming intervals for the internal clocks. From gen11, these clocks are configuration via the RPM_CONFIG and so for convenience, we would like to convert to/from more natural units (ns). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200424162805.25920-2-chris@chris-wilson.co.uk	2020-04-24 19:10:17 +01:00
Chris Wilson	555a322429	drm/i915/gt: Trace RPS events Add tracek to the RPS events (interrupts, worker, enabling, threshold selection, frequency setting), so that if we have to debug reticent HW we have some traces to start from. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200424162805.25920-1-chris@chris-wilson.co.uk	2020-04-24 18:38:46 +01:00
Chris Wilson	1ebf7aaf3a	drm/i915/gt: Prefer soft-rc6 over RPS DOWN_TIMEOUT The RPS DOWN_TIMEOUT interrupt is signaled after a period of rc6, and upon receipt of that interrupt we reprogram the GPU clocks down to the next idle notch [to help convserve power during rc6]. However, on execlists, we benefit from soft-rc6 immediately parking the GPU and setting idle frequencies upon idling [within a jiffie], and here the interrupt prevents us from restarting from our last frequency. In the process, we can simply opt for a static pm_events mask and rely on the enable/disable interrupts to flush the worker on parking. This will reduce the amount of oscillation observed during steady workloads with microsleeps, as each time the rc6 timeout occurs we immediately follow with a waitboost for a dropped frame. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422001703.1697-1-chris@chris-wilson.co.uk	2020-04-24 17:20:58 +01:00
Chris Wilson	50689771c8	drm/i915: Only close vma we open The history of i915_vma_close() is confusing, as is its use. As the lifetime of the i915_vma is currently bounded by the object it is attached to, we needed a means of identify when a vma was no longer in use by userspace (via the user's fd). This is further complicated by that only ppgtt vma should be closed at the user's behest, as the ggtt were always shared. Now that we attach the vma to a lut on the user's context, the open count does indicate how many unique and open context/vm are referencing this vma from the user. As such, we can and should just use the open_count to track when the vma is still in use by userspace. It's a poor man's replacement for reference counting. Closes: https://gitlab.freedesktop.org/drm/intel/issues/1193 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422190558.30509-1-chris@chris-wilson.co.uk	2020-04-24 11:24:45 +01:00
Mika Kuoppala	b4892e4404	drm/i915: Make define for lrc state offset More often than not, we need a byte offset into lrc register state from the start of the hw state. Make it so. Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200423182355.21837-3-mika.kuoppala@linux.intel.com	2020-04-24 00:52:14 +01:00
Mika Kuoppala	f1cc6acf22	drm/i915/selftests: Add context batchbuffers registers to live_lrc_fixed Add per ctx bb and indirect ctx bb register locations to live_lrc_fixed for verification. Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200423224159.22078-1-chris@chris-wilson.co.uk	2020-04-24 00:36:13 +01:00
Chris Wilson	b97f77baa8	drm/i915/gt: Check carefully for an idle engine in wait-for-idle intel_gt_wait_for_idle() tries to wait until all the outstanding requests are retired and the GPU is idle. As a side effect of retiring requests, we may submit more work to flush any pm barriers, and so the wait-for-idle tries to flush the background pm work and catch the new requests. However, if the work completed in the background before we were able to flush, it would queue the extra barrier request without us noticing -- and so we would return from wait-for-idle with one request remaining. (This breaks e.g. record_default_state where we need to wait until that barrier is retired, and it may slow suspend down by causing us to wait on the background retirement worker as opposed to immediately retiring the barrier.) However, since we track if there has been a submission since the engine pm barrier, we can very quickly detect if the idle barrier is still outstanding. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1763 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200423085940.28168-1-chris@chris-wilson.co.uk	2020-04-23 16:16:32 +01:00
Chris Wilson	36fe164d8d	drm/i915/gt: Carefully order virtual_submission_tasklet During the virtual engine's submission tasklet, we take the request and insert into the submission queue on each of our siblings. This seems quite simply, and so no problems with ordering. However, the sibling execlists' submission tasklets may run concurrently with the virtual engine's tasklet, submitting the request to HW before the virtual finishes its task of telling all the siblings. If this happens, the sibling tasklet may reorder the ve->sibling[] array that the virtual engine tasklet is processing. This can only reorder within the elements already processed by the virtual engine, nevertheless the race is detected by KCSAN: [ 185.580014] BUG: KCSAN: data-race in execlists_dequeue [i915] / virtual_submission_tasklet [i915] [ 185.580054] [ 185.580076] write to 0xffff8881f1919860 of 8 bytes by interrupt on cpu 2: [ 185.580553] execlists_dequeue+0x6ad/0x1600 [i915] [ 185.581044] __execlists_submission_tasklet+0x48/0x60 [i915] [ 185.581517] execlists_submission_tasklet+0xd3/0x170 [i915] [ 185.581554] tasklet_action_common.isra.0+0x42/0x90 [ 185.581585] __do_softirq+0xc8/0x206 [ 185.581613] run_ksoftirqd+0x15/0x20 [ 185.581641] smpboot_thread_fn+0x15a/0x270 [ 185.581669] kthread+0x19a/0x1e0 [ 185.581695] ret_from_fork+0x1f/0x30 [ 185.581717] [ 185.581736] read to 0xffff8881f1919860 of 8 bytes by interrupt on cpu 0: [ 185.582231] virtual_submission_tasklet+0x10e/0x5c0 [i915] [ 185.582265] tasklet_action_common.isra.0+0x42/0x90 [ 185.582291] __do_softirq+0xc8/0x206 [ 185.582315] run_ksoftirqd+0x15/0x20 [ 185.582340] smpboot_thread_fn+0x15a/0x270 [ 185.582368] kthread+0x19a/0x1e0 [ 185.582395] ret_from_fork+0x1f/0x30 [ 185.582417] We can prevent this race by checking for the ve->request after looking up the sibling array. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200423115315.26825-1-chris@chris-wilson.co.uk	2020-04-23 16:14:27 +01:00
Chris Wilson	15501287b1	drm/i915/execlists: Drop request-before-CS assertion When we migrated to execlists, one of the conditions we wanted to test for was whether the breadcrumb seqno was being written before the breadcumb interrupt was delivered. This was following on from issues observed on previous generations which were not so strongly ordered. With the removal of the missed interrupt detection, we have not reliable means of detecting the out-of-order seqno/interrupt but instead tried to assert that the relationship between the CS event interrupt and the breadwrite should be strongly ordered. However, Icelake proves it is possible for the HW implementation to forget about minor little details such as write ordering and so the order between processing the CS event and the breadcrumb is unreliable. Remove the unreliable assertion, but leave a debug telltale in case we have reason to suspect. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1658 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422141749.28709-1-chris@chris-wilson.co.uk	2020-04-22 17:17:50 +01:00

... 3 4 5 6 7 ...

1323 Commits