linux

iv/linux

Author	SHA1	Message	Date
Umesh Nerlige Ramappa	a5a6d92f77	drm/i915/perf: Simply use stream->ctx Earlier code used exclusive_stream to check for user passed context. Simplify this by accessing stream->ctx. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-7-umesh.nerlige.ramappa@intel.com	2022-10-27 12:36:24 -07:00
Umesh Nerlige Ramappa	cceb084905	drm/i915/perf: Enable bytes per clock reporting in OA XEHPSDV and DG2 provide a way to configure bytes per clock vs commands per clock reporting. Enable bytes per clock setting on enabling OA. Bspec: 51762 Bspec: 52201 v2: - Fix commit msg (Ashutosh) - Fix checkpatch issues v3: - s/commands/bytes/ in code comment and commmit msg Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-6-umesh.nerlige.ramappa@intel.com	2022-10-27 12:36:19 -07:00
Umesh Nerlige Ramappa	a5c3a3cbf0	drm/i915/perf: Determine gen12 oa ctx offset at runtime Some SKUs of same gen12 platform may have different oactxctrl offsets. For gen12, determine oactxctrl offsets at runtime. v2: (Lionel) - Move MI definitions to intel_gpu_commands.h - Ensure __find_reg_in_lri does read past context image size v3: (Ashutosh) - Drop unnecessary use of double underscores - fix find_reg_in_lri - Return error if oa context offset is U32_MAX - Error out if oa_ctx_ctrl_offset does not find offset v4: (Ashutosh) - Warn on odd MI LRI_LEN - Remove unnecessary check for valid_oactxctrl_offset - Drop valid_oactxctrl_offset macro v5: Drop unrelated comment Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-5-umesh.nerlige.ramappa@intel.com	2022-10-27 12:36:13 -07:00
Umesh Nerlige Ramappa	2d9da58521	drm/i915/perf: Fix noa wait predication for DG2 Predication for batch buffer commands changed in XEHPSDV. MI_BATCH_BUFFER_START predicates based on MI_SET_PREDICATE_RESULT register. The MI_SET_PREDICATE_RESULT register can only be modified with MI_SET_PREDICATE command. When configured, the MI_SET_PREDICATE command sets MI_SET_PREDICATE_RESULT based on bit 0 of MI_PREDICATE_RESULT_2. Use this to configure predication in noa_wait. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-4-umesh.nerlige.ramappa@intel.com	2022-10-27 12:35:59 -07:00
Umesh Nerlige Ramappa	81d5f7d914	drm/i915/perf: Add 32-bit OAG and OAR formats for DG2 Add new OA formats for DG2. MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893 v2: - Update commit title (Ashutosh) - Coding style fixes (Lionel) - 64 bit OA formats need UMD changes in GPUvis, drop for now and send in a separate series with UMD changes v3: - Update commit message to drop 64 bit related description Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> #1 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-3-umesh.nerlige.ramappa@intel.com	2022-10-27 12:35:57 -07:00
Umesh Nerlige Ramappa	682aa4373f	drm/i915/perf: Fix OA filtering logic for GuC mode With GuC mode of submission, GuC is in control of defining the context id field that is part of the OA reports. To filter reports, UMD and KMD must know what sw context id was chosen by GuC. There is not interface between KMD and GuC to determine this, so read the upper-dword of EXECLIST_STATUS to filter/squash OA reports for the specific context. v2: Explain guc id stealing w.r.t OA use case Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026222102.5526-2-umesh.nerlige.ramappa@intel.com	2022-10-27 12:35:56 -07:00
Nathan Chancellor	a8a4f0467d	drm/i915: Fix CFI violations in gt_sysfs When booting with CONFIG_CFI_CLANG, there are numerous violations when accessing the files under /sys/devices/pci0000:00/0000:00:02.0/drm/card0/gt/gt0: $ cd /sys/devices/pci0000:00/0000:00:02.0/drm/card0/gt/gt0 $ grep . * id:0 punit_req_freq_mhz:350 rc6_enable:1 rc6_residency_ms:214934 rps_act_freq_mhz:1300 rps_boost_freq_mhz:1300 rps_cur_freq_mhz:350 rps_max_freq_mhz:1300 rps_min_freq_mhz:350 rps_RP0_freq_mhz:1300 rps_RP1_freq_mhz:350 rps_RPn_freq_mhz:350 throttle_reason_pl1:0 throttle_reason_pl2:0 throttle_reason_pl4:0 throttle_reason_prochot:0 throttle_reason_ratl:0 throttle_reason_status:0 throttle_reason_thermal:0 throttle_reason_vr_tdc:0 throttle_reason_vr_thermalert:0 $ sudo dmesg &\| grep "CFI failure at" [ 214.595903] CFI failure at kobj_attr_show+0x19/0x30 (target: id_show+0x0/0x70 [i915]; expected type: 0xc527b809) [ 214.596064] CFI failure at kobj_attr_show+0x19/0x30 (target: punit_req_freq_mhz_show+0x0/0x40 [i915]; expected type: 0xc527b809) [ 214.596407] CFI failure at kobj_attr_show+0x19/0x30 (target: rc6_enable_show+0x0/0x40 [i915]; expected type: 0xc527b809) [ 214.596528] CFI failure at kobj_attr_show+0x19/0x30 (target: rc6_residency_ms_show+0x0/0x270 [i915]; expected type: 0xc527b809) [ 214.596682] CFI failure at kobj_attr_show+0x19/0x30 (target: act_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809) [ 214.596792] CFI failure at kobj_attr_show+0x19/0x30 (target: boost_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809) [ 214.596893] CFI failure at kobj_attr_show+0x19/0x30 (target: cur_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809) [ 214.596996] CFI failure at kobj_attr_show+0x19/0x30 (target: max_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809) [ 214.597099] CFI failure at kobj_attr_show+0x19/0x30 (target: min_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809) [ 214.597198] CFI failure at kobj_attr_show+0x19/0x30 (target: RP0_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809) [ 214.597301] CFI failure at kobj_attr_show+0x19/0x30 (target: RP1_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809) [ 214.597405] CFI failure at kobj_attr_show+0x19/0x30 (target: RPn_freq_mhz_show+0x0/0xe0 [i915]; expected type: 0xc527b809) [ 214.597538] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) [ 214.597701] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) [ 214.597836] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) [ 214.597952] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) [ 214.598071] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) [ 214.598177] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) [ 214.598307] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) [ 214.598439] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) [ 214.598542] CFI failure at kobj_attr_show+0x19/0x30 (target: throttle_reason_bool_show+0x0/0x50 [i915]; expected type: 0xc527b809) With kCFI, indirect calls are validated against their expected type versus actual type and failures occur when the two types do not match. The ultimate issue is that these sysfs functions are expecting to be called via dev_attr_show() but they may also be called via kobj_attr_show(), as certain files are created under two different kobjects that have two different sysfs_ops in intel_gt_sysfs_register(), hence the warnings above. When accessing the gt_ files under /sys/devices/pci0000:00/0000:00:02.0/drm/card0, which are using the same sysfs functions, there are no violations, meaning the functions are being called with the proper type. To make everything work properly, adjust certain functions to match the type of the ->show() and ->store() members in 'struct kobj_attribute'. Add a macro to generate functions for that can be called via both dev_attr_{show,store}() or kobj_attr_{show,store}() so that they can be called through both kobject locations without violating kCFI and adjust the attribute groups to account for this. Link: https://github.com/ClangBuiltLinux/linux/issues/1716 Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221013205909.1282545-1-nathan@kernel.org	2022-10-27 19:14:53 +02:00
Karolina Drobnik	67f99e3447	i915/i915_gem_context: Remove debug message in i915_gem_context_create_ioctl We know that as long as GEM context create ioctl succeeds, a context was created. There is no need to write about it, especially when such a message heavily pollutes dmesg and makes debugging actual errors harder. Since commit baa89ba3f1fe ("drm/i915/gem: initial conversion to new logging macros using coccinelle"), the logging for creating a new user context was moved under the driver debug output (for lack of a means for per-user logs, and a lack of user-focused drm.debug parameter). This only reveals how obnoxious having that spam be part of the driver debug logs, so remove it. [ from Chris Wilson ] Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Karolina Drobnik <karolina.drobnik@intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221025091903.986819-1-karolina.drobnik@intel.com	2022-10-27 12:28:17 +02:00
Robert Beckett	78a07fe777	drm/i915: stop abusing swiotlb_max_segment swiotlb_max_segment used to return either the maximum size that swiotlb could bounce, or for Xen PV PAGE_SIZE even if swiotlb could bounce buffer larger mappings. This made i915 on Xen PV work as it bypasses the coherency aspect of the DMA API and can't cope with bounce buffering and this avoided bounce buffering for the Xen/PV case. So instead of adding this hack back, check for Xen/PV directly in i915 for the Xen case and otherwise use the proper DMA API helper to query the maximum mapping size. Replace swiotlb_max_segment() calls with dma_max_mapping_size(). In i915_gem_object_get_pages_internal() no longer consider max_segment only if CONFIG_SWIOTLB is enabled. There can be other (iommu related) causes of specific max segment sizes. Fixes: a2daa27c0c61 ("swiotlb: simplify swiotlb_max_segment") Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: Robert Beckett <bob.beckett@collabora.com> Signed-off-by: Christoph Hellwig <hch@lst.de> [hch: added the Xen hack, rewrote the changelog] Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221020110308.1582518-1-hch@lst.de	2022-10-27 10:10:05 +01:00
Matthew Auld	b0feda9ce7	Revert "drm/i915/uapi: expose GTT alignment" The process for merging uAPI is to have UMD side ready and reviewed and merged before merging. Revert for now until that is ready. This reverts commit d54576a074a29d4901d0a693cd84e1a89057f694. Reported-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Michal Mrozek <michal.mrozek@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Yang A Shi <yang.a.shi@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221024101946.28974-1-matthew.auld@intel.com	2022-10-27 09:21:56 +01:00
Alan Previn	a7ac9d84b8	drm/i915/guc: Remove intel_context:number_committed_requests counter With the introduction of the delayed disable-sched behavior, we use the GuC's xarray of valid guc-id's as a way to identify if new requests had been added to a context when the said context is being checked for closure. Additionally that prior change also closes the race for when a new incoming request fails to cancel the pending delayed disable-sched worker. With these two complementary checks, we see no more use for intel_context:guc_state:number_committed_requests. Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221006225121.826257-3-alan.previn.teres.alexis@intel.com	2022-10-26 17:29:47 -07:00
Matthew Brost	8332109430	drm/i915/guc: Delay disabling guc_id scheduling for better hysteresis Add a delay, configurable via debugfs (default 34ms), to disable scheduling of a context after the pin count goes to zero. Disable scheduling is a costly operation as it requires synchronizing with the GuC. So the idea is that a delay allows the user to resubmit something before doing this operation. This delay is only done if the context isn't closed and less than a given threshold (default is 3/4) of the guc_ids are in use. Alan Previn: Matt Brost first introduced this patch back in Oct 2021. However no real world workload with measured performance impact was available to prove the intended results. Today, this series is being republished in response to a real world workload that benefited greatly from it along with measured performance improvement. Workload description: 36 containers were created on a DG2 device where each container was performing a combination of 720p 3d game rendering and 30fps video encoding. The workload density was configured in a way that guaranteed each container to ALWAYS be able to render and encode no less than 30fps with a predefined maximum render + encode latency time. That means the totality of all 36 containers and their workloads were not saturating the engines to their max (in order to maintain just enough headrooom to meet the min fps and max latencies of incoming container submissions). Problem statement: It was observed that the CPU core processing the i915 soft IRQ work was experiencing severe load. Using tracelogs and an instrumentation patch to count specific i915 IRQ events, it was confirmed that the majority of the CPU cycles were caused by the gen11_other_irq_handler() -> guc_irq_handler() code path. The vast majority of the cycles was determined to be processing a specific G2H IRQ: i.e. INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_DONE. These IRQs are sent by GuC in response to i915 KMD sending H2G requests: INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_SET. Those H2G requests are sent whenever a context goes idle so that we can unpin the context from GuC. The high CPU utilization % symptom was limiting density scaling. Root Cause Analysis: Because the incoming execution buffers were spread across 36 different containers (each with multiple contexts) but the system in totality was NOT saturated to the max, it was assumed that each context was constantly idling between submissions. This was causing a thrashing of unpinning contexts from GuC at one moment, followed quickly by repinning them due to incoming workload the very next moment. These event-pairs were being triggered across multiple contexts per container, across all containers at the rate of > 30 times per sec per context. Metrics: When running this workload without this patch, we measured an average of ~69K INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_DONE events every 10 seconds or ~10 million times over ~25+ mins. With this patch, the count reduced to ~480 every 10 seconds or about ~28K over ~10 mins. The improvement observed is ~99% for the average counts per 10 seconds. Design awareness: Selftest impact. As temporary WA disable this feature for the selftests. Selftests are very timing sensitive and any change in timing can cause failure. A follow up patch will fixup the selftests to understand this delay. Design awareness: Race between guc_request_alloc and guc_context_close. If a context close is issued while there is a request submission in flight and a delayed schedule disable is pending, guc_context_close and guc_request_alloc will race to cancel the delayed disable. To close the race, make sure that guc_request_alloc waits for guc_context_close to finish running before checking any state. Design awareness: GT Reset event. If a gt reset is triggered, as preparation steps, add an additional step to ensure all contexts that have a pending delay-disable-schedule task be flushed of it. Move them directly into the closed state after cancelling the worker. This is okay because the existing flow flushes all yet-to-arrive G2H's dropping them anyway. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221006225121.826257-2-alan.previn.teres.alexis@intel.com	2022-10-26 17:29:43 -07:00
Alan Previn	befb231d5d	drm/i915/guc: Fix GuC error capture sizing estimation and reporting During GuC error capture initialization, we estimate the amount of size we need for the error-capture-region of the shared GuC-log-buffer. This calculation was incorrect so fix that. With the fixed calculation we can reduce the allocation of error-capture region from 4MB to 1MB (see note2 below for reasoning). Additionally, switch from drm_notice to drm_debug for the 3X spare size check since that would be impossible to hit without redesigning gpu_coredump framework to hold multiple captures. NOTE1: Even for 1x the min size estimation case, actually running out of space is a corner case because it can only occur if all engine instances get reset all at once and i915 isn't able extract the capture data fast enough within G2H handler worker. NOTE2: With the corrected calculation, a DG2 part required ~77K and a PVC required ~115K (1X min-est-size that is calculated as one-shot all-engine- reset scenario). Fixes: d7c15d76a554 ("drm/i915/guc: Check sizing of guc_capture output") Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris.p.wilson@intel.com> Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221026060506.1007830-2-alan.previn.teres.alexis@intel.com	2022-10-26 15:00:39 -07:00
Vinay Belgaumkar	37d52e446e	drm/i915/slpc: Use platform limits for min/max frequency GuC will set the min/max frequencies to theoretical max on ATS-M. This will break kernel ABI, so limit min/max frequency to RP0(platform max) instead. Also modify the SLPC selftest to update the min frequency when we have a server part so that we can iterate between platform min and max. v2: Check softlimits instead of platform limits (Riana) v3: More review comments (Ashutosh) v4: No need to use saved_min_freq and other comments (Ashutosh) Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/7030 Acked-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Riana Tauro <riana.tauro@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221024225453.4856-1-vinay.belgaumkar@intel.com	2022-10-26 14:24:30 -07:00
Vinay Belgaumkar	f864a29afc	drm/i915/slpc: Optmize waitboost for SLPC Waitboost (when SLPC is enabled) results in a H2G message. This can result in thousands of messages during a stress test and fill up an already full CTB. There is no need to request for boost if min softlimit is equal or greater than it. v2: Add the tracing back, and check requested freq in the worker thread (Tvrtko) v3: Check requested freq in dec_waiters as well v4: Only check min_softlimit against boost_freq. Limit this optimization for server parts for now. v5: min_softlimit can be greater than boost (Ashutosh) Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221024171108.14373-1-vinay.belgaumkar@intel.com	2022-10-26 12:20:25 -07:00
Gustavo Sousa	e62f31e173	drm/i915/xelp: Add Wa_1806527549 Workaround to be applied to platforms using XE_LP graphics. BSpec: 52890 Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221019161334.119885-1-gustavo.sousa@intel.com	2022-10-26 10:50:13 -07:00
Alan Previn	5f8a3f65fc	drm/i915/guc: Add compute reglist for guc err capture We missed this at initial upstream because at that time none of the GuC enabled platforms had a compute engine. Add this now. Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221019072930.17755-3-alan.previn.teres.alexis@intel.com	2022-10-24 14:44:44 -07:00
Alan Previn	a894077890	drm/i915/guc: Add error-capture init warnings when needed If GuC is being used and we initialized GuC-error-capture, we need to be warning if we don't provide an error-capture register list in the firmware ADS, for valid GT engines. A warning makes sense as this would impact debugability without realizing why a reglist wasn't retrieved and reported by GuC. However, depending on the platform, we might have certain engines that have a register list for engine instance error state but not for engine class. Thus, add a check only to warn if the register list was non existent vs an empty list (use the empty lists to skip the warning). NOTE: if a future platform were to introduce new registers in place of what was an empty list on existing / legacy hardware engines no warning is provided as the empty list is meant to be used intentionally. As an example, if a future hardware were to add blitter engine-class-registers (new) on top of the legacy blitter engine-instance-register (HEAD, TAIL, etc.), no warning is generated. Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221019072930.17755-2-alan.previn.teres.alexis@intel.com	2022-10-24 14:44:41 -07:00
John Harrison	d7a8680ec9	drm/i915: Improve long running compute w/a for GuC submission A workaround was added to the driver to allow compute workloads to run 'forever' by disabling pre-emption on the RCS engine for Gen12. It is not totally unbound as the heartbeat will kick in eventually and cause a reset of the hung engine. However, this does not work well in GuC submission mode. In GuC mode, the pre-emption timeout is how GuC detects hung contexts and triggers a per engine reset. Thus, disabling the timeout means also losing all per engine reset ability. A full GT reset will still occur when the heartbeat finally expires, but that is a much more destructive and undesirable mechanism. The purpose of the workaround is actually to give compute tasks longer to reach a pre-emption point after a pre-emption request has been issued. This is necessary because Gen12 does not support mid-thread pre-emption and compute tasks can have long running threads. So, rather than disabling the timeout completely, just set it to a 'long' value. v2: Review feedback from Tvrtko - must hard code the 'long' value instead of determining it algorithmically. So make it an extra CONFIG definition. Also, remove the execlist centric comment from the existing pre-emption timeout CONFIG option given that it applies to more than just execlists. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Michal Mrozek <michal.mrozek@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221006213813.1563435-5-John.C.Harrison@Intel.com	2022-10-24 12:12:47 -07:00
John Harrison	47daf84a8b	drm/i915: Make the heartbeat play nice with long pre-emption timeouts Compute workloads are inherently not pre-emptible for long periods on current hardware. As a workaround for this, the pre-emption timeout for compute capable engines was disabled. This is undesirable with GuC submission as it prevents per engine reset of hung contexts. Hence the next patch will re-enable the timeout but bumped up by an order of magnitude. However, the heartbeat might not respect that. Depending upon current activity, a pre-emption to the heartbeat pulse might not even be attempted until the last heartbeat period. Which means that only one period is granted for the pre-emption to occur. With the aforesaid bump, the pre-emption timeout could be significantly larger than this heartbeat period. So adjust the heartbeat code to take the pre-emption timeout into account. When it reaches the final (high priority) period, it now ensures the delay before hitting reset is bigger than the pre-emption timeout. v2: Fix for selftests which adjust the heartbeat period manually. v3: Add FIXME comment about selftests. Add extra FIXME comment and drm_notices when setting heartbeat to a non-default value (review feedback from Tvrtko) Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221006213813.1563435-4-John.C.Harrison@Intel.com	2022-10-24 12:12:36 -07:00
John Harrison	c3bd49cd9a	drm/i915: Fix compute pre-emption w/a to apply to compute engines An earlier patch added support for compute engines. However, it missed enabling the anti-pre-emption w/a for the new engine class. So move the 'compute capable' flag earlier and use it for the pre-emption w/a test. Fixes: c674c5b9342e ("drm/i915/xehp: CCS should use RCS setup functions") Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: "Michał Winiarski" <michal.winiarski@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Ramalingam C <ramalingam.c@intel.com> Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221006213813.1563435-3-John.C.Harrison@Intel.com	2022-10-24 12:12:21 -07:00
John Harrison	568944af44	drm/i915/guc: Limit scheduling properties to avoid overflow GuC converts the pre-emption timeout and timeslice quantum values into clock ticks internally. That significantly reduces the point of 32bit overflow. On current platforms, worst case scenario is approximately 110 seconds. Rather than allowing the user to set higher values and then get confused by early timeouts, add limits when setting these values. v2: Add helper functions for clamping (review feedback from Tvrtko). v3: Add a bunch of BUG_ON range checks in addition to the checks already in the clamping functions (Tvrtko) Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221006213813.1563435-2-John.C.Harrison@Intel.com	2022-10-24 12:11:59 -07:00
Andrzej Hajda	c61aa7407d	drm/i915/gt: use intel_uncore_rmw when appropriate This patch replaces all occurences of the form intel_uncore_write(reg, intel_uncore_read(reg) OP val) with intel_uncore_rmw. Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221019143818.244339-2-andrzej.hajda@intel.com	2022-10-24 20:33:58 +02:00
Andrzej Hajda	5490c50438	drm/i915: use intel_uncore_rmw when appropriate This patch replaces all occurences of the form intel_uncore_write(reg, intel_uncore_read(reg) OP val) with intel_uncore_rmw. Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221019143818.244339-1-andrzej.hajda@intel.com	2022-10-24 20:32:55 +02:00
Matt Roper	a47e8a46a7	drm/i915/xelpg: Fix write to MTL_MCR_SELECTOR A misplaced closing parenthesis caused the groupid/instanceid values to be considered part of the ternary operator's condition instead of being OR'd into the resulting value. Fixes: f32898c94a10 ("drm/i915/xelpg: Add multicast steering") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Arun R Murthy <arun.r.murthy@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221019222437.3035182-1-matthew.d.roper@intel.com	2022-10-20 19:07:42 -07:00
Tvrtko Ursulin	6407cf5332	drm/i915/selftests: Stop using kthread_stop() Since a7c01fa93aeb ("signal: break out of wait loops on kthread_stop()") kthread_stop() started asserting a pending signal which wreaks havoc with a few of our selftests. Mainly because they are not fully expecting to handle signals, but also cutting the intended test runtimes short due signal_pending() now returning true (via __igt_timeout), which therefore breaks both the patterns of: kthread_run() ..sleep for igt_timeout_ms to allow test to exercise stuff.. kthread_stop() And check for errors recorded in the thread. And also: Main thread \| Test thread ---------------+------------------------------ kthread_run() \| kthread_stop() \| do stuff until __igt_timeout \| -- exits early due signal -- Where this kthread_stop() was assume would have a "join" semantics, which it would have had if not the new signal assertion issue. To recap, threads are now likely to catch a previously impossible ERESTARTSYS or EINTR, marking the test as failed, or have a pointlessly short run time. To work around this start using kthread_work(er) API which provides an explicit way of waiting for threads to exit. And for cases where parent controls the test duration we add explicit signaling which threads will now use instead of relying on kthread_should_stop(). Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221020130841.3845791-1-tvrtko.ursulin@linux.intel.com	2022-10-20 22:06:19 +01:00
Ville Syrjälä	03eababbf3	drm/i915: s/HAS_BAR2_SMEM_STOLEN/HAS_LMEMBAR_SMEM_STOLEN/ The fact that LMEMBAR is BAR2 should be of no real interest to anyone. So use the name of the BAR rather than its index. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221005154159.18750-3-ville.syrjala@linux.intel.com Acked-by: Matthew Auld <matthew.auld@intel.com>	2022-10-20 22:35:42 +03:00
Ville Syrjälä	0492a34c83	drm/i915: Name our BARs based on the spec We use all kinds of weird names for our base address registers. Take the names from the spec and stick to them to avoid confusing everyone. The only exceptions are IOBAR and LMEMBAR since naming them IOBAR_BAR and LMEMBAR_BAR looks too funny, and yet I think that adding the _BAR to GTTMMADR & co. (which don't have one in the spec name) does make it more clear what they are. And IOBAR vs. GTTMMADR_BAR also looks a bit too inconsistent for my taste. v2: Fix gvt build v3: Add GEN2_IO_BAR for completeness Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221005195646.17201-1-ville.syrjala@linux.intel.com Acked-by: Matthew Auld <matthew.auld@intel.com>	2022-10-20 21:08:42 +03:00
Ville Syrjälä	5bfcff516c	drm/i915: Extract intel_mmio_bar() We have the same code to determine the MMIO BAR in two places. Collect it to a single place. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221005154159.18750-1-ville.syrjala@linux.intel.com Reviewed-by: Matthew Auld <matthew.auld@intel.com>	2022-10-20 17:31:22 +03:00
Nirmoy Das	6667d78a11	drm/i915: Refactor ttm ghost obj detection Currently i915_ttm_to_gem() returns NULL for ttm ghost object which makes it unclear when we should add a NULL check for a caller of i915_ttm_to_gem() as ttm ghost objects are expected behaviour for certain cases. Create a separate function to detect ttm ghost object and use that in places where we expect a ghost obj from ttm. Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014131427.21102-1-nirmoy.das@intel.com	2022-10-19 13:29:21 +01:00
Matt Roper	2d3093fd5e	drm/i915/pvc: Update forcewake domain for CCS register ranges The bspec was just updated with a correction to the forcewake domain required when accessing registers in the CCS engine ranges (0x1a000 - 0x1ffff and 0x26000 - 0x27fff) on PVC; these ranges require a wake on the RENDER domain, not the GT domain. Bspec: 67609 Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Harish Chegondi <harish.chegondi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014233004.1053678-1-matthew.d.roper@intel.com	2022-10-18 14:44:50 -07:00
Daniele Ceraolo Spurio	21f213e67e	drm/i915/huc: bump timeout for delayed load and reduce print verbosity We're observing sporadic HuC delayed load timeouts in CI, due to mei_pxp binding completing later than we expected. HuC is still loaded when the bind occurs, but in the meantime i915 has started allowing submission to the VCS engines even if HuC is not there. In most of the cases I've observed, the timeout was due to the init/resume of another driver between i915 and mei hitting errors and thus adding an extra delay, but HuC was still loaded before userspace could submit, because the whole resume process time was increased by the delays. Given that there is no upper bound to the delay that can be introduced by other drivers, I've reached the following compromise with the media team: 1) i915 is going to bump the timeout to 5s, to reduce the probability of reaching it. We still expect HuC to be loaded before userspace starts submitting, so increasing the timeout should have no impact on normal operations, but in case something weird happens we don't want to stall video submissions for too long. 2) The media driver will cope with the failing submissions that manage to go through between i915 init/resume complete and HuC loading, if any ever happen. This could cause a small corruption of video playback immediately after a resume (we should be safe on boot because the media driver polls the HUC_STATUS ioctl before starting submissions). Since we're accepting the timeout as a valid outcome, I'm also reducing the print verbosity from error to notice. v2: use separate prints for MEI GSC and MEI PXP init timeouts (John) v3: add MISSING_CASE to the if-else chain (John) References: https://gitlab.freedesktop.org/drm/intel/-/issues/7033 Fixes: 27536e03271d ("drm/i915/huc: track delayed HuC load with a fence") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tony Ye <tony.ye@intel.com> Cc: John Harrison <john.c.harrison@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221013203245.1801788-1-daniele.ceraolospurio@intel.com	2022-10-17 13:36:05 -07:00
Matt Roper	a7ec65fc7e	drm/i915/xelpmp: Add multicast steering for media GT MTL's media IP (Xe_LPM+) only has a single type of steering ("OAADDRM") which selects between media slice 0 and media slice 1. We'll always steer to media slice 0 unless it is fused off (which is the case when VD0, VE0, and SFC0 are all reported as unavailable). Bspec: 67789 Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014230239.1023689-15-matthew.d.roper@intel.com	2022-10-17 10:18:50 -07:00
Matt Roper	f32898c94a	drm/i915/xelpg: Add multicast steering MTL's graphics IP (Xe_LPG) once again changes the multicast register types and steering details. Key changes from past platforms: * The number of instances of some MCR types (NODE, OAAL2, and GAM) vary according to the MTL subplatform and cannot be read from fuse registers. However steering to instance #0 will always provided a non-terminated value, so we can lump these all into a single "instance0" table. * The MCR steering register (and its bitfields) has changed. Unlike past platforms, we will be explicitly steering all types of MCR accesses, including those for "SLICE" and "DSS" ranges; we no longer rely on implicit steering. On previous platforms, various hardware/firmware agents that needed to access registers typically had their own steering control registers, allowing them to perform multicast steering without clobbering the CPU/kernel steering. Starting with MTL, more of these agents now share a single steering register (0xFD4) and it is no longer safe for us to assume that the value will remain unchanged from how we initialized it during startup. There is also a slight chance of race conditions between the driver and a hardware/firmware agent, so the hardware provides a semaphore register that can be used to coordinate access to the steering register. Support for the semaphore register will be introduced in a future patch. v2: - Use Xe_LPG terminology instead of "MTL 3D" since it's the IP version we're matching on now rather than the platform. - Don't combine l3bank and mslice masks into a union. It's not related to the other changes here and we might still need both of them on some future platform. - Separate debug dumping of steering settings to a separate helper function. (Tvrtko) - Update debug dumping to include DSS ranges (and future-proof it so that any new ranges added on future platforms will also be dumped). - Restore MULTICAST bit at the end of rw_with_mcr_steering_fw() if we cleared it. Also force the MULTICAST bit to true at the beginning of multicast writes just to be safe. (Bala) Bspec: 67788, 67112 Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014230239.1023689-14-matthew.d.roper@intel.com	2022-10-17 10:18:41 -07:00
Matt Roper	58bc2453ab	drm/i915: Define multicast registers as a new type Rather than treating multicast registers as 'i915_reg_t' let's define them as a completely new type. This will allow the compiler to help us make sure we're using multicast-aware functions to operate on multicast registers. This plan does break down a bit in places where we're just maintaining heterogeneous lists of registers (e.g., various MMIO whitelists used by perf, GVT, etc.) rather than performing reads/writes. We only really care about the offset in those cases, so for now we can "cast" the registers as non-MCR, leaving us with a list of i915_reg_t's, but we may want to look for better ways to store mixed collections of i915_reg_t and i915_mcr_reg_t in the future. v2: - Add TLB invalidation registers v3: - Make type checking of i915_mmio_reg_offset() stricter. It will accept either i915_reg_t or i915_mcr_reg_t, but will now raise a compile error if any other type is passed, even if that type contains a 'reg' field. (Jani) - Drop a ton of GVT changes; allowing i915_mmio_reg_offset() to take either an i915_reg_t or an i915_mcr_reg_t means that the huge lists of MMIO_D*() macros used in GVT will continue to work without modification. We need only make changes to structures that have an explicit i915_reg_t in them now. Cc: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014230239.1023689-13-matthew.d.roper@intel.com	2022-10-17 10:16:35 -07:00
Matt Roper	9e49bda902	drm/i915/gt: Add MCR-specific workaround initializers Let's be more explicit about which of our workarounds are updating MCR registers. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014230239.1023689-12-matthew.d.roper@intel.com	2022-10-17 10:16:22 -07:00
Matt Roper	cf35f6afb9	drm/i915/guc: Handle save/restore of MCR registers explicitly MCR registers can be placed on the GuC's save/restore list, but at the moment they are always handled in a multicast manner (i.e., the GuC reads one instance to save the value and then does a multicast write to restore that single value to all instances). In the future the GuC will probably give us an alternate interface to do unicast per-instance save/restore operations, so we should be very clear about which registers on the list are MCR registers (and in the future which save/restore behavior we want for them). Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014230239.1023689-11-matthew.d.roper@intel.com	2022-10-17 10:16:05 -07:00
Matt Roper	46c507f03a	drm/i915/gt: Always use MCR functions on multicast registers Rather than relying on the implicit behavior of intel_uncore_() functions, let's always use the intel_gt_mcr_() functions to operate on multicast/replicated registers. v2: - Add TLB invalidation registers v3: - Switch more uncore operations in mmio_invalidate_full() to MCR operations for Xe_HP. (Bala) Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014230239.1023689-10-matthew.d.roper@intel.com	2022-10-17 10:15:47 -07:00
Matt Roper	a9e69428b1	drm/i915: Define MCR registers explicitly Rather than using the same _MMIO() macro to define MCR registers as singleton registers, let's use a new MCR_REG() macro to make it clear that these registers are special and should be handled accordingly. For now MCR_REG() will still generate an i915_reg_t with the given offset, but we'll change that in future patches. Bspec: 66673, 66696, 66534, 67609 Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014230239.1023689-9-matthew.d.roper@intel.com	2022-10-17 10:14:24 -07:00
Matt Roper	3068bec83e	drm/i915/gt: Add intel_gt_mcr_wait_for_reg_fw() Xe_HP has some MCR registers that need to be polled for completion of operations like TLB invalidation. Those registers are in the GAM range, which rolls up the status from each unit into the 'primary' instance's value. This makes it useful to have a dedicated 'wait for register' function that handles this on MCR registers, similar to the __intel_wait_for_register_fw() function we already have for regular registers. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014230239.1023689-8-matthew.d.roper@intel.com	2022-10-17 10:13:59 -07:00
Matt Roper	ab1b2d40d6	drm/i915/xehp: Check for faults on primary GAM On Xe_HP the fault registers are now in a multicast register range. However as part of the GAM these registers follow special rules and we need only read from the "primary" GAM's instance to get the information we need. So a single intel_gt_mcr_read_any() (which will automatically steer to the primary GAM) is sufficient; we don't need to loop over each instance of the MCR register. v2: - Update more instances of fault registers. (Bala) Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014230239.1023689-7-matthew.d.roper@intel.com	2022-10-17 10:13:46 -07:00
Matt Roper	851435ec36	drm/i915/gt: Add intel_gt_mcr_multicast_rmw() operation There are cases where we wish to read from any non-terminated MCR register instance (or the primary instance in the case of GAM ranges), clear/set some bits, and then write the value back out to the register in a multicast manner. Adding a "multicast RMW" will avoid the need to open-code this. v2: - Return a u32 to align with the recent change to intel_uncore_rmw. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014230239.1023689-6-matthew.d.roper@intel.com	2022-10-17 10:13:31 -07:00
Matt Roper	e4abeab946	drm/i915/gt: Correct prefix on a few registers We have a few registers that have existed for several hardware generations, but are only used by the driver on Xe_HP and beyond. In cases where the Xe_HP version of the register is now replicated and uses multicast behavior, but earlier generations were singleton, let's change the register prefix to "XEHP_" to help clarify that we're using the newer multicast form of the register. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014230239.1023689-5-matthew.d.roper@intel.com	2022-10-17 10:13:16 -07:00
Matt Roper	fb8af92055	drm/i915/gt: Drop a few unused register definitions Let's drop a few register definitions that are unused anywhere in the driver today. Since the referenced offsets are part of what is now considered a multicast register region, the current definitions would not be correct for use on any future platform. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014230239.1023689-4-matthew.d.roper@intel.com	2022-10-17 10:13:06 -07:00
Matt Roper	77fa9efc16	drm/i915/xehp: Create separate reg definitions for new MCR registers Starting in Xe_HP, several registers our driver works with have been converted from singleton registers into replicated registers with multicast behavior. Although the registers are still located at the same MMIO offsets as on previous platforms, let's duplicate the register definitions in preparation for upcoming patches that will handle multicast registers in a special manner. The registers that are now replicated on Xe_HP are: * PAT_INDEX (mslice replication) * FF_MODE2 (gslice replication) * COMMON_SLICE_CHICKEN3 (gslice replication) * SLICE_COMMON_ECO_CHICKEN1 (gslice replication) * SLICE_UNIT_LEVEL_CLKGATE (gslice replication) * LNCFCMOCS (lncf replication) Note that there are a couple places in selftest_mocs.c where the gen9 version of LNCFCMOCS is still used without regards for which platform we're on. Those cases are just doing an offset lookup and not issuing any CPU reads/writes of the register, so the potentially multicast nature of the register doesn't come into play. v2: - Add commit message note about the unconditional GEN9_LNCFCMOCS usage in selftest_mocs. (Bala) - Include some additional TLB registers. Bspec: 66534 Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014230239.1023689-3-matthew.d.roper@intel.com	2022-10-17 10:12:54 -07:00
Matt Roper	dfa13f1bfc	drm/i915/gen8: Create separate reg definitions for new MCR registers Gen8 was the first time our hardware had multicast registers (or at least the first time the multicast nature was exposed and MMIO accesses could be steered). There are some registers that transitioned from singleton behavior to multicast during the gen7 -> gen8 transition; let's duplicate the register definitions for those registers in preparation for upcoming patches that will handle MCR registers in a special manner. The registers adjusted are: * MISCCPCTL * SAMPLER_INSTDONE * ROW_INSTDONE * ROW_CHICKEN2 * HALF_SLICE_CHICKEN1 * HALF_SLICE_CHICKEN3 v2: - Use the gen8 version of HALF_SLICE_CHICKEN3 in GVT's gen9 engine MMIO list. (Bala) - Update to the gen8 version of MISCCPCTL in a couple new workarounds that were recently added for DG2/PVC. (Bala) Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221014230239.1023689-2-matthew.d.roper@intel.com	2022-10-17 10:12:43 -07:00
Dale B Stimson	a6a924abf8	drm/i915/hwmon: Extend power/energy for XEHPSDV Extend hwmon power/energy for XEHPSDV especially per gt level energy usage. v2: Update to latest HWMON spec (Ashutosh) v3: Fix review comments (Ashutosh) v4: Fix review comments (Anshuman) v5: s/hwmon_device_register_with_info/ devm_hwmon_device_register_with_info/ (Ashutosh) v6: Change contact to intel-gfx (Rodrigo) GEN12_RPSTAT1 is available for all Gen12+ (Andi) Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Dale B Stimson <dale.b.stimson@intel.com> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221013154526.2105579-8-ashutosh.dixit@intel.com	2022-10-17 14:57:50 +05:30
Ashutosh Dixit	4c2572fe0a	drm/i915/hwmon: Expose power1_max_interval Expose power1_max_interval, that is the tau corresponding to PL1, as a custom hwmon attribute. Some bit manipulation is needed because of the format of PKG_PWR_LIM_1_TIME in GT0_PACKAGE_RAPL_LIMIT register (1.x * power(2,y)). v2: Update date and kernel version in Documentation (Badal) v3: Cleaned up hwm_power1_max_interval_store() (Badal) v4: - Fixed review comments (Anshuman) - In hwm_power1_max_interval_store() get PKG_MAX_WIN from pkg_power_sku when it is valid (Ashutosh) - KernelVersion: 6.2, Date: February 2023 in doc (Tvrtko) v5: On some of the DGFX setups it is seen that although pkg_power_sku is valid the field PKG_WIN_MAX is not populated. So it is decided to stick to default value of PKG_WIN_MAX (Ashutosh) v6: Change contact to intel-gfx (Rodrigo) Fixed variable types in hwm_power1_max_interval_store (Andi) Documented PKG_MAX_WIN_DEFAULT (Andi) Removed else in hwm_attributes_visible (Andi) Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221013154526.2105579-7-ashutosh.dixit@intel.com	2022-10-17 14:57:23 +05:30
Ashutosh Dixit	c8939848f7	drm/i915/hwmon: Expose card reactive critical power Expose the card reactive critical (I1) power. I1 is exposed as power1_crit in microwatts (typically for client products) or as curr1_crit in milliamperes (typically for server). v2: Add curr1_crit functionality (Ashutosh) v3: Use HWMON_CHANNEL_INFO to define power1_crit, curr1_crit (Badal) v4: Use hwm_ prefix for static functions (Ashutosh) v5: KernelVersion: 6.2, Date: February 2023 in doc (Tvrtko) v6: Change contact to intel-gfx (Rodrigo) Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221013154526.2105579-6-ashutosh.dixit@intel.com	2022-10-17 14:55:28 +05:30
Dale B Stimson	c41b8bdcc2	drm/i915/hwmon: Show device level energy usage Use i915 HWMON to display device level energy input. v2: Updated the date and kernel version in feature description v3: - Cleaned up hwm_energy function and removed unused function i915_hwmon_energy_status_get (Ashutosh) v4: KernelVersion: 6.2, Date: February 2023 in doc (Tvrtko) v5: Change contact to intel-gfx (Rodrigo) Change return type of hwm_energy to void (Andi) Signed-off-by: Dale B Stimson <dale.b.stimson@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Riana Tauro <riana.tauro@intel.com> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221013154526.2105579-5-ashutosh.dixit@intel.com	2022-10-17 14:53:50 +05:30

1 2 3 4 5 ...

1123349 Commits