linux

iv/linux

Author	SHA1	Message	Date
Ashutosh Dixit	1b44019a93	drm/i915/guc: Disable PL1 power limit when loading GuC firmware On dGfx, the PL1 power limit being enabled and set to a low value results in a low GPU operating freq. It also negates the freq raise operation which is done before GuC firmware load. As a result GuC firmware load can time out. Such timeouts were seen in the GL #8062 bug below (where the PL1 power limit was enabled and set to a low value). Therefore disable the PL1 power limit when allowed by HW when loading GuC firmware. v2: - Take mutex (to disallow writes to power1_max) across GuC reset/fw load - Add hwm_power_max_restore to error return code path v3 (Jani N): - Add/remove explanatory comments - Function renames - Type corrections - Locking annotation v4: - Don't hold the lock across GuC reset (Rodrigo) - New locking scheme (suggested by Rodrigo) - Eliminate rpm_get in power_max_disable/restore, not needed (Tvrtko) v5: - Fix uninitialized pl1en variable compile warning reported by kernel build robot by creating new err_rps label Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8062 Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230420164041.1428455-3-ashutosh.dixit@intel.com	2023-04-26 18:02:40 -04:00
Ashutosh Dixit	d81268ee1c	drm/i915/hwmon: Get mutex and rpm ref just once in hwm_power_max_write In preparation for follow-on patches, refactor hwm_power_max_write to take hwmon_lock and runtime pm wakeref at start of the function and release them at the end, therefore acquiring these just once each. Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230420164041.1428455-2-ashutosh.dixit@intel.com	2023-04-26 18:02:33 -04:00
John Harrison	80ab317990	drm/i915/guc: Actually return an error if GuC version range check fails Dan Carpenter pointed out that 'err' was not being set in the case where the GuC firmware version range check fails. Fix that. Note that while this is a bug fix for a previous patch (see Fixes tag below). It is an exceedingly low risk bug. The range check is asserting that the GuC firmware version is within spec. So it should not be possible to ever have a firmware file that fails this check. If larger version numbers are required in the future, that would be a backwards breaking spec change and thus require a major version bump, in which case an old i915 driver would not load that new version anyway. Fixes: 9bbba0667f37 ("drm/i915/guc: Use GuC submission API version number") Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230421224742.2357198-1-John.C.Harrison@Intel.com	2023-04-26 10:33:07 -07:00
Tejas Upadhyay	47d8b30296	drm/i915/mtl: Add workaround 14018778641 WA 18018781329 is applicable now across all MTL steppings. V2: - Remove IS_MTL check, code already running for MTL - Matt Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230424101749.3719600-1-tejas.upadhyay@intel.com	2023-04-25 18:48:51 +02:00
Tejas Upadhyay	e991b5244d	drm/i915/selftest: Record GT error for gt failure igt_live_test has pr_err dumped in case of some GT failures. It will be more informative regarding GT if we use gt_err instead. Cc: Andi Shyti <andi.shyti@intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230424133607.3736825-1-tejas.upadhyay@intel.com	2023-04-25 18:03:57 +02:00
Fei Yang	a161b6dba6	drm/i915/mtl: workaround coherency issue for Media This patch implements Wa_22016122933. In MTL, memory writes initiated by the Media tile update the whole cache line, even for partial writes. This creates a coherency problem for cacheable memory if both CPU and GPU are writing data to different locations within a single cache line. This patch circumvents the issue by making CPU/GPU shared memory uncacheable (WC on CPU side, and PAT index 2 for GPU). Additionally, it ensures that CPU writes are visible to the GPU with an intel_guc_write_barrier(). While fixing the CTB issue, we noticed some random GSC firmware loading failure because the share buffers are cacheable (WB) on CPU side but uncached on GPU side. To fix these issues we need to map such shared buffers as WC on CPU side. Since such allocations are not all done through GuC allocator, to avoid too many code changes, the i915_coherent_map_type() is now hard coded to return WC for MTL. v2: Simplify the commit message(Matt). BSpec: 45101 Signed-off-by: Fei Yang <fei.yang@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230424182902.3663500-3-fei.yang@intel.com	2023-04-25 09:23:31 +02:00
Fei Yang	341ad0e8e2	drm/i915/mtl: Add PTE encode function PTE encode functions are platform dependent. This patch implements PTE functions for MTL, and ensures the correct PTE encode function is used by calling pte_encode function pointer instead of the hardcoded gen8 version of PTE encode. Fixes: b76c0deef627 ("drm/i915/mtl: Define MOCS and PAT tables for MTL") Signed-off-by: Fei Yang <fei.yang@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230424182902.3663500-2-fei.yang@intel.com	2023-04-25 09:23:28 +02:00
Andi Shyti	66ca1d8f22	drm/i915/i915_drv: Use i915 instead of dev_priv insied the file_priv structure In the process of renaming all instances of 'dev_priv' to 'i915', start using 'i915' within the i915_drv.h file. Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230421190026.294208-3-andi.shyti@linux.intel.com	2023-04-23 02:29:27 +02:00
Andi Shyti	64e22551b6	drm/i915/i915_drv: Use proper parameter naming in for_each_engine() for_each_engine() loops through engines in the GT, not in dev_priv. Because it's misleading, call it "gt__" instead of "dev_priv__". Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230421190026.294208-2-andi.shyti@linux.intel.com	2023-04-23 02:29:27 +02:00
Fei Yang	faca6aaa48	drm/i915/mtl: fix mocs selftest Media GT has a different base for MOCS register, need to apply gsi_offset to the mmio address if not using the intel_uncore_r/w functions for register access. Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Fei Yang <fei.yang@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230421182535.292670-3-andi.shyti@linux.intel.com	2023-04-23 02:11:23 +02:00
Madhumitha Tolakanahalli Pradeep	b76c0deef6	drm/i915/mtl: Define MOCS and PAT tables for MTL On MTL, GT can no longer allocate on LLC - only the CPU can. This, along with programming new register bits that MTL requires calls for a MOCS/PAT table update. Also the PAT index registers are multicasted for primary GT, and there is an address jump from index 7 to 8. This patch makes sure that these registers are programmed in the proper way. BSpec: 44509, 45101, 44235 Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Madhumitha Tolakanahalli Pradeep <madhumitha.tolakanahalli.pradeep@intel.com> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Fei Yang <fei.yang@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230421182535.292670-2-andi.shyti@linux.intel.com	2023-04-23 02:11:21 +02:00
Fei Yang	7787af2565	drm/i915/mtl: Set has_llc=0 On MTL, LLC is not shared between GT and CPU, set has_llc=0. Signed-off-by: Fei Yang <fei.yang@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230420102349.15302-1-nirmoy.das@intel.com	2023-04-21 08:24:05 -07:00
Haridhar Kalvala	3bece767da	drm/i915/mtl: WA to clear RDOP clock gating Workaround implementation to clear RDOP clock gating. Bspec: 66622 Signed-off-by: Haridhar Kalvala <haridhar.kalvala@intel.com> Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230418220446.2205509-5-radhakrishna.sripada@intel.com	2023-04-19 20:09:46 -07:00
Madhumitha Tolakanahalli Pradeep	514b8a79aa	drm/i915/mtl: Extend Wa_22011802037 to MTL A-step Wa_22011802037 was being applied to all graphics_ver 11 & 12. This patch updates the if statement to apply the W/A to right platforms and extends it to MTL-M:A step. v1.1: Fix checkpatch warning. v2: Change the check to reflect the wa at other places(Lucas) Bspec: 66622 Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: Madhumitha Tolakanahalli Pradeep <madhumitha.tolakanahalli.pradeep@intel.com> Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230418220446.2205509-4-radhakrishna.sripada@intel.com	2023-04-19 15:33:32 -07:00
Tejas Upadhyay	0c29efa23f	drm/i915/selftests: Consider multi-gt instead of to_gt() In order to enable complete multi-GT, loop through all the GTs, rather than relying on the to_gt(), which only provides a reference to the primary GT. Problem appear when it runs on platform like MTL where different set of engines are possible on different GTs. Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230419060036.3422635-4-tejas.upadhyay@intel.com	2023-04-19 16:04:53 +02:00
Tejas Upadhyay	a347279dec	drm/i915/gem: Consider multi-gt instead of to_gt() In order to enable complete multi-GT, use the GT reference obtained directly from the engine, rather than relying on the to_gt(), which only provides a reference to the primary GT. Problem appear when it runs on platform like MTL where different set of engines are possible on different GTs. Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230419060036.3422635-3-tejas.upadhyay@intel.com	2023-04-19 16:04:53 +02:00
Tejas Upadhyay	a6704f4a54	drm/i915/gt: Consider multi-gt instead of to_gt() In order to enable complete multi-GT, use the GT reference obtained directly from the engine, rather than relying on the to_gt(), which only provides a reference to the primary GT. Problem appear when it runs on platform like MTL where different set of engines are possible on different GTs. Cc: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230419060036.3422635-2-tejas.upadhyay@intel.com	2023-04-19 16:04:52 +02:00
Andi Shyti	d1f3b5e92c	drm/i915: Make IRQ reset and postinstall multi-gt aware In multi-gt systems IRQs need to be reset and enabled per GT. This might add some redundancy when handling interrupts for engines that might not exist in every tile, but helps to keep the code cleaner and more understandable. Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230417235356.1291060-1-andi.shyti@linux.intel.com	2023-04-18 13:19:21 +02:00
Cong Liu	8bfbdadce8	drm/i915: Fix memory leaks in i915 selftests This patch fixes memory leaks on error escapes in function fake_get_pages Fixes: c3bfba9a2225 ("drm/i915: Check for integer truncation on scatterlist creation") Signed-off-by: Cong Liu <liucong2@kylinos.cn> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230414224109.1051922-1-andi.shyti@linux.intel.com	2023-04-15 22:38:54 +02:00
Lucas De Marchi	adfbae9ffe	drm/i915/gt: Avoid out-of-bounds access when loading HuC When HuC is loaded by GSC, there is no header definition for the kernel to look at and firmware is just handed to GSC. However when reading the version, it should still check the size of the blob to guarantee it's not incurring into out-of-bounds array access. If firmware is smaller than expected, the following message is now printed: # echo boom > /lib/firmware/i915/dg2_huc_gsc.bin # dmesg \| grep -i huc [drm] GT0: HuC firmware i915/dg2_huc_gsc.bin: invalid size: 5 < 184 [drm] ERROR GT0: HuC firmware i915/dg2_huc_gsc.bin: fetch failed -ENODATA ... Even without this change the size, header and signature are still checked by GSC when loading, so this only avoids the out-of-bounds array access. Fixes: a7b516bd981f ("drm/i915/huc: Add fetch support for gsc-loaded HuC binary") Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230413200349.3492571-1-lucas.demarchi@intel.com	2023-04-14 08:55:45 -07:00
Nirmoy Das	b90b044c64	drm/i915/mtl: Disable stolen memory backed FB for A0 Stolen memory is not usable for MTL A0 stepping beyond certain access size and we have no control over userspace access size of /dev/fb which can be backed by stolen memory. So disable stolen memory backed fb by setting i915->dsm.usable_size to zero. v2: remove hsdes reference and fix commit message(Andi) v3: use revid as we want to target SOC stepping(Radhakrishna) Cc: Matthew Auld <matthew.auld@intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230404181342.23362-1-nirmoy.das@intel.com	2023-04-11 16:20:12 +02:00
Joonas Lahtinen	ea68a3e9d1	Merge drm/drm-next into drm-intel-gt-next Need to pull in commit from drm-next (earlier in drm-intel-next): 1eca0778f4b3 ("drm/i915: add struct i915_dsm to wrap dsm members together") In order to merge following patch to drm-intel-gt-next: https://patchwork.freedesktop.org/patch/530942/?series=114925&rev=6 Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2023-04-11 15:43:45 +03:00
Daniel Vetter	55bf14961d	Mediatek DRM Next for Linux 6.4 1. Add support for 10-bit overlays 2. Add MediaTek SoC DRM (vdosys1) support for mt8195 3. Change mmsys compatible for mt8195 mediatek-drm 4. Only trigger DRM HPD events if bridge is attached 5. Change the aux retries times when receiving AUX_DEFER -----BEGIN PGP SIGNATURE----- iQJMBAABCgA2FiEEACwLKSDmq+9RDv5P4cpzo8lZTiQFAmQ0mPkYHGNodW5rdWFu Zy5odUBrZXJuZWwub3JnAAoJEOHKc6PJWU4kv/wP/ixaRMppN9cIVB9UICxn31xJ vzMyXScZ2cei/0nlTtWh2BOiHuw5BPvdUK9W+FqNDTf/EgRE7q9TLvEyB6DGEGOM 2Ahkcqw2rSnigP5rGFUKt9zFoVIc6YtiCG2jS65Q3UhuoBvoQZckTi/HhKGlw9KQ piQCXZfTCSg67FHX109J1ASkh4B7hio8Pnk9X0+JqGI1U+y2RgWKhJV+o8BVmImE 3VFqEbqNPswy2GMYUWyRPTz89dw6ZPHswwbRPuOD5kGCY82QAZNxoVx8FkQD4GCv 1QTpZKb3dW8dUQBnkxXGtZ+8M/0H9ZjSvr4yjg0HF2HFQPof3UPie0MG7e4X1xUw 6spB4jSacvMHw89oIWg2eaIJ6D9iJVkevp8PiJsxS/pgvqHM4gCW9mCsEHxKVMQb M2I0Z5tyfmiljrsiK0TDyv5uKHuozW6SX/nmCfLwvL4c1k/KOZuA7JBAEH2zMTMs CBXMnFa9MgaVXdamEu/tbUOBfbaSsXde7dKWds1jYwIF3bAwRfuE/gtSkBVEqu1W EABKUZTyd8/VwfsaEaQSF7N3e7bBSf5tjGZH/m2RHp2Eu/FsfeWxZK8/l8LMmGnr 4gF/dHTNOFMBg469EXS8BKouh40kJJjoj0c1IAOp5wcmH73RuGgXExbnVu3W7siC fZPLDzUTSak5qC7tcpP1 =T2t7 -----END PGP SIGNATURE----- Merge tag 'mediatek-drm-next-6.4' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-next Mediatek DRM Next for Linux 6.4 1. Add support for 10-bit overlays 2. Add MediaTek SoC DRM (vdosys1) support for mt8195 3. Change mmsys compatible for mt8195 mediatek-drm 4. Only trigger DRM HPD events if bridge is attached 5. Change the aux retries times when receiving AUX_DEFER Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Chun-Kuang Hu <chunkuang.hu@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20230410233005.2572-1-chunkuang.hu@kernel.org	2023-04-11 12:28:10 +02:00
Daniel Vetter	b8d85bb505	Merge tag 'drm-msm-next-2023-04-10' of https://gitlab.freedesktop.org/drm/msm into drm-next main pull request for v6.4 Core Display: ============ * Bugfixes for error handling during probe * rework UBWC decoder programming * prepare_commit cleanup * bindings for SM8550 (MDSS, DPU), SM8450 (DP) * timeout calculation fixup * atomic: use drm_crtc_next_vblank_start() instead of our own custom thing to calculate the start of next vblank DP: == * interrupts cleanup DPU: === * DSPP sub-block flush on sc7280 * support AR30 in addition to XR30 format * Allow using REC_0 and REC_1 to handle wide (4k) RGB planes * Split the HW catalog into individual per-SoC files DSI: === * rework DSI instance ID detection on obscure platforms GPU: === * uapi C++ compatibility fix * a6xx: More robust gdsc reset * a3xx and a4xx devfreq support * update generated headers * various cleanups and fixes * GPU and GEM updates to avoid allocations which could trigger reclaim (shrinker) in fence signaling path * dma-fence deadline hint support and wait-boost * a640 speedbin support * a650 speedbin support Conflicts in drivers/gpu/drm/msm/adreno/adreno_gpu.c: Conflict between the 7fa5047a436b ("drm: Use of_property_present() for testing DT property presence") and 9f251f934012 ("drm/msm/adreno: Use OPP for every GPU generation"). The latter removed the of_ function call outright, so I went with what's in the PR unchanged. From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvwuj5tabyW910+N-B=5kFNAC7QNYoQ=0xi3roBjQvFFQ@mail.gmail.com Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2023-04-11 12:21:50 +02:00
Daniel Vetter	838ac90d8d	This tag contains additional habanalabs driver changes for v6.4: - uAPI changes: - Add a definition of a new Gaudi2 server type. This is used by userspace to know what is the connectivity between the accelerators inside the server - New features and improvements: - speedup h/w queues test in Gaudi2 to reduce device initialization times. - Firmware related fixes: - Fixes to the handshake protocol during f/w initialization. - Sync f/w events interrupt in hard reset to avoid warning message. - Improvements to extraction of the firmware version. - Misc bug fixes and code cleanups. Notable fixes are: - Multiple fixes for interrupt handling in Gaudi2. - Unmap mapped memory in case TLB invalidation fails. -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEE7TEboABC71LctBLFZR1NuKta54AFAmQ0BEAACgkQZR1NuKta 54CxNQgArVWo/+IdKz4a+JtCHfHZm2lfPp7bl4Ck05PgF085lpX6bfO5NpTGAOJd AJJNX5faqqRakg3BdDZ9u2FxGRGQltfjJt1KVD78vlOaux6nMw83XWWYDmzYHwPT GYqPuluoiEQMqH4h95kwt3PyS9iILPon9gGIp4JVJCEWhFFa6TYyNr/rKdqNJUHz ujfhLpkkV8GZ7KwMdtWCSTuprG7icHyu3yolZmNJ0Fx8AIZ0CQacxIr4WLa2n4Tv CNnJYCTN9DNh8HfwDtm1wVb6Nq1am3DwZGOPXsy0Y8RhKf2VrGAFpW9wG07mL4Ep EbPqTtO7fT4KGhdZ4JcUeeRg5F3gTw== =Eit8 -----END PGP SIGNATURE----- Merge tag 'drm-habanalabs-next-2023-04-10' of https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux into drm-next This tag contains additional habanalabs driver changes for v6.4: - uAPI changes: - Add a definition of a new Gaudi2 server type. This is used by userspace to know what is the connectivity between the accelerators inside the server - New features and improvements: - speedup h/w queues test in Gaudi2 to reduce device initialization times. - Firmware related fixes: - Fixes to the handshake protocol during f/w initialization. - Sync f/w events interrupt in hard reset to avoid warning message. - Improvements to extraction of the firmware version. - Misc bug fixes and code cleanups. Notable fixes are: - Multiple fixes for interrupt handling in Gaudi2. - Unmap mapped memory in case TLB invalidation fails. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Oded Gabbay <ogabbay@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20230410124637.GA2441888@ogabbay-vm-u20.habana-labs.com	2023-04-11 12:02:38 +02:00
Tomer Tayar	56499c4615	accel/habanalabs: add missing error flow in hl_sysfs_init() hl_sysfs_fini() is called only if hl_sysfs_init() completes successfully. Therefore if hl_sysfs_init() fails, need to remove any sysfs group that was added until that point. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>	2023-04-08 10:44:23 +03:00
Moti Haimovski	31420f93b5	accel/habanalabs: speedup h/w queues test in Gaudi2 HW queues testing at driver load and after reset takes a substantial amount of time. This commit reduces the queues test time in Gaudi2 devices by running all the tests in parallel instead of one after the other. Time measurements on tests duration shows that the new method is almost x100 faster than the serial approach. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>	2023-04-08 10:43:34 +03:00
Dani Liberman	91204e4703	accel/habanalabs: fix handling of arc farm sei event There is only single eq entry for arc farm sei event which aggregates events from the four arc farms. Fix the code to handle this event according to this behavior. Signed-off-by: Dani Liberman <dliberman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>	2023-04-08 10:42:13 +03:00
Ofir Bitton	b207e166db	accel/habanalabs: remove Gaudi1 multi MSI code Multi MSI interrupts aren't working in Gaudi1 and because of that, we are only using a single MSI interrupt. Therefore, let's remove this dead code in order to avoid confusion. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>	2023-04-08 10:40:35 +03:00
Oded Gabbay	a25c2f7a46	accel/habanalabs/uapi: new Gaudi2 server type Add definition of a new Gaudi2 server type. This represents the connectivity between the cards in that server type. Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>	2023-04-08 10:39:34 +03:00
Ofir Bitton	38f3c732fc	accel/habanalabs: fixes for unexpected error interrupt Removing redundant asic prop variable as we don't need to expose this to common code. In addition, fix some typos. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>	2023-04-08 10:39:34 +03:00
Koby Elbaz	c19350efa9	accel/habanalabs: don't wait for STS_OK after sending COMMS WFE Sending COMMS_GOTO_WFE instructs the FW's CPU to halt (WFE state). Once sent, FW's CPU isn't expected to continue communicating with LKD. Therefore, the stage of waiting for COMMS_STS_OK should be skipped or else waiting for COMMS_STS_OK will simply timeout, which will trigger unexpected behavior. Signed-off-by: Koby Elbaz <kelbaz@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>	2023-04-08 10:39:34 +03:00
Tal Cohen	802f25b6c2	accel/habanalabs: sync f/w events interrupt in hard reset Receiving events from FW, while the device is in hard reset, causes a warning message in Driver log. The message may point to a problem in the Driver or FW. But It also can appear as a result of events that have been sent from FW just before the hard reset. In order to avoid receiving events from FW while the device is in reset and is already in 'disabled' mode, sync the f/w events interrupt right before setting the device to 'disabled'. Signed-off-by: Tal Cohen <talcohen@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>	2023-04-08 10:39:34 +03:00
Ofir Bitton	82a1b48a4e	accel/habanalabs: fix wrong reset and event flags During event handling, driver sets relevant reset and user event notifier flags. Fix few wrong flags settings. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>	2023-04-08 10:39:34 +03:00
Tomer Tayar	d4801c0485	accel/habanalabs: fix events mask of decoder abnormal interrupts The decoder IRQ status register may have several set bits upon an abnormal interrupt. Therefore, when setting the events mask, need to check all bits and not using if-else. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>	2023-04-08 10:39:34 +03:00
Tomer Tayar	9cf56f0d97	accel/habanalabs: remove completion from abnormal interrupt work name Decoder abnormal interrupts are for errors and not for completion, so rename the relevant work and work function to not include 'completion'. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>	2023-04-08 10:39:34 +03:00
Ofir Bitton	49fd071d15	accel/habanalabs: print raw binning masks in debug level There are rare cases of failures when cards are initialized due to wrong values in efuse mappings that are parsed by firmware. To help debug those cases, print (in debug level) the raw binning masks as fetched from the firmware during device initialization. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>	2023-04-08 10:39:34 +03:00
Ofir Bitton	d1943f1b97	accel/habanalabs: fix HBM MMU interrupt handling Current mapping between HMMU event and HMMU block is wrong. In addition the captured address in case of a page fault or an access error is scrambled, Hence we must call the descramble function. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>	2023-04-08 10:39:34 +03:00
Dafna Hirschfeld	12f7701138	accel/habanalabs: improvements to FW ver extraction 1. Rename the func to hl_get_preboot_major_minor because we also set the extracted values in hdev fields. 2. Free the allocated string in the calling function which makes more sense Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>	2023-04-08 10:39:33 +03:00
Dani Liberman	6306e81583	accel/habanalabs: fix access error clear event The register which needs to be cleared is the valid register instead of the address. Signed-off-by: Dani Liberman <dliberman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>	2023-04-08 10:39:33 +03:00
Tal Cohen	3a8d7c3a7d	accel/habanalabs: send disable pci when compute ctx is active Fix an issue in hard reset flow in which the driver didn't send a disable pci message if there was an active compute context. In hard reset, disable pci message should be sent no matter if a compute context exists or not. Signed-off-by: Tal Cohen <talcohen@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>	2023-04-08 10:39:33 +03:00
Tal Cohen	a855f710f5	accel/habanalabs: remove duplicated disable pci msg The disable pci message is sent in reset device. It informs the FW not to raise more EQs. The Driver may ignore received EQs, when the device is in disabled mode. The duplication happens when hard reset is scheduled during compute reset and also performs 'escalate_reset_flow'. Signed-off-by: Tal Cohen <talcohen@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>	2023-04-08 10:39:33 +03:00
Koby Elbaz	9d7fef7c59	accel/habanalabs: change COMMS warning messages to error level COMMS protocol is used for LKD <--> FW communication, and any communication failure between the two might turn out to be destructive, hence, it should be well emphasized. Signed-off-by: Koby Elbaz <kelbaz@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>	2023-04-08 10:39:33 +03:00
Dafna Hirschfeld	fb10da9337	accel/habanalabs: check return value of add_va_block_locked since the function might fail and we should propagate the failure. Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>	2023-04-08 10:39:33 +03:00
Tal Cohen	957b247bca	accel/habanalabs: print event type when device is disabled When the device is in disabled state, the driver isn't suppose to receive any events from FW. Printing the event type, as part of the message that was already printed, shall help to get more info if this unexpected message is received. Signed-off-by: Tal Cohen <talcohen@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>	2023-04-08 10:39:33 +03:00
Koby Elbaz	6c31c13759	accel/habanalabs: unmap mapped memory when TLB inv fails Once a memory mapping is added to the page tables, it's followed by a TLB invalidation request which could potentially fail (HW failure). Removing the mapping is simply a part of this failure handling routine. TLB invalidation failure prints were updated to be more accurate. Signed-off-by: Koby Elbaz <kelbaz@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>	2023-04-08 10:39:33 +03:00
Cai Huoqing	248ed9e227	accel/habanalabs: Remove redundant pci_clear_master Remove pci_clear_master to simplify the code, the bus-mastering is also cleared in do_pci_disable_device, like this: ./drivers/pci/pci.c:2197 static void do_pci_disable_device(struct pci_dev *dev) { u16 pci_command; pci_read_config_word(dev, PCI_COMMAND, &pci_command); if (pci_command & PCI_COMMAND_MASTER) { pci_command &= ~PCI_COMMAND_MASTER; pci_write_config_word(dev, PCI_COMMAND, pci_command); } pcibios_disable_device(dev); }. And dev->is_busmaster is set to 0 in pci_disable_device. Signed-off-by: Cai Huoqing <cai.huoqing@linux.dev> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>	2023-04-08 10:39:33 +03:00
Lionel Landwerlin	16fc9c08f0	drm/i915: disable sampler indirect state in bindless heap By default the indirect state sampler data (border colors) are stored in the same heap as the SAMPLER_STATE structure. For userspace drivers that can be 2 different heaps (dynamic state heap & bindless sampler state heap). This means that border colors have to copied in 2 different places so that the same SAMPLER_STATE structure find the right data. This change is forcing the indirect state sampler data to only be in the dynamic state pool (more convenient for userspace drivers, they only have to have one copy of the border colors). This is reproducing the behavior of the Windows drivers. BSpec: 46052 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: stable@vger.kernel.org Reviewed-by: Haridhar Kalvala <haridhar.kalvala@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230407093237.3296286-1-lionel.g.landwerlin@intel.com	2023-04-07 17:45:05 -07:00
Dmitry Baryshkov	ac7e7c9c65	drm/msm/dpu: drop unused macros from hw catalog Drop the version comparison macros from dpu_hw_catalog.h, they are unused. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/530889/ Link: https://lore.kernel.org/r/20230404130622.509628-43-dmitry.baryshkov@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>	2023-04-07 03:54:50 +03:00
Dmitry Baryshkov	dac76a0144	drm/msm/dpu: fetch DPU configuration from match data In email discussion it was noted that there can be different SoC device having slightly different SoC features, but sharing the same DPU hw revision. Stop fetching catalog data using core_rev and use platform's match data instead. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/530891/ Link: https://lore.kernel.org/r/20230404130622.509628-42-dmitry.baryshkov@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>	2023-04-07 03:54:39 +03:00

1 2 3 4 5 ...

1171806 Commits